GemBench Challenge

GemBench comprises 16 training tasks with 31 variations, covering seven action primitives. The testing set includes 44 tasks with 92 variations, which are organized into four progressively more challenging levels to systematically evaluate generalization capabilities, namely novel placements, novel rigid object, novel articulated objects, and long-horizon tasks.

GemBench

Colosseum Challenge

Colosseum aims to evaluate models' generalization across various scene perturbations. It encompasses 14 perturbation factors within 20 distinct RLBench tasks, categorized into three tiers (simple, intermediate, and complex) according to the number of way-points involved (task horizon). Collectively, Colosseum presents 20,371 unique task perturbations instances.

Colosseum

Evaluation

Simulator-based Evaluation

To participate the challenge, please first register your team using this registration form. Note that the two challenges are evaluated independently. Please review the guidelines for your chosen challenge(s):

Dates

  • Challenge submission deadline: May 12 23:59 CET
  • Challenge report deadline: May 19 23:59 CET