Task Domains¶

Here, we describe the various task domains available in DiscoGen. We expect this to continue to grow as our benchmark scales.

BayesianOptimisation¶

The agent must maximise randomly sampled variables using Bayesian Optimisation.

Modules¶

acq_fn, acq_optimizer, sampler, next_queries, surrogate, surrogate_optimizer

Datasets¶

Ackley1d, Ackley2d, Branin2d, Bukin2d, Cosine8d, DropWave2d, EggHolder2d, Griewank5d, Hartmann6d, HolderTable2d, Levy6d

BrainSpeechDetection¶

The agent is tasked with training a speech detector based on brain MEG signals.

Modules¶

loss, networks, optim

Datasets¶

LibriBrainSherlock1, LibriBrainSherlock2, LibriBrainSherlock3, LibriBrainSherlock4, LibriBrainSherlock5, LibriBrainSherlock6, LibriBrainSherlock7

ComputerVisionClassification¶

The agent must train an image classifier for a range of different image classification datasets, of varying difficulty.

Modules¶

loss, networks, optim, preprocess

Datasets¶

CIFAR10, CIFAR10C, CIFAR10LT, CIFAR100, FashionMNIST, MNIST, OxfordFlowers, StanfordCars, TinyImageNet

ContinualLearning¶

The agent must train a model on different non-stationary continual learning tasks.

Modules¶

optim, regularizer, replay, sampler, scheduler

Datasets¶

PermutedMNIST, SplitCIFAR100, TinyImageNetSplit

GreenhouseGasPrediction¶

The agent must train a model to predict the changing concentrations of different greenhouse gases in the atmosphere.

Modules¶

data_processing, model

Datasets¶

CH4, CO2, N2O, SF6

LanguageModelling¶

The agent must pre-train a language model on different small-scale pretraining datasets.

Modules¶

loss, networks, optim

Datasets¶

LMFineWeb, OPCFineWebCode, OPCFineWebMath, TinyStories

ModelUnlearning¶

The agent must unlearn certain behaviours of a pretrained model while maintaining others.

Modules¶

loss

Datasets¶

muse, tofu, wmdp_cyber

Models¶

gemma-7b-it, Llama-2-7b-chat-hf, Llama-2-7b-hf, Llama-2-13b-hf, Llama-3.1-8b-Instruct, Llama-3.2-1B-Instruct, Llama-3.2-3B-Instruct, phi-1_5, Phi-3.5-mini-instruct, Qwen2.5-1.5B-Instruct, Qwen2.5-3B-Instruct, Qwen-2.5-7B-Instruct

Installation¶

Please note, after installing the ModelUnlearning requirements.txt, you must install flash-attn. Please use:

pip install flash-attn==2.6.3 --no-build-isolation

NeuralCellularAutomata¶

The agent must design algorithms for evolving neural cellular automata, which must do tasks like reproduce images of emojis or classify digits from MNIST.

Modules¶

loss, optimiser, perceive, train, update

Datasets¶

GrowingButterfly, GrowingLizard, MatrixOperations, MNISTInpainting, SelfClassifyingMNIST

OfflineRL¶

The agent must train a value-based RL agent in game environments.

Modules¶

actor_loss, critic_loss, networks, optim, train

Datasets¶

OGBench/antmaze-giant-navigate, OGBench/antmaze-large-navigate, OGBench/antsoccer-arena-navigate, OGBench/cube-double-play, OGBench/cube-single-play, OGBench/humanoidmaze-large-navigate, OGBench/humanoidmaze-medium-navigate, OGBench/puzzle-3x3-play, OGBench/puzzle-4x4-play, OGBench/scene-play `

OffPolicyRL¶

The agent must train a value-based RL agent in game environments.

Modules¶

config, networks, optim, policy, q_update, rb, train

Datasets¶

MinAtar/Asterix, MinAtar/Breakout, MinAtar/Freeway, MinAtar/SpaceInvaders

OnPolicyMARL¶

The agent must train multiple RL agents in cooperative and competitive multi-agent RL environments.

Modules¶

loss, networks, optim, train, activation, targets

Datasets¶

MABrax/Ant, MABrax/HalfCheetah, MABrax/Hopper, MABrax/Humanoid, MABrax/Walker, MPE/Spread, SMAX/2s3z, SMAX/3s_vs_5z, SMAX/3s5z, SMAX/3s5z_vs_3s6z, SMAX/5m_vs_6m, SMAX/6h_vs_8z, SMAX/10m_vs_11m, SMAX/27m_vs_30m, SMAX/smacv2_5_units, SMAX/smacv2_10_units, SMAX/smacv2_20_units

OnPolicyRL¶

The agent must train an on-policy RL agent in game and robotics environments.

Modules¶

loss, networks, optim, train, activation, targets

Datasets¶

Brax/Ant, Brax/HalfCheetah, Brax/Hopper, Brax/Humanoid, Brax/Pusher, Brax/Reacher, Brax/Walker2D, Craftax/Craftax, Craftax/Craftax-Classic, MinAtar/Asterix, MinAtar/Breakout, MinAtar/Freeway, MinAtar/SpaceInvaders

Please note: some datasets are currently omitted from the DiscoBench tasks due to excessive runtimes. This is likely to change in the future.

TrajectoryPrediction¶

The agent must develop algorithms to predict the trajectories of traffic participants, such as cars or pedestrians.

Modules¶

loss, networks, optim, train

Datasets¶

Argoverse2, nuScenes, Waymo

UnsupervisedEnvironmentDesign¶

The agent must develop level sampling methods for an on-policy RL agent.

Modules¶

sample_levels, train_step, variable_config

Datasets¶

Kinetix/Large, Kinetix/Medium, Kinetix/Small, Minigrid