A Playground for research at the intersection of Continual, Reinforcement, and Self-Supervised Learning.
- 5 minute intro: https://www.youtube.com/watch?v=0u48vr96zRQ
- Paper link: https://arxiv.org/abs/2108.01005
- Continual Supervised Learning Study (~6K runs)
- Continual Reinforcement Learning Study (~2300 runs)
If you have any questions or comments, please make an issue!
Most applied ML research generally either proposes new Settings (research problems), new Methods (solutions to such problems), or both.
-
When proposing new Settings, researchers almost always have to reimplement or heavily modify existing solutions before they can be applied onto their new problem.
-
Likewise, when creating new Methods, it's often necessary to first re-create the experimental setting of other baseline papers, or even the baseline methods themselves, as experimental conditions may be slightly different between papers!
The goal of this repo is to:
-
Organize various research Settings into an inheritance hierarchy (a tree!), with more general, challenging settings with few assumptions at the top, and more constrained problems at the bottom.
-
Provide a mechanism for easily reusing existing solutions (Methods) onto new Settings through Polymorphism!
-
Allow researchers to easily create new, general Methods and quickly gather results on a multitude of Settings, ranging from Supervised to Reinforcement Learning!
Requires python >= 3.7
-
Clone the repo:
$ git clone https://www.github.com/lebrice/Sequoia.git $ cd Sequoia
-
Optional: Create the conda environment (only once):
$ conda env create -f environment.yaml $ conda activate sequoia
-
Install the dependencies:
$ pip install -e .
Install the latest XQuartz app from here: https://www.xquartz.org/releases/index.html
Then run the following commands on the terminal:
mkdir /tmp/.X11-unix
sudo chmod 1777 /tmp/.X11-unix
sudo chown root /tmp/.X11-unix/
Setting | RL vs SL | clear task boundaries? | Task boundaries given? | Task labels at training time? | task labels at test time | Stationary context? | Fixed action space |
---|---|---|---|---|---|---|---|
Continual RL | RL | no | no | no | no | no | no(?) |
Discrete Task-Agnostic RL | RL | yes | yes | no | no | no | no(?) |
Incremental RL | RL | yes | yes | yes | no | no | no(?) |
Task-Incremental RL | RL | yes | yes | yes | yes | no | no(?) |
Traditional RL | RL | yes | yes | yes | no | yes | no(?) |
Multi-Task RL | RL | yes | yes | yes | yes | yes | no(?) |
Continual SL | SL | no | no | no | no | no | no |
Discrete Task-Agnostic SL | SL | yes | no | no | no | no | no |
(Class) Incremental SL | SL | yes | yes | no | no | no | no |
Domain-Incremental SL | SL | yes | yes | yes | no | no | yes |
Task-Incremental SL | SL | yes | yes | yes | yes | no | no |
Traditional SL | SL | yes | yes | yes | no | yes | no |
Multi-Task SL | SL | yes | yes | yes | yes | yes | no |
-
Active / Passive: Active settings are Settings where the next observation depends on the current action, i.e. where actions influence future observations, e.g. Reinforcement Learning. Passive settings are Settings where the current actions don't influence the next observations (e.g. Supervised Learning.)
-
Bold entries in the table mark constant attributes which cannot be changed from their default value.
-
*: The environment is changing constantly over time in
ContinualRLSetting
, so there aren't really "tasks" to speak of.
--> (Reminder) First, take a look at the Examples <--
from sequoia.settings import TaskIncrementalSLSetting
from sequoia.methods import BaseMethod
setting = TaskIncrementalSLSetting(dataset="mnist")
method = BaseMethod(max_epochs=1)
results = setting.apply(method)
print(results.summary())
sequoia --setting <some_setting> --method <some_method> (arguments)
For example:
- Run the BaseMethod on task-incremental MNIST, with one epoch per task, and without wandb:
sequoia --setting task_incremental_sl --dataset mnist --method base --max_epochs 1
- Run the PPO Method from stable-baselines3 on an incremental RL setting, with the default dataset (CartPole) and 5 tasks:
sequoia --setting incremental_rl --nb_tasks 5 --method ppo --steps_per_task 10_000
More questions? Please let us know by creating an issue!