Code for the papers
- Hierarchical Imitation Learning with Vector Quantized Models (ICML 2023), arXiV
- Hybrid Search for Efficient Planning with Completeness Guarantees (NeurIPS 2023), arXiV
Execute the following commands to create the environment:
conda create -n hips python=3.9
conda activate hips
pip3 install -r requirements.txt
Perform the following steps to train the models. Steps 2a-2d can be done in parallel. Steps 3-6 depend on 2a and 2b.
- Download the datasets from Google Drive
- Perform the following steps (in parallel)
- Train the HIPS detector and the subgoal conditioned low-level policy with
reinforce.py
- Train the continuous vqvae with
vqvae.py
using the argument--continuous
- Train the dynamics model with
train_model.py
(optional, you can use the env dynamics as an alternative) - Create the distance dataset with
distance_dataset_creator.py
and train the distance function withtrain_distance_function.py
- Train the HIPS detector and the subgoal conditioned low-level policy with
- Create the discrete VQVAE dataset with
vqvae_dataset_creator.py
- Train the discrete vqvae with
vqvae.py
- Create a dataset for training the prior and low-level BC policy with
prior_dataset_creator.py
- Train the prior and low-level BC policy with
prior.py
The main script for evaluation is search.py. The command to use is:
python search.py --env <ENV> --policy <POLICY> --vqvae <VQVAE> --prior <PRIOR> --heuristic <DIST_FUNC> --jobs <N> \
--epsilon <E> --hybrid [--K <K>] [--ada] [--gbfs] [--astar] [--step\_cost] [--model <MODEL>]
For evaluating
- with
$\varepsilon \to 0$ , use--ada
- baseline HIPS (no hybrid search), do not use
--hybrid
- with GBFS or A*, use
--gbfs
or--astar
respectively. Then, a prior is not needed, but the value ofK
must be specified. When A* is used, you should also use--step_cost
- with model, include
--model <MODEL>