Created by Zekun Qi*, Runpei Dong*, Guofan Fan, Zheng Ge, Xiangyu Zhang, Kaisheng Ma, Li Yi
This repository contains the code release of paper Contrast with Reconstruct: Contrastive 3D Representation Learning Guided by Generative Pretraining (ICML 2023).
- 🎉 Apr, 2023: ReCon accepted by ICML 2023
- 💥 Feb, 2023: Check out our previous work ACT, which has been accepted by ICLR 2023
PyTorch >= 1.7.0 < 1.11.0; python >= 3.7; CUDA >= 9.0; GCC >= 4.9; torchvision;
pip install -r requirements.txt
# Chamfer Distance & emd
cd ./extensions/chamfer_dist
python setup.py install --user
cd ./extensions/emd
python setup.py install --user
# PointNet++
pip install "git+https://github.com/erikwijmans/Pointnet2_PyTorch.git#egg=pointnet2_ops&subdirectory=pointnet2_ops_lib"
We use ShapeNet, ScanObjectNN, ModelNet40 and ShapeNetPart in this work. See DATASET.md for details.
Task | Dataset | Config | Acc. | Checkpoints Download |
---|---|---|---|---|
Pre-training | ShapeNet | pretrain_base.yaml | N.A. | ReCon |
Classification | ScanObjectNN | finetune_scan_hardest.yaml | 91.26% | PB_T50_RS |
Classification | ScanObjectNN | finetune_scan_objbg.yaml | 95.35% | OBJ_BG |
Classification | ScanObjectNN | finetune_scan_objonly.yaml | 93.80% | OBJ_ONLY |
Classification | ModelNet40(1k) | finetune_modelnet.yaml | 94.5% | ModelNet_1k |
Classification | ModelNet40(8k) | finetune_modelnet_8k.yaml | 94.7% | ModelNet_8k |
Zero-Shot | ModelNet10 | zeroshot_modelnet10.yaml | 81.6% | ReCon zero-shot |
Zero-Shot | ModelNet40 | zeroshot_modelnet40.yaml | 66.8% | ReCon zero-shot |
Linear SVM | ModelNet40 | svm.yaml | 93.4% | ReCon svm |
Part Segmentation | ShapeNetPart | segmentation | 86.4% mIoU | part seg |
Task | Dataset | Config | 5w10s (%) | 5w20s (%) | 10w10s (%) | 10w20s (%) | Download |
---|---|---|---|---|---|---|---|
Few-shot learning | ModelNet40 | fewshot.yaml | 97.3 ± 1.9 | 98.9 ± 1.2 | 93.3 ± 3.9 | 95.8 ± 3.0 | ReCon |
The checkpoints and logs have been released on Google Drive. You can use the voting strategy in classification testing to reproduce the performance reported in the paper. For classification downstream tasks, we randomly select 8 seeds to obtain the best checkpoint.
Pre-training with the default configuration, run the script:
sh scripts/pretrain.sh <GPU> <exp_name>
If you want to try different models or masking ratios etc., first create a new config file, and pass its path to --config.
CUDA_VISIBLE_DEVICES=<GPU> python main.py --config <config_path> --exp_name <exp_name>
Fine-tuning with the default configuration, run the script:
bash scripts/cls.sh <GPU> <exp_name> <path/to/pre-trained/model>
Or, you can use the command.
Fine-tuning on ScanObjectNN, run:
CUDA_VISIBLE_DEVICES=<GPUs> python main.py --config cfgs/full/finetune_scan_hardest.yaml \
--finetune_model --exp_name <exp_name> --ckpts <path/to/pre-trained/model>
Fine-tuning on ModelNet40, run:
CUDA_VISIBLE_DEVICES=<GPUs> python main.py --config cfgs/full/finetune_modelnet.yaml \
--finetune_model --exp_name <exp_name> --ckpts <path/to/pre-trained/model>
Test&Voting with the default configuration, run the script:
bash scripts/test.sh <GPU> <exp_name> <path/to/best/fine-tuned/model>
or:
CUDA_VISIBLE_DEVICES=<GPUs> python main.py --test --config cfgs/finetune_modelnet.yaml \
--exp_name <output_file_name> --ckpts <path/to/best/fine-tuned/model>
Few-shot with the default configuration, run the script:
sh scripts/fewshot.sh <GPU> <exp_name> <path/to/pre-trained/model> <way> <shot> <fold>
or
CUDA_VISIBLE_DEVICES=<GPUs> python main.py --config cfgs/full/fewshot.yaml --finetune_model \
--ckpts <path/to/pre-trained/model> --exp_name <exp_name> --way <5 or 10> --shot <10 or 20> --fold <0-9>
Pre-training with the default configuration, run the script.
sh scripts/zeroshot.sh <GPU> <exp_name>
Part segmentation on ShapeNetPart, run:
cd segmentation
bash seg.sh <GPU> <exp_name> <path/to/pre-trained/model>
or
cd segmentation
python main.py --ckpts <path/to/pre-trained/model> --log_dir <path/to/log/dir> --learning_rate 0.0001 --epoch 300
Test part segmentation on ShapeNetPart, run:
cd segmentation
bash test.sh <GPU> <exp_name> <path/to/best/fine-tuned/model>
Linear SVM on ModelNet40, run:
sh scripts/svm.sh <GPU> <exp_name> <path/to/pre-trained/model>
We use PointVisualizaiton repo to render beautiful pointcloud image, including specified color rendering and attention distribution rendering.
If you have any questions related to the code or the paper, feel free to email Zekun ([email protected]
) or Runpei ([email protected]
).
ReCon is released under MIT License. See the LICENSE file for more details. Besides, the licensing information for pointnet2
modules is available here.
This codebase is built upon Point-MAE, Point-BERT, CLIP, Pointnet2_PyTorch and ACT
If you find our work useful in your research, please consider citing:
@inproceedings{qi2023recon,
title={Contrast with Reconstruct: Contrastive 3D Representation Learning Guided by Generative Pretraining},
author={Qi, Zekun and Dong, Runpei and Fan, Guofan and Ge, Zheng and Zhang, Xiangyu and Ma, Kaisheng and Yi, Li},
booktitle={International Conference on Machine Learning (ICML) },
year={2023}
}
and closely related work ACT:
@inproceedings{dong2023act,
title={Autoencoders as Cross-Modal Teachers: Can Pretrained 2D Image Transformers Help 3D Representation Learning?},
author={Runpei Dong and Zekun Qi and Linfeng Zhang and Junbo Zhang and Jianjian Sun and Zheng Ge and Li Yi and Kaisheng Ma},
booktitle={The Eleventh International Conference on Learning Representations (ICLR) },
year={2023},
url={https://openreview.net/forum?id=8Oun8ZUVe8N}
}