yashbhalgat / contrastive-lift Goto Github PK

[NeurIPS 2023 Spotlight] Code for "Contrastive Lift: 3D Object Instance Segmentation by Slow-Fast Contrastive Fusion"

Home Page: https://www.robots.ox.ac.uk/~vgg/research/contrastive-lift/

Python 100.00%

3d-reconstruction computer-vision instance-segmentation multi-view-learning multi-view-stereo nerf pytorch rendering neurips neurips-2023

contrastive-lift's Introduction

Contrastive-Lift (NeurIPS 2023 Spotlight)

[Project page | Paper]

Yash Bhalgat, Iro Laina, João F. Henriques, Andrew Zisserman, Andrea Vedaldi

TL;DR: Our paper presents a novel "slow-fast" contrastive fusion method to lift 2D predictions to 3D for scalable instance segmentation, achieving significant improvements without requiring an upper bound on the number of objects in the scene.

Data and Pretrained checkpoints

You can download the Messy Rooms dataset from here. For all other datasets, refer to the instructions provided in Panoptic-Lifting

NOTE: In this codebase, the term "MOS" stands for "Many Object Scenes", which was the original name of the "Messy Rooms" dataset as referenced in the paper.

You can download the pretrained checkpoints from here.

Inference and Evaluation

Download the pretrained checkpoints and place them in the pretrained_checkpoints folder. Then, run the following commands to evaluate the pretrained models:

python3 inference/render_panopli.py --ckpt_path pretrained_checkpoints/<SCENE NAME>/checkpoints/<CKPT NAME>.ckpt --cached_centroids_path pretrained_checkpoints/<SCENE NAME>/checkpoints/all_centroids.pkl

This will render the outputs to runs/<experiment> folder. To calculate the metrics, run the following command:

python inference/evaluate.py --root_path ./data/<SCENE DATA PATH> --exp_path runs/<experiment>

Citation

If you find this work useful in your research, please cite our paper:

@inproceedings{
  bhalgat2023contrastive,
  title={Contrastive Lift: 3D Object Instance Segmentation by Slow-Fast Contrastive Fusion},
  author={Bhalgat, Yash and Laina, Iro and Henriques, Jo{\~a}o F and Zisserman, Andrew and Vedaldi, Andrea},
  booktitle={Thirty-seventh Conference on Neural Information Processing Systems},
  year={2023},
  url={https://openreview.net/forum?id=bbbbbov4Xu}
}

Thanks

This code is based on Panoptic-Lifting and TensoRF codebases. We thank the authors for releasing their code.

contrastive-lift's People

Contributors

Stargazers

Watchers

Forkers

miaowu99

contrastive-lift's Issues

Pre-trained models on other datasets

Dear author,

Thanks for your previous help and providing the pretrained models on ScanNet dataset. I would really appreciate it if you could further provide the pretrained model (Constrastive-Lifting and Panoptic-Lifting) on the other datasets (especially the MOS dataset) if possible since I only have limited computation.
Many thanks in advance!!

Bests,
Runsong

Performance on individual scenes

Hi, I might have missed it, but where can I find the individual performance of the replica and scannet scenes? Thanks!

Poor performance on duplicate objects

Hello! Really good work.

I was benchmarking your method wrt to Replica (i set the batch size to 4096 and segment loss to 0.75 as they did in Panopli; i also increased max instances to 25 which is also used in Panopli).

It seems the PQ is similar (perhaps due to semantics performing well?), but CL often is able to delineate duplicate adjacent objects. For example, here is result from training on replica-vmap office3 split (slow-fast, above) compared to Panopli (linear assignment; below). Do you have any intuition why this is the case? Thank you!

CL:

PL:

When will the code be available?

Dear author,

Thanks for your great work.
I want to test your methods in my experiments and I want to know when the code will be available? Besides, Does the proposed method take the similar time as panoptic-lifting for optimization time on ScanNet. BTW, I found the panoptic-lifting takes 36 hours on my single 3090 RTX for one ScanNet scene (042302).
Looking forward to your reply. Many thanks in advance.

Bests,
Runsong

Issue with training

Hello, thank you for this great work!

for this line running on an itw dataset:

loss_semantics = (self.loss_semantics(output_semantics, probs) * confs).mean()

gives the error:

RuntimeError: 0D or 1D target tensor expected, multi-target not supported

Also requirements.txt has duplicate packages and is not complete i believe

missing mesh.ply while running preprocess_replica.py

I downloaded replica dataset provided by Semantic_Nerf. When I ran preprocess_replica.py, I got a missing mesh.ply file, that is files not provided in the downloaded dataset.

Can you please share where i can download the complete processed dataset or provide the step to create them.

where is the training code?

Hi,

Do you provide the training instruction?