Coder Social home page Coder Social logo

yangcaoai / coda_neurips2023 Goto Github PK

View Code? Open in Web Editor NEW
172.0 10.0 15.0 72.88 MB

Official code for NeurIPS2023 paper: CoDA: Collaborative Novel Box Discovery and Cross-modal Alignment for Open-vocabulary 3D Object Detection

Home Page: https://yangcaoai.github.io/publications/CoDA.html

License: MIT License

Python 32.48% Jupyter Notebook 66.43% Shell 0.41% C 0.03% C++ 0.21% Cuda 0.31% Cython 0.12%
3d-detection 3d-vision multi-modality open-vocabulary artificial-intelligence deep-learning transformer detection

coda_neurips2023's Introduction

📖 CoDA: Collaborative Novel Box Discovery and Cross-modal Alignment for Open-vocabulary 3D Object Detection (NeurIPS2023)

🔥Please star CoDA ⭐ and share it. Thanks🔥

[Paper]   [Project Page]

Yang Cao, Yihan Zeng, Hang Xu, Dan Xu
The Hong Kong University of Science and Technology
Huawei Noah's Ark Lab

🚩 Updates

☑ Our extended work CoDAv2 is released, check out it on arXiv !

☑ Latest papers&codes about open-vocabulary perception are collected here.

☑ All the codes, data and pretrained models have been released!

☑ The training and testing codes have been released.

☑ The pretrained models have been released.

☑ The OV-setting SUN-RGBD datasets have been released.

☑ The OV-setting ScanNet datasets have been released.

☑ Paper LaTeX codes are available at https://scienhub.com/Yang/CoDA.

Framework

Samples

Installation

Our code is based on PyTorch 1.8.1, torchvision==0.9.1, CUDA 10.1 and Python 3.7. It may work with other versions.

Please also install the following Python dependencies:

matplotlib
opencv-python
plyfile
'trimesh>=2.35.39,<2.35.40'
'networkx>=2.2,<2.3'
scipy

Please install pointnet2 layers by running

cd third_party/pointnet2 && python setup.py install

Please install a Cythonized implementation of gIOU for faster training.

conda install cython
cd utils && python cython_compile.py build_ext --inplace

Dataset preparation

To achieve the OV setting, we re-organize the original ScanNet and SUN RGB-D and adopt annotations of more categories. Please directly download the ov-setting datasets we provide here: OV SUN RGB-D and OV ScanNet.

Then run for the downloaded *.tar file:

bash data_preparation.sh

Evaluation

Download the pretrained models here. Then run:

bash test_release_models.sh

Training

bash scripts/coda_sunrgbd_stage1.sh
bash scripts/coda_sunrgbd_stage2.sh

Running Samples

bash run_samples.sh

📜 BibTeX

If CoDA is helpful, please cite:

@inproceedings{cao2023coda,
  title={CoDA: Collaborative Novel Box Discovery and Cross-modal Alignment for Open-vocabulary 3D Object Detection},
  author={Cao, Yang and Zeng, Yihan and Xu, Hang  and  Xu, Dan},
  booktitle={NeurIPS},
  year={2023}
}
@article{cao2024collaborative,
      title={Collaborative Novel Object Discovery and Box-Guided Cross-Modal Alignment for Open-Vocabulary 3D Object Detection}, 
      author={Yang Cao and Yihan Zeng and Hang Xu and Dan Xu},
    journal={arXiv preprint arXiv:2406.00830},
    year={2024}
}

📧 Contact

If you have any question or collaboration need (research purpose or commercial purpose), please email [email protected].

📜 Acknowledgement

CoDA is inspired by CLIP and 3DETR. We appreciate their great codes.

coda_neurips2023's People

Contributors

eltociear avatar yangcaoai avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

coda_neurips2023's Issues

Training Error

Dear Authors,

When I train this code, there is an error.

捕获

Training Error

Dear Authors,

When I train this code, there still exists an error:

微信图片_20240620110059

Code Release

Thank you for your excellent jobs,it is so cool!!! I wonder if you will release the code? It is important for me. Waiting for your reply!

Training Time

Dear Authors,

How long does your method need to train?

About Training

Dear Authors,

I have downloaded your given data. But I do not run your code. There exist multiple errors:

捕获

Data downloading

Thanks for your contribution for this nice work! I was setting up the repo but fail to download the large data files (~300GB) from Sharepoint. Do you mind uploading it to somewhere else so that the files can be downloaded via command lines. Thanks!

About categories

Hi,
Thanks for your great work! It seems the authors are busy and the code release may be later. I wonder if you can share the names of base and novel categories on ScanNet and SUN-RGBD, that will make others easier to compare with CoDA. Thanks a lot!

File Error

Dear Authors,

When I run this code, there exists an error:

FileNotFoundError: [Errno 2] No such file or directory: 'Data/sunrgb_d/sunrgbd_v1_revised_0415/sunrgbd_pc_bbox_votes_50k_v1_all_classes_revised_0415_minival'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.