Coder Social home page Coder Social logo

arlo0o / stereoscene Goto Github PK

View Code? Open in Web Editor NEW
91.0 5.0 5.0 51.67 MB

Official PyTorch Implementation of Bridging Stereo Geometry and BEV Representation with Reliable Mutual Interaction for Semantic Scene Completion (IJCAI 2024)

Home Page: https://github.com/Arlo0o/StereoScene

License: Apache License 2.0

Shell 0.15% Python 99.85%
3d-scene-understanding artificial-intelligence autonomous-driving autonomous-vehicles computer-vision deep-learning machine-learning pytorch semantic-kitti semantic-scene-completion

stereoscene's Introduction

[IJCAI 2024] Bridging Stereo Geometry and BEV Representation with Reliable Mutual Interaction for Semantic Scene Completion

Demo:

Benchmark Results

Table of Content

News

  • [2024/04]: Our new work is accepted on ECCV 2024, please check HTCL.
  • [2024/04]: Paper is accepted on IJCAI 2024
  • [2023/03]: Paper is on arxiv
  • [2023/03]: Demo and code released.

Quick Installation on A100

You can use our pre-picked environment on NVIDIA A100 with the following steps if using the same hardware:

a. Download the pre-picked package: occA100.

b. Unpack environment into directory occA100

cd /opt/conda/envs/
mkdir -p occA100
tar -xzf occA100.tar.gz -C occA100 

c. Activate the environment. This adds occA100/bin to your path.

source occA100/bin/activate

You can also use Python executable file without activating or fixing the prefixes.

./occA100/bin/python

Step-by-step Installation Instructions

Following https://mmdetection3d.readthedocs.io/en/latest/getting_started.html#installation

a. Create a conda virtual environment and activate it. python > 3.7 may not be supported, because installing open3d-python with py>3.7 causes errors.

conda create -n occupancy python=3.7 -y
conda activate occupancy

b. Install PyTorch and torchvision following the official instructions.

conda install pytorch==1.10.1 torchvision==0.11.2 torchaudio==0.10.1 cudatoolkit=11.3 -c pytorch -c conda-forge

c. Install gcc>=5 in conda env (optional). I do not use this step.

conda install -c omgarcia gcc-6 # gcc-6.2

c. Install mmcv-full.

pip install mmcv-full==1.4.0

d. Install mmdet and mmseg.

pip install mmdet==2.14.0
pip install mmsegmentation==0.14.1

e. Install mmdet3d from source code.

cd mmdetection3d
git checkout v0.17.1 # Other versions may not be compatible.
python setup.py install

f. Install other dependencies.

pip install timm
pip install open3d-python
pip install PyMCubes

Known problems

AttributeError: module 'distutils' has no attribute 'version'

The error appears due to the version of "setuptools", try:

pip install setuptools==59.5.0

Prepare Data

  • a. You need to download

    • The Odometry calibration (Download odometry data set (calibration files)) and the RGB images (Download odometry data set (color)) from KITTI Odometry website, extract them to the folder data/occupancy/semanticKITTI/RGB/.
    • The Velodyne point clouds (Download data_odometry_velodyne) and the SemanticKITTI label data (Download data_odometry_labels) for sparse LIDAR supervision in training process, extract them to the folders data/lidar/velodyne/ and data/lidar/lidarseg/, separately.
  • b. Prepare KITTI voxel label (see sh file for more details)

bash process_kitti.sh

Pretrained Model

Download Pretrained model on SemanticKITTI and Efficientnet-b7 pretrained model, put them in the folder /pretrain.

Training & Evaluation

Single GPU

  • Train with single GPU:
export PYTHONPATH="."  
python tools/train.py   \
            projects/configs/occupancy/semantickitti/stereoscene.py
  • Evaluate with single GPUs:
export PYTHONPATH="."  
python tools/test.py    \
            projects/configs/occupancy/semantickitti/stereoscene.py \
            pretrain/pretrain_stereoscene.pth  1

Multiple GPUS

  • Train with n GPUs:
bash run.sh  \
        projects/configs/occupancy/semantickitti/stereoscene.py n
  • Evaluate with n GPUs:
 bash tools/dist_test.sh  \
            projects/configs/occupancy/semantickitti/stereoscene.py \
            pretrain/pretrain_stereoscene.pth  n

License

This repository is released under the Apache 2.0 license as found in the LICENSE file.

Acknowledgements

Many thanks to these excellent open source projects:

Citation

If you find our paper and code useful for your research, please consider citing:

@article{li2023bridging,
  title={Bridging stereo geometry and BEV representation with reliable mutual interaction for semantic scene completion},
  author={Li, Bohan and Sun, Yasheng and Liang, Zhujin and Du, Dalong and Zhang, Zhuanghui and Wang, Xiaofeng and Wang, Yunnan and Jin, Xin and Zeng, Wenjun},
  journal={arXiv preprint arXiv:2303.13959},
  year={2023}
}

stereoscene's People

Contributors

arlo0o avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.