Coder Social home page Coder Social logo

dm-nerf's Introduction

arXiv visitors License CC BY-NC-SA 4.0 Twitter Follow

DM-NeRF: 3D Scene Geometry Decomposition and Manipulation from 2D Images

Bing Wang, Lu Chen, Bo Yang*
Paper | Video | DM-SR

The architecture of our proposed DM-NeRF. Given a 3D point $\boldsymbol{p}$, we learn an object code through a series of loss functions using both 2D and 3D supervision signals.

1. Decomposition and Reconstruction:

2. Decomposition and Rendering:

3. Manipulation:

4. Installation

DM-NeRF uses a Conda environment that makes it easy to install all dependencies.

  1. Create the DM-NeRF Conda environment (Python 3.7) with miniconda.
conda create --name DM-NeRF python=3.7
conda activate DM-NeRF
  1. Install all dependencies by running:
pip install -r requirements.txt

4.1 Datasets

In this paper, we consider the following three different datasets:

(1) DM-SR

To the best of our knowledge, there is no existing 3D scene dataset suitable for quantitative evaluation of geometry manipulation. Therefore, we create a synthetic dataset with 8 types of different and complex indoor rooms, called DM-SR. The room types and designs follow Hypersim Dataset. Overall, we firstly render the static scenes, and then manipulate each scene followed by second round rendering. Each scene has a physical size of about 12x12x3 meters with around 8 objects. We will keep updating DM-SR for future research in the community.

In this paper, we use 7 scenes office0, office2, office3, office4, room0, room1, room2 from the Replica Dataset. We request the authors of Semantic-NeRF to generate color images and 2D object masks with camera poses at 640x480 pixels for each of 7 scenes. Each scene has 59~93 objects with very diverse sizes. Details of camera settings and trajectories can be found here.

In this paper, we use 8 scenes scene0010_00, scene0012_00, scene0024_00, scene0033_00, scene0038_00, scene0088_00, scene0113_00, scene0192_00 from the ScanNet Dataset.

4.2 Training

For the training of our standard DM-NeRF , you can simply run the following command with a chosen config file specifying data directory and hyper-params.

CUDA_VISIBLE_DEVICES=0 python -u train_dmsr.py --config configs/dmsr/train/study.txt

Other working modes and set-ups can be also made via the above command by choosing different config files.

4.3 Evaluation

In this paper, we use PSNR, SSIM, LPIPS for rendering evaluation, and mAPs for both decomposition and manipulation evluations.

(1) Decomposition

Quantitative Evaluation

For decomposition evaluation, you need choose a specific config file and then run:

CUDA_VISIBLE_DEVICES=0 python -u test_dmsr.py --config configs/dmsr/test/study.txt
Mesh Generation

For mesh generation, you can change the config file and then run:

CUDA_VISIBLE_DEVICES=0 python -u test_dmsr.py --config configs/dmsr/test/meshing.txt

(2) Manipulation

Quantitative Evaluation

We provide the DM-SR dataset for the quantitative evaluation of geometry manipulation.

Set the target object and desired manipulated settings in a sepcific config file. And then run:

CUDA_VISIBLE_DEVICES=0 python -u test_dmsr.py --config configs/dmsr/mani/study.txt --mani_mode translation
Qualitative Evaluation

For other qualitative evaluations, you can change the config file and then run:

CUDA_VISIBLE_DEVICES=0 python -u test_dmsr.py --config configs/dmsr/mani/demo_deform.txt

Citation

If you find our work useful in your research, please consider citing:

  @article{wang2022dmnerf,
  title={DM-NeRF: 3D Scene Geometry Decomposition and Manipulation from 2D Images},
  author={Bing, Wang and Chen, Lu and Yang, Bo},
  journal={arXiv preprint arXiv:2208.07227},
  year={2022}
}

License

Licensed under the CC BY-NC-SA 4.0 license, see LICENSE.

Updates

  • 31/8/2022: Data releaseļ¼
  • 25/8/2022: Code releaseļ¼
  • 15/8/2022: Initial releaseļ¼

Related Repos

  1. RangeUDF: Semantic Surface Reconstruction from 3D Point Clouds GitHub stars
  2. GRF: Learning a General Radiance Field for 3D Representation and Rendering GitHub stars
  3. 3D-BoNet: Learning Object Bounding Boxes for 3D Instance Segmentation on Point Clouds GitHub stars

dm-nerf's People

Contributors

bingcs avatar yang7879 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

dm-nerf's Issues

Question about the experiments

Hiļ¼ Thanks for your wonderful work.
In your paper, it says:"Note that, the recent Semantic-NeRF (Zhi et al., 2021a) is also not comparable because it only learns 3D semantic categories, not individual 3D objects."
Iā€™m a little bit confused about why dont you directly use the instance masks to train the semantic-nerf the compare with your work?

Preprocessing of ScanNet scenes

Hi,

can you show how to preprocess scannet scenes for training? I tried to run preprocess.py first and then split.py. It gave me train and test split, but when running train_scannet.py it says ins_rgb.hdf5 is missing. How should i get this hdf5 file?

Thank you!

ValueError: Surface level must be within volume data range.

Hiļ¼Œ
thanks for excellent work.
when run Mesh Generation:CUDA_VISIBLE_DEVICES=0 python -u test_dmsr.py --config configs/dmsr/test/meshing.txt.
ā€œValueError: Surface level must be within volume data range. ā€will be reported

Question for manipulation?

Why we need the scene center? I don't understand.

In generate_poses_eval function, u list the mani_center:
mani_centers = {'bathroom': [0.779178, 1.05247, 0.380208], 'bedroom': [-1.29552, 1.72703, 0.2946], 'dinning': [-0.633653, 0.295162, 0.279743], 'kitchen': [-2.52579, -0.103821, 1.47165], 'reception': [0.579352, -0.099242, 0.092597], 'restroom': [-0.001277, -2.85079, 0.588084], 'office': [-0.717374, 0.929292, 0.904515], 'study': [-0.519422, -2.16509, 1.07392]}
Can u explain the manipulation matrix generater? Thanks!!

About DM dataset

Hello,

Thanks for your great work. Will you release the video of the training dataset?

Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    šŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. šŸ“ŠšŸ“ˆšŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ā¤ļø Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.