Coder Social home page Coder Social logo

sgaligner's Introduction

SGAligner : 3D Scene Alignment with Scene Graphs

Sayan Deb Sarkar1, Ondrej Miksik2, Marc Pollefeys1,2, Daniel Barath1, Iro Armeni1

1ETH Zurich 2Microsoft Mixed Reality & AI Labs

SGAligner aligns 3D scene graphs of environments using multi-modal learning and leverage the output for the downstream task of 3D point cloud registration.

PWC PWC

PyTorch

teaser

[Project Webpage] [Paper]

News ๐Ÿ“ฐ

Code Structure ๐ŸŽฌ

โ”œโ”€โ”€ sgaligner
โ”‚   โ”œโ”€โ”€ data-preprocessing            <- subscan generation + preprocessing
โ”‚   โ”œโ”€โ”€ configs                       <- configuration files
โ”‚   โ”œโ”€โ”€ src
โ”‚   โ”‚   โ”‚โ”€โ”€ aligner                   <- SGAligner modules
โ”‚   โ”‚   โ”‚โ”€โ”€ datasets                  <- dataloader for 3RScan subscans
โ”‚   โ”‚   โ”‚โ”€โ”€ engine                    <- trainer classes
โ”‚   โ”‚   โ”‚โ”€โ”€ GeoTransformer            <- geotransformer submodule for registration
โ”‚   โ”‚   โ”‚โ”€โ”€ inference                 <- inference files for alignment + downstream applications
โ”‚   โ”‚   โ”‚โ”€โ”€ trainers                  <- train + validation loop (EVA + SGAligner)
โ”‚   โ”‚โ”€โ”€ utils                         <- util functions
โ”‚   โ”‚โ”€โ”€ README.md                    
โ”‚   โ”‚โ”€โ”€ scripts                       <- bash scripts for data generation + preprocesing + training
โ”‚   โ””โ”€โ”€ output                        <- folder that stores models and logs
โ”‚

Dependencies ๐Ÿ“

The main dependencies of the project are the following:

python: 3.8.15
cuda: 11.6

You can set up a conda environment as follows :

git clone --recurse-submodules -j8 [email protected]:sayands/sgaligner.git
cd sgaligner
conda env create -f req.yml

Please follow the submodule for additional installation requirements and setup of GeoTransformer.

Downloads ๐Ÿ’ง

The pre-trained model and other meta files are available here.

Dataset Generation ๐Ÿ”จ

After installing the dependencies, we preprocess the datasets and provide the benchmarks.

Subscan Pair Generation - 3RScan + 3DSSG

Download 3RScan and 3DSSG. Move all files of 3DSSG to a new files/ directory within Scan3R. The structure should be:

โ”œโ”€โ”€ 3RScan
โ”‚   โ”œโ”€โ”€ files       <- all 3RScan and 3DSSG meta files (NOT the scan data)  
โ”‚   โ”œโ”€โ”€ scenes      <- scans
โ”‚   โ””โ”€โ”€ out         <- Default output directory for generated subscans (created when running pre-processing)

Change the absolute paths in utils/define.py.

First, we create sub-scans from each 3RScan scan using the ground truth scene Graphs from the 3DSSG dataset and then calculate the pairwise overlap ratio for the subscans in a scan. Finally, we preprocess the data for our framework. The relevant code can be found in the data-preprocessing/ directory. You can use the following command to generate the subscans.

bash scripts/generate_data_scan3r_gt.sh

Note To adhere to our evaluation procedure, please do not change the seed value in the files in configs/ directory.

Generating Overlapping and Non-Overlapping Subscan Pairs

To generate overlapping and non-overlapping pairs, use :

python preprocessing/gen_all_pairs_fileset.py

This will create a fileset with the same number of randomly chosen non-overlapping pairs from the generated subscans as overlapping pairs generated before during subscan generation.

Usage on Predicted Scene Graphs : Coming Soon!

Training ๐Ÿš„

To train SGAligner on 3RScan subscans generated from here, you can use :

cd src
python trainers/trainval_sgaligner.py --config ../configs/scan3r/scan3r_ground_truth.yaml

EVA Training

We also provide training scripts for EVA, used as a baseline after being adapted for scene graph alignment. To train EVA similar to SGAligner on the same data, you can use :

cd src
python trainers/trainval_eva.py --config ../configs/scan3r/scan3r_eva.yaml

We provide config files for the corresponding data in config/ directory. Please change the parameters in the configuration files, if you want to tune the hyper-parameters.

Evaluation ๐Ÿšฆ

Graph Alignment + Point Cloud Registration

cd src
python inference/sgaligner/inference_align_reg.py --config ../configs/scan3r/scan3r_ground_truth.yaml --snapshot <path to SGAligner trained model> --reg_snapshot <path to GeoTransformer model trained on 3DMatch>

Finding Overlapping vs Non-Overlapping Pairs

โ— Run Generating Overlapping and Non-Overlapping Subscan Pairs before.

To run the inference, you need to:

cd src
python inference/sgaligner/inference_find_overlapper.py --config ../configs/scan3r/scan3r_gt_w_wo_overlap.yaml --snapshot <path to SGAligner trained model> --reg_snapshot <path to GeoTransformer model trained on 3DMatch>

3D Point Cloud Mosaicking

First, we generate the subscans per 3RScan scan using :

python data-preprocessing/gen_scan_subscan_mapping.py --split <the split you want to generate the mapping for>

And then, to run the inference, you need to:

cd src
python inference/sgaligner/inference_mosaicking.py --config ../configs/scan3r/scan3r_gt_mosaicking.yaml --snapshot <path to SGAligner trained model> --reg_snapshot <path to GeoTransformer model trained on 3DMatch>

Benchmark ๐Ÿ“ˆ

We provide detailed results and comparisons here.

3D Scene Graph Alignment (Node Matching)

Method Mean Reciprocal Rank Hits@1 Hits@2 Hits@3 Hits@4 Hits@5
EVA 0.867 0.790 0.884 0.938 0.963 0.977
$\mathcal{P}$ 0.884 0.835 0.886 0.921 0.938 0.951
$\mathcal{P}$ + $\mathcal{S}$ 0.897 0.852 0.899 0.931 0.945 0.955
$\mathcal{P}$ + $\mathcal{S}$ + $\mathcal{R}$ 0.911 0.861 0.916 0.947 0.961 0.970
SGAligner 0.950 0.923 0.957 0.974 0.9823 0.987

3D Point Cloud Registration

Method CD RRE RTE FMR RR
GeoTr 0.02247 1.813 2.79 98.94 98.49
Ours, K=1 0.01677 1.425 2.88 99.85 98.79
Ours, K=2 0.01111 1.012 1.67 99.85 99.40
Ours, K=3 0.01525 1.736 2.55 99.85 98.81

TODO ๐Ÿ”œ

  • Add 3D Point Cloud Mosaicking
  • Add Support For EVA
  • Add a demo for real-life point cloud testing
  • Add usage on Predicted Scene Graphs
  • Add scene graph alignment of local 3D scenes to prior 3D maps
  • Add overlapping scene finder with a traditional retrieval method (FPFH + VLAD + KNN)

BibTeX ๐Ÿ™

@misc{sarkar2023sgaligner,
      title={SGAligner : 3D Scene Alignment with Scene Graphs}, 
      author={Sayan Deb Sarkar and Ondrej Miksik and Marc Pollefeys and Daniel Barath and Iro Armeni},
      year={2023},
      eprint={2304.14880},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}

Acknowledgments โ™ป๏ธ

In this project we use (parts of) the official implementations of the following works and thank the respective authors for open sourcing their methods:

sgaligner's People

Contributors

sayands avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.