Coder Social home page Coder Social logo

trans4map's Introduction

Trans4Map

Trans4Map: Revisiting Holistic Bird's-Eye-View Mapping from Egocentric Images to Allocentric Semantics with Vision Transformers

Chang Chen, Jiaming Zhang, Kailun Yang, Kunyu Peng, Rainer Stiefelhagen

News

  • [10/2022] Trans4Map [PDF] is accepted to WACV2023.
  • [9/2022] A demo video is at [VIDEO].

trans4map

Introduction

In this work, we propose an end-to-end one-stage Transformer-based framework for Mapping, termed Trans4Map. Our egocentric-to-allocentric mapping process includes three steps: (1) the efficient transformer extracts the contextual features from a batch of egocentric images; (2) the proposed Bidirectional Allocentric Memory (BAM) module projects egocentric features into the allocentric memory; (3) the map decoder parses the accumulated memory and predicts the top-down semantic segmentation map.

More detailed can be found in our arxiv paper.

Installation

To create conda env:

conda create -n Trans4Map python=3.7
conda activate Trans4Map
cd /path/to/Trans4Map
pip install -r requirements.txt

To get RGBD renderings in Matterport3D dataset, we need to install Habitat-sim and Habitat-lab. To ensure consistency with our working environment,please install the following version Habitat-sim == 0.1.5 and Habitat-lab == 0.1.5.

Datasets

You can prepare the training and test dataset in the same way as SMNet.

  • data/paths.json has the given trajectories which are manually recorded by SMNet.
  • The semantic top-down ground truth are also available : GT, please place them under data/semmap.
  • Our project is working with Matterport3D dataset and Replica dataset, please download them and place them under data/mp3d or data/replica.

Training and Evaluation

To train our Trans4Map with different backbones, run:

python train.py 

To generate the test result, run the following code:

python build_test_date_feature.py
python test.py

To obtain the mIOU and mBF1, run:

python eval/eval.py
python eval/eval_bfscore.py

Main results on Matterport3D with pretrained models

Method Backbone mIOU(%) weight
ConvNeXt ConvNeXt-T 35.91
ConvNeXt ConvNeXt-S 36.49
FAN FAN-T 31.07
FAN FAN-S 34.62
Swin Swin-T 34.19
Swin Swin-S 36.80
Trans4Map MiT-B2 40.02 B2
Trans4Map MiT-B4 40.88 B4

License

This repository is under the Apache-2.0 license. For commercial use, please contact with the authors.

Citations

If you are interested in this work, please cite the following work:

@inproceedings{chen2023trans4map,
  title={Trans4Map: Revisiting Holistic Bird's-Eye-View Mapping from Egocentric Images to Allocentric Semantics with Vision Transformers},
  author={Chen, Chang and Zhang, Jiaming and Yang, Kailun and Peng, Kunyu and Stiefelhagen, Rainer},
  booktitle={2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
  year={2023}
}

trans4map's People

Contributors

elnino9ykl avatar jamycheung avatar utopiac avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

utopiac brucetend

trans4map's Issues

No such file or directory: smnet_training_data

Hi,

Thank you for sharing your excellent research code.

After following the instructions provided on this GitHub, I attempted to run it but encountered the following error:

File "/home/dmsai3/Trans4Map/model/loader.py", line 82, in init
self.files = os.listdir(os.path.join(self.root, 'smnet_training_data'))
FileNotFoundError: [Errno 2] No such file or directory: '/home/dmsai3/Trans4Map/smnet_training_data'

It seems that a pretrained smnet is required. Could you please provide guidance on where to download the necessary files? Alternatively, could you provide detailed instructions on what needs to be done in smnet to run your model without encountering the aforementioned error?

I'll be grateful for any assistance you can provide.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.