trans4map's Introduction

Trans4Map

Trans4Map: Revisiting Holistic Bird's-Eye-View Mapping from Egocentric Images to Allocentric Semantics with Vision Transformers

Chang Chen, Jiaming Zhang, Kailun Yang, Kunyu Peng, Rainer Stiefelhagen

News

[10/2022] Trans4Map [PDF] is accepted to WACV2023.
[9/2022] A demo video is at [VIDEO].

Introduction

In this work, we propose an end-to-end one-stage Transformer-based framework for Mapping, termed Trans4Map. Our egocentric-to-allocentric mapping process includes three steps: (1) the efficient transformer extracts the contextual features from a batch of egocentric images; (2) the proposed Bidirectional Allocentric Memory (BAM) module projects egocentric features into the allocentric memory; (3) the map decoder parses the accumulated memory and predicts the top-down semantic segmentation map.

More detailed can be found in our arxiv paper.

Installation

To create conda env:

conda create -n Trans4Map python=3.7
conda activate Trans4Map
cd /path/to/Trans4Map
pip install -r requirements.txt

To get RGBD renderings in Matterport3D dataset, we need to install Habitat-sim and Habitat-lab. To ensure consistency with our working environment，please install the following version Habitat-sim == 0.1.5 and Habitat-lab == 0.1.5.

Datasets

You can prepare the training and test dataset in the same way as SMNet.

data/paths.json has the given trajectories which are manually recorded by SMNet.
The semantic top-down ground truth are also available : GT, please place them under data/semmap.
Our project is working with Matterport3D dataset and Replica dataset, please download them and place them under data/mp3d or data/replica.

Training and Evaluation

To train our Trans4Map with different backbones, run:

python train.py

To generate the test result, run the following code:

python build_test_date_feature.py
python test.py

To obtain the mIOU and mBF1, run:

python eval/eval.py
python eval/eval_bfscore.py

Main results on Matterport3D with pretrained models

Method	Backbone	mIOU(%)	weight
ConvNeXt	ConvNeXt-T	35.91
ConvNeXt	ConvNeXt-S	36.49
FAN	FAN-T	31.07
FAN	FAN-S	34.62
Swin	Swin-T	34.19
Swin	Swin-S	36.80
Trans4Map	MiT-B2	40.02	B2
Trans4Map	MiT-B4	40.88	B4

License

This repository is under the Apache-2.0 license. For commercial use, please contact with the authors.

Citations

If you are interested in this work, please cite the following work:

@inproceedings{chen2023trans4map,
  title={Trans4Map: Revisiting Holistic Bird's-Eye-View Mapping from Egocentric Images to Allocentric Semantics with Vision Transformers},
  author={Chen, Chang and Zhang, Jiaming and Yang, Kailun and Peng, Kunyu and Stiefelhagen, Rainer},
  booktitle={2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)},
  year={2023}
}

trans4map's People

Contributors

Stargazers

Watchers

trans4map's Issues

No such file or directory: smnet_training_data

Hi,

Thank you for sharing your excellent research code.

After following the instructions provided on this GitHub, I attempted to run it but encountered the following error:

File "/home/dmsai3/Trans4Map/model/loader.py", line 82, in init
self.files = os.listdir(os.path.join(self.root, 'smnet_training_data'))
FileNotFoundError: [Errno 2] No such file or directory: '/home/dmsai3/Trans4Map/smnet_training_data'

It seems that a pretrained smnet is required. Could you please provide guidance on where to download the necessary files? Alternatively, could you provide detailed instructions on what needs to be done in smnet to run your model without encountering the aforementioned error?

I'll be grateful for any assistance you can provide.

Recommend Projects

jamycheung / trans4map Goto Github PK