Coder Social home page Coder Social logo

eagle-chase / gkt Goto Github PK

View Code? Open in Web Editor NEW

This project forked from hustvl/gkt

0.0 0.0 0.0 962 KB

Efficient and Robust 2D-to-BEV Representation Learning via Geometry-guided Kernel Transformer

Home Page: https://arxiv.org/abs/2206.04584

License: MIT License

Python 100.00%

gkt's Introduction

Geometry-guided Kernel Transformer

Efficient and Robust 2D-to-BEV Representation Learning via Geometry-guided Kernel Transformer
Shaoyu Chen*, Tianheng Cheng*, Xinggang Wang, Wenming Meng, Qian Zhang, Wenyu Liu

(*: equal contribution, : corresponding author)

News

  • October 14, 2022: We've released code & models for map-view segmentation

  • June 9, 2022: We've released the tech report for Geometry-guided Kernel Transformer (GKT). This work is still in progress and code/models are coming sonn. Please stay tuned! ☕️

Introduction

Framework

We present a novel and efficient 2D-to-BEV transformation, Geometry-guided Kernel Transformer (GKT).

  • GKT leverages geometric priors to guide the transformers to focus on discriminative regions for generating BEV representation with surrouding-view image features.
  • GKT is based on kernel-wise attention and much efficient, especially with LUT indexing.
  • GKT is robust to the deviation of cameras, making the 2D-to-BEV transformation more stable and reliable.

Getting Started

git clone https://github.com/hustvl/GKT.git

Map-view nuScenes Segmentation

Models

Method Kernel mIoU (Setting 1) mIoU (Setting 2) FPS model
CVT - 39.3 37.2 34.1 model
GKT 7x1 41.4 38.0 45.6 model

Note: FPS are measured on one 2080 Ti GPU.

Usage

For map-view nuScenes segmentation, we mainly build the GKT based on the awesome CrossViewTransformer.

# map-view segmentation
cd segmentation

Prerequisites

# install dependencies
pip install -r reuqirements.txt
pip install -e .

Preparing the Dataset

Training / Testing / Benchmarking

  • Pretrained model

Download the pretrained model efficientnet-b4-6ed6700e.pth

mkdir pretrained_models
cd pretrained_models
# place the pretrained model here
  • Training
python scripts/train.py +experiment=gkt_nuscenes_vehicle_kernel_7x1.yaml  data.dataset_dir=<path/to/nuScenes> data.labels_dir=<path/to/labels>
  • Testing

Using the absolute path of the checkpoint is better.

python scripts/eval.py +experiment=gkt_nuscenes_vehicle_kernel_7x1.yaml data.dataset_dir=<path/to/nuScenes> data.labels_dir=<path/to/labels> experiment.ckptt <path/to/checkpoint>
  • Evalutating Speed
python scripts/speed.py +experiment=gkt_nuscenes_vehicle_kernel_7x1.yaml data.dataset_dir=<path/to/nuScenes> data.labels_dir=<path/to/labels>

3D Object Detection

coming soon.

Acknowledgements

We sincerely appreciate the awesome repos cross_view_transformers and fiery!

License

GKT is released under the MIT Licence.

Citation

If you find GKT is useful in your research or applications, please consider giving us a star 🌟 and citing it by the following BibTeX entry.

@article{GeokernelTransformer,
  title={Efficient and Robust 2D-to-BEV Representation Learning via Geometry-guided Kernel Transformer},
  author={Chen, Shaoyu and Cheng, Tianheng and Wang, Xinggang and Meng, Wenming and Zhang, Qian and Liu, Wenyu},
  journal={arXiv preprint arXiv:2206.04584},
  year={2022}
}

gkt's People

Contributors

outsidercsy avatar wondervictor avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.