Coder Social home page Coder Social logo

awesome-vln's Introduction

Awesome Vision-and-Language Navigation

A curated list of research papers in Vision-and-Language Navigation (VLN). Link to the code and website if available is also present.

Contributing

Please feel free to contact me via email ([email protected]) or open an issue or submit a pull request.

To add a new paper via pull request:

  1. Fork the repo, edit README.md.
  2. Put the new paper at the correct chronological position as the following format:
    1. **Paper Title** <br>
    *Author(s)* <br>
    Conference, Year. [[Paper]](link) [[Code]](link) [[Website]](link)
    
  3. Send a pull request. Ideally, I will review the request within a week.

Papers

  1. Vision-and-Language Navigation: Interpreting Visually-Grounded Navigation Instructions in Real Environments
    Peter Anderson, Qi Wu, Damien Teney, Jake Bruce, Mark Johnson, Niko Sünderhauf, Ian Reid, Stephen Gould, Anton van den Hengel
    CVPR, 2018. [Paper] [Code] [Website]

  2. Look Before You Leap: Bridging Model-Free and Model-Based Reinforcement Learning for Planned-Ahead Vision-and-Language Navigation
    Xin Wang, Wenhan Xiong, Hongmin Wang, William Yang Wang
    ECCV, 2018. [Paper]

  3. Speaker-Follower Models for Vision-and-Language Navigation
    Daniel Fried, Ronghang Hu, Volkan Cirik, Anna Rohrbach, Jacob Andreas, Louis-Philippe Morency, Taylor Berg-Kirkpatrick, Kate Saenko, Dan Klein, Trevor Darrell
    NeurIPS, 2018. [Paper] [Code] [Website]

  4. Shifting the Baseline: Single Modality Performance on Visual Navigation & QA
    Jesse Thomason, Daniel Gordon, Yonatan Bisk
    NAACL, 2019. [Paper] [Poster]

  5. Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation
    Xin Wang, Qiuyuan Huang, Asli Celikyilmaz, Jianfeng Gao, Dinghan Shen, Yuan-Fang Wang, William Yang Wang, Lei Zhang
    CVPR, 2019. [Paper]

  6. Self-Monitoring Navigation Agent via Auxiliary Progress Estimation
    Chih-Yao Ma, Jiasen Lu, Zuxuan Wu, Ghassan AlRegib, Zsolt Kira, Richard Socher, Caiming Xiong
    ICLR, 2019. [Paper] [Code] [Website]

  7. The Regretful Agent: Heuristic-Aided Navigation through Progress Estimation
    Chih-Yao Ma, Zuxuan Wu, Ghassan AlRegib, Caiming Xiong, Zsolt Kira
    CVPR, 2019. [Paper] [Code] [Website]

  8. Tactical Rewind: Self-Correction via Backtracking in Vision-and-Language Navigation
    Liyiming Ke, Xiujun Li, Yonatan Bisk, Ari Holtzman, Zhe Gan, Jingjing Liu, Jianfeng Gao, Yejin Choi, Siddhartha Srinivasa
    CVPR, 2019. [Paper] [Code] [Video]

  9. Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout
    Hao Tan, Licheng Yu, Mohit Bansal
    NAACL, 2019. [Paper] [Code]

  10. Multi-modal Discriminative Model for Vision-and-Language Navigation
    Haoshuo Huang, Vihan Jain, Harsh Mehta, Jason Baldridge, Eugene Ie
    NAACL Workshop, 2019. [Paper]

  11. Are You Looking? Grounding to Multiple Modalities in Vision-and-Language Navigation
    Ronghang Hu, Daniel Fried, Anna Rohrbach, Dan Klein, Trevor Darrell, Kate Saenko
    ACL, 2019. [Paper]

  12. Chasing Ghosts: Instruction Following as Bayesian State Tracking
    Peter Anderson, Ayush Shrivastava, Devi Parikh, Dhruv Batra, Stefan Lee
    NeurIPS, 2019. [Paper]

  13. Embodied Vision-and-Language Navigation with Dynamic Convolutional Filters
    Federico Landi, Lorenzo Baraldi, Massimiliano Corsini, Rita Cucchiara
    BMVC, 2019. [Paper] [Code]

  14. Transferable Representation Learning in Vision-and-Language Navigation
    Haoshuo Huang, Vihan Jain, Harsh Mehta, Alexander Ku, Gabriel Magalhaes, Jason Baldridge, Eugene Ie
    ICCV, 2019. [Paper]

  15. Robust Navigation with Language Pretraining and Stochastic Sampling
    Xiujun Li, Chunyuan Li, Qiaolin Xia, Yonatan Bisk, Asli Celikyilmaz, Jianfeng Gao, Noah Smith, Yejin Choi
    EMNLP, 2019. [Paper] [Code]

  16. Vision-Language Navigation with Self-Supervised Auxiliary Reasoning Tasks
    Fengda Zhu, Yi Zhu, Xiaojun Chang, Xiaodan Liang
    arXiv:1911.07883. [Paper]

  17. Counterfactual Vision-and-Language Navigation via Adversarial Path Sampling
    Tsu-Jui Fu, Xin Wang, Matthew Peterson, Scott Grafton, Miguel Eckstein, William Yang Wang
    arXiv:1911.07308. [Paper]

awesome-vln's People

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.