Coder Social home page Coder Social logo

prevalent's Introduction

Prevalent: A Pretrained Generic VLN Agent

This repository contains source code to reproduce the results presented in the paper:

Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-training, CVPR 2020
Weituo Hao*, Chunyuan Li*, Xiujun Li, Lawrence Carin, Jianfeng Gao

Pretrain

Our collected triplets can be downloaded here

The pretrained model can be downloaded here

R2R

  • Please check here for experiment setup
  • Please check here for PREVALENT application

CVDN

  • Please check here for experiment setup
  • Please check here for PREVALENT application

HANNA

  • Please check here for experiment setup
  • Please check here for PREVALENT application

Citation

If you use this code for your research, please cite our paper:

@article{hao2020prevalent,
  title={Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-training},
  author={Hao, Weituo and Li, Chunyuan and Li, Xiujun and Carin, Lawrence and Gao, Jianfeng},
  journal={Conference on Computer Vision and Pattern Recognition (CVPR)},
  year={2020}
}

prevalent's People

Contributors

chunyuanli avatar weituo12321 avatar

Stargazers

 avatar cagatay odabasi avatar  avatar mino avatar  avatar Andy Cai avatar Joochan Joseph Kim avatar Lain Iwakura avatar  avatar  avatar Jeff Carpenter avatar Rui Chen avatar kstranger avatar Xiang Li avatar wangjw55 avatar learning... avatar Jafar Mohammadi nasab avatar Zhaoyi Zhang avatar  avatar RunweiSitu avatar  avatar  avatar  avatar  avatar Mingyuan Luo avatar Chaoyi Zhang avatar SiyuanHuang avatar hanqing avatar  avatar Cho-Ying Wu avatar  avatar 任思宇 avatar  avatar Yujie Lu avatar zerone avatar  avatar Jackie Chou avatar  avatar  avatar  avatar Mingchen Zhuge avatar shawn lin avatar justiceli avatar Dong An avatar YANG Kai avatar  avatar  avatar Fan Wang avatar Xiwen Liang avatar  avatar YicongHong avatar  avatar  avatar Sifan Wang avatar Wang-MMM-Lab avatar  avatar Ro.Z avatar CAJvon avatar Zhixiang Wang avatar Yongfei Liu avatar Daqing Liu avatar hamlet avatar J avatar Michael avatar MOML avatar Arka Sadhu avatar  avatar Yaya Shi avatar tuchuang avatar  avatar  avatar  avatar Jack avatar  avatar 爱可可-爱生活 avatar  avatar Wuyang avatar Jin Shan avatar Tsu-Jui Fu avatar  avatar raozhufa avatar Vincent Xiaopeng Lu avatar Weixia Zhang avatar wellsred avatar kkb avatar

Watchers

James Cloos avatar justiceli avatar  avatar  avatar Vincent Xiaopeng Lu avatar paper2code - bot avatar

prevalent's Issues

How to train and use it?

Hi,

Thank you for your work!

Could you please well documented to explain how to train and use it in the README.md? For example, how to train with HANNA, CVDN, R2R. And where is the pre-trained model?

Thank you.

Where is your main model?

Hi, Thank you for sharing nice work.
I interested in your work.
But, I can't find your main models.
Where is it?
And, could you please explain about your model training scripts?

question about visualizing the top-down map

Hi, thanks for sharing such a good job! And I'm trying to visualizing the top-down map by following your script in your 3D profile. But the map I got is a black image. Could you please tell what is the problem of it?
屏幕截图 2024-04-19 113128

How to fine tune?

您好。请问是不是通过将预训练模型语言部分的hidden states输入到R2R-EnvDrop中作为WordEmbedding来fine tune的啊?

Where is the training script?

I appreciate this work so much, and can not wait to have a try. Could you please explain your model training scripts?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.