Coder Social home page Coder Social logo

plate-pytorch's Introduction

PlaTe: Visually-Grounded Planning with Transformers in Procedural Tasks

[ ๐Ÿ“บ Website | ๐Ÿ— Github Repo | ๐ŸŽ“ Paper ]

Requirements

Data Preparation

Follow this folder structure to prepare the dataset:

.
โ””โ”€โ”€ crosstask
    โ”œโ”€โ”€ crosstask_features
    โ””โ”€โ”€ crosstask_release
        โ”œโ”€โ”€ tasks_primary.txt
        โ”œโ”€โ”€ videos.csv
        โ””โ”€โ”€ videos_val.csv

The data root is set here train_gpt.py.

How to Run

conda create -f environment.yml
bash srun.sh

Acknowledgement

We appreciate the following github repos a lot for their valuable code base implementations: joaanna/something_else, karpathy/minGPT.

Citation

@ARTICLE{PlaTe_RAL_2022,  
author={Sun, Jiankai and Huang, De-An and Lu, Bo and Liu, Yun-Hui and Zhou, Bolei and Garg, Animesh},  
journal={IEEE Robotics and Automation Letters},   
title={PlaTe: Visually-Grounded Planning With Transformers in Procedural Tasks},  
year={2022},  volume={7},  number={2},  pages={4924-4930},  
doi={10.1109/LRA.2022.3150855}}

plate-pytorch's People

Contributors

jiankai-sun avatar jiankaisun avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

Forkers

jshi31 alizaree

plate-pytorch's Issues

About adding label with index 0 for some examples

In line 437 below, you seem to append label with index 0 for videos that have number of steps less than max (3 or 4). As far as I can tell, the index 0 is not reserved for a special action. Moreover, you don't have 0 in the beginning for the examples that have number of steps greater than max trajectory (3 or 4). May I know the reason for adding the label with index 0 for some but not other examples?
Thanks!

if len(labels_matrix) > self.args.max_traj_len:
idx = np.random.randint(
0, len(labels_matrix) - self.args.max_traj_len)
else:
idx = 0
frames = []
for i in range(self.args.max_traj_len):
frames.extend(
images[min(idx + i, len(images) - 1)]) # goal
frames = torch.tensor(frames)
labels = []
if idx - 1 < 0:
labels.append([0])

What is the `bbox` folder supposed to contain for crosstask dataset?

Hi! Thanks for providing the code for your paper. I was trying to run the code after downloading the crosstask dataset. However, I did not find any folder called bbox_v0 or bbox in the dataset. May I know how I could get this folder? I tried looking at the comments in the code as well as the paper itself but I could not find any mention of such a folder. Any help in this regard would be greatly appreciated.

Question on the visual representation

Hi, I carefully walk through the code and find that the visual feature is pooled into 1 dimension, so the prediction is only based on the previous language (action) instead of the visual state. I just want to check if this is intended. Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.