Coder Social home page Coder Social logo

victorletzelter / multimodal-future-prediction Goto Github PK

View Code? Open in Web Editor NEW

This project forked from lmb-freiburg/multimodal-future-prediction

0.0 0.0 0.0 22.1 MB

The official repository for the CVPR 2019 paper "Overcoming Limitations of Mixture Density Networks: A Sampling and Fitting Framework for Multimodal Future Prediction"

License: Other

C++ 40.77% Python 58.91% CMake 0.32%

multimodal-future-prediction's Introduction

This repository corresponds to the official source code of the CVPR 2019 paper:

Overcoming Limitations of Mixture Density Networks: A Sampling and Fitting Framework for Multimodal Future Prediction

To get an overview about the method and its results, we highly recommend checking our poster and a short video at [Page]

demo

Requirements

  • Tensorflow-gpu 1.14.
  • opencv-python, sklearn, matplotlib, Pillow (via pip).

Setup

We use the source code from WEMD[1] to compute our SEMD evaluation metric.

  • extract the blitz++.zip under /wemd.
  • cd build
  • cmake ..
  • make

After compilation, you should get a library under /wemd/lib, which is linked in the wemd.py.

Data

To reproduce our results in the paper, we provide the processed testing samples from SDD [2] used in our paper. Please download them from [Link]

After extracting the datasets.zip, you will get a set of folders representing the testing scenes. For each scene you have the following structure:

  • imgs: contains the images of the scene.
  • floats: for each image, we store -features.float3 and -labels.float3 files. The former is a numpy array of shape (1154,5) which can store up to 1154 annotated objects. Each object has 5 components describing its bounding box (tl_x, tl_y, br_x, br_y, class_id). The indexes of the objects represent the tracking id and are given in the file -labels.float3.
  • scene.txt: each line represent one testing sequence and has the following format: tracking_id img_0,img_1,img_2,img_future.

Additionally, we provide the processed training SDD which can be downloaded from [Link]

Models

We provide the final trained model for our EWTAD-MDF. Please download them from [Link]

Testing

To test our EWTAD-MDF, you can run:

python test.py --output

  • --output: will write the output files to the disk under the path specified in the config.py (OUTPUT_FOLDER_FLN). If you need only to get the testing accuracies without writing files (much faster), you can simply remove the --output.

Training

We provide additionally the loss functions used when training our sampling-fitting network, please check the net.py file for more details.

CPI Dataset

We also provide the script to generate our CPI (Car Pedestrian Interaction) synthetic dataset. To generate the training dataset, you can run:

cd CPI/ python CPI-train.py output_folder n_scenes history n_gts dist

  • output_folder: local folder where to store the generated dataset
  • n_scenes: number of scenes to generate, where each scene correspond to one training sample (we use 20000)
  • history: length of the history, which corresponds to the number of images used as input (we use 3)
  • n_gts: number of ground truths of the future (we use 20)
  • dist: the prediction horizon (we use 20)

Similarly, the testing dataset can be generated using:

python CPI-test.py cpi_testing_dataset 54 3 1000 20

Citation

If you use our repository or find it useful in your research, please cite the following paper:

@InProceedings{MICB19,
  author       = "O. Makansi and E. Ilg and {\"O}. {\c{C}}i{\c{c}}ek and T. Brox",
  title        = "Overcoming Limitations of Mixture Density Networks: A Sampling and Fitting Framework for Multimodal Future Prediction",
  booktitle    = "IEEE International Conference on Computer Vision and Pattern Recognition (CVPR)",
  month        = " ",
  year         = "2019",
  url          = "http://lmb.informatik.uni-freiburg.de/Publications/2019/MICB19"
}

References

[1] S. Shirdhonkar and D. W. Jacobs. Approximate earth movers distance in linear time. In 2008 IEEE Conference on Computer Vision and Pattern Recognition, pages 1โ€“8, June 2008.

[2] A. Robicquet, A. Sadeghian, A. Alahi, S. Savarese, Learning Social Etiquette: Human Trajectory Prediction In Crowded Scenes in European Conference on Computer Vision (ECCV), 2016.

License

logo

This source code is shared under the license CC-BY-NC-SA, please refer to the LICENSE file for more information.

This source code is only shared for R&D or evaluation of this model on user database.

Any commercial utilization is strictly forbidden.

For any utilization with a commercial goal, please contact contact_cs or bendahan

multimodal-future-prediction's People

Contributors

os1a avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.