Coder Social home page Coder Social logo

speedchallenge's Introduction

Given: data/drive.mp4 8616 frames in data/IMG each frame is 640(w) x 840(h) x 3 (RGB) Given ground_truth data in drive.json with [time, speed] for each of the 8616 frames.

Method 2: 15 epoch train,(weight = model-weights-Vtest2.h5). MSE: ~5.6

Watch Video Here

Mean Squared Error for v2(15 epochs)

Check out the medium article

TRAIN:

  • VideoToDataset.ipynb (This is what I used to write the ground truth data to a dataframe and store my images separately, this helped with testing)
  • NvidiaModel-OpticalFlowDense_kerasnew.ipynb (this is how I trained the model and demonstrated the MSE, I also processed the dataset into a video which is shown in HTML inline, notes on how I did certain things are in here)

TEST: (also found in test_suite.zip)

  • test.py
  • model.py
  • opticalHelpers.py
  • model-weights-Vtest.h5 (trained on 10 epochs, MSE ~ 10)
  • model-weights-Vtest2.h5 (trained on 15 epochs, MSE ~ 5.6) (preloaded)
  • setupstuff.sh

To test the model:

  1. run ./setupstuff.sh - this will create the necessary folders (driving_test.csv, test_IMG, test_predict)
  2. create paths to your own data.json and movie.mp4 file on lines 21 and 22 inside test.py
  3. python test.py - this will log out the MSE for a given sample size (you pick the sample size on line 14, weights should be prespecified on line 13)
  4. python makeVideo.py - this will create a video with the prediction values overlayed on-top of each image feel free to delete the ./data/predict folder after step 4
  • Requires moviepy

Dense Optical Flow network feeding.

Strategies:

Dense optical flow network feeding explanation:

  • Method 1: append images to give 3rd dimension an angular and a magnitude layer. In NvidiaModel-OpticalFlowDense I changed up my generator to yield (66, 220, 5) images with (Height , Width, R, G, B, Ang, Mag) Angles and Magnitudes are a result of computing the Dense Optical Flow using Farneback parameters. This did not help my MSE was still ~20 and I did not observe any special results.

  • Method 2: Convert optical flow angles and magnitude HSV to RGB and pass that into the network as (66, 220, 3) RGB values.

  • Hyperparameter selection: I trained the model with 400 samples per epoch, with batch sizes of 32. Therefore I sent ~16,000 images into the generator, resulting in 8k optical flow differentials. I also used an adam optimizer, and ELU activation functions because they lead to convergence faster!

Method 2 was the winner. I guess there was just too much noise when doing a simple image_1 (RGB) - image_2 (RGB). The network model held up because I converted the optical flow parameters to an RGB image, as you can see in the above video.

Other approaches:

  1. Nvidia Model: PilotNet based implementation that compares the differences between both images and sends that through a network and performs regression based on the image differences
  2. DeepVO: AlexNet like implementation that performs parallel convolutions on two images and them merges them later in the pipeline to extract special features between them
  1. DeepFlow: Large displacement optical flow with deep matching link
  • I considered using DeepFlow

Implement Dense optical flow analysis, get optical flow per each pixel. as seen in this example

Architecture Design:

architecture design

Tools used

speedchallenge's People

Contributors

jonathancmitchell avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.