Coder Social home page Coder Social logo

anuragranj / spynet Goto Github PK

View Code? Open in Web Editor NEW
224.0 17.0 48.0 65.61 MB

Spatial Pyramid Network for Optical Flow

License: Other

Lua 55.74% CMake 2.12% Cuda 27.01% C 15.13%
deep-learning convolutional-networks optical-flow spatial-pyramid-network spynet

spynet's Introduction

SPyNet: Spatial Pyramid Network for Optical Flow

This code is based on the paper Optical Flow Estimation using a Spatial Pyramid Network.

[Unofficial Pytorch version] [Unofficial tensorflow version]

First things first

You need to have Torch.

Install other required packages

cd extras/spybhwd
luarocks make
cd ../stnbhwd
luarocks make

For Easy Usage, follow this

Set up SPyNet

spynet = require('spynet')
easyComputeFlow = spynet.easy_setup()

Load images and compute flow

im1 = image.load('samples/00001_img1.ppm' )
im2 = image.load('samples/00001_img2.ppm' )
flow = easyComputeFlow(im1, im2)

To save your flow fields to a .flo file use flowExtensions.writeFLO.

For Fast Performace, follow this (recommended)

Set up SPyNet

Set up SPyNet according to the image size and model. For optimal performance, resize your image such that width and height are a multiple of 32. You can also specify your favorite model. The present supported modes are fine tuned models sintelFinal(default), sintelClean, kittiFinal, and base models chairsFinal and chairsClean.

spynet = require('spynet')
computeFlow = spynet.setup(512, 384, 'sintelFinal')    -- for 384x512 images

Now you can call computeFlow anytime to estimate optical flow between image pairs.

Computing flow

Load an image pair and stack and normalize it.

im1 = image.load('samples/00001_img1.ppm' )
im2 = image.load('samples/00001_img2.ppm' )
im = torch.cat(im1, im2, 1)
im = spynet.normalize(im)

SPyNet works with batches of data on CUDA. So, compute flow using

im = im:resize(1, im:size(1), im:size(2), im:size(3)):cuda()
flow = computeFlow(im)

You can also use batch-mode, if your images im are a tensor of size Bx6xHxW, of batch size B with 6 RGB pair channels. You can directly use:

flow = computeFlow(im)

Training

Training sequentially is faster than training end-to-end since you need to learn small number of parameters at each level. To train a level N, we need the trained models at levels 1 to N-1. You also initialize the model with a pretrained model at N-1.

E.g. To train level 3, we need trained models at L1 and L2, and we initialize it modelL2_3.t7.

th main.lua -fineWidth 128 -fineHeight 96 -level 3 -netType volcon \
-cache checkpoint -data FLYING_CHAIRS_DIR \
-L1 models/modelL1_3.t7 -L2 models/modelL2_3.t7 \
-retrain models/modelL2_3.t7

End2End SPyNet

The end-to-end version of SPyNet is easily trainable and is available at anuragranj/end2end-spynet.

Optical Flow Utilities

We provide flowExtensions.lua containing various functions to make your life easier with optical flow while using Torch/Lua. You can just copy this file into your project directory and use if off the shelf.

flowX = require 'flowExtensions'

[flow_magnitude] flowX.computeNorm(flow_x, flow_y)

Given flow_x and flow_y of size MxN each, evaluate flow_magnitude of size MxN.

[flow_angle] flowX.computeAngle(flow_x, flow_y)

Given flow_x and flow_y of size MxN each, evaluate flow_angle of size MxN in degrees.

[rgb] flowX.field2rgb(flow_magnitude, flow_angle, [max], [legend])

Given flow_magnitude and flow_angle of size MxN each, return an image of size 3xMxN for visualizing optical flow. max(optional) specifies maximum flow magnitude and legend(optional) is boolean that prints a legend on the image.

[rgb] flowX.xy2rgb(flow_x, flow_y, [max])

Given flow_x and flow_y of size MxN each, return an image of size 3xMxN for visualizing optical flow. max(optional) specifies maximum flow magnitude.

[flow] flowX.loadFLO(filename)

Reads a .flo file. Loads x and y components of optical flow in a 2 channel 2xMxN optical flow field. First channel stores x component and second channel stores y component.

flowX.writeFLO(filename,F)

Write a 2xMxN flow field F containing x and y components of its flow fields in its first and second channel respectively to filename, a .flo file.

[flow] flowX.loadPFM(filename)

Reads a .pfm file. Loads x and y components of optical flow in a 2 channel 2xMxN optical flow field. First channel stores x component and second channel stores y component.

[flow_rotated] flowX.rotate(flow, angle)

Rotates flow of size 2xMxN by angle in radians. Uses nearest-neighbor interpolation to avoid blurring at boundaries.

[flow_scaled] flowX.scale(flow, sc, [opt])

Scales flow of size 2xMxN by sc times. opt(optional) specifies interpolation method, simple (default), bilinear, and bicubic.

[flowBatch_scaled] flowX.scaleBatch(flowBatch, sc)

Scales flowBatch of size Bx2xMxN, a batch of B flow fields by sc times. Uses nearest-neighbor interpolation.

Timing Benchmarks

Our timing benchmark is set up on Flying chair dataset. To test it, you need to download

wget http://lmb.informatik.uni-freiburg.de/resources/datasets/FlyingChairs/FlyingChairs.zip

Run the timing benchmark

th timing_benchmark.lua -data YOUR_FLYING_CHAIRS_DATA_DIRECTORY

References

  1. Our warping code is based on qassemoquab/stnbhwd.
  2. The images in samples are from Flying Chairs dataset: Dosovitskiy, Alexey, et al. "Flownet: Learning optical flow with convolutional networks." 2015 IEEE International Conference on Computer Vision (ICCV). IEEE, 2015.
  3. Some parts of flowExtensions.lua are adapted from marcoscoffier/optical-flow with help from fguney.
  4. The unofficial PyTorch implementation is from sniklaus.

License

Free for non-commercial and scientific research purposes. For commercial use, please contact [email protected]. Check LICENSE file for details.

When using this code, please cite

Ranjan, Anurag, and Michael J. Black. "Optical Flow Estimation using a Spatial Pyramid Network." arXiv preprint arXiv:1611.00850 (2016).

spynet's People

Contributors

anuragranj avatar tkoham avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

spynet's Issues

performance of flownetS with ft* on kitti

@anuragranj , thanks for your nice work. Here I have one question about the performace of flownetS on kitti if it is also finetuned with additional data as used by the spynet as shown in Table 1.
table1

Will flownetS finetuned with additional driving and Monka dataset get better performance than spynet?

The max flow that the model can predict

Is there a limitation of max flow that the model can predict? Specifically, if the motion is lagger than a value(maybe the largest flow of the training set), the model still can give a good prediction? If there is a max flow that the model can predict, can you tell me the value of max flow?

Training on a different dataset

Hi anuragranj,

I'm trying to use spynet on a personal dataset, but I'm having issues concerning the training part, I converted my data to the ppm and flo format and renamed them to match the structure of the flying chair dataset, but when runnning

th main.lua -fineWidth 128 -fineHeight 96 -level 3 -netType volcon
-cache checkpoint -data myDir
-L1 models/modelL1_3.t7 -L2 models/modelL2_3.t7
-retrain models/modelL2_3.t7

it fails trying to locate a file not in the dataset:

~/spyNetTrainImages/00666_img1.ppm: No such file or directory

I would like to know if you could give me advice on how to train spynet on a different dataset

Thank you for your attention,
Best regards.

Training on a custom dataset

Hello,

Thanks for sharing your code. I’m new to optical flow concept and could learn a lot from your paper.

I would like to train your model on a custom biometric dataset eg. face. But, I don’t have the ground truth flow. Do you know any unsupervised optical flow model.

Thanks for your help

Training time written on your paper.

I read your paper, SpyNet, and now i am trying to implement your model in PyTorch.
I understand you trained network, G0 for three days and it takes a day to train other sub networks. As written on your paper, sub network G0 takes 24*32 images and i think this model is trained promptly.
What's your opinion about this...?

Error:bad argument #1 to 'copy' (sizes do not match

I want to compute flow according to github, my code looks like this:

1 spynet = require('spynet')
  2 flowX = require 'flowExtensions'
  3 computeFlow = spynet.setup(960, 720, 'sintelFinal')
  4 img1 = '/data/deblur/model/testdata/20200116_2/01/960x720/undistort/Sony/00010.png'
  5 img2 = '/data/deblur/model/testdata/20200116_2/01/960x720/undistort/Sony/00011.png'
  6 im1 = image.load(img1 )
  7 im2 = image.load(img2 )
  8 im = torch.cat(im1, im2, 1)
  9 print(#im)
 10 im = spynet.normalize(im)
 11 im = im:resize(1, im:size(1), im:size(2), im:size(3)):cuda()
 12 print(#im)
 13 flow = computeFlow(im)
 14 print(#flow)
 15 flow = flow:resize(flow:size(2), flow:size(3), flow:size(4)):float()
 16 print(#flow)
 17 flowX.writeFLO('Sony.flo', flow)

But it reports error like this:
/home/torch/install/bin/luajit: ./spynet.lua:136: bad argument #1 to 'copy' (sizes do not match at /home/torch/extra/cutorch/lib/THC/THCTensorCopy.cu:31)
stack traceback:
[C]: in function 'copy'
./spynet.lua:136: in function 'computeInitFlowL2'
./spynet.lua:147: in function 'computeInitFlowL3'
./spynet.lua:160: in function 'computeInitFlowL4'
./spynet.lua:173: in function 'computeInitFlowL5'
./spynet.lua:187: in function 'computeFlow'
opt_zs.lua:13: in main chunk
[C]: in function 'dofile'
...time/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:150: in main chunk
[C]: at 0x00405d50

How to build your extra folder files in Pytorch?

Hi, I want to implement your method in pytorch but I find the files under "extra" folder contains some implementations for many different layers. Could you share me a guide how to transfer all the operations to pytorch.

does image format has to be “ppm”

In your given example, the image format is 'ppm', is that essential? Case when using torch 'image' package to load image, the range of ppm image is [0,256] while a png file is [0,1]

Compilation on a gcc 6 toolchain?

running luarocks make spits out errors:

In file included from /opt/cuda/include/cuda_runtime.h:78:0,
from :0:
/opt/cuda/include/host_config.h:119:2: error: #error -- unsupported GNU version! gcc versions later than 5 are not supported!
#error -- unsupported GNU version! gcc versions later than 5 are not supported!
^~~~~
CMake Error at cuspy_generated_init.cu.o.cmake:207 (message):
Error generating
/home/tyson/Git/spynet/extras/spybhwd/build/CMakeFiles/cuspy.dir//./cuspy_generated_init.cu.o

make[2]: *** [CMakeFiles/cuspy.dir/build.make:65: CMakeFiles/cuspy.dir/cuspy_generated_init.cu.o] Error 1
make[1]: *** [CMakeFiles/Makefile2:104: CMakeFiles/cuspy.dir/all] Error 2
make: *** [Makefile:128: all] Error 2

Error: Build error: Failed building.

I've tried setting the environment variable via

export EXTRA_NVCCFLAGS="-Xcompiler -std=c++98"

with no success

error in spynet.easycomputeflow

Hi
I tried to use this function for my image that type of them was torch.CudaTensor , but I faced error in imgs = image.scale(imgs, fineWidth, fineHeight)
how can I solve this problem?
Your answer is appreciated in advance.

Random rotation in training

Hi, excellent paper! I am reading your code. I see you have only random scale, random noise, random crop, and random color as data augmentation in donkey.lua. But you mentioned random rotation in your paper. Can you show an example of how to use it? I am mostly wondering how you handle the flow with random rotation.

module 'spynet' not found : No LuaRocks module found for spynet

I have installed the required packages : 'spybhwd' and 'stnbhwd'

But, when I run the command 'spynet = require('spynet')' in torch, it goes wrong. Here's the message:

th> spynet = require('spynet')
/home/liran/torch/install/share/lua/5.1/trepl/init.lua:389: module 'spynet' not found:No LuaRocks module found for spynet
no field package.preload['spynet']
no file '/home/liran/.luarocks/share/lua/5.1/spynet.lua'
no file '/home/liran/.luarocks/share/lua/5.1/spynet/init.lua'
no file '/home/liran/torch/install/share/lua/5.1/spynet.lua'
no file '/home/liran/torch/install/share/lua/5.1/spynet/init.lua'
no file './spynet.lua'
no file '/home/liran/torch/install/share/luajit-2.1.0-beta1/spynet.lua'
no file '/usr/local/share/lua/5.1/spynet.lua'
no file '/usr/local/share/lua/5.1/spynet/init.lua'
no file '/home/liran/.luarocks/lib/lua/5.1/spynet.so'
no file '/home/liran/torch/install/lib/lua/5.1/spynet.so'
no file '/home/liran/torch/install/lib/spynet.so'
no file './spynet.so'
no file '/usr/local/lib/lua/5.1/spynet.so'
no file '/usr/local/lib/lua/5.1/loadall.so'
stack traceback:
[C]: in function 'error'
/home/liran/torch/install/share/lua/5.1/trepl/init.lua:389: in function 'require'
[string "spynet = require('spynet')"]:1: in main chunk
[C]: in function 'xpcall'
/home/liran/torch/install/share/lua/5.1/trepl/init.lua:679: in function 'repl'
...iran/torch/install/lib/luarocks/rocks/trepl/scm-1/bin/th:199: in main chunk
[C]: at 0x00406670

reconstruct second image with flow

Hi
I tried to reconstruct the second image by computed flow through the spynet, but the reconstructed image was similar to first image rather than second image and it couldn't reconstruct motions.

spynet = require('spynet')
easyComputeFlow = spynet.easy_setup()
im1 = image.load('samples/00001_img1.ppm' )
im2 = image.load('samples/00001_img2.ppm' )
flow = easyComputeFlow(im1, im2)
im3=image.warp(im1, flow)

Your answer is appreciated in advance.

Some questions about the flo result.

Hi, thanks for your work, I intend to apply it to my job and I have some questions to confirm.
Which image is the optical flow referenced to in the code? im1 + flow = im2 or im2 + flow = img1?
The two channels in the flow result represent horizontal(cols) and vertical(rows) translation respectively?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.