Coder Social home page Coder Social logo

jihyongoh / xvfi Goto Github PK

View Code? Open in Web Editor NEW
273.0 12.0 39.0 114.36 MB

[ICCV 2021, Oral 3%] Official repository of XVFI

Python 99.14% MATLAB 0.86%
video-frame-interpolation frame-interpolation convolutional-neural-networks pytorch deep-learning extreme-video-frame-interpolatioin dataset 4k-frame iccv iccv2021

xvfi's Introduction

Hi there πŸ‘‹

Updates Visitor Badge

Anurag's GitHub stats

  • πŸ‘¨πŸ»β€πŸ’» I am currently an assistant professor at CMLab (Creative Vision and Multimedia Lab.) in Chung-Ang Univ. (CAU).
  • πŸ‘¨πŸ»β€βš• Please visit my personal homepage (here).
  • πŸ”¬ I primarily focus on a variety of deep-learning-based Computer Vision research areas, such as:
   Neural Radiance Fields (NeRF)
   Video Frame Interpolation / Super Resolution / Deblurring / Colorization  
   Optical Flow Estimation
   Computational Photography
   SDR-to-HDR Inverse Tone Mapping
   Generative AI; Diffusion Models, GANs
   GAN/CNN-based Synthetic Aperture Radar (SAR) Target Recognition/Generation
  • πŸ’» If you are interested in collaborating with me, please don't hesitate to send the email below.
  • πŸ“§ Contact: [email protected]

xvfi's People

Contributors

hjsim avatar jihyongoh avatar kaist-viclab avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

xvfi's Issues

Question on training scheme

Hi,

In section 5.1, when you compare results for 8x interpolation, does this mean that the interpolation is being performed from (1000/8=)125 FPS to 1000 FPS? And what does 8x mean on Adobe dataset, does it mean 30-240FPS?

Also, from your dataset, even through the videos might be 1000FPS, you still seem to sample them at 25 FPS, this is because a one second clip from your dataset video is giving 32 frames, while it should give around 1000 frames for 1000 FPS videos. Can you clarify this part?

A dataset question

i have downloaded the X4K1000FPS dataset from your link. however the data in subfolder is in .mp4 format, but in your readme.md you showed that in .png format.How can i convert the dataset in .mp4 format to .png format.
the error isfollowing
image
and when i found that the module that make the train set doesn't work.
image
when printing the sample_path, it tell me that: 'sample_paths=[]' which means sample_path is empty.

Code release

Wonderful idea!
When to release the code? Inference code is enough, firstly.

Questions about Shared Parameters

Hi! Congratulations!
I got a question that why you try to share paramters between those sub-networks. Is there any other motivations except for just reducing the number of paramters, or maybe some theories, explainations and experiments on it?
I will be appreciated if you could reply as soon as you could .

Equations (1) and (2) in paper

Hi,

Thank you for sharing the dataset and I look forward to see the code as well.

Regarding the Equations (1) and (2) in the paper: since the linear motion approximation combines the bidirectional flow, shouldn't the division in the equation be separated into two terms, such that the normalization is only applied to the appropriate flow. Specifically, -F_0t should only be normalized with w_0 in Equation (1) and not w_1. Analogous conditions apply for Equation (2).

can't download dataset successfully

Thank you for sharing your code!
When i try to download your training dataset in https://www.dropbox.com/sh/duisote638etlv2/AABJw5Vygk94AWjGM4Se0Goza?dl=0&preview=encoded_train.tar.gz,I cannot get it successfully.

  1. I tried to download it in chrome directly, after it finished, but when i try to "tar zxvf encoded_train.tar.gz", it said "gzip: stdin: unexpected end of file. tar: Unexpected EOF in archive. tar: Unexpected EOF in archive tar: Error is not recoverable: exiting now". then i check the size of the tar, only 6.2GB not 14GB, this method is not working.
  2. Then I tried to use wget do download dataset, but it always stopped in the middle of progressbar, I think the server of dropbox is not stability.

So, could you tell me how i can download dataset successfully? or would you mind make a copy and put it to Google drive or baidu cloud disk?
Thank you very much!

Can not download the pretrianed models from Dropbox~

Hi, thank you for your open source, I found that the pretrained model can not be downloaded from the dropbox, and it may be the problem of internet's firewall. Could you please to upload pretrained models in other tools (like one-drive or baidu netdisk)? looking forward to your reply~

Very inefficient inference

Hello, the inference code seems to have rather severe bottlenecks - The CUDA usage is only around 25%.

image

RIFE and other interpolation networks usually have a usage of 80-95%.

Are any optimizations planned to reduce this overhead?

Inferencing on a different dataset

How can we do inference on different video dataset. For example, for different resolution images. Seems only the default datasets are supported for inferencing for now ?

about results of Adobe

Hi, thanks for your code, could you provide the test clips you used in the paper for adobe240, thanks!

Longer sequences for validation / testing?

Hi, thanks to the authors for their impressive work.

I have a question on the data for validation / testing.

The currently available version for the public seems to be aimed for testing environments which take two frames as the input, namely the 0-th and 32-th frame of each scene.

However in that case, I'm afraid methods such as QVI (NeurIPS '19) which require more than two input frames cannot be compared fairly.

We could use more intermediate frames as the input (e.g. 0-th, 16-th, 32-th frame for a framework which requires 3 input frames),
but this may lead to a slightly different scenario (different fps settings), considering that the intended testing environment is interpolating from 30fps to 240fps.

More importantly, I've tried experimenting on interpolating 120fps to 960(or 1000)fps, using the intermediate frames as input, but seems like the task gets too easy and all methods that I've tried perform very well, making it hard to compare which is better.
For these reasons I think it would rather be better with a longer sequence...

According to the example videos on the very first figure of this repository, it seems like the original video sequence for validation / testing seems to be longer than the public version.

Would it be possible for you to share a version of a longer sequence to the public?

Multipul batches inference

Hi, I found that it seems this code base doesn't support multipul-batch inference. It there a solution cause it's too slow.

some questions

Thanks for your wonderful jobs!

I have some questions:

  1. Have you compared it with RIFE?
  2. Does the method use explicit optical flow supervision?
  3. some bad cases
    "python main.py --gpu 0 --phase 'test_custom' --exp_num 1 --dataset 'X4K1000FPS' --module_scale_factor 4 --S_tst 5 --multiple 2 --custom_path ./test_img_dir/test4"
    input image1
    0111
    input image2
    0112
    result
    0111_000

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.