Coder Social home page Coder Social logo

stgtn's Introduction

Aggregating Long-term Sharp Features via Hybrid Transformers for Video Deblurring

Under Review

Introduction

Prerequisites

  • Python >= 3.6, PyTorch >= 1.1.0
  • Requirements: opencv-python, numpy, matplotlib, imageio, scikit-image, tqdm
  • Platforms: Ubuntu 20.04, cuda-10.2, 4 * Tesla V100 (16GB)

Datasets

GOPRO_Random(Original): To satisfy our assumption that sharp frames exist in a blurry video, we generate non-consecutively blurry frames in a video by randomly averaging adjacent sharp frames, i.e., the average number is randomly chosen from 1 to 15. And we assume that a generated frame Bi is sharp if the number of averaging frames is smaller than 5, i.e., label is 1, otherwise label is 0. It is worth noting that we randomly generate 50% blurry frames in a video, while the other 50% frames are sharp, without constraining that there must be 2 sharp ones in consecutive 7 frame.

REDS_Random(Original): To satisfy our assumption that sharp frames exist in a blurry video, we generate non-consecutively blurry frames in the same way as GOPRO. However, when the frame rate is not high enough, simply averaging frames may generate unnatural spikes or steps in the blur trajectory, especially when the resolution is high and the motion is fast. Hence, we employed FLAVR to interpolate frames, increasing the frame rate to virtual 960 fps by recursively interpolating the frames. Thus, we can synthesize frames with different degrees of blur, i.e., the average number is randomly chosen from 3 to 39. And we assume that a generated frame Bi is sharp if the number of averaging frames is smaller than 17, i.e., label is 1, otherwise label is 0.

Dataset Organization Form

|--dataset
    |--blur  
        |--video 1
            |--frame 1
            |--frame 2
                :  
        |--video 2
            :
        |--video n
    |--gt
        |--video 1
            |--frame 1
            |--frame 2
                :  
        |--video 2
         :
        |--video n
    |--Event
        |--video 1
            |--frame 1
            |--frame 2
                :  
        |--video 2
         :
        |--video n
    |--label
        |--video 1
        |--video 2
         :
        |--video n

BSD Dataset: ESTRNN provided a real-world video blur dataset by using a beam splitter system with two synchronized cameras. By controlling the length of exposure time and strength of exposure intensity during video shooting, the system could obtain a pair of sharp and blurry video samples by shooting video one time. They collected blurry/sharp video sequences for three different blur intensity settings (sharp exposure time – blurry exposure time), 1ms–8ms, 2ms–16ms and 3ms–24ms, respectively. The test set has 20 video sequences with 150 frames in each intensity setting. We use these test sets for evaluating generalization ability.

CED Dataset: Scheerlinck et al. presented the first Color Event Camera Dataset (CED) by color event camera ColorDAVIS346, containing 50 minutes of footage with both color frames and events. We also employed FLAVR to interpolate frames for generating blurry frames as the same with REDS. We randomly split the sequences in CED into training, validation and testing sets, and report the corresponding comparison results against the state-of-the-art models by retraining them with the same setting in extension experiment.

RBE Dataset: Pan et al. presented a real blurry event dataset, where each real sequence is captured with the DAVIS under different conditions, such as indoor, outdoor scenery, low lighting conditions, and different motion patterns (e.g., camera shake, objects motion) that naturally introduce motion blur into the APS intensity images. There is no ground-truth data available on this dataset. Hence, we only use it for quantitative comparison.

Download

Please download the testing datasets and training datasets from BaiduYun[password:f94f]. Our STGTN model trained on non-consecutively blurry GOPRO dataset, REDS dataset, event dataset can be download Here[password:8qpa]. Our results on all datasets can be download Here[password:9jzx].

(i) If you have downloaded the pretrained models,please put STGTN model to './experiment'.

(ii) If you have downloaded the datasets,please put them to './dataset'.

Getting Started

1) Testing

cd code

For testing w/o event data :

python inference_swin_hsa_nfs.py --default_data XXXX

For testing w/ event data :

On synthetic event dataset:

inference_swin_hsa_nfs_event.py

On real event dataset:

inference_swin_hsa_nfs_event_real.py

Results

Metrics(PSNR/SSIM) calculating codes are Here.

avatar

avatar

avatar

avatar

Results on RBE dataset:

avatar

Ablation

Effectiveness of NSFs:

avatar avatar

2) Training

Without event data:

python main_swint_hsa_nsf.py --template SWINT_HSA_NSF  # SWINT_HSA_NSF_REDS  for REDS dataset, 

With event data:

python main_swint_hsa_nsf_event.py --template SWINT_HSA_NSF_EVENT_GREY # SWINT_HSA_NSF_EVENT for color event

Please check the path for your dataset.

Cite

If you use any part of our code, or STGATN and non consecutively blurry datasets are useful for your research, please consider citing:

@article{ren2023aggregating,
  title={Aggregating Long-term Sharp Features via Hybrid Transformers for Video Deblurring},
  author={Ren, Dongwei and Shang, Wei and Yang, Yi and Zuo, Wangmeng},
  journal={arXiv preprint arXiv:2309.07054},
  year={2023}
}

stgtn's People

Contributors

shangwei5 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Forkers

cv-ip

stgtn's Issues

the code of the blur-aware detector

Thanks for the fantastic work~
By the way, could you tell me the code of the blur-aware detector mentioned in the paper? I didn't find it in this repo.
Thank you so much!

Could you release the code you make GoPro?

Blur frames: nonconsecutively blurry frames in a video by randomly averaging adjacent sharp frames, i.e., the average number is randomly chosen from 1 to 15.
Sharp frames: the number of averaging frames is smaller than 5

Could you release the code you make GoPro?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.