Coder Social home page Coder Social logo

all-in-one-deflicker's Introduction

Blind Video Deflickering by Neural Filtering with a Flawed Atlas

Blind Video Deflickering by Neural Filtering with a Flawed Atlas
Chenyang Lei*, Xuanchi Ren*, Zhaoxiang Zhang and Qifeng Chen
CVPR 2023
* indicates equal contribution

[Paper] [ArXiv] [Project Website] [Dataset-1] [Dataset-2]



this slowpoke moves

News!

  • May 1, 2023: Our collected dataset with real-world flickering videos is released.
  • Apr 10, 2023: Our code can work with segmentations masks for a foreground object.
  • Mar 12, 2023: Inference code and paper are released! Collected dataset will release soon.
  • Feb 28, 2023: Our paper is accepted by CVPR 2023, code will be released in two weeks.

Contents

  1. Environment & Dependency
  2. Inference
  3. All Evaluated Types of Flickering Videos
  4. Advanced Features
  5. Suggestions for Choosing the Hyperparameters
  6. Collected Real-world Dataset
  7. Discussion and Related work

Environment & Dependency

We provide an environment with python 3.10 & torch 1.12 with CUDA 11. If you want a torch 1.6 with CUDA 10, please check this env file.

Install environment:

conda env create -f environment.yml 
conda activate deflicker

Download pretrained ckpt:

git clone https://github.com/ChenyangLEI/cvpr2023_deflicker_public_folder
mv cvpr2023_deflicker_public_folder/pretrained_weights ./ && rm -r cvpr2023_deflicker_public_folder

Inference

Put your video or image folder under data/test. For example:

export PYTHONPATH=$PWD
python test.py --video_name data/test/Winter_Scenes_in_Holland.mp4 # for video input
python test.py --video_frame_folder data/test/Winter_Scenes_in_Holland # for image folder input

Find the results under results/$YOUR_DATA_NAME/final/output.mp4.

Note: our inference code only takes about 3000M GPU memory.

All Evaluated Types of Flickering Videos:

Advanced Features

Using segmentation masks:

Currently, we support to process video with Carvekit or Mask-RCNN. This support can help improve the atlas, particularly for videos featuring a salient object or human. Please note that the current implementation supports only one foreground object with a background.

  • To use Carvekit, which is for background removal:
git clone https://github.com/OPHoperHPO/image-background-remove-tool.git
export PYTHONPATH=$PWD
python test.py --video_name data/test/Winter_Scenes_in_Holland.mp4 --class_name portrait # portrait triggers Carvekit
python -m pip install 'git+https://github.com/facebookresearch/detectron2.git'
export PYTHONPATH=$PWD
python test.py --video_name data/test/Winter_Scenes_in_Holland.mp4 --class_name anything # actually not work for this video 

where --class_name determines the COCO class name of the sought foreground object. It is also possible to choose the first instance retrieved by Mask-RCNN by using --class_name anything.

In both two settings, we suggest you to check the generated masks under data/test/{vid_name}_seg. If the images are all black, you can only use the non-segmentation implementation above.

Suggestions for Choosing the Hyperparameters

If you want to find the best setting to get an atlas for deflickering, we provide a reference guide here:

  1. (Important) Iteration number: Please change this according to the total frame number of your video and the downsample rate of the image size. For example, we adopt 10000 iteration number for the example video with 80 frames and a downsample rate of 4. If you find the results are not as expected, you can try to increase the iters_num (for example: 100000). If you want to use the implementation with segmentation masks, it is suggested to increase the iters_num.

  2. (Important) Optical flow loss weight: Please change optical_flow_coeff and alpha_flow_factor (Note: alpha_flow_factor only used in the advanced features with segmentation masks) according the intensity of flicker in your video. For example, we adopt 500.0 for the optical_flow_coeff and 4900.0 for the alpha_flow_factor for the sample video. If the video has minor flickering, you can use 5.0 for the optical_flow_coeff and 49.0 for the alpha_flow_factor.

  3. Downsample rate: We find that downsampling the resolution of the neural atlas by 4 times make the convergence much faster and slightly influences the quality. You can choose your own downsample rate.

  4. Maximum number of frames: We set the maximum_number_of_frames to 200. The performance for longer videos is not evaluated. It is recommended to split long videos into several shorter sequences.

  5. Useness of segmentation masks: Perfect segmentation masks will increase the quality of the neural atlas, especially for objects with significant motion. However, in most cases, the improvement brought by segmentation on the final prediction is not significant since neural filtering can filter the flaws in the atlas. For now, we provide a naive version for segmentation masks support above.

Dataset

We release two parts of datasets:

  • Dataset-1 including our collected synthesized videos from text-to-video algorithms, old movies, old cartoons, time-lapse videos, slow-motion videos.

  • Dataset-2 including processed videos by the image processing algorithms from fast_blind_video_consistency. Considering the link in the original repo is dead, we provide it here.

Discussion and Related work

Potential applications: Our model can be applied to all evaluated types of flickering videos. Besides, while our approach is designed for videos, it is possible to apply Blind Deflickering for other tasks (e.g., novel view synthesis) where flickering artifacts exist.

Temporal consistency beyond our scope: Solving the temporal inconsistency of video content is beyond the scope of deflickering. For example, the contents obtained by video generation algorithms can be very different. Large scratches in old films can destroy the contents and result in unstable videos, which requires extra restoration technique. We leave the study for a general framework to solve these temporally inconsistent artifacts for future work.

Credit

Our code is heavily relied on layered-neural-atlases, fast_blind_video_consistency, and pytorch-deep-video-prior.

Others

While we do not work on this project full-time, please feel free to provide any suggestions. We would also appreciate it if anyone could help us improve the engineering part of this project.

Citation

If you find our work useful in your research, please consider citing:

@InProceedings{Lei_2023_CVPR,
      author    = {Lei, Chenyang and Ren, Xuanchi and Zhang, Zhaoxiang and Chen, Qifeng},
      title     = {Blind Video Deflickering by Neural Filtering with a Flawed Atlas},
      booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
      month     = {June},
      year      = {2023},
  }

or

@article{lei2023blind,
  title={Blind Video Deflickering by Neural Filtering with a Flawed Atlas},
  author={Lei, Chenyang and Ren, Xuanchi and Zhang, Zhaoxiang and Chen, Qifeng},
  journal={arXiv preprint arXiv:2303.08120},
  year={2023}
}

all-in-one-deflicker's People

Contributors

chenyanglei avatar xrenaa avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

all-in-one-deflicker's Issues

How to run on cpu?

Hi, I try to run this on my computer, but unfortunately I do not have a GPU in it.
What argument should I use to get it to run on cpu?

Adding --cpu to the following command does not work unfortunately
python test.py --video_name data/test/Winter_Scenes_in_Holland.mp4 # for video input

Please remove os.environ["CUDA_VISIBLE_DEVICES"] = "0"

Hi guys, first of all, great scientific work!

If I may suggest an improvement to your code, though, could you factor out or fix the continuous resetting of CUDA_VISIBLE_DEVICES throughout the scripts? The problem is that in different parts of the code, you just disregard the --gpu option and just use GPU 0, which is kind of annoying for multiple GPUs setups.

A better option still would be to remove the --gpu option completely and let the user launch the script by specifying CUDA_VISIBLE_DEVICES beforehand.

If in reproducing your results I come up with a decent edit, I will open a PR. :)

AssertionError: the number of style frames is different from the number of content frames

I set this up and ran on google colab. The example script executed perfectly, however when I tried it on my own longer video there was the following error:

Namespace(ckpt_filter='./pretrained_weights/neural_filter.pth', ckpt_local='./pretrained_weights/local_refinement_net.pth', fps=10, video_name='neathaz_color001', gpu=0) Load ./pretrained_weights/local_refinement_net.pth Traceback (most recent call last): File "/content/All-in-one-Deflicker/src/neural_filter_and_refinement.py", line 73, in <module> assert len(style_names) == len(content_names), "the number of style frames is different from the number of content frames" AssertionError: the number of style frames is different from the number of content frames

[BUG] Using image folder path - Expected to have 3 channels, but got 4

Hey team, thanks for open sourcing this awesome code! I'm really excited to try it out.

I got it to work on video, but I decided to try to use image frames as well. Seems like I'm getting an error with it. Heres the full log including the error:

Namespace(ckpt_filter='./pretrained_weights/neural_filter.pth', ckpt_local='./pretrained_weights/local_refinement_net.pth', fps=10, gpu=0, video_frame_folder='/home/jupyter/repo/All-In-One-Deflicker/data/test/bunny_nadia_v4-face-frames-0_200', video_name=None)
input folder ./data/test/bunny_nadia_v4-face-frames-0_200 exist
python src/preprocess_optical_flow.py --vid-path data/test/bunny_nadia_v4-face-frames-0_200 --gpu 0 
computing flow:   0%|                                                                  | 0/200 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "src/preprocess_optical_flow.py", line 48, in <module>
    preprocess(args=args)
  File "src/preprocess_optical_flow.py", line 29, in preprocess
    flow12 = raft_wrapper.compute_flow(im1, im2)
  File "/home/jupyter/repo/All-In-One-Deflicker/src/models/stage_1/raft_wrapper.py", line 70, in compute_flow
    _, flow12 = self.model(im1, im2, iters=20, test_mode=True)
  File "/opt/conda/envs/deflicker/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/jupyter/repo/All-In-One-Deflicker/src/models/stage_1/core/raft.py", line 104, in forward
    fmap1, fmap2 = self.fnet([image1, image2])        
  File "/opt/conda/envs/deflicker/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/jupyter/repo/All-In-One-Deflicker/src/models/stage_1/core/extractor.py", line 176, in forward
    x = self.conv1(x)
  File "/opt/conda/envs/deflicker/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/opt/conda/envs/deflicker/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 419, in forward
    return self._conv_forward(input, self.weight)
  File "/opt/conda/envs/deflicker/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 416, in _conv_forward
    self.padding, self.dilation, self.groups)
RuntimeError: Given groups=1, weight of size [64, 3, 7, 7], expected input[2, 4, 768, 768] to have 3 channels, but got 4 channels instead
Traceback (most recent call last):
  File "src/stage1_neural_atlas.py", line 279, in <module>
    main(json.load(f), args)
  File "src/stage1_neural_atlas.py", line 109, in main
    resy, resx, maximum_number_of_frames, data_folder, True,  True, vid_root, vid_name)
  File "/home/jupyter/repo/All-In-One-Deflicker/src/models/stage_1/unwrap_utils.py", line 145, in load_input_data_single
    flow12 = np.load(flow12_fn)
  File "/opt/conda/envs/deflicker/lib/python3.7/site-packages/numpy/lib/npyio.py", line 417, in load
    fid = stack.enter_context(open(os_fspath(file), "rb"))
FileNotFoundError: [Errno 2] No such file or directory: 'data/test/bunny_nadia_v4-face-frames-0_200_flow/output_frame_001.png_output_frame_002.png.npy'
Namespace(ckpt_filter='./pretrained_weights/neural_filter.pth', ckpt_local='./pretrained_weights/local_refinement_net.pth', gpu=0, video_name='bunny_nadia_v4-face-frames-0_200')
Load ./pretrained_weights/local_refinement_net.pth
Traceback (most recent call last):
  File "src/neural_filter_and_refinement.py", line 72, in <module>
    assert len(style_names) == len(content_names), "the number of style frames is different from the number of content frames"
AssertionError: the number of style frames is different from the number of content frames

I'm not fully sure what it means, but it looks like the following is causing the issue:

  1. RuntimeError: Given groups=1, weight of size [64, 3, 7, 7], expected input[2, 4, 768, 768] to have 3 channels, but got 4 channels instead -> this seems to break writing the .npy files?
  2. because of that, there seems to be a fatal bug: AssertionError: the number of style frames is different from the number of content frames

Any ideas on how to fix? My total guess is perhaps because my images are PNG, its reading an extra alpha channel as the 4th channel, when it expects only 3? I'm not sure though...

Please let me know if you need any extra info. Thanks alot!

bug,AssertionError: the number of style frames is different from the number of content frames

something wrong

199
17.87025058686584
100% 10001/10001 [19:51<00:00, 8.39it/s]
Namespace(ckpt_filter='./pretrained_weights/neural_filter.pth', ckpt_local='./pretrained_weights/local_refinement_net.pth', fps=10, video_name='2023-05-19_20-10-50', gpu=0)
Load ./pretrained_weights/local_refinement_net.pth
Traceback (most recent call last):
File "/content/All-In-One-Deflicker/src/neural_filter_and_refinement.py", line 73, in
assert len(style_names) == len(content_names), "the number of style frames is different from the number of content frames"
AssertionError: the number of style frames is different from the number of content frames

No module named 'src'

I ran into the following issue, please help me to solve, what could be the problem?:
I've verified that 'src' folder is stored in my root folder and all pip requirements are installed via conda.

(deflicker) C:\All-In-One-Deflicker>python test.py --video_name data/test/Winter_Scenes_in_Holland.mp4

Namespace(ckpt_filter='./pretrained_weights/neural_filter.pth', ckpt_local='./pretrained_weights/local_refinement_net.pth', video_name='data/test/Winter_Scenes_in_Holland.mp4', video_frame_folder=None, fps=10, gpu=0)
ffmpeg -i data/test/Winter_Scenes_in_Holland.mp4 -vf fps=10 -start_number 0 ./data/test/Winter_Scenes_in_Holland/%05d.png
ffmpeg version 2023-03-05-git-912ac82a3c-full_build-www.gyan.dev Copyright (c) 2000-2023 the FFmpeg developers
built with gcc 12.2.0 (Rev10, Built by MSYS2 project)
configuration: --enable-gpl --enable-version3 --enable-static --disable-w32threads --disable-autodetect --enable-fontconfig --enable-iconv --enable-gnutls --enable-libxml2 --enable-gmp --enable-bzlib --enable-lzma --enable-libsnappy --enable-zlib --enable-librist --enable-libsrt --enable-libssh --enable-libzmq --enable-avisynth --enable-libbluray --enable-libcaca --enable-sdl2 --enable-libaribb24 --enable-libdav1d --enable-libdavs2 --enable-libuavs3d --enable-libzvbi --enable-librav1e --enable-libsvtav1 --enable-libwebp --enable-libx264 --enable-libx265 --enable-libxavs2 --enable-libxvid --enable-libaom --enable-libjxl --enable-libopenjpeg --enable-libvpx --enable-mediafoundation --enable-libass --enable-frei0r --enable-libfreetype --enable-libfribidi --enable-liblensfun --enable-libvidstab --enable-libvmaf --enable-libzimg --enable-amf --enable-cuda-llvm --enable-cuvid --enable-ffnvcodec --enable-nvdec --enable-nvenc --enable-d3d11va --enable-dxva2 --enable-libvpl --enable-libshaderc --enable-vulkan --enable-libplacebo --enable-opencl --enable-libcdio --enable-libgme --enable-libmodplug --enable-libopenmpt --enable-libopencore-amrwb --enable-libmp3lame --enable-libshine --enable-libtheora --enable-libtwolame --enable-libvo-amrwbenc --enable-libilbc --enable-libgsm --enable-libopencore-amrnb --enable-libopus --enable-libspeex --enable-libvorbis --enable-ladspa --enable-libbs2b --enable-libflite --enable-libmysofa --enable-librubberband --enable-libsoxr --enable-chromaprint
libavutil 58. 3.100 / 58. 3.100
libavcodec 60. 6.100 / 60. 6.100
libavformat 60. 4.100 / 60. 4.100
libavdevice 60. 2.100 / 60. 2.100
libavfilter 9. 4.100 / 9. 4.100
libswscale 7. 2.100 / 7. 2.100
libswresample 4. 11.100 / 4. 11.100
libpostproc 57. 2.100 / 57. 2.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'data/test/Winter_Scenes_in_Holland.mp4':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf58.12.100
Duration: 00:00:08.00, start: 0.000000, bitrate: 474 kb/s
Stream #0:00x1: Video: h264 (High) (avc1 / 0x31637661), yuvj420p(pc, progressive), 640x360 [SAR 1:1 DAR 16:9], 472 kb/s, 12 fps, 12 tbr, 12288 tbn (default)
Metadata:
handler_name : VideoHandler
vendor_id : [0][0][0][0]
Stream mapping:
Stream #0:0 -> #0:0 (h264 (native) -> png (native))
Press [q] to stop, [?] for help
[swscaler @ 0000027a84806f00] deprecated pixel format used, make sure you did set range correctly
Last message repeated 3 times
Output #0, image2, to './data/test/Winter_Scenes_in_Holland/%05d.png':
Metadata:
major_brand : isom
minor_version : 512
compatible_brands: isomiso2avc1mp41
encoder : Lavf60.4.100
Stream #0:0(und): Video: png, rgb24(pc, gbr/unknown/unknown, progressive), 640x360 [SAR 1:1 DAR 16:9], q=2-31, 200 kb/s, 10 fps, 10 tbn (default)
Metadata:
handler_name : VideoHandler
vendor_id : [0][0][0][0]
encoder : Lavc60.6.100 png
frame= 80 fps=0.0 q=-0.0 Lsize=N/A time=00:00:07.90 bitrate=N/A speed=26.8x ts/s speed=N/A
video:15063kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown

Traceback (most recent call last):
File "C:\All-In-One-Deflicker\src\neural_filter_and_refinement.py", line 6, in
import src.models.network_filter as net
ModuleNotFoundError: No module named 'src'

How can i fix the error

frame= 80 fps=0.0 q=-0.0 Lsize=N/A time=00:00:08.00 bitrate=N/A speed=42.5x
video:15063kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
Traceback (most recent call last):
File "/root/All-In-One-Deflicker/src/stage1_neural_atlas.py", line 274, in
cmd = "python src/preprocess_optical_flow.py --vid-path %s --gpu %s " % (vid_path, select_gpu)
NameError: name 'select_gpu' is not defined
Namespace(ckpt_filter='./pretrained_weights/neural_filter.pth', ckpt_local='./pretrained_weights/local_refinement_net.pth', fps=10, video_name='Winter_Scenes_in_Holland', gpu=0)
Load ./pretrained_weights/local_refinement_net.pth
Traceback (most recent call last):
File "/root/All-In-One-Deflicker/src/neural_filter_and_refinement.py", line 73, in
assert len(style_names) == len(content_names), "the number of style frames is different from the number of content frames"
AssertionError: the number of style frames is different from the number of content frames

deflickering 16bit depth

Super cool repo! I was trying to deflicker 16 bit depth and it does show some promise; however, it looks like it is casting it to 8bit at intermediary step (I altered the input and output for 16bit).

Before I invest more time into this approach, I was wondering if there's any reason 16 bit wouldn't work with any of the internal algorithms?

All-in-one-deflicker on the right (the fps is mismatched)

zoe_deflicker.mp4

Can't create the environment.

`(base) F:\apps\All-In-One-Deflicker>conda env create -f environment.yml
Collecting package metadata (repodata.json): done
Solving environment: failed

ResolvePackageNotFound:

  • xorg-libxdmcp==1.1.3=h7f98852_0
  • pillow==9.4.0=py310h023d228_1
  • gmp==6.2.1=h58526e2_0
  • python==3.10.9=he550d4f_0_cpython
  • xorg-libxfixes==5.0.3=h7f98852_1004
  • libffi==3.4.2=h7f98852_5
  • p11-kit==0.24.1=hc5aa10d_0
  • lerc==4.0.0=h27087fc_0
  • tk==8.6.12=h27826a3_0
  • libiconv==1.17=h166bdaf_0
  • libgcc-ng==12.2.0=h65d4601_19
  • cudatoolkit==11.3.1=h9edb442_11
  • readline==8.1.2=h0f457ee_0
  • lcms2==2.15=hfd0df8a_0
  • libidn2==2.3.4=h166bdaf_0
  • mkl-include==2022.1.0=h84fe81f_915
  • gnutls==3.7.8=hf3e180e_0
  • x265==3.5=h924138e_3
  • xorg-libxext==1.3.4=h0b41bf4_2
  • gettext==0.21.1=h27087fc_0
  • openh264==2.3.1=hcb278e6_2
  • libuuid==2.32.1=h7f98852_1000
  • libgfortran-ng==12.2.0=h69a702a_19
  • cryptography==39.0.2=py310h34c0648_0
  • xz==5.2.6=h166bdaf_0
  • libzlib==1.2.13=h166bdaf_4
  • x264==1!164.3095=h166bdaf_2
  • ncurses==6.3=h27087fc_1
  • openssl==3.1.0=h0b41bf4_0
  • ca-certificates==2022.12.7=ha878542_0
  • brotlipy==0.7.0=py310h5764c6d_1005
  • libsqlite==3.40.0=h753d276_0
  • blas-devel==3.9.0=16_linux64_mkl
  • mkl-devel==2022.1.0=ha770c72_916
  • libxml2==2.10.3=h7463322_0
  • xorg-libx11==1.8.4=h0b41bf4_0
  • ld_impl_linux-64==2.40=h41732ed_0
  • _openmp_mutex==4.5=2_kmp_llvm
  • libunistring==0.9.10=h7f98852_0
  • zstd==1.5.2=h3eb15da_6
  • ffmpeg==5.1.2=gpl_h8dda1f0_106
  • fontconfig==2.14.2=h14ed4e7_0
  • svt-av1==1.4.1=hcb278e6_0
  • libcblas==3.9.0=16_linux64_mkl
  • mkl==2022.1.0=h84fe81f_915
  • libblas==3.9.0=16_linux64_mkl
  • numpy==1.24.2=py310h8deb116_0
  • libtasn1==4.19.0=h166bdaf_0
  • _libgcc_mutex==0.1=conda_forge
  • libwebp-base==1.3.0=h0b41bf4_0
  • libpng==1.6.39=h753d276_0
  • xorg-fixesproto==5.0=h7f98852_1002
  • lame==3.100=h166bdaf_1003
  • pytorch==1.12.0=py3.10_cuda11.3_cudnn8.3.2_0
  • xorg-xextproto==7.3.0=h0b41bf4_1003
  • nettle==3.8.1=hc379101_1
  • liblapacke==3.9.0=16_linux64_mkl
  • tbb==2021.8.0=hf52228f_0
  • xorg-kbproto==1.0.7=h7f98852_1002
  • freetype==2.12.1=hca18f0e_1
  • libnsl==2.0.0=h7f98852_0
  • libpciaccess==0.17=h166bdaf_0
  • libxcb==1.13=h7f98852_1004
  • icu==70.1=h27087fc_0
  • bzip2==1.0.8=h7f98852_4
  • libtiff==4.5.0=h6adf6a1_2
  • cffi==1.15.1=py310h255011f_3
  • xorg-libxau==1.0.9=h7f98852_0
  • libgfortran5==12.2.0=h337968e_19
  • libva==2.17.0=h0b41bf4_0
  • libhwloc==2.9.0=hd6dc26d_0
  • liblapack==3.9.0=16_linux64_mkl
  • libdrm==2.4.114=h166bdaf_0
  • libvpx==1.11.0=h9c3ff4c_3
  • libopus==1.3.1=h7f98852_1
  • llvm-openmp==15.0.7=h0cdce71_0
  • libdeflate==1.17=h0b41bf4_0
  • xorg-xproto==7.0.31=h7f98852_1007
  • pthread-stubs==0.4=h36c2ea0_1001
  • libstdcxx-ng==12.2.0=h46fd767_19
  • jpeg==9e=h0b41bf4_3
  • openjpeg==2.5.0=hfec8fc6_2
  • expat==2.5.0=h27087fc_0
  • aom==3.5.0=h27087fc_0
    `

[Question] - How to improve the color in deflickered video?

Hey ya'll this is not a bug, I was just wanting to see if you had any pointers in using this tech. I'm trying to make an animation from a video made using stable diffusion frame by frame. However, I noticed in deflickered version, indeed it looked like less flicker, however, the color was a little off in some frames. You can see in the picture below, the left side is the video from the deflickered version & the right side is the original. You can see that the arms look almost grey/blue ish compared to the original.

The only params I changed in the config are:

  • maximum_number_of_frames: 500
  • iters_num: 60006
  • frames / second: 30 (but tested on default 10 & the issue persisted)

I was just wondering if you had any pointers or suggestions on maybe how I can fix the issue with the coloring on this bunny avatar?

Thanks for your help & love the tech :D

Screen Shot 2023-03-13 at 11 05 30 PM

Here is the original video:

bunny_nadia_v4-face.mp4

Here is the deflickered version of the video:

output_fixed.mp4

code for train

Nice work! Will there be open-source training code in the future?

Some more information regarding parameters please

Currently I am recording a tutorial video to animation with stable diffusion

I have got excellent quality consistency

Here example video

consisten.anime.raw.shorter.mp4

I will also upscale this to 1024x1024

Now I want to reduce flickering problem with your AI

could you give me some optimal parameters for such thing?

Lets say I have 1024x1024 resolution and 2000 frames

So many parameters. So how should I modify them? making iters_num = 100k should I also change samples_batch rgb_coeff alpha_bootstrapping_factor alpha_flow_factor stop_bootstrapping_iteration sparsity_coeff

Could you give some idea about which one of these should be changed? thank you so much

{
"results_folder_name": "results",
"maximum_number_of_frames": 2000,
"resx": 1024,
"resy": 1024,
"iters_num": 100001,
"samples_batch": 10000,
"optical_flow_coeff": 500.0,
"evaluate_every": 10000,
"derivative_amount": 1,
"rgb_coeff": 5000,
"rigidity_coeff": 1.0,
"uv_mapping_scale": 0.8,
"pretrain_mapping1": true,
"pretrain_mapping2": true,
"alpha_bootstrapping_factor": 2000.0,
"alpha_flow_factor": 4900.0,
"positional_encoding_num_alpha": 5,
"number_of_channels_atlas": 256,
"number_of_layers_atlas": 8,
"number_of_channels_alpha": 256,
"number_of_layers_alpha": 8,
"stop_bootstrapping_iteration": 10000,
"number_of_channels_mapping1": 256,
"number_of_layers_mapping1": 6,
"number_of_channels_mapping2": 256,
"number_of_layers_mapping2": 4,
"gradient_loss_coeff": 1000,
"use_gradient_loss": true,
"sparsity_coeff": 1000.0,
"positional_encoding_num_atlas": 10,
"use_positional_encoding_mapping1": false,
"number_of_positional_encoding_mapping1": 4,
"use_positional_encoding_mapping2": false,
"number_of_positional_encoding_mapping2": 2,
"pretrain_iter_number": 100,
"load_checkpoint": false,
"checkpoint_path": "",
"include_global_rigidity_loss": true,
"global_rigidity_derivative_amount_fg": 100,
"global_rigidity_derivative_amount_bg": 100,
"global_rigidity_coeff_fg": 5.0,
"global_rigidity_coeff_bg": 50.0,
"stop_global_rigidity": 5000,
"add_to_experiment_folder_name": ""
}

@xrenaa @ChenyangLEI

error in the end of the process

Thank you very much for this incredible work, unfortunately I always get this error at the end

AssertionError: the number of style frames is different from the number of content frames

--
Killed
Namespace(ckpt_filter='./pretrained_weights/neural_filter.pth', ckpt_local='./pretrained_weights/local_refinement_net.pth', fps=10, video_name='video', gpu=0)
Load ./pretrained_weights/local_refinement_net.pth
Traceback (most recent call last):
File "/content/All-In-One-Deflicker/src/neural_filter_and_refinement.py", line 73, in
assert len(style_names) == len(content_names), "the number of style frames is different from the number of content frames"
AssertionError: the number of style frames is different from the number of content frames

Windows installation

Hi guys,
I'm really excited to test the awesome software that you've created.
Can someone help me to install it on a Windows machine, when I try to install it using Conda the packages are missing. I got:

(base) PS D:\proj\All-in-one-Deflicker> conda env create --name deflicker -f environment.yml
Collecting package metadata (repodata.json): done
Solving environment: failed

ResolvePackageNotFound:

  • lame==3.100=h166bdaf_1003
  • libzlib==1.2.13=h166bdaf_4
  • libpng==1.6.39=h753d276_0
  • blas-devel==3.9.0=16_linux64_mkl
  • xorg-xproto==7.0.31=h7f98852_1007
  • lerc==4.0.0=h27087fc_0
  • libdrm==2.4.114=h166bdaf_0
  • ca-certificates==2022.12.7=ha878542_0
  • libnsl==2.0.0=h7f98852_0
  • ffmpeg==5.1.2=gpl_h8dda1f0_106
  • numpy==1.24.2=py310h8deb116_0
  • p11-kit==0.24.1=hc5aa10d_0
  • pillow==9.4.0=py310h023d228_1
  • libblas==3.9.0=16_linux64_mkl
  • xorg-fixesproto==5.0=h7f98852_1002
  • openjpeg==2.5.0=hfec8fc6_2
  • libuuid==2.32.1=h7f98852_1000
  • libwebp-base==1.3.0=h0b41bf4_0
  • readline==8.1.2=h0f457ee_0
  • svt-av1==1.4.1=hcb278e6_0
  • python==3.10.9=he550d4f_0_cpython
  • lcms2==2.15=hfd0df8a_0
  • mkl==2022.1.0=h84fe81f_915
  • x265==3.5=h924138e_3
  • gmp==6.2.1=h58526e2_0
  • xorg-libxau==1.0.9=h7f98852_0
  • fontconfig==2.14.2=h14ed4e7_0
  • tbb==2021.8.0=hf52228f_0
  • libgfortran5==12.2.0=h337968e_19
  • _openmp_mutex==4.5=2_kmp_llvm
  • openh264==2.3.1=hcb278e6_2
  • libsqlite==3.40.0=h753d276_0
  • tk==8.6.12=h27826a3_0
  • openssl==3.1.0=h0b41bf4_0
  • aom==3.5.0=h27087fc_0
  • icu==70.1=h27087fc_0
  • bzip2==1.0.8=h7f98852_4
  • xorg-libxdmcp==1.1.3=h7f98852_0
  • mkl-include==2022.1.0=h84fe81f_915
  • llvm-openmp==15.0.7=h0cdce71_0
  • libunistring==0.9.10=h7f98852_0
  • xorg-xextproto==7.3.0=h0b41bf4_1003
  • libiconv==1.17=h166bdaf_0
  • libopus==1.3.1=h7f98852_1
  • libtasn1==4.19.0=h166bdaf_0
  • x264==1!164.3095=h166bdaf_2
  • mkl-devel==2022.1.0=ha770c72_916
  • xorg-libxext==1.3.4=h0b41bf4_2
  • libhwloc==2.9.0=hd6dc26d_0
  • libxcb==1.13=h7f98852_1004
  • libxml2==2.10.3=h7463322_0
  • libcblas==3.9.0=16_linux64_mkl
  • pthread-stubs==0.4=h36c2ea0_1001
  • xorg-kbproto==1.0.7=h7f98852_1002
  • xorg-libxfixes==5.0.3=h7f98852_1004
  • nettle==3.8.1=hc379101_1
  • ncurses==6.3=h27087fc_1
  • libgcc-ng==12.2.0=h65d4601_19
  • xz==5.2.6=h166bdaf_0
  • libdeflate==1.17=h0b41bf4_0
  • liblapack==3.9.0=16_linux64_mkl
  • libva==2.17.0=h0b41bf4_0
  • libgfortran-ng==12.2.0=h69a702a_19
  • libtiff==4.5.0=h6adf6a1_2
  • _libgcc_mutex==0.1=conda_forge
  • libvpx==1.11.0=h9c3ff4c_3
  • xorg-libx11==1.8.4=h0b41bf4_0
  • libffi==3.4.2=h7f98852_5
  • gnutls==3.7.8=hf3e180e_0
  • freetype==2.12.1=hca18f0e_1
  • jpeg==9e=h0b41bf4_3
  • ld_impl_linux-64==2.40=h41732ed_0
  • zstd==1.5.2=h3eb15da_6
  • cudatoolkit==11.3.1=h9edb442_11
  • libpciaccess==0.17=h166bdaf_0
  • cffi==1.15.1=py310h255011f_3
  • libidn2==2.3.4=h166bdaf_0
  • pytorch==1.12.0=py3.10_cuda11.3_cudnn8.3.2_0
  • cryptography==39.0.2=py310h34c0648_0
  • libstdcxx-ng==12.2.0=h46fd767_19
  • expat==2.5.0=h27087fc_0
  • liblapacke==3.9.0=16_linux64_mkl
  • gettext==0.21.1=h27087fc_0
  • brotlipy==0.7.0=py310h5764c6d_1005

Already tried to remove the version after the second = but with no luck so far.
Does anyone have an environment.yml that is compatible with Windows?
Any help will be appreciated.

GPU index parameter gets ignored in `test.py`

Hi, just a heads up about the script for someone experiencing similar problems. The current test.py script accepts gpu as argument but currently does not forward it to the other scripts. This, combined with the fact that env variables seem not to be forwarded when running the other scripts, is particularly insidious, as setting CUDA_VISIBLE_DEVICES won't work either (at least it didn't for me).

Questions about video flicker detection

Your work is truly amazing! I think you are also familiar with the related characteristics that characterize video flicker, so I would like to ask you how to detect light and dark flicker in videos. I look forward to your reply.

[REQ] add a (GH-compliant) license file

Hi there, 1st of all thanks for this awesome work !

Since we've 'doxed' it in our HyMPS project (under VIDEO section \ AI-based page \ Restoring), can you please add a GH-compliant license file for it ? (as already requested by @thegenerativegeneration in the - still open - Issue #26)

As you know, expliciting licensing terms is extremely important to let anyone better/faster understand how to reuse/adapt/modify sources (and not only) in other open projects and vice-versa.

Although it may sounds like a minor aspect, license file omission obviously causes an inconsistent generation of the relative badge too:


(badge-generator URL: https://badgen.net/github/license/ChenyangLEI/All-In-One-Deflicker)

You can easily set a standardized one through the GH's license wizard tool.

Last but not least, let us know how we could improve - in your opinion - our categorizations and links to resources in order to favor collaboration between developers (and therefore evolution) of listed projects.

Hope that helps/inspires !

Output loses color.

Firstly, thank you very much for the code! Can't quite figure out why, the color of my output video turns to black and white. I was wondering if you would have any ideas or pointers that could be the case.

References:

input_video.mp4
output.mp4

License

Hey, thank you for your work :)
Are you planning on adding a license to the repo?

Transformations mimicking the Atlas distortion

Nice work! I am wondering what transformations you used to mimick the distortion caused by foreground motion and if there is still more room for improvement of the performance. I do not find any details in the paper and project page. Thanks.

"CUDA out of memory" for large video file

Hi. Thanks for this nice work!! I'm trying to deflicker my video on Google Colab. For video files with relatively small dimensions (640x640), I can get the code running with no issue. But when I try with larger input video (1280x1280), I'm getting: RuntimeError: CUDA out of memory. Tried to allocate 30.00 MiB (GPU 0; 14.75 GiB total capacity; 13.69 GiB already allocated; 6.81 MiB free; 13.98 GiB reserved in total by PyTorch):

Untitled

Link to the Google Colab notebook

Full code block output here
/content/All-In-One-Deflicker/data/test
video-depth-720.mov
video-depth-720.mov(video/quicktime) - 2035246 bytes, last modified: 3/14/2023 - 100% done
Saving video-depth-720.mov to video-depth-720.mov
/content/All-In-One-Deflicker
Namespace(ckpt_filter='./pretrained_weights/neural_filter.pth', ckpt_local='./pretrained_weights/local_refinement_net.pth', fps=10, gpu=0, video_frame_folder=None, video_name='data/test/video-depth-720.mov')
ffmpeg -i data/test/video-depth-720.mov -vf fps=10 -start_number 0 ./data/test/video-depth-720/%05d.png
ffmpeg version 4.0.2 Copyright (c) 2000-2018 the FFmpeg developers
  built with gcc 4.8.2 (GCC) 20140120 (Red Hat 4.8.2-15)
  configuration: --prefix=/home/conda/feedstock_root/build_artifacts/ffmpeg_1539667330082/_h_env_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_placehold_plac --disable-doc --disable-openssl --enable-shared --enable-static --extra-cflags='-Wall -g -m64 -pipe -O3 -march=x86-64 -fPIC' --extra-cxxflags='-Wall -g -m64 -pipe -O3 -march=x86-64 -fPIC' --extra-libs='-lpthread -lm -lz' --enable-zlib --enable-pic --enable-pthreads --enable-gpl --enable-version3 --enable-hardcoded-tables --enable-avresample --enable-libfreetype --enable-gnutls --enable-libx264 --enable-libopenh264
  libavutil      56. 14.100 / 56. 14.100
  libavcodec     58. 18.100 / 58. 18.100
  libavformat    58. 12.100 / 58. 12.100
  libavdevice    58.  3.100 / 58.  3.100
  libavfilter     7. 16.100 /  7. 16.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  1.100 /  5.  1.100
  libswresample   3.  1.100 /  3.  1.100
  libpostproc    55.  1.100 / 55.  1.100
Input #0, mov,mp4,m4a,3gp,3g2,mj2, from 'data/test/video-depth-720.mov':
  Metadata:
    major_brand     : qt  
    minor_version   : 0
    compatible_brands: qt  
    creation_time   : 2023-03-14T06:51:47.000000Z
  Duration: 00:00:04.03, start: 0.000000, bitrate: 4036 kb/s
    Stream #0:0(eng): Audio: aac (LC) (mp4a / 0x6134706D), 48000 Hz, stereo, fltp, 253 kb/s (default)
    Metadata:
      creation_time   : 2023-03-14T06:51:47.000000Z
      handler_name    : Core Media Data Handler
    Stream #0:1(eng): Video: h264 (Main) (avc1 / 0x31637661), yuv420p(tv, bt709), 1276x1280, 3772 kb/s, 30 fps, 30 tbr, 30k tbn, 60k tbc (default)
    Metadata:
      creation_time   : 2023-03-14T06:51:47.000000Z
      handler_name    : Core Media Data Handler
      encoder         : H.264
Stream mapping:
  Stream #0:1 -> #0:0 (h264 (native) -> png (native))
Press [q] to stop, [?] for help
Output #0, image2, to './data/test/video-depth-720/%05d.png':
  Metadata:
    major_brand     : qt  
    minor_version   : 0
    compatible_brands: qt  
    encoder         : Lavf58.12.100
    Stream #0:0(eng): Video: png, rgb24, 1276x1280, q=2-31, 200 kb/s, 10 fps, 10 tbn, 10 tbc (default)
    Metadata:
      creation_time   : 2023-03-14T06:51:47.000000Z
      handler_name    : Core Media Data Handler
      encoder         : Lavc58.18.100 png
frame=   40 fps= 29 q=-0.0 Lsize=N/A time=00:00:04.00 bitrate=N/A speed=2.93x    
video:6869kB audio:0kB subtitle:0kB other streams:0kB global headers:0kB muxing overhead: unknown
python src/preprocess_optical_flow.py --vid-path data/test/video-depth-720 --gpu 0 
computing flow:   0% 0/40 [00:01<?, ?it/s]
Traceback (most recent call last):
  File "src/preprocess_optical_flow.py", line 48, in <module>
    preprocess(args=args)
  File "src/preprocess_optical_flow.py", line 29, in preprocess
    flow12 = raft_wrapper.compute_flow(im1, im2)
  File "/content/All-In-One-Deflicker/src/models/stage_1/raft_wrapper.py", line 70, in compute_flow
    _, flow12 = self.model(im1, im2, iters=20, test_mode=True)
  File "/usr/local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/content/All-In-One-Deflicker/src/models/stage_1/core/raft.py", line 132, in forward
    net, up_mask, delta_flow = self.update_block(net, inp, corr, flow)
  File "/usr/local/lib/python3.7/site-packages/torch/nn/modules/module.py", line 722, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/content/All-In-One-Deflicker/src/models/stage_1/core/update.py", line 135, in forward
    mask = .25 * self.mask(net)
RuntimeError: CUDA out of memory. Tried to allocate 30.00 MiB (GPU 0; 14.75 GiB total capacity; 13.69 GiB already allocated; 6.81 MiB free; 13.98 GiB reserved in total by PyTorch)
Traceback (most recent call last):
  File "src/stage1_neural_atlas.py", line 279, in <module>
    main(json.load(f), args)
  File "src/stage1_neural_atlas.py", line 109, in main
    resy, resx, maximum_number_of_frames, data_folder, True,  True, vid_root, vid_name)
  File "/content/All-In-One-Deflicker/src/models/stage_1/unwrap_utils.py", line 145, in load_input_data_single
    flow12 = np.load(flow12_fn)
  File "/usr/local/lib/python3.7/site-packages/numpy/lib/npyio.py", line 417, in load
    fid = stack.enter_context(open(os_fspath(file), "rb"))
FileNotFoundError: [Errno 2] No such file or directory: 'data/test/video-depth-720_flow/00000.png_00001.png.npy'
Namespace(ckpt_filter='./pretrained_weights/neural_filter.pth', ckpt_local='./pretrained_weights/local_refinement_net.pth', gpu=0, video_name='video-depth-720')
Load ./pretrained_weights/local_refinement_net.pth
Traceback (most recent call last):
  File "src/neural_filter_and_refinement.py", line 72, in <module>
    assert len(style_names) == len(content_names), "the number of style frames is different from the number of content frames"
AssertionError: the number of style frames is different from the number of content frames

May I know it is possible to process large video file on Colab with this project? Or I'll have to run it on machine with more GPU mem? Thank you :)

Where is the appendix paper?

# See explanation in the paper, appendix A (Second paragraph) is for the function pre_train_mapping,
But I can not find the appendix for this paper.
Thanks for your response.

poor deflicker result

I am testing this on a video, which has color flicker. The video clip was originally black and white but then colorized using the DD colorization model. An artifact of the colorization process was a lot of color flicker. Using the All-In-One_Deflicker pretrained model produced poor results. The image quality was significantly degraded and still there was some remaining flicker and poor colors. I am attaching here a before and after mp4 files.

flicker_test000.mp4
output.2.mp4

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.