In this repository, we present a versatile VFI work, utilizing the Attention-to-Motion (ATM) module to intuitively formulate motion estimation.
- Paper: (Under review)
- Video demo: Youtube
We provide the dependencies in requirements.txt
.
import torch
import cv2
from network.network_base import Network # or use network.network_lite
from demo_2x import load_model_checkpoint, inference_2frame
# initialize model
model = Network()
load_model_checkpoint(model, 'path_to_checkpoint')
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
model.to(device).eval()
# prepare data and inference
img0 = cv2.imread("path_to_frame0")
img1 = cv2.imread("path_to_frame1")
pred = inference_2frame(img0, img1, model, isBGR=True) # please trace demo_2x.py -> inference_2frame() for details
For 2x interpolation, run the command below:
use
--global_off
flag to disable the global motion estimation.
- input: 2 frames
python3 demo_2x.py --model_type <select base or lite> --ckpt <path to model checkpoint> --frame0 <path to frame 0> --frame1 <path to frame 1> --out <path to output frame>
- input: mp4 video
python3 demo_2x.py --model_type <select base or lite> --ckpt <path to model checkpoint> --video <path to .mp4 file>
use
--combine_video
flag to combine the original input video and processed video.
output_interpolation_combine_resize2.mov
We will release the checkpoints after the final paper decision.
Version | Link | Param (M) |
---|---|---|
Base | TBA | 51.56 |
Lite | TBA | 11.98 |
Pct | TBA | 51.56 |
We evaluate our method using the benchmark
scripts provided by RIFE, EMA-VFI and AMT for consistency.
- Vimeo90K
cd benchmark python3 test_vimeo90k.py --path <path to Vimeo90K dataset folder> --ckpt <path to model checkpoint>
- UCF101
cd benchmark python3 test_ucf101.py --path <path to UCF101 dataset folder> --ckpt <path to model checkpoint>
- SNU-FILM
cd benchmark python3 test_snufilm.py --path <path to SNU-FILM dataset txt> --img_data_path <path to SNU-FILM dataset image folder> --ckpt <path to model checkpoint>
- Xiph
cd benchmark python3 test_xiph.py --root <path to Xiph dataset folder> --ckpt <path to model checkpoint>
The first 2 phases of the training procedure (stated in our paper) utilize train.py
and trainer.py
, while the last 2 phases utilize finetune.py
and finetune_trainer.py
.
- Phase 1: run
train.py
and set the argumentdataset
asvimeo90k
, the other training hyperparameters can be set as you wish (batch size, learning rate, no. of epoch, etc.). Reminder: please make sure to uncommentmodel.global_motion = False
. - Phase 2: run
train.py
, set the argumentdataset
asX4k
, and remember to set the variableisLoadCheckpoint
toTrue
and changeparam
to the checkpoint of Phase 1. Reminder: please make sure to uncommentmodel.global_motion = True
andmodel.__freeze_local_motion__()
. - Phase 3: run
finetune.py
, changeparam
to the checkpoint of Phase 2. For more tweaking, please trace the source code. - Phase 4: run
finetune.py
, changeparam
to the checkpoint of Phase 3.
TBA
Thanks to EMA-VFI, AMT, RIFE, XVFI, vgg_perceptual_loss, GMFlow for releasing their source code.