Coder Social home page Coder Social logo

3d-cinemagraphy's Introduction

3D Cinemagraphy from a Single Image (CVPR 2023)

Xingyi Li1,3, Zhiguo Cao1, Huiqiang Sun1, Jianming Zhang2, Ke Xian3*, Guosheng Lin3

1Huazhong University of Science and Technology, 2Adobe Research, 3S-Lab, Nanyang Technological University

This repository is the official PyTorch implementation of the CVPR 2023 paper "3D Cinemagraphy from a Single Image".

Installation

git clone https://github.com/xingyi-li/3d-cinemagraphy.git
cd 3d-cinemagraphy
bash requirements.sh

Usage

Due to Download pretrained models from Google Drive, then unzip and put them in the directory ckpts.

To achieve better motion estimation results and controllable animation, here we provide the controllable version.

Firstly, use labelme to specify the target regions (masks) and desired movement directions (hints):

conda activate 3d-cinemagraphy
cd demo/0/
labelme image.png

A screenshot here: labelme

It is recommended to specify short hints rather than long hints to avoid artifacts. Please follow labelme for detailed instructions if needed.

After that, we can obtain an image.json file. Our next step is to convert the annotations stored in JSON format into datasets that can be used by our method:

labelme_json_to_dataset image.json  # this will generate a folder image_json
cd ../../
python scripts/generate_mask.py --inputdir demo/0/image_json

We now can create 3D cinemagraphs according to your preference:

python demo.py -c configs/config.yaml --input_dir demo/0/ --ckpt_path ckpts/model_150000.pth --flow_scale 1.0 --ds_factor 1.0
  • input_dir: input folder that contains src images.
  • ckpt_path: checkpoint path.
  • flow_scale: scale that used to control the speed of fluid, > 1.0 will slow down the fluid.
  • ds_factor: downsample factor for the input images.

Results will be saved to the input_dir/output.

Known issues

  • Due to the limited size of the training dataset, the intermediate frame may occasionally experience flickering.
  • The utilization of a fixed distance threshold in agglomerative clustering within the disparity space can occasionally result in the presence of visible boundaries between different layers.
  • We may sometimes see artifacts when the fluid is moving very fast. You can either slow down the fluid by increasing the flow_scale or try to specify short hints rather than long hints, to avoid artifacts.
  • The motion estimation module occasionally generates motion fields that do not perfectly align with the desired preferences.

Citation

If you find our work useful in your research, please consider to cite our paper:

@InProceedings{li2023_3dcinemagraphy,
    author    = {Li, Xingyi and Cao, Zhiguo and Sun, Huiqiang and Zhang, Jianming and Xian, Ke and Lin, Guosheng},
    title     = {3D Cinemagraphy From a Single Image},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2023},
    pages     = {4595-4605}
}

Relevant works

Acknowledgement

This code borrows heavily from 3D Moments and SLR-SFS. We thank the respective authors for open sourcing their methods.

3d-cinemagraphy's People

Contributors

xingyi-li avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.