Coder Social home page Coder Social logo

pku-yuangroup / chronomagic-bench Goto Github PK

View Code? Open in Web Editor NEW
162.0 3.0 13.0 324.25 MB

ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation

Home Page: https://pku-yuangroup.github.io/ChronoMagic-Bench/

License: Apache License 2.0

Python 97.83% Shell 0.46% C++ 0.53% Cuda 1.19%
benchmark diffusion-models evaluation metamorphic-video-generation open-sora-plan text-to-video time-lapse time-lapse-dataset video-generation

chronomagic-bench's Introduction

If you like our project, please give us a star โญ on GitHub for the latest update.

hf_space hf_space arXiv Home Page Dataset Dataset Dataset Download zhihu zhihu License GitHub Repo stars

This repository is the official implementation of ChronoMagic-Bench, a benchmark for metamorphic evaluation of text-to-time-lapse video generation. The key insight is to evaluate the capabilities of Text-to-Video Generation Models in physics, biology, and chemistry by enabling the generation of time-lapse videos, which are characterized by rich physics priors, through a free-form text prompt.

๐Ÿ’ก We also have other video generation project that may interest you โœจ.

Open-Sora-Plan
PKU-Yuan Lab and Tuzhan AI etc.
github github

MagicTime
Shenghai Yuan, Jinfa Huang and Yujun Shi etc.
github github

๐Ÿ“ฃ News

  • โณโณโณ Evaluate more Text-to-Video Generation Models via ChronoMagic-Bench.
  • [2024.06.30] ๐Ÿ”ฅ We release the code of the "Multi-Aspect Data Preprocessing", which is used to process the ChronoMagic-Pro dataset. Please click here and here to see more details.
  • [2024.06.29] ๐Ÿ”ฅ Support evaluating customized Text-to-Video models. The code and instructions are available in this repo.
  • [2024.06.28] ๐Ÿ”ฅ We release the ChronoMagic-Pro and ChronoMagic-ProH datasets. The datasets include 460K and 150K time-lapse video-text pairs respectively and can be downloaded at HF-Dataset-Pro and HF-Dataset-ProH.
  • [2024.06.27] ๐Ÿ”ฅ We release the arXiv paper and Leaderboard for ChronoMagic-Bench, and you can click here to read the paper and here to see the leaderboard.
  • [2024.06.26] ๐Ÿ”ฅ We release the testing prompts, reference videos and generated results by different models in ChronoMagic-Bench, and you can click here to see more details.
  • [2024.06.25] ๐Ÿ”ฅ All codes & datasets are coming soon! Stay tuned ๐Ÿ‘€!

๐Ÿ˜ฎ Highlights

ChronoMagic-Bench can reflect the physical prior capacity of Text-to-Video Generation Model.

๐Ÿ“ฃ Overview

In contrast to existing benchmarks, ChronoMagic-Bench emphasizes generating videos with high persistence and strong variation, i.e., metamorphic time-lapse videos with high physical prior content.

Backbone Type Visual Quality Text Relevance Metamorphic Amplitude Temporal Coherence
UCF-101 General โœ”๏ธ โœ”๏ธ โŒ โŒ
Make-a-Video-Eval General โœ”๏ธ โœ”๏ธ โŒ โŒ
MSR-VTT General โœ”๏ธ โœ”๏ธ โŒ โŒ
FETV General โœ”๏ธ โœ”๏ธ โŒ โœ”๏ธ
VBench General โœ”๏ธ โœ”๏ธ โŒ โœ”๏ธ
T2VScore General โœ”๏ธ โœ”๏ธ โŒ โŒ
ChronoMagic-Bench Time-lapse โœ”๏ธ โœ”๏ธ โœ”๏ธ โœ”๏ธ

We specifically design four major categories for time-lapse videos (as shown below), including biological, human-created, meteorological, and physical videos, and extend these to 75 subcategories. Based on this, we construct ChronoMagic-Bench, comprising 1,649 prompts and their corresponding reference time-lapse videos.

Biological Human Created Meteorological Physical
Biological Human Created Meteorological Physical
"Time-lapse of microgreens germinating and growing ..." "Time-lapse of a modern house being constructed in ..." "Time-lapse of a beach sunset capturing the sun's ..." "Time-lapse of an ice cube melting on a solid ..."
Biological Human Created Meteorological Physical
"Time-lapse of microgreens germinating and growing ..." "Time-lapse of a 3D printing process: starting with ..." "Time-lapse of a solar eclipse showing the moon's ..." "Time-lapse of a cake baking in an oven, depicting ..."
Biological Human Created Meteorological Physical
"Time-lapse of a butterfly metamorphosis from ..." "Time-lapse of a busy nighttime city intersection ..." "Time-lapse of a landscape transitioning from a ..." "Time-lapse of a strawberry rotting: starting with ..."

๐ŸŽ“ Evaluation Results

We visualize the evaluation results of various open-source and closed-source T2V generation models across ChronoMagic-Bench.

๐Ÿ† Leaderboard

See numeric values at our Leaderboard ๐Ÿฅ‡๐Ÿฅˆ๐Ÿฅ‰

or you can run it locally:

cd LeadBoard
python app.py

โš™๏ธ Requirements and Installation

We recommend the requirements as follows.

Environment

git clone --depth=1 https://github.com/PKU-YuanGroup/ChronoMagic-Bench.git
cd ChronoMagic-Bench
conda create -n chronomagic python=3.10
conda activate chronomagic

# install base packages
pip install -r requirements.txt

# install flash-attn
git clone https://github.com/Dao-AILab/flash-attention.git
cd flash-attention/csrc/layer_norm && pip install .
cd ../../../
rm -r flash-attention

Download Checkpoints

huggingface-cli download --repo-type model \
--resume-download BestWishYsh/ChronoMagic-Bench \
--local-dir BestWishYsh/ChronoMagic-Bench \
--local-dir-use-symlinks False

๐Ÿ”จ Usage

Use ChronoMagic-Bench to evaluate videos, and video generative models.

Prepare Videos for Evaluation

The generated videos should be named corresponding to the prompt ID in ChronoMagic-Bench and placed in the evaluation folder, which is structured as follows. We also provide input examples in the 'toy_video' .

# for ChronoMagic-Bench
`-- input_video_folder
    `-- model_name_a
        |-- 1
        |   |-- 3d_printing_08.mp4
        |   `-- ...
        |-- 2
        |   |-- 3d_printing_08.mp4
        |   `-- ...
        `-- 3
            |-- 3d_printing_08.mp4
            `-- ...
    `-- model_name_b
        |-- 1
        |   |-- 3d_printing_08.mp4
        |   `-- ...
        |-- 2
        |   |-- 3d_printing_08.mp4
        |   `-- ...
        `-- 3
            |-- 3d_printing_08.mp4
            `-- ...
            
# for ChronoMagic-Bench-150
-- input_video_folder
    |-- model_name_a
    |   |-- 3d_printing_08.mp4
    |   `-- animal_04.mp4
    |   `-- ...
    |-- model_name_b
    |   |-- 3d_printing_08.mp4
    |   `-- ...
    `-- ...

The filenames of all videos to be evaluated should be "videoid.mp4". For example, if the videoid is 3d_printing_08, the video filename should be "3d_printing_08.mp4". If this naming convention is not followed, the text relevance cannot be evaluated.

Get MTScore, CHScore and GPT4o-MTScore

We provide output examples in the 'results'. You can run the following commands for testing, then modify the relevant parameters (such as model_names, input_folder, model_pth and openai_api) to suit the text-to-video (T2V) generation model you want to evaluate.

python evaluate.py \
  --model_names test \
  # or more than one model
  # --model_names test abc  \
  --input_folder toy_video \
  --output_folder results \
  --video_frames_folder video_frames_folder_temp \
  --model_pth_CHScore cotracker2.pth \
  --model_pth_MTScore InternVideo2-stage2_1b-224p-f4.pt \
  --num_workers 8 \
  --openai_api "sk-UybXXX" \

If you only want to evaluate any one of the metrics instead of calculating all of them, you can follow the step below. Before running, please modify the parameters in 'xxx.sh' as needed. (If you want to obtain the JSON to submit to the leaderboard, you can organize the output files in MTScore / CHScore / GPT4o-MTScore according to 'results' and then proceed with the following steps.)

# for MTScore
cd MTScore
bash get_chscore.sh

# for CHScore
cd CHScore
bash get_mtscore.sh

# for GPT4o-MTScore
cd GPT4o_MTScore
bash get_gp4omtscore.sh

Get UMT-FVD and UMTScore

Please refer to the folder UMT for how to compute the UMTScore.

Get File and Submit to Leaderboard

python get_uploaded_json.py \
  --input_path results/all \
  --output_path results

After completing the above steps, you will obtain ChronoMagic-Bench-Input.json, and then you need to manually fill the JSON with UMT-FVD and UMTScore (as we calculate them separately). Finally, you can submit the JSON to HuggingFace.

๐Ÿ“‘ Benchmark Prompts

We provide prompt lists and the reference videos of ChronoMagic-Bench at Hugging Face. You can use this to sample videos for evaluation of your model.

๐Ÿ„ Sampled Videos

Dataset Download

To facilitate future research and to ensure full transparency, we release all the videos we sampled and used for ChronoMagic-Bench evaluation. You can download them on Hugging Face. We also provide detailed explanations of the sampled videos and detailed setting for the models under evaluation here.

๐Ÿณ ChronoMagicPro Dataset

ChronoMagic-Pro with 460K time-lapse videos, each accompanied by a detailed caption. We also released the 150K subset (ChronoMagic-ProH), which is a higher quality subset. All the dataset can be downloaded at here and here, or you can download it with the following command. Some samples can be found on our Project Page.

huggingface-cli download --repo-type dataset \
--resume-download BestWishYsh/ChronoMagic-Pro \  # or BestWishYsh/ChronoMagic-ProH
--local-dir BestWishYsh/ChronoMagic-Pro \  # or BestWishYsh/ChronoMagic-ProH
--local-dir-use-symlinks False

Please refer to the folder Multi-Aspect_Preprocessing for how ChronoMagic-Pro to process this data.

๐Ÿ‘ Acknowledgement

๐Ÿ”’ License

  • The majority of this project is released under the Apache 2.0 license as found in the LICENSE file.
  • The service is a research preview. Please contact us if you find any potential violations.

โœ๏ธ Citation

If you find our paper and code useful in your research, please consider giving a star โญ and citation ๐Ÿ“.

@article{yuan2024chronomagic,
  title={ChronoMagic-Bench: A Benchmark for Metamorphic Evaluation of Text-to-Time-lapse Video Generation},
  author={Yuan, Shenghai and Huang, Jinfa and Xu, Yongqi and Liu, Yaoyang and Zhang, Shaofeng and Shi, Yujun and Zhu, Ruijie and Cheng, Xinhua and Luo, Jiebo and Yuan, Li},
  journal={arXiv preprint arXiv:2406.18522},
  year={2024}
}

๐Ÿค Contributors

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.