magic-research / magic-animate Goto Github PK

View Code? Open in Web Editor NEW

9.9K 9.9K 1.0K 24.86 MB

[CVPR 2024] MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model

Home Page: https://showlab.github.io/magicanimate/

License: BSD 3-Clause "New" or "Revised" License

Python 99.95% Shell 0.05%

magic-animate's People

Contributors

Stargazers

Watchers

Forkers

zcxu-eric jostoz codeaudit kustomzone lordfrank sdbds igorriti cestlenny camenduru tema7707 rushank7 paperwave mrforexample cat-stack-boop kuoenterprises princetrunks maximiliankr blizaine pololei furkangozukara shinshin86 luogaara blueroutecn ilyamk techthiyanes huangxihuang prahs jinjianghuang tjxj mook2525 apricot404 zhuyeye dreamtalecore ytrjp jensinjames yigithanyucedag mmyyrroonn thindor oreml mk1018 wangxingjun778 harlanhong haifengzeng qoboty neosun100 panqiwei hassantsyed alcu1n rgx3000 zhiweicoding gagaprince sma-6 alexandor91 bg5t kaishinishimura huangshenneng youngfly93 zhoudai 1183980941 vamuvetv touristshaun chen-rn yuelegeling davincibj dhnanjay sinloss danieljjh youngzs yinghuozijin cv-synthesis wusongfu suryatmodulus cocktailpeanut sorokinvld samukei knightnic zcfrank1st gagajian zvrr cookingnoodle alex-wong-hk zjhken liupengandroid haishanghuafan searchgpt y360u jmaigc panthole-s-lab zhangnn520 grow-yhq hotairbag terry8210 inarikami enestatli cylonspace montagao kopsteve carlblocking jagamypriera eaglebh

magic-animate's Issues

Animate Image Magically!!! (Without adding motion video)

Without adding motion video, Can AI/LLM infer possible movements for given image ? (by analysing image) And animate it?

M1 Pro encouter some problem of loading the models

I tried to follow the instruction to run this project, but it failed with this error:
"huggingface_hub.utils._validators.HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': 'magicanimate/pretrained_models/stable-diffusion-v1-5'. Use repo_type argument if needed."

I have already download the whole project of stable-diffusion-v1-5 on Huggingface repository to local, but it still not work on my m1 pro laptop.....

motion sequence video upload error?

after i tinker with detectron2 densepose, i generate motion detection like this

dancing.mp4

what's the next step if i wanna to upload motion sequence video successfully?

How did the demo vids achieve facial movements when Densepose does not contain facial information?

In this demo, we can see the girl moving her mouth "lip syncing".

However, since the Densepose does not contain any facial information(it's just blobs), and the initial image only contains one reference of the face, how is it extrapolating lip sync movements?

From my personal experiments, it seems very challenging to maintain facial coherence, especially during dynamic movements.

I'd love to learn more on how those demo videos were achieved.

please fix depedency for windows

remove nvidia-cudnn-cu11==8.5.0.96
from requirements.txt because for linux (manual install cuda ncnn for windows) [solved]

remove nvidia-nccl-cu11==2.14.3
from requirements.txt because for linux (manual install cuda sdk for windows) [solved]

remove triton==2.0.0
from requirements.txt because for linux (i dont know how to fix)

MagicAnimate Online Demo

Thanks for the amazing project ❤️
I made an online demo for more people to enjoy this awesome work 🎉

https://chromox.alkaidvision.com

btw, I made some changes and it now supports arbitrary aspect ratio:

Again, really amazing idea and great work👍

how to save save_individual_videos

i only want to save the final video,i have changed animation.yaml。
`pretrained_model_path: "pretrained_models/stable-diffusion-v1-5"
pretrained_vae_path: "pretrained_models/sd-vae-ft-mse"
pretrained_controlnet_path: "pretrained_models/MagicAnimate/densepose_controlnet"
pretrained_appearance_encoder_path: "pretrained_models/MagicAnimate/appearance_encoder"
pretrained_unet_path: ""

motion_module: "pretrained_models/MagicAnimate/temporal_attention/temporal_attention.ckpt"

savename: null

fusion_blocks: "midup"

seed: [1]
steps: 25
guidance_scale: 7.5

source_image:

"inputs/applications/source_image/monalisa.png"
"inputs/applications/source_image/0002.png"
"inputs/applications/source_image/demo4.png"
"inputs/applications/source_image/dalle2.jpeg"
"inputs/applications/source_image/dalle8.jpeg"
"inputs/applications/source_image/multi1_source.png"
video_path:
"inputs/applications/driving/densepose/running.mp4"
"inputs/applications/driving/densepose/demo4.mp4"
"inputs/applications/driving/densepose/demo4.mp4"
"inputs/applications/driving/densepose/running2.mp4"
"inputs/applications/driving/densepose/dancing2.mp4"
"inputs/applications/driving/densepose/multi_dancing.mp4"

inference_config: "configs/inference/inference.yaml"
size: 512
L: 16
S: 1
I: 0
clip: 0
offset: 0
max_length: null
video_type: "condition"
invert_video: false
save_individual_videos: true`

but save_individual_videos not work

How can i improve the video?

How do i adjust the Sampling Step and Guidance scale to improve the result?

Heres what i got but its looks a bit weird

Apple M2: No module named 'magicanimate'

requirements_mac.txt
I modified the requirements.txt to make it suitable for my mac.
While executing ./scripts/animate.sh, I encountered erro:

No module named 'magicanimate'

So I tried pip3 install magicanimate and encountered below:

ERROR: Could not find a version that satisfies the requirement magicanimate (from versions: none)
ERROR: No matching distribution found for magicanimate

Custom input data error

Hey, thanks for the release it's quite amazing having access to this for free ! Quick question tho, I'm having this error when using my own input image (works fine with the exemple data) any idea ?

torch/nn/modules/conv.py", line 459, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Given groups=1, weight of size [128, 3, 3, 3], expected input[1, 4, 512, 512] to have 3 channels, but got 4 channels instead

Thanks !

Error

this error doesn't give me more details to share. and i can't run the model even with the samples.
the progress bar start but in the middle the gardio just gives an error and in the terminal the progress bar continue going. but gardio still doesn't show anything.
I'm running it via python3 -m demo.gradio_animate

Dense pose video for Portrait Video

Great work!
I move this work.
I can create a portrait video for TikTok by changing the code.

2023-12-05T10-35-11.mp4

However, the video is weird because the dense pose video is square.
How can I create the portrait dense pose video for TikTok?
Thanks in advance.

Share your video!

Hi,

Thanks for the amazing work!

Just gonna open a thread to share our video outputs here for everyone to showoff

https://file.io/DlfGouv1V5Ts

Installation guide for Windows

Install Ubuntu on WSL2 on Windows 10
https://ubuntu.com/tutorials/install-ubuntu-on-wsl2-on-windows-10#7-enjoy-ubuntu-on-wsl

git lfs install:
curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash
sudo apt-get install git-lfs
git lfs install

ffmpeg install:
sudo add-apt-repository ppa:mc3man/trusty-media
sudo apt-get update
sudo apt-get dist-upgrade
sudo apt-get install ffmpeg

git clone https://github.com/magic-research/magic-animate.git
cd magic-animate
pip install nvidia-pyindex
pip3 install -r requirements.txt

you create the folder structure you need
pretrained_models
git lfs clone https://huggingface.co/zcxu-eric/MagicAnimate

pretrained_models/sd-vae-ft-mse
https://huggingface.co/stabilityai/sd-vae-ft-mse/resolve/main/diffusion_pytorch_model.safetensors
https://huggingface.co/stabilityai/sd-vae-ft-mse/resolve/main/config.json

pretrained_models/stable-diffusion-v1-5
https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/v1-5-pruned-emaonly.safetensors

stable-diffusion-v1-5/tokenizer
All file

stable-diffusion-v1-5/text_encoder
https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/text_encoder/pytorch_model.bin

stable-diffusion-v1-5/unet
https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/unet/config.json
https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/unet/diffusion_pytorch_model.bin

stable-diffusion-v1-5/scheduler
https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/scheduler/scheduler_config.json

Run:
python3 -m demo.gradio_animate
or
bash scripts/animate.sh

how generate custom motion sequence?

how to generate motion sequence ?

Training code

hi guys, awesome work! I'm wondering if you guys have any training code readily available? Much appreciated!

Speeding up inference?

i'm no expert in video models, but surely there is a way to make these models run faster? I'm seeing >300s of generation time on a 8x3090 node...

e.g. surely the attn in here could be replaced with flash-attn?

Hands really like a mess

mask doesn't provides enought information...

How to make it process 768 pixel?

i am giving 768 px image and 768x768 dense pose video but still getting 512px output

i have edited

inference_config: "configs/inference/inference.yaml"
size: 768
L: 16
S: 1
I: 0
clip: 0
offset: 0
max_length: null
video_type: "condition"
invert_video: false
save_individual_videos: true

save_individual_videos also not working

5_sec_768_dense.mp4

2023-12-05T23-26-23.mp4

requirements.txt

mutliple files not available in requirements.txt from nvidia.
"ERROR: Could not find a version that satisfies the requirement nvidia-nccl-cu11==2.14.3 (from versions: 0.0.1.dev5)
ERROR: No matching distribution found for nvidia-nccl-cu11==2.14.3"
"ERROR: Ignored the following yanked versions: 8.9.4.19
ERROR: Could not find a version that satisfies the requirement nvidia-cudnn-cu11==8.5.0.96 (from versions: 0.0.1.dev5, 8.9.4.25, 8.9.5.29)
ERROR: No matching distribution found for nvidia-cudnn-cu11==8.5.0.96"

Some warning during inference

I have something confused during inference, since I am new in AIGC.

I truly believe i have change the "pretrained_model_path" in file animation.yaml to my model path. And i have changed clip_sample=false in the config.json file, but there is still have outputs in terminal like this:
has not set the configuration clip_sample. clip_sample should be set to False in the configuration file. Please make sure to update the config accordingly as not setting clip_sample in the config might lead to incorrect results in future versions. If you have downloaded this checkpoint from the Hugging Face Hub, it would be very nice if you could open a Pull request for the scheduler/scheduler_config.json file

Same append for steps_offset=1.

Additionally, I don't know what this warning is about, since my images have already resized to (512,512):
IMAGEIO FFMPEG_WRITER WARNING: input image is not divisible by macro_block_size=16, resizing from (1544, 516) to (1552, 528) to ensure video compatibility with most codecs and players. To prevent resizing, make your input image divisible by the macro_block_size or set the macro_block_size to 1 (risking incompatibility).

I don't know whether the warning above will influence the results of video generation, and how to get rid of these warning. Sorry to bother, I am wondering whether you could help me.

how to fix GPU USAGE RTX 3080 10GB

 44%|████████████████████████████████████                                              | 11/25 [02:48<03:34, 15.30s/it]
Traceback (most recent call last):
  File "C:\Users\NGXCRYPT-2ND\.conda\envs\m_animate_for_win\lib\site-packages\gradio\routes.py", line 488, in run_predict
    output = await app.get_blocks().process_api(
  File "C:\Users\NGXCRYPT-2ND\.conda\envs\m_animate_for_win\lib\site-packages\gradio\blocks.py", line 1431, in process_api
    result = await self.call_function(
  File "C:\Users\NGXCRYPT-2ND\.conda\envs\m_animate_for_win\lib\site-packages\gradio\blocks.py", line 1103, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "C:\Users\NGXCRYPT-2ND\.conda\envs\m_animate_for_win\lib\site-packages\anyio\to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "C:\Users\NGXCRYPT-2ND\.conda\envs\m_animate_for_win\lib\site-packages\anyio\_backends\_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
  File "C:\Users\NGXCRYPT-2ND\.conda\envs\m_animate_for_win\lib\site-packages\anyio\_backends\_asyncio.py", line 807, in run
    result = context.run(func, *args)
  File "C:\Users\NGXCRYPT-2ND\.conda\envs\m_animate_for_win\lib\site-packages\gradio\utils.py", line 707, in wrapper
    response = f(*args, **kwargs)
  File "V:\_ANIMATION\MAGIC_ANIMATE\magic-animate-for-windows\demo\gradio_animate_gpu_1.py", line 22, in animate
    return animator(reference_image, motion_sequence_state, seed, steps, guidance_scale)
  File "V:\_ANIMATION\MAGIC_ANIMATE\magic-animate-for-windows\demo\animate_gpu_1.py", line 164, in __call__
    sample = self.pipeline(
  File "C:\Users\NGXCRYPT-2ND\.conda\envs\m_animate_for_win\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "V:\_ANIMATION\MAGIC_ANIMATE\magic-animate-for-windows\magicanimate\pipelines\pipeline_animation.py", line 738, in __call__
    pred = self.unet(
  File "C:\Users\NGXCRYPT-2ND\.conda\envs\m_animate_for_win\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "V:\_ANIMATION\MAGIC_ANIMATE\magic-animate-for-windows\magicanimate\models\unet_controlnet.py", line 462, in forward
    sample = upsample_block(
  File "C:\Users\NGXCRYPT-2ND\.conda\envs\m_animate_for_win\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "V:\_ANIMATION\MAGIC_ANIMATE\magic-animate-for-windows\magicanimate\models\unet_3d_blocks.py", line 653, in forward
    hidden_states = attn(hidden_states, encoder_hidden_states=encoder_hidden_states).sample
  File "C:\Users\NGXCRYPT-2ND\.conda\envs\m_animate_for_win\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "V:\_ANIMATION\MAGIC_ANIMATE\magic-animate-for-windows\magicanimate\models\attention.py", line 136, in forward
    hidden_states = block(
  File "C:\Users\NGXCRYPT-2ND\.conda\envs\m_animate_for_win\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "V:\_ANIMATION\MAGIC_ANIMATE\magic-animate-for-windows\magicanimate\models\mutual_self_attention.py", line 272, in hacked_basic_transformer_inner_forward
    hidden_states = self.ff(self.norm3(hidden_states)) + hidden_states
  File "C:\Users\NGXCRYPT-2ND\.conda\envs\m_animate_for_win\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\NGXCRYPT-2ND\.conda\envs\m_animate_for_win\lib\site-packages\diffusers\models\attention.py", line 307, in forward
    hidden_states = module(hidden_states, scale)
  File "C:\Users\NGXCRYPT-2ND\.conda\envs\m_animate_for_win\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\NGXCRYPT-2ND\.conda\envs\m_animate_for_win\lib\site-packages\diffusers\models\attention.py", line 356, in forward
    return hidden_states * self.gelu(gate)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 320.00 MiB (GPU 0; 10.00 GiB total capacity; 8.34 GiB already allocated; 0 bytes free; 9.12 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

How do you animate an image ?

Hi, I'd like to know how to animate an image. I don't really understand how.

It would be to animate an image from my galery photo for a work project so if you could reply as soon as possible that would be great. Thank you

how can i convert normal video to densepose video，any tools

Full Auto Installers For Windows & RunPod (so Linux too) Including Raw Video To DensePose Video

8 December 2023 Update

RunPod Instructions added as well

Gradio will not timeout and working

I have spent over 24 hours to prepare all these scripts

You can download all from here : https://www.patreon.com/posts/94098751

Example 15 second DensePose I generated via my script

15_sec_512_closeshot_dense.mp4

Full instructions

Google Colab project

Could you create a Colab project?

Thanks!

Anyone install this on a windows machine?

The yaml install didn't work for me so I tried via the requirements.txt. Had to install cuDNN, and made some progress on their errors that I received, but then ran into the trying to install this dependency: pip install nvidia-nccl-cu11. Anyone else get it to run locally? I am on windows creating a conda env.

Do you incorporate UV from DensePose?

Thanks for this incredible work!

For the motion representation, do you use the UV coordinates from the densepose representation or only the sequence of semantic part maps? If not, why not include the texture map as inputs for motion conditioning?

Thanks again!

Do you plan to open source the training code ?

Code availability

Hey! I just want to know if you will release the source code of the project. Great work 🙌

typo

conda env create -f environment.yml is not found

-> conda env create -f environment.yaml

Choice of densepose

Hi. Thanks for the great work. What was the reasoning behind the choice for densepose? I see that the segmentation masks are not always accurate so it leads to some inconsistencies in human biomechanics. Out of curiosity, would such a diffusion model work if one were to map the body onto a body model like SMPL and go from there?

Can't upload custom Densepose to the gradio demo

Hi, I get an error when I try to upload any densepose videos to the Gradio demo.

I've tried resizing the video to 512 x 512. Using mp4 format. Or even downloading the videos attached in the example and re-uploading them.

All of them returns the following error.
Using huggingface spaces.

UnboundLocalError: local variable 'size' referenced before assignment
Traceback (most recent call last):
  File "/home/user/.pyenv/versions/3.8.18/lib/python3.8/site-packages/gradio/queueing.py", line 456, in call_prediction
    output = await route_utils.call_process_api(
  File "/home/user/.pyenv/versions/3.8.18/lib/python3.8/site-packages/gradio/route_utils.py", line 232, in call_process_api
    output = await app.get_blocks().process_api(
  File "/home/user/.pyenv/versions/3.8.18/lib/python3.8/site-packages/gradio/blocks.py", line 1522, in process_api
    result = await self.call_function(
  File "/home/user/.pyenv/versions/3.8.18/lib/python3.8/site-packages/gradio/blocks.py", line 1144, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "/home/user/.pyenv/versions/3.8.18/lib/python3.8/site-packages/anyio/to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/home/user/.pyenv/versions/3.8.18/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
  File "/home/user/.pyenv/versions/3.8.18/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 807, in run
    result = context.run(func, *args)
  File "/home/user/.pyenv/versions/3.8.18/lib/python3.8/site-packages/gradio/utils.py", line 673, in wrapper
    response = f(*args, **kwargs)
  File "app.py", line 62, in read_video
    size = int(size)
UnboundLocalError: local variable 'size' referenced before assignment

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/user/.pyenv/versions/3.8.18/lib/python3.8/site-packages/gradio/queueing.py", line 501, in process_events
    response = await self.call_prediction(awake_events, batch)
  File "/home/user/.pyenv/versions/3.8.18/lib/python3.8/site-packages/gradio/queueing.py", line 465, in call_prediction
    raise Exception(str(error) if show_error else None) from error
Exception: None

I created an open source 1 click installer

Hi I got this to work on my Windows Nvidia RTX A4500, and wrote a 1 click installer. Hopefully Mac support is coming soon, in which case the same installer can be used to install on Macs as well.

You can learn more here: https://x.com/cocktailpeanut/status/1732052908227588263?s=20

Basically I'm working on a desktop app that runs any kind of command scripts with a click of a button. If you have any trouble installing, please reach out on my descord. Happy to help. Hope this is helpful.

I made auto installer. However gradio app using more than 24 GB vram and animate_dist giving below error

I made an auto installer script and it works on Windows 10 python 3.10.11

I am trying gradio_animate.py and it is using ridiculously more than 24 GB VRAM

when your pre shared motion sequence used it works. so when your shared motion sequence provided it uses less than 10 GB VRAM and works

When I upload a raw video it gives out of VRAM error over 24 GB VRAM

2023-12-05T04-02-57.mp4

When I try to below video no matter what it gives out of VRAM error on my RTX 3090 machine

ex2.mp4

So I tried to run gradio_animate_dist.py but it is giving below error no matter what. I even fixed pathing errors

To create a public link, set `share=True` in `launch()`.
Traceback (most recent call last):
  File "G:\magic_animate\magic-animate\venv\lib\site-packages\gradio\routes.py", line 488, in run_predict
    output = await app.get_blocks().process_api(
  File "G:\magic_animate\magic-animate\venv\lib\site-packages\gradio\blocks.py", line 1434, in process_api
    data = self.postprocess_data(fn_index, result["prediction"], state)
  File "G:\magic_animate\magic-animate\venv\lib\site-packages\gradio\blocks.py", line 1335, in postprocess_data
    prediction_value = block.postprocess(prediction_value)
  File "G:\magic_animate\magic-animate\venv\lib\site-packages\gradio\components\video.py", line 281, in postprocess
    processed_files = (self._format_video(y), None)
  File "G:\magic_animate\magic-animate\venv\lib\site-packages\gradio\components\video.py", line 355, in _format_video
    video = self.make_temp_copy_if_needed(video)
  File "G:\magic_animate\magic-animate\venv\lib\site-packages\gradio\components\base.py", line 226, in make_temp_copy_if_needed
    temp_dir = self.hash_file(file_path)
  File "G:\magic_animate\magic-animate\venv\lib\site-packages\gradio\components\base.py", line 190, in hash_file
    with open(file_path, "rb") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'G:\\magic_animate\\magic-animate\\demo\\demo\\outputs\\2023-12-05T04-34-44.mp4'

here the fixed gradio_animate_dist.py

import argparse
import imageio
import os, datetime
import numpy as np
import gradio as gr
from PIL import Image
from subprocess import PIPE, run

base_dir = os.path.dirname(os.path.abspath(__file__))
demo_dir = os.path.join(base_dir, "demo")
tmp_dir = os.path.join(demo_dir, "tmp")
outputs_dir = os.path.join(demo_dir, "outputs")

os.makedirs(tmp_dir, exist_ok=True)
os.makedirs(outputs_dir, exist_ok=True)

def animate(reference_image, motion_sequence, seed, steps, guidance_scale):
    time_str = datetime.datetime.now().strftime("%Y-%m-%dT%H-%M-%S")
    animation_path = os.path.join(outputs_dir, f"{time_str}.mp4")
    save_path = os.path.join(tmp_dir, "input_reference_image.png")
    Image.fromarray(reference_image).save(save_path)
    command = f"python -m demo.animate_dist --reference_image {save_path} --motion_sequence {motion_sequence} --random_seed {seed} --step {steps} --guidance_scale {guidance_scale} --save_path {animation_path}"
    run(command, stdout=PIPE, stderr=PIPE, universal_newlines=True, shell=True)
    return animation_path

with gr.Blocks() as demo:

    gr.HTML(
        """
        <div style="display: flex; justify-content: center; align-items: center; text-align: center;">
        <a href="https://github.com/magic-research/magic-animate" style="margin-right: 20px; text-decoration: none; display: flex; align-items: center;">
        </a>
        <div>
            <h1 >MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model</h1>
            <h5 style="margin: 0;">If you like our project, please give us a star ✨ on Github for the latest update.</h5>
            <div style="display: flex; justify-content: center; align-items: center; text-align: center;">
                <a href="https://arxiv.org/abs/2311.16498"><img src="https://img.shields.io/badge/Arxiv-2311.16498-red"></a>
                <a href='https://showlab.github.io/magicanimate'><img src='https://img.shields.io/badge/Project_Page-MagicAnimate-green' alt='Project Page'></a>
                <a href='https://github.com/magic-research/magic-animate'><img src='https://img.shields.io/badge/Github-Code-blue'></a>
            </div>
        </div>
        </div>
        """
    )
    animation = gr.Video(format="mp4", label="Animation Results", autoplay=True)
    
    with gr.Row():
        reference_image  = gr.Image(label="Reference Image")
        motion_sequence  = gr.Video(format="mp4", label="Motion Sequence")
        
        with gr.Column():
            random_seed         = gr.Textbox(label="Random seed", value=1, info="default: -1")
            sampling_steps      = gr.Textbox(label="Sampling steps", value=25, info="default: 25")
            guidance_scale      = gr.Textbox(label="Guidance scale", value=7.5, info="default: 7.5")
            submit              = gr.Button("Animate")

    def read_video(video, size=512):
        size = int(size)
        reader = imageio.get_reader(video)
        frames = []
        for img in reader:
            frames.append(np.array(Image.fromarray(img).resize((size, size))))
        
        save_path = os.path.join(tmp_dir, "input_motion_sequence.mp4")
        imageio.mimwrite(save_path, frames, fps=25)
        return save_path
    
    def read_image(image, size=512):
        img = np.array(Image.fromarray(image).resize((size, size)))
        return img
        
    # when user uploads a new video
    motion_sequence.upload(
        read_video,
        motion_sequence,
        motion_sequence
    )
    # when `first_frame` is updated
    reference_image.upload(
        read_image,
        reference_image,
        reference_image
    )
    # when the `submit` button is clicked
    submit.click(
        animate,
        [reference_image, motion_sequence, random_seed, sampling_steps, guidance_scale], 
        animation
    )

    # Examples
    gr.Markdown("## Examples")
    gr.Examples(
        examples=[
            ["inputs/applications/source_image/monalisa.png", "inputs/applications/driving/densepose/running.mp4"], 
        ],
        inputs=[reference_image, motion_sequence],
        outputs=animation,
    )

demo.launch(share=False,inbrowser=True)

Improving facial likeness

First, congratulations on this project and thank you for your excellent work!

In my sample generations (which I have attached) you can see that the source face is not preserved very well, i.e. the person in the video does not look like the person in the source image.

I'm wondering, are there any settings I can change to improve this, or any other tips on the best source image to use to preserve facial features?

Thanks again!

bart2_demo4.mp4

grid.mp4

bart3_demo4.mp4

grid.mp4

My Dense Pose video uses more than 24 GB VRAM meanwhile yours uses like 10 GB why?

Here 1st one is your video

running2.mp4

And here second one is my own DensePose video

dense_output_video.mp4

Why mine is using more than 24 GB VRAM and causing OOM?

@zcxu-eric @jfzhang95

Repository not found

Hi. When start project python3 -m demo.gradio_animate - have same error -

Repository Not Found for url: https://huggingface.co/pretrained_models/stable-diffusion-v1-5/resolve/main/tokenizer/vocab.json.
Please make sure you specified the correct repo_id and repo_type.
If you are trying to access a private or gated repo, make sure you are authenticated.

How to fix it and start a project!? Thanks

🦒 colab

Thanks for the project ❤️ I made a colab. 🥳 I hope you like it. https://github.com/camenduru/MagicAnimate-colab

Process is "Killed" after finishing

It worked 1 time but now it doesn't work. I have 16gb of vram

100%|███████████████████████████████████████████████████████████████████████████████████| 25/25 [08:31<00:00, 20.46s/it]100%|█████████████████████████████████████████████████████████████████████████████████| 112/112 [00:26<00:00, 4.16it/s]Killed

A popup on the top right:
Error
Unexpected token '<', " <h"... is not valid JSON

Use OpenPose instead of DensePose

I noticed an inconsistency in using Densepose, especially on the hands. Given this, I'm curious about the feasibility of utilizing Openpose as an alternative to Densepose in my workflow. Have you considered any potential advantages or challenges with such a switch?

RuntimeError: Found no NVIDIA driver. AMD system. + Error while finding module specification for magicanimate.pipelines.animation

Hi, thanks for this amazing project. I'm having issues trying to run the project. Apparently all is installed as it's supposed to be, but the run_gui gives me the runtime error and the bash animate gives me the module specification error

Fix broken model links

The links for the base model StableDiffusion V1.5 and MSE-finetuned VAE are broken, please provide the correct links to HuggingFace.

OSError: Error no file named pytorch_model.bin, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory pretrained_models/stable-diffusion-v1-5.

(manimate) ryan@DESKTOP-81PKUNM:/mnt/d/python/magic-animate-main$ bash scripts/animate.sh
Traceback (most recent call last):
File "/home/ryan/miniconda3/envs/manimate/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/ryan/miniconda3/envs/manimate/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/mnt/d/python/magic-animate-main/magicanimate/pipelines/animation.py", line 282, in
run(args)
File "/mnt/d/python/magic-animate-main/magicanimate/pipelines/animation.py", line 271, in run
main(args)
File "/mnt/d/python/magic-animate-main/magicanimate/pipelines/animation.py", line 76, in main
text_encoder = CLIPTextModel.from_pretrained(config.pretrained_model_path, subfolder="text_encoder")
File "/home/ryan/miniconda3/envs/manimate/lib/python3.8/site-packages/transformers/modeling_utils.py", line 2805, in from_pretrained
state_dict = load_state_dict(resolved_archive_file)
File "/home/ryan/miniconda3/envs/manimate/lib/python3.8/site-packages/transformers/modeling_utils.py", line 458, in load_state_dict
with safe_open(checkpoint_file, framework="pt") as f:
safetensors_rust.SafetensorError: Error while deserializing header: HeaderTooLarge

I downloaded the models from the updated links from #2

training code readily available?

training code readily available? Much appreciated!

LCM sampler

2023-12-05T23-09-44.mp4

This thing is 8 steps, guidance 4 with LCM trained checkpoint. Seems almost like usual generation with 25 steps. It should work a bit better if we use LCM sampler I guess. So how do we change sampler?

Too much resource

Hi, thanks for this amazing project. I have issues about generating, when I use on RunPOD's cloud 80gb GPU I still can't get over 25 sampling steps. It gives this error on gradio: Error
Unexpected token '<', " <!DOCTYPE "... is not valid JSON

this is terminal error:
workspace/magic-animate/magicanimate/pipelines/pipeline_animation.py:624: FutureWarning: Accessing config attribute in_channels directly via 'UNet3DConditionModel' object attribute is deprecated. Please access 'in_channels' over 'UNet3DConditionModel's config object instead, e.g. 'unet.config.in_channels'.
num_channels_latents = self.unet.in_channels

Can you provide the training details or the Supplementary Materials of your paper?

I would like to know more training details.

Blurred face

Can you add a high-definition model of the face? the face will be obviously disfigured.