Coder Social home page Coder Social logo

magic-research / magic-animate Goto Github PK

View Code? Open in Web Editor NEW
9.9K 9.9K 1.0K 24.86 MB

[CVPR 2024] MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model

Home Page: https://showlab.github.io/magicanimate/

License: BSD 3-Clause "New" or "Revised" License

Python 99.95% Shell 0.05%

magic-animate's People

Contributors

zcxu-eric avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

magic-animate's Issues

M1 Pro encouter some problem of loading the models

I tried to follow the instruction to run this project, but it failed with this error:
"huggingface_hub.utils._validators.HFValidationError: Repo id must be in the form 'repo_name' or 'namespace/repo_name': 'magicanimate/pretrained_models/stable-diffusion-v1-5'. Use repo_type argument if needed."

I have already download the whole project of stable-diffusion-v1-5 on Huggingface repository to local, but it still not work on my m1 pro laptop.....

motion sequence video upload error?

after i tinker with detectron2 densepose, i generate motion detection like this

dancing.mp4

what's the next step if i wanna to upload motion sequence video successfully?

How did the demo vids achieve facial movements when Densepose does not contain facial information?

In this demo, we can see the girl moving her mouth "lip syncing".

However, since the Densepose does not contain any facial information(it's just blobs), and the initial image only contains one reference of the face, how is it extrapolating lip sync movements?

From my personal experiments, it seems very challenging to maintain facial coherence, especially during dynamic movements.

I'd love to learn more on how those demo videos were achieved.

please fix depedency for windows

remove nvidia-cudnn-cu11==8.5.0.96
from requirements.txt because for linux (manual install cuda ncnn for windows) [solved]

remove nvidia-nccl-cu11==2.14.3
from requirements.txt because for linux (manual install cuda sdk for windows) [solved]

remove triton==2.0.0
from requirements.txt because for linux (i dont know how to fix)

MagicAnimate Online Demo

Thanks for the amazing project โค๏ธ
I made an online demo for more people to enjoy this awesome work ๐ŸŽ‰

btw, I made some changes and it now supports arbitrary aspect ratio:
image

Again, really amazing idea and great work๐Ÿ‘

how to save save_individual_videos

i only want to save the final video,i have changed animation.yamlใ€‚
`pretrained_model_path: "pretrained_models/stable-diffusion-v1-5"
pretrained_vae_path: "pretrained_models/sd-vae-ft-mse"
pretrained_controlnet_path: "pretrained_models/MagicAnimate/densepose_controlnet"
pretrained_appearance_encoder_path: "pretrained_models/MagicAnimate/appearance_encoder"
pretrained_unet_path: ""

motion_module: "pretrained_models/MagicAnimate/temporal_attention/temporal_attention.ckpt"

savename: null

fusion_blocks: "midup"

seed: [1]
steps: 25
guidance_scale: 7.5

source_image:

  • "inputs/applications/source_image/monalisa.png"
  • "inputs/applications/source_image/0002.png"
  • "inputs/applications/source_image/demo4.png"
  • "inputs/applications/source_image/dalle2.jpeg"
  • "inputs/applications/source_image/dalle8.jpeg"
  • "inputs/applications/source_image/multi1_source.png"
    video_path:
  • "inputs/applications/driving/densepose/running.mp4"
  • "inputs/applications/driving/densepose/demo4.mp4"
  • "inputs/applications/driving/densepose/demo4.mp4"
  • "inputs/applications/driving/densepose/running2.mp4"
  • "inputs/applications/driving/densepose/dancing2.mp4"
  • "inputs/applications/driving/densepose/multi_dancing.mp4"

inference_config: "configs/inference/inference.yaml"
size: 512
L: 16
S: 1
I: 0
clip: 0
offset: 0
max_length: null
video_type: "condition"
invert_video: false
save_individual_videos: true`

but save_individual_videos not work

How can i improve the video?

How do i adjust the Sampling Step and Guidance scale to improve the result?

Heres what i got but its looks a bit weird

ezgif com-video-to-gif

Apple M2: No module named 'magicanimate'

requirements_mac.txt
I modified the requirements.txt to make it suitable for my mac.
While executing ./scripts/animate.sh, I encountered erro:

No module named 'magicanimate'

So I tried pip3 install magicanimate and encountered below:

ERROR: Could not find a version that satisfies the requirement magicanimate (from versions: none)
ERROR: No matching distribution found for magicanimate

Custom input data error

Hey, thanks for the release it's quite amazing having access to this for free ! Quick question tho, I'm having this error when using my own input image (works fine with the exemple data) any idea ?

torch/nn/modules/conv.py", line 459, in _conv_forward
    return F.conv2d(input, weight, bias, self.stride,
RuntimeError: Given groups=1, weight of size [128, 3, 3, 3], expected input[1, 4, 512, 512] to have 3 channels, but got 4 channels instead

Thanks !

Error

this error doesn't give me more details to share. and i can't run the model even with the samples.
the progress bar start but in the middle the gardio just gives an error and in the terminal the progress bar continue going. but gardio still doesn't show anything.
I'm running it via python3 -m demo.gradio_animate

image

Dense pose video for Portrait Video

Great work!
I move this work.
I can create a portrait video for TikTok by changing the code.

2023-12-05T10-35-11.mp4

However, the video is weird because the dense pose video is square.
How can I create the portrait dense pose video for TikTok?
Thanks in advance.

Installation guide for Windows

Install Ubuntu on WSL2 on Windows 10
https://ubuntu.com/tutorials/install-ubuntu-on-wsl2-on-windows-10#7-enjoy-ubuntu-on-wsl

git lfs install:
curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash
sudo apt-get install git-lfs
git lfs install

ffmpeg install:
sudo add-apt-repository ppa:mc3man/trusty-media
sudo apt-get update
sudo apt-get dist-upgrade
sudo apt-get install ffmpeg

git clone https://github.com/magic-research/magic-animate.git
cd magic-animate
pip install nvidia-pyindex
pip3 install -r requirements.txt

you create the folder structure you need
pretrained_models
git lfs clone https://huggingface.co/zcxu-eric/MagicAnimate

pretrained_models/sd-vae-ft-mse
https://huggingface.co/stabilityai/sd-vae-ft-mse/resolve/main/diffusion_pytorch_model.safetensors
https://huggingface.co/stabilityai/sd-vae-ft-mse/resolve/main/config.json

pretrained_models/stable-diffusion-v1-5
https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/v1-5-pruned-emaonly.safetensors

stable-diffusion-v1-5/tokenizer
All file

stable-diffusion-v1-5/text_encoder
https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/text_encoder/pytorch_model.bin

stable-diffusion-v1-5/unet
https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/unet/config.json
https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/unet/diffusion_pytorch_model.bin

stable-diffusion-v1-5/scheduler
https://huggingface.co/runwayml/stable-diffusion-v1-5/resolve/main/scheduler/scheduler_config.json

Run:
python3 -m demo.gradio_animate
or
bash scripts/animate.sh

Training code

hi guys, awesome work! I'm wondering if you guys have any training code readily available? Much appreciated!

Speeding up inference?

i'm no expert in video models, but surely there is a way to make these models run faster? I'm seeing >300s of generation time on a 8x3090 node...

e.g. surely the attn in here could be replaced with flash-attn?

How to make it process 768 pixel?

i am giving 768 px image and 768x768 dense pose video but still getting 512px output

i have edited

inference_config: "configs/inference/inference.yaml"
size: 768
L: 16
S: 1
I: 0
clip: 0
offset: 0
max_length: null
video_type: "condition"
invert_video: false
save_individual_videos: true

save_individual_videos also not working

5_sec_768_dense.mp4

768

2023-12-05T23-26-23.mp4

requirements.txt

mutliple files not available in requirements.txt from nvidia.
"ERROR: Could not find a version that satisfies the requirement nvidia-nccl-cu11==2.14.3 (from versions: 0.0.1.dev5)
ERROR: No matching distribution found for nvidia-nccl-cu11==2.14.3"
"ERROR: Ignored the following yanked versions: 8.9.4.19
ERROR: Could not find a version that satisfies the requirement nvidia-cudnn-cu11==8.5.0.96 (from versions: 0.0.1.dev5, 8.9.4.25, 8.9.5.29)
ERROR: No matching distribution found for nvidia-cudnn-cu11==8.5.0.96"

Some warning during inference

I have something confused during inference, since I am new in AIGC.

I truly believe i have change the "pretrained_model_path" in file animation.yaml to my model path. And i have changed clip_sample=false in the config.json file, but there is still have outputs in terminal like this:
has not set the configuration clip_sample. clip_sample should be set to False in the configuration file. Please make sure to update the config accordingly as not setting clip_sample in the config might lead to incorrect results in future versions. If you have downloaded this checkpoint from the Hugging Face Hub, it would be very nice if you could open a Pull request for the scheduler/scheduler_config.json file
image

Same append for steps_offset=1.

Additionally, I don't know what this warning is about, since my images have already resized to (512,512):
IMAGEIO FFMPEG_WRITER WARNING: input image is not divisible by macro_block_size=16, resizing from (1544, 516) to (1552, 528) to ensure video compatibility with most codecs and players. To prevent resizing, make your input image divisible by the macro_block_size or set the macro_block_size to 1 (risking incompatibility).
image

I don't know whether the warning above will influence the results of video generation, and how to get rid of these warning. Sorry to bother, I am wondering whether you could help me.

how to fix GPU USAGE RTX 3080 10GB

image

 44%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ                                              | 11/25 [02:48<03:34, 15.30s/it]
Traceback (most recent call last):
  File "C:\Users\NGXCRYPT-2ND\.conda\envs\m_animate_for_win\lib\site-packages\gradio\routes.py", line 488, in run_predict
    output = await app.get_blocks().process_api(
  File "C:\Users\NGXCRYPT-2ND\.conda\envs\m_animate_for_win\lib\site-packages\gradio\blocks.py", line 1431, in process_api
    result = await self.call_function(
  File "C:\Users\NGXCRYPT-2ND\.conda\envs\m_animate_for_win\lib\site-packages\gradio\blocks.py", line 1103, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "C:\Users\NGXCRYPT-2ND\.conda\envs\m_animate_for_win\lib\site-packages\anyio\to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "C:\Users\NGXCRYPT-2ND\.conda\envs\m_animate_for_win\lib\site-packages\anyio\_backends\_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
  File "C:\Users\NGXCRYPT-2ND\.conda\envs\m_animate_for_win\lib\site-packages\anyio\_backends\_asyncio.py", line 807, in run
    result = context.run(func, *args)
  File "C:\Users\NGXCRYPT-2ND\.conda\envs\m_animate_for_win\lib\site-packages\gradio\utils.py", line 707, in wrapper
    response = f(*args, **kwargs)
  File "V:\_ANIMATION\MAGIC_ANIMATE\magic-animate-for-windows\demo\gradio_animate_gpu_1.py", line 22, in animate
    return animator(reference_image, motion_sequence_state, seed, steps, guidance_scale)
  File "V:\_ANIMATION\MAGIC_ANIMATE\magic-animate-for-windows\demo\animate_gpu_1.py", line 164, in __call__
    sample = self.pipeline(
  File "C:\Users\NGXCRYPT-2ND\.conda\envs\m_animate_for_win\lib\site-packages\torch\utils\_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "V:\_ANIMATION\MAGIC_ANIMATE\magic-animate-for-windows\magicanimate\pipelines\pipeline_animation.py", line 738, in __call__
    pred = self.unet(
  File "C:\Users\NGXCRYPT-2ND\.conda\envs\m_animate_for_win\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "V:\_ANIMATION\MAGIC_ANIMATE\magic-animate-for-windows\magicanimate\models\unet_controlnet.py", line 462, in forward
    sample = upsample_block(
  File "C:\Users\NGXCRYPT-2ND\.conda\envs\m_animate_for_win\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "V:\_ANIMATION\MAGIC_ANIMATE\magic-animate-for-windows\magicanimate\models\unet_3d_blocks.py", line 653, in forward
    hidden_states = attn(hidden_states, encoder_hidden_states=encoder_hidden_states).sample
  File "C:\Users\NGXCRYPT-2ND\.conda\envs\m_animate_for_win\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "V:\_ANIMATION\MAGIC_ANIMATE\magic-animate-for-windows\magicanimate\models\attention.py", line 136, in forward
    hidden_states = block(
  File "C:\Users\NGXCRYPT-2ND\.conda\envs\m_animate_for_win\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "V:\_ANIMATION\MAGIC_ANIMATE\magic-animate-for-windows\magicanimate\models\mutual_self_attention.py", line 272, in hacked_basic_transformer_inner_forward
    hidden_states = self.ff(self.norm3(hidden_states)) + hidden_states
  File "C:\Users\NGXCRYPT-2ND\.conda\envs\m_animate_for_win\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\NGXCRYPT-2ND\.conda\envs\m_animate_for_win\lib\site-packages\diffusers\models\attention.py", line 307, in forward
    hidden_states = module(hidden_states, scale)
  File "C:\Users\NGXCRYPT-2ND\.conda\envs\m_animate_for_win\lib\site-packages\torch\nn\modules\module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "C:\Users\NGXCRYPT-2ND\.conda\envs\m_animate_for_win\lib\site-packages\diffusers\models\attention.py", line 356, in forward
    return hidden_states * self.gelu(gate)
torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 320.00 MiB (GPU 0; 10.00 GiB total capacity; 8.34 GiB already allocated; 0 bytes free; 9.12 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

How do you animate an image ?

Hi, I'd like to know how to animate an image. I don't really understand how.

It would be to animate an image from my galery photo for a work project so if you could reply as soon as possible that would be great. Thank you

Anyone install this on a windows machine?

The yaml install didn't work for me so I tried via the requirements.txt. Had to install cuDNN, and made some progress on their errors that I received, but then ran into the trying to install this dependency: pip install nvidia-nccl-cu11. Anyone else get it to run locally? I am on windows creating a conda env.

Do you incorporate UV from DensePose?

Thanks for this incredible work!

For the motion representation, do you use the UV coordinates from the densepose representation or only the sequence of semantic part maps? If not, why not include the texture map as inputs for motion conditioning?

Thanks again!

Code availability

Hey! I just want to know if you will release the source code of the project. Great work ๐Ÿ™Œ

typo

conda env create -f environment.yml is not found

-> conda env create -f environment.yaml

Choice of densepose

Hi. Thanks for the great work. What was the reasoning behind the choice for densepose? I see that the segmentation masks are not always accurate so it leads to some inconsistencies in human biomechanics. Out of curiosity, would such a diffusion model work if one were to map the body onto a body model like SMPL and go from there?

Can't upload custom Densepose to the gradio demo

Hi, I get an error when I try to upload any densepose videos to the Gradio demo.

I've tried resizing the video to 512 x 512. Using mp4 format. Or even downloading the videos attached in the example and re-uploading them.

All of them returns the following error.
Using huggingface spaces.

UnboundLocalError: local variable 'size' referenced before assignment
Traceback (most recent call last):
  File "/home/user/.pyenv/versions/3.8.18/lib/python3.8/site-packages/gradio/queueing.py", line 456, in call_prediction
    output = await route_utils.call_process_api(
  File "/home/user/.pyenv/versions/3.8.18/lib/python3.8/site-packages/gradio/route_utils.py", line 232, in call_process_api
    output = await app.get_blocks().process_api(
  File "/home/user/.pyenv/versions/3.8.18/lib/python3.8/site-packages/gradio/blocks.py", line 1522, in process_api
    result = await self.call_function(
  File "/home/user/.pyenv/versions/3.8.18/lib/python3.8/site-packages/gradio/blocks.py", line 1144, in call_function
    prediction = await anyio.to_thread.run_sync(
  File "/home/user/.pyenv/versions/3.8.18/lib/python3.8/site-packages/anyio/to_thread.py", line 33, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/home/user/.pyenv/versions/3.8.18/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
    return await future
  File "/home/user/.pyenv/versions/3.8.18/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 807, in run
    result = context.run(func, *args)
  File "/home/user/.pyenv/versions/3.8.18/lib/python3.8/site-packages/gradio/utils.py", line 673, in wrapper
    response = f(*args, **kwargs)
  File "app.py", line 62, in read_video
    size = int(size)
UnboundLocalError: local variable 'size' referenced before assignment

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/user/.pyenv/versions/3.8.18/lib/python3.8/site-packages/gradio/queueing.py", line 501, in process_events
    response = await self.call_prediction(awake_events, batch)
  File "/home/user/.pyenv/versions/3.8.18/lib/python3.8/site-packages/gradio/queueing.py", line 465, in call_prediction
    raise Exception(str(error) if show_error else None) from error
Exception: None

I created an open source 1 click installer

Hi I got this to work on my Windows Nvidia RTX A4500, and wrote a 1 click installer. Hopefully Mac support is coming soon, in which case the same installer can be used to install on Macs as well.

You can learn more here: https://x.com/cocktailpeanut/status/1732052908227588263?s=20

Basically I'm working on a desktop app that runs any kind of command scripts with a click of a button. If you have any trouble installing, please reach out on my descord. Happy to help. Hope this is helpful.

I made auto installer. However gradio app using more than 24 GB vram and animate_dist giving below error

I made an auto installer script and it works on Windows 10 python 3.10.11

I am trying gradio_animate.py and it is using ridiculously more than 24 GB VRAM

when your pre shared motion sequence used it works. so when your shared motion sequence provided it uses less than 10 GB VRAM and works

When I upload a raw video it gives out of VRAM error over 24 GB VRAM

2023-12-05T04-02-57.mp4

When I try to below video no matter what it gives out of VRAM error on my RTX 3090 machine

ex2.mp4

So I tried to run gradio_animate_dist.py but it is giving below error no matter what. I even fixed pathing errors

To create a public link, set `share=True` in `launch()`.
Traceback (most recent call last):
  File "G:\magic_animate\magic-animate\venv\lib\site-packages\gradio\routes.py", line 488, in run_predict
    output = await app.get_blocks().process_api(
  File "G:\magic_animate\magic-animate\venv\lib\site-packages\gradio\blocks.py", line 1434, in process_api
    data = self.postprocess_data(fn_index, result["prediction"], state)
  File "G:\magic_animate\magic-animate\venv\lib\site-packages\gradio\blocks.py", line 1335, in postprocess_data
    prediction_value = block.postprocess(prediction_value)
  File "G:\magic_animate\magic-animate\venv\lib\site-packages\gradio\components\video.py", line 281, in postprocess
    processed_files = (self._format_video(y), None)
  File "G:\magic_animate\magic-animate\venv\lib\site-packages\gradio\components\video.py", line 355, in _format_video
    video = self.make_temp_copy_if_needed(video)
  File "G:\magic_animate\magic-animate\venv\lib\site-packages\gradio\components\base.py", line 226, in make_temp_copy_if_needed
    temp_dir = self.hash_file(file_path)
  File "G:\magic_animate\magic-animate\venv\lib\site-packages\gradio\components\base.py", line 190, in hash_file
    with open(file_path, "rb") as f:
FileNotFoundError: [Errno 2] No such file or directory: 'G:\\magic_animate\\magic-animate\\demo\\demo\\outputs\\2023-12-05T04-34-44.mp4'

here the fixed gradio_animate_dist.py

import argparse
import imageio
import os, datetime
import numpy as np
import gradio as gr
from PIL import Image
from subprocess import PIPE, run

base_dir = os.path.dirname(os.path.abspath(__file__))
demo_dir = os.path.join(base_dir, "demo")
tmp_dir = os.path.join(demo_dir, "tmp")
outputs_dir = os.path.join(demo_dir, "outputs")

os.makedirs(tmp_dir, exist_ok=True)
os.makedirs(outputs_dir, exist_ok=True)

def animate(reference_image, motion_sequence, seed, steps, guidance_scale):
    time_str = datetime.datetime.now().strftime("%Y-%m-%dT%H-%M-%S")
    animation_path = os.path.join(outputs_dir, f"{time_str}.mp4")
    save_path = os.path.join(tmp_dir, "input_reference_image.png")
    Image.fromarray(reference_image).save(save_path)
    command = f"python -m demo.animate_dist --reference_image {save_path} --motion_sequence {motion_sequence} --random_seed {seed} --step {steps} --guidance_scale {guidance_scale} --save_path {animation_path}"
    run(command, stdout=PIPE, stderr=PIPE, universal_newlines=True, shell=True)
    return animation_path

with gr.Blocks() as demo:

    gr.HTML(
        """
        <div style="display: flex; justify-content: center; align-items: center; text-align: center;">
        <a href="https://github.com/magic-research/magic-animate" style="margin-right: 20px; text-decoration: none; display: flex; align-items: center;">
        </a>
        <div>
            <h1 >MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model</h1>
            <h5 style="margin: 0;">If you like our project, please give us a star โœจ on Github for the latest update.</h5>
            <div style="display: flex; justify-content: center; align-items: center; text-align: center;">
                <a href="https://arxiv.org/abs/2311.16498"><img src="https://img.shields.io/badge/Arxiv-2311.16498-red"></a>
                <a href='https://showlab.github.io/magicanimate'><img src='https://img.shields.io/badge/Project_Page-MagicAnimate-green' alt='Project Page'></a>
                <a href='https://github.com/magic-research/magic-animate'><img src='https://img.shields.io/badge/Github-Code-blue'></a>
            </div>
        </div>
        </div>
        """
    )
    animation = gr.Video(format="mp4", label="Animation Results", autoplay=True)
    
    with gr.Row():
        reference_image  = gr.Image(label="Reference Image")
        motion_sequence  = gr.Video(format="mp4", label="Motion Sequence")
        
        with gr.Column():
            random_seed         = gr.Textbox(label="Random seed", value=1, info="default: -1")
            sampling_steps      = gr.Textbox(label="Sampling steps", value=25, info="default: 25")
            guidance_scale      = gr.Textbox(label="Guidance scale", value=7.5, info="default: 7.5")
            submit              = gr.Button("Animate")

    def read_video(video, size=512):
        size = int(size)
        reader = imageio.get_reader(video)
        frames = []
        for img in reader:
            frames.append(np.array(Image.fromarray(img).resize((size, size))))
        
        save_path = os.path.join(tmp_dir, "input_motion_sequence.mp4")
        imageio.mimwrite(save_path, frames, fps=25)
        return save_path
    
    def read_image(image, size=512):
        img = np.array(Image.fromarray(image).resize((size, size)))
        return img
        
    # when user uploads a new video
    motion_sequence.upload(
        read_video,
        motion_sequence,
        motion_sequence
    )
    # when `first_frame` is updated
    reference_image.upload(
        read_image,
        reference_image,
        reference_image
    )
    # when the `submit` button is clicked
    submit.click(
        animate,
        [reference_image, motion_sequence, random_seed, sampling_steps, guidance_scale], 
        animation
    )

    # Examples
    gr.Markdown("## Examples")
    gr.Examples(
        examples=[
            ["inputs/applications/source_image/monalisa.png", "inputs/applications/driving/densepose/running.mp4"], 
        ],
        inputs=[reference_image, motion_sequence],
        outputs=animation,
    )

demo.launch(share=False,inbrowser=True)

Improving facial likeness

First, congratulations on this project and thank you for your excellent work!

In my sample generations (which I have attached) you can see that the source face is not preserved very well, i.e. the person in the video does not look like the person in the source image.

I'm wondering, are there any settings I can change to improve this, or any other tips on the best source image to use to preserve facial features?

Thanks again!

bart2_demo4.mp4
grid.mp4
bart3_demo4.mp4
grid.mp4

Process is "Killed" after finishing

It worked 1 time but now it doesn't work. I have 16gb of vram

100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 25/25 [08:31<00:00, 20.46s/it]100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 112/112 [00:26<00:00, 4.16it/s]Killed

A popup on the top right:
Error
Unexpected token '<', " <h"... is not valid JSON

Use OpenPose instead of DensePose

I noticed an inconsistency in using Densepose, especially on the hands. Given this, I'm curious about the feasibility of utilizing Openpose as an alternative to Densepose in my workflow. Have you considered any potential advantages or challenges with such a switch?

OSError: Error no file named pytorch_model.bin, tf_model.h5, model.ckpt.index or flax_model.msgpack found in directory pretrained_models/stable-diffusion-v1-5.

(manimate) ryan@DESKTOP-81PKUNM:/mnt/d/python/magic-animate-main$ bash scripts/animate.sh
Traceback (most recent call last):
File "/home/ryan/miniconda3/envs/manimate/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/ryan/miniconda3/envs/manimate/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/mnt/d/python/magic-animate-main/magicanimate/pipelines/animation.py", line 282, in
run(args)
File "/mnt/d/python/magic-animate-main/magicanimate/pipelines/animation.py", line 271, in run
main(args)
File "/mnt/d/python/magic-animate-main/magicanimate/pipelines/animation.py", line 76, in main
text_encoder = CLIPTextModel.from_pretrained(config.pretrained_model_path, subfolder="text_encoder")
File "/home/ryan/miniconda3/envs/manimate/lib/python3.8/site-packages/transformers/modeling_utils.py", line 2805, in from_pretrained
state_dict = load_state_dict(resolved_archive_file)
File "/home/ryan/miniconda3/envs/manimate/lib/python3.8/site-packages/transformers/modeling_utils.py", line 458, in load_state_dict
with safe_open(checkpoint_file, framework="pt") as f:
safetensors_rust.SafetensorError: Error while deserializing header: HeaderTooLarge

I downloaded the models from the updated links from #2

LCM sampler

2023-12-05T23-09-44.mp4

This thing is 8 steps, guidance 4 with LCM trained checkpoint. Seems almost like usual generation with 25 steps. It should work a bit better if we use LCM sampler I guess. So how do we change sampler?

Too much resource

Hi, thanks for this amazing project. I have issues about generating, when I use on RunPOD's cloud 80gb GPU I still can't get over 25 sampling steps. It gives this error on gradio: Error
Unexpected token '<', " <!DOCTYPE "... is not valid JSON

this is terminal error:
workspace/magic-animate/magicanimate/pipelines/pipeline_animation.py:624: FutureWarning: Accessing config attribute in_channels directly via 'UNet3DConditionModel' object attribute is deprecated. Please access 'in_channels' over 'UNet3DConditionModel's config object instead, e.g. 'unet.config.in_channels'.
num_channels_latents = self.unet.in_channels

Blurred face

Can you add a high-definition model of the face? the face will be obviously disfigured.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.