opentalker / sadtalker Goto Github PK

[CVPR 2023] SadTalker：Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation

Home Page: https://sadtalker.github.io/

License: Other

Python 97.13% Shell 1.48% Jupyter Notebook 1.34% Batchfile 0.05%

audio-driven-talking-face cvpr2023 deep-fake deep-fakes image-animation talking-face talking-face-generation talking-head talking-heads

sadtalker's People

Contributors

Stargazers

Watchers

Forkers

xishen0220 aaaarvin maxmax2016 amorjnyh ishine codeaudit test-dan-run chewtoys hyojunguy babyblue26 0xkumalabs kakao014 jackzhousz saketspradhan pranjalya samsgates harrydevpy aodamiaomiao kwj-team beyondchenlin zhangziliang04 ylic2022 gh81997167 kedreamix mowenli orlgln intp1 leetesla cendywang johndpope guozhencs tngamemo jjandnn ezhangle c00renut chenkaic4 zhaoqiang88 i-paradox-i felaofan kuoenterprises magicbowen xiaoqingwang conphi lilujunai thetargo chenyangqiqi cat9999sss ruqhia maoyanfi carlyx vaan89 austfish justin-sky wuwenrui qq44321 dut3062796s kent-xiong shadowcz007 jiangzhihong02 clcarwin ahappymosquito curiosity92 yayawawo starrysky1986 baicuya zhubenjie trisix hezuogongying robotpin jinghao666 techthiyanes leeformoney louhuan1987 zfbok joeggg0401 lianhan zxh451200 pustar cogbee github1145229447 cersar chinalu hqman macroustc jaykef feiiyin ytsophie xx-tsao gdtiti samuelbaizg match08 babyhux coding-today d-mad devenlu seki2023 sxyseo vinthony zobnechopin edgeoutreach

sadtalker's Issues

colab book running issue

there is problem with

no module found gfpgan
how to clone gfpgan in notebook can you please explore it

no such directory examples/results
I have created directory same as need
but now also showing same error

pydub.exceptions.CouldntDecodeError: Decoding failed. ffmpeg returned error code: 1

I generate 2D face from a single Image, the audio size is 11499622.

Python 3.8.16

pip list:

Package            Version
------------------ ------------
audioread          3.0.0
blurhash           1.1.4
boost              0.1
certifi            2022.12.7
charset-normalizer 2.1.1
cmake              3.26.0
decorator          5.1.1
dlib               19.24.0
face-alignment     1.3.5
ffmpy              0.3.0
greenlet           2.0.2
idna               3.4
imageio            2.19.3
imageio-ffmpeg     0.4.7
joblib             1.1.0
kornia             0.6.8
librosa            0.6.0
llvmlite           0.31.0
Mastodon.py        1.8.0
networkx           3.0
numba              0.48.0
numpy              1.23.4
opencv-python      4.7.0.72
packaging          23.0
Pillow             9.3.0
pip                23.0.1
pydub              0.25.1
python-dateutil    2.8.2
python-magic       0.4.27
PyWavelets         1.4.1
PyYAML             6.0
requests           2.28.1
resampy            0.3.1
scikit-image       0.19.3
scikit-learn       1.1.3
scipy              1.5.3
setuptools         65.6.3
six                1.16.0
SQLAlchemy         2.0.6
threadpoolctl      3.1.0
tifffile           2023.3.15
torch              1.12.1+cu113
torchaudio         0.12.1+cu113
torchvision        0.13.1+cu113
tqdm               4.65.0
typing_extensions  4.4.0
urllib3            1.26.13
wheel              0.38.4
yacs               0.1.8

errors:

Traceback (most recent call last):
  File "inference.py", line 98, in <module>
    main(args)
  File "inference.py", line 75, in main
    animate_from_coeff.generate(data, save_dir)
  File "/home/ubuntu/SadTalker/facerender/animate.py", line 154, in generate
    sound = AudioSegment.from_mp3(audio_path)
  File "/home/ubuntu/anaconda3/envs/sadtalker/lib/python3.8/site-packages/pydub/audio_segment.py", line 796, in from_mp3
    return cls.from_file(file, 'mp3', parameters=parameters)
  File "/home/ubuntu/anaconda3/envs/sadtalker/lib/python3.8/site-packages/pydub/audio_segment.py", line 773, in from_file
    raise CouldntDecodeError(
pydub.exceptions.CouldntDecodeError: Decoding failed. ffmpeg returned error code: 1

Output from ffmpeg/avlib:

ffmpeg version 4.2.2 Copyright (c) 2000-2019 the FFmpeg developers
  built with gcc 7.3.0 (crosstool-NG 1.23.0.449-a04d0)
  configuration: --prefix=/home/ubuntu/anaconda3/envs/sadtalker --cc=/tmp/build/80754af9/ffmpeg_1587154242452/_build_env/bin/x86_64-conda_cos6-linux-gnu-cc --disable-doc --enable-avresample --enable-gmp --enable-hardcoded-tables --enable-libfreetype --enable-libvpx --enable-pthreads --enable-libopus --enable-postproc --enable-pic --enable-pthreads --enable-shared --enable-static --enable-version3 --enable-zlib --enable-libmp3lame --disable-nonfree --enable-gpl --enable-gnutls --disable-openssl --enable-libopenh264 --enable-libx264
  libavutil      56. 31.100 / 56. 31.100
  libavcodec     58. 54.100 / 58. 54.100
  libavformat    58. 29.100 / 58. 29.100
  libavdevice    58.  8.100 / 58.  8.100
  libavfilter     7. 57.100 /  7. 57.100
  libavresample   4.  0.  0 /  4.  0.  0
  libswscale      5.  5.100 /  5.  5.100
  libswresample   3.  5.100 /  3.  5.100
  libpostproc    55.  5.100 / 55.  5.100
[mp3float @ 0x55dce8ccfc80] Header missing
    Last message repeated 162 times
[mp3 @ 0x55dce8cc5fc0] decoding for stream 0 failed
[mp3 @ 0x55dce8cc5fc0] Could not find codec parameters for stream 0 (Audio: mp3 (mp3float), 0 channels, fltp): unspecified frame size
Consider increasing the value for the 'analyzeduration' and 'probesize' options
Input #0, mp3, from './face-zxc.wav':
  Duration: N/A, start: 0.000000, bitrate: N/A
    Stream #0:0: Audio: mp3, 0 channels, fltp
Stream mapping:
  Stream #0:0 -> #0:0 (mp3 (mp3float) -> pcm_s32le (native))

How to solve this problem?

About training code

Great job! We want to train the network based on our own datasets, could you please tell us when you will release the training code of Audio2exp or Audio2Pose?

TypeError: zoom() got an unexpected keyword argument 'grid_mode'

python inference.py --driven_audio japanese.wav --source_image art_1.png --result_dir .
checkpoints/epoch_20.pth
checkpoints/auido2pose_00140-model.pth
checkpoints/auido2exp_00300-model.pth
checkpoints/facevid2vid_00189-model.pth.tar
checkpoints/mapping_00229-model.pth.tar
landmark Det:: 100%|██████████████████████████████| 1/1 [00:04<00:00,  4.67s/it]
 3DMM Extraction In Video:: 100%|█████████████████| 1/1 [00:00<00:00,  6.51it/s]
Traceback (most recent call last):
  File "inference.py", line 98, in <module>
    main(args)
  File "inference.py", line 73, in main
    data = get_facerender_data(coeff_path, crop_pic_path, first_coeff_path, audio_path, 
  File "/media/user/home/04_MachineLearning/10_talkingheads/SadTalker/generate_facerender_batch.py", line 20, in get_facerender_data
    source_image = transform.resize(source_image, (256, 256, 3))
  File "/media/user/home/04_MachineLearning/10_talkingheads/SadTalker/.venv/lib/python3.8/site-packages/skimage/transform/_warps.py", line 186, in resize
    out = ndi.zoom(image, zoom_factors, order=order, mode=ndi_mode,
TypeError: zoom() got an unexpected keyword argument 'grid_mode'

关于如何通过调参提高分辨率？

请问我该如何对参数进行调整从而提高输出画面得分辨率呢，除了使用超分，我尝试通过修改facerender.yaml但中得generator_params中的reshape_depth和reshape_channel，但是报错了，请问有什么方法吗

Whether Chinese is supported ？

RuntimeError: Found no NVIDIA driver on your system

RuntimeError: Found no NVIDIA driver on your system. Please check that you have an NVIDIA GPU and installed a driver from http://www.nvidia.com/Download/index.aspx

Sir can you please explain how we eliminate this issue ?

About Pose

Thanks for your great work! I found that the generated head is moving around even if I set all the pose to be same with the first frame's pose. Is it possible to remove the transitions of the head?

code running in local without gpu is possible??

Hi sir,

its possible to run code in local without gpu only using cpu??

i know colab book is option but i need to run in local

Traceback (most recent call last):
File "inference.py", line 9, in
from generate_batch import get_data
File "C:\Users\chlyw\Desktop\SadTalker\generate_batch.py", line 5, in
import librosa
File "C:\Users\chlyw.conda\envs\sadtalker\lib\site-packages\librosa_init_.py", line 12, in
from . import core
File "C:\Users\chlyw.conda\envs\sadtalker\lib\site-packages\librosa\core_init_.py", line 102, in
from .time_frequency import * # pylint: disable=wildcard-import
File "C:\Users\chlyw.conda\envs\sadtalker\lib\site-packages\librosa\core\time_frequency.py", line 10, in
from ..util.exceptions import ParameterError
File "C:\Users\chlyw.conda\envs\sadtalker\lib\site-packages\librosa\util_init_.py", line 67, in
from .utils import * # pylint: disable=wildcard-import
File "C:\Users\chlyw.conda\envs\sadtalker\lib\site-packages\librosa\util\utils.py", line 111, in
def valid_audio(y, mono=True):
File "C:\Users\chlyw.conda\envs\sadtalker\lib\site-packages\librosa\cache.py", line 49, in wrapper
if self.cachedir is not None and self.level >= level:
AttributeError: 'CacheManager' object has no attribute 'cachedir'

load facevid2vid_00189-model.pth.tar error

i download from https://github.com/Winfredy/SadTalker/releases/tag/v0.0.1 or google drive, when the program load facevid2vid_00189-model.pth.tar, raise KeyError: "filename 'storages' not found"
Is the file I downloaded corrupted ?

感觉可以考虑加入codeformer

codeformer：https://github.com/sczhou/CodeFormer
在二次元图片上的表现比GFPGAN好很多
结果：
生成视频：https://user-images.githubusercontent.com/24595900/226800506-77d2dca8-3922-481c-809c-ccc213f70bcf.mp4
GFPGAN：https://user-images.githubusercontent.com/24595900/226800547-49867fcd-e738-4768-a1ef-40fee143e73b.mp4
codeformer：https://user-images.githubusercontent.com/24595900/226800615-7cc4133c-42d9-4d95-b5e9-4730a386716c.mp4
（视频无法在线看的话可能要右键保存链接）
当然只是建议，自己加起来也很方便x

在win10环境下，跑--enhancer gfpgan报错，请问是什么问题呢？

landmark Det:: 100%|█████████████████████████████████████████████████████████████████████| 1/1 [00:13<00:00, 13.25s/it]
3DMM Extraction In Video:: 100%|████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 1.19it/s]
Face Renderer:: 100%|██████████████████████████████████████████████████████████████████| 68/68 [00:14<00:00, 4.80it/s]
Traceback (most recent call last):
File "inference.py", line 109, in
main(args)
File "inference.py", line 78, in main
animate_from_coeff.generate(data, save_dir, enhancer=args.enhancer)
File "E:\ai\SadTalker\facerender\animate.py", line 149, in generate
enhanced_images = face_enhancer(result, method=enhancer)
File "E:\ai\SadTalker\utils\face_enhancer.py", line 32, in enhancer
restorer = GFPGANer(
File "C:\ProgramData\Anaconda3\envs\sadtalker\lib\site-packages\gfpgan\utils.py", line 79, in init
self.face_helper = FaceRestoreHelper(
File "C:\ProgramData\Anaconda3\envs\sadtalker\lib\site-packages\facexlib\utils\face_restoration_helper.py", line 103, in init
self.face_parse = init_parsing_model(model_name='parsenet', device=self.device, model_rootpath=model_rootpath)
File "C:\ProgramData\Anaconda3\envs\sadtalker\lib\site-packages\facexlib\parsing_init_.py", line 20, in init_parsing_model
load_net = torch.load(model_path, map_location=lambda storage, loc: storage)
File "C:\ProgramData\Anaconda3\envs\sadtalker\lib\site-packages\torch\serialization.py", line 713, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "C:\ProgramData\Anaconda3\envs\sadtalker\lib\site-packages\torch\serialization.py", line 938, in _legacy_load
typed_storage._storage._set_from_file(
RuntimeError: unexpected EOF, expected 381073 more bytes. The file might be corrupted.

error with VideoCapture

I followed the instructions to install and generate a video, but the following error occurred How can I resolve this issue? The video itself is output and complete.

./checkpoints\epoch_20.pth
./checkpoints\auido2pose_00140-model.pth
./checkpoints\auido2exp_00300-model.pth
./checkpoints\facevid2vid_00189-model.pth.tar
./checkpoints\mapping_00229-model.pth.tar
[ERROR:[email protected]] global cap_ffmpeg_impl.hpp:1223 open Could not find decoder for codec_id=61
[ERROR:[email protected]] global cap_ffmpeg_impl.hpp:1272 open VIDEOIO/FFMPEG: Failed to initialize VideoCapture

Generating 3D face from Audio is a good idea, can you tell me when it can be used?

If possible. I can join the development

pytorch throws mat mismatch when using short audio as input

python inference.py --driven_audio ./speech.wav --source_image face.png --batch_size 6 --result_dir ./examples/results
checkpoints\epoch_20.pth
checkpoints\auido2pose_00140-model.pth
checkpoints\auido2exp_00300-model.pth
checkpoints\facevid2vid_00189-model.pth.tar
checkpoints\mapping_00229-model.pth.tar
landmark Det:: 100%|█████████████████████████████████████████████████████████████████████| 1/1 [00:04<00:00, 4.67s/it]
3DMM Extraction In Video:: 100%|████████████████████████████████████████████████████████| 1/1 [00:04<00:00, 4.86s/it]
Traceback (most recent call last):
File "inference.py", line 99, in
main(args)
File "inference.py", line 71, in main
coeff_path = audio_to_coeff.generate(batch, save_dir, pose_style)
File "D:\Workspace\projects\sadtalker\SadTalker\test_audio2coeff.py", line 75, in generate
results_dict_pose = self.audio2pose_model.test(batch)
File "D:\Workspace\projects\sadtalker\SadTalker\audio2pose_models\audio2pose.py", line 86, in test
batch = self.netG.test(batch)
File "D:\Workspace\projects\sadtalker\SadTalker\audio2pose_models\cvae.py", line 49, in test
return self.decoder(batch)
File "D:\Software\Anaconda\anaconda3\envs\sadtalker\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "D:\Workspace\projects\sadtalker\SadTalker\audio2pose_models\cvae.py", line 139, in forward
x_out = self.MLP(x_in) # bs layer_sizes[-1]
File "D:\Software\Anaconda\anaconda3\envs\sadtalker\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "D:\Software\Anaconda\anaconda3\envs\sadtalker\lib\site-packages\torch\nn\modules\container.py", line 139, in forward
input = module(input)
File "D:\Software\Anaconda\anaconda3\envs\sadtalker\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "D:\Software\Anaconda\anaconda3\envs\sadtalker\lib\site-packages\torch\nn\modules\linear.py", line 114, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x196 and 262x128)

Below is my audio file:
speech.zip

cannot unpack non-iterable NoneType object

This is an excellent repo. Thank you for sharing. I am having the following issue with some of the image files. Tried saving as PNG, JPG, also saving on Photoshop without the ICC profile. No success. What can be the reason?

examples/source_image/yetinew3.png
./checkpoints/epoch_20.pth
./checkpoints/auido2pose_00140-model.pth
./checkpoints/auido2exp_00300-model.pth
./checkpoints/facevid2vid_00189-model.pth.tar
./checkpoints/mapping_00229-model.pth.tar
libpng warning: iCCP: known incorrect sRGB profile
Traceback (most recent call last):
  File "inference.py", line 132, in <module>
    main(args)
  File "inference.py", line 64, in main
    first_coeff_path, crop_pic_path, original_size =  preprocess_model.generate(pic_path, first_frame_dir, args.preprocess)
  File "/content/SadTalker/src/utils/preprocess.py", line 85, in generate
    x_full_frames, crop, quad = self.croper.crop(x_full_frames, xsize=pic_size)
TypeError: cannot unpack non-iterable NoneType object

眼睛动作的问题

嗨，感谢分享那么棒的工作。

我发现生成的视频眼睛一般都没法完全闭上，所以稍微有点不自然，是否有改进方法呢？thanks！

docker运行gpu配置问题

docker run -it --gpus all nvidia/cuda:11.4.0-base-ubuntu20.04 nvidia-smi
docker: Error response from daemon: failed to create shim task: OCI runtime create failed: runc create failed: unable to start container process: error during container init: error running hook #0: error running hook: exit status 1, stdout: , stderr: Auto-detected mode as 'legacy'
nvidia-container-cli: initialization error: load library failed: libnvidia-ml.so.1: cannot open shared object file: no such file or directory: unknown.

RuntimeError: Unable to open ./checkpoints\shape_predictor_68_face_landmarks.dat

RuntimeError: Unable to open C:\Users\Administrator\SadTalker./checkpoints\shape_predictor_68_face_landmarks.dat

图片生成的视频分辨率如何提高？怎么样设置生成的视频不是只有人头

hi
请问基于图片生成的视频分辨率如何提高？
怎么样设置生成的视频不是只有人头，能够包含原图片全身型像？

Different style

Is that possible to adapt this for cartoon/anime faces or this is just for human faces?

extract_kp_videos.py中读取图片关键点时索引超出界限

第33行 keypoints.append(current_kp[-1]) 报错 list index out of range ，请问一下这是我的素材的问题吗

real time talking head

hi, great work ! do you think this can work in realtime interactive scenarios. like talking that are love/interactive

如何添加到sd

如何添加到stable diffusion里?像演示视频一样

使用cpu版本渲染时解码音频部分出现异常

完整异常栈：
Traceback (most recent call last): File "E:\learn\sadtalker\inference.py", line 141, in <module> main(args) File "E:\learn\sadtalker\inference.py", line 53, in main audio_to_coeff = Audio2Coeff(audio2pose_checkpoint, audio2pose_yaml_path, File "E:\learn\sadtalker\src\test_audio2coeff.py", line 35, in __init__ self.audio2pose_model = Audio2Pose(cfg_pose, wav2lip_checkpoint, device=device) File "E:\learn\sadtalker\src\audio2pose_models\audio2pose.py", line 15, in __init__ self.audio_encoder = AudioEncoder(wav2lip_checkpoint,device=device) File "E:\learn\sadtalker\src\audio2pose_models\audio_encoder.py", line 45, in __init__ wav2lip_state_dict = torch.load(wav2lip_checkpoint)['state_dict'] File "E:\venv\sadtalker\lib\site-packages\torch\serialization.py", line 713, in load return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args) File "E:\venv\sadtalker\lib\site-packages\torch\serialization.py", line 930, in _legacy_load result = unpickler.load() File "E:\venv\sadtalker\lib\site-packages\torch\serialization.py", line 876, in persistent_load wrap_storage=restore_location(obj, location), File "E:\venv\sadtalker\lib\site-packages\torch\serialization.py", line 175, in default_restore_location result = fn(storage, location) File "E:\venv\sadtalker\lib\site-packages\torch\serialization.py", line 152, in _cuda_deserialize device = validate_cuda_device(location) File "E:\venv\sadtalker\lib\site-packages\torch\serialization.py", line 136, in validate_cuda_device raise RuntimeError('Attempting to deserialize object on a CUDA ' RuntimeError: Attempting to deserialize object on a CUDA device but torch.cuda.is_available() is False. If you are running on a CPU-only machine, please use torch.load with map_location=torch.device('cpu') to map your storages to the CPU.

解决方法：1.AudioEncoder构造函数添加device,从外部传入。
2. torch.load(wav2lip_checkpoint)['state_dict'] 改为
wav2lip_state_dict = torch.load(wav2lip_checkpoint,map_location=torch.device(device))['state_dict']
顺带一提，#注释最后最后一位用\结尾的话会导致pychram的debug点不了下一行

I don't know where to put "w/ still mode"

Thanks! Great product, I am running on RTX 3090.

I noticed that you have added a version called "w/ still mode" which supposedly suppresses facial movements, and I want to install it immediately, but I couldn't find that command in the readme.

I don't know whether to add "w/ still mode" to "-expression_scale" or to the "--enhancer" section, and I don't know how to describe it, since there is no explanation of how to do so.

'CacheManager' object has no attribute 'cachedir'

Hitting this error when running an inference;

(.venv) (base) ➜  SadTalker git:(main) ✗ python inference.py --driver_audio japanese.wav --source_image art_1.png --result_dir .
Traceback (most recent call last):
  File "inference.py", line 9, in <module>
    from generate_batch import get_data
  File "/media/user/home/04_MachineLearning/10_talkingheads/SadTalker/generate_batch.py", line 5, in <module>
    import librosa    
  File "/media/user/home/04_MachineLearning/10_talkingheads/SadTalker/.venv/lib/python3.8/site-packages/librosa/__init__.py", line 12, in <module>
    from . import core
  File "/media/user/home/04_MachineLearning/10_talkingheads/SadTalker/.venv/lib/python3.8/site-packages/librosa/core/__init__.py", line 102, in <module>
    from .time_frequency import *  # pylint: disable=wildcard-import
  File "/media/user/home/04_MachineLearning/10_talkingheads/SadTalker/.venv/lib/python3.8/site-packages/librosa/core/time_frequency.py", line 10, in <module>
    from ..util.exceptions import ParameterError
  File "/media/user/home/04_MachineLearning/10_talkingheads/SadTalker/.venv/lib/python3.8/site-packages/librosa/util/__init__.py", line 67, in <module>
    from .utils import *  # pylint: disable=wildcard-import
  File "/media/user/home/04_MachineLearning/10_talkingheads/SadTalker/.venv/lib/python3.8/site-packages/librosa/util/utils.py", line 111, in <module>
    def valid_audio(y, mono=True):
  File "/media/user/home/04_MachineLearning/10_talkingheads/SadTalker/.venv/lib/python3.8/site-packages/librosa/cache.py", line 49, in wrapper
    if self.cachedir is not None and self.level >= level:
AttributeError: 'CacheManager' object has no attribute 'cachedir'

Hello, can I use Wav in Vietnamese language?

How to run SadTalker in stable-diffusion-web-ui?

Thank you for your work on SadTalker! Could you please provide instructions on how to run SadTalker in stable-diffusion-web-ui?

GPU usage Settings

Hi, great work !
Computer 16G GPU, but the maximum can only use 6G, I want it to generate results faster, where is this value set? Look forward to your reply. Thank you~

ubuntu Building llvmlite requires LLVM 7.0+ Be sure to set LLVM_CONFIG to the right executable path.

Requirement already satisfied: contourpy>=1.0.1 in /home/oem/miniconda3/envs/ldm2/lib/python3.9/site-packages (from matplotlib->filterpy->facexlib==0.2.5->-r requirements3d.txt (line 17)) (1.0.6)
Requirement already satisfied: python-dateutil>=2.7 in /home/oem/miniconda3/envs/ldm2/lib/python3.9/site-packages (from matplotlib->filterpy->facexlib==0.2.5->-r requirements3d.txt (line 17)) (2.8.2)
Requirement already satisfied: cycler>=0.10 in /home/oem/miniconda3/envs/ldm2/lib/python3.9/site-packages (from matplotlib->filterpy->facexlib==0.2.5->-r requirements3d.txt (line 17)) (0.11.0)
Requirement already satisfied: fonttools>=4.22.0 in /home/oem/miniconda3/envs/ldm2/lib/python3.9/site-packages (from matplotlib->filterpy->facexlib==0.2.5->-r requirements3d.txt (line 17)) (4.38.0)
Requirement already satisfied: charset-normalizer<4,>=2 in /home/oem/miniconda3/envs/ldm2/lib/python3.9/site-packages (from requests->basicsr==1.4.2->-r requirements3d.txt (line 16)) (2.0.4)
Requirement already satisfied: idna<4,>=2.5 in /home/oem/miniconda3/envs/ldm2/lib/python3.9/site-packages (from requests->basicsr==1.4.2->-r requirements3d.txt (line 16)) (2.10)
Requirement already satisfied: certifi>=2017.4.17 in /home/oem/miniconda3/envs/ldm2/lib/python3.9/site-packages (from requests->basicsr==1.4.2->-r requirements3d.txt (line 16)) (2022.12.7)
Requirement already satisfied: urllib3<1.27,>=1.21.1 in /home/oem/miniconda3/envs/ldm2/lib/python3.9/site-packages (from requests->basicsr==1.4.2->-r requirements3d.txt (line 16)) (1.26.13)
Requirement already satisfied: tensorboard-plugin-wit>=1.6.0 in /home/oem/miniconda3/envs/ldm2/lib/python3.9/site-packages (from tb-nightly->basicsr==1.4.2->-r requirements3d.txt (line 16)) (1.8.1)
Requirement already satisfied: markdown>=2.6.8 in /home/oem/miniconda3/envs/ldm2/lib/python3.9/site-packages (from tb-nightly->basicsr==1.4.2->-r requirements3d.txt (line 16)) (3.4.1)
Requirement already satisfied: protobuf<4,>=3.9.2 in /home/oem/miniconda3/envs/ldm2/lib/python3.9/site-packages (from tb-nightly->basicsr==1.4.2->-r requirements3d.txt (line 16)) (3.20.0)
Requirement already satisfied: tensorboard-data-server<0.7.0,>=0.6.0 in /home/oem/miniconda3/envs/ldm2/lib/python3.9/site-packages (from tb-nightly->basicsr==1.4.2->-r requirements3d.txt (line 16)) (0.6.1)
Requirement already satisfied: absl-py>=0.4 in /home/oem/miniconda3/envs/ldm2/lib/python3.9/site-packages (from tb-nightly->basicsr==1.4.2->-r requirements3d.txt (line 16)) (1.3.0)
Requirement already satisfied: google-auth<3,>=1.6.3 in /home/oem/miniconda3/envs/ldm2/lib/python3.9/site-packages (from tb-nightly->basicsr==1.4.2->-r requirements3d.txt (line 16)) (2.14.1)
Requirement already satisfied: werkzeug>=1.0.1 in /home/oem/miniconda3/envs/ldm2/lib/python3.9/site-packages (from tb-nightly->basicsr==1.4.2->-r requirements3d.txt (line 16)) (2.2.2)
Requirement already satisfied: google-auth-oauthlib<0.5,>=0.4.1 in /home/oem/miniconda3/envs/ldm2/lib/python3.9/site-packages (from tb-nightly->basicsr==1.4.2->-r requirements3d.txt (line 16)) (0.4.6)
Requirement already satisfied: grpcio>=1.24.3 in /home/oem/miniconda3/envs/ldm2/lib/python3.9/site-packages (from tb-nightly->basicsr==1.4.2->-r requirements3d.txt (line 16)) (1.50.0)
Requirement already satisfied: pyasn1-modules>=0.2.1 in /home/oem/miniconda3/envs/ldm2/lib/python3.9/site-packages (from google-auth<3,>=1.6.3->tb-nightly->basicsr==1.4.2->-r requirements3d.txt (line 16)) (0.2.8)
Requirement already satisfied: cachetools<6.0,>=2.0.0 in /home/oem/miniconda3/envs/ldm2/lib/python3.9/site-packages (from google-auth<3,>=1.6.3->tb-nightly->basicsr==1.4.2->-r requirements3d.txt (line 16)) (5.2.0)
Requirement already satisfied: rsa<5,>=3.1.4 in /home/oem/miniconda3/envs/ldm2/lib/python3.9/site-packages (from google-auth<3,>=1.6.3->tb-nightly->basicsr==1.4.2->-r requirements3d.txt (line 16)) (4.9)
Requirement already satisfied: requests-oauthlib>=0.7.0 in /home/oem/miniconda3/envs/ldm2/lib/python3.9/site-packages (from google-auth-oauthlib<0.5,>=0.4.1->tb-nightly->basicsr==1.4.2->-r requirements3d.txt (line 16)) (1.3.1)
Requirement already satisfied: importlib-metadata>=4.4 in /home/oem/miniconda3/envs/ldm2/lib/python3.9/site-packages (from markdown>=2.6.8->tb-nightly->basicsr==1.4.2->-r requirements3d.txt (line 16)) (5.1.0)
Requirement already satisfied: zipp>=0.5 in /home/oem/miniconda3/envs/ldm2/lib/python3.9/site-packages (from importlib-metadata>=4.4->markdown>=2.6.8->tb-nightly->basicsr==1.4.2->-r requirements3d.txt (line 16)) (3.11.0)
Requirement already satisfied: pyasn1<0.5.0,>=0.4.6 in /home/oem/miniconda3/envs/ldm2/lib/python3.9/site-packages (from pyasn1-modules>=0.2.1->google-auth<3,>=1.6.3->tb-nightly->basicsr==1.4.2->-r requirements3d.txt (line 16)) (0.4.8)
Requirement already satisfied: oauthlib>=3.0.0 in /home/oem/miniconda3/envs/ldm2/lib/python3.9/site-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib<0.5,>=0.4.1->tb-nightly->basicsr==1.4.2->-r requirements3d.txt (line 16)) (3.2.2)
Requirement already satisfied: MarkupSafe>=2.1.1 in /home/oem/miniconda3/envs/ldm2/lib/python3.9/site-packages (from werkzeug>=1.0.1->tb-nightly->basicsr==1.4.2->-r requirements3d.txt (line 16)) (2.1.1)
Building wheels for collected packages: llvmlite
Building wheel for llvmlite (setup.py) ... error
ERROR: Command errored out with exit status 1:
command: /home/oem/miniconda3/envs/ldm2/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-mi8t38g9/llvmlite_989da752628c422a8009813d14822a81/setup.py'"'"'; file='"'"'/tmp/pip-install-mi8t38g9/llvmlite_989da752628c422a8009813d14822a81/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' bdist_wheel -d /tmp/pip-wheel-8kcoad95
cwd: /tmp/pip-install-mi8t38g9/llvmlite_989da752628c422a8009813d14822a81/
Complete output (16 lines):
running bdist_wheel
/home/oem/miniconda3/envs/ldm2/bin/python /tmp/pip-install-mi8t38g9/llvmlite_989da752628c422a8009813d14822a81/ffi/build.py
LLVM version... llvm-config: /home/oem/miniconda3/lib/libtinfo.so.6: no version information available (required by llvm-config)
14.0.0

Traceback (most recent call last):
File "/tmp/pip-install-mi8t38g9/llvmlite_989da752628c422a8009813d14822a81/ffi/build.py", line 168, in
main()
File "/tmp/pip-install-mi8t38g9/llvmlite_989da752628c422a8009813d14822a81/ffi/build.py", line 158, in main
main_posix('linux', '.so')
File "/tmp/pip-install-mi8t38g9/llvmlite_989da752628c422a8009813d14822a81/ffi/build.py", line 120, in main_posix
raise RuntimeError(msg)
RuntimeError: Building llvmlite requires LLVM 7.0+ Be sure to set LLVM_CONFIG to the right executable path.
Read the documentation at http://llvmlite.pydata.org/ for more information about building llvmlite.

error: command '/home/oem/miniconda3/envs/ldm2/bin/python' failed with exit code 1

ERROR: Failed building wheel for llvmlite
Running setup.py clean for llvmlite
Failed to build llvmlite
Installing collected packages: llvmlite, scipy, numba, joblib, imageio, scikit-image, resampy, yacs, trimesh, librosa, kornia, imageio-ffmpeg, face-alignment, dlib-bin
Attempting uninstall: llvmlite
Found existing installation: llvmlite 0.39.1
Uninstalling llvmlite-0.39.1:
Successfully uninstalled llvmlite-0.39.1
Running setup.py install for llvmlite ... error
ERROR: Command errored out with exit status 1:
command: /home/oem/miniconda3/envs/ldm2/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-mi8t38g9/llvmlite_989da752628c422a8009813d14822a81/setup.py'"'"'; file='"'"'/tmp/pip-install-mi8t38g9/llvmlite_989da752628c422a8009813d14822a81/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record /tmp/pip-record-ochopfnf/install-record.txt --single-version-externally-managed --compile --install-headers /home/oem/miniconda3/envs/ldm2/include/python3.9/llvmlite
cwd: /tmp/pip-install-mi8t38g9/llvmlite_989da752628c422a8009813d14822a81/
Complete output (21 lines):
running install
/home/oem/miniconda3/envs/ldm2/lib/python3.9/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
warnings.warn(
running build
got version from file /tmp/pip-install-mi8t38g9/llvmlite_989da752628c422a8009813d14822a81/llvmlite/_version.py {'version': '0.31.0', 'full': 'fe7d985f6421d87f613bd414479d29d912771562'}
running build_ext
/home/oem/miniconda3/envs/ldm2/bin/python /tmp/pip-install-mi8t38g9/llvmlite_989da752628c422a8009813d14822a81/ffi/build.py
LLVM version... llvm-config: /home/oem/miniconda3/lib/libtinfo.so.6: no version information available (required by llvm-config)
14.0.0

Traceback (most recent call last):
  File "/tmp/pip-install-mi8t38g9/llvmlite_989da752628c422a8009813d14822a81/ffi/build.py", line 168, in <module>
    main()
  File "/tmp/pip-install-mi8t38g9/llvmlite_989da752628c422a8009813d14822a81/ffi/build.py", line 158, in main
    main_posix('linux', '.so')
  File "/tmp/pip-install-mi8t38g9/llvmlite_989da752628c422a8009813d14822a81/ffi/build.py", line 120, in main_posix
    raise RuntimeError(msg)
RuntimeError: Building llvmlite requires LLVM 7.0+ Be sure to set LLVM_CONFIG to the right executable path.
Read the documentation at http://llvmlite.pydata.org/ for more information about building llvmlite.

error: command '/home/oem/miniconda3/envs/ldm2/bin/python' failed with exit code 1
----------------------------------------

Rolling back uninstall of llvmlite
Moving to /home/oem/miniconda3/envs/ldm2/lib/python3.9/site-packages/llvmlite
from /home/oem/miniconda3/envs/ldm2/lib/python3.9/site-packages/~lvmlite
Moving to /home/oem/miniconda3/envs/ldm2/lib/python3.9/site-packages/llvmlite-0.39.1-py3.9.egg-info
from /home/oem/miniconda3/envs/ldm2/lib/python3.9/site-packages/~lvmlite-0.39.1-py3.9.egg-info
ERROR: Command errored out with exit status 1: /home/oem/miniconda3/envs/ldm2/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-mi8t38g9/llvmlite_989da752628c422a8009813d14822a81/setup.py'"'"'; file='"'"'/tmp/pip-install-mi8t38g9/llvmlite_989da752628c422a8009813d14822a81/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record /tmp/pip-record-ochopfnf/install-record.txt --single-version-externally-managed --compile --install-headers /home/oem/miniconda3/envs/ldm2/include/python3.9/llvmlite Check the logs for full command output.

UPDATE

sudo apt-get install libllvm-14-ocaml-dev libllvm14 llvm-14 llvm-14-dev llvm-14-doc llvm-14-examples llvm-14-runtime

and I update .zshrc

export LD_LIBRARY_PATH=$HOME/.miniconda3/envs/ldm2/lib:$LD_LIBRARY_PATH
export LLVM_CONFIG=/usr/bin/llvm-config-14

but no joy.

Running setup.py clean for llvmlite
Failed to build llvmlite
Installing collected packages: llvmlite, scipy, numba, joblib, imageio, scikit-image, resampy, yacs, trimesh, librosa, kornia, imageio-ffmpeg, face-alignment, dlib-bin
Attempting uninstall: llvmlite
Found existing installation: llvmlite 0.39.1
Uninstalling llvmlite-0.39.1:
Successfully uninstalled llvmlite-0.39.1
Running setup.py install for llvmlite ... error
ERROR: Command errored out with exit status 1:
command: /home/oem/miniconda3/envs/ldm2/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-9jv2q5m2/llvmlite_ddde0f598ad9469b92939673cd0b4f4e/setup.py'"'"'; file='"'"'/tmp/pip-install-9jv2q5m2/llvmlite_ddde0f598ad9469b92939673cd0b4f4e/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record /tmp/pip-record-ubj4g4b4/install-record.txt --single-version-externally-managed --compile --install-headers /home/oem/miniconda3/envs/ldm2/include/python3.9/llvmlite
cwd: /tmp/pip-install-9jv2q5m2/llvmlite_ddde0f598ad9469b92939673cd0b4f4e/
Complete output (20 lines):
running install
/home/oem/miniconda3/envs/ldm2/lib/python3.9/site-packages/setuptools/command/install.py:34: SetuptoolsDeprecationWarning: setup.py install is deprecated. Use build and pip and other standards-based tools.
warnings.warn(
running build
got version from file /tmp/pip-install-9jv2q5m2/llvmlite_ddde0f598ad9469b92939673cd0b4f4e/llvmlite/_version.py {'version': '0.31.0', 'full': 'fe7d985f6421d87f613bd414479d29d912771562'}
running build_ext
/home/oem/miniconda3/envs/ldm2/bin/python /tmp/pip-install-9jv2q5m2/llvmlite_ddde0f598ad9469b92939673cd0b4f4e/ffi/build.py
LLVM version... 14.0.0

Traceback (most recent call last):
  File "/tmp/pip-install-9jv2q5m2/llvmlite_ddde0f598ad9469b92939673cd0b4f4e/ffi/build.py", line 168, in <module>
    main()
  File "/tmp/pip-install-9jv2q5m2/llvmlite_ddde0f598ad9469b92939673cd0b4f4e/ffi/build.py", line 158, in main
    main_posix('linux', '.so')
  File "/tmp/pip-install-9jv2q5m2/llvmlite_ddde0f598ad9469b92939673cd0b4f4e/ffi/build.py", line 120, in main_posix
    raise RuntimeError(msg)
RuntimeError: Building llvmlite requires LLVM 7.0+ Be sure to set LLVM_CONFIG to the right executable path.
Read the documentation at http://llvmlite.pydata.org/ for more information about building llvmlite.

error: command '/home/oem/miniconda3/envs/ldm2/bin/python' failed with exit code 1
----------------------------------------

Rolling back uninstall of llvmlite
Moving to /home/oem/miniconda3/envs/ldm2/lib/python3.9/site-packages/llvmlite
from /home/oem/miniconda3/envs/ldm2/lib/python3.9/site-packages/~lvmlite
Moving to /home/oem/miniconda3/envs/ldm2/lib/python3.9/site-packages/llvmlite-0.39.1-py3.9.egg-info
from /home/oem/miniconda3/envs/ldm2/lib/python3.9/site-packages/~lvmlite-0.39.1-py3.9.egg-info
ERROR: Command errored out with exit status 1: /home/oem/miniconda3/envs/ldm2/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-9jv2q5m2/llvmlite_ddde0f598ad9469b92939673cd0b4f4e/setup.py'"'"'; file='"'"'/tmp/pip-install-9jv2q5m2/llvmlite_ddde0f598ad9469b92939673cd0b4f4e/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record /tmp/pip-record-ubj4g4b4/install-record.txt --single-version-externally-managed --compile --install-headers /home/oem/miniconda3/envs/ldm2/include/python3.9/llvmlite Check the logs for full command output.

pip install lvmlite succeeds.

configure environment error

numpy1.23.5 required python>=3.8.But as you telled, you install python=3.7.Whether you installed numpy successfully？

显存要求最低多少才能运行此训练?

如何设置能把生成的人像再大一些呢？

如何设置能把生成的人像再大一些呢？稍微露出多一些肩膀也好，有时只有头和膀子

如何提高分辨率呢？

上传一个高清照片后，会进行压缩，如何设置不压缩呢？

audio sampling rate

Hello, I want to ask you about the audio sampling rate. In the example you provided, there are 16k, 24k and 48k. How much should I convert the audio sampling rate? In addition, could you tell me how to make the dataset

数据集问题

请问作者是用的普通话数据集训练的模型吗？

how to export the whole body animation, i try many times,failed...

如何扩展到Stable-Diffusion-Webui的Animation中？

请教下如何能取消掉视频的上下震动？

关于自定义音频

我这边想通过调用国内的语音合成，从文字生成语音，现阶段测试选的标贝的。将音频输入进去之后报错

报错log如下
Traceback (most recent call last):
File "inference.py", line 133, in
main(args)
File "inference.py", line 72, in main
coeff_path = audio_to_coeff.generate(batch, save_dir, pose_style)
File "E:\PycharmProjects\SadTalker_Git\src\test_audio2coeff.py", line 74, in generate
results_dict_pose = self.audio2pose_model.test(batch)
File "E:\PycharmProjects\SadTalker_Git\src\audio2pose_models\audio2pose.py", line 85, in test
batch = self.netG.test(batch)
File "E:\PycharmProjects\SadTalker_Git\src\audio2pose_models\cvae.py", line 49, in test
return self.decoder(batch)
File "E:\Anaconda3\envs\sadtalker\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "E:\PycharmProjects\SadTalker_Git\src\audio2pose_models\cvae.py", line 139, in forward
x_out = self.MLP(x_in) # bs layer_sizes[-1]
File "E:\Anaconda3\envs\sadtalker\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "E:\Anaconda3\envs\sadtalker\lib\site-packages\torch\nn\modules\container.py", line 139, in forward
input = module(input)
File "E:\Anaconda3\envs\sadtalker\lib\site-packages\torch\nn\modules\module.py", line 1130, in _call_impl
return forward_call(*input, **kwargs)
File "E:\Anaconda3\envs\sadtalker\lib\site-packages\torch\nn\modules\linear.py", line 114, in forward
return F.linear(input, self.weight, self.bias)
RuntimeError: mat1 and mat2 shapes cannot be multiplied (1x220 and 262x128)
仔细翻看了readme，并未发现对音频文件有说明，劳烦大佬了

About expression

as its said sadtalker so expression are in sad but if we want our output in joyful expression or any other expression rather than same sad expression so how we implement for this ?

and library also work for video's same expression just give audio so he convert in to lips sync and other thing like expression of video and other thing are same?

RuntimeError: Unable to open checkpoints/shape_predictor_68_face_landmarks.dat

Thanks for your great work!
Flollow the process, have put all checkpoint files in the checkpoints path, but got this error:

python3 inference.py --driven_audio assets/nhk.wav --source_image assets/guren3.png --result_dir assets
checkpoints/epoch_20.pth

Traceback (most recent call last):
File "inference.py", line 98, in
main(args)
File "inference.py", line 48, in main
preprocess_model = CropAndExtract(path_of_lm_croper, path_of_net_recon_model, dir_of_BFM_fitting, device)
File "/SadTalker/preprocess.py", line 45, in init
self.croper = Croper(path_of_lm_croper)
File "/SadTalker/croper.py", line 38, in init
self.predictor = dlib.shape_predictor(path_of_lm)
RuntimeError: Unable to open checkpoints/shape_predictor_68_face_landmarks.dat

可能存在的bug

现象：我在使用过程中，在图片路径的开头加了 .\ 表示是从当前目录下读取文件夹中的图片，但是并未读取出正确的文件格式。

原因：在预处理文件 src\utils\preprocess.py 里的第6行，是取split后的列表中的第二个元素来判断文件格式。
如果读入路径中出现了 . 或者 .. 就会导致判断出错，从而进入视频读取流程，
例如，传入的图片路径为 .\imgSrc\1.png 此时依照第66行代码的逻辑，读出的后缀名为 "\imgSrc\1" ，并不为三个中的任何一种。

解决：把split后的数组索引 1 改成 -1 ，是否会好一些？