zhengkw18 / face-vid2vid Goto Github PK

View Code? Open in Web Editor NEW

138.0 138.0 17.0 10.31 MB

Unofficial implementation of One-Shot Free-View Neural Talking Head Synthesis

Python 100.00%

face-vid2vid's People

Contributors

Stargazers

Watchers

Forkers

laleh-samadfam maddigit jason-cs18 657671238 haibao637 jupinter seriousran yudeng techthiyanes goo2go luh1124 immocat gooqi tinaa23 adambear

face-vid2vid's Issues

Keypoint prior loss function

Thank you for your work. May I ask why your keypoint prior loss function is slightly different from the one in the original paper?

In the paper (A.2), the keypoint prior loss function is:

However, yours in losses.py is:

loss = (
    torch.max(0 * dist_mat, self.Dt - dist_mat).sum((1, 2)).mean()
    + torch.abs(kp_d[:, :, 2].mean(1) - self.zt).mean()
    - kp_d.shape[1] * self.Dt
)

I was wondering why you subtracted kp_d.shape[1] * self.Dt in the end.

There is a problem with the generated video, what is the cause?

Thank you for your shared model!

I used your pre-trained model.
What's the problem?

thanks!

load_videos.py - Can not load video UW1c9E8nfxQ, broken link

Hi! Thank you for your great contribution! I would love to use your model! I'm trying to build the default dataset folder but:

python load_videos.py --workers=8

Number of videos: 3442
0it [00:00, ?it/s]Can not load video UW1c9E8nfxQ, broken link
1it [00:08, 8.73s/it]Can not load video pbm-5KhWXlc, broken link
2it [00:10, 4.82s/it]Can not load video tMP5U3jYNkg, broken link
Can not load video LZ_Hw9J62KE, broken link
4it [00:10, 1.84s/it]Can not load video u3odsIbYouc, broken link
Can not load video yLA2n3coUgk, broken link
6it [00:11, 1.01s/it]Can not load video ULBH3A8DjPM, broken link
Can not load video B5jqlhXWkOo, broken link
8it [00:11, 1.59it/s]Can not load video LNlufCgIx_E, broken link
Can not load video shR-y9jzeHg, broken link
10it [00:22, 2.44s/it]Can not load video 8xomuTM5Jm8, broken link
Can not load video zWig265SViA, broken link
Can not load video q1mNeW_BrSw, broken link
13it [00:22, 1.42s/it]Can not load video vN5K8HEgafI, broken link
Can not load video ldAbe81ePpE, broken link
15it [00:22, 1.02s/it]Can not load video daZUIa8FA_M, broken link
Can not load video dwnIdViJS0U, broken link
17it [00:26, 1.23s/it]Can not load video QdBQTHX55yI, broken link
18it [00:33, 2.25s/it]Can not load video 1fpTDuFfoB0, broken link
19it [00:33, 1.83s/it]Can not load video DE089Obo6L4, broken link
20it [00:33, 1.46s/it]Can not load video sh6J3wEmceA, broken link
21it [00:33, 1.14s/it]Can not load video Hyzl8482nfY, broken link
22it [00:33, 1.14it/s]Can not load video vuVdwmx_1yQ, broken link

Can you help me?!

这块的sota还是这个吗

大佬，想咨询下，这块的sota还是这个吗，最近看到sadtalker的工作，对比效果貌似也没有比这个效果更好。大佬了解还有什么做的更好的项目或者文章吗

training data and command

Hello～Thank you for your great contribution~
I want to know how to train on this project~Which format（A series of folders containing video frames（pictures） or a series of videos or others） and directory are the data put into during training? What are the corresponding python training commands?
Thank you～

Continuing training on your shared model

Thank you for your shared model! And now I'm continuing training on the shared model using the voxceleb2 sub-datasets (part_b part_c and part_d, about 380k videos, the paper said using 280k videos).
After every epoch, I evaluated the model but seems that the performance is gradually worse.
Although the training losses are decreasing, the PSNR of generated videos is decreasing, and the visual quality is also worse. It's so strange.

Do you have any thinking about it? Could you share more training details you thought necessary? Thank you a lot.

up: shared model
below: the continuing training model
You can see the background is moving using my model.

What is the version number of imageio? Thank you very much~

你好，请问你可以分享一下处理好的数据集吗

可以使用百度网盘分享一下处理好的数据集吗

我想咨询一下论文keypoint的增多一定会带来性能上的改善，这个您怎么看，是绝对会成正比增长的吗

还有请问下训练问题，改变kp的数量会增大多少训练开销呢，比如15kp到20kp

about the nework

Hello, zhengkw18, thank you for your contribution!

the output “delta” of the the HPE_EDE model should be the expression of the persion, not head pose, right ?
but , when i frozen the yaw,pitch and roll matrixs, and only extract delta feature from HPE model of driving person , the source persion still have a head movtion. so , what's wrong with me?

I want to transfer one person's expression from another, with no head movtion. how shoud i do.

about the ckp epoch

Thanks a lot for your code and pre-trained model.

Now I want to continue training on your pretrained model, after loading the pretrained model, the epoch begins from the 12400, but the ckp name is 00000100-ckp.pth.tar, which means the ckp was generated after the 100 epochs? Do you have any idea about the issue? Thank you!

Will you please share the trained checkpoint ?

大佬您好，感谢您开源的代码。
不知道后续是否会考虑分享一下预训练模型？谢谢啦