snap-research / mocogan-hd Goto Github PK

View Code? Open in Web Editor NEW

237.0 237.0 25.0 20.24 MB

[ICLR 2021 Spotlight] A Good Image Generator Is What You Need for High-Resolution Video Synthesis

License: Other

Python 83.82% C++ 1.33% Cuda 8.76% Shell 6.09%

deep-learning gan image-generation video-generation

mocogan-hd's People

Contributors

Stargazers

Watchers

mocogan-hd's Issues

Incorrect link for the image generator checkpoint on FaceForensics

Hi! Thank you for the project and the codebase! I noticed that for some datasets, links to the pretrained models do not work: e.g. the image generator link on FaceForensics leads to https://github.com/snap-research/MoCoGAN-HD/blob/main/pretrained_models/faceforensics-fid10.9920-snapshot-008765.pt, which does not exist (same for (Anime, VoxCeleb) and (AFHQ, VoxCeleb) cross-domain image generators). Could you please provide a link for the pretrained image generator on FaceForensics?

Augmentation for training?

Hello,

As I saw issue #5 (specifically, the below comment), I understand DiffAugment is applied for training on UCF-101 dataset.

Is DiffAugment applied for FaceForensics dataset too?
Because similar to UCF-101 which only has a small number of samples per class,
FaceForensics has only 704 training data, and I think this is not enough amount of data to train GANs

Hi @sihyun-yu, have you tried to use the augmentation from this work?

The FID was calculated during training from StyleGAN2.

Originally posted by @alanspike in #5 (comment)

Thanks,

Inference issue using pre-trained models

Hi,
Great Work!
I was using the pre_trained model for inference on skytimelapse and ucf-101 dataset. However in both the cases, gray videos are generated. I have not made any changes to the code. There are no errors or warnings. Did you face any similar issue ?

video-gen_19_5_noise.mp4

Question about the way you finetune the generator

Dear authors,

I want to ask a question about how you finetune the generator. Take faceforensics as an instance, did you use all cropped frames as the finetuning dataset, or did you use several frames of an identity each for finetuning?

Thanks a lot.

Question about Inception score evaluation

Hello
Thank you for your great work! I read the paper carefully.

I wonder how to calculate the inception score of UCF-101 in detail.
I read that you follow the tgan paper for evaluating the inception scores and use the C3D network for getting the predictions.

Which weights did you use for the C3D network? Did you train from the initial?
If not, could you let me know the weight of the C3D network and w how to use the C3D net?

In detail, in this paper, the size of the generated video(ucf101) is (224, 224). But pre-trained network of C3D in this link (https://github.com/rezoo/tgan2/releases/download/v1.0/conv3d_deepnetA_ucf.npz) was not trained with config (224, 224). How did you resize and normalize the frames..?

I would be very grateful if you could reply.
Thanks.

Cannot run pca_stats.py

Hi,

I was able to run the pca_stats.py file using the pretrained image generator models provided by you. I was installing a few more packages to my conda environment when pca_stats stopped running altogether.

I have tried uninstalling and reinstalling the conda environment using the requirements.txt provided in the repository. This is my command -- python get_stats_pca.py --batchSize 4000 --save_pca_path pca_stats/ucf_101 --pca_iterations 250 --latent_dimension 512 --img_g_weights pretrained_checkpoints/ucf-256-fid41.6761-snapshot-006935.pt --style_gan_size 256 --gpu 0

The process just hangs forever, the GPU memory goes from 0 MB to 3 MB, and nothing else happens. I don't know what I could have done wrong. It was working before. As an additional step, I also setup the repository from scratch.

Any idea what might have happened?

Did you cut first seconds of the FaceForensics dataset?

Hi! FaceForensics contains "video starting" artifacts for its first ~0.5 seconds for many its videos (see the gif), which might produce the corresponding training artifacts. Did you remove them?

Here are random samples from FFS, cut to the first 0.5 seconds:

Also, did you account for them in any way when computing FVD?

Question about the FVD evaluation

Hi,

First of all, thank you for your great work!

As I read your paper,
I understand that the FVD is calculated from 2048 videos with 128x128 resolution in UCF101 dataset.

To evaluate your model on UCF101, I randomly sampled the 2048 real videos (random video clips with 16 consecutive frames) and resize them into 128x128 resolution.
Then, I calculated FVD between sampled real and fake videos.

In result, I got 625.87 which is a little lower than the distance you reported.
I think there is some difference when building the real video samples compared to your implementation or there is a lot of oscillation of FVD as the randomness of sampling.

Can you inform me detailed evaluation process for FVD on UCF101 and faceforensics dataset?

Thanks,

Question about the cross-domain video discriminator

Hi, thanks for your great work!

I have a question about the cross-domain video discriminator.

According to your paper, you can learn to synthesize video content from one dataset A (such as Anime-Face) while motion part from another dataset B (such as VoxCeleb). In this mode, I think the video discriminator will first learn how to classify the anime and the real person's contents, rather than distinguish meaningful motions. How do you ensure that the video discriminator is helpful during training in this mode?

About the evaluation code

Hi,

Will u release the code for ACD (average consistency distance) and FID?

Thanks

Why feed real data into the video discriminator when training G?

MoCoGAN-HD/train_func_cross_domain.py

Lines 245 to 247 in 27356ba

    
           G_loss_3d = (criterionGAN(D_fake_3d, D_real_3d, True) + 
        
                        criterionGAN(D_real_3d, D_fake_3d, False)) * 0.5

how to compute similarity loss in equation (3)?

Hi,
can you give an example on how to calculate similarity loss in equation 3 in the paper? Thanks!

README should be updated

Some orders in README and current scripts don't match.

should run like this
sh script/ffhq-vox/run_evaluate_1024.sh and sh script/ffhq-vox/run_get_stats_pca_1024.sh

Usage of UCF-101 dataset

Hi, thank you for sharing the code of your elegant work!

I have a question about the experimental setup on experiments with UCF-101 dataset.
Did you use the "train" split from the UCF-101 dataset or the whole dataset without split?

Thank you in advance!

Sincerely,
Sihyun

How to train on a custom dataset?

I have a custom dataset of face videos from the How2Sign dataset. I have the dataset in the format required by this repository. What are the steps for training on a custom dataset?

Hyperparameters to train StyleGANv2 on UCF-101

Hello! Thanks again for providing the implementation.
I am trying to retrain a "unconditional" image generator from scratch on the UCF-101 dataset using StyleGANv2 as you suggested.
Did you use specific hyperparameters to train such a model to reach the reported FID?
If so, can you share those hyperparameters?

Thanks in advance!

Sincerely,
Sihyun

Did you use any truncation or curation for the released samples?

Hi! Could you please tell whether you used any truncation for the content or motion codes or curated the samples for these generations: https://github.com/snap-research/MoCoGAN-HD#faceforensics-1 ? I used your pretrained checkpoint, PCA stats and the pretrained G to generate samples with --n_frames_G=32 and without spatial noise. And the results feel of lower quality compared to the ones which you show in your README.md. Here is the samples I got (sorry for the external link, github for some reason does not want to upload the gif even if it is less than 10mb):

https://i.imgur.com/1QRibnD.mp4

For example, the motion diversity is not that good, i.e. the heads do not "speak". Could you tell, why there is such a difference?

	G_loss_3d = (criterionGAN(D_fake_3d, D_real_3d, True) +
	criterionGAN(D_real_3d, D_fake_3d, False)) * 0.5

snap-research / mocogan-hd Goto Github PK

mocogan-hd's People

Contributors

Stargazers

Watchers

Forkers

mocogan-hd's Issues

Recommend Projects

Recommend Topics

Recommend Org