Coder Social home page Coder Social logo

Comments (31)

shaoanlu avatar shaoanlu commented on August 26, 2024 2

Here are the results I got after training for ~10k iters (4 hrs on K80) with perceptual loss. The masked faces are fairly good. So what I've learned is that, as long as perceptual loss is introduced during training, we probably don't have to prepare training data as much as I though it should.

0218_masked_pl
0218_mask_pl

However, the generated mask can not deal with hard sample, e.g., faces with motion blurs at the bottom left of figures above. As a result, the output video quality on non-stationary face is sub-optimal.

shia_downey

We can see there is one quick moment that it failed to transform the face to Downey. On the contrary, the mouth is preserved so that we know that he is shouting "just do it!".

shia_downey_ccr

But it is somehow eliminated after color correction. (Update: It's not eliminated, rather it's just becoming less discernible)

from faceswap-gan.

shaoanlu avatar shaoanlu commented on August 26, 2024

Jesus man, these masks are awesome. They are like dot arts. Please tell me how did you get it, seriously. What's the training data? How many iters. did you train on? I would really like to reproduce the result.

In terms of fakeswap, I did not look into the detail of the recent GAN update, but I believe its the same with my implementation.

I have not seen such masks before, but I guess it might has something to do with lacking diversity as well as not enough training data. However, the sharpness can be eliminated to a certain level by GaussianBlur during video making, so that's not a big deal in my opinion.

from faceswap-gan.

iperov avatar iperov commented on August 26, 2024

I just built https://github.com/deepfakes/faceswap latest release + 155 pull request.

I built windows version with cmd batches for myself, you can download it from here
test footage as on picture included in torrent
deepfakes/faceswap-playground#39

from faceswap-gan.

shaoanlu avatar shaoanlu commented on August 26, 2024

I've downloaded the zip file you mentioned above through torrent. Looking through the zip file, I only found 2 mp4 videos (to be extract as training data?), each < 30 sec.. So I think such awesome masks are caused by too little training data. The model is heavily overfit such that one of preview mask column is purely white (which means the model believe it can reconstruct input image by 100% accuracy).

from faceswap-gan.

iperov avatar iperov commented on August 26, 2024

652 photos not enough for GAN ?
then how many?

from faceswap-gan.

iperov avatar iperov commented on August 26, 2024

actually white column is not purely white:
2018-02-11_17-01-54

from faceswap-gan.

shaoanlu avatar shaoanlu commented on August 26, 2024

LOL, love this gif. It shows that what we see is actually not what we see.

Are these 652 images come from the 2 videos (data_dst.mp4 and data_src.mp4)? If yes, then its definitely not enough since the extracted faces will look the same (they are under same lighting condition).

In my experiment, its better to have more than 1k image from various (>=3 at least) video sources. And the extraction has fps <5 (i.e., extract < 5 faes/s from the video)so that there will not be too much duplicate images.

However, you can still try a technique called transfer learning even if you have little training data. To be specific, we first train our model using other face dataset (e.g., Emma Watson, Donald Trump, celebA etc. whatever you can find). Once the pre-trained model shows good results (say, after 20k iters), then we train it using these 652 images.

But really, have you tried to make video using your current model? I'm imaging it produce faces with a lot of jitters. But considering how much face detail it generate in the previews, It might produces interesting results in video.

from faceswap-gan.

iperov avatar iperov commented on August 26, 2024

data_src.mp4 - 654 images

so you mean GAN model not like a lot of similar images ?

from faceswap-gan.

shaoanlu avatar shaoanlu commented on August 26, 2024

Ahh that make sense. A 27-sec-long video with avg. 25fps. 27 x 25 is about 654. You probably want to add more images from other sources.

Since I found extraction < 3 fps be more effective. The information these 654 images provide to the model is probably not much different from less than 100 images.

from faceswap-gan.

iperov avatar iperov commented on August 26, 2024

result in current stage
2018-02-11_17-30-07

from faceswap-gan.

shaoanlu avatar shaoanlu commented on August 26, 2024

I think the result can be much better if the face bounding box is more stable. As we can see, the bbox size and position in the gif above is not smooth so it results in severe jitters. We can smooth the bbox position being a moving average of previous frames. However, as far as I know there is no such functionality in faceswap (or maybe its still a wip).

What I want to say is that, don't be desperate about the current result. It can be improved if proper trick is introduced.

from faceswap-gan.

iperov avatar iperov commented on August 26, 2024

so there is no point to train GAN model in https://github.com/deepfakes/faceswap ?

from faceswap-gan.

shaoanlu avatar shaoanlu commented on August 26, 2024

The GAN (and non-GAN as well) will work if we have enough training data.

I'll tried your dataset and do some experiment tomorrow, hope I can make some breakthrough.

from faceswap-gan.

iperov avatar iperov commented on August 26, 2024

non-GAN works much better with same footage

2018-02-11_18-02-54

So may be deepfakes/faceswap did something wrong when ported your model?

from faceswap-gan.

shaoanlu avatar shaoanlu commented on August 26, 2024

I don't think there are much difference between mine implementation and faceswap's. Perhaps GAN is just not as data efficient as non-GAN.

from faceswap-gan.

dfaker avatar dfaker commented on August 26, 2024

@shaoanlu I think the GAN + perceptual loss is very promising, But I've never been convinced that the masking doesn't degrade down to edge detection.

Have you considered training against a mask derived from the convex hull of the landmarks with the mask scaled down or multiplied by the rgb-face-loss*sigmoid to allow the network to reduce the mask only if it's beneficial to face reproduction?

from faceswap-gan.

shaoanlu avatar shaoanlu commented on August 26, 2024

The landmarks will be detected no matter there is occlusion on faces or not. My understanding is that if we use convex hull mask as supervised ground truth, the model will overfit so that it can not handle occlusion anymore. On the contrast, I believe the mask is able to learn more or less semantic segmentation, as shown in LR-GAN. But we have to find better architecture and loss function for masking on face swapping task.

from faceswap-gan.

dfaker avatar dfaker commented on August 26, 2024

The landmarks will be detected no matter there is occlusion on faces or not.

Yeah hence the need to give the network a 'loophole' by biasing the loss using rgb-face-loss*sigmoid, if the face is being generated well the loss will be near zero and we won't care about adjustments to the masking layer.

If the generation is poor however the rgb-face-loss*sigmoid is high and we do rigidly enforce the ground truth mask.

from faceswap-gan.

shaoanlu avatar shaoanlu commented on August 26, 2024

Can you elaborate more on the rgb-face-loss*sigmoid?

from faceswap-gan.

dfaker avatar dfaker commented on August 26, 2024

So we have the rgb-face-loss which represents purely the accuracy of the reproduction of the unmasked face.

The output of that is simply passed through some mapping function, perhaps sigmoid perhaps something harder, to force high losses towards 0 and low losses towards 1, then it becomes suitable for multiplication with the mask loss enforcing the full loss against the ground truth mask when the face is accurate but allowing the mask to take on any value when the face is innacurate.

If an occlusion appears it returns a high rgb-face-loss as an unexpected feature to appear in a face, this is forced low by the rgb-face-loss*sigmoid which then when multiplied with the mask loss it biases it towards zero which allows the mask output to move away from the ground-truth-mask.

For normal areas of the face without occlusion, rgb-face-loss is low which is forced high, passing the full loss back to the mask.

from faceswap-gan.

shaoanlu avatar shaoanlu commented on August 26, 2024

I think this is viable. It would be good if someone can run some experiment for prove of concept.

from faceswap-gan.

iperov avatar iperov commented on August 26, 2024

how I can get such masks in faceswap ?

from faceswap-gan.

shaoanlu avatar shaoanlu commented on August 26, 2024

Yet another results trained for 15k without perceptual loss. The model is not exactly the same with v2 model but the modifications should have little impact on output quality theoretically.

just_do_it_masked_wopl
just_do_it_mask_wopl

In the light of my recent experiments, I think the sharp masks shown in your post are due to too long of training so that the model is heavily overfit.

from faceswap-gan.

Jack29913 avatar Jack29913 commented on August 26, 2024

Is perceptional loss implemented in faceswap?

from faceswap-gan.

iperov avatar iperov commented on August 26, 2024

no

from faceswap-gan.

MisterGenerosity avatar MisterGenerosity commented on August 26, 2024

Hey shaoanlu, I noticed you mentioned color correction above

But it is somehow eliminated after color correction.

What do you do to produce color correction? Meaning, what software or, perhaps, what code? As I'm getting into this I'm finding my subject faces have such different skin tones, and I'm concerned it'll look like rubbish when I'm done.

from faceswap-gan.

shaoanlu avatar shaoanlu commented on August 26, 2024

Color correction is implemented through histogram matching. You can find the piece of code in v2_test_video notebooks.

The reason I did not include perceptual loss into faceswap is because it requires keras >2.1.1 and more VRAM. I worried that it will lead to lots of issues posts on the repo., thus I decide to remove it.

from faceswap-gan.

iperov avatar iperov commented on August 26, 2024

is histogram matching implemented in faceswap?

from faceswap-gan.

shaoanlu avatar shaoanlu commented on August 26, 2024

Probably not, I added it recently. However, I did not follow updates of faceswap closely so maybe there are some similar works been done.

from faceswap-gan.

Jack29913 avatar Jack29913 commented on August 26, 2024

Nope. Repo is pretty idle now.

from faceswap-gan.

mrgloom avatar mrgloom commented on August 26, 2024

About checkerboard artifacts
https://distill.pub/2016/deconv-checkerboard/

from faceswap-gan.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.