mit-han-lab / anycost-gan Goto Github PK

[CVPR 2021] Anycost GANs for Interactive Image Synthesis and Editing

Home Page: https://hanlab.mit.edu/projects/anycost-gan/

License: MIT License

Python 88.00% C++ 1.21% Cuda 9.75% Shell 1.04%

computer-vision deep-learning computer-graphics generative-adversarial-network gan image-generation image-manipulation image-editing gans pytorch

anycost-gan's People

Contributors

Stargazers

Watchers

Forkers

felixzhang7 bingwen-hu liuguoyou baifree tamwaiban stjordanis wx-b sarrbranka dendisuhubdy sharifamit ml-lab ilyeong-minset senalpayagalage ideaplexus flock1 wh19970104 zumbalamambo peterouzh johndpope doytsujin ikasumi mornydew davidruoyuwu rafaelmri s-bei ammaddd ak391 peterzhousz c1a1o1 a-chen23 pradeepppc gleery templeblock ak9250 peterzs cordob zhanghongyong123456 abyuthup githubcrj repo-collection jjandnn peterria cvlinks jmu201521121021 jubilant-choi kibernetika-ai imlixinyang pingponglabs jojocorleone boboyiyi antonlinderer stephenfang51 winjia edwardmeng132 ruanjiyang rahul-sindhu metavai tianzhengg pranavmistry zt706 riaduli nichollu puguomai hevincent cv-ip celsopitta zaynla frankycao rexiome iramshiv rdaim cstichbury re-n-y jchetboun westamine aczire mrk1992 jaedukseo quasar-u127 amarszalek sunmeng7 zivzone sjskoko aqrose0526 joskid goldeneye-open-source haizhu12 qiu023 legio-x 5l1v3r1 tonywhite11 anniedde mahmoudkh24 yimilirui

anycost-gan's Issues

Using My Face

Hello.
Can I use my own face to add a smile? How do I do it?

Using this tool for another LSUN dataset + model

Hello! Thanks for creating this.

I am trying to use this tool with another model (the LSUN Churches dataset) with sliders that represent attributes.

As I understand it, these are the steps I need to take to configure this toolset to work with a different dataset + pre-trained network:

Make sure the format of the LSUN Churches dataset matches as described for other models
Change the config name as described here to config_name = ''stylegan2-church-config-f", referring to the pre-trained network found here
run models.get_pretrained('attribute-predictor') as described in the pre-trained-models section of README
Change the relevant attribute labels in the files that show up during this search

I am just wondering if there any obvious steps I am missing to get this working, I am very new to the world of GANs and toolsets. Thank you for your time 😊

how to save , when runned python demo.py , edited face picture

add face editing demo to colab

is it possible to add the face editing demo to the colab notebook? Thanks

Colab Error Invalid Syntax keys_list.append(f'{int(cfg['macs'] * 100. / full_macs)}%')

low-resolution output image looks unnatural

I find low-resolution outputs image(below 128 size) looks unnatural.
According to Figure 4b, the low-resolution output should look natural too but is similar to vanilla StyleGAN2 output.

Is AnycostGAN effective in shortening the experiment time?

Hi, @tonylins
I have a question.
As I understand, this study has been studied for the purpose of fast inference in several edge devices.

In addition, this technology seems to be trying to effectively apply knowledge distillation to a high resolution that requires a lot of learning time by conducting various experiments with a fast experiment at low resolution and the confirmed experimental results.

In this part, I am interested. I want to do sufficiently different experiments (ex. Conditional GAN etc.) at 64x64 or 128x128 using AnycostGAN, and apply it to high resolution after completing the experiment. I am curious if it will be applied well to this part.

Obviously, it will be confirmed by experimenting, but if there are any additional papers or techniques that can be referenced in this research method, I would appreciate it if you would recommend it.

And I am curious about your opinion on whether applying technology to high resolution after experimenting in low resolution for the fast experiment is more effective than learning single resolution.

is this a bug?

Hi @junyanz @songhan , I found a possible bug, not sure, if not please correct me, thank you.
when you calculate the fid, you use the transform with random flip:
https://github.com/mit-han-lab/anycost-gan/blob/master/tools/calc_inception.py#L53
but when training code, there is just clamp, no flip:
https://github.com/mit-han-lab/anycost-gan/blob/master/tools/train_gan.py#L279
That maybe lead to wrong evaluation result

Default parameters for project.py do not recreate projected latents in assets/demo/projected_latents

Hello, great work! I am wondering what options you use to calculate the projected latents in assets/demo/projected_latents?
I am trying to recreate them using the default parameters via: python3 tools/project.py 00_ryan.jpg
But the resulting vectors are numerically different and, when viewed in demo.py: (1) the projected image is good but clearly different than the projected image preloaded in the repo and (2) the editing directions don't seem to work very well for this set of latent codes.

Below I've included a screenshot of the behavior I am seeing. Note the differences in his neckline from the demo projection and the lack of any meaningful change in the output image.

How can I edit my photos using your already trained model in demo.py

Hello. I express my gratitude for the work done.

I am far from programming. Installed your project.
Can you explain in more detail how I can edit my images using demo.py. I found the paths input_images and projected_latents.
Npy files stay old when replacing images.

What steps should I take to get interactive image editing for my images?

Thanks in advance, I don’t know how to do it.

Could you please provide pre-trained boundaries of 'stylegan2-ffhq-config-f'?

Hi, thanks for your brilliant work!
However when I try to get pre-trained directions, it seems these pre-trained boundaries are missing.
For example, I cannot find the webpage 'https://hanlab.mit.edu/projects/anycost-gan/files/boundary_stylegan2-ffhq-config-f'.
Could you please fix this problem or share the boundary files with me?
That would be great help.
Thank you very much.

How to compute MACs and flops

Hi, thanks for your impressive work. I have some problems about MACs/flops.

I compute MACs based on https://github.com/Lyken17/pytorch-OpCounter. However, the computational cost is a little different from your results. The styleganv2 has some custom ops: 'PixelNorm', 'EqualConv2d', 'EqualLinear', 'ModulatedConv2d', 'StyledConv', 'ConvLayer', 'ResBlock', 'ConstantInput', 'ToRGB',(which include FusedLeakyReLU, fused_leaky_relu, upfirdn2d, NoiseInjection, Blur)

I am not sure if you consider all of these ops.

In your code fid.py, I only find
if hvd.rank() == 0:
try:
from torchprofile import profile_macs
macs = profile_macs(generator, torch.rand(1, 1, 512).to(device))

However, the torchprofile package does not contain the above operations.

In content GAN compression https://github.com/lychenyoko/content-aware-gan-compression/blob/master/Util/Calculators.py, it seems that it regards them as CONV and LINEAR layers. Other operations (e.g., FusedLeakyReLU, fused_leaky_relu, upfirdn2d, NoiseInjection, Blur) are ignored.

Could you please let me know how to compute the MACs to reproduce the code? Thanks!

Custom image editing

Question 1: how to generate latent image code in custom image editing
Question 2: when customizing image editing properties, can i use all 40 properties to modify without needing to be retrained?I see that demo.py uses eight properties

Any plan to support s space manipulation?

Thanks to the excellent work, anycost-gan
And is there any plan to support s space manipulation?

Error on loading pretrained model

Hi @tonylins , I tried with adaptive-channel training, and used the teacher model from your dropbox:
https://www.dropbox.com/sh/l8g9amoduz99kjh/AAAY9LYZk2CnsO43ywDrLZpEa?dl=0
stylegan2-ffhq-config-f.pt

But got this error:

Could you please kindly tell me what's problem/cause of it, thank you.

p solved

生成图全是灰色

import torch
import numpy as np
import os, random
from PIL import Image
from tqdm import tqdm
from models.dynamic_channel import set_uniform_channel_ratio, reset_generator
import models
import time
import cv2
config = 'anycost-ffhq-config-f'
device = 'cuda:2'

class Face_Editor():
    def __init__(self):
        self.init_model()

    def init_model(self):
        self.anycost_channel = 1.0
        self.anycost_resolution = 1024
        self.generator = models.get_pretrained('generator', config).to(device)
        self.generator.eval()

    def sample(self):
        torch.manual_seed(1601)
        # latent = torch.randn(1, 1, 512, device=device)
        # mean_style = self.generator.mean_style(10000)
        # self.input_kwargs = {'styles':latent, 'return_rgbs':True, 'truncation':0.5,
        #                      'truncation_style':mean_style, 'randomize_noise':False}
        # style = torch.randn(1, 18, 512, device=device)
        style = np.load('/simple/zlp1/masters/anycost-gan/assets/demo_ori/projected_latents/00_ryan.npy')
        style = torch.from_numpy(style).view(1, -1, 512).to(device)
        self.input_kwargs = {'styles': style,
                            'noise': None, 'randomize_noise': False, 'input_is_style': True}
        image = self.generate_image()
        image = cv2.cvtColor(image, cv2.COLOR_RGB2BGR)
        cv2.imshow('image', image)
        cv2.waitKey(0)

    def generate_image(self):
        def image_to_np(x):
            assert x.shape[0] == 1
            x = x.squeeze(0).permute(1, 2, 0)
            x = (x + 1) * 0.5  # 0-1
            x = (x * 255).cpu().numpy().astype('uint8')
            return x

        with torch.no_grad():
            print(self.input_kwargs)
            out = self.generator(**self.input_kwargs)[0].clamp(-1, 1)
            out = image_to_np(out)
            return out


if __name__ == '__main__':
    FE = Face_Editor()
    FE.sample()

checkpoint for mult-resolution training

Dear anycost-gan team,

Thank you for sharing this great work, I really like it.

Would you minding sharing the intermedia checkpoint for mult-resolution step? To train anycost gan, we need to do 3 steps:

Training the original StyleGAN2 on FFHQ
Training Anycost GAN: mult-resolution
Training Anycost GAN: adaptive-channel

You provide the checkpoint for 1st and 3rd steps. Would you minding aslo sharing the checkpoint for the second step? I understand that I can train it by myself, but 8 gpus for 5 days is really too heave resource for us.

Thank you for your help.

Best Wishes,

Alex

I want to embedding 256x256 image and generate 256x256 image test.

Hi, @tonylins
Thank you for your good paper.

In the case of this Github, only 256x256 resolution can be encoded. However, it seems that only the resolutions of 1024x1024 and 512x512 are uploaded through the decoder.

What I want to test is to encode and decode a 256x256 image and check whether the same image as the original image comes out.

Can you send me 256x256 anycost-ffhq decoder weight?

Need help editing my own uploaded images

It seems like this repo is generating faces out of randomness and edit those random faces. I want to upload my own images and edit them. How do I do that?

Thanks in advance.

Share the pretrained anycost Discriminator

I'm using your generator

G = models.get_pretrained("generator", 'anycost-ffhq-config-f')

Can you provide also the pretained discriminator used, with the same structure? I see the class Discriminator in anycost_gan.py but it raises a NotImplementedError when doing the same for D.
Could you publish it?

mean style requires cuda tensor but got ''cpu'' while runing demo.py

anycost-gan/models/anycost_gan.py

Line 85 in 19229bd

w = self.style(z).mean(0, keepdim=True)

here 'z' is "cpu" device while self.style needs "cuda" tensor???
Error is: RuntimeError: input must be a CUDA tensor
i figured out the reason is the code then running to FusedLeakyReLUFunction ,which requires CUDA tensor. But in demo.py, the device is "cpu". Any idea how to fix this problem?

using win run demo.py need cuda but cant using force-native=1

hi,thanks for job. my win10 pc run stylegan3 is ok but run anycost-gan , demo.py need cuda, using force-native=1, my pc cant understand,anyone can help me,thanks very much.

wrong image generate by using config: stylegan2-

import torch
import numpy as np
import os
from PIL import Image
from models.dynamic_channel import set_uniform_channel_ratio, reset_generator
import models


class FaceEditor:
    def __init__(self, config, device, anycost_resolution=1024, n_style_to_change=12):
        # load assets
        self.device = device
        self.anycost_channel = 1.0
        self.anycost_resolution = anycost_resolution
        self.n_style_to_change = n_style_to_change

        # build the generator
        self.generator = models.get_pretrained('generator', config).to(device)
        self.generator.eval()
        set_uniform_channel_ratio(self.generator, 0.5)  # set channel
        self.generator.target_res = anycost_resolution  # set resolution
        # self.generator.target_res = self.anycost_resolution
        self.mean_latent = self.generator.mean_style(10000)

        # select only a subset of the directions to use
        '''
        possible keys:
        ['00_5_o_Clock_Shadow', '01_Arched_Eyebrows', '02_Attractive', '03_Bags_Under_Eyes', '04_Bald', '05_Bangs',
            '06_Big_Lips', '07_Big_Nose', '08_Black_Hair', '09_Blond_Hair', '10_Blurry', '11_Brown_Hair', '12_Bushy_Eyebrows',
            '13_Chubby', '14_Double_Chin', '15_Eyeglasses', '16_Goatee', '17_Gray_Hair', '18_Heavy_Makeup', '19_High_Cheekbones',
            '20_Male', '21_Mouth_Slightly_Open', '22_Mustache', '23_Narrow_Eyes', '24_No_Beard', '25_Oval_Face', '26_Pale_Skin',
            '27_Pointy_Nose', '28_Receding_Hairline', '29_Rosy_Cheeks', '30_Sideburns', '31_Smiling', '32_Straight_Hair',
            '33_Wavy_Hair', '34_Wearing_Earrings', '35_Wearing_Hat', '36_Wearing_Lipstick', '37_Wearing_Necklace',
            '38_Wearing_Necktie', '39_Young']
        '''

        direction_map = {
            'smiling': '31_Smiling',
            'young': '39_Young',
            'wavy hair': '33_Wavy_Hair',
            'gray hair': '17_Gray_Hair',
            'blonde hair': '09_Blond_Hair',
            'eyeglass': '15_Eyeglasses',
            'mustache': '22_Mustache',
        }

        boundaries = models.get_pretrained('boundary', config)
        self.direction_dict = dict()
        for k, v in boundaries.items():
            self.direction_dict[k] = v.view(1, 1, -1)

    def get_latent_code(self, latent_code_path):
        latent_code = torch.from_numpy(np.load(os.path.join(latent_code_path))).view(1, -1, 512)
        return latent_code

    def get_direction_dict(self, attr_weights):
        final_dict = {}
        for key, value in attr_weights.items():
            if value == 0:
                continue
            final_dict[key] = value * self.direction_dict[key]
        return final_dict

    def get_boundary_dict(self):
        return self.direction_dict

    def generate_image(self, save_path, input_kwargs):
        def image_to_np(x):
            assert x.shape[0] == 1
            x = x.squeeze(0).permute(1, 2, 0)
            x = (x + 1) * 0.5  # 0-1
            x = (x * 255).cpu().numpy().astype('uint8')
            return x

        with torch.no_grad():
            out = self.generator(**input_kwargs)[0].clamp(-1, 1)
            out = image_to_np(out)
            out = np.ascontiguousarray(out)
            img_pil = Image.fromarray(out)
            img_pil.save(save_path)

    def edit(self, latent_code_path, attr_sliders, force_full_g=False):
        latent_code = torch.from_numpy(np.load(os.path.join(latent_code_path))).view(1, -1, 512).to(self.device)
        # input kwargs for the generator

        edited_code = latent_code.clone()
        for direction_name in attr_sliders.keys():
            edited_code[:, :self.n_style_to_change] = edited_code[:, :self.n_style_to_change] \
                                                 + attr_sliders[direction_name] * self.direction_dict[
                                                     direction_name].to(self.device)

        edited_code = edited_code.to(self.device)
        if not force_full_g:
            set_uniform_channel_ratio(self.generator, self.anycost_channel)
            self.generator.target_res = self.anycost_resolution
        return latent_code, edited_code

if __name__ == '__main__':
    gan_config = 'stylegan2-ffhq-config-f'
    fe = FaceEditor(config=gan_config, device='cuda:0')
    latent_code = torch.from_numpy(np.load(os.path.join(latent_code_path))).view(1, -1, 512).to(self.device)
    ori_kwargs = {'styles': ori, 'noise': None, 'randomize_noise': False, 'input_is_style': True}
      
    fe.generate_image(save_path=ori_save_path, input_kwargs=ori_kwargs)

image generate by config anycost-ffhq-config-f is pretty fine, but there the image generate with config stylegan2-ffhq-config-f is wrong. How can I fix the bug? Thankyou

About Adaptive-channel training and Generator-conditioned discriminator.

Dear Mit-han-lab:

Thank you very much for sharing your excellent work.

I have two questions regarding the code: 1) For the Step Adaptive-channel training source code, where can I find it? Is dynamic_channel.py required for this process? Whether Is it included in the unpublished train.py? 2) I do not find G_arch operations in class Generator and DiscriminatorMultiRes.

Would you kindly help me answer this question in your spare time?

Best Wishes,

GreenLimeSia

关于encoder的训练

您好，感谢分享。
有一些不太理解的地方，希望能解答。

encoder, generator, discriminator的训练流程是怎样的？
我猜测是先discriminator, generator训练完成后，使用generator来训练encoder。这种流程，encoder是不会影响generator。
那么是否可以三个模型一起训练。互相影响，达到最优。

in extract_edit_directions.py

Hi，I am very confused，Attribute-predictor is poorly explained，How do I get an Attribute-predictor .pt file on different datasets？

FID for FFHQ 1024

Dear mit-han-lab,

Thank you for sharing with us this great work, I really like it.

In Table 1, you show that multiple resolution outputs have higher image quality compared to single resolution training in config E. Have you try config F, which is the standard stylegan2 mode?

According to FFHQ 1024 leadboard, the stylegan2 has FID of 2.84, while anycost GAN has FID of 2.99, which is a little bit worse. So I am wondering if you use config F as standard StyleGAN2, will you get better results than standard StyleGAN2?

Thank you for your help.

Best Wishes,

Alex

Color difference in generated image for stylegan2-ffhq-config-f model

Hi, thanks so much for this awesome library. Congrats on the great work!

I've a question regarding the color of the generated image using the stylegan2-ffhq-config-f.

Why are they yellowish and have less contrast? I'm thinking it might be the difference in how the training images for the different generators are normalized, is there a way to reverse the normalization after the images are generated using the stylegan2-ffhq-config-f?

I tried to play around with the parameters in the image_to_np function, but it did not work.

Please see attached for a sample.

Please let me know what I did wrong here or if this is an improvement that could be made.

Thank you!

Recommendations on speeding up tools/project.py

I don't provide any optional argument and the projection takes 19 seconds even with a GPU. Besides tweaking n_iter, is there any other argument that I should tweak to speed up the projection process?

usage: project.py [-h] [--config CONFIG] [--encoder] [--optimizer OPTIMIZER]
                  [--n_iter N_ITER] [--optimize_sub_g]
                  [--mse_weight MSE_WEIGHT] [--enc_reg_weight ENC_REG_WEIGHT]
                  FILES [FILES ...]

Image projector to the generator latent spaces

positional arguments:
  FILES                 path to image files to be projected

optional arguments:
  -h, --help            show this help message and exit
  --config CONFIG       models config
  --encoder             use encoder prediction as init
  --optimizer OPTIMIZER
                        optimizer used
  --n_iter N_ITER       optimize iterations
  --optimize_sub_g      also optimize the sub-generators
  --mse_weight MSE_WEIGHT
                        weight of MSE loss
  --enc_reg_weight ENC_REG_WEIGHT
                        weight of encoder regularization loss

mit-han-lab / anycost-gan Goto Github PK

anycost-gan's People

Contributors

Stargazers

Watchers

Forkers

anycost-gan's Issues

Recommend Projects

Recommend Topics

Recommend Org