Coder Social home page Coder Social logo

anvoynov / ganlatentdiscovery Goto Github PK

View Code? Open in Web Editor NEW
415.0 415.0 52.0 2.62 MB

The authors official implementation of Unsupervised Discovery of Interpretable Directions in the GAN Latent Space

Python 91.23% Jupyter Notebook 1.76% C++ 0.92% Cuda 6.09%

ganlatentdiscovery's People

Contributors

anvoynov avatar dependabot[bot] avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

ganlatentdiscovery's Issues

Any tips for wrapping my trained generator?

First off, very cool work!

I've been working with a modified ClusterGAN and am struggling a bit with wrapping it for use with your approach? The generator is a pretty basic 3-layer convolutional model with spectral normalization. The trained generator of the ClusterGAN operates as a conditional GAN, so it takes two parameters: zn (the latent) and zc, the class label vector (basically a one-hot encoded class vector—though you can give it partial class values, for interpolation).

Any guidelines on how I might go about setting it up for exploration with your model would be greatly appreciated! Particularly, how to handle the conditional/class vector. I see that the code indicates conditional generation, but I'm not clear on how target_classes and mixed_classes are handled.

pre-trained model for SN-GAN

Hello Anvoynov,

I am wondering which SN-GAN repo are you using for training the MNIST datasets. It looks like you did not mention it on your page.

Thank you!

Can't reproduce results

Hi,
First, I would like to thank you for sharing your code for your awesome work.
I'm trying to reproduce the results, but I get really weird results, especially for StyleGAN FFHQ. For example:
image

Based on your human annotation, I would expect the editing directions to be:
red_light: 6
gender: 15
color_intensity: 24
lightening_2: 49
eyes: 50
contrast: 53
luminance: 57
hair-skin_inversion: 58
skin_tone: 69
redness: 70
tan: 71
saturation: 78
smile_(entangled): 96

The steps I've done:

  1. I ran the download.py script to download the pretrained models.
  2. I ran the evaluation notebook.

The only change I did is in the args.json file, I changed the resolution attribute to be named as gan_resolution to adjust your code.

Thanks!

Install problem

Hi,

I am trying to use this library with a StyleGAN2 model.
I have Windows 10, python 3.6, cuda 10.0, visual studio 2017 and torch 1.4.

When I run the command, I get :

`python run_train.py --gan_type StyleGAN2 --gan_weights network-snapshot-002083.pkl --deformator ortho --out rectification_results_dir
C:\Users\dummy\AppData\Local\Programs\Python\Python38\lib\site-packages\setuptools\distutils_patch.py:25: UserWarning: Distutils was imported before Setuptools. This usage is discouraged and may exhibit undesirable behaviors or errors. Please use Setuptools' objects directly or at least import Setuptools first.
warnings.warn(
C:\Users\dummy\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\utils\cpp_extension.py:287: UserWarning: Error checking compiler version for cl: 'utf-8' codec can't decode byte 0xff in position 54: invalid start byte
warnings.warn('Error checking compiler version for {}: {}'.format(compiler, error))
StyleGAN2 load fail: Error building extension 'fused': [1/2] C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\bin\nvcc -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcompiler /EHsc -Xcompiler /wd4190 -Xcompiler /wd4018 -Xcompiler /wd4275 -Xcompiler /wd4267 -Xcompiler /wd4244 -Xcompiler /wd4251 -Xcompiler /wd4819 -Xcompiler /MD -DTORCH_EXTENSION_NAME=fused -DTORCH_API_INCLUDE_EXTENSION_H -IC:\Users\dummy\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\include -IC:\Users\dummy\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\dummy\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\include\TH -IC:\Users\dummy\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\include" -IC:\Users\dummy\AppData\Local\Programs\Python\Python38\Include -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_75,code=sm_75 -c I:\GANLatentDiscovery-master\models\StyleGAN2\op\fused_bias_act_kernel.cu -o fused_bias_act_kernel.cuda.o
FAILED: fused_bias_act_kernel.cuda.o
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\bin\nvcc -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcompiler /EHsc -Xcompiler /wd4190 -Xcompiler /wd4018 -Xcompiler /wd4275 -Xcompiler /wd4267 -Xcompiler /wd4244 -Xcompiler /wd4251 -Xcompiler /wd4819 -Xcompiler /MD -DTORCH_EXTENSION_NAME=fused -DTORCH_API_INCLUDE_EXTENSION_H -IC:\Users\dummy\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\include -IC:\Users\dummy\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\dummy\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\include\TH -IC:\Users\dummy\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\include" -IC:\Users\dummy\AppData\Local\Programs\Python\Python38\Include -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_75,code=sm_75 -c I:\GANLatentDiscovery-master\models\StyleGAN2\op\fused_bias_act_kernel.cu -o fused_bias_act_kernel.cuda.o
C:/Users/dummy/AppData/Local/Programs/Python/Python38/lib/site-packages/torch/include\c10/util/ThreadLocalDebugInfo.h(12): warning: modifier is ignored on an enum specifier

C:/Users/dummy/AppData/Local/Programs/Python/Python38/lib/site-packages/torch/include\ATen/core/boxing/impl/boxing.h(100): warning: integer conversion resulted in a change of sign

C:/Users/dummy/AppData/Local/Programs/Python/Python38/lib/site-packages/torch/include\ATen/record_function.h(13): warning: modifier is ignored on an enum specifier

C:/Users/dummy/AppData/Local/Programs/Python/Python38/lib/site-packages/torch/include\ATen/core/op_registration/op_whitelist.h(39): warning: integer conversion resulted in a change of sign

C:/Users/dummy/AppData/Local/Programs/Python/Python38/lib/site-packages/torch/include\ATen/core/builtin_function.h(97): warning: statement is unreachable

C:/Users/dummy/AppData/Local/Programs/Python/Python38/lib/site-packages/torch/include\torch/csrc/jit/ir/ir.h(1347): error: member "torch::jit::ProfileOptionalOp::Kind" may not be initialized

1 error detected in the compilation of "C:/Users/dummy/AppData/Local/Temp/tmpxft_00000a20_00000000-10_fused_bias_act_kernel.cpp1.ii".
fused_bias_act_kernel.cu
ninja: build stopped: subcommand failed.

Traceback (most recent call last):
File "run_train.py", line 106, in
main()
File "run_train.py", line 64, in main
G = load_generator(args.dict, weights_path)
File "I:\GANLatentDiscovery-master\loading.py", line 19, in load_generator
G = make_style_gan2(args['gan_resolution'], G_weights, args['w_shift'])
File "I:\GANLatentDiscovery-master\models\gan_load.py", line 109, in make_style_gan2
G = StyleGAN2Generator(size, 512, 8)
NameError: name 'StyleGAN2Generator' is not defined`

Can you help me ?

K Value

Hi, thanks for your intriguing work. I was wondering how you can change the K value inside the code so that it looks for fewer directions?

Use the code for StyleGAN3

Thanks for sharing your code. I was wondering if you have any idea how to use the code for StyleGAN3?

cuda out of memory when loading from checkpoint

Hello,

Again, thank you for making such a great tool available. I am writing because I am having some issues with loading from a checkpoint model. The training runs totally fine initially, but when I want to resume from a model checkpoint (I am resuming at the 20,000th step), it cannot go past 1 step without running out of gpu memory. I am running the training on 6 V100's and here is my command:
`python run_train.py --gan_type StyleGAN2 --gan_resolution 256 --w_shift True --gan_weights /XXX/work/stylegan2-pytorch/500kNFT_tf_stylegan2.pt --deformator ortho --out /XXX/work/Notebooks/GAN/directionality/out/NFT_500k_ortho --multi_gpu 6 --shift_predictor_size 256`

How difficult is it to adapt this to stylegan2-ada-pytorch

I've read the paper and in theory, I should be able to more or less exchange the generator and configure the latent-space dimensionality and then have the matrix of possible directions trained by the resnet18 reconstructor that you used for proggan/stylegan2.

Do you think it's reasonable to adapt the code base to the stylegan2-ada-pytorch model? And can you possibly give me a hint on the code in the /models directory, in case I want to give it a try myself? I'm not entirely sure about the purpose of those. Happy to help rewrite code though. :)

EDIT: Sorry for duplication, didn't see the closed issue earlier.

how to check the results after training?

i've completed the training on my stylegan2 model, and would like to explore the results - could you describe how to do that? how many directions have been found, how to check them explicitly, etc.?

provided jupyter notebook gives very little clues about that, it shows only your hand-picked results.

KeyError: 'resolution'

Hello,

I try to run this command:

!python run_train.py \
    --gan_type StyleGAN2 \
    --gan_weights /content/stylegan2-pytorch/network-snapshot-005000.pt \
    --deformator ortho \
    --out rectification_results_dir

And I encounter this error:

Traceback (most recent call last):
  File "run_train.py", line 103, in <module>
    main()
  File "run_train.py", line 62, in main
    G = load_generator(args.__dict__, weights_path, args.w_shift)
  File "/content/GANLatentDiscovery/loading.py", line 19, in load_generator
    G = make_style_gan2(args['resolution'], G_weights, shift_in_w)
KeyError: 'resolution'

From the look of it, load_generator() is referenced in two places:

  • in main(), where the error occurs:

G = load_generator(args.__dict__, weights_path, args.w_shift)

  • in load_from_dir(), where resolution is defined as a dictionary key:

if 'resolution' not in args.keys():
args['resolution'] = 128
G = load_generator(args, G_weights, shift_in_w)

I think main() should have a similar block of code where resolution is defined. Or resolution should appear in:

class Params(object):
def __init__(self, **kwargs):
self.shift_scale = 6.0
self.min_shift = 0.5
self.shift_distribution = ShiftDistribution.UNIFORM

so that the user can set it via the command-line.

Using the load_from_dir function

Hi Andrey!
I need to load the trained generator to my project.
So I cloned your repo, everything goes according to plan, but I'm having trouble using the load_from_dir function (loading.py file).
I'm providing the path of my G_ema.pth file. Then the load_from_dir function concatenates 'args.json' to this path. And then there is an error that says this directory does not exist.
What am I missing? What path do I need to provide when I'm calling the function?

Thanks :)

Arguments used for experiments reported in the paper

Hi, could you provide the arguments that you have used for the experiments reported in the paper? For instance, a set of args.json and command.sh files. That would help in reproducing your results.

Thanks!

No module named 'torch_utils'

I'm running this repository code in Google Colab setting the environment with Conda to match the requirements.txt and getting this issue

python run_train.py --gan_type StyleGAN2 --gan_weights ../network-snapshot-000200.pkl --deformator ortho --out ../Results/
Traceback (most recent call last):
  File "run_train.py", line 106, in <module>
    main()
  File "run_train.py", line 64, in main
    G = load_generator(args.__dict__, weights_path)
  File "/content/GANLatentDiscovery/loading.py", line 19, in load_generator
    G = make_style_gan2(args['gan_resolution'], G_weights, args['w_shift'])
  File "/content/GANLatentDiscovery/models/gan_load.py", line 110, in make_style_gan2
    G.load_state_dict(torch.load(weights, map_location='cpu')['g_ema'])
  File "/usr/local/envs/latent/lib/python3.8/site-packages/torch/serialization.py", line 529, in load
    return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
  File "/usr/local/envs/latent/lib/python3.8/site-packages/torch/serialization.py", line 692, in _legacy_load
    magic_number = pickle_module.load(f, **pickle_load_args)
ModuleNotFoundError: No module named 'torch_utils'

What might be the issue?

How to train it for off-the-shelf BigBiGAN model?

Hi, nice work! I am studying your subsequent work Big GANs Are Watching You: Towards Unsupervised Object Segmentation with Off-the-Shelf Generative Models which is based on the technique proposed in this work. I was attempting to get some interpretable directions for BigBiGAN as described in the paper. What I did is just loading the pretrained BigBiGAN weights provided in this repo: https://github.com/anvoynov/BigGANsAreWatching and then training the model in the same way as training for BigGAN. I suppose that BigGAN and BigBiGAN have the same generator structure so it could have worked. However, it didn't work. The reconstructor was trained very poorly. I wonder what is the correct way to do that. Thanks very much!

training command

python run_train.py \
   --gan_type BigGAN \
   --gan_weights /path/to/BigBiGAN/weights \
   --deformator ortho \
   --out output/BigBiGAN_ortho

discover directions in full w space

is it possible to discover directions in full w space (e.g. [18,512], one [512] latent per layer), instead of single tiled/repeated [512]?
that provided way better results for the task of encoding ("projecting") foreign images into StyleGAN latent space, so should also work here.

Can the number of directions be larger than the dimesnionality of the latent space?

For target_indices you sample batch_size integers from [0, ..., directions_count-1] and then you create z_shift as a batch_size x latent_dim matrix, where for each row, at the column defined by target_indices vector, you assign a (random) shift magnitude value.

However, if directions_count > latent_dim, then this would fail, since you would try to assign a shift magnitude to a non-existent column of z_shift (it has only latent_dim columns). This effectively means that (at least with this version of the code), you cannot discover more directions than the dimensionality of the latent space (you can discover at most as many directions as the dimensionality of the latent space). This might be reasonable in practice, but in the paper it is stated that "The method from (Ramesh et al., 2018) is also limited with the maximal number of discovered directions equal to the latent space dimensionality, while our approach can be applied for a higher number of directions.".

Am I missing something here? Thanks for your time.

Alternate anime face models?

The anime face model used looks like it has interesting results, but is also not the highest quality one. Have you considered using either my BigGAN or StyleGAN 1/2 pretrained anime face models?

Why using wrapper function for generating "shifted" images?

May I ask you why you chose to use the following code snippet in order to generate images from shifted latent codes, instead of, for instance, using the standard GAN's generator forward function?

import types
from functools import wraps


def add_forward_with_shift(generator):
    def gen_shifted(self, z, shift, *args, **kwargs):
        return self.forward(z + shift, *args, **kwargs)

    generator.gen_shifted = types.MethodType(gen_shifted, generator)
    generator.dim_shift = generator.dim_z


def gan_with_shift(gan_factory):
    @wraps(gan_factory)
    def wrapper(*args, **kwargs):
        gan = gan_factory(*args, **kwargs)
        add_forward_with_shift(gan)
        return gan

    return wrapper

More specifically, for getting images from latent codes z you apply:

imgs = G(z)

while for getting images from shifted latent codes z+shift, you apply:

imgs_shifted = G.gen_shifted(z, shift)

Is there any particular benefits of doing so, instead of getting the "shifted" images as follows?

imgs_shifted = G(z, shift)

Thank you.

Accuracy scores / tensorboard logs

Hey there. Thanks for publishing this intriguing work. Love the idea!

I'm trying to reproduce your results on BigGAN128, but I think I am missing some details, as the results are weaker than what was noted in the paper.

It would really helpful to have the accuracy scores of the latent shift predictor for each model, or, even better, the tensorboard logs.

Thanks!

Run on a cVAE's Decoder?

Given that the Generator is being treated as a black box, I'm guessing we could probably use this with a conditional VAE as well, by just running it on the VAE's Decoder. Does that seem reasonable, or am I missing something important?

Some errors in the code

  1. In download.py, in line number 27, extra brackets are added while defining choices.
    choices=[list(SOURCES.keys()) + ['all']], default=['all'])
    should be
    choices=list(SOURCES.keys()) + ['all'], default=['all'])

  2. In gan_segmentation.py, run_in_background is imported from utils. But, no such function is defined in utils which results in an error.

  3. In train_segmentation.py, line numbers 54 and 55 give an error.

  4. In train_segmentation.py, line number 219, 'val_dirs=[args.val_images_dirs, args.val_masks_dirs]' gives the following error:
    attribute error: 'namespace' object has no attribute 'val_images_dirs'

  5. In train_segmentation.py, in line numbers 217, 218 and 219, while calling the function train_segmentation, gen_devices hasn't been provided as an argument.

  6. In train_segmentation.py, line number 69 gives the following error:
    AttributeError: 'SegmentationTrainParams' object has no attribute 'test_samples_count'

Learnable directions question

Hi @anvoynov, I have a question regarding the learnable (interpretable) directions. More specifically, regarding LatentDeformator, in order to learn the directions, you define an input dimension and an output dimension, both of which are set to be equal to directions_count as shown here, which of course can be different than the dimensionality of latent space.

However, precisely because the dimensionality of these directions needs to be equal to the dimensionality of the latent space (so as you can add shifts to latent codes), you fix their output dimension by padding zeros or by discarding the extra dimensions, as shown here.

My question is why you don't just set out_dim to be equal to shift_dim in the first place? That is, instead of padding with zeros or cutting out extra dimensions, to learn exactly how many dimensions you really need. In my opinion this is not a detail and I would expect that this could affect training somehow.

Is there a particular reason for doing so?

Very confusing instructions

sorry, but the instructions are very confusing.
Can you tell me how I look for the latent addresses and where in which folder they are stored?
How do I load and use latent addresses after saving?
for example I want to find the latent code that changes the hair color?

Across-class or "global" directions from conditional GAN?

I'm working with a conditional GAN and finding this model very useful for finding within-class directions, but I'm curious whether it could be used for across-class, or global, directions? The way I'm using it now is to find intelligible variations within each of my classes, but it would also be very useful to have an intuitive way to "morph" between classes.

Any thoughts appreciated.

StyleGAN2 load fail: No module named 'fused

Dear everyone,
Thank you very much for this repository.

I am not able to run your code for StylgeGan2 models. The issue is that fused cannot be loaded.
I am using :

  • Windows
  • Cuda 10.1
  • Visual studio 2015
    Do you have any idea of how I can fix this problem ?

Thank you very much.

Awesome work! Suggest adding a video

This work is great, your proposed approach using a matrix A and reconstructor network is quite unique. I'm wondering if it's possible to create a video such as the video created for GANSpace; GANSpace is a concurrent work for unsupervised latent space exploration, but they use PCA instead. I'm curious to know how these two concurrent works compare.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.