anvoynov / ganlatentdiscovery Goto Github PK
View Code? Open in Web Editor NEWThe authors official implementation of Unsupervised Discovery of Interpretable Directions in the GAN Latent Space
The authors official implementation of Unsupervised Discovery of Interpretable Directions in the GAN Latent Space
Hello,
Thank you so much for making such exciting work publicly available. I was wondering if there is any plan to add compatibility with NVIDIA's new pytorch StyleGAN2-ADA pipeline (https://github.com/NVlabs/stylegan2-ada-pytorch) I am really struggling with loading the model weights trained from that.
First off, very cool work!
I've been working with a modified ClusterGAN and am struggling a bit with wrapping it for use with your approach? The generator is a pretty basic 3-layer convolutional model with spectral normalization. The trained generator of the ClusterGAN operates as a conditional GAN, so it takes two parameters: zn
(the latent) and zc
, the class label vector (basically a one-hot encoded class vector—though you can give it partial class values, for interpolation).
Any guidelines on how I might go about setting it up for exploration with your model would be greatly appreciated! Particularly, how to handle the conditional/class vector. I see that the code indicates conditional generation, but I'm not clear on how target_classes
and mixed_classes
are handled.
@anvoynov
I want to try training using StyleGANv2 on W space. But it's always out of memory even I tried the different parameters. From your readme, it seems that you successful to run StyleGANv2 of model-f with 1024 resolution.
Hello Anvoynov,
I am wondering which SN-GAN repo are you using for training the MNIST datasets. It looks like you did not mention it on your page.
Thank you!
Hi,
First, I would like to thank you for sharing your code for your awesome work.
I'm trying to reproduce the results, but I get really weird results, especially for StyleGAN FFHQ. For example:
Based on your human annotation, I would expect the editing directions to be:
red_light: 6
gender: 15
color_intensity: 24
lightening_2: 49
eyes: 50
contrast: 53
luminance: 57
hair-skin_inversion: 58
skin_tone: 69
redness: 70
tan: 71
saturation: 78
smile_(entangled): 96
The steps I've done:
The only change I did is in the args.json file, I changed the resolution attribute to be named as gan_resolution to adjust your code.
Thanks!
Hi,
I am trying to use this library with a StyleGAN2 model.
I have Windows 10, python 3.6, cuda 10.0, visual studio 2017 and torch 1.4.
When I run the command, I get :
`python run_train.py --gan_type StyleGAN2 --gan_weights network-snapshot-002083.pkl --deformator ortho --out rectification_results_dir
C:\Users\dummy\AppData\Local\Programs\Python\Python38\lib\site-packages\setuptools\distutils_patch.py:25: UserWarning: Distutils was imported before Setuptools. This usage is discouraged and may exhibit undesirable behaviors or errors. Please use Setuptools' objects directly or at least import Setuptools first.
warnings.warn(
C:\Users\dummy\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\utils\cpp_extension.py:287: UserWarning: Error checking compiler version for cl: 'utf-8' codec can't decode byte 0xff in position 54: invalid start byte
warnings.warn('Error checking compiler version for {}: {}'.format(compiler, error))
StyleGAN2 load fail: Error building extension 'fused': [1/2] C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\bin\nvcc -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcompiler /EHsc -Xcompiler /wd4190 -Xcompiler /wd4018 -Xcompiler /wd4275 -Xcompiler /wd4267 -Xcompiler /wd4244 -Xcompiler /wd4251 -Xcompiler /wd4819 -Xcompiler /MD -DTORCH_EXTENSION_NAME=fused -DTORCH_API_INCLUDE_EXTENSION_H -IC:\Users\dummy\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\include -IC:\Users\dummy\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\dummy\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\include\TH -IC:\Users\dummy\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\include" -IC:\Users\dummy\AppData\Local\Programs\Python\Python38\Include -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_75,code=sm_75 -c I:\GANLatentDiscovery-master\models\StyleGAN2\op\fused_bias_act_kernel.cu -o fused_bias_act_kernel.cuda.o
FAILED: fused_bias_act_kernel.cuda.o
C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\bin\nvcc -Xcudafe --diag_suppress=dll_interface_conflict_dllexport_assumed -Xcudafe --diag_suppress=dll_interface_conflict_none_assumed -Xcudafe --diag_suppress=field_without_dll_interface -Xcudafe --diag_suppress=base_class_has_different_dll_interface -Xcompiler /EHsc -Xcompiler /wd4190 -Xcompiler /wd4018 -Xcompiler /wd4275 -Xcompiler /wd4267 -Xcompiler /wd4244 -Xcompiler /wd4251 -Xcompiler /wd4819 -Xcompiler /MD -DTORCH_EXTENSION_NAME=fused -DTORCH_API_INCLUDE_EXTENSION_H -IC:\Users\dummy\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\include -IC:\Users\dummy\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\include\torch\csrc\api\include -IC:\Users\dummy\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\include\TH -IC:\Users\dummy\AppData\Local\Programs\Python\Python38\lib\site-packages\torch\include\THC "-IC:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v10.1\include" -IC:\Users\dummy\AppData\Local\Programs\Python\Python38\Include -D_GLIBCXX_USE_CXX11_ABI=0 -D__CUDA_NO_HALF_OPERATORS__ -D__CUDA_NO_HALF_CONVERSIONS__ -D__CUDA_NO_HALF2_OPERATORS__ --expt-relaxed-constexpr -gencode=arch=compute_75,code=sm_75 -c I:\GANLatentDiscovery-master\models\StyleGAN2\op\fused_bias_act_kernel.cu -o fused_bias_act_kernel.cuda.o
C:/Users/dummy/AppData/Local/Programs/Python/Python38/lib/site-packages/torch/include\c10/util/ThreadLocalDebugInfo.h(12): warning: modifier is ignored on an enum specifier
C:/Users/dummy/AppData/Local/Programs/Python/Python38/lib/site-packages/torch/include\ATen/core/boxing/impl/boxing.h(100): warning: integer conversion resulted in a change of sign
C:/Users/dummy/AppData/Local/Programs/Python/Python38/lib/site-packages/torch/include\ATen/record_function.h(13): warning: modifier is ignored on an enum specifier
C:/Users/dummy/AppData/Local/Programs/Python/Python38/lib/site-packages/torch/include\ATen/core/op_registration/op_whitelist.h(39): warning: integer conversion resulted in a change of sign
C:/Users/dummy/AppData/Local/Programs/Python/Python38/lib/site-packages/torch/include\ATen/core/builtin_function.h(97): warning: statement is unreachable
C:/Users/dummy/AppData/Local/Programs/Python/Python38/lib/site-packages/torch/include\torch/csrc/jit/ir/ir.h(1347): error: member "torch::jit::ProfileOptionalOp::Kind" may not be initialized
1 error detected in the compilation of "C:/Users/dummy/AppData/Local/Temp/tmpxft_00000a20_00000000-10_fused_bias_act_kernel.cpp1.ii".
fused_bias_act_kernel.cu
ninja: build stopped: subcommand failed.
Traceback (most recent call last):
File "run_train.py", line 106, in
main()
File "run_train.py", line 64, in main
G = load_generator(args.dict, weights_path)
File "I:\GANLatentDiscovery-master\loading.py", line 19, in load_generator
G = make_style_gan2(args['gan_resolution'], G_weights, args['w_shift'])
File "I:\GANLatentDiscovery-master\models\gan_load.py", line 109, in make_style_gan2
G = StyleGAN2Generator(size, 512, 8)
NameError: name 'StyleGAN2Generator' is not defined`
Can you help me ?
Hi, thanks for your intriguing work. I was wondering how you can change the K value inside the code so that it looks for fewer directions?
Thanks for sharing your code. I was wondering if you have any idea how to use the code for StyleGAN3?
Hello,
Again, thank you for making such a great tool available. I am writing because I am having some issues with loading from a checkpoint model. The training runs totally fine initially, but when I want to resume from a model checkpoint (I am resuming at the 20,000th step), it cannot go past 1 step without running out of gpu memory. I am running the training on 6 V100's and here is my command:
`python run_train.py --gan_type StyleGAN2 --gan_resolution 256 --w_shift True --gan_weights /XXX/work/stylegan2-pytorch/500kNFT_tf_stylegan2.pt --deformator ortho --out /XXX/work/Notebooks/GAN/directionality/out/NFT_500k_ortho --multi_gpu 6 --shift_predictor_size 256`
I've read the paper and in theory, I should be able to more or less exchange the generator and configure the latent-space dimensionality and then have the matrix of possible directions trained by the resnet18 reconstructor that you used for proggan/stylegan2.
Do you think it's reasonable to adapt the code base to the stylegan2-ada-pytorch model? And can you possibly give me a hint on the code in the /models directory, in case I want to give it a try myself? I'm not entirely sure about the purpose of those. Happy to help rewrite code though. :)
EDIT: Sorry for duplication, didn't see the closed issue earlier.
i've completed the training on my stylegan2 model, and would like to explore the results - could you describe how to do that? how many directions have been found, how to check them explicitly, etc.?
provided jupyter notebook gives very little clues about that, it shows only your hand-picked results.
Hello,
I try to run this command:
!python run_train.py \
--gan_type StyleGAN2 \
--gan_weights /content/stylegan2-pytorch/network-snapshot-005000.pt \
--deformator ortho \
--out rectification_results_dir
And I encounter this error:
Traceback (most recent call last):
File "run_train.py", line 103, in <module>
main()
File "run_train.py", line 62, in main
G = load_generator(args.__dict__, weights_path, args.w_shift)
File "/content/GANLatentDiscovery/loading.py", line 19, in load_generator
G = make_style_gan2(args['resolution'], G_weights, shift_in_w)
KeyError: 'resolution'
From the look of it, load_generator()
is referenced in two places:
main()
, where the error occurs:GANLatentDiscovery/run_train.py
Line 62 in 36704fe
load_from_dir()
, where resolution
is defined as a dictionary key:Lines 44 to 47 in 36704fe
I think main()
should have a similar block of code where resolution is defined. Or resolution
should appear in:
Lines 20 to 25 in 36704fe
so that the user can set it via the command-line.
Hi Andrey!
I need to load the trained generator to my project.
So I cloned your repo, everything goes according to plan, but I'm having trouble using the load_from_dir function (loading.py file).
I'm providing the path of my G_ema.pth file. Then the load_from_dir function concatenates 'args.json' to this path. And then there is an error that says this directory does not exist.
What am I missing? What path do I need to provide when I'm calling the function?
Thanks :)
Hi, could you provide the arguments that you have used for the experiments reported in the paper? For instance, a set of args.json
and command.sh
files. That would help in reproducing your results.
Thanks!
Where can I get the real dataset ,such as mnist you used in your experiment?
I'm running this repository code in Google Colab setting the environment with Conda to match the requirements.txt and getting this issue
python run_train.py --gan_type StyleGAN2 --gan_weights ../network-snapshot-000200.pkl --deformator ortho --out ../Results/
Traceback (most recent call last):
File "run_train.py", line 106, in <module>
main()
File "run_train.py", line 64, in main
G = load_generator(args.__dict__, weights_path)
File "/content/GANLatentDiscovery/loading.py", line 19, in load_generator
G = make_style_gan2(args['gan_resolution'], G_weights, args['w_shift'])
File "/content/GANLatentDiscovery/models/gan_load.py", line 110, in make_style_gan2
G.load_state_dict(torch.load(weights, map_location='cpu')['g_ema'])
File "/usr/local/envs/latent/lib/python3.8/site-packages/torch/serialization.py", line 529, in load
return _legacy_load(opened_file, map_location, pickle_module, **pickle_load_args)
File "/usr/local/envs/latent/lib/python3.8/site-packages/torch/serialization.py", line 692, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
ModuleNotFoundError: No module named 'torch_utils'
What might be the issue?
Hi, nice work! I am studying your subsequent work Big GANs Are Watching You: Towards Unsupervised Object Segmentation with Off-the-Shelf Generative Models which is based on the technique proposed in this work. I was attempting to get some interpretable directions for BigBiGAN as described in the paper. What I did is just loading the pretrained BigBiGAN weights provided in this repo: https://github.com/anvoynov/BigGANsAreWatching and then training the model in the same way as training for BigGAN. I suppose that BigGAN and BigBiGAN have the same generator structure so it could have worked. However, it didn't work. The reconstructor was trained very poorly. I wonder what is the correct way to do that. Thanks very much!
python run_train.py \
--gan_type BigGAN \
--gan_weights /path/to/BigBiGAN/weights \
--deformator ortho \
--out output/BigBiGAN_ortho
is it possible to discover directions in full w space (e.g. [18,512], one [512] latent per layer), instead of single tiled/repeated [512]?
that provided way better results for the task of encoding ("projecting") foreign images into StyleGAN latent space, so should also work here.
For target_indices
you sample batch_size
integers from [0, ..., directions_count-1]
and then you create z_shift
as a batch_size x latent_dim
matrix, where for each row, at the column defined by target_indices
vector, you assign a (random) shift magnitude value.
However, if directions_count > latent_dim
, then this would fail, since you would try to assign a shift magnitude to a non-existent column of z_shift
(it has only latent_dim
columns). This effectively means that (at least with this version of the code), you cannot discover more directions than the dimensionality of the latent space (you can discover at most as many directions as the dimensionality of the latent space). This might be reasonable in practice, but in the paper it is stated that "The method from (Ramesh et al., 2018) is also limited with the maximal number of discovered directions equal to the latent space dimensionality, while our approach can be applied for a higher number of directions.".
Am I missing something here? Thanks for your time.
The anime face model used looks like it has interesting results, but is also not the highest quality one. Have you considered using either my BigGAN or StyleGAN 1/2 pretrained anime face models?
May I ask you why you chose to use the following code snippet in order to generate images from shifted latent codes, instead of, for instance, using the standard GAN's generator forward function?
import types
from functools import wraps
def add_forward_with_shift(generator):
def gen_shifted(self, z, shift, *args, **kwargs):
return self.forward(z + shift, *args, **kwargs)
generator.gen_shifted = types.MethodType(gen_shifted, generator)
generator.dim_shift = generator.dim_z
def gan_with_shift(gan_factory):
@wraps(gan_factory)
def wrapper(*args, **kwargs):
gan = gan_factory(*args, **kwargs)
add_forward_with_shift(gan)
return gan
return wrapper
More specifically, for getting images from latent codes z you apply:
imgs = G(z)
while for getting images from shifted latent codes z+shift
, you apply:
imgs_shifted = G.gen_shifted(z, shift)
Is there any particular benefits of doing so, instead of getting the "shifted" images as follows?
imgs_shifted = G(z, shift)
Thank you.
Hey there. Thanks for publishing this intriguing work. Love the idea!
I'm trying to reproduce your results on BigGAN128, but I think I am missing some details, as the results are weaker than what was noted in the paper.
It would really helpful to have the accuracy scores of the latent shift predictor for each model, or, even better, the tensorboard logs.
Thanks!
Given that the Generator is being treated as a black box, I'm guessing we could probably use this with a conditional VAE as well, by just running it on the VAE's Decoder. Does that seem reasonable, or am I missing something important?
Removed.
max_dim to inspect in inspect_all_directions
is set to G.dim_shift
; if directions_count
is set to lower value (say, 200 instead of 512), there's "index out of bounds" error
In download.py, in line number 27, extra brackets are added while defining choices.
choices=[list(SOURCES.keys()) + ['all']], default=['all'])
should be
choices=list(SOURCES.keys()) + ['all'], default=['all'])
In gan_segmentation.py, run_in_background is imported from utils. But, no such function is defined in utils which results in an error.
In train_segmentation.py, line numbers 54 and 55 give an error.
In train_segmentation.py, line number 219, 'val_dirs=[args.val_images_dirs, args.val_masks_dirs]' gives the following error:
attribute error: 'namespace' object has no attribute 'val_images_dirs'
In train_segmentation.py, in line numbers 217, 218 and 219, while calling the function train_segmentation, gen_devices hasn't been provided as an argument.
In train_segmentation.py, line number 69 gives the following error:
AttributeError: 'SegmentationTrainParams' object has no attribute 'test_samples_count'
Hi @anvoynov,
I am interested in knowing if this method can be modified to work on 3D-GANs such as this one: https://github.com/sh4174/3DStyleGAN
Hi @anvoynov, I have a question regarding the learnable (interpretable) directions. More specifically, regarding LatentDeformator
, in order to learn the directions, you define an input dimension and an output dimension, both of which are set to be equal to directions_count
as shown here, which of course can be different than the dimensionality of latent space.
However, precisely because the dimensionality of these directions needs to be equal to the dimensionality of the latent space (so as you can add shifts to latent codes), you fix their output dimension by padding zeros or by discarding the extra dimensions, as shown here.
My question is why you don't just set out_dim
to be equal to shift_dim
in the first place? That is, instead of padding with zeros or cutting out extra dimensions, to learn exactly how many dimensions you really need. In my opinion this is not a detail and I would expect that this could affect training somehow.
Is there a particular reason for doing so?
sorry, but the instructions are very confusing.
Can you tell me how I look for the latent addresses and where in which folder they are stored?
How do I load and use latent addresses after saving?
for example I want to find the latent code that changes the hair color?
I'm working with a conditional GAN and finding this model very useful for finding within-class directions, but I'm curious whether it could be used for across-class, or global, directions? The way I'm using it now is to find intelligible variations within each of my classes, but it would also be very useful to have an intuitive way to "morph" between classes.
Any thoughts appreciated.
Dear everyone,
Thank you very much for this repository.
I am not able to run your code for StylgeGan2 models. The issue is that fused cannot be loaded.
I am using :
Thank you very much.
This work is great, your proposed approach using a matrix A and reconstructor network is quite unique. I'm wondering if it's possible to create a video such as the video created for GANSpace; GANSpace is a concurrent work for unsupervised latent space exploration, but they use PCA instead. I'm curious to know how these two concurrent works compare.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.