tohinz / consingan Goto Github PK
View Code? Open in Web Editor NEWPyTorch implementation of "Improved Techniques for Training Single-Image GANs" (WACV-21)
License: MIT License
PyTorch implementation of "Improved Techniques for Training Single-Image GANs" (WACV-21)
License: MIT License
I have some single channel grayscale images to generate, and I changed the parameters
parser. add_ argument('--nc_im',type=int,help='image # channels',default=1),
However, an error was encountered:
The behavior of rgb2gray will change in scikit-image 0.19. Currently, rgb2gray allows 2D grayscale image to be passed as inputs and leaves them unmodified as outputs. Starting from version 0.19, 2D arrays will be treated as 1D images with 3 channels.
x = color.rgb2gray(x)
Traceback (most recent call last):
File "main_train.py", line 113, in
train(opt)
File "/root/ConSinGAN-master/ConSinGAN/training_generation.py", line 23, in train
real = functions.adjust_scales2image(real, opt)
File "/root/ConSinGAN-master/ConSinGAN/functions.py", line 185, in adjust_scales2image
real = imresize(real_, opt.scale1, opt)
File "/root/ConSinGAN-master/ConSinGAN/imresize.py", line 52, in imresize
im = np2torch(im,opt)
File "/root/ConSinGAN-master/ConSinGAN/imresize.py", line 26, in np2torch
x = color.rgb2gray(x)
File "/usr/local/lib/python3.6/dist-packages/skimage/color/colorconv.py", line 809, in rgb2gray
rgb = _prepare_colorarray(rgb)
File "/usr/local/lib/python3.6/dist-packages/skimage/color/colorconv.py", line 150, in _prepare_colorarray
raise ValueError("Input array must have a shape == (..., 3)), "
ValueError: Input array must have a shape == (..., 3)), got (250, 250, 1)
How should I deal with it?Thank!
Hi, I tried the command in the README file:
python main_train.py --gpu 0 --train_mode harmonization --train_stages 3 --min_size 120 --lrelu_alpha 0.3 --niter 1000 --batch_norm --input_name Images/Harmonization/scream.jpg
and it popped up an error:
Traceback (most recent call last): File "main_train.py", line 105, in <module> copyfile(py_file, osp.join(dir2save, py_file.split("/")[-1])) File "\Python\lib\shutil.py", line 104, in copyfile raise SameFileError("{!r} and {!r} are the same file".format(src, dst)) shutil.SameFileError: '\\ConSinGAN-master\\ConSinGAN-master\\evaluate_model.py' and '\\ConSinGAN-master\\ConSinGAN-master\\evaluate_model.py' are the same file
Please send help, thank you
Hi, when I try to execute the follwing command:
python main_train.py --gpu 0 --train_mode harmonization --train_stages 3 --min_size 120 --lrelu_alpha 0.3 --niter 1000 --batch_norm --input_name Images/oddConSinGan/good/2.png --naive_img Images/oddConSinGan/hor/8.png
I get the following error:
Training model (TrainedModels/2/2020_07_26_16_02_44_harmonization_train_depth_3_lr_scale_0.1_BN_act_lrelu_0.3)
Training model with the following parameters:
number of stages: 3
number of concurrently trained stages: 3
learning rate scaling: 0.1
non-linearity: lrelu
Traceback (most recent call last):
File "main_train.py", line 113, in <module>
train(opt)
File "/home/me/ConSinGAN/ConSinGAN/training_harmonization_editing.py", line 22, in train
real = functions.read_image(opt)
File "/home/me/ConSinGAN/ConSinGAN/functions.py", line 130, in read_image
x = np2torch(x,opt)
File "/home/me/ConSinGAN/ConSinGAN/functions.py", line 144, in np2torch
x = x[:,:,:,None]
IndexError: too many indices for array
How can I solve? Thanks
Hi, I was trying to use the animation function that's in the README.md
but I hit a wall.
my cmd input
py main_train.py --gpu 0 --train_mode animation --input_name Images/Animation/scream.jpg
main_train.py: error: argument --train_mode: invalid choice: 'animation' (choose from 'generation', 'retarget', 'harmonization', 'editing')
thank you.
edit: Realized didn't update the code. Sorry.
We would like to ask if you have encountered nan problems when calculating SIFID? My colleague and I cannot get the correct result when running the SIFID program given by SinGAN, the program gives a nan error.
The error message is as follows:
SIFID/sifid_score.py:262: RuntimeWarning: Mean of empty slice.
print('SIFID: ', sifid_values.mean())
../sinGAN/SinGAN-master/mypython/lib/python3.6/site-packages/numpy/core/_methods.py:161: RuntimeWarning: invalid value encountered in true_divide
ret = ret.dtype.type(ret / rcount)
SIFID: nan
Hi,when I want to run the instruction:
python main_train.py --gpu 1 --train_mode harmonization --train_stages 3 --min_size 120 --lrelu_alpha 0.3 --niter 1000 --batch_norm --input_name Images/Harmonization/scream.jpg
The following error occurred:
Traceback (most recent call last):
File "main_train.py", line 111, in
train(opt)
File "/home/lbd08/jwz/SA-ConSinGAN-master/ConSinGAN/training_harmonization_editing.py", line 75, in train
naive_img, naive_img_large, fixed_noise,
UnboundLocalError: local variable 'naive_img_large' referenced before assignment
How can I solve it,thank you.
Training model (TrainedModels/pantheon/2021_02_22_15_28_21_generation_train_depth_3_lr_scale_0.1_act_lrelu_0.05)
Training model with the following parameters:
number of stages: 6
number of concurrently trained stages: 3
learning rate scaling: 0.1
non-linearity: lrelu
Training on image pyramid: [torch.Size([1, 3, 26, 42]), torch.Size([1, 3, 31, 51]), torch.Size([1, 3, 40, 66]), torch.Size([1, 3, 57, 94]), torch.Size([1, 3, 106, 175]), torch.Size([1, 3, 152, 250])]
stage [0/5]:: 0%| | 0/1000 [00:00<?, ?it/s]T
raceback (most recent call last):
File "main_train.py", line 118, in
train(opt)
File "G:\ConSinGAN\ConSinGAN\training_generation.py", line 48, in train
fixed_noise, noise_amp, generator, d_curr = train_single_scale(d_curr, generator, reals, fixed_noise, noise_amp, opt, scale_num, writer)
File "G:\ConSinGAN\ConSinGAN\training_generation.py", line 156, in train_single_scale
gradient_penalty = functions.calc_gradient_penalty(netD, real, fake, opt.lambda_grad, opt.device)
File "G:\ConSinGAN\ConSinGAN\functions.py", line 122, in calc_gradient_penalty
create_graph=True, retain_graph=True, only_inputs=True)[0]
File "D:\Anaconda3\envs\ConSinGAN\lib\site-packages\torch\autograd_init_.py", line 149, in grad
inputs, allow_unused)
RuntimeError: cuDNN error: CUDNN_STATUS_EXECUTION_FAILED
All the Fake_sample and generated_sample at each stage is found to be a black/Null image (though no black in my training image).
Training model (TrainedModels/chk1/2021_05_21_21_23_43_generation_train_depth_3_lr_scale_1.0_act_lrelu_0.05)
Training model with the following parameters:
number of stages: 8
number of concurrently trained stages: 3
learning rate scaling: 1.0
non-linearity: lrelu
Training on image pyramid: [torch.Size([1, 3, 26, 38]), torch.Size([1, 3, 29, 43]), torch.Size([1, 3, 34, 50]), torch.Size([1, 3, 40, 61]), torch.Size([1, 3, 51, 77]), torch.Size([1, 3, 72, 108]), torch.Size([1, 3, 127, 191]), torch.Size([1, 3, 166, 250])]
-- Can you please resolve.
Hi! Is there a way to save and to load the process of model training? For example, I use google colab for a faster model training, but maybe 30% of time the process interrupts , and I should start from the beginning again.
I have performed image generation using SINGAN with default params set. The generated image seems better in comparison with ConSinGan though i tried tweaking lr_scales =(0.1/0.5/1.0),train_stages = (5/8/10) in consingan but i couldn't generate better samples. Can you please help me understand what tweaks i need to perform (in priority order as i lack a good GPU).
I prefer to work with ConSinGAN due to its speed & memory.
Training Image: circuit board.
-Thanks
Hi I am running the image harmonization part of the model with a --train_stages 6 --max_size 350 and --lr_scale 0.5 to increase the quality of the images.
However, once I get to the 2 stage of the training, it crashes because of lack of CUDA memory. I altered the torch device for the model to accept more than 1 gpu (let's say gpus 0 and 1) and made changes to the model to be encapsulated in a DataParallel model so that it can run parallel on multiple GPUs. However, it still only runs on 1 GPU.
Do you have any suggestions to fix this issue?
This project is a very impressive job. I am wondering how to apply this model to SR and animation as SinGAN does. In SinGAN SR process, it need to up-sample several times, wherein those generators can't be run separately.
Hi, I'm interested in the actual implemantation about Learning Rate Scaling you mentioned in the paper, but I can't find the exact position of it on code. I do find a method: get_scale_factor(opt) in main_train.py, but I don't find any call of it, could you point it out?
Hi! I'm curious about your decision to opt for the L2 (MSE) loss for reconstruction. I've come across discussions suggesting that the L1 (MAE) loss might be a better choice, as it tends to encourage less blurring in the generated images. Interestingly, I've observed some blurriness in the results obtained with the current L2 loss settings.
With this in mind, I'm considering experimenting with changing L2 to L1 for the Reconstruction loss. Do you think this change would primarily affect the section highlighted in this screenshot:
Or would it also extend to this area?
Additionally, I would greatly appreciate it if you could provide some insight into the significance of noise_amp in this context.
Thank you for your assistance!
Hi,I find that if i modify the resolution of the generated image,the generated images get bad results.Should I increase the number of concurrently trainde stages?
Hi, is there a way to set the output resolution size to be same as the input resolution size?
I try to run the code in README:
python main_train.py --gpu 0 --train_mode harmonization --train_stages 3 --min_size 120 --lrelu_alpha 0.3 --niter 1000 --batch_norm --input_name Images/Harmonization/scream.jpg
It shows the message:
Traceback (most recent call last): File "main_train.py", line 111, in train(opt) File "/root/code/ConSinGAN/ConSinGAN/training_harmonization_editing.py", line 76, in train naive_img, naive_img_large, fixed_noise,
UnboundLocalError: local variable 'naive_img_large' referenced before assignment
Can you tell me how to solve this problem?Thank YOU!
Hi, I am trying to use the fine-tune function but this error always comes up
cmd input:
G:\projects\ConSinGAN-test-p\ConSinGAN-master\ConSinGAN-master>py main_train.py --gpu 0 --train_mode harmonization --input_name Images/Harmonization/23.jpg --naive_img Images/Harmonization/23.png --fine_tune --model_dir TrainedModels/23/2020_04_23_15_58_53_harmonization_train_depth_3_lr_scale_0.1_BN_act_lrelu_0.3
But this error pops up:
Image does not exist: G:/projects/ConSinGAN
Please specify a valid image.
The error also skipped the -test-p
of ConSinGAN-test-p
and outputting only G:/projects/ConSinGAN
I am using Windows and you told me last time to change main_train.py
line 107 to
copyfile(py_file, osp.join(dir2save, py_file.split("\\")[-1]))
fyi.
Thank you
Sorry, I don't quite understand. What command do you input to generate the image?How do you determine the scale of the generated image?
Thank you very much!
Hi, I'm trying to run your implementation.
I run a command in your README.
python main_train.py --gpu 0 --train_mode harmonization --train_stages 3 --min_size 120 --lrelu_alpha 0.3 --niter 1000 --batch_norm --input_name Images/Harmonization/scream.jpg
I faced a error.
Traceback (most recent call last):
File "main_train.py", line 111, in <module>
train(opt)
File "/home/ConSinGAN/training_harmonization_editing.py", line 75, in train
naive_img, naive_img_large, fixed_noise,
UnboundLocalError: local variable 'naive_img_large' referenced before assignment
Help me.
Hello, I have some problems. The picture that I run with your code is black. I tried to solve this problem with some methods, but did not succeed, so I ask you for advice.
Where was the article published
Hi, your project ConSinGAN requires "albumentations==0.4.3" in its dependency. After analyzing the source code, we found that some other versions of albumentations can also be suitable without affecting your project, i.e., albumentations 0.4.4, 0.4.5. Therefore, we suggest to loosen the dependency on albumentations from "albumentations==0.4.3" to "albumentations>=0.4.3,<=0.4.5" to avoid any possible conflict for importing more packages or for downstream projects that may use ConSinGAN.
May I pull a request to loosen the dependency on albumentations?
By the way, could you please tell us whether such dependency analysis may be potentially helpful for maintaining dependencies easier during your development?
For your reference, here are details in our analysis.
Your project ConSinGAN(commit id: 8b4681f) directly uses 11 APIs from package albumentations.
albumentations.augmentations.transforms.Cutout.__init__, albumentations.augmentations.transforms.ChannelShuffle.__init__, albumentations.core.composition.Compose.__init__, albumentations.augmentations.transforms.ToSepia.__init__, albumentations.augmentations.transforms.MultiplicativeNoise.__init__, albumentations.imgaug.transforms.IAAAdditiveGaussianNoise.__init__, albumentations.augmentations.transforms.InvertImg.__init__, albumentations.augmentations.transforms.HueSaturationValue.__init__, albumentations.augmentations.transforms.ChannelDropout.__init__, albumentations.core.composition.OneOf.__init__, albumentations.augmentations.transforms.GaussNoise.__init__
From which, 14 functions are then indirectly called, including 12 albumentations's internal APIs and 2 outsider APIs, as follows (neglecting some repeated function occurrences).
[/tohinz/ConSinGAN]
+--albumentations.augmentations.transforms.Cutout.__init__
| +--albumentations.core.transforms_interface.BasicTransform.__init__
| +--warnings.warn
+--albumentations.augmentations.transforms.ChannelShuffle.__init__
| +--albumentations.core.transforms_interface.BasicTransform.__init__
+--albumentations.core.composition.Compose.__init__
| +--albumentations.core.composition.BaseCompose.__init__
| | +--albumentations.core.composition.Transforms.__init__
| | | +--albumentations.core.composition.Transforms._find_dual_start_end
| | | | +--albumentations.core.composition.Transforms._find_dual_start_end
| +--albumentations.augmentations.bbox_utils.BboxProcessor.__init__
| | +--albumentations.core.utils.DataProcessor.__init__
| +--albumentations.core.composition.BboxParams.__init__
| | +--albumentations.core.utils.Params.__init__
| +--albumentations.augmentations.keypoints_utils.KeypointsProcessor.__init__
| | +--albumentations.core.utils.DataProcessor.__init__
| +--albumentations.core.composition.KeypointParams.__init__
| | +--albumentations.core.utils.Params.__init__
| +--albumentations.core.composition.BaseCompose.add_targets
+--albumentations.augmentations.transforms.ToSepia.__init__
| +--albumentations.core.transforms_interface.BasicTransform.__init__
| +--numpy.matrix
+--albumentations.augmentations.transforms.MultiplicativeNoise.__init__
| +--albumentations.core.transforms_interface.BasicTransform.__init__
| +--albumentations.core.transforms_interface.to_tuple
+--albumentations.imgaug.transforms.IAAAdditiveGaussianNoise.__init__
| +--albumentations.core.transforms_interface.BasicTransform.__init__
| +--albumentations.core.transforms_interface.to_tuple
+--albumentations.augmentations.transforms.InvertImg.__init__
| +--albumentations.core.transforms_interface.BasicTransform.__init__
+--albumentations.augmentations.transforms.HueSaturationValue.__init__
| +--albumentations.core.transforms_interface.BasicTransform.__init__
| +--albumentations.core.transforms_interface.to_tuple
+--albumentations.augmentations.transforms.ChannelDropout.__init__
| +--albumentations.core.transforms_interface.BasicTransform.__init__
+--albumentations.core.composition.OneOf.__init__
| +--albumentations.core.composition.BaseCompose.__init__
+--albumentations.augmentations.transforms.GaussNoise.__init__
| +--albumentations.core.transforms_interface.BasicTransform.__init__
We scan albumentations's versions among [0.4.4, 0.4.5] and 0.4.3, the changing functions (diffs being listed below) have none intersection with any function or API we mentioned above (either directly or indirectly called by this project).
diff: 0.4.3(original) 0.4.4
['albumentations.augmentations.transforms.GridDropout', 'albumentations.core.composition.PerChannel', 'albumentations.core.composition.OneOf', 'albumentations.augmentations.transforms.GridDropout.apply', 'albumentations.augmentations.transforms.RandomSizedBBoxSafeCrop', 'albumentations.augmentations.transforms.GridDropout.targets_as_params', 'albumentations.augmentations.transforms.FancyPCA', 'albumentations.augmentations.transforms.FancyPCA.get_params', 'albumentations.core.composition.OneOf.__call__', 'albumentations.augmentations.transforms.GridDropout.get_params_dependent_on_targets', 'albumentations.augmentations.transforms.Lambda', 'albumentations.augmentations.transforms.Cutout', 'albumentations.augmentations.transforms.GlassBlur.targets_as_params', 'albumentations.augmentations.functional.gamma_transform', 'albumentations.augmentations.functional.glass_blur', 'albumentations.augmentations.transforms.RandomGamma.__init__', 'albumentations.augmentations.transforms.GridDropout.__init__', 'albumentations.augmentations.transforms.FancyPCA.__init__', 'albumentations.augmentations.transforms.Cutout.get_params_dependent_on_targets', 'albumentations.augmentations.transforms.RandomSizedBBoxSafeCrop.get_params_dependent_on_targets', 'albumentations.augmentations.transforms.RandomGamma.apply', 'albumentations.augmentations.functional.fancy_pca', 'albumentations.augmentations.transforms.GlassBlur.__init__', 'albumentations.core.composition.PerChannel.__call__', 'albumentations.augmentations.transforms.Lambda.__init__', 'albumentations.augmentations.functional.clahe', 'albumentations.augmentations.transforms.RandomGamma', 'albumentations.augmentations.transforms.FancyPCA.get_transform_init_args_names', 'albumentations.augmentations.bbox_utils.convert_bbox_to_albumentations', 'albumentations.augmentations.transforms.GridDropout.get_transform_init_args_names', 'albumentations.augmentations.transforms.GlassBlur.apply', 'albumentations.augmentations.transforms.FancyPCA.apply', 'albumentations.augmentations.transforms.GlassBlur.get_transform_init_args_names', 'albumentations.augmentations.bbox_utils.denormalize_bbox', 'albumentations.augmentations.transforms.GridDropout.apply_to_mask', 'albumentations.augmentations.transforms.GlassBlur.get_params_dependent_on_targets', 'albumentations.augmentations.transforms.GlassBlur']
diff: 0.4.3(original) 0.4.5
['albumentations.augmentations.transforms.GridDropout', 'albumentations.core.composition.PerChannel', 'albumentations.core.composition.OneOf', 'albumentations.augmentations.transforms.GridDropout.apply', 'albumentations.augmentations.transforms.RandomSizedBBoxSafeCrop', 'albumentations.augmentations.transforms.GridDropout.targets_as_params', 'albumentations.augmentations.transforms.FancyPCA', 'albumentations.augmentations.transforms.FancyPCA.get_params', 'albumentations.core.composition.OneOf.__call__', 'albumentations.augmentations.transforms.GridDropout.get_params_dependent_on_targets', 'albumentations.augmentations.transforms.Lambda', 'albumentations.augmentations.transforms.Cutout', 'albumentations.augmentations.transforms.GlassBlur.targets_as_params', 'albumentations.augmentations.functional.gamma_transform', 'albumentations.augmentations.functional.glass_blur', 'albumentations.augmentations.transforms.RandomGamma.__init__', 'albumentations.augmentations.transforms.GridDropout.__init__', 'albumentations.augmentations.transforms.FancyPCA.__init__', 'albumentations.augmentations.transforms.Cutout.get_params_dependent_on_targets', 'albumentations.augmentations.transforms.RandomSizedBBoxSafeCrop.get_params_dependent_on_targets', 'albumentations.augmentations.transforms.RandomGamma.apply', 'albumentations.augmentations.functional.fancy_pca', 'albumentations.augmentations.transforms.GlassBlur.__init__', 'albumentations.core.composition.PerChannel.__call__', 'albumentations.augmentations.transforms.Lambda.__init__', 'albumentations.augmentations.functional.clahe', 'albumentations.augmentations.transforms.RandomGamma', 'albumentations.augmentations.transforms.FancyPCA.get_transform_init_args_names', 'albumentations.augmentations.bbox_utils.convert_bbox_to_albumentations', 'albumentations.augmentations.transforms.GridDropout.get_transform_init_args_names', 'albumentations.augmentations.transforms.GlassBlur.apply', 'albumentations.augmentations.transforms.FancyPCA.apply', 'albumentations.augmentations.transforms.GlassBlur.get_transform_init_args_names', 'albumentations.augmentations.bbox_utils.denormalize_bbox', 'albumentations.augmentations.transforms.GridDropout.apply_to_mask', 'albumentations.augmentations.transforms.GlassBlur.get_params_dependent_on_targets', 'albumentations.augmentations.transforms.GlassBlur']
As for other packages, the APIs of @outside_package_name are called by albumentations in the call graph and the dependencies on these packages also stay the same in our suggested versions, thus avoiding any outside conflict.
Therefore, we believe that it is quite safe to loose your dependency on albumentations from "albumentations==0.4.3" to "albumentations>=0.4.3,<=0.4.5". This will improve the applicability of ConSinGAN and reduce the possibility of any further dependency conflict with other projects/packages.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.