Coder Social home page Coder Social logo

open-mmlab / mmgeneration Goto Github PK

View Code? Open in Web Editor NEW
1.9K 27.0 225.0 27.21 MB

MMGeneration is a powerful toolkit for generative models, based on PyTorch and MMCV.

Home Page: https://mmgeneration.readthedocs.io/en/latest/

License: Apache License 2.0

Python 93.26% Shell 0.16% Dockerfile 0.06% C++ 1.58% Cuda 4.93%
generative gan pytorch mmcv openmmlab generative-adversarial-network diffusion-models

mmgeneration's People

Contributors

ckkelvinchan avatar gvalvano avatar jiangongwang avatar jimheo avatar leeteng2001 avatar leoxing1996 avatar nbei avatar plutoyuxie avatar plyfager avatar rangeking avatar shinya7y avatar taited avatar tommyzihao avatar vansin avatar yshuo-li avatar z-fran avatar zeakey avatar zeng-hello-world avatar zengyh1900 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

mmgeneration's Issues

An error in Verification

Hello! Thank you for your work!

When I followed the get_started.md for verification, an error occurred:
Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/local/lib/python3.8/dist-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "/hy-nas/mmgeneration/mmgen/apis/inference.py", line 78, in sample_uncoditional_model res = model.sample_from_noise( File "/hy-nas/mmgeneration/mmgen/models/gans/base_gan.py", line 167, in sample_from_noise outputs = _model(noise, num_batches=num_batches, **kwargs) File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/hy-nas/mmgeneration/mmgen/core/runners/fp16_utils.py", line 141, in new_func return old_func(*args, **kwargs) File "/hy-nas/mmgeneration/mmgen/models/architectures/stylegan/generator_discriminator_v2.py", line 369, in forward styles = [self.style_mapping(s) for s in styles] File "/hy-nas/mmgeneration/mmgen/models/architectures/stylegan/generator_discriminator_v2.py", line 369, in <listcomp> styles = [self.style_mapping(s) for s in styles] File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/container.py", line 119, in forward input = module(input) File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 889, in _call_impl result = self.forward(*input, **kwargs) File "/hy-nas/mmgeneration/mmgen/models/architectures/stylegan/modules/styleganv2_modules.py", line 104, in forward x = self.linear(x) File "/usr/local/lib/python3.8/dist-packages/torch/nn/modules/module.py", line 875, in _call_impl result = hook(self, input) File "/hy-nas/mmgeneration/mmgen/models/architectures/pggan/modules.py", line 67, in __call__ setattr(module, self.name, self.compute_weight(module)) File "/hy-nas/mmgeneration/mmgen/models/architectures/pggan/modules.py", line 59, in compute_weight weight = weight * torch.tensor( RuntimeError: CUDA error: no kernel image is available for execution on the device
I tested the installed CUDA as shown below:

import torch
import torchvision
print(torch.cuda.is_available())
True
a = torch.Tensor(5,3)
a = a.cuda()
print (a)
tensor([[0., 0., 0.],
[0., 0., 0.],
[0., 0., 0.],
[0., 0., 0.],
[0., 0., 0.]], device='cuda:0')

In fact, I also did some experiments on the same platform before, and there was no such error. Could you give me some suggestions?

ps. I did not use anaconda to create a virtual environment, and my environment is like this:

Ubuntu 18.04
Python 3.8
PyTorch 1.8.1
CUDA 11.1
gcc version 7.5.0
mmcv-full-1.3.6
a rtx3090

I see a closed issue (#4), so I try:

import mmcv
from mmgen.apis import init_model, sample_uncoditional_model

config_file = 'configs/styleganv2/stylegan2_c2_lsun-church_256_b4x8_800k.py'
# you can download this checkpoint in advance and use a local file path.
checkpoint_file = 'https://download.openmmlab.com/mmgen/stylegan2/official_weights/stylegan2-church-config-f-official_20210327_172657-1d42b7d1.pth'
device = 'cpu'
# init a generatvie
model = init_model(config_file, checkpoint_file, device=device).cpu()
# sample images
fake_imgs = sample_uncoditional_model(model, 4)

the result is:
Use load_from_http loader 2021-06-16 17:55:54,373 - mmgen - INFO - Switch to evaluation style mode: single 2021-06-16 17:55:54,374 - mmgen - INFO - Switch to evaluation style mode: single
I don’t know if this proves that the installation was successful.

Checklist for unfinished Chinise docs

The following Chinese docs have not been finished:

some questions when using mmg

When using the unconditional GANS model, can we only use one type of dataset for training? Does it support multiple types of training?
当使用unconditional GANS 模型时,是不是只能使用一个种类的数据集进行训练,请问它都支持多个目标种类训练吗?

How to resize the random noise input of SinGAN?

Hi,

Like the paper "Positional Encoding as Spatial Inductive Bias in GANs",
I would like to resize the random noise input of SinGAN to an arbitrary size.

I want to synthesize images larger than the input(G.T) image.
How can I do this?

Thank you.

TypeError: train_step() got an unexpected keyword argument 'running_status'

Thanks for your error report and we appreciate it a lot.

Checklist

  1. I have searched related issues but cannot get the expected help.
  2. I have read the FAQ documentation but cannot get the expected help.
  3. The bug has not been fixed in the latest version.

Describe the bug
A clear and concise description of what the bug is.

There is something wrong with the DDP. I'm trying to train a pix2pix model with mmgeneration and it reports error. However, when I train a pix2pix model with mmediting, it does not has this issue. I believe that this issue is caused by the dynamic iteration.

Reproduction

  1. What command or script did you run?
    bash tools/dist_train.sh configs/pix2pix/pix2pix_vanilla_unet_bn_1x1_80k_facades.py 1
A placeholder for the command.
  1. Did you make any modifications on the code or config? Did you understand what you have modified?
    No.
  2. What dataset did you use?
    Facades

Environment
fatal: Not a git repository (or any parent up to mount point /home)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
sys.platform: linux
Python: 3.7.1 (default, Dec 14 2018, 19:28:38) [GCC 7.3.0]
CUDA available: True
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 10.2, V10.2.89
GPU 0,1,2,3,4,5,6,7: GeForce RTX 2080 Ti
GCC: gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609
PyTorch: 1.5.0
PyTorch compiling details: PyTorch built with:

  • GCC 7.3
  • C++ Version: 201402
  • Intel(R) Math Kernel Library Version 2019.0.1 Product Build 20180928 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v0.21.1 (Git Hash 7d2fd500bc78936d1d648ca713b901012f470dbc)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • NNPACK is enabled
  • CPU capability usage: AVX2
  • CUDA Runtime 10.2
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
  • CuDNN 7.6.5
  • Magma 2.5.2
  • Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_INTERNAL_THREADPOOL_IMPL -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,

TorchVision: 0.6.0
OpenCV: 4.1.1
MMCV: 1.3.5
MMGen: 0.2.0+
MMCV Compiler: GCC 5.4
MMCV CUDA Compiler: 10.2

  1. Please run python mmgen/utils/collect_env.py to collect necessary environment information and paste it here.
  2. You may add addition that may be helpful for locating the problem, such as
    • How you installed PyTorch [e.g., pip, conda, source]
    • with pip. I have tried torch==1.5.0 and 1.8.1. Both does not work.
    • Other environment variables that may be related (such as $PATH, $LD_LIBRARY_PATH, $PYTHONPATH, etc.)

Error traceback
If applicable, paste the error trackback here.
Python: 3.7.1 (default, Dec 14 2018, 19:28:38) [GCC 7.3.0]
CUDA available: True
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 10.2, V10.2.89
GPU 0: GeForce RTX 2080 Ti
GCC: gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609
PyTorch: 1.5.0
PyTorch compiling details: PyTorch built with:

  • GCC 7.3
  • C++ Version: 201402
  • Intel(R) Math Kernel Library Version 2019.0.1 Product Build 20180928 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v0.21.1 (Git Hash 7d2fd500bc78936d1d648ca713b901012f470dbc)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • NNPACK is enabled
  • CPU capability usage: AVX2
  • CUDA Runtime 10.2
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
  • CuDNN 7.6.5
  • Magma 2.5.2
  • Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_INTERNAL_THREADPOOL_IMPL -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,

TorchVision: 0.6.0
OpenCV: 4.1.1
MMCV: 1.3.5
MMGen: 0.2.0+
MMCV Compiler: GCC 5.4
MMCV CUDA Compiler: 10.2

2021-05-31 15:45:07,821 - mmgen - INFO - Distributed training: True
2021-05-31 15:45:08,123 - mmgen - INFO - Config:
model = dict(
type='Pix2Pix',
generator=dict(
type='UnetGenerator',
in_channels=3,
out_channels=3,
num_down=8,
base_channels=64,
norm_cfg=dict(type='BN'),
use_dropout=True,
init_cfg=dict(type='normal', gain=0.02)),
discriminator=dict(
type='PatchDiscriminator',
in_channels=6,
base_channels=64,
num_conv=3,
norm_cfg=dict(type='BN'),
init_cfg=dict(type='normal', gain=0.02)),
gan_loss=dict(
type='GANLoss',
gan_type='vanilla',
real_label_val=1.0,
fake_label_val=0.0,
loss_weight=1.0),
pixel_loss=dict(type='L1Loss', loss_weight=100.0, reduction='mean'))
train_cfg = dict(direction='b2a')
test_cfg = dict(direction='b2a', show_input=True)
train_dataset_type = 'PairedImageDataset'
val_dataset_type = 'PairedImageDataset'
img_norm_cfg = dict(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
train_pipeline = [
dict(
type='LoadPairedImageFromFile',
io_backend='disk',
key='pair',
flag='color'),
dict(
type='Resize',
keys=['img_a', 'img_b'],
scale=(286, 286),
interpolation='bicubic'),
dict(type='FixedCrop', keys=['img_a', 'img_b'], crop_size=(256, 256)),
dict(type='Flip', keys=['img_a', 'img_b'], direction='horizontal'),
dict(type='RescaleToZeroOne', keys=['img_a', 'img_b']),
dict(
type='Normalize',
keys=['img_a', 'img_b'],
to_rgb=False,
mean=[0.5, 0.5, 0.5],
std=[0.5, 0.5, 0.5]),
dict(type='ImageToTensor', keys=['img_a', 'img_b']),
dict(
type='Collect',
keys=['img_a', 'img_b'],
meta_keys=['img_a_path', 'img_b_path'])
]
test_pipeline = [
dict(
type='LoadPairedImageFromFile',
io_backend='disk',
key='pair',
flag='color'),
dict(
type='Resize',
keys=['img_a', 'img_b'],
scale=(256, 256),
interpolation='bicubic'),
dict(type='RescaleToZeroOne', keys=['img_a', 'img_b']),
dict(
type='Normalize',
keys=['img_a', 'img_b'],
to_rgb=False,
mean=[0.5, 0.5, 0.5],
std=[0.5, 0.5, 0.5]),
dict(type='ImageToTensor', keys=['img_a', 'img_b']),
dict(
type='Collect',
keys=['img_a', 'img_b'],
meta_keys=['img_a_path', 'img_b_path'])
]
data = dict(
samples_per_gpu=1,
workers_per_gpu=4,
drop_last=True,
train=dict(
type='PairedImageDataset',
dataroot='data/paired/facades',
pipeline=[
dict(
type='LoadPairedImageFromFile',
io_backend='disk',
key='pair',
flag='color'),
dict(
type='Resize',
keys=['img_a', 'img_b'],
scale=(286, 286),
interpolation='bicubic'),
dict(
type='FixedCrop',
keys=['img_a', 'img_b'],
crop_size=(256, 256)),
dict(type='Flip', keys=['img_a', 'img_b'], direction='horizontal'),
dict(type='RescaleToZeroOne', keys=['img_a', 'img_b']),
dict(
type='Normalize',
keys=['img_a', 'img_b'],
to_rgb=False,
mean=[0.5, 0.5, 0.5],
std=[0.5, 0.5, 0.5]),
dict(type='ImageToTensor', keys=['img_a', 'img_b']),
dict(
type='Collect',
keys=['img_a', 'img_b'],
meta_keys=['img_a_path', 'img_b_path'])
],
test_mode=False),
val=dict(
type='PairedImageDataset',
dataroot='data/paired/facades',
pipeline=[
dict(
type='LoadPairedImageFromFile',
io_backend='disk',
key='pair',
flag='color'),
dict(
type='Resize',
keys=['img_a', 'img_b'],
scale=(256, 256),
interpolation='bicubic'),
dict(type='RescaleToZeroOne', keys=['img_a', 'img_b']),
dict(
type='Normalize',
keys=['img_a', 'img_b'],
to_rgb=False,
mean=[0.5, 0.5, 0.5],
std=[0.5, 0.5, 0.5]),
dict(type='ImageToTensor', keys=['img_a', 'img_b']),
dict(
type='Collect',
keys=['img_a', 'img_b'],
meta_keys=['img_a_path', 'img_b_path'])
],
test_mode=True),
test=dict(
type='PairedImageDataset',
dataroot='data/paired/facades',
pipeline=[
dict(
type='LoadPairedImageFromFile',
io_backend='disk',
key='pair',
flag='color'),
dict(
type='Resize',
keys=['img_a', 'img_b'],
scale=(256, 256),
interpolation='bicubic'),
dict(type='RescaleToZeroOne', keys=['img_a', 'img_b']),
dict(
type='Normalize',
keys=['img_a', 'img_b'],
to_rgb=False,
mean=[0.5, 0.5, 0.5],
std=[0.5, 0.5, 0.5]),
dict(type='ImageToTensor', keys=['img_a', 'img_b']),
dict(
type='Collect',
keys=['img_a', 'img_b'],
meta_keys=['img_a_path', 'img_b_path'])
],
test_mode=True))
checkpoint_config = dict(interval=4000, by_epoch=False, save_optimizer=True)
log_config = dict(
interval=100, hooks=[dict(type='TextLoggerHook', by_epoch=False)])
custom_hooks = [
dict(
type='VisualizationHook',
output_dir='training_samples',
res_name_list=['fake_b'],
interval=100)
]
runner = dict(
type='DynamicIterBasedRunner',
is_dynamic_ddp=True,
pass_training_status=True)
dist_params = dict(backend='nccl')
log_level = 'INFO'
load_from = None
resume_from = None
workflow = [('train', 1)]
find_unused_parameters = True
cudnn_benchmark = True
dataroot = 'data/paired/facades'
optimizer = dict(
generator=dict(type='Adam', lr=0.0002, betas=(0.5, 0.999)),
discriminator=dict(type='Adam', lr=0.0002, betas=(0.5, 0.999)))
lr_config = None
visual_config = None
total_iters = 80000
exp_name = 'pix2pix_facades'
work_dir = './work_dirs/experiments/pix2pix_facades'
metrics = dict(
FID=dict(type='FID', num_images=106, image_shape=(3, 256, 256)),
IS=dict(type='IS', num_images=106, image_shape=(3, 256, 256)))
gpu_ids = range(0, 1)

2021-05-31 15:45:08,124 - mmgen - INFO - Set random seed to 2021, deterministic: False
fatal: Not a git repository (or any parent up to mount point /home)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
2021-05-31 15:45:13,765 - mmgen - INFO - Start running, host: lthpc@lthpc, work_dir: /home/lthpc/mmgeneration-master/work_dirs/experiments/pix2pix_facades
2021-05-31 15:45:13,765 - mmgen - INFO - workflow: [('train', 1)], max: 80000 iters
Traceback (most recent call last):
File "tools/train.py", line 163, in
main()
File "tools/train.py", line 159, in main
meta=meta)
File "/home/lthpc/mmgeneration-master/mmgen/apis/train.py", line 196, in train_model
runner.run(data_loaders, cfg.workflow, cfg.total_iters)
File "/home/lthpc/mmgeneration-master/mmgen/core/runners/dynamic_iterbased_runner.py", line 284, in run
iter_runner(iter_loaders[i], **kwargs)
File "/home/lthpc/mmgeneration-master/mmgen/core/runners/dynamic_iterbased_runner.py", line 214, in train
outputs = self.model.train_step(data_batch, self.optimizer, **kwargs)
File "/home/lthpc/anaconda3/lib/python3.7/site-packages/mmcv/parallel/distributed.py", line 51, in train_step
output = self.module.train_step(*inputs[0], **kwargs[0])
TypeError: train_step() got an unexpected keyword argument 'running_status'
Traceback (most recent call last):
File "/home/lthpc/anaconda3/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/lthpc/anaconda3/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/lthpc/anaconda3/lib/python3.7/site-packages/torch/distributed/launch.py", line 263, in
main()
File "/home/lthpc/anaconda3/lib/python3.7/site-packages/torch/distributed/launch.py", line 259, in main
cmd=cmd)
subprocess.CalledProcessError: Command '['/home/lthpc/anaconda3/bin/python', '-u', 'tools/train.py', '--local_rank=0', 'configs/pix2pix/pix2pix_vanilla_unet_bn_1x1_80k_facades.py', '--launcher', 'pytorch']' returned non-zero exit status 1.

A placeholder for trackback.

Bug fix
If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!

Why "find_unused_parameters" needs to be True?

Hi, nice work. I am just studying this repo and have some small questions. I find that when training StyleGANv2 using stylegan2_c2_lsun-horse_256_b4x8_800k.py, "find_unused_parameters" in default_runtime.py must be True, otherwise there will be error says the module has parameters that were not used in producing loss. In my opinion, this usually happens when some parameters in the model do not need gradient. However, I did not find such a case in StyleGANv2. Thus, I just wonder if "find_unused_parameters" must be True for training in this repo or I just made some bugs causing this error? If it is not the bug, what is the reason for this. Thanks a lot.

problem

when i train the stylegan2_c2_ffhq_512_b3x8_1100k using bash tools/dist_train.sh configs/positional_encoding_in_gans/stylegan2_c2_ffhq_512_b3x8_1100k.py 1

i meet the problem:
Traceback (most recent call last):
File "tools/train.py", line 163, in
main()
File "tools/train.py", line 159, in main
meta=meta)
File "/home/wenjun/mmgeneration/mmgen/apis/train.py", line 198, in train_model
runner.run(data_loaders, cfg.workflow, cfg.total_iters)
File "/home/wenjun/mmgeneration/mmgen/core/runners/dynamic_iterbased_runner.py", line 285, in run
iter_runner(iter_loaders[i], **kwargs)
File "/home/wenjun/mmgeneration/mmgen/core/runners/dynamic_iterbased_runner.py", line 215, in train
outputs = self.model.train_step(data_batch, self.optimizer, **kwargs)
File "/home/wenjun/anaconda3/envs/mmgeneration/lib/python3.7/site-packages/mmcv/parallel/distributed.py", line 52, in train_step
output = self.module.train_step(*inputs[0], **kwargs[0])
File "/home/wenjun/mmgeneration/mmgen/models/gans/static_unconditional_gan.py", line 175, in train_step
disc_pred_real = self.discriminator(real_imgs)
File "/home/wenjun/anaconda3/envs/mmgeneration/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/wenjun/mmgeneration/mmgen/core/runners/fp16_utils.py", line 142, in new_func
return old_func(*args, **kwargs)
File "/home/wenjun/mmgeneration/mmgen/models/architectures/stylegan/generator_discriminator_v2.py", line 605, in forward
x = self.final_linear(x)
File "/home/wenjun/anaconda3/envs/mmgeneration/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/wenjun/anaconda3/envs/mmgeneration/lib/python3.7/site-packages/torch/nn/modules/container.py", line 119, in forward
input = module(input)
File "/home/wenjun/anaconda3/envs/mmgeneration/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/wenjun/mmgeneration/mmgen/models/architectures/stylegan/modules/styleganv2_modules.py", line 105, in forward
x = self.linear(x)
File "/home/wenjun/anaconda3/envs/mmgeneration/lib/python3.7/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/wenjun/anaconda3/envs/mmgeneration/lib/python3.7/site-packages/torch/nn/modules/linear.py", line 94, in forward
return F.linear(input, self.weight, self.bias)
File "/home/wenjun/anaconda3/envs/mmgeneration/lib/python3.7/site-packages/torch/nn/functional.py", line 1753, in linear
return torch._C._nn.linear(input, weight, bias)
RuntimeError: mat1 dim 1 must match mat2 dim 0
Killing subprocess 32428
Traceback (most recent call last):
File "/home/wenjun/anaconda3/envs/mmgeneration/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/wenjun/anaconda3/envs/mmgeneration/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/wenjun/anaconda3/envs/mmgeneration/lib/python3.7/site-packages/torch/distributed/launch.py", line 340, in
main()
File "/home/wenjun/anaconda3/envs/mmgeneration/lib/python3.7/site-packages/torch/distributed/launch.py", line 326, in main
sigkill_handler(signal.SIGTERM, None) # not coming back
File "/home/wenjun/anaconda3/envs/mmgeneration/lib/python3.7/site-packages/torch/distributed/launch.py", line 301, in sigkill_handler
raise subprocess.CalledProcessError(returncode=last_return_code, cmd=cmd)
subprocess.CalledProcessError: Command '['/home/wenjun/anaconda3/envs/mmgeneration/bin/python', '-u', 'tools/train.py', '--local_rank=0', 'configs/positional_encoding_in_gans/stylegan2_c2_ffhq_512_b3x8_1100k.py', '--launcher', 'pytorch']' returned non-zero exit status 1.

can you help me?
Thans!

Differences between ModulatedPEConv2d and ModulatedConv2d

Hi, I noticed that in Styleganv2, ModulatedConv2d is applied. However, there is also a ModulatedPEConv2d function, but seems not used in any model. So what is the difference between ModulatedPEConv2d and ModulatedConv2d. I haven't found any information related to this. Why not use ModulatedPEConv2d instead of ModulatedConv2d? Actually, I saw some checkboard artifacts using Styleganv2 and it seems it is due to the ConvTranspose2d operation. Thus, I wonder whether ModulatedPEConv2d can be applied to solve such a problem. Thanks.

How to generate visualization results for `pix2pix` model

Hello!

I used the configs/pix2pix/pix2pix_vanilla_unet_bn_1x1_80k_facades.py config to train and got the expected checkpoints. But when I tried to use the checkpoints to generate visualization results for valid datasets, all scripts under the apps folder and tools/evaluation.py didn't work, May I ask that how to generate visualization results for the pix2pix model?

Thanks in advance.

About the custom hook "VisualizeUnconditionalSamples"

Hi, I am using mmgeneration to train styleganv2.

When visualizing unconditional samples during the training, I found the number of generated samples is 32. It seems that these 32 images are just twice the duplication of 16 images. I checked the code, the num_samples by default is 16, and I didn't change it.

So I am wondering where the number samples are set?

fused_bias_leakyrelu(): incompatible function arguments. The following argument types are supported: 1. (arg0: at::Tensor, arg1: at::Tensor, arg2: at::Tensor, arg3: int, arg4: int, arg5: float, arg6: float) -> at::Tensor

Describe the bug
On Ubuntu20.04, NVIDIA RTX 3080, Python3.8, I run the official varification demo, but get the TypeError

from mmgen.apis import init_model, sample_uncoditional_model
if __name__ == '__main__':
    config_file = 'configs/styleganv2/stylegan2_c2_lsun-church_256_b4x8_800k.py'
    # you can download this checkpoint in advance and use a local file path.
    checkpoint_file = 'http://download.openmmlab.com/mmgen/stylegan2/official_weights/stylegan2-church-config-f-official_20210327_172657-1d42b7d1.pth'
    device = 'cuda:0'
    # init a generatvie
    model = init_model(config_file, checkpoint_file, device=device)
    # sample images
    fake_imgs = sample_uncoditional_model(model, 4)
    print(fake_imgs)

error infomation:

    out = ext_module.fused_bias_leakyrelu(
TypeError: fused_bias_leakyrelu(): incompatible function arguments. The following argument types are supported:
    1. (arg0: at::Tensor, arg1: at::Tensor, arg2: at::Tensor, arg3: int, arg4: int, arg5: float, arg6: float) -> at::Tensor

Invoked with: tensor([[ 1.5543,  0.9359, -0.7516,  ...,  0.8526,  0.1092, -1.5828],
        [-0.4884, -0.5331,  0.0995,  ..., -1.1292,  0.7263,  0.0868],
        [ 1.8955, -0.2950, -0.6122,  ..., -1.5320,  0.3621, -0.0334],
        [-1.1611,  1.4324, -0.5950,  ..., -0.0754, -0.2390,  0.5369]],
       device='cuda:0'), tensor([ 2.1072e-02,  7.2350e-03, -7.3995e-03,  8.8704e-03,  1.2794e-02,
         ...
        -1.4140e-03, -4.1711e-03,  1.9475e-02, -4.1921e-03,  4.0421e-03,
         6.0400e-03, -4.2922e-03], device='cuda:0'), tensor([], device='cuda:0'); kwargs: act=3, grad=0, alpha=0.2, scale=1.4142135623730951

I debug the input 7 args, their data type all are right, but when I change the device="cuda:0" to device=cpu, it works fine, just like below

from mmgen.apis import init_model, sample_uncoditional_model

if __name__ == '__main__':
    config_file = 'configs/styleganv2/stylegan2_c2_lsun-church_256_b4x8_800k.py'
    # you can download this checkpoint in advance and use a local file path.
    checkpoint_file = 'http://download.openmmlab.com/mmgen/stylegan2/official_weights/stylegan2-church-config-f-official_20210327_172657-1d42b7d1.pth'
    device = 'cpu'
    # init a generatvie
    model = init_model(config_file, checkpoint_file, device=device)
    # sample images
    fake_imgs = sample_uncoditional_model(model, 4)
    print(fake_imgs)

Environment

sys.platform: linux
Python: 3.8.8 (default, Apr 13 2021, 19:58:26) [GCC 7.3.0]
CUDA available: True
CUDA_HOME: /usr/local/cuda-11.1
NVCC: Build cuda_11.1.TC455_06.29069683_0
GPU 0: GeForce RTX 3080
GCC: gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
PyTorch: 1.8.0
PyTorch compiling details: PyTorch built with:

  • GCC 7.3
  • C++ Version: 201402
  • Intel(R) Math Kernel Library Version 2020.0.4 Product Build 20200917 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v1.7.0 (Git Hash 7aed236906b1f7a05c0917e5257a1af05e9ff683)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • NNPACK is enabled
  • CPU capability usage: AVX2
  • CUDA Runtime 11.1
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
  • CuDNN 8.0.5
  • Magma 2.5.2
  • Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.8.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,

TorchVision: 0.9.0
OpenCV: 4.5.1
MMCV: 1.3.2
MMGen: 0.1.0+d7eda0b
MMCV Compiler: GCC 7.3
MMCV CUDA Compiler: 11.1

Error traceback
Traceback (most recent call last):
File "/usrpath/data/openmmlab/mmgeneration/my_test.py", line 11, in
fake_imgs = sample_uncoditional_model(model, 4)
File "/usrpath/software/miniconda3/envs/openmmlab/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
return func(*args, **kwargs)
File "/usrpath/data/openmmlab/mmgeneration/mmgen/apis/inference.py", line 78, in sample_uncoditional_model
res = model.sample_from_noise(
File "/usrpath/data/openmmlab/mmgeneration/mmgen/models/gans/base_gan.py", line 166, in sample_from_noise
outputs = _model(noise, num_batches=num_batches, **kwargs)
File "/usrpath/software/miniconda3/envs/openmmlab/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/usrpath/data/openmmlab/mmgeneration/mmgen/models/architectures/stylegan/generator_discriminator_v2.py", line 340, in forward
styles = [self.style_mapping(s) for s in styles]
File "/usrpath/data/openmmlab/mmgeneration/mmgen/models/architectures/stylegan/generator_discriminator_v2.py", line 340, in
styles = [self.style_mapping(s) for s in styles]
File "/usrpath/software/miniconda3/envs/openmmlab/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/usrpath/software/miniconda3/envs/openmmlab/lib/python3.8/site-packages/torch/nn/modules/container.py", line 119, in forward
input = module(input)
File "/usrpath/software/miniconda3/envs/openmmlab/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/usrpath/data/openmmlab/mmgeneration/mmgen/models/architectures/stylegan/modules/styleganv2_modules.py", line 89, in forward
x = self.activate(x, self.bias * self.lr_mul)
File "/usrpath/software/miniconda3/envs/openmmlab/lib/python3.8/site-packages/mmcv/ops/fused_bias_leakyrelu.py", line 158, in fused_bias_leakyrelu
return FusedBiasLeakyReLUFunction.apply(input, bias.to(input.dtype),
File "/usrpath/software/miniconda3/envs/openmmlab/lib/python3.8/site-packages/mmcv/ops/fused_bias_leakyrelu.py", line 71, in forward
out = ext_module.fused_bias_leakyrelu(
TypeError: fused_bias_leakyrelu(): incompatible function arguments. The following argument types are supported:
1. (arg0: at::Tensor, arg1: at::Tensor, arg2: at::Tensor, arg3: int, arg4: int, arg5: float, arg6: float) -> at::Tensor

Invoked with: tensor([[ 1.5543, 0.9359, -0.7516, ..., 0.8526, 0.1092, -1.5828],
[-0.4884, -0.5331, 0.0995, ..., -1.1292, 0.7263, 0.0868],
[ 1.8955, -0.2950, -0.6122, ..., -1.5320, 0.3621, -0.0334],
[-1.1611, 1.4324, -0.5950, ..., -0.0754, -0.2390, 0.5369]],
device='cuda:0'), tensor([ 2.1072e-02, 7.2350e-03, -7.3995e-03, 8.8704e-03, 1.2794e-02,
5.6403e-03, 2.0907e-02, 3.6095e-02, 2.7718e-02, 2.5495e-02,
5.5971e-02, -2.6303e-03, -1.9122e-02, 3.4960e-02, -9.9869e-04,
5.4610e-02, -2.1504e-02, -7.6235e-04, 4.2148e-02, 2.9336e-02,
-1.2010e-02, 2.6697e-02, 2.0704e-02, 1.7850e-02, 2.5822e-02,
2.4205e-02, 9.2259e-03, 5.2766e-03, -9.7084e-03, 3.1917e-02,
1.4420e-02, 1.9740e-02, 4.5503e-03, -1.3387e-03, 3.1348e-02,
2.8662e-02, 2.1212e-02, 2.7612e-02, 2.7846e-02, 1.9240e-02,
1.4399e-03, 2.4100e-02, 1.5192e-03, 2.1962e-02, 3.8593e-02,
2.6981e-02, 2.3939e-02, -2.5659e-02, 1.8938e-02, 4.2813e-03,
3.2460e-02, -6.4725e-03, 5.7439e-03, -8.2432e-03, 2.8644e-02,
1.7430e-02, 1.6883e-02, 2.1885e-02, -3.7464e-02, -2.3679e-03,
-2.0484e-02, 3.5609e-03, -3.3241e-02, 2.5305e-02, 8.8699e-03,
1.0048e-02, 1.9374e-05, 4.6468e-02, 6.3671e-03, 2.8634e-02,
-2.0427e-02, 2.0850e-02, 1.3230e-02, 1.4341e-02, 3.5405e-02,
3.3669e-02, 7.0619e-03, -7.3995e-03, 1.5559e-02, 1.6781e-02,
1.5602e-02, 4.9376e-03, -1.4243e-02, -3.6679e-04, 1.9603e-02,
-1.4750e-02, -2.0574e-02, 2.7651e-02, 5.3985e-03, 2.2140e-02,
3.8984e-02, -2.1741e-02, 1.4327e-02, 2.8988e-02, 3.2486e-02,
1.6863e-02, 1.7482e-02, -1.1381e-02, 3.0547e-02, 1.9567e-02,
-7.9459e-04, -1.0579e-02, 4.3845e-03, 1.4310e-02, 2.4147e-02,
4.3581e-02, 1.3225e-02, 3.0805e-03, -7.5716e-03, 2.0421e-02,
4.3091e-02, -7.6145e-03, 5.7272e-03, 1.6518e-02, 5.0934e-02,
5.7350e-03, 2.5533e-02, 1.7434e-02, 2.7099e-02, 1.4430e-02,
4.5475e-03, -6.0277e-03, -3.5998e-02, 3.2768e-02, 8.8244e-04,
-4.2015e-02, -1.3836e-02, -2.7834e-02, -2.0692e-03, 5.6001e-02,
1.5373e-02, 1.1584e-02, -1.6014e-02, 3.1709e-02, 4.5359e-02,
-1.3453e-02, 2.9151e-02, -7.2418e-04, 4.8063e-02, 4.6696e-03,
-1.0628e-02, -3.5622e-02, -2.7828e-02, 4.8612e-02, 8.2219e-03,
-1.1421e-02, 9.2356e-04, -2.0823e-02, 3.5524e-03, -1.4194e-02,
5.2389e-02, 2.9024e-02, 4.0951e-02, 1.6816e-02, -6.9830e-04,
1.2723e-02, 9.0101e-03, -1.0772e-02, 4.1453e-02, 3.0848e-02,
-4.0323e-03, -9.3071e-03, -2.5157e-03, -1.7072e-02, -6.3630e-03,
4.7484e-02, -9.7196e-03, 4.4207e-04, 5.3285e-02, -1.3681e-02,
3.6376e-02, -2.4833e-02, -1.7205e-02, 8.6541e-04, -4.9962e-03,
2.2003e-02, 2.7286e-02, 4.3871e-02, 3.7229e-03, 1.1205e-02,
6.5684e-03, 3.5262e-02, 1.9407e-02, -2.7542e-02, -1.4560e-03,
3.1051e-02, 8.6826e-03, -6.7558e-03, 1.2897e-03, -8.2027e-03,
3.5474e-02, 2.9025e-02, 8.3344e-03, 3.0252e-02, -1.0292e-02,
2.1823e-02, 2.2295e-02, 2.3701e-02, 9.5861e-03, -5.9920e-03,
2.4102e-02, 6.4874e-03, 4.5393e-03, 6.0186e-03, 1.2538e-03,
-8.4845e-03, 4.8124e-03, 1.8087e-02, 2.9517e-02, 4.7770e-03,
6.5807e-03, 2.3674e-02, 3.7929e-02, 2.2110e-02, 2.2780e-02,
1.3864e-02, 7.1113e-03, 1.2022e-02, -5.3407e-03, 3.6154e-02,
6.7236e-03, 2.8046e-02, 2.7828e-02, 9.8256e-03, 4.3182e-02,
2.0617e-02, 4.5155e-02, -1.5414e-02, 1.5908e-02, 6.8419e-03,
2.9496e-02, -3.1403e-02, 3.6277e-02, 4.6770e-02, 5.5999e-02,
2.8341e-04, -1.6284e-03, 7.3653e-03, 9.8462e-03, 2.4935e-02,
4.5824e-03, -1.0666e-02, 1.1739e-02, 3.6363e-02, -2.9184e-02,
1.7379e-02, 2.3970e-02, 2.4238e-02, -8.9407e-03, 6.6026e-04,
-2.2883e-02, -3.3076e-02, -4.4075e-03, 1.6711e-02, 1.3338e-02,
-1.2148e-02, -2.3471e-02, 7.0922e-03, 3.0079e-02, 8.8431e-03,
2.0569e-02, 1.0583e-02, -3.8945e-03, 2.9986e-02, 5.9738e-03,
7.9658e-03, -1.5161e-02, -2.9107e-02, 4.6673e-03, 2.0985e-02,
-3.4134e-02, -1.9778e-03, 1.3358e-02, -3.5385e-03, 6.8200e-03,
4.1863e-02, 1.3161e-02, 3.3239e-03, -4.8689e-03, -5.6864e-03,
2.4394e-02, 6.5547e-02, -1.8028e-02, 2.6225e-02, -1.0597e-02,
1.8662e-02, -1.7158e-02, 5.9483e-03, 7.8847e-03, -5.3181e-02,
1.9470e-02, 1.6042e-02, -3.6935e-03, -3.0404e-02, 2.3784e-02,
5.2594e-02, 5.9842e-03, -9.9118e-03, 6.2297e-04, -5.0590e-03,
1.3509e-02, 2.4367e-02, -1.5341e-02, -3.5737e-03, -2.9005e-02,
-4.2906e-02, -4.0113e-03, -1.7275e-02, 7.2076e-03, -1.4392e-02,
4.4198e-03, 3.9066e-02, 9.1157e-03, 1.9261e-02, 3.8205e-02,
9.3444e-03, -3.2351e-03, -3.1751e-02, 5.6089e-04, -2.5561e-02,
4.8151e-03, -1.8522e-02, 1.5201e-02, 2.7765e-02, 2.8825e-02,
3.9171e-02, 4.6608e-03, -2.3405e-04, 1.9949e-05, 1.9330e-02,
2.6645e-02, 3.3298e-02, 2.8010e-02, 1.0021e-02, -4.7764e-03,
1.8138e-02, 2.0020e-02, 5.5121e-02, 1.0647e-02, -3.6136e-02,
1.3342e-02, 3.2644e-02, 3.5566e-02, 2.8251e-03, 1.8185e-02,
1.1187e-02, 1.1329e-02, -3.7697e-02, 3.0473e-02, 1.8692e-03,
2.1232e-02, -1.1128e-02, 6.7449e-03, -2.1927e-02, 2.6531e-02,
2.4637e-02, -1.2103e-02, 2.0931e-02, 4.8818e-03, 9.5352e-03,
1.3082e-02, 1.2242e-02, -1.8965e-02, -8.8815e-03, -2.4578e-02,
1.7757e-02, 4.3894e-02, -5.7756e-03, -2.0324e-02, -1.1204e-02,
2.3646e-02, 2.3955e-03, 2.6433e-03, 3.0905e-02, -2.5491e-02,
4.1358e-02, 2.3892e-02, -1.3900e-02, 6.1827e-03, 1.6889e-02,
1.8942e-02, -1.9551e-02, -2.4444e-02, 2.8982e-02, 2.6768e-02,
2.3272e-02, 9.3854e-03, 4.9806e-02, 1.5713e-03, -2.8103e-03,
2.6112e-02, -1.9443e-02, 2.3675e-02, 5.1647e-04, -1.4187e-02,
3.3832e-02, 1.6349e-02, -8.3935e-04, 2.6937e-02, -1.5575e-02,
-6.4266e-03, 2.5762e-02, 3.4929e-02, 4.2691e-02, 2.2936e-02,
1.3600e-02, 1.9131e-02, 1.4410e-02, 1.3022e-02, 8.8988e-03,
2.0286e-02, 3.0822e-02, 4.5990e-02, 3.0429e-02, 1.4325e-02,
3.0174e-02, -9.7151e-03, 6.1264e-03, 1.7919e-02, 1.1921e-02,
1.1496e-02, -2.1060e-02, -2.1198e-03, 2.7974e-02, -1.4536e-02,
2.8282e-02, -2.2916e-02, 3.0952e-02, -2.9149e-02, 4.8224e-04,
1.2608e-02, 3.7351e-04, -1.9128e-02, 2.4483e-02, 4.7125e-03,
-2.4219e-02, 6.6152e-03, 3.9150e-02, -7.7771e-03, 3.3208e-02,
-3.4074e-02, 7.5642e-04, 4.7129e-02, 1.3904e-04, -1.0335e-03,
3.3290e-02, 5.2648e-02, 7.3007e-03, 4.0760e-02, -1.8254e-02,
5.3701e-03, 1.6878e-02, 4.3795e-03, 9.5917e-03, 1.6428e-02,
3.7221e-02, 1.7113e-02, 1.0436e-02, -1.7236e-02, 2.2115e-02,
-7.6049e-03, -8.7258e-04, -2.5602e-02, 2.0399e-02, -1.9864e-02,
-6.1954e-03, 3.3585e-02, 9.5259e-03, 2.2930e-02, 1.7410e-02,
-3.0388e-02, 5.9288e-03, 2.1480e-03, 3.6306e-02, -7.4481e-03,
3.7329e-02, 2.2018e-02, -9.5370e-03, 1.2821e-02, 2.6528e-02,
2.3349e-02, 2.2825e-02, -4.2607e-02, 7.9119e-03, 2.3603e-02,
4.8039e-02, 3.9244e-02, 4.2007e-02, 7.2430e-03, -1.6995e-02,
2.2379e-02, -3.1171e-03, 9.4846e-03, 4.4265e-03, 3.0305e-02,
-1.5910e-02, 3.0286e-02, -2.3028e-02, -4.0660e-02, 4.5538e-03,
-6.2878e-03, -1.1943e-02, -2.5031e-03, 2.4371e-02, 1.7945e-02,
-1.4140e-03, -4.1711e-03, 1.9475e-02, -4.1921e-03, 4.0421e-03,
6.0400e-03, -4.2922e-03], device='cuda:0'), tensor([], device='cuda:0'); kwargs: act=3, grad=0, alpha=0.2, scale=1.4142135623730951

Computation of FID at different scales - MSPIE

Hello,

Thank you for the amazing work on Positional Encodings and the neat github repo!

I am trying to compute the FID at different scales and I was wondering about the process you followed to do so for MSPIE.

Following your documentation, I am currently using
python tools/utils/inception_stat.py --imgsdir ${path-to-1024-FFHQ} --pklname ${out-pkl-destination} --size ${size-of-output-images-to-compare-with} --num-samples 50000
When using the produces pkl to compare with one of your pretrained models (usings tools/evaluate.py), I am getting an FID of ~19 for 256x256 images and I am trying to find what I am doing wrong.

Thanks!

Cheers,
Evan

CycleGAN

I tried to inference CycleGAN by horse2zebra id0 config and weight, but got this error.
Could I know if I have somethig mistake or missing ?

config_file = '/mmlab/mmgeneration/configs/cyclegan/cyclegan_lsgan_id0_resnet_in_1x1_270k_horse2zebra.py'
checkpoint_file = '/mmlab/mmgeneration/cyclegan_lsgan_id0_resnet_in_1x1_266800_horse2zebra_convert-bgr_20210902_165724-77c9c806.pth'
device = 'cuda:0'
model = init_model(config_file, checkpoint_file, device=device)


TypeError Traceback (most recent call last)
~/miniconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/utils/registry.py in build_from_cfg(cfg, registry, default_args)
51 try:
---> 52 return obj_cls(**args)
53 except Exception as e:

TypeError: init() got an unexpected keyword argument 'default_domain'

During handling of the above exception, another exception occurred:

TypeError Traceback (most recent call last)
in
7 device = 'cuda:0'
8 # init a generatvie
----> 9 model = init_model(config_file, checkpoint_file, device=device)
10 # translate a single image
11 translated_image = sample_img2img_model(model, image_path)

~/mmlab/mmgeneration/mmgen/apis/inference.py in init_model(config, checkpoint, device, cfg_options)
33 config.merge_from_dict(cfg_options)
34
---> 35 model = build_model(
36 config.model, train_cfg=config.train_cfg, test_cfg=config.test_cfg)
37

~/mmlab/mmgeneration/mmgen/models/builder.py in build_model(cfg, train_cfg, test_cfg)
29
30 def build_model(cfg, train_cfg=None, test_cfg=None):
---> 31 """Build model (GAN)."""
32 return build(cfg, MODELS, dict(train_cfg=train_cfg, test_cfg=test_cfg))
33

~/mmlab/mmgeneration/mmgen/models/builder.py in build(cfg, registry, default_args)
24 ]
25 return nn.ModuleList(modules)
---> 26
27 return build_from_cfg(cfg, registry, default_args)
28

~/miniconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/utils/registry.py in build_from_cfg(cfg, registry, default_args)
53 except Exception as e:
54 # Normal TypeError does not print class name.
---> 55 raise type(e)(f'{obj_cls.name}: {e}')
56
57

TypeError: CycleGAN: init() got an unexpected keyword argument 'default_domain'

A general question about the FID metric

Hi,

I would like to ask a general question about the FID metric. If we train the CycleGAN on another dataset that is much different from the ImageNet dataset, e.g. a MRI dataset, do we need to re-train the Inception-V3 model in advance? I appreciate your help.

Evaluation error about input data keys

Hi, this is a great work! But I have encountered a problem.

I downloaded ckpt cyclegan_lsgan_resnet_in_1x1_246200_summer2winter_convert-bgr_20210411_181406-b351815d.pth, and when I run this command:

python ./tools/evaluation.py 
configs/cyclegan/cyclegan_lsgan_resnet_in_1x1_250k_summer2winter.py 
./work_dirs/experiments/cyclegan_summer2winter/ckpt/cyclegan_lsgan_resnet_in_1x1_246200_summer2winter_convert-bgr_20210411_181406-b351815d.pth
--batch-size 10 
--online

I got this error:

File "/opt/mmgeneration/mmgen/core/evaluation/evaluation.py", line 264, in single_gpu_online_evaluation
    raise KeyError('Cannot found key for images in data_dict. '
KeyError: 'Cannot found key for images in data_dict. Only support `real_img` for unconditional datasets and `img` for conditional datasets.'

Looking forward for your reply! Thx

visualization.py (NotImplementedError: There were no tensor arguments to this function)

Describe the bug

Error during traininig a sample config: cyclegan_lsgan_id0_resnet_in_1x1_270k_horse2zebra.py

At first every thing is alright, each time when it comes to the VisualizeUnconditionalSamples, program stopped and raised NotImplementedError.

custom_hooks = [
    dict(
        type='VisualizeUnconditionalSamples',
        output_dir='training_samples',
        interval=1000)
]

  File "/home/root/git/jay-mmgeneration/mmgen/core/hooks/visualization.py", line 73, in after_train_iter
    img_cat = torch.cat(img_list, dim=3).detach()
NotImplementedError: There were no tensor arguments to this function (e.g., you passed an empty list of Tensors), but no fallback function is registered for schema aten::_cat.  This usually means that this function requires a non-empty list of Tensors, or that you (the operator writer) forgot to register a fallback function.  Available functions are [CPU, CUDA, QuantizedCPU, BackendSelect, Named, ADInplaceOrView, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, UNKNOWN_TENSOR_TYPE_ID, AutogradMLC, AutogradHPU, AutogradNestedTensor, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, Tracer, Autocast, Batched, VmapMode].

Reproduction

just run it, the data is saved in data dir.

tools/dist_train.sh configs/cyclegan/cyclegan_lsgan_id0_resnet_in_1x1_270k_horse2zebra.py 2
  1. Did you make any modifications on the code or config? Did you understand what you have modified?

yes

  1. What dataset did you use?

Environment

  • conda py3.8
  • pytoch 1.7
  • lastest mmlab-full and mmgen

Error traceback
If applicable, paste the error trackback here.

A placeholder for trackback.

error.log

Bug fix

I found the VisualizeUnconditionalSamples exited in default runtime, so I create a new hook without VisualizeUnconditionalSamples.

custom_hooks = [
    # dict(
    #     type="MMGenVisualizationHook",
    #     output_dir="training_samples",
    #     res_name_list=["fake_b"],
    #     interval=100,
    # )
]

seems worked now

Will StyleGAN3 be supported?

Describe the feature

Motivation
A clear and concise description of the motivation of the feature.
Ex1. It is inconvenient when [....].
Ex2. There is a recent paper [....], which is very helpful for [....].

Related resources
If there is an official code release or third-party implementations, please also provide the information here, which would be very helpful.

Additional context
Add any other context or screenshots about the feature request here.
If you would like to implement the feature and create a PR, please leave a comment here and that would be much appreciated.

Training times problem when using multi-gpus

Question:
When I train styleganv2 by using bash tools/dist_train.sh configs/styleganv2/stylegan2_c2_ffhq_256_b4x8_800k.py 1 (1 GPU) and bash tools/dist_train.sh configs/styleganv2/stylegan2_c2_ffhq_256_b4x8_800k.py 2 (2GPU), but the training times are almost the same.

logs:
1. bash tools/dist_train.sh configs/styleganv2/stylegan2_c2_ffhq_256_b4x8_800k.py 1

/opt/conda/lib/python3.7/site-packages/torch/distributed/launch.py:164: DeprecationWarning: The 'warn' method is deprecated, use 'warning' instead
"The module torch.distributed.launch is deprecated "
The module torch.distributed.launch is deprecated and going to be removed in future.Migrate to torch.distributed.run
WARNING:torch.distributed.run:--use_env is deprecated and will be removed in future releases.
Please read local_rank from os.environ('LOCAL_RANK') instead.
INFO:torch.distributed.launcher.api:Starting elastic_operator with launch configs:
entrypoint : tools/train.py
min_nodes : 1
max_nodes : 1
nproc_per_node : 1
run_id : none
rdzv_backend : static
rdzv_endpoint : 127.0.0.1:29500
rdzv_configs : {'rank': 0, 'timeout': 900}
max_restarts : 3
monitor_interval : 5
log_dir : None
metrics_cfg : {}

INFO:torch.distributed.elastic.agent.server.local_elastic_agent:log directory set to: /tmp/torchelastic_hym17szx/none_aujbtbrg
INFO:torch.distributed.elastic.agent.server.api:[default] starting workers for entrypoint: python
INFO:torch.distributed.elastic.agent.server.api:[default] Rendezvous'ing worker group
/opt/conda/lib/python3.7/site-packages/torch/distributed/elastic/utils/store.py:53: FutureWarning: This is an experimental API and will be changed in future.
"This is an experimental API and will be changed in future.", FutureWarning
INFO:torch.distributed.elastic.agent.server.api:[default] Rendezvous complete for workers. Result:
restart_count=0
master_addr=127.0.0.1
master_port=29500
group_rank=0
group_world_size=1
local_ranks=[0]
role_ranks=[0]
global_ranks=[0]
role_world_sizes=[1]
global_world_sizes=[1]

INFO:torch.distributed.elastic.agent.server.api:[default] Starting worker group
INFO:torch.distributed.elastic.multiprocessing:Setting worker0 reply file to: /tmp/torchelastic_hym17szx/none_aujbtbrg/attempt_0/0/error.json
fatal: not a git repository (or any parent up to mount point /home/ma-user)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).

2022-01-06 23:05:39,038 - mmgen - INFO - Environment info:

sys.platform: linux
Python: 3.7.10 (default, Jun 4 2021, 14:48:32) [GCC 7.5.0]
CUDA available: True
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 10.2, V10.2.89
GPU 0,1: Tesla P100-PCIE-16GB
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 1.9.0+cu102
PyTorch compiling details: PyTorch built with:

  • GCC 7.3
  • C++ Version: 201402
  • Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v2.1.2 (Git Hash 98be7e8afa711dc9b66c8ff3504129cb82013cdb)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • NNPACK is enabled
  • CPU capability usage: AVX2
  • CUDA Runtime 10.2
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70
  • CuDNN 7.6.5
  • Magma 2.5.2
  • Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=10.2, CUDNN_VERSION=7.6.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.9.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,

TorchVision: 0.10.0+cu102
OpenCV: 4.5.5
MMCV: 1.4.0
MMGen: 0.4.0+
MMCV Compiler: GCC 7.3

MMCV CUDA Compiler: 10.2

2022-01-06 23:05:39,172 - mmgen - INFO - Distributed training: True
2022-01-06 23:05:39,303 - mmgen - INFO - Config:
dataset_type = 'UnconditionalImageDataset'
train_pipeline = [
dict(type='LoadImageFromFile', key='real_img', io_backend='disk'),
dict(type='Flip', keys=['real_img'], direction='horizontal'),
dict(
type='Normalize',
keys=['real_img'],
mean=[127.5, 127.5, 127.5],
std=[127.5, 127.5, 127.5],
to_rgb=False),
dict(type='ImageToTensor', keys=['real_img']),
dict(type='Collect', keys=['real_img'], meta_keys=['real_img_path'])
]
val_pipeline = [
dict(type='LoadImageFromFile', key='real_img', io_backend='disk'),
dict(
type='Normalize',
keys=['real_img'],
mean=[127.5, 127.5, 127.5],
std=[127.5, 127.5, 127.5],
to_rgb=True),
dict(type='ImageToTensor', keys=['real_img']),
dict(type='Collect', keys=['real_img'], meta_keys=['real_img_path'])
]
data = dict(
samples_per_gpu=4,
workers_per_gpu=4,
train=dict(
type='RepeatDataset',
times=100,
dataset=dict(
type='UnconditionalImageDataset',
imgs_root='../../../data/raw/sample',
pipeline=[
dict(
type='LoadImageFromFile',
key='real_img',
io_backend='disk'),
dict(type='Flip', keys=['real_img'], direction='horizontal'),
dict(
type='Normalize',
keys=['real_img'],
mean=[127.5, 127.5, 127.5],
std=[127.5, 127.5, 127.5],
to_rgb=False),
dict(type='ImageToTensor', keys=['real_img']),
dict(
type='Collect',
keys=['real_img'],
meta_keys=['real_img_path'])
])),
val=dict(
type='UnconditionalImageDataset',
imgs_root='../../../data/raw/sample',
pipeline=[
dict(type='LoadImageFromFile', key='real_img', io_backend='disk'),
dict(
type='Normalize',
keys=['real_img'],
mean=[127.5, 127.5, 127.5],
std=[127.5, 127.5, 127.5],
to_rgb=True),
dict(type='ImageToTensor', keys=['real_img']),
dict(
type='Collect', keys=['real_img'], meta_keys=['real_img_path'])
]))
d_reg_interval = 16
g_reg_interval = 4
g_reg_ratio = 0.8
d_reg_ratio = 0.9411764705882353
model = dict(
type='StaticUnconditionalGAN',
generator=dict(
type='StyleGANv2Generator', out_size=256, style_channels=512),
discriminator=dict(type='StyleGAN2Discriminator', in_size=256),
gan_loss=dict(type='GANLoss', gan_type='wgan-logistic-ns'),
disc_auxiliary_loss=dict(
type='R1GradientPenalty',
loss_weight=80.0,
interval=16,
norm_mode='HWC',
data_info=dict(real_data='real_imgs', discriminator='disc')),
gen_auxiliary_loss=dict(
type='GeneratorPathRegularizer',
loss_weight=8.0,
pl_batch_shrink=2,
interval=4,
data_info=dict(generator='gen', num_batches='batch_size')))
train_cfg = dict(use_ema=True)
test_cfg = None
optimizer = dict(
generator=dict(type='Adam', lr=0.0016, betas=(0, 0.9919919678228657)),
discriminator=dict(
type='Adam', lr=0.0018823529411764706, betas=(0, 0.9905854573074332)))
checkpoint_config = dict(interval=10000, by_epoch=False, max_keep_ckpts=30)
log_config = dict(interval=100, hooks=[dict(type='TextLoggerHook')])
custom_hooks = [
dict(
type='VisualizeUnconditionalSamples',
output_dir='training_samples',
interval=5000),
dict(
type='ExponentialMovingAverageHook',
module_keys=('generator_ema', ),
interval=1,
interp_cfg=dict(momentum=0.9977843871238888),
priority='VERY_HIGH')
]
runner = dict(
type='DynamicIterBasedRunner',
is_dynamic_ddp=True,
pass_training_status=True)
dist_params = dict(backend='nccl')
log_level = 'INFO'
load_from = '../../../model_zoo/mmgeneration/0.4.0/styleganv2/stylegan2_c2_ffhq_256_b4x8_20210407_160709-7890ae1f.pth'
resume_from = None
workflow = [('train', 10000)]
find_unused_parameters = True
cudnn_benchmark = True
ema_half_life = 10.0
lr_config = None
total_iters = 5000
metrics = dict(
fid50k=dict(
type='FID',
num_images=40000,
inception_pkl='../../../output/inception_pkl/sample.pkl',
bgr2rgb=True),
pr50k3=dict(type='PR', num_images=40000, k=3),
ppl_wend=dict(type='PPL', space='W', sampling='end', num_images=40000))
evaluation = dict(
type='GenerativeEvalHook',
interval=10000,
metrics=dict(
type='FID',
num_images=40000,
inception_pkl='../../../output/inception_pkl/caleba.pkl',
bgr2rgb=True),
sample_kwargs=dict(sample_model='ema'))
work_dir = './work_dirs/stylegan2_c2_ffhq_256_b4x8_800k'
gpu_ids = range(0, 1)

2022-01-06 23:05:39,304 - mmgen - INFO - Set random seed to 2021, deterministic: False
2022-01-06 23:05:41,097 - mmgen - INFO - dataset_name: <class 'mmgen.datasets.unconditional_image_dataset.UnconditionalImageDataset'>, total 40000 images in imgs_root: ../../../data/raw/sample
fatal: not a git repository (or any parent up to mount point /home/ma-user)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
2022-01-06 23:05:44,954 - mmgen - INFO - dataset_name: <class 'mmgen.datasets.unconditional_image_dataset.UnconditionalImageDataset'>, total 40000 images in imgs_root: ../../../data/raw/sample
/home/ma-user/anaconda3/envs/PyTorch-1.8/lib/python3.7/site-packages/torchvision/models/inception.py:82: FutureWarning: The default weight initialization of inception_v3 will be changed in future releases of torchvision. If you wish to keep the old behavior (which leads to long initialization times due to scipy/scipy#11299), please set init_weights=True.
' due to scipy/scipy#11299), please set init_weights=True.', FutureWarning)
2022-01-06 23:08:41,239 - mmgen - INFO - FID: Adopt Inception in pytorch style
2022-01-06 23:08:41,271 - mmgen - INFO - Load reference inception pkl from ../../../output/inception_pkl/caleba.pkl
2022-01-06 23:08:41,272 - mmgen - INFO - load checkpoint from local path: ../../../model_zoo/mmgeneration/0.4.0/styleganv2/stylegan2_c2_ffhq_256_b4x8_20210407_160709-7890ae1f.pth
2022-01-06 23:08:41,581 - mmgen - INFO - Start running, host: ma-user@notebook-8057cc96-2d5a-4a22-9d1f-754b14633e6b, work_dir: /home/ma-user/work/mmgen/algorithms/mmgeneration/algorithm/work_dirs/stylegan2_c2_ffhq_256_b4x8_800k
2022-01-06 23:08:41,582 - mmgen - INFO - workflow: [('train', 10000)], max: 5000 iters
2022-01-06 23:08:41,582 - mmgen - INFO - Checkpoints will be saved to ./work_dirs/stylegan2_c2_ffhq_256_b4x8_800k/ckpt/stylegan2_c2_ffhq_256_b4x8_800k by HardDiskBackend.
/home/ma-user/work/mmgen/algorithms/mmgeneration/algorithm/mmgen/ops/conv2d_gradfix.py:190: UserWarning: conv2d_gradfix not supported on PyTorch 1.9.0+cu102. Falling back to torch.nn.functional.conv2d().
f'conv2d_gradfix not supported on PyTorch {torch.version}. '
2022-01-06 23:10:34,715 - mmgen - INFO - Iter [100/5000] lr_generator: 1.600e-03 lr_discriminator: 1.882e-03, eta: 1:11:51, time: 0.880, data_time: 0.003, memory: 12927, loss_disc_fake_g: 1.3611, loss_path_regular: 0.0610, loss: 1.1599, loss_disc_fake: 0.4977, loss_disc_real: 0.5007, loss_r1_gp: 2.3067
2022-01-06 23:11:54,224 - mmgen - INFO - Iter [200/5000] lr_generator: 1.600e-03 lr_discriminator: 1.882e-03, eta: 1:06:59, time: 0.795, data_time: 0.003, memory: 12927, loss_disc_fake_g: 1.0434, loss_path_regular: 0.0636, loss: 1.2430, loss_disc_fake: 0.5775, loss_disc_real: 0.5714, loss_r1_gp: 1.9658
2022-01-06 23:13:13,804 - mmgen - INFO - Iter [300/5000] lr_generator: 1.600e-03 lr_discriminator: 1.882e-03, eta: 1:04:30, time: 0.796, data_time: 0.003, memory: 12927, loss_disc_fake_g: 1.0855, loss_path_regular: 0.0574, loss: 1.2338, loss_disc_fake: 0.5709, loss_disc_real: 0.5621, loss_r1_gp: 1.8754

2. bash tools/dist_train.sh configs/styleganv2/stylegan2_c2_ffhq_256_b4x8_800k.py 2
/opt/conda/lib/python3.7/site-packages/torch/distributed/launch.py:164: DeprecationWarning: The 'warn' method is deprecated, use 'warning' instead
"The module torch.distributed.launch is deprecated "
The module torch.distributed.launch is deprecated and going to be removed in future.Migrate to torch.distributed.run


Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.


WARNING:torch.distributed.run:--use_env is deprecated and will be removed in future releases.
Please read local_rank from os.environ('LOCAL_RANK') instead.
INFO:torch.distributed.launcher.api:Starting elastic_operator with launch configs:
entrypoint : tools/train.py
min_nodes : 1
max_nodes : 1
nproc_per_node : 2
run_id : none
rdzv_backend : static
rdzv_endpoint : 127.0.0.1:29500
rdzv_configs : {'rank': 0, 'timeout': 900}
max_restarts : 3
monitor_interval : 5
log_dir : None
metrics_cfg : {}

INFO:torch.distributed.elastic.agent.server.local_elastic_agent:log directory set to: /tmp/torchelastic_x1xrygdm/none_7_m468vp
INFO:torch.distributed.elastic.agent.server.api:[default] starting workers for entrypoint: python
INFO:torch.distributed.elastic.agent.server.api:[default] Rendezvous'ing worker group
/opt/conda/lib/python3.7/site-packages/torch/distributed/elastic/utils/store.py:53: FutureWarning: This is an experimental API and will be changed in future.
"This is an experimental API and will be changed in future.", FutureWarning
INFO:torch.distributed.elastic.agent.server.api:[default] Rendezvous complete for workers. Result:
restart_count=0
master_addr=127.0.0.1
master_port=29500
group_rank=0
group_world_size=1
local_ranks=[0, 1]
role_ranks=[0, 1]
global_ranks=[0, 1]
role_world_sizes=[2, 2]
global_world_sizes=[2, 2]

INFO:torch.distributed.elastic.agent.server.api:[default] Starting worker group
INFO:torch.distributed.elastic.multiprocessing:Setting worker0 reply file to: /tmp/torchelastic_x1xrygdm/none_7_m468vp/attempt_0/0/error.json
INFO:torch.distributed.elastic.multiprocessing:Setting worker1 reply file to: /tmp/torchelastic_x1xrygdm/none_7_m468vp/attempt_0/1/error.json
fatal: not a git repository (or any parent up to mount point /home/ma-user)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
fatal: not a git repository (or any parent up to mount point /home/ma-user)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
2022-01-06 23:14:12,988 - mmgen - INFO - Environment info:

sys.platform: linux
Python: 3.7.10 (default, Jun 4 2021, 14:48:32) [GCC 7.5.0]
CUDA available: True
CUDA_HOME: /usr/local/cuda
NVCC: Cuda compilation tools, release 10.2, V10.2.89
GPU 0,1: Tesla P100-PCIE-16GB
GCC: gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
PyTorch: 1.9.0+cu102
PyTorch compiling details: PyTorch built with:

  • GCC 7.3
  • C++ Version: 201402
  • Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v2.1.2 (Git Hash 98be7e8afa711dc9b66c8ff3504129cb82013cdb)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • NNPACK is enabled
  • CPU capability usage: AVX2
  • CUDA Runtime 10.2
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70
  • CuDNN 7.6.5
  • Magma 2.5.2
  • Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=10.2, CUDNN_VERSION=7.6.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.9.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,

TorchVision: 0.10.0+cu102
OpenCV: 4.5.5
MMCV: 1.4.0
MMGen: 0.4.0+
MMCV Compiler: GCC 7.3
MMCV CUDA Compiler: 10.2

2022-01-06 23:14:13,117 - mmgen - INFO - Distributed training: True
2022-01-06 23:14:13,247 - mmgen - INFO - Config:
dataset_type = 'UnconditionalImageDataset'
train_pipeline = [
dict(type='LoadImageFromFile', key='real_img', io_backend='disk'),
dict(type='Flip', keys=['real_img'], direction='horizontal'),
dict(
type='Normalize',
keys=['real_img'],
mean=[127.5, 127.5, 127.5],
std=[127.5, 127.5, 127.5],
to_rgb=False),
dict(type='ImageToTensor', keys=['real_img']),
dict(type='Collect', keys=['real_img'], meta_keys=['real_img_path'])
]
val_pipeline = [
dict(type='LoadImageFromFile', key='real_img', io_backend='disk'),
dict(
type='Normalize',
keys=['real_img'],
mean=[127.5, 127.5, 127.5],
std=[127.5, 127.5, 127.5],
to_rgb=True),
dict(type='ImageToTensor', keys=['real_img']),
dict(type='Collect', keys=['real_img'], meta_keys=['real_img_path'])
]
data = dict(
samples_per_gpu=4,
workers_per_gpu=4,
train=dict(
type='RepeatDataset',
times=100,
dataset=dict(
type='UnconditionalImageDataset',
imgs_root='../../../data/raw/sample',
pipeline=[
dict(
type='LoadImageFromFile',
key='real_img',
io_backend='disk'),
dict(type='Flip', keys=['real_img'], direction='horizontal'),
dict(
type='Normalize',
keys=['real_img'],
mean=[127.5, 127.5, 127.5],
std=[127.5, 127.5, 127.5],
to_rgb=False),
dict(type='ImageToTensor', keys=['real_img']),
dict(
type='Collect',
keys=['real_img'],
meta_keys=['real_img_path'])
])),
val=dict(
type='UnconditionalImageDataset',
imgs_root='../../../data/raw/sample',
pipeline=[
dict(type='LoadImageFromFile', key='real_img', io_backend='disk'),
dict(
type='Normalize',
keys=['real_img'],
mean=[127.5, 127.5, 127.5],
std=[127.5, 127.5, 127.5],
to_rgb=True),
dict(type='ImageToTensor', keys=['real_img']),
dict(
type='Collect', keys=['real_img'], meta_keys=['real_img_path'])
]))
d_reg_interval = 16
g_reg_interval = 4
g_reg_ratio = 0.8
d_reg_ratio = 0.9411764705882353
model = dict(
type='StaticUnconditionalGAN',
generator=dict(
type='StyleGANv2Generator', out_size=256, style_channels=512),
discriminator=dict(type='StyleGAN2Discriminator', in_size=256),
gan_loss=dict(type='GANLoss', gan_type='wgan-logistic-ns'),
disc_auxiliary_loss=dict(
type='R1GradientPenalty',
loss_weight=80.0,
interval=16,
norm_mode='HWC',
data_info=dict(real_data='real_imgs', discriminator='disc')),
gen_auxiliary_loss=dict(
type='GeneratorPathRegularizer',
loss_weight=8.0,
pl_batch_shrink=2,
interval=4,
data_info=dict(generator='gen', num_batches='batch_size')))
train_cfg = dict(use_ema=True)
test_cfg = None
optimizer = dict(
generator=dict(type='Adam', lr=0.0016, betas=(0, 0.9919919678228657)),
discriminator=dict(
type='Adam', lr=0.0018823529411764706, betas=(0, 0.9905854573074332)))
checkpoint_config = dict(interval=10000, by_epoch=False, max_keep_ckpts=30)
log_config = dict(interval=100, hooks=[dict(type='TextLoggerHook')])
custom_hooks = [
dict(
type='VisualizeUnconditionalSamples',
output_dir='training_samples',
interval=5000),
dict(
type='ExponentialMovingAverageHook',
module_keys=('generator_ema', ),
interval=1,
interp_cfg=dict(momentum=0.9977843871238888),
priority='VERY_HIGH')
]
runner = dict(
type='DynamicIterBasedRunner',
is_dynamic_ddp=True,
pass_training_status=True)
dist_params = dict(backend='nccl')
log_level = 'INFO'
load_from = '../../../model_zoo/mmgeneration/0.4.0/styleganv2/stylegan2_c2_ffhq_256_b4x8_20210407_160709-7890ae1f.pth'
resume_from = None
workflow = [('train', 10000)]
find_unused_parameters = True
cudnn_benchmark = True
ema_half_life = 10.0
lr_config = None
total_iters = 5000
metrics = dict(
fid50k=dict(
type='FID',
num_images=40000,
inception_pkl='../../../output/inception_pkl/caleba.pkl',
bgr2rgb=True),
pr50k3=dict(type='PR', num_images=40000, k=3),
ppl_wend=dict(type='PPL', space='W', sampling='end', num_images=40000))
evaluation = dict(
type='GenerativeEvalHook',
interval=10000,
metrics=dict(
type='FID',
num_images=40000,
inception_pkl='../../../output/inception_pkl/caleba.pkl',
bgr2rgb=True),
sample_kwargs=dict(sample_model='ema'))
work_dir = './work_dirs/stylegan2_c2_ffhq_256_b4x8_800k'
gpu_ids = range(0, 2)

2022-01-06 23:14:13,247 - mmgen - INFO - Set random seed to 2021, deterministic: False
2022-01-06 23:14:15,078 - mmgen - INFO - dataset_name: <class 'mmgen.datasets.unconditional_image_dataset.UnconditionalImageDataset'>, total 40000 images in imgs_root: ../../../data/raw/sample
fatal: not a git repository (or any parent up to mount point /home/ma-user)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
fatal: not a git repository (or any parent up to mount point /home/ma-user)
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
2022-01-06 23:14:18,943 - mmgen - INFO - dataset_name: <class 'mmgen.datasets.unconditional_image_dataset.UnconditionalImageDataset'>, total 40000 images in imgs_root: ../../../data/raw/sample
/home/ma-user/anaconda3/envs/PyTorch-1.8/lib/python3.7/site-packages/torchvision/models/inception.py:82: FutureWarning: The default weight initialization of inception_v3 will be changed in future releases of torchvision. If you wish to keep the old behavior (which leads to long initialization times due to scipy/scipy#11299), please set init_weights=True.
' due to scipy/scipy#11299), please set init_weights=True.', FutureWarning)
/home/ma-user/anaconda3/envs/PyTorch-1.8/lib/python3.7/site-packages/torchvision/models/inception.py:82: FutureWarning: The default weight initialization of inception_v3 will be changed in future releases of torchvision. If you wish to keep the old behavior (which leads to long initialization times due to scipy/scipy#11299), please set init_weights=True.
' due to scipy/scipy#11299), please set init_weights=True.', FutureWarning)
2022-01-06 23:17:24,405 - mmgen - INFO - FID: Adopt Inception in pytorch style
2022-01-06 23:17:24,451 - mmgen - INFO - Load reference inception pkl from ../../../output/inception_pkl/caleba.pkl
2022-01-06 23:17:24,451 - mmgen - INFO - load checkpoint from local path: ../../../model_zoo/mmgeneration/0.4.0/styleganv2/stylegan2_c2_ffhq_256_b4x8_20210407_160709-7890ae1f.pth
2022-01-06 23:17:24,765 - mmgen - INFO - Start running, host: ma-user@notebook-8057cc96-2d5a-4a22-9d1f-754b14633e6b, work_dir: /home/ma-user/work/mmgen/algorithms/mmgeneration/algorithm/work_dirs/stylegan2_c2_ffhq_256_b4x8_800k
2022-01-06 23:17:24,765 - mmgen - INFO - workflow: [('train', 10000)], max: 5000 iters
2022-01-06 23:17:24,766 - mmgen - INFO - Checkpoints will be saved to ./work_dirs/stylegan2_c2_ffhq_256_b4x8_800k/ckpt/stylegan2_c2_ffhq_256_b4x8_800k by HardDiskBackend.
/home/ma-user/work/mmgen/algorithms/mmgeneration/algorithm/mmgen/ops/conv2d_gradfix.py:190: UserWarning: conv2d_gradfix not supported on PyTorch 1.9.0+cu102. Falling back to torch.nn.functional.conv2d().
f'conv2d_gradfix not supported on PyTorch {torch.version}. '
/home/ma-user/work/mmgen/algorithms/mmgeneration/algorithm/mmgen/ops/conv2d_gradfix.py:190: UserWarning: conv2d_gradfix not supported on PyTorch 1.9.0+cu102. Falling back to torch.nn.functional.conv2d().
f'conv2d_gradfix not supported on PyTorch {torch.version}. '
2022-01-06 23:19:30,464 - mmgen - INFO - Iter [100/5000] lr_generator: 1.600e-03 lr_discriminator: 1.882e-03, eta: 1:22:01, time: 1.004, data_time: 0.003, memory: 12927, loss_disc_fake_g: 1.2576, loss_path_regular: 0.0464, loss: 1.1665, loss_disc_fake: 0.5016, loss_disc_real: 0.4978, loss_r1_gp: 2.3874
2022-01-06 23:21:01,678 - mmgen - INFO - Iter [200/5000] lr_generator: 1.600e-03 lr_discriminator: 1.882e-03, eta: 1:16:39, time: 0.912, data_time: 0.003, memory: 12927, loss_disc_fake_g: 1.0306, loss_path_regular: 0.0424, loss: 1.2528, loss_disc_fake: 0.5724, loss_disc_real: 0.5765, loss_r1_gp: 2.0846
2022-01-06 23:22:32,770 - mmgen - INFO - Iter [300/5000] lr_generator: 1.600e-03 lr_discriminator: 1.882e-03, eta: 1:13:49, time: 0.911, data_time: 0.003, memory: 12927, loss_disc_fake_g: 0.9520, loss_path_regular: 0.0407, loss: 1.2777, loss_disc_fake: 0.5932, loss_disc_real: 0.5889, loss_r1_gp: 1.9291
2022-01-06 23:24:03,901 - mmgen - INFO - Iter [400/5000] lr_generator: 1.600e-03 lr_discriminator: 1.882e-03, eta: 1:11:39, time: 0.911, data_time: 0.003, memory: 12927, loss_disc_fake_g: 0.9833, loss_path_regular: 0.0393, loss: 1.2634, loss_disc_fake: 0.5889, loss_disc_real: 0.5915, loss_r1_gp: 1.7984
2022-01-06 23:25:35,512 - mmgen - INFO - Iter [500/5000] lr_generator: 1.600e-03 lr_discriminator: 1.882e-03, eta: 1:09:49, time: 0.916, data_time: 0.003, memory: 12927, loss_disc_fake_g: 0.9344, loss_path_regular: 0.0372, loss: 1.2855, loss_disc_fake: 0.5944, loss_disc_real: 0.5857, loss_r1_gp: 1.7347

Interpolate through a set of latent values?

Hi,

Is it possible to interpolate using a set of differant latent variables, rather then as it currently does which is I'm guessing random endpoints.

I can see that from the styleGAN projector that you can get the latent_n value from a specified image. If I was to save this value is it possible to use it somehow within the interpolate_sample.py code. I would like to generate a series of images which are the interpolations between N+ images form the training set (in this case use closest latent_n).

I was just curious if anyone knew of a good way to do it :)

AssertionError in metrics.py

Thanks for your error report and we appreciate it a lot.

Checklist

  1. I have searched related issues but cannot get the expected help.
  2. I have read the FAQ documentation but cannot get the expected help.
  3. The bug has not been fixed in the latest version.

Describe the bug
A clear and concise description of what the bug is.

Reproduction

  1. What command or script did you run?
sh tools/slurm_train.sh test mmg_cifar10 configs/biggan/biggan_cifar10_32x32_b25x2_500k.py exps/biggan-cifar10
  1. Did you make any modifications on the code or config? Did you understand what you have modified?
    No

  2. What dataset did you use?
    CIFAR10

Environment

  1. Please run python mmgen/utils/collect_env.py to collect necessary environment information and paste it here.
  2. You may add addition that may be helpful for locating the problem, such as
    • How you installed PyTorch [e.g., pip, conda, source]
    • Other environment variables that may be related (such as $PATH, $LD_LIBRARY_PATH, $PYTHONPATH, etc.)

Error traceback
If applicable, paste the error trackback here.

INFO:mmgen:Sample 50000 fake images for evaluation
[                                                  ] 0/50000, elapsed: 0s, ETA:2021-09-05 15:49:32,938 - mmgen - INFO - `use_pil_resize` is set as True, apply Bicubic interpolat$on with Pillow backend. We perform type conversion between torch.tensor and PIL.Image in this function and make this process a little bit slow.
INFO:mmgen:`use_pil_resize` is set as True, apply Bicubic interpolation with Pillow backend. We perform type conversion between torch.tensor and PIL.Image in this function and make this process a little bit slow.
[>>>>>>>>>>>>>>>>>>>>>>>>] 50000/50000, 50.7 task/s, elapsed: 985s, ETA:     0s
Traceback (most recent call last):
  File "tools/train.py", line 163, in <module>
    main()
  File "tools/train.py", line 159, in main
    meta=meta)
  File "~/.local/lib/python3.6/site-packages/mmgen-0.3.0-py3.6.egg/mmgen/apis/train.py", line 198, in train_model
    runner.run(data_loaders, cfg.workflow, cfg.total_iters)
  File "~.local/lib/python3.6/site-packages/mmgen-0.3.0-py3.6.egg/mmgen/core/runners/dynamic_iterbased_runner.py", line 285, in run
    iter_runner(iter_loaders[i], **kwargs)
  File "~/.local/lib/python3.6/site-packages/mmgen-0.3.0-py3.6.egg/mmgen/core/runners/dynamic_iterbased_runner.py", line 236, in train
    self.call_hook('after_train_iter')
  File "~/.local/lib/python3.6/site-packages/mmgen-0.3.0-py3.6.egg/mmgen/core/runners/dynamic_iterbased_runner.py", line 181, in call_hook
    getattr(hook, fn_name)(self)
  File "~/.local/lib/python3.6/site-packages/mmgen-0.3.0-py3.6.egg/mmgen/core/evaluation/eval_hooks.py", line 283, in after_train_iter
    metric.summary()
  File "~/.conda/conda_envs/miniconda3/envs/openmm-lab/lib/python3.6/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "~/.local/lib/python3.6/site-packages/mmgen-0.3.0-py3.6.egg/mmgen/core/evaluation/metrics.py", line 575, in summary
    assert feats.shape[0] >= self.num_images
AssertionError
srun: error:task 0: Exited with exit code 1

Bug fix
If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!

configs/styleganv2/stylegan2_c2_apex_fp16_PL-R1-no-scaler_ffhq_256_b4x8_800k.py

Thanks for your error report and we appreciate it a lot.

Checklist

  1. I have searched related issues but cannot get the expected help.
  2. I have read the FAQ documentation but cannot get the expected help.
  3. The bug has not been fixed in the latest version.

Describe the bug
A clear and concise description of what the bug is.

Reproduction

  1. What command or script did you run?
A placeholder for the command.
  1. Did you make any modifications on the code or config? Did you understand what you have modified?
  2. What dataset did you use?

Environment

  1. Please run python mmgen/utils/collect_env.py to collect necessary environment information and paste it here.
  2. You may add addition that may be helpful for locating the problem, such as
    • How you installed PyTorch [e.g., pip, conda, source]
    • Other environment variables that may be related (such as $PATH, $LD_LIBRARY_PATH, $PYTHONPATH, etc.)

Error traceback
If applicable, paste the error trackback here.

A placeholder for trackback.

Bug fix
If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!
image

Padding at convolution module in SinGANMSGeneratorPE's generator while using explicit positional encoding (e.g. CSG, SPE)

Describe the issue

According to "Positional Encoding as Spatial Inductive Bias in GANs", zero padding leads to an unbalanced spatial bias with vague relation between locations. Thorough out the paper, they propose other explicit positional encoding such as cartesian grid or sinusoidal positional encodings. While using these explicit positional encoding, they remove padding from convolution generators and replace them with bilinear upsampling.

However, according to configuration in this project (e.g. https://github.com/open-mmlab/mmgeneration/blob/master/configs/positional_encoding_in_gans/singan_csg_bohemian.py,), I found out that these implementation use padding with size 1 at convolution module. I wonder that this is an exact reimplementation of paper.

  1. What config dir you run?
_base_ = ['../singan/singan_fish.py']

num_scales = 10  # start from zero
model = dict(
    type='PESinGAN',
    generator=dict(
        type='SinGANMSGeneratorPE',
        num_scales=num_scales,
        padding=1,
        pad_at_head=False,
        first_stage_in_channels=2,
        positional_encoding=dict(type='CSG')),
    discriminator=dict(num_scales=num_scales))

train_cfg = dict(first_fixed_noises_ch=2)

data = dict(
    train=dict(
        img_path='./data/singan/bohemian.png',
        min_size=25,
        max_size=500,
    ))

dist_params = dict(backend='nccl', port=28120)
total_iters = 22000
  1. Did you make any modifications on the code or config? Did you understand what you have modified?

No I haven't change anything.

  1. What dataset did you use?

I used ballons.png which provided in original singan repository.

Environment

sys.platform: linux
Python: 3.8.10 (default, Sep 28 2021, 16:10:42) [GCC 9.3.0]
CUDA available: True
CUDA_HOME: /usr/local/cuda
NVCC: Build cuda_11.1.TC455_06.29190527_0
GPU 0,1,2,3,4,5: NVIDIA RTX A6000
GCC: gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
PyTorch: 1.8.1+cu111
PyTorch compiling details: PyTorch built with:

  • GCC 7.3
  • C++ Version: 201402
  • Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v1.7.0 (Git Hash 7aed236906b1f7a05c0917e5257a1af05e9ff683)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • NNPACK is enabled
  • CPU capability usage: AVX2
  • CUDA Runtime 11.1
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
  • CuDNN 8.0.5
  • Magma 2.5.2
  • Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.8.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,

TorchVision: 0.9.1+cu111
OpenCV: 4.2.0
MMCV: 1.4.0
MMGen: 0.4.0+ac1c630
MMCV Compiler: GCC 9.3
MMCV CUDA Compiler: 11.1

Results

Currently i don't have any results when removing padding in convolution module.
I will reproduce result as soon as possible.

height and width in Resize pipeline are misplaced

Reproduction

from mmgen.datasets.pipelines import Resize, LoadImageFromFile
results = {'real_img_path': 'path-to-img/0001.jpg'}
t = LoadImageFromFile(key='real_img', io_backend='disk')(results)
tt = Resize(keys=['real_img'], scale=(128, 32), keep_ratio=False)(t)

Bug fix
I think it's because mmcv.imresize uses (width, heights) for the scale argument.

Generated images got oil painting style and is blurry

I trained the StyleGAN2 about 140k iterations(samples_per_gpu=3, px=1024), and visualized the training result. I found the generated images tend to be oil painting style and is a little blurry. What can I do to avoid these phenomenon?
Looking forward to your reply, thanks!
5_cvt

Verification failed in Win10

Thanks for your error report and we appreciate it a lot.

Checklist

  1. I have searched related issues but cannot get the expected help.
  2. I have read the FAQ documentation but cannot get the expected help.
  3. The bug has not been fixed in the latest version.

Describe the bug
A clear and concise description of what the bug is.

Reproduction

  1. What command or script did you run?
A placeholder for the command.
  1. Did you make any modifications on the code or config? Did you understand what you have modified?
  2. What dataset did you use?

Environment

  1. Please run python mmgen/utils/collect_env.py to collect necessary environment information and paste it here.
  2. You may add addition that may be helpful for locating the problem, such as
    • How you installed PyTorch [e.g., pip, conda, source]
    • Other environment variables that may be related (such as $PATH, $LD_LIBRARY_PATH, $PYTHONPATH, etc.)

Error traceback
If applicable, paste the error trackback here.

A placeholder for trackback.

Bug fix
If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!


Have read aforementioned list and still could not fix the problem. The way to reproduce my problem follows:

  1. install mmgeneration with following scripts:
    conda create -n open-mmlab python=3.7 -y
    conda activate open-mmlab

conda install pytorch==1.7.0 torchvision==0.8.0 cudatoolkit=10.1 -c pytorch -y

install the latest mmcv

pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu101/torch1.7.0/index.html

install mmgeneration

git clone https://github.com/open-mmlab/mmgeneration.git
cd mmgeneration
pip install -r requirements.txt
pip install -v -e .

  1. run verification and found out 'from mmcv.ops import get_compiling_cuda_version, get_compiler_version' cause ImportError
    (open-mmlab1) C:\Users\13456\mmgeneration>python
    Python 3.7.10 (default, Feb 26 2021, 13:06:18) [MSC v.1916 64 bit (AMD64)] :: Anaconda, Inc. on win32
    Type "help", "copyright", "credits" or "license" for more information.

from mmcv.ops import get_compiling_cuda_version, get_compiler_version
Traceback (most recent call last):
File "", line 1, in
File "C:\Users\13456\anaconda3\envs\open-mmlab1\lib\site-packages\mmcv\ops_init_.py", line 1, in
from .bbox import bbox_overlaps
File "C:\Users\13456\anaconda3\envs\open-mmlab1\lib\site-packages\mmcv\ops\bbox.py", line 3, in
ext_module = ext_loader.load_ext('ext', ['bbox_overlaps'])
File "C:\Users\13456\anaconda3\envs\open-mmlab1\lib\site-packages\mmcv\utils\ext_loader.py", line 12, in load_ext
ext = importlib.import_module('mmcv.' + name)
File "C:\Users\13456\anaconda3\envs\open-mmlab1\lib\importlib_init
.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
ImportError: DLL load failed: 找不到指定的程序。

  1. Tried this and could not get useful info
    (open-mmlab1) C:\Users\13456\mmgeneration>python mmgen/utils/collect_env.py
    'tail' 不是内部或外部命令,也不是可运行的程序
    或批处理文件。
    'gcc' 不是内部或外部命令,也不是可运行的程序
    或批处理文件。
    Traceback (most recent call last):
    File "mmgen/utils/collect_env.py", line 69, in
    for name, val in collect_env().items():
    File "mmgen/utils/collect_env.py", line 44, in collect_env
    gcc = subprocess.check_output('gcc --version | head -n1', shell=True)
    File "C:\Users\13456\anaconda3\envs\open-mmlab1\lib\subprocess.py", line 411, in check_output
    **kwargs).stdout
    File "C:\Users\13456\anaconda3\envs\open-mmlab1\lib\subprocess.py", line 512, in run
    output=stdout, stderr=stderr)
    subprocess.CalledProcessError: Command 'gcc --version | head -n1' returned non-zero exit status 255.

configs/styleganv1/styleganv1_ffhq_1024_g8_25Mimg.py

Thanks for your error report and we appreciate it a lot.

Checklist

  1. I have searched related issues but cannot get the expected help.
  2. I have read the FAQ documentation but cannot get the expected help.
  3. The bug has not been fixed in the latest version.

Describe the bug
A clear and concise description of what the bug is.

Reproduction

  1. What command or script did you run?
A placeholder for the command.
  1. Did you make any modifications on the code or config? Did you understand what you have modified?
  2. What dataset did you use?

Environment

  1. Please run python mmgen/utils/collect_env.py to collect necessary environment information and paste it here.
  2. You may add addition that may be helpful for locating the problem, such as
    • How you installed PyTorch [e.g., pip, conda, source]
    • Other environment variables that may be related (such as $PATH, $LD_LIBRARY_PATH, $PYTHONPATH, etc.)

Error traceback
If applicable, paste the error trackback here.

A placeholder for trackback.

Bug fix
If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!
image

configs/pggan/pggan_celeba-cropped_128_g8_12Mimgs.py

Thanks for your error report and we appreciate it a lot.

Checklist

  1. I have searched related issues but cannot get the expected help.
  2. I have read the FAQ documentation but cannot get the expected help.
  3. The bug has not been fixed in the latest version.

Describe the bug
A clear and concise description of what the bug is.

Reproduction

  1. What command or script did you run?
A placeholder for the command.
  1. Did you make any modifications on the code or config? Did you understand what you have modified?
  2. What dataset did you use?

Environment
python tools/train.py configs/pggan/pggan_celeba-cropped_128_g8_12Mimgs.py

  1. Please run python mmgen/utils/collect_env.py to collect necessary environment information and paste it here.
  2. You may add addition that may be helpful for locating the problem, such as
    • How you installed PyTorch [e.g., pip, conda, source]
    • Other environment variables that may be related (such as $PATH, $LD_LIBRARY_PATH, $PYTHONPATH, etc.)

Error traceback
If applicable, paste the error trackback here.

A placeholder for trackback.

Bug fix
If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!
image

cannot import name 'CUDA_HOME' from 'mmcv.utils.parrots_wrapper'

when i run the code ,i meet this problem,can you help me?

mmgeneration/mmgen/utils/collect_env.py", line 26, in collect_env
from mmcv.utils.parrots_wrapper import CUDA_HOME
ImportError: cannot import name 'CUDA_HOME' from 'mmcv.utils.parrots_wrapper'
cannot import name 'CUDA_HOME' from 'mmcv.utils.parrots_wrapper'

Position Encoding with SinGAN demo

I was trying to generate samples with existing models. I want to use singan_spe model which is proposed from Positional Encoding as Spatial Inductive Bias in GANs.

i download pretrain model at download directory.

My command is this.

python3 demo/unconditional_demo.py configs/positional_encoding_in_gans/singan_spe-dim4_bohemian.py download/singan_spe-dim4_bohemian_20210406_175820-6e484a35.pth

But i got error message

/mmgeneration/mmgen/models/gans/singan.py", line 198, in sample_from_noise
assert self.use_ema

After chage sample_model to 'orig' i still got error message.
/mmgeneration/mmgen/models/gans/singan.py", line 212, in sample_from_noise
if not self.fixed_noises[0].is_cuda and torch.cuda.is_available():

i guess fixed_noises should exist and there is no way to make fixed_noises without training.
is there any way to sample form pretrained singan_spe?

RuntimeError: CUDA error: no kernel image is available for execution on the device

Describe the bug
RuntimeError: CUDA error: no kernel image is available for execution on the device

Reproduction

  1. What command or script did you run?
  from mmgen.apis import init_model, sample_uncoditional_model
  
  config_file = 'configs/styleganv2/stylegan2_c2_lsun-church_256_b4x8_800k.py'
  # you can download this checkpoint in advance and use a local file path.
  checkpoint_file = 'http://download.openmmlab.com/mmgen/stylegan2/official_weights/stylegan2-church-config-f-official_20210327_172657-1d42b7d1.pth'
  device = 'cuda:0'
  # init a generatvie
  model = init_model(config_file, checkpoint_file, device=device)
  # sample images
  fake_imgs = sample_uncoditional_model(model, 4)

Environment
sys.platform: linux
Python: 3.8.3 (default, May 19 2020, 18:47:26) [GCC 7.3.0]
CUDA available: True
GPU 0,1,2,3: GeForce RTX 3090
CUDA_HOME: /usr/local/cuda
NVCC: Build cuda_11.2.r11.2/compiler.29373293_0
GCC: gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609
PyTorch: 1.8.1+cu111
PyTorch compiling details: PyTorch built with:

  • GCC 7.3
  • C++ Version: 201402
  • Intel(R) Math Kernel Library Version 2020.0.0 Product Build 20191122 for Intel(R) 64 architecture applications
  • Intel(R) MKL-DNN v1.7.0 (Git Hash 7aed236906b1f7a05c0917e5257a1af05e9ff683)
  • OpenMP 201511 (a.k.a. OpenMP 4.5)
  • NNPACK is enabled
  • CPU capability usage: AVX2
  • CUDA Runtime 11.1
  • NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86
  • CuDNN 8.0.5
  • Magma 2.5.2
  • Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.1, CUDNN_VERSION=8.0.5, CXX_COMPILER=/opt/rh/devtoolset-7/root/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.8.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON,

TorchVision: 0.9.1+cu111
OpenCV: 4.5.1
MMCV: 1.3.1
MMCV Compiler: GCC 7.3
MMCV CUDA Compiler: 11.1
MMDetection: 2.11.0+2313bd7

Error traceback
If applicable, paste the error trackback here.

Traceback (most recent call last):
  File "test.py", line 10, in <module>
    fake_imgs = sample_uncoditional_model(model, 4)
  File "/home/zhanronghui/.local/lib/python3.8/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context
    return func(*args, **kwargs)
  File "/home/zhanronghui/mmlab/mmgeneration/mmgen/apis/inference.py", line 78, in sample_uncoditional_model
    res = model.sample_from_noise(
  File "/home/zhanronghui/mmlab/mmgeneration/mmgen/models/gans/base_gan.py", line 166, in sample_from_noise
    outputs = _model(noise, num_batches=num_batches, **kwargs)
  File "/home/zhanronghui/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/zhanronghui/mmlab/mmgeneration/mmgen/models/architectures/stylegan/generator_discriminator_v2.py", line 340, in forward
    styles = [self.style_mapping(s) for s in styles]
  File "/home/zhanronghui/mmlab/mmgeneration/mmgen/models/architectures/stylegan/generator_discriminator_v2.py", line 340, in <listcomp>
    styles = [self.style_mapping(s) for s in styles]
  File "/home/zhanronghui/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/zhanronghui/.local/lib/python3.8/site-packages/torch/nn/modules/container.py", line 119, in forward
    input = module(input)
  File "/home/zhanronghui/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
    result = self.forward(*input, **kwargs)
  File "/home/zhanronghui/mmlab/mmgeneration/mmgen/models/architectures/stylegan/modules/styleganv2_modules.py", line 86, in forward
    x = self.linear(x)
  File "/home/zhanronghui/.local/lib/python3.8/site-packages/torch/nn/modules/module.py", line 875, in _call_impl
    result = hook(self, input)
  File "/home/zhanronghui/mmlab/mmgeneration/mmgen/models/architectures/pggan/modules.py", line 67, in __call__
    setattr(module, self.name, self.compute_weight(module))
  File "/home/zhanronghui/mmlab/mmgeneration/mmgen/models/architectures/pggan/modules.py", line 59, in compute_weight
    weight = weight * torch.tensor(
RuntimeError: CUDA error: no kernel image is available for execution on the device

Rename pypi package to mmgeneration

Hello Devs,

Might you consider renaming your package on pypi.org to mmgeneration or mmgenerate? This would make it consistent with the Github repository name, or with the naming of your other packages. In addition, mmgen conflicts with the package name and repo name of The MMGen Project (also written in Python), which has existed on Github since 2013.

UPDATE: I see the name of your package itself is also mmgen. If this were changed to mmgenerate, for example, then both our packages could coexist on the same computer without conflict. This would also be consistent with the naming of your other packages, which use the infinitive form of the verb. For instance, with that of your mmtracking repo, whose package name is mmtrack.

Only one GPU for training mmgen

When I follow your documentation for training my own datasets.
Your progrom is trained by distribution, but i only have 1-GPU,so what need to change in 'tools/dist_train.sh'

Here is my command:
python tools/train.py --work-dir {my-work-dir} --gpu-ids 0 --launch none configs/styleganv2/stylegan2_c2_ffhq_256_b4x8_800k.py

But i got the error:'MMDataParallel' object has no arrribute 'reducer'

Some confusion about generative networks

Hello, I have some confusion about the generativeadversarial generative network, I hope I can get your help.
As some papers introduce that GAN can be used to generate images to expand the data set. My question is, if I want to train a GAN to generate the images to enlarge my dataset, do I need to provide a large training dataset for the GAN?

CycleGAN inference

I'm using this code to performance an inference with the CycleGAN trained model available here for horse2zebras, but the resulting image I'm getting is not even close to the images shown on this page.

import matplotlib.pyplot as plt
import numpy as np

import mmcv
from mmgen.apis import init_model, sample_img2img_model

# Specify the path to model config and checkpoint file
config_file = 'configs/cyclegan/cyclegan_lsgan_id0_resnet_in_horse2zebra_b1x1_270k.py'
checkpoint_file = 'https://download.openmmlab.com/mmgen/cyclegan/refactor/cyclegan_lsgan_id0_resnet_in_1x1_266800_horse2zebra_convert-bgr_20210902_165724-77c9c806.pth'

# Specify the path to image you want to translate'
image_path = 'tests/data/unpaired/testA/5.jpg'

device = 'cuda:0'
# init a generative
model = init_model(config_file, checkpoint_file, device=device)

# translate a single image
translated_image = sample_img2img_model(model, image_path, target_domain='zebra')

new_image = (translated_image[0] * 255).byte()
plt.imshow(new_image.permute(1, 2, 0).flip(-1))

plt.show()

zebra

Am I doing something wrong?
With pip2pix example worked fine.

configs/wgan-gp/wgangp_GN_celeba-cropped_128_b64x1_160kiter.py

Thanks for your error report and we appreciate it a lot.

Checklist

  1. I have searched related issues but cannot get the expected help.
  2. I have read the FAQ documentation but cannot get the expected help.
  3. The bug has not been fixed in the latest version.

Describe the bug
A clear and concise description of what the bug is.

Reproduction

  1. What command or script did you run?
A placeholder for the command.
  1. Did you make any modifications on the code or config? Did you understand what you have modified?
  2. What dataset did you use?

Environment
python tools/train.py configs/wgan-gp/wgangp_GN_celeba-cropped_128_b64x1_160kiter.py
image

  1. Please run python mmgen/utils/collect_env.py to collect necessary environment information and paste it here.
  2. You may add addition that may be helpful for locating the problem, such as
    • How you installed PyTorch [e.g., pip, conda, source]
    • Other environment variables that may be related (such as $PATH, $LD_LIBRARY_PATH, $PYTHONPATH, etc.)

Error traceback
If applicable, paste the error trackback here.

A placeholder for trackback.

Bug fix
If you have already identified the reason, you can provide the information here. If you are willing to create a PR to fix it, please also leave a comment here and that would be much appreciated!
image

will data augmentation affect model performance?

Because of the need of more data, i want to use data augmentation. There is a question. If i rotate many images, will styleganv2 generate rotated images too? If that is a case, that is not what i want. So what is correct way to augment images?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.