When I run cyclegan, it only changes the data path, but the running problem is reporti

same with me <div class="snippet-clipboard-content notranslate position-relative o

Hi <a class="user-mention notranslate" data-hovercard-type="

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-ho

Hi <a class="user-mention notranslate" data-hov

Hi <a class="user-mention notransl

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

<a class="user-mention notranslate" data-hovercard-type="user" data-hover

<a class="user-mention notranslate" data-hovercard-type="use

'MMDataParallel' object has no attribute 'reducer',about open-mmlab/mmgeneration

Comments (10)

jayagami commented on June 29, 2024 1

same with me

2021-05-17 21:15:05,719 - mmgen - INFO - Environment info:
------------------------------------------------------------
sys.platform: linux
Python: 3.7.7 (default, May  7 2020, 21:25:33) [GCC 7.3.0]
CUDA available: True
CUDA_HOME: /usr
NVCC: Cuda compilation tools, release 10.1, V10.1.243
GPU 0,1: GeForce GTX 1080 Ti
GCC: gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
PyTorch: 1.6.0
PyTorch compiling details: PyTorch built with:
  - GCC 7.3
  - C++ Version: 201402
  - Intel(R) Math Kernel Library Version 2020.0.1 Product Build 20200208 for Intel(R) 64 architecture applications
  - Intel(R) MKL-DNN v1.5.0 (Git Hash e2ac1fac44c5078ca927cb9b90e1b3066a0b2ed0)
  - OpenMP 201511 (a.k.a. OpenMP 4.5)
  - NNPACK is enabled
  - CPU capability usage: AVX2
  - CUDA Runtime 10.1
  - NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
  - CuDNN 7.6.3
  - Magma 2.5.2
  - Build settings: BLAS=MKL, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -DUSE_XNNPACK -DUSE_VULKAN_WRAPPER -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -Wno-stringop-overflow, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF, 

TorchVision: 0.7.0
OpenCV: 4.2.0
MMCV: 1.3.4
MMGen: 0.1.0+0ece0cd
MMCV Compiler: GCC 7.3
MMCV CUDA Compiler: 10.1
------------------------------------------------------------

2021-05-17 21:15:05,984 - mmgen - INFO - Distributed training: False
2021-05-17 21:15:06,178 - mmgen - INFO - Config:
model = dict(
    type='CycleGAN',
    generator=dict(
        type='ResnetGenerator',
        in_channels=3,
        out_channels=3,
        base_channels=64,
        norm_cfg=dict(type='IN'),
        use_dropout=False,
        num_blocks=9,
        padding_mode='reflect',
        init_cfg=dict(type='normal', gain=0.02)),
    discriminator=dict(
        type='PatchDiscriminator',
        in_channels=3,
        base_channels=64,
        num_conv=3,
        norm_cfg=dict(type='IN'),
        init_cfg=dict(type='normal', gain=0.02)),
    gan_loss=dict(
        type='GANLoss',
        gan_type='lsgan',
        real_label_val=1.0,
        fake_label_val=0.0,
        loss_weight=1.0),
    cycle_loss=dict(type='L1Loss', loss_weight=10.0, reduction='mean'),
    id_loss=dict(type='L1Loss', loss_weight=0.5, reduction='mean'))
train_cfg = dict(direction='a2b', buffer_size=50)
test_cfg = dict(direction='a2b', show_input=False, test_direction='a2b')
train_dataset_type = 'UnpairedImageDataset'
val_dataset_type = 'UnpairedImageDataset'
img_norm_cfg = dict(mean=[0.5, 0.5, 0.5], std=[0.5, 0.5, 0.5])
train_pipeline = [
    dict(
        type='LoadImageFromFile', io_backend='disk', key='img_a',
        flag='color'),
    dict(
        type='LoadImageFromFile', io_backend='disk', key='img_b',
        flag='color'),
    dict(
        type='Resize',
        keys=['img_a', 'img_b'],
        scale=(286, 286),
        interpolation='bicubic'),
    dict(
        type='Crop',
        keys=['img_a', 'img_b'],
        crop_size=(256, 256),
        random_crop=True),
    dict(type='Flip', keys=['img_a'], direction='horizontal'),
    dict(type='Flip', keys=['img_b'], direction='horizontal'),
    dict(type='RescaleToZeroOne', keys=['img_a', 'img_b']),
    dict(
        type='Normalize',
        keys=['img_a', 'img_b'],
        to_rgb=False,
        mean=[0.5, 0.5, 0.5],
        std=[0.5, 0.5, 0.5]),
    dict(type='ImageToTensor', keys=['img_a', 'img_b']),
    dict(
        type='Collect',
        keys=['img_a', 'img_b'],
        meta_keys=['img_a_path', 'img_b_path'])
]
test_pipeline = [
    dict(
        type='LoadImageFromFile', io_backend='disk', key='img_a',
        flag='color'),
    dict(
        type='LoadImageFromFile', io_backend='disk', key='img_b',
        flag='color'),
    dict(
        type='Resize',
        keys=['img_a', 'img_b'],
        scale=(256, 256),
        interpolation='bicubic'),
    dict(type='RescaleToZeroOne', keys=['img_a', 'img_b']),
    dict(
        type='Normalize',
        keys=['img_a', 'img_b'],
        to_rgb=False,
        mean=[0.5, 0.5, 0.5],
        std=[0.5, 0.5, 0.5]),
    dict(type='ImageToTensor', keys=['img_a', 'img_b']),
    dict(
        type='Collect',
        keys=['img_a', 'img_b'],
        meta_keys=['img_a_path', 'img_b_path'])
]
data_root = None
data = dict(
    samples_per_gpu=1,
    workers_per_gpu=4,
    drop_last=True,
    val_samples_per_gpu=1,
    val_workers_per_gpu=0,
    train=dict(
        type='UnpairedImageDataset',
        dataroot='./data/horse2zebra',
        pipeline=[
            dict(
                type='LoadImageFromFile',
                io_backend='disk',
                key='img_a',
                flag='color'),
            dict(
                type='LoadImageFromFile',
                io_backend='disk',
                key='img_b',
                flag='color'),
            dict(
                type='Resize',
                keys=['img_a', 'img_b'],
                scale=(286, 286),
                interpolation='bicubic'),
            dict(
                type='Crop',
                keys=['img_a', 'img_b'],
                crop_size=(256, 256),
                random_crop=True),
            dict(type='Flip', keys=['img_a'], direction='horizontal'),
            dict(type='Flip', keys=['img_b'], direction='horizontal'),
            dict(type='RescaleToZeroOne', keys=['img_a', 'img_b']),
            dict(
                type='Normalize',
                keys=['img_a', 'img_b'],
                to_rgb=False,
                mean=[0.5, 0.5, 0.5],
                std=[0.5, 0.5, 0.5]),
            dict(type='ImageToTensor', keys=['img_a', 'img_b']),
            dict(
                type='Collect',
                keys=['img_a', 'img_b'],
                meta_keys=['img_a_path', 'img_b_path'])
        ],
        test_mode=False),
    val=dict(
        type='UnpairedImageDataset',
        dataroot='./data/horse2zebra',
        pipeline=[
            dict(
                type='LoadImageFromFile',
                io_backend='disk',
                key='img_a',
                flag='color'),
            dict(
                type='LoadImageFromFile',
                io_backend='disk',
                key='img_b',
                flag='color'),
            dict(
                type='Resize',
                keys=['img_a', 'img_b'],
                scale=(256, 256),
                interpolation='bicubic'),
            dict(type='RescaleToZeroOne', keys=['img_a', 'img_b']),
            dict(
                type='Normalize',
                keys=['img_a', 'img_b'],
                to_rgb=False,
                mean=[0.5, 0.5, 0.5],
                std=[0.5, 0.5, 0.5]),
            dict(type='ImageToTensor', keys=['img_a', 'img_b']),
            dict(
                type='Collect',
                keys=['img_a', 'img_b'],
                meta_keys=['img_a_path', 'img_b_path'])
        ],
        test_mode=True),
    test=dict(
        type='UnpairedImageDataset',
        dataroot='./data/horse2zebra',
        pipeline=[
            dict(
                type='LoadImageFromFile',
                io_backend='disk',
                key='img_a',
                flag='color'),
            dict(
                type='LoadImageFromFile',
                io_backend='disk',
                key='img_b',
                flag='color'),
            dict(
                type='Resize',
                keys=['img_a', 'img_b'],
                scale=(256, 256),
                interpolation='bicubic'),
            dict(type='RescaleToZeroOne', keys=['img_a', 'img_b']),
            dict(
                type='Normalize',
                keys=['img_a', 'img_b'],
                to_rgb=False,
                mean=[0.5, 0.5, 0.5],
                std=[0.5, 0.5, 0.5]),
            dict(type='ImageToTensor', keys=['img_a', 'img_b']),
            dict(
                type='Collect',
                keys=['img_a', 'img_b'],
                meta_keys=['img_a_path', 'img_b_path'])
        ],
        test_mode=True))
checkpoint_config = dict(interval=100, by_epoch=False, save_optimizer=True)
log_config = dict(
    interval=100, hooks=[dict(type='TextLoggerHook', by_epoch=False)])
custom_hooks = [
    dict(
        type='VisualizeUnconditionalSamples',
        output_dir='training_samples',
        interval=1000)
]
runner = dict(
    type='DynamicIterBasedRunner',
    is_dynamic_ddp=True,
    pass_training_status=True)
dist_params = dict(backend='nccl')
log_level = 'INFO'
load_from = None
resume_from = None
workflow = [('train', 1)]
find_unused_parameters = True
cudnn_benchmark = True
dataroot = './data/horse2zebra'
optimizer = dict(
    generators=dict(type='Adam', lr=0.0002, betas=(0.5, 0.999)),
    discriminators=dict(type='Adam', lr=0.0002, betas=(0.5, 0.999)))
lr_config = None
total_iters = 80000
exp_name = 'cyclegan_facades_id0'
work_dir = './work_dirs/cyclegan_facades_id0'
metrics = dict(
    FID=dict(type='FID', num_images=140, image_shape=(3, 256, 256)),
    IS=dict(type='IS', num_images=140, image_shape=(3, 256, 256)))
gpu_ids = range(0, 1)

2021-05-17 21:15:06,178 - mmgen - INFO - Set random seed to 2021, deterministic: False
/home/jay/miniconda3/envs/open-mmlab/lib/python3.7/site-packages/mmcv/cnn/bricks/conv_module.py:107: UserWarning: ConvModule has norm and bias at the same time
  warnings.warn('ConvModule has norm and bias at the same time')
2021-05-17 21:15:08,000 - mmgen - INFO - Start running, host: jay@Tachikoma, work_dir: /home/jay/git/jay-mmgeneration/work_dirs/cyclegan_facades_id0
2021-05-17 21:15:08,000 - mmgen - INFO - workflow: [('train', 1)], max: 80000 iters
Traceback (most recent call last):
  File "tools/train.py", line 163, in <module>
    main()
  File "tools/train.py", line 159, in main
    meta=meta)
  File "/home/jay/git/jay-mmgeneration/mmgen/apis/train.py", line 196, in train_model
    runner.run(data_loaders, cfg.workflow, cfg.total_iters)
  File "/home/jay/git/jay-mmgeneration/mmgen/core/runners/dynamic_iterbased_runner.py", line 284, in run
    iter_runner(iter_loaders[i], **kwargs)
  File "/home/jay/git/jay-mmgeneration/mmgen/core/runners/dynamic_iterbased_runner.py", line 206, in train
    kwargs.update(dict(ddp_reducer=self.model.reducer))
  File "/home/jay/miniconda3/envs/open-mmlab/lib/python3.7/site-packages/torch/nn/modules/module.py", line 772, in __getattr__
    type(self).__name__, name))
torch.nn.modules.module.ModuleAttributeError: 'MMDataParallel' object has no attribute 'reducer'

from mmgeneration.

nbei commented on June 29, 2024 1

Hi @xinzhichao and @jayagami , our MMGeneration does NOT support Dataparallel training. If you start the training by directly using python train.py xxx, the PyTorch will automatically run in Dataparallel. Thus, we recommend that all of our users use distributed training by:
bash tools/dist_train.sh {CONFIG} {GPU_NUM} --work-dir {WORK_DIR}
In addition, after checking the detailed codes, I have just found another bug in the config files for cyclegan and pix2pix models. We will fix it asap but you can quickly fix it by adding this code at the end of the config file:
runner = None
use_ddp_wrapper = True
@plyfager will further follow this issue and fix the bugs.
thanks for replying, I found that cyclegan from mmediting worked for me.

In the future, the image translation model will be removed from MMEditing and supported in MMGeneration. We hope that you can switch to MMGeneration and sorry for the inconvenience.

from mmgeneration.

xinzhichao commented on June 29, 2024

I use python tools/train.py configs/cyclegan/cyclegan_lsgan_resnet_in_1x1_266800_horse2zebra.py

from mmgeneration.

nbei commented on June 29, 2024

Hi @xinzhichao and @jayagami , our MMGeneration does NOT support Dataparallel training. If you start the training by directly using python train.py xxx, the PyTorch will automatically run in Dataparallel. Thus, we recommend that all of our users use distributed training by:

bash tools/dist_train.sh {CONFIG} {GPU_NUM} --work-dir {WORK_DIR}

In addition, after checking the detailed codes, I have just found another bug in the config files for cyclegan and pix2pix models. We will fix it asap but you can quickly fix it by adding this code at the end of the config file:

runner = None
use_ddp_wrapper = True

@plyfager will further follow this issue and fix the bugs.

from mmgeneration.

jayagami commented on June 29, 2024

Hi @xinzhichao and @jayagami , our MMGeneration does NOT support Dataparallel training. If you start the training by directly using python train.py xxx, the PyTorch will automatically run in Dataparallel. Thus, we recommend that all of our users use distributed training by:
bash tools/dist_train.sh {CONFIG} {GPU_NUM} --work-dir {WORK_DIR}
In addition, after checking the detailed codes, I have just found another bug in the config files for cyclegan and pix2pix models. We will fix it asap but you can quickly fix it by adding this code at the end of the config file:
runner = None
use_ddp_wrapper = True
@plyfager will further follow this issue and fix the bugs.

thanks for replying, I found that cyclegan from mmediting worked for me.

from mmgeneration.

jayagami commented on June 29, 2024

Hi @xinzhichao and @jayagami , our MMGeneration does NOT support Dataparallel training. If you start the training by directly using python train.py xxx, the PyTorch will automatically run in Dataparallel. Thus, we recommend that all of our users use distributed training by:
bash tools/dist_train.sh {CONFIG} {GPU_NUM} --work-dir {WORK_DIR}
In addition, after checking the detailed codes, I have just found another bug in the config files for cyclegan and pix2pix models. We will fix it asap but you can quickly fix it by adding this code at the end of the config file:
runner = None
use_ddp_wrapper = True
@plyfager will further follow this issue and fix the bugs.
thanks for replying, I found that cyclegan from mmediting worked for me.
In the future, the image translation model will be removed from MMEditing and supported in MMGeneration. We hope that you can switch to MMGeneration and sorry for the inconvenience.

Thanks again, I will take your advice.

from mmgeneration.

plyfager commented on June 29, 2024

Hi @xinzhichao and @jayagami , our MMGeneration does NOT support Dataparallel training. If you start the training by directly using python train.py xxx, the PyTorch will automatically run in Dataparallel. Thus, we recommend that all of our users use distributed training by:
bash tools/dist_train.sh {CONFIG} {GPU_NUM} --work-dir {WORK_DIR}
In addition, after checking the detailed codes, I have just found another bug in the config files for cyclegan and pix2pix models. We will fix it asap but you can quickly fix it by adding this code at the end of the config file:
runner = None
use_ddp_wrapper = True
@plyfager will further follow this issue and fix the bugs.
thanks for replying, I found that cyclegan from mmediting worked for me.
In the future, the image translation model will be removed from MMEditing and supported in MMGeneration. We hope that you can switch to MMGeneration and sorry for the inconvenience.
``
Thanks again, I will take your advice.

Sorry for the inconvenience. This bug has been fixed in #38.

from mmgeneration.

zhjw0927 commented on June 29, 2024

@nbei hi, I only use the python tools/train.py config_file on a single-gpu machine, but I still encounter the above problems. I haven't modified any files yet. The current commit is 3542102

from mmgeneration.

LeoXing1996 commented on June 29, 2024

@nbei hi, I only use the python tools/train.py config_file on a single-gpu machine, but I still encounter the above problems. I haven't modified any files yet. The current commit is 3542102

We suggest using dist_train.sh to start your training. You can use the following command to start a single GPU training:

bash dist_train.sh CONFIG 1

from mmgeneration.

zhjw0927 commented on June 29, 2024

@nbei hi, I only use the python tools/train.py config_file on a single-gpu machine, but I still encounter the above problems. I haven't modified any files yet. The current commit is 3542102

We suggest using dist_train.sh to start your training. You can use the following command to start a single GPU training:
bash dist_train.sh CONFIG 1

Thank you. Its ok. Let me debug in DDP mode first.

from mmgeneration.

'MMDataParallel' object has no attribute 'reducer' about mmgeneration HOT 10 CLOSED

Comments (10)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent