xpixelgroup / ranksrgan Goto Github PK

ICCV 2019 (oral) RankSRGAN: Generative Adversarial Networks with Ranker for Image Super-Resolution. PyTorch implementation

Python 82.47% MATLAB 17.04% M 0.06% Shell 0.44%

super-resolution generative-adversarial-network iccv2019 low-level-vision learning-to-rank pytorch gan

ranksrgan's People

Contributors

Stargazers

Watchers

Forkers

peterzs chisyliu juingzhou kwanwaipang trendingtechnology shuikehuo zhuangzhong kite-hz yuanjunchai thanhtd91 zhengwen-zhang scape1989 zeyuxiao1997 tom666tom666 hell-to-heaven liuwenhaha weifj0212 gengjiaqi templeblock peterzhousz lamia482 max-my leammonia sedwas666 jxh-shu gammalee qq2737499951 wt200081 lifeixianshen zhhezhhe desera sitp2018 yangsenwxy tsaiyali zoq qirui-y lovedoubledan laindream vcip2015 houlin dtm3302 lyh-18 zhaohengyuan1 greitzmann kaiwang960112 cv-ip lotayou wz940216 leonardoamoreira tengtengzhong owen718 city292 ip-superresolution zanewiegand elijahahianyo shuweis iyakademy ar0kim sucht isabella232

ranksrgan's Issues

AssertionError

When i run your code,i get the error like this 👍

(pytorch) D:\CY\Super-resolution\RankSRGAN-master\codes>python test.py -opt options/test/test_RankSRGAN.yml
export CUDA_VISIBLE_DEVICES=3
19-12-03 10:15:28.664 - INFO: name: RankSRGANx4
suffix: None
model: sr
distortion: sr
scale: 4
crop_border: None
gpu_ids: [3]
datasets:[
test_1:[
name: set14
mode: LQGT
dataroot_GT: /home/wlzhang/BasicSR12/data/val/Set14_mod
dataroot_LQ: None
phase: test
scale: 4
data_type: img
]
test_2:[
name: PIRMtest
mode: LQGT
dataroot_GT: /home/wlzhang/RankSRGAN/data/val/PIRMtestHR
dataroot_LQ: /home/wlzhang/RankSRGAN/data/val/PIRMtest
phase: test
scale: 4
data_type: img
]
]
network_G:[
which_model_G: SRResNet
in_nc: 3
out_nc: 3
nf: 64
nb: 16
upscale: 4
scale: 4
]
path:[
pretrain_model_G: ../experiments/pretrained_models/mmsr_RankSRGAN_NIQE.pth
root: D:\CY\Super-resolution\RankSRGAN-master
results_root: D:\CY\Super-resolution\RankSRGAN-master\results\RankSRGANx4
log: D:\CY\Super-resolution\RankSRGAN-master\results\RankSRGANx4
]
is_train: False

Traceback (most recent call last):
File "test.py", line 30, in
test_set = create_dataset(dataset_opt)
File "D:\CY\Super-resolution\RankSRGAN-master\codes\data_init_.py", line 41, in create_dataset
dataset = D(dataset_opt)
File "D:\CY\Super-resolution\RankSRGAN-master\codes\data\LQGT_dataset.py", line 23, in init
self.paths_GT, self.sizes_GT = util.get_image_paths(self.data_type, opt['dataroot_GT'])
File "D:\CY\Super-resolution\RankSRGAN-master\codes\data\util.py", line 53, in get_image_paths
paths = sorted(_get_paths_from_images(dataroot))
File "D:\CY\Super-resolution\RankSRGAN-master\codes\data\util.py", line 24, in _get_paths_from_images
assert os.path.isdir(path), '{:s} is not a valid directory'.format(path)
AssertionError: /home/wlzhang/BasicSR12/data/val/Set14_mod is not a valid directory

So how can i do to solve this problem?

error occur on dataloader when train

Thank you for your work!
I try to train the RankSRGAN model, but I don't have Matlab on my computer, so I turned off the code about matlab. I don't know if because of it I got the error as followed.
I checked the path of the data, and there was no error.
Have you ever been in this situation?

19-08-30 14:29:21.129 - INFO: Model [SRGANModel] is created.
19-08-30 14:29:21.129 - INFO: Start training from epoch: 0, iter: 0
Traceback (most recent call last):
  File "train_without_matlab.py", line 199, in <module>
    main()
  File "train_without_matlab.py", line 110, in main
    for _, train_data in enumerate(train_loader):
  File "d:\Anaconda3\envs\pytorch110\lib\site-packages\torch\utils\data\dataloader.py", line 193, in __iter__
    return _DataLoaderIter(self)
  File "d:\Anaconda3\envs\pytorch110\lib\site-packages\torch\utils\data\dataloader.py", line 469, in __init__
    w.start()
  File "d:\Anaconda3\envs\pytorch110\lib\multiprocessing\process.py", line 105, in start
    self._popen = self._Popen(self)
  File "d:\Anaconda3\envs\pytorch110\lib\multiprocessing\context.py", line 223, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "d:\Anaconda3\envs\pytorch110\lib\multiprocessing\context.py", line 322, in _Popen
    return Popen(process_obj)
  File "d:\Anaconda3\envs\pytorch110\lib\multiprocessing\popen_spawn_win32.py", line 65, in __init__
    reduction.dump(process_obj, to_child)
  File "d:\Anaconda3\envs\pytorch110\lib\multiprocessing\reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
TypeError: can't pickle Environment objects

(pytorch110) E:\heqinwen\RankSRGAN\codes>Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "d:\Anaconda3\envs\pytorch110\lib\multiprocessing\spawn.py", line 105, in spawn_main
    exitcode = _main(fd)
  File "d:\Anaconda3\envs\pytorch110\lib\multiprocessing\spawn.py", line 115, in _main
    self = reduction.pickle.load(from_parent)
EOFError: Ran out of input

when i run test i, ihave meet some problems

Traceback (most recent call last):
File "test.py", line 35, in
model = create_model(opt)
File "/home/zzh/桌面/RankSRGAN/codes/models/init.py", line 19, in create_model
m = M(opt)
File "/home/zzh/桌面/RankSRGAN/codes/models/SR_model.py", line 26, in init
self.netG = networks.define_G(opt).to(self.device)
File "/home/zzh/anaconda3/envs/py13/lib/python3.6/site-packages/torch/nn/modules/module.py", line 381, in to
return self._apply(convert)
File "/home/zzh/anaconda3/envs/py13/lib/python3.6/site-packages/torch/nn/modules/module.py", line 187, in _apply
module._apply(fn)
File "/home/zzh/anaconda3/envs/py13/lib/python3.6/site-packages/torch/nn/modules/module.py", line 193, in _apply
param.data = fn(param.data)
File "/home/zzh/anaconda3/envs/py13/lib/python3.6/site-packages/torch/nn/modules/module.py", line 379, in convert
return t.to(device, dtype if t.is_floating_point() else None, non_blocking)
File "/home/zzh/anaconda3/envs/py13/lib/python3.6/site-packages/torch/cuda/init.py", line 162, in _lazy_init
torch._C._cuda_init()
RuntimeError: cuda runtime error (38) : no CUDA-capable device is detected at /opt/conda/conda-bld/pytorch_1544174967633/work/aten/src/THC/THCGeneral.cpp:51

Can this model be magnified 3x？

how could i resume train from specific checkpoint?

Do i need to set 'resume_state' in train_RankSRGAN.yml? if so, what should i write?

license

thanks for your great work.

under which license is RankSRGAN released?

could you please add a LICENSE.md file to the repo?

thanks again

。

请问一下这个与 ESRGAN相比效果怎样？

请问一下这个与 ESRGAN相比效果怎样？https://github.com/xinntao/ESRGAN

[Prepare perceptual data] RuntimeError: cuDNN error: CUDNN_STATUS_NOT_SUPPORTED

Thans for your great work!
When dataroot_LQ has only one image, the code runs OK!
But when I put more images (DIV2K dataset) to the dir of dataroot_LQ: /DIV2K/DIV2K_train_HR, the error occurs.

export CUDA_VISIBLE_DEVICES=4
20-01-15 19:31:02.816 - INFO:   name: DIV2K
  suffix: None
  model: sr
  distortion: sr
  scale: 4
  crop_border: None
  gpu_ids: [4]
  datasets:[
    test_1:[
      name: DIV2K_train_srres
      mode: LQ
      dataroot_GT: None
      dataroot_LQ: DIV2K/DIV2K_train_HR
      phase: test
      scale: 4
      data_type: img
    ]
  ]
  network_G:[
    which_model_G: SRResNet
    in_nc: 3
    out_nc: 3
    nf: 64
    nb: 16
    upscale: 4
    scale: 4
  ]
  path:[
    pretrain_model_G: ../experiments/pretrained_models/mmsr_RankSRGAN_NIQE.pth
    root: /RankSRGAN
    results_root: /RankSRGAN/results/DIV2K
    log: /RankSRGAN/results/DIV2K
  ]
  is_train: False

20-01-15 19:31:02.822 - INFO: Dataset [LQDataset - DIV2K_train_srres] is created.
20-01-15 19:31:02.822 - INFO: Number of test images in [DIV2K_train_srres]: 800
20-01-15 19:31:08.614 - INFO: Network G structure: DataParallel - SRResNet, with parameters: 1,554,499
20-01-15 19:31:08.615 - INFO: SRResNet(
  (conv_first): Conv2d(3, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (recon_trunk): Sequential(
    (0): ResidualBlock_noBN(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    )
    (1): ResidualBlock_noBN(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    )
    (2): ResidualBlock_noBN(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    )
    (3): ResidualBlock_noBN(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    )
    (4): ResidualBlock_noBN(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    )
    (5): ResidualBlock_noBN(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    )
    (6): ResidualBlock_noBN(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    )
    (7): ResidualBlock_noBN(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    )
    (8): ResidualBlock_noBN(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    )
    (9): ResidualBlock_noBN(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    )
    (10): ResidualBlock_noBN(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    )
    (11): ResidualBlock_noBN(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    )
    (12): ResidualBlock_noBN(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    )
    (13): ResidualBlock_noBN(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    )
    (14): ResidualBlock_noBN(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    )
    (15): ResidualBlock_noBN(
      (conv1): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    )
  )
  (LRconv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (upconv1): Conv2d(64, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (upconv2): Conv2d(64, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (pixel_shuffle): PixelShuffle(upscale_factor=2)
  (HRconv): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (conv_last): Conv2d(64, 3, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
  (relu): ReLU(inplace)
)
20-01-15 19:31:08.615 - INFO: Loading model for G [../experiments/pretrained_models/mmsr_RankSRGAN_NIQE.pth] ...
20-01-15 19:31:08.625 - INFO: Model [SRModel] is created.
20-01-15 19:31:08.625 - INFO:
Testing [DIV2K_train_srres]...
Traceback (most recent call last):
  File "test.py", line 55, in <module>
    model.test()
  File "/RankSRGAN/codes/models/SR_model.py", line 102, in test
    self.fake_H = self.netG(self.var_L)
  File "/anaconda3/envs/pytorch_sr/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/anaconda3/envs/pytorch_sr/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 150, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/anaconda3/envs/pytorch_sr/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/RankSRGAN/codes/models/archs/RankSRGAN_arch.py", line 50, in forward
    out = self.relu(self.pixel_shuffle(self.upconv2(out)))
  File "/anaconda3/envs/pytorch_sr/lib/python3.6/site-packages/torch/nn/modules/module.py", line 493, in __call__
    result = self.forward(*input, **kwargs)
  File "/anaconda3/envs/pytorch_sr/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 338, in forward
    self.padding, self.dilation, self.groups)
RuntimeError: cuDNN error: CUDNN_STATUS_NOT_SUPPORTED. This error may appear if you passed in a non-contiguous input.

About Results on PIRM Test

Hi，
I noticed that your paper shows the results on the PIRM test set. But I cannot download the HR data from the official link, which seems to be invalid now. Is it convenient for you to provide HR data for me? Thank you very much!

关于rank loss 的细节

作者您好，关于rank loss 的理解，我有一个问题：

ranker训练的时候定义图像质量越好，打分越高
但是用作损失时

根据公式9，恢复图像质量好，打分高，sigmoid越大，损失就越大；恢复质量差损失小，似乎有点矛盾。请问我该怎么理解这个位置呀

Could you provide more details on generating rank dataset

When generating rank dataset, what should change in test_RankSRGAN.yml?
For example, if generating DIV2K_train_ESRGAN rank dataset, which_model_G in test_RankSRGAN.yml should I set?

It seems a typo in the equation (4) of the paper

It seems a minus is missing in eq (4) - the MarginRankingLoss. It should be
max(0, -(s1 − s2) ∗ γ + ε).

训练ranker，但是没有找到label的txt

没有找到label.txt的代码，也没有找到txt格式的描述或者例子，无法训练ranker

training time and number of GPUs

Hi, thanks for your wonderful work and opening source.
Could you please tell me how long did you train the model , the kind of GPU and number of GPUs?

Best regards

use on own image

Hi @wenlongzhang0724
thank you very much for sharing your code with us
could you please explain how to use with my own images, without using json file

not find some functions

from data import create_dataloader, create_dataset
from models import create_model

I cannot find these functions: create_dataloader, create_datasetand and create_model

Can I use the existing loss directly？

RankESRGAN underperform as expected

Thank you for your amazing work!
I replace the base model as RRDB_net, with using the original Ranker model. The SR results are disappointing. It's like even GAN is not working that the SR results don't recovery details.
I guess you did this experiment, too. Did you get the same result? Maybe I should retrain the Ranker model, do I ?

Best wish.

How to train this model in one node multi-gpus mode?

Thanks for your project.

My eviroument is Ubuntu16.04+Python3.6 +Pytorch1.1+CUDA10.0

I try to use this code to train distributed
python -m torch.distributed.launch --nproc_per_node=2 --master_port=4321 train_niqe.py -opt options/train/train_AdaGrowingNet.yml --launcher pytorch

First, for VGGFeatureExtractor, I got this error:
RuntimeError: replicas_[0].size() >= 1 ASSERT FAILED at /pytorch/torch/csrc/distributed/c10d/reducer.cpp:53, please report a bug to PyTorch. Expected at least one parameter. (Reducer at /pytorch/torch/csrc/distributed/c10d/reducer.cpp:53) frame #0: std::function<std::string ()>::operator()() const + 0x11 (0x7f27c47be441 in /home/wangzhan/anaconda3/envs/py36_pt10_tf14/lib/python3.6/site-packages/torch/lib/libc10.so) frame #1: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x2a (0x7f27c47bdd7a in /home/wangzhan/anaconda3/envs/py36_pt10_tf14/lib/python3.6/site-packages/torch/lib/libc10.so) frame #2: c10d::Reducer::Reducer(std::vector<std::vector<torch::autograd::Variable, std::allocator<torch::autograd::Variable> >, std::allocator<std::vector<torch::autograd::Variable, std::allocator<torch::autograd::Variable> > > >, std::vector<std::vector<unsigned long, std::allocator<unsigned long> >, std::allocator<std::vector<unsigned long, std::allocator<unsigned long> > > >, std::shared_ptr<c10d::ProcessGroup>) + 0x199c (0x7f280405fc1c in /home/wangzhan/anaconda3/envs/py36_pt10_tf14/lib/python3.6/site-packages/torch/lib/libtorch_python.so)

Then I set the parameters of netF: v.requires_grad = False; After self.netF = DistributedDataParallel(self.netF, device_ids=[torch.cuda.current_device()]).
While this code is first at the define of the VGGFeatureExtractor.
So, this error disappeared.

Then I still run this code,
But it got RuntimeError.
Traceback (most recent call last): File "train_niqe.py", line 260, in <module> main() File "train_niqe.py", line 172, in main model.optimize_parameters(current_step) File "/home/wangzhan/SRtask/data_augment/RankSRGAN-master/codes/models/RankSRGAN_model.py", line 215, in optimize_parameters l_d_total.backward() File "/home/wangzhan/anaconda3/envs/py36_pt10_tf14/lib/python3.6/site-packages/torch/tensor.py", line 107, in backward torch.autograd.backward(self, gradient, retain_graph, create_graph) File "/home/wangzhan/anaconda3/envs/py36_pt10_tf14/lib/python3.6/site-packages/torch/autograd/__init__.py", line 93, in backward allow_unreachable=True) # allow_unreachable flag RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [512]] is at version 4; expected version 3 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detect_anomaly(True).

Do you encounter this problem?
How to fix it?