swz30 / mirnet Goto Github PK

[ECCV 2020] Learning Enriched Features for Real Image Restoration and Enhancement. SOTA results for image denoising, super-resolution, and image enhancement.

License: Other

Python 100.00%

image-denoising super-resolution image-enhancement image-restoration low-level-vision computer-vision multi-resolution-streams attention-mechanism pytorch eccv2020

mirnet's Introduction

Learning Enriched Features for Real Image Restoration and Enhancement (ECCV 2020)

Syed Waqas Zamir, Aditya Arora, Salman Khan, Munawar Hayat, Fahad Shahbaz Khan, Ming-Hsuan Yang, Ling Shao

News

A lightweight, fast and extended version of MIRNet is accepted in TPAMI. Paper | Code
Keras Tutorial on MIRNet is available at https://keras.io/examples/vision/mirnet/
Video on Tensorflow Youtube channel https://youtu.be/BMza5yrwZ9s
Links to (unofficial) implementations are added here

Abstract: With the goal of recovering high-quality image content from its degraded version, image restoration enjoys numerous applications, such as in surveillance, computational photography, medical imaging, and remote sensing. Recently, convolutional neural networks (CNNs) have achieved dramatic improvements over conventional approaches for image restoration task. Existing CNN-based methods typically operate either on full-resolution or on progressively low-resolution representations. In the former case, spatially precise but contextually less robust results are achieved, while in the latter case, semantically reliable but spatially less accurate outputs are generated. In this paper, we present a novel architecture with the collective goals of maintaining spatially-precise high-resolution representations through the entire network, and receiving strong contextual information from the low-resolution representations. The core of our approach is a multi-scale residual block containing several key elements: (a) parallel multi-resolution convolution streams for extracting multi-scale features, (b) information exchange across the multi-resolution streams, (c) spatial and channel attention mechanisms for capturing contextual information, and (d) attention based multi-scale feature aggregation. In the nutshell, our approach learns an enriched set of features that combines contextual information from multiple scales, while simultaneously preserving the high-resolution spatial details. Extensive experiments on five real image benchmark datasets demonstrate that our method, named as MIRNet, achieves state-of-the-art results for a variety of image processing tasks, including image denoising, super-resolution and image enhancement.

Network Architecture (click to expand)

Overall Framework of MIRNet


Selective Kernel Feature Fusion (SKFF)	Downsampling Module


Dual Attention Unit (DAU)	Upsampling Module

Installation

The model is built in PyTorch 1.1.0 and tested on Ubuntu 16.04 environment (Python3.7, CUDA9.0, cuDNN7.5).

For installing, follow these intructions

sudo apt-get install cmake build-essential libjpeg-dev libpng-dev
conda create -n pytorch1 python=3.7
conda activate pytorch1
conda install pytorch=1.1 torchvision=0.3 cudatoolkit=9.0 -c pytorch
pip install matplotlib scikit-image opencv-python yacs joblib natsort h5py tqdm

Training

Download the SIDD-Medium dataset from here
Generate image patches

python generate_patches_SIDD.py --ps 256 --num_patches 300 --num_cores 10

Download validation images of SIDD and place them in ../SIDD_patches/val
Install warmup scheduler

cd pytorch-gradual-warmup-lr; python setup.py install; cd ..

Train your model with default arguments by running

python train_denoising.py

Note: Our model is trained with 2 Nvidia Tesla-V100 GPUs. See #5 for changing the model parameters.

Evaluation

You can download, at once, the complete repository of MIRNet (including pre-trained models, datasets, results, etc) from this Google Drive link, or evaluate individual tasks with the following instructions:

Image Denoising

Download the model and place it in ./pretrained_models/denoising/

Testing on SIDD dataset

Download sRGB images of SIDD and place them in ./datasets/sidd/
Run

python test_sidd_rgb.py --save_images

Testing on DND dataset

Download sRGB images of DND and place them in ./datasets/dnd/
Run

python test_dnd_rgb.py --save_images

Image Super-resolution

Download the models and place them in ./pretrained_models/super_resolution/
Download images of different scaling factor and place them in ./datasets/super_resolution/
Run

python test_super_resolution.py --save_images --scale 3
python test_super_resolution.py --save_images --scale 4

Image Enhancement

Testing on LOL dataset

Download the LOL model and place it in ./pretrained_models/enhancement/
Download images of LOL dataset and place them in ./datasets/lol/
Run

python test_enhancement.py --save_images --input_dir ./datasets/lol/ --result_dir ./results/enhancement/lol/ --weights ./pretrained_models/enhancement/model_lol.pth

Testing on Adobe-MIT FiveK dataset

Download the FiveK model and place it in ./pretrained_models/enhancement/
Download some sample images of fiveK dataset and place them in ./datasets/fivek_sample_images/
Run

python test_enhancement.py --save_images --input_dir ./datasets/fivek_sample_images/ --result_dir ./results/enhancement/fivek/ --weights ./pretrained_models/enhancement/model_fivek.pth

Results

Experiments are performed on five real image datasets for different image processing tasks including, image denoising, super-resolution and image enhancement. Images produced by MIRNet can be downloaded from Google Drive link.

Image Denoising (click to expand)

Image Super-resolution (click to expand)

Image Enhancement (click to expand)

Other Implementations

Tensorflow (Soumik Rakshit)
Tensorflow-JS (Rishit Dagli)
Tensorflow-TFLite (Sayak Paul)

Citation

If you use MIRNet, please consider citing:

@inproceedings{Zamir2020MIRNet,
    title={Learning Enriched Features for Real Image Restoration and Enhancement},
    author={Syed Waqas Zamir and Aditya Arora and Salman Khan and Munawar Hayat
            and Fahad Shahbaz Khan and Ming-Hsuan Yang and Ling Shao},
    booktitle={ECCV},
    year={2020}
}

Contact

Should you have any question, please contact [email protected]

Our Related Works

Learning Enriched Features for Fast Image Restoration and Enhancement, TPAMI 2022. Paper | Code
Restormer: Efficient Transformer for High-Resolution Image Restoration, CVPR 2022. Paper | Code
Multi-Stage Progressive Image Restoration, CVPR 2021. Paper | Code
CycleISP: Real Image Restoration via Improved Data Synthesis, CVPR 2020. Paper | Code

mirnet's People

Contributors

Stargazers

Watchers

Forkers

playhand daehankim-korea yawudede jiangzt jcsome lijiansunny knxie yangyuren03 forthebetter-ms cv-ip jstar-s jjdbear wwenu hanxuhfut xrosliang deadkany xiwj doudou123456 hello-trouble ofirkris juingzhou ruanjiyang 20155104009 littlestarer uuaix peterzs dnnyyq perseverancelx axki rensimon peterzhousz aiedward jia-honghenrylee gentlezr mukaino1 zhenglyufelix banksy10 tianbaochou 24werewolf yuhuang-ca yusha-fias mjt1312 sudeepa-galahad hudenjear knkarthick distoramos xinyuezhangqdu songjianjun9527 iceage7 siomn lindddt kongzijian singhkavinder siddiquesalman senlin-ali zz-su rebz135 zhipengy xiaoyuan7 katherinekher linxinqiang90 taowangzj heimu24 adambear littlefaithstrong terrisgo wps1215 zyxjtu mymuli gauravrai1207 pidanbushidan caiyuanhao1998 lizn2016 anylee2021 shivamshrirao artificial-intelligence-office ip-restoration calebhemara zongking123 minsun0824 winnie0320 sudo-rajarshi chadkowski serkansatak lucciffer junlin317 jzier7 wbert kackbob luckwhale jyang297

mirnet's Issues

torch.clamp

why use torch.clamp(restored, 0, 1) after model output?

About SR premodel

Hi,Thank you for releasing the code,and that is so cool!!
I want to realize image super-resolution in scale 2 and 4.But in Readme file,there are only x3 and x4 pre-model.
May I ask when can u upload the x2 pre-model?
Looking forwoard to hearing from you soon~
If I want to test Set5、Set14、BSD100 or other datasets,how can realize that?

Codes about iamge enhancement

Thank you for your excellent work about image filed firstly? In this repo， only train_denosing.py can be seen . Could you provide image enhancement training .py code? Looking forward to your reply.

training.yml for LOL enhancement

Thanks for your excellent works! I'm trying to reproduce your enhancement result.
Can you please share training.yml for training the enhancement model for the LOL dataset? Because I can only obtain PSNR 20.69 if I use the training setting of training.yml you share in this repository.

Some questions about your code

Hi, Thanks you sharing this work!
The code in your MIRNet_model.py in SKFF,
batch_size = inp_feats[0].shape[0]

I want to ask you that the inp_feats are [B, C, H, W], inp_feats[0] indicates that a batch's [C, H, W], so inp_feats[0].shape[0] indicates that C, but why your code it indicates batch_size?

Questions regarding the SIDD dataset split

Hi,

Thanks for the great work!

I am a little bit confused with how the SIDD validation set images are generated. Did you take the Validation .mat files from SIDD benchmark here (as shown in the screenshot below), and then convert them to image patches as described in #12?

In other words, the validation image patches on your Google Drive should be the same content as these .mat files, and they do not have any overlap with the actual training data? Some further clarifications on the dataset split would be super helpful. Much appreciated!

Training Code

Hi! The performance of your method on the real noisy datasets is really attractive. I am trying to train the denoising model by following the instructions in the paper, but have some difficulties reproducing the results. It seems that the model is too large to train with batch size 16 on a GPU with 24GB memory. The training dataset is also very large such that it takes very long to process one epoch. Could I know the details of the training time and implementation devices? It would be better if you could share the training code. Looking forward to your reply! Thank you!

Training Dataset link for Image Enhancement?

Hi thanks for your great work!

I sincerely appreciated that your MPRNet has provided all the training and val dataset download links. I am wondering could you also kindly provide the training and val datasets for RealSR, LOL, MIT-5K as well?

NameError: name 'warmup' is not defined

When I run train_denoising.py, I meet the probem:
NameError: name 'warmup' is not defined
How can I solve it? Thank you!

Maybe the checkpoints of enhancemenet on LOL datasets is wrong?

Your work is very productive, but I found that when reproducing the SSIM and PSNR metrics on the LOL dataset, the enhanced results obtained with the checkpoints you provided were different from here's and the scores were a bit worse.
I wonder if there is any step I did wrong?

training script

There is one issue.

It prints the following error message. Can you comment on what's wrong?
I tried to do the Enhancement Task on my custom image.

SIDD dataset is not avaiable

Hi, I want to download the rawRGB dataset (train set and val set). However, the SIDD dataset is not avaiable(https://www.eecs.yorku.ca/~kamel/sidd/dataset.php). If someone could offer the new url to download it, or share a download link of baidunetdisk. thanks~

There is a question that the data processing of super-resolution and the data processing of Image denoising?

There is a question that the data processing of super-resolution and the data processing of Image denoising?
`##################################################################################################
class DataLoaderTrain(Dataset):
def init(self, rgb_dir, img_options=None, target_transform=None):
super(DataLoaderTrain, self).init()

    self.target_transform = target_transform

    clean_files = sorted(os.listdir(os.path.join(rgb_dir, 'groundtruth')))
    noisy_files = sorted(os.listdir(os.path.join(rgb_dir, 'input')))
    
    self.clean_filenames = [os.path.join(rgb_dir, 'groundtruth', x) for x in clean_files if is_png_file(x)]
    self.noisy_filenames = [os.path.join(rgb_dir, 'input', x)       for x in noisy_files if is_png_file(x)]
    
    self.img_options = img_options

    self.tar_size = len(self.clean_filenames)  # get the size of target

def __len__(self):
    return self.tar_size

def __getitem__(self, index):
    tar_index   = index % self.tar_size
    clean = torch.from_numpy(np.float32(load_img(self.clean_filenames[tar_index])))
    noisy = torch.from_numpy(np.float32(load_img(self.noisy_filenames[tar_index])))
    
    clean = clean.permute(2,0,1)
    noisy = noisy.permute(2,0,1)

    clean_filename = os.path.split(self.clean_filenames[tar_index])[-1]
    noisy_filename = os.path.split(self.noisy_filenames[tar_index])[-1]

    #Crop Input and Target
    ps = self.img_options['patch_size']
    H = clean.shape[1]
    W = clean.shape[2]
    r = np.random.randint(0, H - ps)
    c = np.random.randint(0, W - ps)
    clean = clean[:, r:r + ps, c:c + ps]
    noisy = noisy[:, r:r + ps, c:c + ps]

    apply_trans = transforms_aug[random.getrandbits(3)]

    clean = getattr(augment, apply_trans)(clean)
    noisy = getattr(augment, apply_trans)(noisy)        

    return clean, noisy, clean_filename, noisy_filename`

General query in MIRNet_model.py .

thanks for your great work sir!

I have one doubt in lines 296-297 of this file(MIRNet_model.py).
for j in range(self.height):
blocks_out[j] = self.blocks[j]i

Why have you used a for loop over here, according to the architecture as per my understanding, we could simply pass the temp(selective_kernel_fusion) output to the DAU(or blocks variable). I mean to say, couldn't we have just written, blocks_out[j] = self.blocksj. Kindly help.

number of parameters

Hi, thanks for your work.

Could you please share the number of parameter of your network?
I implemented it and it seems to have more than 34M params?

Thanks!

Confused about MRB.

In every MRB, features are downsampled, and the feed into DAU. It is obvious that DAU don't change features shape, so the input to SKFF will have different shape and channel. How can SKFF handle these features by simply using L=L_1+L_2+L_3?

Zero division error

while running test_enhancement.py code it occured an error at psnr_val_rgb = sum(psnr_val_rgb)/len(psnr_val_rgb) line as zero division error. how can i solve this.Thank you in advance.

Where can I get validation images of SIDD?

Where can I get validation image of SIDD? Do you use the same script to extract patches for training and validataion images?

A lightweight, fast and extended version of MIRNet is accepted in TPAMI

hi! I'm very interested in your work, especially the lightweight version published in TPAMI. I would like to know the type of GPU used, and the total GPU memory time required for MIRNETv2 training. I noticed that batchsize is 64 in MIRNETv2 but only 16 in MIRNET. MIRNETv2 uses a larger batchsize and patch, which means more GPU memory is required.

There is no SSIM indicator in the training code

Hi, There is no SSIM indicator in the training code.

new issue

Wow! First of all, thanks for the quick reply, Thank you.
Thank you for your awesome research. Could you please share the Training Code?

Question in MIRNet_model.py

In the line 270-275, i got a question about the downsample.
inp changes in each iteration, and looks different with Fig. 1 in paper
Did I miss something?

CUDA out of memory during own set of images.

Hello @swz30 @adityac8 !

I made the necessary changes to the demo.py file present in the other repository in order to test MIRNet on my image set. During the process I had to make some configurations related to graphics compatibility, but everything was resolved.

However, at this moment I am faced with an error message:

RuntimeError: CUDA out of memory. Tried to allocate 5.38 GiB (GPU 0; 3.95 GiB total capacity; 379.90 MiB already allocated; 2.89 GiB free; 16.10 MiB cached)

Do you have any suggestions to solve my problem? I am using the pre-trained model provided, in a linux environment, with all the correct specifications with anaconda and graphics -> NVIDIA GEFORCE GTX 960m 4gb

Kind Regards,
João

testing with other datasets

Hi.

i'm wondering if i can inference your model with other datasets.

How can i do this?

Thx

doubt regarding training on denoising dataset

Thanku for your nice work

Sir, if I am training on denoising datasets for more than 60 epochs, my validation accuracy is continuously decreasing.

Sir, do I need to make any more changes to the code?except increasing the number of epochs

Which SSIM implementation?

Hi authors,

May I know which SSIM implementation you used for evaluation?

Questions on Enhancement.pth

Good job! Excuse me，can you tell me how to train low-light-enhanced weight files on your network?

Implementing SKFF with Tensorflow

Hi. I tried to implement SKFF with tf 1.15:

def SKFF(self, inputs:list, reduction=8, name='SKFF'):
    with tf.variable_scope(name):
        ch_n=inputs[0].shape[3]
        num=len(inputs)
        d=max(ch_n//reduction, 4)
        
        inputs=tf.stack(inputs, 0)
        fea=tf.reduce_sum(inputs, 0)
        fea=tf.reduce_mean(fea, [1, 2], keep_dims=True)
        fea=self.conv_layer(fea, d, 1, name='du')
        fea=tf.keras.layers.PReLU()(fea)
        
        vecs=[self.conv_layer(fea, ch_n, 1, name=str(no)) for no in range(num)]
        vec=tf.concat(vecs,axis=1)
        weight=tf.nn.softmax(vec, axis=1)
        weight=tf.transpose(weight, (1, 0, 2, 3))
        weight=tf.expand_dims(weight, 2)
        out = inputs*weight
        out=tf.reduce_sum(out, 0)
        return out

Then I used timeline to profile my network. I noticed that there were lots of transpose operations (i.e., convert data from NHWC to NCHW) so the inference speed was actually slower than direct contact different scales.

Is there any way I can optimize the TensorFlow codes? Thanks.

Issue With Processing real world images (not 600x400px) with LOL

Hello, well done!

How can I test on real world images with different resolutions? (example image attached)
When I try image enhancement on my own input images with different aspect ratio, nothing happens.

Log Files from Training

Thank you for your awesome code!

I am hoping you might open-source the log files you have from training. Maybe the training and validation loss as a function of epoch
(and/or batch) with an estimate of the runtime?

Train

Hi!
Thank you for your work!
You only provided the training code for image denoising. It's convenient to ask how to train low illumination images?

Why does the run timeout appear?

As the title，How to deal with the following phenomena？

Can't evaluate - ModuleNotFoundError

Hello,

I have a problem with "test_super_resolution.py" then I execute the following command:

python3 test_super_resolution.py --save_images --input_dir=./input/ --result_dir=./results/ --scale 3

I get the following error:


  File "test_super_resolution.py", line 19, in <module>
    from networks.MIRNet_model import MIRNet
ModuleNotFoundError: No module named 'networks.MIRNet_model'

Obviously, I have the file "MIRNet_model.py" in folder "networks". I don't understand that is the problem.

Could you help me please with an advice... I guess it's a simple problem that I don't notice.

Training Details on RealSR Dataset

Hi! The performance of your method on the RealSR datasets is really attractive. I am trying to train the model by following the instructions in the paper, but have some difficulties reproducing the results. The RealSR dataset totally contains 505 image pairs for training and 30 image pairs for validation on each scale. In the training stage, I choose to crop 320 patches with size 128×128 from one image. Therefore, the number of patches in the training set is 505×320. It seems that the training dataset is very large such that it takes very long (about 4.5 hours) to process one epoch with batch size 8 on a GPU with 24GB memory. Could I know the details of the training epoch, patch size and the total number of patches in the RealSR training dataset? Besides, do you train the model on the whole dataset and validate the performance on the validation set at each scale? Looking forward to your reply! Thank you!

Processing for different upscaling factor

Thanks for your wonderful sharing first! And I have a small question about how you handle different upscaling factors in the model for super resolution task, since I havn't found the description in the implementation details parts of your paper.

I can train this model using one gpu?

Hi, I have followed your which is awesome!
I wanted to train this model with some additional data. will it be possible to do it on a gtx 1660ti or a rtx 2060 laptop GPU?

I'm still a student. That's the best gpu I can afford right now.