qc17-thu / dl-sr Goto Github PK

Tensorflow/keras implementation for image transformation from low-resolution (LR) image to super-resolved one, including single wide-field (WF) image super-resolution prediction and SIM reconstruction.

License: MIT License

MATLAB 53.34% Objective-C 0.21% Python 46.45%

dl-sr's Introduction

DFCAN/DFGAN

DFCAN/DFGAN software is the tensorflow/keras implementation for image transformation from low-resolution (LR) image to super-resolved one, including single wide-field (WF) image super-resolution prediction and SIM reconstruction. This repository is developed based on the 2021 Nature Methods paper Evaluation and development of deep neural networks for image super-resolution in optical microscopy.

Author: Chang Qiao^1,#, Di Li^2,#, Yuting Guo^2,#, Chong Liu^2,3,#, Tao Jiang^2,3, Qionghai Dai^1,+, Dong Li^2,3,4,+
¹Department of Automation, Tsinghua University, Beijing, China.
²National Laboratory of Biomacromolecules, CAS Center for Excellence in Biomacromolecules, Institute of Biophysics, Chinese Academy of Sciences, Beijing, China.
³College of Life Sciences, University of Chinese Academy of Sciences, Beijing, China.
⁴Bioland Laboratory, Guangzhou Regenerative Medicine and Health Guangdong Laboratory, Guangzhou, China.
^#Equal contribution.
⁺Correspondence to: [email protected] and [email protected]

Environment
File structure
BioSR dataset
Test pre-trained models
Train a new model
License
Citation

Environment

Ubuntu 16.04
CUDA 9.0.16
Python 3.6.10
Tensorflow 1.10.0
Keras 2.2.4
GPU: GeForce RTX 2080Ti

File structure

./dataset is the default path for training data and testing data
- ./dataset/train The augmented training image patch pairs will be saved here by default
- ./dataset/test includes some demo images of F-actin and microtubules to test DFCAN/DFGAN models
./src includes the source codes of DFCAN and DFGAN
- ./src/models includes declaration of DFCAN and DFGAN models
- ./src/utils is the tool package of DFCAN/DFGAN software
./trained_models place pre-trained DFGAN/DFCAN models here for testing, and newly trained models will be saved here by default
./data_agmt_matlab includes matlab codes used for data augmentation (matlab version: MATLAB 2017b)

BioSR dataset

BioSR is a biological image dataset for super-resolution microscopy, currently including more than 2200 pairs of low-and-high resolution images covering four biology structures (CCPs, ER, MTs, F-actin), nine signal levels (15-600 average photon count), and two upscaling-factors (linear SIM and non-linear SIM). BioSR is now freely available, aiming to provide a high-quality dataset for the community of single bio-image super-resolution algorithm and advanced SIM reconstruction algorithm developers.

Test pre-trained models

Download pre-trained models of DFCAN/DFGAN and place them in ./trained_models/
Download test data and place them in ./dataset/test. Also, you can download BioSR for more testing data
Open your terminal and cd to ./src
Run bash demo_predict.sh in your terminal. Note that before running the bash file, you should check if the data paths and other arguments in demo_predict.sh are set correctly
The output SR images will be saved in --data_dir
Typical results:

Train a new model

Data for training: You can train a new DFCAN/DFGAN model using BioSR or your own datasets. Note that you'd better divide the dataset of each specimen into training part and validation/testing part before training, so that you can test your model with the preserved validation/testing data
Data augumentation: run ./data_agmt_matlab/DataAugmentation_ForTrain.m with MATLAB to creat image patch pairs of BioSR datasets. Before running, you should check image paths and some parameters following the instructions in ./data_agmt_matlab/DataAugumentation_ForTrain.m. After running, the augumented data is saved in ./dataset/train by default
Run bash demo_train.sh in your terminal to train a new DFCAN model. Similar to testing, before running the bash file, you should check if the data paths and the arguments are set correctly
You can run tensorboard --logdir [save_weights_dir]/[save_weights_name]/graph to monitor the training process via tensorboard. If the validation loss isn't likely to decay any more, you can use early stop strategy to end the training
Model weights will be saved in ./trained_models/ by default

License

This repository is released under the MIT License (refer to the LICENSE file for details).

Citation

If you find the code or BioSR dataset helpful in your resarch, please cite the following paper:

@article{qiao2021evaluation,
  title={Evaluation and development of deep neural networks for image super-resolution in optical microscopy},       
  author={Chang Qiao, Di li, Yuting Guo, Chong Liu, Tao Jiang, Qionghai Dai and Dong Li},
  journal={Nature Methods},
  pages={194-202},
  year={2021},
  publisher={Nature Publishing Group}
}

dl-sr's People

Contributors

Stargazers

Watchers

dl-sr's Issues

Model weight not saving

I'm having trouble with model weight and best did not get save after training. Do you know what is causing this?

Save weights before loading weights

In line 209 of the train_DFGAN.py file, it seems that the weight is not loaded before g.save_weights(), so the initial weight will be used to override the optimal weight each time the program is executed. I am puzzled that this program should not load the best weight before writing the new weight ?

Questions about LR data

Hi,
I don't know how to get the size of the input data. Why does each group of original SIM images get an image with a resolution of 512 * 512 after being averaged as diffraction limited wide field (WF) images?
Thank you in advance.

Clarification on the SIM Data

Following is the extract from the paper and i have marked the corresponding files names for CCP's. Seems #2 is not available in the data shared. Is my understanding correct?

#1. For each ROI, we acquired nine sets of N-phase × M-orientation raw images with constant 1 ms exposure time but increasing the excitation light intensity, where N and M are three for TIRF-SIM and GI-SIM, and five for nonlinear SIM. - RawSIMData_level_*.mrc

#2. Meanwhile, each set of N × M raw images was reconstructed into a SIM image attributing the same fluorescence level as the
corresponding WF image, which served as a reference to assess the quality of the
DLSR image at that fluorescence level. - Missing

#3 - In addition, in the same ROI, we finally elevated the excitation intensity and exposure time (typically 120 W cm−2 for 10 ms) to achieve a high fluorescence level of >1,200 average photon count, and independently acquired three sets of N × M raw images. The resulting three SIM images of ultrahigh SNR were averaged as the GT-SIM image to guarantee high quality. SIM_gt.mrc

Obtaining the paired LR-HR dataset

Thanks for your excellent work. I am confused about how to obtain the paired WF-SIM dataset without registration. Could you please tell me the specific steps?

dataset, splits, and evaluation issue.

hi,
based in your paper, you obtained 50 groups which you divided into 35 groups for training, and 15 groups for validation.

over the trainset, you cropped patches of low resolution (wf) of 128x128 and high resolution (gt) 256x256.
over validset, you cropped patches of wf of size 256x256 and gt of 512x512.

it seems that in the entire paper, the quantitative evaluation was performed over the validset.
then, in this github, you introduced another set: testset.

here are my questions:

in machine learning, when dividing data into train, valid, and test sets, we usually fit the model using trainset, and use validset for model selection. once the training is completely done, we evaluate the best obtained model during training over testset, and report that performance (not the performance over the validset because it is biased to be better than testet in general.). why did you report the performance over the validset? why did you introduce testset while it was not used to report quantitative performance? validset should be used for model selection. the way you did the experiments will yield the best results on the validset for sure; and this gives a false impression of the performance because you are selecting the best model. i recommend to report the performance using the three metrics you used over a separate test set that was not used or seen during the training. if you are not using at all testset for any quantitative evaluation, i recommend removing the word 'testset' from github because it is confusing. you can call it additional data for visual evaluation.
overlap between valid/train and test set: the so called 'testset' in this github seems to be extracted somehow from the 50 groups, unless testset comes from groups other than these 50 groups. if it is from the 50 groups, again, this testset will give a false impression of the performance because either your model has used it for training or validation. it should not be either cases. testset should be independent.
you reported 3 metrics: nrmse, ms-ssim, and resolution. why didnt you discuss nor report psnr metric which is well known in super-resolution community?
why there is no comparison to bicubic interpolation? it is a simple common baseline in super-resolution. all methods are expected to get better results than this simple interpolation.
can you provide ground truth tif-files for the so called 'testset' in this github? you provided only tif low resolution and we do not know how to map them to the super resolution ground truth which you said it is somewhere in train/valid biosr dataset you provided. in the folder testset for f-actin for instance, there are only 2 folders 'input_raw_sim_images' and 'input_wide_field_images' which are the low resolution. inside 'input_raw_sim_images' for instance, we find level folders which hold tif files each. but we dont know from which cell they come. the biosr dataset in f-actin folder, it has folders named "Cell_xxx". it does not there is a link from testset to biosr data to get the ground truth. it would be helpful to provide the high resolution tif of the test set.
not everyone has access to matlab, it would be very helpful to provide either python code to build tif files. or provide tif file for all biosr data.

my colleague (@shakeebmurtaza) and i are trying to use your data, reproduce your results, and do further evaluations.
so far, we are having issues with building the patches.

we appreciate your help.
thanks

python version of your matlab code. process *.mrc

hi,
could you please release the python code that does the job of matalb code you provided here?
or release the final tif images produced by your matlab code? for full data.
the python code you provided with data in supp material does read/write only.
thanks

pytorch version of DFCAN

Hello.
I realized your wonderful model using pytorch with the github link: https://github.com/L2-Zhang/DFCAN-pytorch.
Could you add it in the document Readme.md?

Output is noise

High, This job was very enlightening to me, but I encountered problems when trying to reproduce your code.
When I run bash demo_predict.sh, The result of the program output is noise. How to solve it?

left is input, right is output.
My Environment:

Keras==2.2.4
tensorflow-gpu==1.10.0
tensorboard==1.10.0
numpy==1.19.5
opencv-python==4.1.0.25
imageio==2.5.0
scikit-image==0.15.0
scipy==1.3.1
matplotlib==3.1.1

My file structure:

BioSR dataset

I have no permission to enter the original link, could you please provide the download link for the BioSR dataset again?

loss function vs epochs for mictotubules data

Hi, I'm just wondering if you have any data of training loss vs the epochs for microtubules data?

requirements.txt is missing

Adding a requirements.txt file would really help reproducing the experiments ;-)

关于数据集采集的问题

乔畅学长你好，我也是SIM方向的一名研究生，我在尝试用你们公开的数据集BioSR进行SIM的重构时，SIM重建效果不如数据集中给到的SIM重构出来的结果好，我看文章中有提到你们使用了一个自制的多模态SIM系统，然后关于线性SIM的数据重构用的是互相关估计参数，广义维纳滤波进行SIM重建，我也按照了上述方法进行重构，但是伪影严重，我觉得是我模拟生成的OTF（根据数据集中给到的参数模拟生成的）与实际系统的OTF差距比较大导致的参数估计误差较大，所以想问一下学长，方不方便在这个项目中公开一下系统的OTF，如果可以的话，万分感谢。

3D DFCAN

I am interested in training a 3D DFCAN for denoising but the models here seem to be all 2D. Is there a separate repository for 3D models ? Thank you

Issue with pre-processing dataset (matlab script)

I tried to process all planes/folders in your BioSR dataset using Matlab code. But Matlab is only processing one folder (folder) successfully. It generates errors for other folders. For example, it produces the below-given error for the following folder: "F-actin_Nonlinear". Could you please rectify this issue and provide the code that you used for other planes.

Thank you very much.

script matlab.

hi,

cc @shakeebmurtaza

using the current version of the code, i am getting this error when processing 'F-actin_Nonlinear' to generate train/valid set using DataAugumentation_ForTrain.m. can you update the code. please test it before publishing. retest over all specimens before publishing so we dont have to come back again for more issues related to generating data. we have already created an issue about this matter 25 days ago, and so far no answer.

Generating training data 1/30, SNR 9/9
Generating training data 1/30, SNR 10/9
Index exceeds matrix dimensions.

Error in DataAugumentation_ForTrain (line 189)
        [header, data] = XxReadMRC(files_input{j});

can you fix the case of ER. i am getting this error:

Error using char
Conversion of element 1 from <missing> to character vector is not supported.

Error in XxSort (line 47)
s1 = char(c2);

Error in DataAugumentation_ForTrain (line 90)
    files_input = XxSort(XxDir(CellList{i}, DataFilter));

can you fix the case of CCPs. error:

Generating training data 1/30, SNR 8/9
Generating training data 1/30, SNR 9/9
Generating training data 1/30, SNR 10/9
Index exceeds matrix dimensions.

Error in DataAugumentation_ForTrain (line 189)
        [header, data] = XxReadMRC(files_input{j});

can you fix the case of Microtubules . error:

Generating training data 1/30, SNR 6/9
Generating training data 1/30, SNR 7/9
Generating training data 1/30, SNR 8/9
Generating training data 1/30, SNR 9/9
Generating training data 1/30, SNR 10/9
Index exceeds matrix dimensions.

Error in DataAugumentation_ForTrain (line 189)
        [header, data] = XxReadMRC(files_input{j});

can you provide the exact script that you used to generate all the test set pairs low_resolution-super_resolution under the format *.tiff of the four specimens: ccps, er, mts, f-actin, f-actin (non-linear). please test your code before publishing.

please provide the exact same scripts with the exact same configuration that you used so we can produce exactly the same data as you. without these scripts we cant use your data. please avoid asking users to modify or guess anything related to your code. you know better than anyone your data and your matlab code. so, please save everyone headaches, and help us use your public data. the scripts should be ready to use. the user needs only to run them. they have to produce the same data as you since you said you are unable to upload the ready-to-use data. everyone appreciates your help.

matlab version: same as you (matlab 2017b).

thanks.

Create a virtual environment

Hi, I'm just wondering if you can include a virtual environment so I can successfully run the code?

Where are the ground-truth images of the images in the "test" folder?

Hi, thanks for your significant work! :-)
Could you please tell me where are the ground-truth images of the images in the "test" folder? Thanks!

Loss is Nan

Hi,

I'm trying to reproduce your training result for CCPs dataset. I tried setting up the batch size for DFCAN to 4 and batch size for DFGAN to 2 but during training at around 9000 epochs or so the loss is Nan. I'm just wondering what parameters do you use to train CCPs dataset and how can I avoid getting Nan for my loss? I tried to google this and most people said to check if the preprocessing dataset is corrupted. I then checked this but did not see any corrupted data in my dataset.

Thank you in advance.

Discriminator contribution to DFGAN

Hi,

I have been testing the DFGAN model with a different data set. In order to obtain a better performance I have been trying different parameter configurations and I have realized that the discriminator loss and its predictions do not change to much. In general, I have distinguished two different performances:

The discriminator catches the generator (predicting as fake, the generated images) and then the performance of the generator gets worse, generating black or white images.
The most common performance, the discriminator starts differentiating well between true and fake, but around the batch iteration 400 (quite early) the discriminator starts to predict 0.5 to all the images, not contributing to the training.

Then I tested the original train_DFGAN.py scrip that you have with the original data set of F-actin and it seems to perform like in the second case where, the discriminator returns a 0.5 to all the images.

So, the question is: Does the discriminator really contribute to the training of the model or the improvement of the performance is due to the MSE and SSIM in the loss of the generator? What has happened in your experiments?

Thanks.

about models

Hi,

Why train a corresponding sample model for each sample? Also just use the samples of the corresponding model when testing the model?

Thank you in advance.

Question about the dataset

Hi. Thank you for your great work. I notice that just like what you said in the paper. Each set of raw SIM images is averaged to get the WF image while reconstructing to get the SR-SIM. I'm wondering in the BioSR dataset, which one is the reconstructed image(RawSIMData_gt.mrc or SIM_gt.mrc)? And which one did you use as the gtto train the network?
Thank you
Jasper

Questions about the training and test data acquisition.

Hi! Thank you for the awesome data and code. Aftering reading your paper, I still have one question about the training and test data acquisition:

I see that there a SIM_gt.mrc file and a RawSIMData_gt.mrc file in BioSR dataset. What's the difference between these two files? Which one is used for the ground truth in the trianing?

Thanks

qc17-thu / dl-sr Goto Github PK

dl-sr's Introduction

DFCAN/DFGAN

Contents

Environment

File structure

BioSR dataset

Test pre-trained models

Train a new model

License

Citation

dl-sr's People

Contributors

Stargazers

Watchers

Forkers

dl-sr's Issues

Recommend Projects

Recommend Topics

Recommend Org