Coder Social home page Coder Social logo

invertible-isp's People

Contributors

yzxing87 avatar zqianaa avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

invertible-isp's Issues

Bug in preprocessing code

The preprocessing code contains the following lines:

if camera_name == 'Canon EOD 5D':
    raw_img = np.maximum(raw_img - 127.0, 0)

The string literal is incorrect, it should be Canon_EOS_5D (with an 'S' and underscores). As a result, the Canon data has not been correctly shifted.
The network has most likely learned to correct for this on its own, but I still thought I'd let you know.

Keep in mind that fixing the typo without releasing a new pretrained model will probably result in broken outputs.

Data generalization and incorrect highlight color

From my understand, the trained model provides an invertible function that can convert between Raw and RGB image. The GT RGBs used in training are not the JPEGs straight out of camera but generated by Rawpy. So the model simulates the process of Rawpy, which is relatively simple comapred to the ISP inside the real camera(e.g. most likely no local tone mapping is used, neither the camera manufacturer's proprietary color profile). Therefore, the trained model only applies to a specific RAW to Jpeg process, with fixed ISP parameters. White balance, which is most likely different between photos, is conducted in preprocess, but other ISP tunable parameters like tone curve, 3D LUT, color temperature-related CCM, lens shading are left for the network to simulate, which leads to my first question:

  1. How does this method perform for Jpeg that is not generated by Rawpy? Do we need to know how the Jpeg is generated before we reconstruct the raw?

I tried the pretrained model with NIKON data, and found the simulated Raw to Jpeg process cause visible hightlight color shifting, as shown in the top left corner:

gt_pred_a0341-dgw_002_00000

  1. Is that incorrect highlight color a bug or known issue?
  2. Do you have any idea to preserve the highlight color? One suggestion may be training a 3DLUT to match color. 3DLUT is somewhat invertible also.

unable to activate conda environment in colab

unable to activate the conda environment.
unable to import the packages that are installed in conda environment .
conda activate myenv not working. It says the shell is not configured. Someone plz help.

Training cost

How about the training cost for this project, e.g., the number of GPUs, and the total training time?

Calculate Metrics

我英语不太好,请允许我用中文提问。

  1. 在发布的test_raw.py 和 cal_metrics.py中,RAW图像没有经过反转 白平衡(white balance) 和 去马赛克(demosaicing)直接计算了PSNR指标,这样算出来的指标可能存在问题(不反转白平衡,RAW图像的像素值范围可能会超出[0, 1])。请问论文中的数值是如何计算的?

  2. 我对RGB图像指标的计算也存在疑惑。在

    rgb_img = PILImage.fromarray(im).save(rgb_target_path + file_name + '.jpg', quality = JPEG_Quality, subsampling = 1)
    中真值RGB被JPEG压缩了一次,而在
    tar.save(out_path+"tar_%s_%05d.jpg"%(file_name, i_patch), quality=90, subsampling=1)
    中真值RGB被压缩了第二次,这意味着代码实际计算的是 被JPEG压缩一次的模型输出RGB图像 和 被JPEG压缩两次的真值RGB图像 之间的差距。

在我理解中,InvISP的目的是

  • 输入RAW图像生成和相机ISP处理接近的RGB图像
  • 模型生成的RGB图像对JPEG压缩健壮,即使经过压缩后,RGB图像仍可通过模型可逆生成高质量的RAW图像。因此引入DiffJPEG来模拟JPEG压缩。

为什么需要在

rgb_img = PILImage.fromarray(im).save(rgb_target_path + file_name + '.jpg', quality = JPEG_Quality, subsampling = 1)
压缩真值RGB?按我的理解,模型生成的RGB图像应该拟合未被压缩的真值RGB,通过DiffJPEG来模拟JPEG压缩,再可逆回去拟合RAW图像。然后测试时,使用真实的JPEG流程压缩真值RGB和模型生成的RGB,计算压缩后的指标。

是我哪里理解错误?期待您的回答。

Using a target size that is different to the input size

I'm trying to train the model on another dataset. But I have encountered the following problem:

Parsed arguments: Namespace(aug=True, batch_size=1, camera='Canon1DsMkIII', data_path='/data/lly/inv_isp_data/', debug_mode=False, gamma=True, loss='L1', lr=0.0001, out_path='/data/lly/inv_isp_data/Canon1DsMkIII/', resume=False, rgb_weight=1, task='debug')
[INFO] Start data loading and preprocessing
[INFO] Start to train
task: debug Epoch: 0 Step: 0 || loss: 0.46242 raw_loss: 0.10383 rgb_loss: 0.35858 || lr: 0.000100 time: 0.316538
task: debug Epoch: 0 Step: 1 || loss: 0.24662 raw_loss: 0.01957 rgb_loss: 0.22705 || lr: 0.000100 time: 0.270781
task: debug Epoch: 0 Step: 2 || loss: 0.05458 raw_loss: 0.00540 rgb_loss: 0.04919 || lr: 0.000100 time: 0.269678
task: debug Epoch: 0 Step: 3 || loss: 0.12149 raw_loss: 0.00757 rgb_loss: 0.11392 || lr: 0.000100 time: 0.269641
task: debug Epoch: 0 Step: 4 || loss: 0.17164 raw_loss: 0.00870 rgb_loss: 0.16295 || lr: 0.000100 time: 0.282781
task: debug Epoch: 0 Step: 5 || loss: 0.09719 raw_loss: 0.00595 rgb_loss: 0.09124 || lr: 0.000100 time: 0.277356
task: debug Epoch: 0 Step: 6 || loss: 0.08278 raw_loss: 0.00824 rgb_loss: 0.07454 || lr: 0.000100 time: 0.276587
task: debug Epoch: 0 Step: 7 || loss: 0.08254 raw_loss: 0.00801 rgb_loss: 0.07453 || lr: 0.000100 time: 0.279638
task: debug Epoch: 0 Step: 8 || loss: 0.11994 raw_loss: 0.01274 rgb_loss: 0.10720 || lr: 0.000100 time: 0.270859
task: debug Epoch: 0 Step: 9 || loss: 0.07166 raw_loss: 0.00605 rgb_loss: 0.06562 || lr: 0.000100 time: 0.287317
task: debug Epoch: 0 Step: 10 || loss: 0.19911 raw_loss: 0.00554 rgb_loss: 0.19357 || lr: 0.000100 time: 0.272710
task: debug Epoch: 0 Step: 11 || loss: 0.14320 raw_loss: 0.00622 rgb_loss: 0.13698 || lr: 0.000100 time: 0.279719
task: debug Epoch: 0 Step: 12 || loss: 0.05994 raw_loss: 0.00999 rgb_loss: 0.04996 || lr: 0.000100 time: 0.282813
task: debug Epoch: 0 Step: 13 || loss: 0.04691 raw_loss: 0.00428 rgb_loss: 0.04263 || lr: 0.000100 time: 0.269908
task: debug Epoch: 0 Step: 14 || loss: 0.09645 raw_loss: 0.00515 rgb_loss: 0.09129 || lr: 0.000100 time: 0.287600
task: debug Epoch: 0 Step: 15 || loss: 0.08834 raw_loss: 0.00427 rgb_loss: 0.08407 || lr: 0.000100 time: 0.288736
train.py:69: UserWarning: Using a target size (torch.Size([1, 3, 0, 256])) that is different to the input size (torch.Size([1, 3, 256, 256])). This will likely lead to incorrect results due to broadcasting. Please ensure they have the same size.
  rgb_loss = F.l1_loss(reconstruct_rgb, target_rgb)
Traceback (most recent call last):
  File "train.py", line 98, in <module>
    main(args)
  File "train.py", line 69, in main
    rgb_loss = F.l1_loss(reconstruct_rgb, target_rgb)
  File "/home/amax/anaconda3/lib/python3.8/site-packages/torch/nn/functional.py", line 2633, in l1_loss
    expanded_input, expanded_target = torch.broadcast_tensors(input, target)
  File "/home/amax/anaconda3/lib/python3.8/site-packages/torch/functional.py", line 71, in broadcast_tensors
    return _VF.broadcast_tensors(tensors)  # type: ignore
RuntimeError: The size of tensor a (256) must match the size of tensor b (0) at non-singleton dimension 2

I have searched for this problem on Google and stackoverflow, but the answers only mention that it maybe the wrong output dimension of certain layers.

So are there any fixed parameters of image size in this code? Would you mind having a look at it and pointing out the problem? Thanks!

test_raw.py

hi, we tried to run the test.sh script on test_raw.py, at the line input_RGBs = sorted(glob(out_path+"pred*jpg")) input_RGBs is an empty list, we looked on out_path and wanted to know if we need to put the images in this folder or this happened in the data_process.
we ran the test on your pretrained weights.

We are trying to convert RGB image to RAW with your model.
Can you please give us some guidelines or tips to do so ?

Thanks

How can I visualize the RAW image?

Hi! Thanks for the nice work! I notice that in your paper you visualize the RAW image through bilinear demosaicing, but I don't know how I can visualize the RAW image after bilinear demosaicing. in the data/data_preprocess.py, the RAW image after bilinear demosaicing ia simply saved in the format of '.npz', and I can't find any code to visualize it. Could you please tell me how I can visualize it? Thank you very much!

RuntimeError: CUDA error: an illegal memory access was encountered

Hi,
I am currently facing this issue below, when running train.py. Could you plz give me a hand?
My pc env is under:

  • Ubuntu 18.04
  • NVIDIA-SMI 455.45.01
  • Driver Version: 455.45.01
  • CUDA Version: 11.1
  • python 3.8
  • torch 1.8.0

/home/anaconda3/bin/python /home/Documents/Invertible-ISP-main/train_cuda.py --task=debug --data_path=./data/ --gamma --aug --camera=NIKON_D700 --out_path=./exps/ --debug_mode
Parsed arguments: Namespace(aug=True, batch_size=1, camera='NIKON_D700', data_path='./data/', debug_mode=True, gamma=True, loss='L1', lr=0.0001, out_path='./exps/', resume=False, rgb_weight=1, task='debug')
[INFO] Start data loading and preprocessing
[INFO] Start to train
Traceback (most recent call last):
File "/home/Documents/Invertible-ISP-main/train_cuda.py", line 99, in
main(args)
File "/home/Documents/Invertible-ISP-main/train_cuda.py", line 72, in main
reconstruct_raw = net(reconstruct_rgb, rev=True)
File "/home/anaconda3/lib/python3.8/site-packages/torch/nn/modules/module.py", line 889, in _call_impl
result = self.forward(*input, **kwargs)
File "/home/Documents/Invertible-ISP-main/model/model.py", line 176, in forward
out = op.forward(out, rev)
File "/home/Documents/Invertible-ISP-main/model/model.py", line 124, in forward
self.s = self.clamp * (torch.sigmoid(self.H(x1)) * 2 - 1)
RuntimeError: CUDA error: an illegal memory access was encountered

Process finished with exit code 1

If switched to invertible-isp as your environment.yml said, the code somehow ghost stopped at
line 22: DiffJPEG = DiffJPEG(differentiable=True, quality=90).cuda()
without showing any errors nor printing "start to train"

About the forward loss function

Hi, why the forward L1 loss between the output and the JPEG image is implemented on Rendered RGB but not Compressed RGB? In other words, what's the reason that the rgb_loss is computed before DiffJPEG?
屏幕截图 2022-11-03 170238
Thanks!

Step of demosaicing

Hi Yazhou,

An excellent work!

I notice that you use bilinear demosaicing by Python library colour_demosaicing, and I guess it is aiming to reverse this step. However, I wonder would if bilinear demosaicing would be enough for an ISP? It seems to have some disadvantages, such as colour error and blurring. Did you notice this problem?

Best,
Kenneth

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.