Coder Social home page Coder Social logo

Comments (3)

victorca25 avatar victorca25 commented on June 16, 2024

Hello! Thanks for doing the tests and experimenting. I am aware of the linear space operations on images, the correct functions are used in iNNfer:
https://github.com/victorca25/iNNfer/blob/09569a1e81cd9a72a1ece85dad73391389998d70/utils/colors.py#L29
https://github.com/victorca25/iNNfer/blob/09569a1e81cd9a72a1ece85dad73391389998d70/utils/colors.py#L49

Where a "linear_resize" is also implemented:
https://github.com/victorca25/iNNfer/blob/09569a1e81cd9a72a1ece85dad73391389998d70/utils/utils.py#L267

However, the conversions add considerable latency in the training process (every image in every batch has to be converted back and forth between SRGB and linear) and not all operations on images require them to be applied on the linear space.

Additionally, the logic may be better implemented as a wrapper in https://github.com/victorca25/augmennt, but I haven't had time to evaluate the different implementation options and compare results between the current case and doing the linear conversions, but considering no current SOTA project does SRGB to linear conversions before doing the images operations and results are not impacted, the priority to implement it is relatively low, in comparison to other WIP elements. However, if during your testing you find results improve with the conversions, the priority can change.

from trainner.

awused avatar awused commented on June 16, 2024

Looking at linear2srgb that appears to be missing a rounding step before conversion to avoid truncation (np.around(srgb).astype(uint8)).

I did a little experiment, it's not much but in the interest of time I ran 500 iterations on DIV2K (using the pre-trained 4xPSNR as a starting point) with both srgb and linear rgb downscaling (modifying MLResize augmentations.py). With the default sRGB downscaling my first validation was 21-11-22 03:56:04.056 - INFO: <epoch: 4, iter: 500> PSNR: 24.702, SSIM: 0.68838, LPIPS: 0.12116, but with linear RGB downscaling I got 21-11-22 04:36:38.677 - INFO: <epoch: 4, iter: 500> PSNR: 26.169, SSIM: 0.7637, LPIPS: 0.10971. I repeated it again with similar results. I used the basic train_sr.yml with no substantial changes.

Using these two models I took this this original image: original and produced these two output images (which I've downscaled back to the original size to demonstrate the colour differences).

With no modifications to trainner (srgb downscaling): srgb_trained

With augment.py switched to use linear RGB downscaling: linear_rgb_trained

While this was a very small test I do think it demonstrates the impact of colour spaces on downscaling on training. Neither one of them was perfect inside of 500 iterations but the models trained on linear RGB downscaling were much closer in colour. The point isn't that either of these two models is actually any good, but that downscaling colour space can make a difference. This should be repeatable.

Even compared to one of the more well-regarded models (yandere neo xl) the results are roughly on par (I'd argue subjectively better, but marginally worse by PSNR/MAE) in terms of colour accuracy despite only 500 iterations. yandere_neo_xl

no current SOTA project does SRGB to linear conversions before doing the images operations and results are not impacted
I honestly believe this is incorrect. The results may not be obviously, visibly, incorrect most of the time but I do believe the results are impacted. The damage done to an image being downscaled in sRGB colour space can be fairly subtle (shifting hues) and the models will be trained to guess how to reverse this process which will have unpredictable effects on output colours.

from trainner.

awused avatar awused commented on June 16, 2024

I've had extremely positive results in terms of colour accuracy from training some non-trivial models locally. While I wasn't entirely happy with this one for other reasons and killed it at 183k iterations, this 2x model shows no discernable colour distortion at all.

Here's the same image round-tripped through my model and then downscaled to the original size to show colour accuracy. It's not just this image, either, I can get noticeable colour distortion passing other images through the all the other ESRGAN models I've tried (not all of them from the wiki, but a decent selection) but my model maintains colours just about perfectly.
183k_2x_dscale

I do believe downscaling properly makes a difference and I think at least some of the colour inaccuracies plaguing ESRGAN models can be attributed to downscaling in srgb colour. Without correcting for gamma, downscaling will cause hue shifts which the model will have to learn to reverse. I may repeat the experiment for the same duration without linear rgb downscaling but because it takes so long I'll be leaving that for later. I've already proven the results to my own satisfaction to keep using this edit locally.

from trainner.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.