Comments (3)
Hello! Thanks for doing the tests and experimenting. I am aware of the linear space operations on images, the correct functions are used in iNNfer:
https://github.com/victorca25/iNNfer/blob/09569a1e81cd9a72a1ece85dad73391389998d70/utils/colors.py#L29
https://github.com/victorca25/iNNfer/blob/09569a1e81cd9a72a1ece85dad73391389998d70/utils/colors.py#L49
Where a "linear_resize" is also implemented:
https://github.com/victorca25/iNNfer/blob/09569a1e81cd9a72a1ece85dad73391389998d70/utils/utils.py#L267
However, the conversions add considerable latency in the training process (every image in every batch has to be converted back and forth between SRGB and linear) and not all operations on images require them to be applied on the linear space.
Additionally, the logic may be better implemented as a wrapper in https://github.com/victorca25/augmennt, but I haven't had time to evaluate the different implementation options and compare results between the current case and doing the linear conversions, but considering no current SOTA project does SRGB to linear conversions before doing the images operations and results are not impacted, the priority to implement it is relatively low, in comparison to other WIP elements. However, if during your testing you find results improve with the conversions, the priority can change.
from trainner.
Looking at linear2srgb that appears to be missing a rounding step before conversion to avoid truncation (np.around(srgb).astype(uint8)
).
I did a little experiment, it's not much but in the interest of time I ran 500 iterations on DIV2K (using the pre-trained 4xPSNR as a starting point) with both srgb and linear rgb downscaling (modifying MLResize augmentations.py). With the default sRGB downscaling my first validation was 21-11-22 03:56:04.056 - INFO: <epoch: 4, iter: 500> PSNR: 24.702, SSIM: 0.68838, LPIPS: 0.12116
, but with linear RGB downscaling I got 21-11-22 04:36:38.677 - INFO: <epoch: 4, iter: 500> PSNR: 26.169, SSIM: 0.7637, LPIPS: 0.10971
. I repeated it again with similar results. I used the basic train_sr.yml with no substantial changes.
Using these two models I took this this original image: and produced these two output images (which I've downscaled back to the original size to demonstrate the colour differences).
With no modifications to trainner (srgb downscaling):
With augment.py switched to use linear RGB downscaling:
While this was a very small test I do think it demonstrates the impact of colour spaces on downscaling on training. Neither one of them was perfect inside of 500 iterations but the models trained on linear RGB downscaling were much closer in colour. The point isn't that either of these two models is actually any good, but that downscaling colour space can make a difference. This should be repeatable.
Even compared to one of the more well-regarded models (yandere neo xl) the results are roughly on par (I'd argue subjectively better, but marginally worse by PSNR/MAE) in terms of colour accuracy despite only 500 iterations. yandere_neo_xl
no current SOTA project does SRGB to linear conversions before doing the images operations and results are not impacted
I honestly believe this is incorrect. The results may not be obviously, visibly, incorrect most of the time but I do believe the results are impacted. The damage done to an image being downscaled in sRGB colour space can be fairly subtle (shifting hues) and the models will be trained to guess how to reverse this process which will have unpredictable effects on output colours.
from trainner.
I've had extremely positive results in terms of colour accuracy from training some non-trivial models locally. While I wasn't entirely happy with this one for other reasons and killed it at 183k iterations, this 2x model shows no discernable colour distortion at all.
Here's the same image round-tripped through my model and then downscaled to the original size to show colour accuracy. It's not just this image, either, I can get noticeable colour distortion passing other images through the all the other ESRGAN models I've tried (not all of them from the wiki, but a decent selection) but my model maintains colours just about perfectly.
I do believe downscaling properly makes a difference and I think at least some of the colour inaccuracies plaguing ESRGAN models can be attributed to downscaling in srgb colour. Without correcting for gamma, downscaling will cause hue shifts which the model will have to learn to reverse. I may repeat the experiment for the same duration without linear rgb downscaling but because it takes so long I'll be leaving that for later. I've already proven the results to my own satisfaction to keep using this edit locally.
from trainner.
Related Issues (20)
- Results are blue HOT 5
- "EOFError: Ran out of input" "AttributeError: Can't pickle local object 'get_totensor.<locals>.<lambda>'" HOT 1
- "CUDA out of memory. Tried to allocate 1.48 GiB" when trying to validate HOT 2
- [Feature Request] Curriculum Training for Augmentations
- Pixel Unshuffle is broken HOT 2
- Video dataloader crashes at 1x scale
- Video learning rate too high
- Add lr_crop_size in config HOT 1
- Update requirements.txt with proper versioning for torch
- `nearest_aligned` is not aligned HOT 2
- so many bugs in your sftgan implementation HOT 6
- How do i even use this HOT 2
- Is there any way to train video super resolution models using this? HOT 4
- lmdb has no valid image file HOT 3
- how to train for real ESRGAN HOT 1
- GPU usage at 0% during training HOT 7
- FutureWarning and UserWarning HOT 5
- I Have No Idea What I Am Doing to Cause This: HOT 5
- "cv2.error: OpenCV(4.7.0) D:\a\opencv-python\opencv-python\opencv\modules\imgproc\src\resize.cpp:4065: error: (-215:Assertion failed) inv_scale_x > 0 in function 'cv::resize'" HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from trainner.