HDRTVDM

The official repo of paper "Learning a Practical SDR-to-HDRTV Up-conversion using New Dataset and Degradation Models" (paper (ArXiv), paper, supplementary material) in CVPR2023.

@InProceedings{Guo_2023_CVPR,
    author    = {Guo, Cheng and Fan, Leidong and Xue, Ziyu and Jiang, Xiuhua},
    title     = {Learning a Practical SDR-to-HDRTV Up-Conversion Using New Dataset and Degradation Models},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2023},
    pages     = {22231-22241}
}

1. Introduction

1.1. Our scope

There're many HDR-related methods in this year's CVPR. Our method differs from others in that we take conventional SDR/BT.709 image to HDR/WCG in PQ/BT.2020 (which is called HDRTV by HDRTVNet(ICCV21)), and is meant to be applied in media industry.

Our task can be called: SDR-to-HDRTV, ITM (inverse tone-mapping) or HDR/WCG up-conversion.

Others methods may take single SDR to a linear-light-HDR in grapghics/rendering (SI-HDR, single-image HDR reconstruction), or merge several SDRs to single HDR in camera imaging pipeline (MEF-HDR, multi-exposure fusion HDR imaging). Please jump to them if you are interested.

1.2 What we provide

PyTorch implementaion of our luminance segmented network (LSN) with Transformer-UNet and self-adaptive convolution.
A new training set named HDRTV4K (3848 HDR/WCG-SDR image pairs, current 1235 the largest).
HDRTV4K's new test set (400 GT-LQ pairs, current 160 the largest), both test and training set provide 7 versions of degradation models.
MATLAB implementaion of non-reference HDR/WCG metrics FHLP/EHL/FWGP/EWG.
Other discussions...

1.3 Changelog

Date	log
13 Dec 2023	Since most SoTAs are still trained with YouTude degradation model (DM), we add this DM to both our training and test set, so you can: (1) train your network with the YouTube version of HDRTV4K training set and get a similar look as SoTAs; (2) directly test SoTA's original checkpoint (trained with YouTube DM) using the YouTube version of HDRTV4K test set.
14 Jan 2024	We change LSN (our network)'s default checkpoint to the one trained with commom HDRTV1K dataset (and YouTube DM), so you can directly compare it with SoTAs, by the old manner (PSNR, SSIM etc.).

2. HDRTV4K Dataset (Training set & test set)

2.1 HDRTV4K Training set

Our major concerns on training data are:

Aspect	Model's benefit
(1) Label HDR/WCG's (scene) diversity	better generalization ability
(2) Label HDR/WCG's quality (especially the amount of advanced color and luminance volume)	more chance to produce advanced HDR/WCG volume
(3) SDR's extent of degradation	a proper degradation recovery ability
(4) style and aesthetic of degraded SDR	better aesthetic performance (or consistency from SDR)

Hence, we provide HDRTV4K label HDR (3848 individual frames) of better (1) quality and (2) diversity, available on:

Training set label HDR/WCG download
BaiduNetDisk, GoogleDrive(TODO)

Atfer obtaining label HDR, you can:

2.1.1. OPTION 1: Download the coresponding degraded SDR below:

SDR from Degradation Model (DM)	DM Usage	(3) Extent of degradation	(4) Style or aesthetic	Download
OCIO2	our method	moderate	good	GoogleDrive, BaiduNetDisk (2.27GB)
2446c+GM	our method	moderate	good	GoogleDrive, BaiduNetDisk (2.03GB)
HC+GM	our method	more	moderate	GoogleDrive, BaiduNetDisk (2.13GB)
2446a	Chen2021	less	bad	BaiduNetDisk
Reinhard	SR-ITM-GAN etc.	less	moderate	OneDrive, BaiduNetDisk
YouTube	most other methods who use HDRTV1K or KAIST training set (if used, you can learn a silimar style as previous methods)	more	bad	GoogleDrive, BaiduNetDisk (2.51GB)
2390EETF+GM	Zhang2023	TODO	TODO	OneDrive, BaiduNetDisk
DaVinci (w. different settings)	another our algorithm ITM-LUT	less	good	GoogleDrive, BaiduNetDisk

and use any of them as the input to train your network.

Since our degradation models (DMs) are just a preliminary attempt on concerns (3) and (4), we encourage you to:

2.1.2. OPTION 2 (Encouraged): Use your own degradation model to obtain input SDR

In this case, you can:

Change the style and aesthetic of degraded SDR to better suit your own technical and artistic intention, or involve your expertise in color science etc. for more precise control between SDR and HDR.
Control the extent of degradation to follow the staticstics of target SDR in your own application scenario (e.g. remastering legacy SDR or converting on-the-air SDR). You can even add diversity on the extent of degradation to endow your network a generalizability to various extent of degradation.
Add new types of degradation e.g. camera noise, compression artifact, motion blur, chromatic aberration and film grain etc. for more specific application scenario. Their degradation models are relatively studied more with traditional and deep-learning model.

2.2 HDRTV4K Test set

The test set used in our paper (consecutive frames) is copyrighted and will not be relesed. We provided alternative test set which consists of 400 individual frames and even more scenes. HDRTV4K's test set share the similar concerns as training set:

Better	The test set will manifest more algorithm's
(1) GT HDR/WCG's (scene) diversity	scene generalization ability
(2) GT HDR/WCG's advanced color and luminance volume	mapping/expansion ability of advanced HDR/WCG volume
(3a) Input SDR's extent of degradation	degradation recovery ability
(3b) Input SDR's diversity of degradation	degradation generalization ability

It's available on:

Test set GT and LQ download
BaiduNetDisk and GoogleDrive(TODO)

This package contains 1 version of GT and 7 versions of LQ by different degradation models, so:

You should test on the same test set (i.e. if your model is trained with OCIO2 SDR, you should also test it on OCIO2 SDR), otherwise conventional distance-based metrics PSNR, SSIM, deltaE and VDP will not work. (Since SDR-HDR/WCG numerical relation in training and test set is different, like model trained on ExpertC of Adobe-MIT-5K dataset will score lower on ExpertA).
You can take only our GT and use your own degradation model to generate input LQ SDR, to test different aspect of method performance.

3. Luminance Segmented Network

3.1 Prerequisites

Python
PyTorch
OpenCV
ImageIO
NumPy

3.2 Usage (how to test)

Run method/test.py with below configuration(s):

python3 method/test.py frameName.jpg

When batch processing, use wildcard *:

python3 method/test.py framesPath/*.png

or like:

python3 method/test.py framesPath/footageName_*.png

Add below configuration(s) for specific propose:

Purpose	Configuration
Specifing output path	`-out resultDir/` (default is inputDir)
Resizing image before inference	`-resize True -height newH -width newW`
Adding filename tag	`-tag yourTag`
Forcing CPU processing	`-use_gpu False`
Using input SDR with bit depth != 8	e.g. `-in_bitdepth 16`
Saving result HDR in other format (defalut is uncompressed 16-bit `.tif`of single frame)	`-out_format suffix` `png` as 16bit .png `exr` require extra package `openEXR`

Change line 104 in method/test.py to use other parameters/checkpoint:

Current method/params.pth was the latest one trained on common HDRTV1K dataset (YouTube degradation model) like most SoTAs, which will score 37.090dB the PSNR, 0.9813 the SSIM, 9.9091 the $\Delta$ E_itp and 9.0997 VDP3 ('task'='side-by-side', 'color_encoding'='rgb-bt.2020', 'pixel_per_degree'=60 on 1920*1080 image) on the mostly-used HDRTV1K test set. This checkpoint was not elaborately trained, so you may retrain another one using your tricks.
method/params_3DM.pth was trained on our HDRTV4K dataset and 3 degradation models (2446c+GM, HC+GM and OCIO2), which will produce the same look as persented in our paper.
method/params_DaVinci.pth was trained on our HDRTV4K dataset and DaVinci degradation model, this degradation model is used in another our algorithm ITM-LUT.
We will later release more interesting checkpoint(s).

4. Assessment criteria of HDR/WCG container and ITM process

In our paper we use 4 metrics to measure how many HDR/WCG volume a single frame possess.

Dimension	Spatial fraction	Numerical energy
HDR (high dynamic range) volume	FHLP(Fraction of HighLight Pixels)	EHL(Extent of HighLight)
WCG (wide color gamut) volume	FWGP(Fraction of Wide Gamut Pixels)	EWG(Extent of Wide Gamut)

You can find their usage in the comment.

Note that: From the prespective of quality assessment (QA), these metrics have not been proven to be consistently positively-correlated with good viewing experience, therefore the should only serve as a reference of HDR/WCG volume. HDR/WCG's preception involoves sophisticated knowlegde in color science and human vision etc., and intuitively these 4 metrics chould be mesured in a "naturalness" way (counting FHLP/EHL/FWGP/EWG's distribution on large-scale visually-pleasuring HDR/WCG images, and juding if someone's FHLP/EHL/FWGP/EWG falls in commom distribution.)

TO BE UPDATED

Still something to discuss?

From the prespective of quality assessment (QA), the assessment of ITM/up-conversion (enhancement process) is still an open task. We and our colleague is currently working on it, please refer to here or here.
...TO BE UPDATED

Contact

Guo Cheng (Andre Guo) [email protected]

State Key Laboratory of Media Convergence and Communication (MCC), Communication University of China (CUC), Beijing, China.
Peng Cheng Laboratory (PCL), Shenzhen, China.

yinnhao / hdrtvdm Goto Github PK

hdrtvdm's Introduction