Coder Social home page Coder Social logo

yinnhao / hdrtvdm Goto Github PK

View Code? Open in Web Editor NEW

This project forked from andreguo/hdrtvdm

0.0 0.0 0.0 10.62 MB

The official repo of "Learning a Practical SDR-to-HDRTV Up-conversion using New Dataset and Degradation Models" in CVPR2023.

License: Mozilla Public License 2.0

Python 62.98% MATLAB 34.35% Shell 2.67%

hdrtvdm's Introduction

HDRTVDM

The official repo of paper "Learning a Practical SDR-to-HDRTV Up-conversion using New Dataset and Degradation Models" (paper (ArXiv), paper, supplementary material) in CVPR2023.

@InProceedings{Guo_2023_CVPR,
    author    = {Guo, Cheng and Fan, Leidong and Xue, Ziyu and Jiang, Xiuhua},
    title     = {Learning a Practical SDR-to-HDRTV Up-Conversion Using New Dataset and Degradation Models},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2023},
    pages     = {22231-22241}
}

1. Introduction

1.1. Our scope

There're many HDR-related methods in this year's CVPR. Our method differs from others in that we take conventional SDR/BT.709 image to HDR/WCG in PQ/BT.2020 (which is called HDRTV by HDRTVNet(ICCV21)), and is meant to be applied in media industry.

Our task can be called: SDR-to-HDRTV, ITM (inverse tone-mapping) or HDR/WCG up-conversion.

Others methods may take single SDR to a linear-light-HDR in grapghics/rendering (SI-HDR, single-image HDR reconstruction), or merge several SDRs to single HDR in camera imaging pipeline (MEF-HDR, multi-exposure fusion HDR imaging). Please jump to them if you are interested.

1.2 What we provide

  • PyTorch implementaion of our luminance segmented network (LSN) with Transformer-UNet and self-adaptive convolution.
  • A new training set named HDRTV4K (3848 HDR/WCG-SDR image pairs, current 1235 the largest).
  • HDRTV4K's new test set (400 GT-LQ pairs, current 160 the largest), both test and training set provide 7 versions of degradation models.
  • MATLAB implementaion of non-reference HDR/WCG metrics FHLP/EHL/FWGP/EWG.
  • Other discussions...

1.3 Changelog

Date log
13 Dec 2023 Since most SoTAs are still trained with YouTude degradation model (DM), we add this DM to both our training and test set, so you can: (1) train your network with the YouTube version of HDRTV4K training set and get a similar look as SoTAs; (2) directly test SoTA's original checkpoint (trained with YouTube DM) using the YouTube version of HDRTV4K test set.
14 Jan 2024 We change LSN (our network)'s default checkpoint to the one trained with commom HDRTV1K dataset (and YouTube DM), so you can directly compare it with SoTAs, by the old manner (PSNR, SSIM etc.).

2. HDRTV4K Dataset (Training set & test set)

2.1 HDRTV4K Training set

Our major concerns on training data are:

Aspect Model's benefit
(1) Label HDR/WCG's (scene) diversity better generalization ability
(2) Label HDR/WCG's quality
(especially the amount of advanced color and luminance volume)
more chance to produce advanced HDR/WCG volume
(3) SDR's extent of degradation a proper degradation recovery ability
(4) style and aesthetic of degraded SDR better aesthetic performance
(or consistency from SDR)

Hence, we provide HDRTV4K label HDR (3848 individual frames) of better (1) quality and (2) diversity, available on:

Training set label HDR/WCG download
BaiduNetDisk, GoogleDrive(TODO)

Atfer obtaining label HDR, you can:

2.1.1. OPTION 1: Download the coresponding degraded SDR below:

SDR from Degradation Model (DM) DM Usage (3) Extent of degradation (4) Style or aesthetic Download
OCIO2 our method moderate good GoogleDrive, BaiduNetDisk (2.27GB)
2446c+GM our method moderate good GoogleDrive, BaiduNetDisk (2.03GB)
HC+GM our method more moderate GoogleDrive, BaiduNetDisk (2.13GB)
2446a Chen2021 less bad BaiduNetDisk
Reinhard SR-ITM-GAN etc. less moderate OneDrive, BaiduNetDisk
YouTube most other methods who use HDRTV1K or KAIST training set (if used, you can learn a silimar style as previous methods) more bad GoogleDrive, BaiduNetDisk (2.51GB)
2390EETF+GM Zhang2023 TODO TODO OneDrive, BaiduNetDisk
DaVinci (w. different settings) another our algorithm ITM-LUT less good GoogleDrive, BaiduNetDisk

and use any of them as the input to train your network.

Since our degradation models (DMs) are just a preliminary attempt on concerns (3) and (4), we encourage you to:

2.1.2. OPTION 2 (Encouraged): Use your own degradation model to obtain input SDR

In this case, you can:

  • Change the style and aesthetic of degraded SDR to better suit your own technical and artistic intention, or involve your expertise in color science etc. for more precise control between SDR and HDR.
  • Control the extent of degradation to follow the staticstics of target SDR in your own application scenario (e.g. remastering legacy SDR or converting on-the-air SDR). You can even add diversity on the extent of degradation to endow your network a generalizability to various extent of degradation.
  • Add new types of degradation e.g. camera noise, compression artifact, motion blur, chromatic aberration and film grain etc. for more specific application scenario. Their degradation models are relatively studied more with traditional and deep-learning model.

2.2 HDRTV4K Test set

The test set used in our paper (consecutive frames) is copyrighted and will not be relesed. We provided alternative test set which consists of 400 individual frames and even more scenes. HDRTV4K's test set share the similar concerns as training set:

Better The test set will manifest more algorithm's
(1) GT HDR/WCG's (scene) diversity scene generalization ability
(2) GT HDR/WCG's advanced color and luminance volume mapping/expansion ability of advanced HDR/WCG volume
(3a) Input SDR's extent of degradation degradation recovery ability
(3b) Input SDR's diversity of degradation degradation generalization ability

It's available on:

Test set GT and LQ download
BaiduNetDisk and GoogleDrive(TODO)

This package contains 1 version of GT and 7 versions of LQ by different degradation models, so:

  • You should test on the same test set (i.e. if your model is trained with OCIO2 SDR, you should also test it on OCIO2 SDR), otherwise conventional distance-based metrics PSNR, SSIM, deltaE and VDP will not work. (Since SDR-HDR/WCG numerical relation in training and test set is different, like model trained on ExpertC of Adobe-MIT-5K dataset will score lower on ExpertA).
  • You can take only our GT and use your own degradation model to generate input LQ SDR, to test different aspect of method performance.

3. Luminance Segmented Network

3.1 Prerequisites

  • Python
  • PyTorch
  • OpenCV
  • ImageIO
  • NumPy

3.2 Usage (how to test)

Run method/test.py with below configuration(s):

python3 method/test.py frameName.jpg

When batch processing, use wildcard *:

python3 method/test.py framesPath/*.png

or like:

python3 method/test.py framesPath/footageName_*.png

Add below configuration(s) for specific propose:

Purpose Configuration
Specifing output path -out resultDir/ (default is inputDir)
Resizing image before inference -resize True -height newH -width newW
Adding filename tag -tag yourTag
Forcing CPU processing -use_gpu False
Using input SDR with bit depth != 8 e.g. -in_bitdepth 16
Saving result HDR in other format
(defalut is uncompressed
16-bit .tifof single frame)
-out_format suffix
png as 16bit .png
exr require extra package openEXR

Change line 104 in method/test.py to use other parameters/checkpoint:

  • Current method/params.pth was the latest one trained on common HDRTV1K dataset (YouTube degradation model) like most SoTAs, which will score 37.090dB the PSNR, 0.9813 the SSIM, 9.9091 the $\Delta$ Eitp and 9.0997 VDP3 ('task'='side-by-side', 'color_encoding'='rgb-bt.2020', 'pixel_per_degree'=60 on 1920*1080 image) on the mostly-used HDRTV1K test set. This checkpoint was not elaborately trained, so you may retrain another one using your tricks.
  • method/params_3DM.pth was trained on our HDRTV4K dataset and 3 degradation models (2446c+GM, HC+GM and OCIO2), which will produce the same look as persented in our paper.
  • method/params_DaVinci.pth was trained on our HDRTV4K dataset and DaVinci degradation model, this degradation model is used in another our algorithm ITM-LUT.
  • We will later release more interesting checkpoint(s).

4. Assessment criteria of HDR/WCG container and ITM process

In our paper we use 4 metrics to measure how many HDR/WCG volume a single frame possess.

Dimension Spatial fraction Numerical energy
HDR (high dynamic range) volume FHLP(Fraction of HighLight Pixels) EHL(Extent of HighLight)
WCG (wide color gamut) volume FWGP(Fraction of Wide Gamut Pixels) EWG(Extent of Wide Gamut)

You can find their usage in the comment.

Note that: From the prespective of quality assessment (QA), these metrics have not been proven to be consistently positively-correlated with good viewing experience, therefore the should only serve as a reference of HDR/WCG volume. HDR/WCG's preception involoves sophisticated knowlegde in color science and human vision etc., and intuitively these 4 metrics chould be mesured in a "naturalness" way (counting FHLP/EHL/FWGP/EWG's distribution on large-scale visually-pleasuring HDR/WCG images, and juding if someone's FHLP/EHL/FWGP/EWG falls in commom distribution.)

TO BE UPDATED

Still something to discuss?

  • From the prespective of quality assessment (QA), the assessment of ITM/up-conversion (enhancement process) is still an open task. We and our colleague is currently working on it, please refer to here or here.
  • ...TO BE UPDATED

Contact

Guo Cheng (Andre Guo) [email protected]

  • State Key Laboratory of Media Convergence and Communication (MCC), Communication University of China (CUC), Beijing, China.
  • Peng Cheng Laboratory (PCL), Shenzhen, China.

hdrtvdm's People

Contributors

andreguo avatar yinnhao avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.