Coder Social home page Coder Social logo

spaq's Introduction

Perceptual Quality Assessment of Smartphone Photography

This repository contains the constructed Smartphone Photography Attribute and Quality (SPAQ) database and implementations for the paper "Perceptual Quality Assessment of Smartphone Photography", Yuming Fang*, Hanwei Zhu*, Yan Zeng, Kede Ma, Zhou Wang, IEEE Conference on Computer Vision and Pattern Recognition, 2020. (*Equal contribution)

Download:

                     
      paper           supplementary         SPAQ database         PPT

Introduction

As smartphones become people's primary cameras to take photos, the quality of their cameras and the associated computational photography modules has become a de facto standard in evaluating and ranking smartphones in the consumer market. We conduct so far the most comprehensive study of perceptual quality assessment of smartphone photography. We introduce the Smartphone Photography Attribute and Quality (SPAQ) database, consisting of 11,125 pictures taken by 66 smartphones, where each image is attached with so far the richest annotations. Specifically, we collect a series of human opinions for each image, including image quality, image attributes (brightness, colorfulness, contrast, noisiness, and sharpness), and scene category labels (animal, cityscape, human, indoor scene, landscape, night scene, plant, still life, and others) in a well-controlled laboratory environment. The exchangeable image file format (EXIF) data for all images are also recorded to aid deeper analysis. We also make the first attempts using the database to train blind image quality assessment (BIQA) models constructed by baseline and multi-task deep neural networks. The results provide useful insights on how EXIF data, image attributes and high-level semantics interact with image quality, how next-generation BIQA models can be designed, and how better computational photography systems can be optimized on mobile devices.

Database

The SPAQ database and the annotations (MOS, image attributes scores, EXIF tags, and scene catogory labels) can be downloaded at the Baidu Yun (Code: b29m) or Google drive.

Proposed Models

We train a baseline (BL) to predict the quality of captured images and three variants that make use of EXIF tags (MT-E), image attributes (MT-A), and scene category labels (MT-S). We provide two images to test the blind image quality assessment (BIQA) models.

Prerequisites

The release version of BIQA models were implemented and have been tested in Ubuntu 16.04 with

  • Python = 3.5.0
  • PyTorch = 1.1.1
  • torchvision = 0.3.0

Baseline Model (BL)

The pretrained checkpoint of BL can be obtained at BL_release.pt. To test the BL model with the default setting:

python BL_demo.py

Multi-Task Learning from EXIF Tags (MT-E)

The pretrained checkpoint of MT-E can be obtained at MT-E_release.pt. To test the MT-E model with the default setting:

python MT-E_demo.py

Multi-Task Learning from Image Attributes (MT-A)

The pretrained checkpoint of MT-A can be obtained at MT-A_release.pt. To test the MT-A model with the default setting:

python MT-A_demo.py

Multi-Task Learning from Scene Semantics (MT-S)

The pretrained checkpoint of MT-S can be obtained at MT-S_release.pt. To test the MT-S model with the default setting:

python MT-S_demo.py

Reference

  • D. Ghadiyaram and A. C. Bovik. Massive online crowdsourced study of subjective and objective picture quality. IEEE Transactions on Image Processing, 25(1): 372–387, Jan. 2016.
  • V. Hosu, H. Lin, T. Sziranyi, and D. Saupe. KonIQ-10k: An ecologically valid database for deep learning of blind image quality assessment. IEEE Transactions on Image Processing, 29: 4041-4056, Jan. 2020.
  • K. He, X. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In IEEE Conference on Computer vison and Pattern Recognition, pages 770–778, 2016.

Citation

@inproceedings{fang2020cvpr,
title={Perceptual Quality Assessment of Smartphone Photography},
author={Fang, Yuming and Zhu, Hanwei and Zeng, Yan and Ma, Kede and Wang, Zhou},
booktitle={IEEE Conference on Computer Vision and Pattern Recognition},
pages={3677-3686},
year={2020}
}

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.