Coder Social home page Coder Social logo

Comments (6)

clennan avatar clennan commented on May 23, 2024

Hi, yes, it is a rather simple transfer learning application, I guess the only special part is the usage of Earth Mover Loss and the associated label structure that is required.

The quality of the technical model is also not good enough for us at the moment, so we are not using it in production. The main problem in my opinion is a) the TID2013 dataset itself, which is very small and the distortions do not generalize well, and b) the usage of CNNs that are optimised for image recognition (ImageNet) for a task that is inherently object independent. So creating a bigger dataset (maybe with synthetic labels?!) and experimenting with other CNN architectures are sensible next steps in my opinion.

Why MobileNet - mainly inference speed, as every millisecond makes a difference when predicting millions of images.

from image-quality-assessment.

soldierofhell avatar soldierofhell commented on May 23, 2024
  • EML - I'm not convinced if this really makes a difference, especially for TID. As you wrote the distribution of TID is artificial

For the TID2013 dataset, used for the technical classifications, we inferred the distribution from the mean score given for each image.

  • Ok, actually I don't have exeperience with MobileNet, always thought it's for "on the mobile" applications. Have you compared accuracy with inception/resnet? (I noticed reference to keras_applications in the code)

  • I agree with object detection as better direction. In fact I think about the problem for some task could be interpreted as an object detection task. I mean, if there should be an instance of object and it is found with good confidence level that means photo is of good quality

  • The other more technical approach could be to generate bigger dataset in the vain of TID with some augmentation library like imgaug and train new CNN for specific task (possibly with transfer learning from this repo) and in the end mix it with object detection approach and train on some dataset annotated by humans

from image-quality-assessment.

clennan avatar clennan commented on May 23, 2024

Have you tried training a classification or regression model on mean scores on TID? Would be interesting to see some results for this. The distribution inference in our repo is very crude, so you might be right that EML doesn't help the technical model much.

Just check the NIMA paper, they report results for MobileNet, Inception, and VGG16.

Actually I was making the opposite point - don't use a CNN architecture that is optimized for image recognition for recognizing technical image quality, like NIMA does. Technical image quality (like blur) should not depend on objects in an image in my opinion. It should rather depend on detecting sharp edges, colour compositions that relate to over- underexposure etc.

from image-quality-assessment.

hcl14 avatar hcl14 commented on May 23, 2024

Perhaps one can make some analog of TID using random distortions and triplet loss. The idea is that it does not matter how anchor image was initially distorted, but image with any single additional distortion (let's call it distortion_1) must be in between anchor and another image with 2 distortions (distortion_1 + some distortion_2).

I wonder how good will be absolute score from this model, but it will allow to compare distortion level on different images I suppose.

from image-quality-assessment.

hcl14 avatar hcl14 commented on May 23, 2024

@soldierofhell

the order of predictions does not exactly match my human judgement

You can try cosine distance of the last layers before the classifier, it worked better for me to compare images.

@clennan Don't mind me throwing in my two cents?

from image-quality-assessment.

clennan avatar clennan commented on May 23, 2024

Closing this for now, feel free to reopen if you want to discuss further.

from image-quality-assessment.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.