Coder Social home page Coder Social logo

neural-image-assessment's Introduction

NIMA: Neural IMage Assessment

Python 3.6+ MIT License

This is a PyTorch implementation of the paper NIMA: Neural IMage Assessment (accepted at IEEE Transactions on Image Processing) by Hossein Talebi and Peyman Milanfar. You can learn more from this post at Google Research Blog.

Implementation Details

  • The model was trained on the AVA (Aesthetic Visual Analysis) dataset containing 255,500+ images. You can get it from here. Note: there may be some corrupted images in the dataset, remove them first before you start training. Use provided CSVs which have already done this for you.

  • Dataset is split into 229,981 images for training, 12,691 images for validation and 12,818 images for testing.

  • An ImageNet pretrained VGG-16 is used as the base network. Should be easy to plug in the other two options (MobileNet and Inception-v2).

  • The learning rate setting differs from the original paper. Can't seem to get the model to converge using the original params. Also didn't do much hyper-param tuning therefore you could probably get better results. Other settings are all directly mirrored from the paper.

Requirements

Code is written using PyTorch 1.8.1 with CUDA 11.1. You can recreate the environment I used with conda by

conda env create -f env.yml

to install the dependancies.

Usage

To start training on the AVA dataset, first download the dataset from the link above and decompress which should create a directory named images/. Then download the curated annotation CSVs below which already splits the dataset (You can create your own split of course). Then do

python main.py --img_path /path/to/images/ --train --train_csv_file /path/to/train_labels.csv --val_csv_file /path/to/val_labels.csv --conv_base_lr 5e-4 --dense_lr 5e-3 --decay --ckpt_path /path/to/ckpts --epochs 100 --early_stoppping_patience 10

For inference, do

python -W ignore test.py --model /path/to/your_model --test_csv /path/to/test_labels.csv --test_images /path/to/images --predictions /path/to/save/predictions

See predictions/ for dumped predictions as an example.

Training Statistics

Training is done with early stopping. Here I set early_stopping_patience=10.

Pretrained Model

~0.069 EMD on validation. Not fully converged yet (constrained by resources). To continue training, download the pretrained weights and add --warm_start --warm_start_epoch 34 to your args.

Google Drive

Annotation CSV Files

Train Validation Test

Example Results

  • Here first shows some good predictions from the test set. Each image title starts with ground-truth rating followed by the predicted mean and std in the parentheses.

  • Also some failure cases, it would seem that the model usually fails at images with low/high aesthetic ratings.

  • The predicted aesthetic ratings from training on the AVA dataset are sensitive to contrast adjustments, preferring images with higher contrast. Below top row is the reference image with contrast c=1.0, while bottom images are enhanced with contrast [0.25, 0.75, 1.25, 1.75]. Contrast adjustment is done using ImageEnhance.Contrast from PIL (in this case pillow-simd).

License

MIT

neural-image-assessment's People

Contributors

george3d6 avatar yunxiaoshi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

neural-image-assessment's Issues

About the test.py

Thank you for your excellent job. when I run the test.py with the csv file ,it seems wrong in the following line:
"gt = test_df[test_df[0] == img].to_numpy()[:, 1:].reshape(10, 1)",
some value errors occurred as following:
ValueError: cannot reshape array of size 0 into shape (10,1).
So sorry to disturb you about this issue. Looking forward to your help.

Maybe the format of the csv is different from that of the paper, How to generate the specified 11 columns csv file of the paper ?

I met this error when run python3 main.py

Hi, @kentsyx @George3d6

I met this error when run python3 main.py

/home/tezro/.local/lib/python3.7/site-packages/torchvision/transforms/transforms.py:187: UserWarning: The use of the transforms.Scale transform is deprecated, please use transforms.Resize instead.
warnings.warn("The use of the transforms.Scale transform is deprecated, " +
Trainable params: 14.97 million
/home/tezro/cocoapi/PythonAPI/Neural-IMage-Assessment/data_loader.py:31: FutureWarning: Method .as_matrix will be removed in a future version. Use .values instead.
annotations = self.annotations.iloc[idx, 1:].as_matrix()
/home/tezro/cocoapi/PythonAPI/Neural-IMage-Assessment/data_loader.py:31: FutureWarning: Method .as_matrix will be removed in a future version. Use .values instead.
annotations = self.annotations.iloc[idx, 1:].as_matrix()
/home/tezro/cocoapi/PythonAPI/Neural-IMage-Assessment/data_loader.py:31: FutureWarning: Method .as_matrix will be removed in a future version. Use .values instead.
annotations = self.annotations.iloc[idx, 1:].as_matrix()
/home/tezro/cocoapi/PythonAPI/Neural-IMage-Assessment/data_loader.py:31: FutureWarning: Method .as_matrix will be removed in a future version. Use .values instead.
annotations = self.annotations.iloc[idx, 1:].as_matrix()
Traceback (most recent call last):
File "main.py", line 241, in
main(config)
File "main.py", line 96, in main
for i, data in enumerate(train_loader):
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 819, in next
return self._process_data(data)
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 846, in _process_data
data.reraise()
File "/usr/local/lib/python3.7/dist-packages/torch/_utils.py", line 385, in reraise
raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
data = fetcher.fetch(index)
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch
return self.collate_fn(data)
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/collate.py", line 74, in default_collate
return {key: default_collate([d[key] for d in batch]) for key in elem}
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/collate.py", line 74, in
return {key: default_collate([d[key] for d in batch]) for key in elem}
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/collate.py", line 55, in default_collate
return torch.stack(batch, 0, out=out)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 3 and 1 in dimension 1 at /pytorch/aten/src/TH/generic/THTensor.cpp:689

My System: Ubuntu 19.04, Pytorch 1.4, Torchvision 0.4.2, TitanXP.

Thanks in advance.
Best
from @bemoregt.

The torchvision pretrained VGG-16 requires normalization of inputs and you do not do this

As per the torchvision documentation:

The images have to be loaded in to a range of [0, 1] and then normalized using mean = [0.485, 0.456, 0.406] and std = [0.229, 0.224, 0.225]. You can use the following transform to normalize:

normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                 std=[0.229, 0.224, 0.225])

Not doing this will cause VGG-16 to output the wrong feature maps and you will probably get worse results. If you add this transform you will have to retrain though.

It requires me to install so much packages

Hi
Thanks for your code I really appreciate that but when I run the code using mac terminal, it keeps asking me for packages not installed, and I keep installing and installing but I still unable to run the training, and now I am sticking with NoModelNamed tensorboardX,

Is there basic solid steps I should follow to have all the required packages then I can run the code?

Sincerely

Abdullah

fully connected layers

This is great work. Thanks. Just wondering are there any particular reason the model does not include fully connected layers like VGG16 before softmax.

Doubt regarding computing standard deviation

In line 195-196 of the main.py file you are computing the standard deviation of the score. The for loop is over the variable j while the variable used inside the loop is i. I just wanted to clarify if this is correct. Thank you in advance.

No such file or directory - Please Help

Hello
I keep getting this error when I run the code
python main.py --img_path /path/to/images/ --train --train_csv_file /path/to/train_labels.csv --val_csv_file /path/to/val_labels.csv --conv_base_lr 3e-4 --dense_lr 3e-3 --decay --ckpt_path /path/to/ckpts --epochs 100 --early_stopping_patience 10

I downloaded the CSV files and I put them in the main folder of Neural-IMage-Assessment-Master but I get the error in the screenshot below
Untitled

@kentsyx @George3d6 @Bubbleinpit

Problematic Implementation of EMD Loss

Should be the L2 distance between CDF of two distributions but not between the PDF of two distributions

And there's some typo in the naming such as emb and emd

Metrics from original NIMA paper

Do you have the implementation on LCC/SRCC for evaluation? I have implemented but I'm not sure if I have done the correct things..

Issue with the test.py

At line [66](https://github.com/kentsyx/Neural-IMage-Assessment/blob/f0028cd27de5cdb20a21c2b896999b3505bcb4f6/test.py#L66), it should be l+1 rather than l, because AVA votes start from 1 not 0.

Unable to import lrs

I have tried to run the code. I'm unable to debug the error "No module named 'lrs'. Can you help with it?

test_labels.csv

Hello, when I want to test my image, how does test_labels.csv get generated? What does test_labels.csv mean? Looking forward to your answer!

About AVA dataset

May I ask how your dataset was obtained?
I also downloaded it from the link you gave, but there are many links in that Web page, so I used torrent file to download it.
However, the dataset I downloaded produced some error messages when I used the label file you gave to run the code , There should be some problems with the picture, so may I ask which link you used to download it?Or what did you do with the pictures?
Thank you very much

Some mistake in main.py

When I read your code, I found some errors

First , in train and val ,I think it better to add model.train() and model.eval(). It may not make a difference in VGG networks, but it should be necessary when network have BN and Dropout

second, also in main.py ,in test ,I think you should add with torch.no_grad():, If this code is not added, it will lead to more gradient operations, so that in the test, even on the 8G GPU, batchsize is 1, it cannot run ,becaues out of CUDA memory

Learning Rate Setting

Hi, I am experimenting with the NIMA implementation for a scientfic project! In your Readme, you say that "The learning rate setting differs from the original paper. I can't seem to get the model to converge with momentum SGD using an lr of 3e-7 for the conv base and 3e-6 for the dense block.". Which settings did you use? The defaults in argparser are set to the 3e-7, 3e-6, so that's why I was wondering!

Thank you :)

csv文件

The CSV file in the link cannot be opend.It seems to be invalid

Pre-trained model giving vague results

I am trying to implement this for a single image and not getting any mean value below 5.0. The good quality images also at times return low values.

I am sharing the main.py file, please check if anything is wrong with the code.

import argparse
import os

import numpy as np
import matplotlib
import matplotlib.pyplot as plt

import torch
from torch import no_grad
import torch.autograd as autograd
import torch.optim as optim

import torchvision.transforms as transforms
import torchvision.datasets as dsets
import torchvision.models as models

import torch.nn.functional as F

from model import *

import cv2
file_name = 'bad'
filename = '/home/shayan/Projects/NIMA/images/'+file_name+'.jpg'

image = cv2.imread(filename)
image = cv2.resize(image,(224,224))

img_arr = image.transpose(2, 0, 1) # C x H x W
img_arr = np.expand_dims(img_arr,axis = 0)
print(img_arr.shape)

img_tensor = torch.from_numpy(img_arr)
img_tensor = img_tensor.type('torch.FloatTensor')
print(img_tensor.shape,img_tensor.size)

cuda = torch.cuda.is_available()
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
if cuda:
    print("Device: GPU")
else:
    print("Device: CPU")
    
base_model = models.vgg16(pretrained=True)
model = NIMA(base_model)

model.load_state_dict(torch.load("/home/shayan/Projects/NIMA/epoch-12.pkl", map_location=lambda storage, loc: storage))
print("Successfully loaded model")

with torch.no_grad():

    model.eval()

output = model(img_tensor)
output = output.view(10, 1)

predicted_mean, predicted_std = 0.0, 0.0
for i, elem in enumerate(output, 1):
    predicted_mean += i * elem
for j, elem in enumerate(output, 1):
    predicted_std += elem * (j - predicted_mean) ** 2
print("________________")
print(u"({}) \u00B1{}".format(round(float(predicted_mean),2), round(float(predicted_std), 2)))  

how to test myself photo?

How do I test my own pictures with your pretrainde model , like the COCO dataset ,because there is no cvs file so I do not know how to test? Thanks

Runtime Error

(py3.6) C:\Users\baydogan\denemeee>python untitled1.py
Traceback (most recent call last):
File "", line 1, in
File "C:\Users\baydogan\AppData\Local\Continuum\anaconda3\envs\py3.6\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "C:\Users\baydogan\AppData\Local\Continuum\anaconda3\envs\py3.6\lib\multiprocessing\spawn.py", line 114, in _main
prepare(preparation_data)
File "C:\Users\baydogan\AppData\Local\Continuum\anaconda3\envs\py3.6\lib\multiprocessing\spawn.py", line 225, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "C:\Users\baydogan\AppData\Local\Continuum\anaconda3\envs\py3.6\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
run_name="mp_main")
File "C:\Users\baydogan\AppData\Local\Continuum\anaconda3\envs\py3.6\lib\runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "C:\Users\baydogan\AppData\Local\Continuum\anaconda3\envs\py3.6\lib\runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "C:\Users\baydogan\AppData\Local\Continuum\anaconda3\envs\py3.6\lib\runpy.py", line 85, in run_code
exec(code, run_globals)
File "C:\Users\baydogan\denemeee\untitled1.py", line 19, in
import lera
File "C:\Users\baydogan\AppData\Local\Continuum\anaconda3\envs\py3.6\lib\site-packages\lera_init
.py", line 53, in
chunk_list = mp.Manager().list()
File "C:\Users\baydogan\AppData\Local\Continuum\anaconda3\envs\py3.6\lib\multiprocessing\context.py", line 56, in Manager
m.start()
File "C:\Users\baydogan\AppData\Local\Continuum\anaconda3\envs\py3.6\lib\multiprocessing\managers.py", line 513, in start
self._process.start()
File "C:\Users\baydogan\AppData\Local\Continuum\anaconda3\envs\py3.6\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:\Users\baydogan\AppData\Local\Continuum\anaconda3\envs\py3.6\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:\Users\baydogan\AppData\Local\Continuum\anaconda3\envs\py3.6\lib\multiprocessing\popen_spawn_win32.py", line 33, in init
prep_data = spawn.get_preparation_data(process_obj._name)
File "C:\Users\baydogan\AppData\Local\Continuum\anaconda3\envs\py3.6\lib\multiprocessing\spawn.py", line 143, in get_preparation_data
_check_not_importing_main()
File "C:\Users\baydogan\AppData\Local\Continuum\anaconda3\envs\py3.6\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main
is not going to be frozen to produce an executable.''')
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.

    This probably means that you are not using fork to start your
    child processes and you have forgotten to use the proper idiom
    in the main module:

        if __name__ == '__main__':
            freeze_support()
            ...

    The "freeze_support()" line can be omitted if the program
    is not going to be frozen to produce an executable.

Can you help me ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.