yunxiaoshi / neural-image-assessment Goto Github PK

A PyTorch Implementation of Neural IMage Assessment

License: Other

Python 100.00%

computer-vision photo-editing image-enhancement machine-learning

neural-image-assessment's Introduction

NIMA: Neural IMage Assessment

This is a PyTorch implementation of the paper NIMA: Neural IMage Assessment (accepted at IEEE Transactions on Image Processing) by Hossein Talebi and Peyman Milanfar. You can learn more from this post at Google Research Blog.

Implementation Details

The model was trained on the AVA (Aesthetic Visual Analysis) dataset containing 255,500+ images. You can get it from here. Note: there may be some corrupted images in the dataset, remove them first before you start training. Use provided CSVs which have already done this for you.
Dataset is split into 229,981 images for training, 12,691 images for validation and 12,818 images for testing.
An ImageNet pretrained VGG-16 is used as the base network. Should be easy to plug in the other two options (MobileNet and Inception-v2).
The learning rate setting differs from the original paper. Can't seem to get the model to converge using the original params. Also didn't do much hyper-param tuning therefore you could probably get better results. Other settings are all directly mirrored from the paper.

Requirements

Code is written using PyTorch 1.8.1 with CUDA 11.1. You can recreate the environment I used with conda by

conda env create -f env.yml

to install the dependancies.

Usage

To start training on the AVA dataset, first download the dataset from the link above and decompress which should create a directory named images/. Then download the curated annotation CSVs below which already splits the dataset (You can create your own split of course). Then do

python main.py --img_path /path/to/images/ --train --train_csv_file /path/to/train_labels.csv --val_csv_file /path/to/val_labels.csv --conv_base_lr 5e-4 --dense_lr 5e-3 --decay --ckpt_path /path/to/ckpts --epochs 100 --early_stoppping_patience 10

For inference, do

python -W ignore test.py --model /path/to/your_model --test_csv /path/to/test_labels.csv --test_images /path/to/images --predictions /path/to/save/predictions

See predictions/ for dumped predictions as an example.

Training Statistics

Training is done with early stopping. Here I set early_stopping_patience=10.

Pretrained Model

~0.069 EMD on validation. Not fully converged yet (constrained by resources). To continue training, download the pretrained weights and add --warm_start --warm_start_epoch 34 to your args.

Google Drive

Annotation CSV Files

Train Validation Test

Example Results

Here first shows some good predictions from the test set. Each image title starts with ground-truth rating followed by the predicted mean and std in the parentheses.

Also some failure cases, it would seem that the model usually fails at images with low/high aesthetic ratings.

The predicted aesthetic ratings from training on the AVA dataset are sensitive to contrast adjustments, preferring images with higher contrast. Below top row is the reference image with contrast c=1.0, while bottom images are enhanced with contrast [0.25, 0.75, 1.25, 1.75]. Contrast adjustment is done using ImageEnhance.Contrast from PIL (in this case pillow-simd).

License

MIT

neural-image-assessment's People

Contributors

Stargazers

Watchers

Forkers

liviust vishalseshagiri dreadlord1984 toanhvu ahuirecome marvin521 ml-lab grevutiu-gabriel liangsi03 shubhampachori12110095 wlwkgus speedofspin bgpratheek minciwu jeffli678 kanbo0409 ankitthapliyal93 scapeqin george3d6 wangwenshan akitoyi chipper1 forrest-ht jliangnku doriswzg sunxingxingtf rtanc pradeep5267 flyingbird93 ak2411 uditpython luben2018 hungbie juingzhou wenqingchu bemoregt jithin-babu wanqiansucceed cherrypiecoco tlwzzy vlordier go-and-practice itsjameszhao elggem ling-team-325 pgsrv maxmarketit nokabadi dale610 leizy1008 donaldaq walkergoxq rk19016 annshorn heliang-zheng seasky100 wangweixing2000 yanhuacheng weizongqi leonardo-lyh chineseyjh trendingtechnology ioannoue hello-trouble michael-gc fliodhais ylhz funykatebird cstichbury beyzacevik mayachiraz pyumei2021 andong-star p-geon pamekitti dholmdahl edziegle alexz33 oraclekaia joaocps sparof iarata gaoyuanhang kaisttw panicattacksss hacker1337 mdarestani98 aquastripe orikkwak

neural-image-assessment's Issues

About the test.py

Thank you for your excellent job. when I run the test.py with the csv file ,it seems wrong in the following line:
"gt = test_df[test_df[0] == img].to_numpy()[:, 1:].reshape(10, 1)",
some value errors occurred as following:
ValueError: cannot reshape array of size 0 into shape (10,1).
So sorry to disturb you about this issue. Looking forward to your help.

Maybe the format of the csv is different from that of the paper, How to generate the specified 11 columns csv file of the paper ?

I met this error when run python3 main.py

Hi, @kentsyx @George3d6

I met this error when run python3 main.py

/home/tezro/.local/lib/python3.7/site-packages/torchvision/transforms/transforms.py:187: UserWarning: The use of the transforms.Scale transform is deprecated, please use transforms.Resize instead.
warnings.warn("The use of the transforms.Scale transform is deprecated, " +
Trainable params: 14.97 million
/home/tezro/cocoapi/PythonAPI/Neural-IMage-Assessment/data_loader.py:31: FutureWarning: Method .as_matrix will be removed in a future version. Use .values instead.
annotations = self.annotations.iloc[idx, 1:].as_matrix()
/home/tezro/cocoapi/PythonAPI/Neural-IMage-Assessment/data_loader.py:31: FutureWarning: Method .as_matrix will be removed in a future version. Use .values instead.
annotations = self.annotations.iloc[idx, 1:].as_matrix()
/home/tezro/cocoapi/PythonAPI/Neural-IMage-Assessment/data_loader.py:31: FutureWarning: Method .as_matrix will be removed in a future version. Use .values instead.
annotations = self.annotations.iloc[idx, 1:].as_matrix()
/home/tezro/cocoapi/PythonAPI/Neural-IMage-Assessment/data_loader.py:31: FutureWarning: Method .as_matrix will be removed in a future version. Use .values instead.
annotations = self.annotations.iloc[idx, 1:].as_matrix()
Traceback (most recent call last):
File "main.py", line 241, in
main(config)
File "main.py", line 96, in main
for i, data in enumerate(train_loader):
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 819, in next
return self._process_data(data)
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py", line 846, in _process_data
data.reraise()
File "/usr/local/lib/python3.7/dist-packages/torch/_utils.py", line 385, in reraise
raise self.exc_type(msg)
RuntimeError: Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/worker.py", line 178, in _worker_loop
data = fetcher.fetch(index)
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/fetch.py", line 47, in fetch
return self.collate_fn(data)
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/collate.py", line 74, in default_collate
return {key: default_collate([d[key] for d in batch]) for key in elem}
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/collate.py", line 74, in
return {key: default_collate([d[key] for d in batch]) for key in elem}
File "/usr/local/lib/python3.7/dist-packages/torch/utils/data/_utils/collate.py", line 55, in default_collate
return torch.stack(batch, 0, out=out)
RuntimeError: invalid argument 0: Sizes of tensors must match except in dimension 0. Got 3 and 1 in dimension 1 at /pytorch/aten/src/TH/generic/THTensor.cpp:689

My System: Ubuntu 19.04, Pytorch 1.4, Torchvision 0.4.2, TitanXP.

Thanks in advance.
Best
from @bemoregt.

The torchvision pretrained VGG-16 requires normalization of inputs and you do not do this

As per the torchvision documentation:

The images have to be loaded in to a range of [0, 1] and then normalized using mean = [0.485, 0.456, 0.406] and std = [0.229, 0.224, 0.225]. You can use the following transform to normalize:

normalize = transforms.Normalize(mean=[0.485, 0.456, 0.406],
                                 std=[0.229, 0.224, 0.225])

Not doing this will cause VGG-16 to output the wrong feature maps and you will probably get worse results. If you add this transform you will have to retrain though.

It requires me to install so much packages

Hi
Thanks for your code I really appreciate that but when I run the code using mac terminal, it keeps asking me for packages not installed, and I keep installing and installing but I still unable to run the training, and now I am sticking with NoModelNamed tensorboardX,

Is there basic solid steps I should follow to have all the required packages then I can run the code?

Sincerely

Abdullah

fully connected layers

This is great work. Thanks. Just wondering are there any particular reason the model does not include fully connected layers like VGG16 before softmax.

userwaring: possibly corrupt EXIF data

Have you ever encountered this situation? Does it affect the effect of the training?

Doubt regarding computing standard deviation

In line 195-196 of the main.py file you are computing the standard deviation of the score. The for loop is over the variable j while the variable used inside the loop is i. I just wanted to clarify if this is correct. Thank you in advance.

你好，我没有训练模型的硬件条件，但是还是想用图像美学。可以分享一下，训练好的权重文件吗？？？

The score for the same picture varies

No such file or directory - Please Help

Hello
I keep getting this error when I run the code
python main.py --img_path /path/to/images/ --train --train_csv_file /path/to/train_labels.csv --val_csv_file /path/to/val_labels.csv --conv_base_lr 3e-4 --dense_lr 3e-3 --decay --ckpt_path /path/to/ckpts --epochs 100 --early_stopping_patience 10

I downloaded the CSV files and I put them in the main folder of Neural-IMage-Assessment-Master but I get the error in the screenshot below

@kentsyx @George3d6 @Bubbleinpit

The EMD loss still seems to be wrong

https://github.com/kentsyx/Neural-IMage-Assessment/blob/9c50b3e384a88a8afdc00333d01656be5526bfed/model.py#L36

The EMD loss still seems to be wrong, my opinion is the sum operation should be inside of torch.abs

Originally posted by @luqiang360 in #4 (comment)

Pre-trained model link not working

https://drive.google.com/file/d/1w9Ig_d6yZqUZSR63kPjZLrEjJ1n845B_/view?usp=sharing

Link not working.
Can you provide an accessible link?

Problematic Implementation of EMD Loss

Should be the L2 distance between CDF of two distributions but not between the PDF of two distributions

And there's some typo in the naming such as emb and emd

AVA dataset download

How to decompress the downloaded dataset such as image.7z.001？

Metrics from original NIMA paper

Do you have the implementation on LCC/SRCC for evaluation? I have implemented but I'm not sure if I have done the correct things..

Can you share the settings of Learning Rate?

Hi,

I have tried the default learning rate settings and it does not converge. Can you share the training settings?

Many thanks.

Issue with the test.py

At line [66](https://github.com/kentsyx/Neural-IMage-Assessment/blob/f0028cd27de5cdb20a21c2b896999b3505bcb4f6/test.py#L66), it should be l+1 rather than l, because AVA votes start from 1 not 0.

Unable to import lrs

I have tried to run the code. I'm unable to debug the error "No module named 'lrs'. Can you help with it?

How to run the pretrained model?

who can share the pre-train model?

test_labels.csv

Hello, when I want to test my image, how does test_labels.csv get generated? What does test_labels.csv mean? Looking forward to your answer!

About AVA dataset

May I ask how your dataset was obtained?
I also downloaded it from the link you gave, but there are many links in that Web page, so I used torrent file to download it.
However, the dataset I downloaded produced some error messages when I used the label file you gave to run the code , There should be some problems with the picture, so may I ask which link you used to download it?Or what did you do with the pictures?
Thank you very much

Some mistake in main.py

When I read your code, I found some errors

First , in train and val ，I think it better to add model.train() and model.eval(). It may not make a difference in VGG networks, but it should be necessary when network have BN and Dropout

second, also in main.py ,in test ,I think you should add with torch.no_grad():, If this code is not added, it will lead to more gradient operations, so that in the test, even on the 8G GPU, batchsize is 1, it cannot run ,becaues out of CUDA memory

Learning Rate Setting

Hi, I am experimenting with the NIMA implementation for a scientfic project! In your Readme, you say that "The learning rate setting differs from the original paper. I can't seem to get the model to converge with momentum SGD using an lr of 3e-7 for the conv base and 3e-6 for the dense block.". Which settings did you use? The defaults in argparser are set to the 3e-7, 3e-6, so that's why I was wondering!

Thank you :)

Validation & Test shouldn't randomly crop the input image

I haven't read the paper yet, but looking into the code, random cropping on the validation/test set doesn't seem like a proper evaluation method, as random crop is usually done to avoid model overfitting on training data.

csv文件

The CSV file in the link cannot be opend.It seems to be invalid

Is there pretrained weight for InceptionNet?

Do you have pretrained weight for NIMA(inception-v2)?

The link to the pre-trained model is not working

I would like to use your pre-training model, but the link to Google Cloud Drive is not working, I would like to request you to fill in the pre-training model, thank you very much.

Pre-trained model giving vague results

I am trying to implement this for a single image and not getting any mean value below 5.0. The good quality images also at times return low values.

I am sharing the main.py file, please check if anything is wrong with the code.

import argparse
import os

import numpy as np
import matplotlib
import matplotlib.pyplot as plt

import torch
from torch import no_grad
import torch.autograd as autograd
import torch.optim as optim

import torchvision.transforms as transforms
import torchvision.datasets as dsets
import torchvision.models as models

import torch.nn.functional as F

from model import *

import cv2
file_name = 'bad'
filename = '/home/shayan/Projects/NIMA/images/'+file_name+'.jpg'

image = cv2.imread(filename)
image = cv2.resize(image,(224,224))

img_arr = image.transpose(2, 0, 1) # C x H x W
img_arr = np.expand_dims(img_arr,axis = 0)
print(img_arr.shape)

img_tensor = torch.from_numpy(img_arr)
img_tensor = img_tensor.type('torch.FloatTensor')
print(img_tensor.shape,img_tensor.size)

cuda = torch.cuda.is_available()
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
if cuda:
    print("Device: GPU")
else:
    print("Device: CPU")
    
base_model = models.vgg16(pretrained=True)
model = NIMA(base_model)

model.load_state_dict(torch.load("/home/shayan/Projects/NIMA/epoch-12.pkl", map_location=lambda storage, loc: storage))
print("Successfully loaded model")

with torch.no_grad():

    model.eval()

output = model(img_tensor)
output = output.view(10, 1)

predicted_mean, predicted_std = 0.0, 0.0
for i, elem in enumerate(output, 1):
    predicted_mean += i * elem
for j, elem in enumerate(output, 1):
    predicted_std += elem * (j - predicted_mean) ** 2
print("________________")
print(u"({}) \u00B1{}".format(round(float(predicted_mean),2), round(float(predicted_std), 2)))

hi, my training process looks OK, the final loss is about 0.0988. but the result seems random

I know the loss may be a little high? can you tell me what learning rate you are use to reach emd loss about 0.075

xrange() was removed in Python 3

Flake8 testing of https://github.com/kentsyx/Neural-IMage-Assessment on Python 3.6.4

$ flake8 . --count --select=E901,E999,F821,F822,F823 --show-source --statistics

./model.py:40:14: F821 undefined name 'xrange'
    for i in xrange(1, length + 1):
             ^
./model.py:57:14: F821 undefined name 'xrange'
    for i in xrange(mini_batch_size):
             ^
2     F821 undefined name 'xrange'
2

how to test myself photo?

How do I test my own pictures with your pretrainde model , like the COCO dataset ,because there is no cvs file so I do not know how to test? Thanks

the google drive link for pretrained model is invalid

the emd loss does not decline

The emd loss always floats around 140

Runtime Error

(py3.6) C:\Users\baydogan\denemeee>python untitled1.py
Traceback (most recent call last):
File "", line 1, in
File "C:\Users\baydogan\AppData\Local\Continuum\anaconda3\envs\py3.6\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "C:\Users\baydogan\AppData\Local\Continuum\anaconda3\envs\py3.6\lib\multiprocessing\spawn.py", line 114, in _main
prepare(preparation_data)
File "C:\Users\baydogan\AppData\Local\Continuum\anaconda3\envs\py3.6\lib\multiprocessing\spawn.py", line 225, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "C:\Users\baydogan\AppData\Local\Continuum\anaconda3\envs\py3.6\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
run_name="mp_main")
File "C:\Users\baydogan\AppData\Local\Continuum\anaconda3\envs\py3.6\lib\runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "C:\Users\baydogan\AppData\Local\Continuum\anaconda3\envs\py3.6\lib\runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "C:\Users\baydogan\AppData\Local\Continuum\anaconda3\envs\py3.6\lib\runpy.py", line 85, in run_code
exec(code, run_globals)
File "C:\Users\baydogan\denemeee\untitled1.py", line 19, in
import lera
File "C:\Users\baydogan\AppData\Local\Continuum\anaconda3\envs\py3.6\lib\site-packages\lera_init.py", line 53, in
chunk_list = mp.Manager().list()
File "C:\Users\baydogan\AppData\Local\Continuum\anaconda3\envs\py3.6\lib\multiprocessing\context.py", line 56, in Manager
m.start()
File "C:\Users\baydogan\AppData\Local\Continuum\anaconda3\envs\py3.6\lib\multiprocessing\managers.py", line 513, in start
self._process.start()
File "C:\Users\baydogan\AppData\Local\Continuum\anaconda3\envs\py3.6\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:\Users\baydogan\AppData\Local\Continuum\anaconda3\envs\py3.6\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:\Users\baydogan\AppData\Local\Continuum\anaconda3\envs\py3.6\lib\multiprocessing\popen_spawn_win32.py", line 33, in init
prep_data = spawn.get_preparation_data(process_obj._name)
File "C:\Users\baydogan\AppData\Local\Continuum\anaconda3\envs\py3.6\lib\multiprocessing\spawn.py", line 143, in get_preparation_data
_check_not_importing_main()
File "C:\Users\baydogan\AppData\Local\Continuum\anaconda3\envs\py3.6\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main
is not going to be frozen to produce an executable.''')
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.

    This probably means that you are not using fork to start your
    child processes and you have forgotten to use the proper idiom
    in the main module:

        if __name__ == '__main__':
            freeze_support()
            ...

    The "freeze_support()" line can be omitted if the program
    is not going to be frozen to produce an executable.

Can you help me ?

Can you share the pretrained model?

Hello, your project is great， can you share the pretrained model?

Creating the dataset partitions

Can you please share the CSV files that you used to generate the data partitions for train, val and test.