yangze0930 / nts-net Goto Github PK

This is a PyTorch implementation of the ECCV2018 paper "Learning to Navigate for Fine-grained Classification" (Ze Yang, Tiange Luo, Dong Wang, Zhiqiang Hu, Jun Gao, Liwei Wang).

License: MIT License

Python 100.00%

fine-grained-visual-categorization fine-grained-classification fine-grained-recognition

nts-net's Introduction

NTS-Net

This is a PyTorch implementation of the ECCV2018 paper "Learning to Navigate for Fine-grained Classification" (Ze Yang, Tiange Luo, Dong Wang, Zhiqiang Hu, Jun Gao, Liwei Wang).

Requirements

python 3+
pytorch 0.4+
numpy
datetime

Datasets

Download the CUB-200-2011 datasets and put it in the root directory named CUB_200_2011, You can also try other fine-grained datasets.

Train the model

If you want to train the NTS-Net, just run python train.py. You may need to change the configurations in config.py. The parameter PROPOSAL_NUM is M in the original paper and the parameter CAT_NUM is K in the original paper. During training, the log file and checkpoint file will be saved in save_dir directory. You can change the parameter resume to choose the checkpoint model to resume.

Test the model

If you want to test the NTS-Net, just run python test.py. You need to specify the test_model in config.py to choose the checkpoint model for testing.

Model

We also provide the checkpoint model trained by ourselves, you can download it from here. If you test on our provided model, you will get a 87.6% test accuracy.

Reference

If you are interested in our work and want to cite it, please acknowledge the following paper:

@inproceedings{Yang2018Learning,
author = {Yang, Ze and Luo, Tiange and Wang, Dong and Hu, Zhiqiang and Gao, Jun and Wang, Liwei},
title = {Learning to Navigate for Fine-grained Classification},
booktitle = {ECCV},
year = {2018}
}

nts-net's People

Contributors

Stargazers

Watchers

Forkers

zeng-lingyun xuguozhi 3dmm-icme2023 muscus353 wishinger-li gaoyuchris queenie88 xingyizhou ygean shubhampachori12110095 tqdavid zimenglan-sysu-512 keyky jryongithub findtheanswer bityangke saturdays flyingbird93 goingboy sankin1770 zbxzc35 ray-lee-94 gsx0 qinxiaojinla gaobb mathpopo dreadlord1984 zehaoy luxunhuang soccergame kjzju zongking123 wolfworld6 michaelfan01 novioleo fendaq weihua93 lijian10086 yuxuanluo cahya-wirawan nikhilrangarajan gaps013 szprestonhuang kulasama suri-ye daodaofr snowstu hessesummer kwanegx chaucergit warmstar1986 rockorchid gaolizhao ksonglovecv csulihong javacr damonzhenghuang chestnut-fish nikhil3456 dzcgaara ahagary cv-or-not-cv arjanadriaanse samsgood0310 wangtao2668129173 kingsley851102 dpqiao flyfoxs whatnamdouwant aneeshgadhwal googlelee swimmingswam changtang cx-x-x-x khanfarhan10 xieliang555 ramonfigueiredo supetzyk zxy14120448 zhaoyu19920930 xrosliang lichenyang-github bysen32 chilung fatejzz wenxinax aravindhm zsj0577 yurongchen1998 ww-zwj cnnandbn 1012638162 yangyin2016 liangxiaoyun yorkewei 8kta syderny 0colin sailfish009 0errors0warning

nts-net's Issues

stty: 'standard input': Inappropriate ioctl for device

_, term_width =os.popen('stty size','r').read().split()
ValueError: not enough values to unpack

Returning the image crops?

This isn’t an issue moreso a question. In the paper, you show examples of the regions the model suggested. Where can these coordinate points be located in the forward()?

Thanks!

矩阵大小不一致

size mismatch for concat_net.weight: copying a param with shape torch.Size([200, 10240]) from checkpoint, the shape in current model is torch.Size([200, 6144])

What is PROPOSAL_NUM = 6

Hi Ze,
I have some hard time understanding this parameter. What does it mean?

Thanks,
Yun

Is it possible to use ResNext?

I replaceed with ResNext, but the test acc only achieve to 87.5.
(500 epochs, batch size 16, lr 0.001, all config are the same as default.)

NTS-Net for small image size

I tried NTS-Net with 320x 320 microscopic images , it worked fine. However , now I have image sizes of > 65 to 192 which I have to classify. I tried padding > 65 size images to 192 and trained NTS-Net model. However, the accuracy reduced to 40%.
Would it be possible to use your model for low image sizes. Tweaking for the patch sizes and shallow Resnet model.

Would appreciate your suggestion.

dropout in resnet.py

thinks for your great work, I found a err in resnet.py #line 148

x = nn.Dropout(p=0.5)(x)

set net.eval() doesn't work for it ,i think need to defind a self.drop with nn.Dropout(p=0.5) in init function or change to F.dropout(x,p=0.5,training=self.training)

I have a question about testing!

Hi, i have a question about testing. Why the test images use center crop instead of scaling to 448x448 directly?

请问论文中的图5是怎么得到的矩形框？

The way to compute rank loss

Hi, I have some question of computing the rank loss.
The original code is :

part_loss = model.list_loss(part_logits.view(batch_size * PROPOSAL_NUM, -1), 
    label.unsqueeze(1).repeat(1,PROPOSAL_NUM).view(-1)).view(batch_size, PROPOSAL_NUM)
rank_loss = model.ranking_loss(top_n_prob, part_loss) 

def ranking_loss(score, targets, proposal_num=PROPOSAL_NUM):
    loss = Variable(torch.zeros(1).cuda())
    batch_size = score.size(0)
    for i in range(proposal_num):
        targets_p = (targets > targets[:, i].unsqueeze(1)).type(torch.cuda.FloatTensor)
        pivot = score[:, i].unsqueeze(1) 
        loss_p = (1 - pivot + score) * targets_p
        loss_p = torch.sum(F.relu(loss_p))
        loss += loss_p
    return loss / batch_size

def list_loss(logits, targets):
    temp = F.log_softmax(logits, -1)
    loss = [-temp[i][targets[i].item()] for i in range(logits.size(0))]
    return torch.stack(loss)

See the code here
targets_p = (targets > targets[:, i].unsqueeze(1)).type(torch.cuda.FloatTensor).
I think the variable targets_p is the mask of target where other value is larger than the index value. However, i think targets_p is just determined by the relative value. The way to compute part_loss is just reverse the order of the relative value and fetch the specific index. I think just fetch the specific index is enough. Then why compute the value of part_loss?

模型文件无法下载？？

请问可以分享一下模型文件吗？链接打不开

Different test accuracy when testing with different BATCH_SIZE in config.py

proposal对齐问题

代码中有一步是对 global_feat 和排序的part_feat concat，然后进行200类的分类。但是不同图片的part_feat是按照得分来排序的，不是按照身体部位来排序的，所以存在特征不对齐的问题。有可能图a是：头，腿，胸，身体，图b是胸，头，腿，身体。我这里尝试把concat换成add会训练更稳定一点，性能有小的提升！

测试问题！！！

你好：
我发现你的网络训练完成后，仅适用于特征分布适合的数据集，意思是仅适合本数据集，无法迁移，因为针对同一张图片，你的网络预测特征向量是每次都变化的，导致预测结果，每次都不一样。

Typo error.

Line 32 in train.py, which should be criterion = torch.nn.CrossEntropyLoss().

F.interpolate unavailable in PyTorch 0.4

Hi ,

@yangze0930 So the F.interpolate() is unavailable in PyTorch 0.4, please consider replacing it with F.upsample()

为什么要在输入的图像四周补0，这样不会导致截取的区域全0么

test

Unable to achieve the performance mentioned in the paper

I have trained the model by the code without any change, unfortunately，I can't achieve the performance mentioned in the paper. The parameters I set are "BATCH_SIZE = 16, PROPOSAL_NUM = 6, INPUT_SIZE = (448, 448), LR = 0.001, WD = 1e-4", the only difference is that the learing rate multiplied by 0.1 after 60 epoches and 100 epoches(which is set in your code) . I have trained 143 epoch, but the acc of the test data is just 0.415? I don't know if there is any problem here？

My owndataset has a SyntaxError: not a TIFF file (header b'' not valid)

small batch_size, better result !

Hello,
When I use 2 GPUS with batch_size 32，I can only get highest acc 83.1 with 500 epochs, but when I change batch_size from 32 to 16 (other settings are the same), now with 200 epoch, the highest acc is 87+. Is it weird？

cub200-2011，config文件中的参数设置没变batch=16，双卡训练，准确率最高在0.869。

batchsize=16,两块GPU训练，学习率0.001，达不到论文中的0.875的准确率。请问，你们达到的是怎么设置的参数？训练了多少epoch?

need 4 optimizer ？

There are 4 same optimizer，why？

raw_parameters = list(net.pretrained_model.parameters())
part_parameters = list(net.proposal_net.parameters())
concat_parameters = list(net.concat_net.parameters())
partcls_parameters = list(net.partcls_net.parameters())

raw_optimizer = torch.optim.SGD(raw_parameters, lr=LR, momentum=0.9, weight_decay=WD)
concat_optimizer = torch.optim.SGD(concat_parameters, lr=LR, momentum=0.9, weight_decay=WD)
part_optimizer = torch.optim.SGD(part_parameters, lr=LR, momentum=0.9, weight_decay=WD)
partcls_optimizer = torch.optim.SGD(partcls_parameters, lr=LR, momentum=0.9, weight_decay=WD)
schedulers = [MultiStepLR(raw_optimizer, milestones=[60, 100], gamma=0.1),
MultiStepLR(concat_optimizer, milestones=[60, 100], gamma=0.1),
MultiStepLR(part_optimizer, milestones=[60, 100], gamma=0.1),
MultiStepLR(partcls_optimizer, milestones=[60, 100], gamma=0.1)]

I think jusk one optimizer can make it？

or some part of nets need different update？

大家测试过吗，结果和论文里的匹配吗

what's the aim of padding 0? what's the meaning of cat_num?

x_pad = F.pad(x, (self.pad_side, self.pad_side, self.pad_side, self.pad_side), mode='constant', value=0)
part_feature = part_feature[:, :CAT_NUM, ...].contiguous()

Would appreciate your suggestion.

我只有一张6G的显卡，bachsize设置为4，训练时测试的正确率一直徘徊在0.6到0.7上不去，这是为什么啊？

About the figure of training the model

It is a nice work, easy and effective.
I want to see the figure of training the different dataset, especially decline curve of different losses.
Is it convenient to give it?

_, term_width = os.popen('stty size', 'r').read().split()

ValueError Traceback (most recent call last)
in ()
5 import logging
6
----> 7 _, term_width = os.popen('stty size', 'r').read().split()
8 term_width = int(term_width)
9

ValueError: not enough values to unpack (expected 2, got 0)

依赖库的版本问题

可以详细列出所需要的库的版本么？

feature extraction

Can I use concat_logits (Line 60) in test.py as a feature extractor from the trained model?

怎么感觉代码中实现rank loss和论文中描述的不一样

permissionError : [Errno 13 ] Permission denied : '/data_4t'

I meet a mistake when I was training, please tell me ,how to solve,thank you

When I trained NTS-Net on my own dataset and stanford dog dataset, the train accuracy can reach 99.8%, but the test accuracy just 85%. Do you same as me

对于代码中ProposalNet有些疑问

我想请问ProposalNet指的是什么？
论文中Navigator参考了PFN和Faster R-CNN 在代码中怎么体现的呢？

请问如何修改anchor大小？

我想只使用48*48的尺寸生成anchor，因此将core/anchors.py下第二个和第三个dict注释。
_default_anchors_setting = (
dict(layer='p3', stride=32, size=48, scale=[2 ** (1. / 3.), 2 ** (2. / 3.)], aspect_ratio=[0.667, 1, 1.5]),
#dict(layer='p4', stride=64, size=96, scale=[2 ** (1. / 3.), 2 ** (2. / 3.)], aspect_ratio=[0.667, 1, 1.5]),
#dict(layer='p5', stride=128, size=192, scale=[1, 2 ** (1. / 3.), 2 ** (2. / 3.)], aspect_ratio=[0.667, 1, 1.5]),
)
结果报错
File "/userhome/task/baa/baa-nts-roi2-48/NTS-Net/core/model.py", line 55, in
for x in rpn_score.data.cpu().numpy()]
ValueError: all the input array dimensions except for the concatenation axis must match exactly

请问这是什么原因？我如何修改自己的anchor大小？
谢谢！

Time cost on Training

Hey,your job is excellent!Recently,I have a task which contains 2000+ classes and 5.6 millions images about cars, I had tried to train BCNN on my dataset, but it spent too much time on convergence of network.I did not get answer after reading your paper. Could you tell me about how long you had trained NTS-Net using CUB dataset on your 4 GPUs machine?Thank you.

consult/请教

您好：
我是一名学生，有幸阅读您的NTS_Net这篇论文，并且阅读了您给出的源码，可以快速的结合代码阅读您的论文所表达的**。
请问，您可以分享给我，您训练完成的汽车的模型么？仅限学术研究。
如果可以的话，十分感谢。

ImportError: cannot import name 'Car2000'

Hello,When i run this code python train.py, i met the following problem.

Traceback (most recent call last):
  File "train.py", line 11, in <module>
    from core.utils import init_log, progress_bar
  File "/home/aibc/Wen/classification/NTS-Net/core/utils.py", line 8, in <module>
    from core.dataset import CUB, Car2000
ImportError: cannot import name 'Car2000'

谢谢作者的无私分享。我想自己制作数据集用此网络试试效果，请问我该怎么标注图片？

我是一名初学者，用CUB200数据集测试了此网络，结果跟论文一致，在此感谢作者的贡献和分享。现在我想自己标注一份数据集来测试一下效果，请问我该怎么做？大家有什么建议吗？特别是我应该如何标注自己的数据来适应这个网络方面还有一些问题不太清楚，请各位大佬指点迷津啊。感谢。

为什么每块GPU里面batchsize数量越少，效果越好，8 img per gpu只有86.2，但是4 img per gpu能到87.1

Set the rationality of the partcls_loss

First of all, thank you very much for your work.Secondly, I am a little confused about the partcls_loss.At the beginning of the training, the regions clipped from the original image may be the background area, but you gave it an original image label. Will it cause errors in the convergence direction of the network and the normal direction?Looking forward to your reply. Thank you!

error in resnet

i got this error when i employed resnet18, resnet34 (they use basic block not bottleneck as resnet50) as backbone pretrained network:
"RuntimeError: mat1 dim 1 must match mat2 dim 0"

i can't exploit from where this error comes

note: code runs perfectly withn resnet50, 101,152

关于显存大小

你好，我想问下这个代码大概需要多少显存？

List Loss error, IndexError: index 4 is out of bounds for dimension 0 with size 4

When I run the model, I get IndexError: index 4 is out of bounds for dimension 0 with size 4. The stack trace points to:

     15     raw_logits, concat_logits, part_logits, _, top_n_prob = net(img)
     16     part_loss = list_loss(part_logits.view(batch_size * PROPOSAL_NUM, -1),
---> 17                                label.unsqueeze(1).repeat(1, PROPOSAL_NUM).view(-1)).view(batch_size, PROPOSAL_NUM)
     18     raw_loss = creterion(raw_logits, label)
     19     concat_loss = creterion(concat_logits, label)

.........................................................................................................................
     84 def list_loss(logits, targets):
     85     temp = F.log_softmax(logits, -1)
---> 86     loss = [-temp[i][targets[i].item()] for i in range(logits.size(0))]
     87     return torch.stack(loss)```

在standford cars 数据集上，测试正确率88.6%左右就上不去了，有达到论文中的93.9%的吗，batch size为8

你的代码有个小问题！！！

我查阅了你的读取数据的代码，发现你读取的图片事RGB通道，而你的Normalize则采用的imagenet的BGR对应的std和mean 值，虽然影响很小，但是会降低一些性能的。我查阅的imagenet的std和mean是按照BGR顺序排列的。

无法下载你们训练的模型

点击“here”链接为空，无法下载。

图片读取时的问题

用自己的数据集训练，为什么一直在读取图片时出现list index out of range的错误呢？txt格式的标签文本中并没有多余的空行和空格。

Anchor boxes visualization

@yangze0930 Hi, I get self.top_n_cdds in model.py/attention_net and then I pad image to 896 x 896 size, draw anchor boxes in padded image. Have I done something wrong in this procedure? I wanna know how you visualize anchor boxes in image, I'd like you can reply as soon as possible sincerely, thanks a lot

This code version is sooo ~ bad! and memory error why???

i was very difficult to match code to right version of packages...

anyway

THCudaCheck FAIL file=/opt/conda/conda-bld/pytorch_1535491974311/work/aten/src/THC/THCGeneral.cpp line=663 error=11 : invalid argument
Traceback (most recent call last):
File "/home/kaist/PycharmProjects/NTS-Net/train.py", line 66, in
raw_logits, concat_logits, part_logits, _, top_n_prob = net(img)
File "/home/kaist/anaconda3/envs/TASN-1/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/kaist/anaconda3/envs/TASN-1/lib/python3.6/site-packages/torch/nn/parallel/data_parallel.py", line 121, in forward
return self.module(*inputs[0], **kwargs[0])
File "/home/kaist/anaconda3/envs/TASN-1/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/kaist/PycharmProjects/NTS-Net/core/model.py", line 68, in forward
_, _, part_features = self.pretrained_model(part_imgs.detach())
File "/home/kaist/anaconda3/envs/TASN-1/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/kaist/PycharmProjects/NTS-Net/core/resnet.py", line 142, in forward
x = self.layer2(x)
File "/home/kaist/anaconda3/envs/TASN-1/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/kaist/anaconda3/envs/TASN-1/lib/python3.6/site-packages/torch/nn/modules/container.py", line 91, in forward
input = module(input)
File "/home/kaist/anaconda3/envs/TASN-1/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/kaist/PycharmProjects/NTS-Net/core/resnet.py", line 74, in forward
out = self.conv1(x)
File "/home/kaist/anaconda3/envs/TASN-1/lib/python3.6/site-packages/torch/nn/modules/module.py", line 477, in call
result = self.forward(*input, **kwargs)
File "/home/kaist/anaconda3/envs/TASN-1/lib/python3.6/site-packages/torch/nn/modules/conv.py", line 301, in forward
self.padding, self.dilation, self.groups)
RuntimeError: CUDA error: out of memory

Process finished with exit code 1