Coder Social home page Coder Social logo

thuml / hashnet Goto Github PK

View Code? Open in Web Editor NEW
239.0 239.0 84.0 26.71 MB

Code release for "HashNet: Deep Learning to Hash by Continuation" (ICCV 2017)

License: MIT License

CMake 1.19% Makefile 0.28% C++ 32.84% Cuda 2.36% MATLAB 0.36% Python 4.59% Shell 0.28% HTML 0.08% CSS 0.10% Jupyter Notebook 57.89% Dockerfile 0.03%
deep-learning hashing learning-to-search

hashnet's People

Contributors

caozhangjie avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

hashnet's Issues

the scale tanh is useless

In the experiment, I found that even without scale tanh, performance would not deteriorate. Can you show some results about the effect of scale tanh?

False sampling of data

hi,

I just found out, that all images in the query list are also in the database list, which is not allowed for fair validation.

thanks

test.py for CIFAR

I have tried to edit the code for CIFAR dataset but the result I got seems highly unreasonable (the MAP value was greater >7).

The original code you provided was based on NUS-WIDE dataset, which is a multi-labelled benchmark.
I believe I made some mistakes on editing it but I really can't fix the bug. I have been stuck on this for days...

Could you let me know the part of code for "def mean_average_precision(...): "? Or could you point out which part of the following codes are wrong?

def mean_average_precision(params, R):
database_code = params['database_code']
validation_code = params['test_code']
database_labels = params['database_labels']
validation_labels = params['test_labels']
query_num = validation_code.shape[0]

sim = np.dot(database_code, validation_code.T)
ids = np.argsort(-sim, axis=0)
APx = []

for i in range(query_num):
    label = validation_labels[i, :]             # I changed this line
    if label == 0:           label = -1            # I changed this line
    idx = ids[:, i]
    imatch = np.sum(database_labels[idx[0:R], :] == label, axis=1) > 0                 # I changed this line
    relevant_num = np.sum(imatch)
    Lx = np.cumsum(imatch)
    Px = Lx.astype(float) / np.arange(1, R+1, 1)
    if relevant_num != 0:
        APx.append(np.sum(Px * imatch) / relevant_num)

return np.mean(np.array(APx))

Problem with MAP@k

In here:

https://github.com/thuml/HashNet/blob/master/pytorch/src/test.py#L51

You compute:

for i in range(query_num):
    label = validation_labels[i, :]
    label[label == 0] = -1
    idx = ids[:, i]
    imatch = np.sum(database_labels[idx[0:R], :] == label, axis=1) > 0
    relevant_num = np.sum(imatch)
    Lx = np.cumsum(imatch)
    Px = Lx.astype(float) / np.arange(1, R+1, 1)
    if relevant_num != 0:
            APx.append(np.sum(Px * imatch) / relevant_num)

Where relevant_num will be the number of relevant entries in the answer of length R, but not the total number of relevant entries. relevant_num will be always less or equal to R

Here https://github.com/benhamner/Metrics/blob/master/Python/ml_metrics/average_precision.py#L39 it is computed differently. There, divider is min(total_number_of_relevant, k).

Also see discussion here: https://stackoverflow.com/questions/40906671/confusion-about-mean-average-precision

It is not possible to cheat the AP by tweaking the size of the returned ranked list. AP is the area below the precision-recall curve which plots precision as a function of recall, where recall is the number of returned positives relative to the total number of positives that exist in the ground truth, not relative to the number of positives in the returned list. So if you crop the list, all you are doing is that you are cropping the precision-recall curve and ignoring to plot its tail.

And:

Your confusion might be related to the way some popular function, such as VLFeat's vl_pr compute the precision-recall curves as they assume that you've provided them the entire ranked list and therefore compute the total number of positives in the ground truth by just looking at the ranked list instead of the ground truth itself. So if you used vl_pr naively on cropped lists you could indeed cheat it, but that would be an invalid computation.

Also here is an explanation of MAP@k: https://www.kaggle.com/c/FacebookRecruiting/discussion/2002

The number you divide by is the number of points possible. This is the lesser of ten (the most you can predict) and the number of actual correct answers that exist.

Am I missing something or your code is not correct? It is true that many other hashing papers compute MAP the same way.

The best results by using Resnet152

Hi, according to the paper, all results in the paper are obtained by AlexNet.
So what's the result on the three datasets if use Resnet152 as feature extraction model?
I run your pytorch code by using Resnet152 as the base CNN model, but get worse results than your paper.
Thank you very much if you can answer this question!!!

License

What is the license for the paper and this code please ?

How did you generate the "train.txt" of MS-COCO?

Hi,
I want to use more images to trian the net on MS-COCO, however, I find it some difficult to generate the "train.txt" of MS-COCO, Would you please provide the source codes to generate the "train.txt"? Or could you please provide the source code to generate the multi-labels of the images? Or could you please provide all the labels list of all the images?
Thank you!
Yours sincerely.
Wu

Why take two batches as input

Hello, I have some problems in reading and understanding the code.

  • First, why two batches need to be taken from a data set?
  • Second , the two batch form the same data set are different or same ?
    Hope to your answer, thanks.

image

test stage question

Hi, I am here again i want to ask you that the mAP reported in your original paper is tested 10 times(four corners , center crop and its mirror) or just 1 time (only center crop)
Thanks

question on nuswide81 dataset

Thank you very much for sharing the dataset, but I found a problem that many of the images in train.txt, test.txt, databse.txt, which you have given are not labeled and they do not belong to any of the 81 categories. What's going on, are these images useless?

pairwise loss in pytorch code

Hi, I have difficult in understanding the pairwise loss in your pytorch code. Particularly,

  1. I can not relate it to the Equation (4) in the paper. What is the meaning of a parameter "l_threshold" in your code?

  2. The returned loss in the code seems to be weighted with 1/w_ij defined in the paper, i.e., Equation (2), as I find that the loss is final divided by |S|. Can you give me some explanation about this point?

why the loss functions have a dot_loss?

i have see the paper,the loss is estimation by weighted maximum likelihood.
i can't understand the additional dot_loss in the implement.
in the paper
image
Corresponding to it is exp_loss
but in the code, the total loss = exp_loss +dot_loss

nus_wide dataset problem

Hi, I followed the instructions in the markdown file, but i got some problem with the nus_wide dataset.

I noticed that you changed the directory ./data/nus_wide to .data/nuswide_81, so I change the code in train.py

elif config["dataset"] == "nus_wide":
    config["data"] = {"train_set1": {"list_path": "../data/nuswide_81/train.txt", "batch_size": 36}, \
                      "train_set2": {"list_path": "../data/nuswide_81/train.txt", "batch_size": 36}}

And when I run the train.py using the following command:

python train.py --gpu_id 0 --dataset nus_wide --prefix resnet50_hashnet --hash_bit 48 --net ResNet50 --lr 0.0003 --class_num 1.0

There is a no such file error:

  File "train.py", line 279, in <module>
    train(config)
  File "train.py", line 199, in train
    inputs1, labels1 = iter1.next()
  File "/home/yuhang/anaconda3/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 286, in __next__
    return self._process_next_batch(batch)
  File "/home/yuhang/anaconda3/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 307, in _process_next_batch
    raise batch.exc_type(batch.exc_msg)
FileNotFoundError: Traceback (most recent call last):
  File "/home/yuhang/anaconda3/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 57, in _worker_loop
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/home/yuhang/anaconda3/lib/python3.5/site-packages/torch/utils/data/dataloader.py", line 57, in <listcomp>
    samples = collate_fn([dataset[i] for i in batch_indices])
  File "/tmp/liyuhang/Hashnet/src/data_list.py", line 89, in __getitem__
    img = self.loader(path)
  File "/tmp/liyuhang/Hashnet/src/data_list.py", line 45, in default_loader
    return pil_loader(path)
  File "/tmp/liyuhang/Hashnet/src/data_list.py", line 26, in pil_loader
    with open(path, 'rb') as f:
FileNotFoundError: [Errno 2] No such file or directory: './data/nuswide_81/images/93478_2084149655_e5049606da_m.jpg'

And when I run it another time, it says another jpg that do not exist in images directory. However, I'm sure those jpg are in the images directory. Could you please help me fix that? Thank you!

And I use pytorch 0.4.0, would that be the reason?

why mAP in your paper is lower than others?

recently, i recover your excellent work hashNet,and find that the mAP in cifar10 is near 0.8 which is higher than that report in your follow-up work hashGANs
but not merely so,
In your paper, you report CNNH achieve 0.5696 in NUS-WIDE(16bits) But In there paper is 0.611 in NUS-WIDE(12bit) In there original paper,there network is composed of three conv-pooling layers, one fully connected layer and an output layers, which is even smaller than AlexNet.
In other hash method,they use CNN-F model as backbone whose network is similiar with AlexNet ,and they achieve mAP about 0.8898 in cifar 10(16bit) ,but your paper is much lower.
So, can you tell me why your mAP is lower than other? It is very import for me to do future work。Thanks!

Some question about my result

Hi, I had download your nuswide_81_48_bits.caffemodel and nuswide.tar.gz dataset , and do it as follows:

python models/predict/nuswide_81/predict_parallel.py --gpu 1 --model_path ./models/train/nuswide_81/caffemodel/nuswide_81_48_bits.caffemodel --save_path ./models/predict/nuswide_81/

However, the result is 54.96. Do you know why it happens?

Data text file.

How we can make our own data text file? Yor provided the data text file like this. Can you explain please?

./data/nuswide_81/images/98796_381708603_f86eb60ccf_m.jpg 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Zero-Shot Retrieval Protocol

Hi, congrats on the nice work.

I'm wondering if it's possible for you to release the details of Zero-Shot Retrieval Protocol on ImageNet100.

In Table 2 of the paper, the protocol refers to [28]:
C. Ma, I. W. Tsang, F. Peng, and C. Liu. Partial hash update via hamming subspace learning. TIP 2017.
But unfortunately this paper is behind a paywall.

Also, this is pure guessing, but I'm wondering if the correct reference should be:
How should we evaluate supervised hashing?
Alexandre Sablayrolles, Matthijs Douze, Nicolas Usunier, Hervé Jégou
ICASSP 2017

Can't find the image file.

Hello authors !
Recently I download your provided NUS-WIDE dataset (Thanks for your sharing) and I find a question.
In your readme, it says:
For NUS-WIDE, you need to move the nus_wide.tar.gz to ./data/nuswide_81 and extract the file there.
But I find the extracted images don't have the same names as in the train.txt.
such as in the train.txt the first file is:
./data/nuswide_81/images/98796_381708603_f86eb60ccf_m.jpg 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
But I don't find the 98796_381708603_f86eb60ccf_m.jpg in the extracted images.
So I feel puzzled.
Thank you for answering my question.

Plotting hashcodes

Hi,I have clustered the snapshot, can you please help me on how to plot these snapshots?
Thanks
Nilesh

AlexNet backbone result (COCO)

I run PyTorch HashNet with the below parameters, I just change the backbone network to AlexNet for comparing this code result with the original paper.

dataset: coco
net: AlexNet
lr: 0.0003
class_num: 1.0
hash_bit: 48

but I get 65.1 mAP which is lower than the original paper.
Is there anybody who got the right performance?
Could you tell me the right parameter setting?

the mAP on cifar10 is only 0.30

For CIFAR-10, I randomly select 100 images per class as the test query set, and 500 images per class as the training set. The remaining images serve as database set.

I trained HashNet with command as follow:
python -u train.py --gpu_id 0 --dataset cifar --prefix resnet50_hashnet --hash_bit 48 --net ResNet50 --lr 0.0003 --class_num 1.0

The final mAP is only 0.302986.

Pytorch with Resnet50

I change the network to resnet50 in the pytorch version HashNet, and i can not produce acceptable map on CUB dataset. I have tried some fine tuning, and the best map i can get is arount 40 with 64 bits. SGD and Adam won't work, the result is given by RMSprop.

Any idea what's the problem and how can i fix it?

Some problem about calculating MAP

Hi, it seems not precise how you calculate the MAP. I check your function "mean_average_precision". When the relevant_num ==0, this query will be ignored when calculating the mean. That eliminates some extreme bad case (no similar images is retrieved) and the MAP will be a little higher.

Imagenet snapshot

Hi, Can you please share the output/snapshot of the train/test for the model which is been ran already by you. As this is taking too long to execute the code in my machine. Thanks in advance

How to do one hot encoding?

Data: Suppose I have 10 folders with 10 images (total 10*10=100) images

Question : what will be the one hot encoding of the 9th class and 7th image?
Question 2: Can there be only 10 digit one hot encoding eg (img1.jpg 0 0 0 0 0 0 0 0 0 1)?

Ps when i tried with only single input as (img1.jpg 1) it gives me the below error

[cloudera@quickstart src]$ python train.py --gpu_id 0 --dataset coco --prefix resnet50_hashnet --hash_bit 48 --net ResNet50 --lr 0.0003 --class_num .1
{'class_num': 0.1, 'l_threshold': 15.0, 'q_weight': 0, 'sigmoid_param': 0.20833333333333334, 'l_weight': 1.0}
Traceback (most recent call last):
File "train.py", line 278, in
train(config)
File "train.py", line 145, in train
transform=prep_dict["train_set1"])
File "/home/cloudera/Capstone/HashNet-master/pytorch/src/data_list.py", line 71, in init
imgs = make_dataset(image_list, labels)
File "/home/cloudera/Capstone/HashNet-master/pytorch/src/data_list.py", line 18, in make_dataset
images = [(val.split()[0], np.array([int(la) for la in val.split()[1:]])) for val in image_list]
ValueError: invalid literal for int() with base 10: 'Caves/1._maxresdefault.jpg'

I want ask a question about the dataset

I have read your data text file. In your data text file,i can get a instance like this: /home/caozhangjie/run-czj/dataset/nus_wide/42368_249918118_7f1a328add_m.jpg 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0

'' /home/caozhangjie/run-czj/dataset/nus_wide/42368_249918118_7f1a328add_m.jpg '' is the data path,right? and the ''0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 '' is the label vector. but ,in nus-wide dataset the length of label vector shoulde be 81,why your label vector length is just 21?

@bfan @caozhangjie

@bfan @caozhangjie
I add the weight in pytorch version(without c).

def pairwise_loss(outputs1,outputs2,label1,label2):
    similarity = Variable(torch.mm(label1.data.float(), label2.data.float().t()) > 0).float()
    dot_product = torch.mm(outputs1, outputs2.t())
    #exp_product = torch.exp(dot_product)

    mask_positive = similarity.data > 0
    mask_negative = similarity.data <= 0
    exp_loss = torch.log(1+torch.exp(-torch.abs(dot_product))) + torch.max(dot_product, Variable(torch.FloatTensor([0.]).cuda()))-similarity * dot_product
    #weight
    S1 = torch.sum(mask_positive.float())
    S0 = torch.sum(mask_negative.float())
    S = S0+S1
    exp_loss[similarity.data > 0] = exp_loss[similarity.data > 0] * (S / S1)
    exp_loss[similarity.data <= 0] = exp_loss[similarity.data <= 0] * (S / S0)

    loss = torch.sum(exp_loss) / S

    #exp_loss = torch.sum(torch.log(1 + exp_product) - similarity * dot_product)

    return loss

Originally posted by @soon-will in #17 (comment)

caffe error while training

Hi! I'm really interested in applying the same technique to other image data with tags. While trying to replicate the training process, I've followed instructions on changes of train_val.prototxt and solver.prototxt.

However, I'm seeing this error when I tried to train my model:

Error parsing text-format caffe.NetParameter: 19:14: Message type "caffe.ImageDataParameter" has no field named "label_dim".

Do you have any advise? Thanks.

Best,
Haining

question on nuswide dataset

Hi authors, in the paper, you mention that nuswide "use the subset of 195,834 images that are associated with the 21 most frequent concepts", but i saw vector with size of 81 in "data/nuswide_81/train.txt", may I know where do you filter the data? Or you are just using the data specified in those txt files? How about data without labels (vector with all 0s)?

About validation in pytorch

Thank you for your effort.

In pytorch version code, the model was saved after 3,000 iterations, so after 10,000 iterations(train) we saved about 3 models. So, I have two questions:

  1. Which model are you going to use to test?

  2. It seems that you don't val the model after training, so how do you ensure the model saved will be the best model?

Thanks for your reply.

a question about training set

Excuse me,,I have seen many papers randomly use some samples when training nus-wide. Often, 21 types are selected. Each class has 500 images as a training set and 100 images as a query set. Could you tell me when you are training how you allocate nus-wide training set?

number of training images for imagenet

Hi, first really appreciate for sharing codes. I have a question about the experimental settings.
in your paper, experimental settings for imagenet is

We randomly select 100 categories, use all the images of these categories in the training set as the database, and use all the images in the validation set as the queries; furthermore, we randomly select 100 images per category from the database as the training points

where the total number of training images is 10,000. However, the number of training images in this repo is 13,000.
Which one is right to reproduce your results in the paper?

could you do me a favor?

i am a beginner in deep hash, there are two questions I can not understand,
1 why do not use classification results as hashcode? where are the shortcomings?
2 i find almost all supervised deep
hash methods use alexnet as backbone network ,is it allowed to change this network when i making papers
Please favor me with your instructions。
thank you very much!

HashNet on CUB200

Hi, I've tried to use HashNet to Fine-grained recognition, so I adopted the PyTorch code to CUB200 dataset with finetuned ResNet50, but I can't make the loss converge. I've tried different optimizers like SGD, Adam, RMSprop, and different class_num values like 1.0 and 200.0, different lr values from 1e-5 to 1e-3.

Here is a set of parameters which I tried:

python train.py \
    --dataset cub200 \
    --prefix resnet50_hashnet \
    --hash_bit 64 \
    --net ResNet50 \
    --lr 1e-5 \
    --class_num 1.0

{'l_weight': 1.0, 'q_weight': 0, 'l_threshold': 15.0, 'sigmoid_param': 0.15625, 'class_num': 1.0}{'type': 'RMSprop', 'optim_params': {'lr': 1.0, 'weight_decay': 1e-05}, 'lr_type': 'step', 'lr_param': {'init_lr': 1e-05, 'gamma': 0.5, 'step': 2000}}

But the training loss is always around 0.69, and mAP is extremely as low as 0.04.

No matter what parameters are used, mAP is always lower than 0.05. Intuitively, this's not reasonable.

Have you ever apply HashNet on CUB200? Do you have any ideas on it? Thanks.

Some issues regarding the dataset

Hi, I just looked into the pytorch version of your code and I wonder if you could help clarify some points.

(1) ImageNet-100 train.txt consists of 13000 images. So I assume the reported results are from using 130 images per class during training?

(2) NUS_WIDE in caffe has 81 classes but only 21 classes in pytorch folder. Are the files in pytorch folder the result of selecting only the images with the 21 most frequent labels?

Thank you for answering my questions.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.