queries about the nuswide_wordvec text file about cvpr17-dvsq HOT 10 CLOSED

zhangzeng97 commented on June 3, 2024

queries about the nuswide_wordvec text file

from cvpr17-dvsq.

Comments (10)

caoyue10 commented on June 3, 2024

Hi Zeng, Here we use word2vec model pretrained on GoogleNews Dataset (e.g. https://github.com/mmihaltz/word2vec-GoogleNews-vectors), to extract the word embeddings for the labels of images, e.g. dog, cat and so on. Best, Yue 2017-11-24 15:18 GMT+08:00 zhangzeng97 <[email protected]>:

…

Hi Cao Yue, Great thanks for your great job done on the DVSQ project. I am currently working on my project at school. It has helped me a lot. I have successfully deployed the whole project. However, when I tried to run it with my own dataset, confusions arise. May I ask where the wordvec file in the data folder comes from? I have read your paper about the transformer to convert the image representations to embedding labels. However, it does not seem relevant to this file. May I ask how I can generate the word vectors and should which dataset be converted to word vectors. Thank you. Best, Zhang Zeng — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#4>, or mute the thread <https://github.com/notifications/unsubscribe-auth/AHOCwQrOzs0w3L9RlzILkFvL2SNlQOSDks5s5m3CgaJpZM4Qpc8H> .

-- Best Regards, Yue Cao Address: Room 11-419 East Main Building, Tsinghua University, Beijing, 100084 P.R. CHINA Mobile: (86)15201519264 E-mail: [email protected]

from cvpr17-dvsq.

zhangzeng97 commented on June 3, 2024

Hi Cao Yue,

Great thanks for your fast reply!

I have looked into that and it helps a lot.

Best,
Zhang Zeng

from cvpr17-dvsq.

zhangzeng97 commented on June 3, 2024

Hi,

May I ask something about the paper itself here?

I have read through it several times, but there are some points that I cannot understand. Like the word embedding for the labels, may I ask why do we need this? I tried to print the output of validation but it is the 81-dimensional label instead of the 300-dimensional word embeddings.

Thanks a lot:)

Best,
Zhang Zeng

from cvpr17-dvsq.

bl0 commented on June 3, 2024

Hi Zeng，
You are right, the label itself is 81-dimensional because nuswide is a 81-class dataset, and the word embedding of a single label is 300-dimensional.
Actually, because Nuswide is a multi-label dataset, the label representation of an image is a matrix of 81 * 300 dimensional(not just a vector of 81 or 300 dimension). Specifically, the ith row is the word embedding of label i if the image has label i, otherwise, the ith row will be all zero.(You can prove this by looking at the line 322 of file "net.py").

from cvpr17-dvsq.

zhangzeng97 commented on June 3, 2024

Hi Bin,

Thank you so much for your fast reply!

I have gone through it again. May I ask what the codebook C mentioned in the section 3.2 of the paper? My understanding is that for 81 classes, each class contains K centers. And if the C here is the same C in the line 68 of the net_val.py file?
I tried to print out the self.C from the model. It shows that it is a 1024 x 300 tensor. In my opinion, the 300 represents the 300-dimensional vectors while I am not so sure where the 1024 comes from.

Best,
Zeng

from cvpr17-dvsq.

bl0 commented on June 3, 2024

1024 = n_subcenter(256) * n_subspace(4).
Sorry for my late reply.

from cvpr17-dvsq.

freehome1 commented on June 3, 2024

I have got the GoogleNews-vectors-negative300.bin and I wonder how to get the word2vec.txt in cifar10 dataset

from cvpr17-dvsq.

bl0 commented on June 3, 2024

You can use gensim to load the model and extract wordvector. Here is a tutorial.

import gensim
model = gensim.models.KeyedVectors.load_word2vec_format('GoogleNews-vectors-negative300.bin', binary=True)
print(model['car'])

from cvpr17-dvsq.

freehome1 commented on June 3, 2024

Thanks for your help. However, I just try model['airplane', ...](include the 10 class of cifar10) and get the .txt which is wrong. I hope to know how to get the correct wordvector.

from cvpr17-dvsq.

bl0 commented on June 3, 2024

I just download the pre-trained word-vector and it works. So maybe you need to check your "GoogleNews-vectors-negative300.bin".

For reference, here is the pre-trained word-vector I use:
wget -c "https://s3.amazonaws.com/dl4j-distribution/GoogleNews-vectors-negative300.bin.gz"

from cvpr17-dvsq.

queries about the nuswide_wordvec text file about cvpr17-dvsq HOT 10 CLOSED

Comments (10)

Related Issues (9)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent