navervision / lincir Goto Github PK
View Code? Open in Web Editor NEWOfficial Pytorch implementation of LinCIR: Language-only Training of Zero-shot Composed Image Retrieval (CVPR 2024)
License: Other
Official Pytorch implementation of LinCIR: Language-only Training of Zero-shot Composed Image Retrieval (CVPR 2024)
License: Other
In the Line 239 of train_phi.py and Line 162 of validate.py, i.e.,
Line 239 in 6ffbdeb
Line 162 in 6ffbdeb
Hi, thank you for your interesting work,
I wonder if I can use other models than CLIP ( like BLIP, BLIP-2, etc) as backbone.
How can it be done?
What modifications need to be done?
In the paper, CC3M and 2.47M StableDiffusion prompts are employed for training. However, in the released code, three datasets are adopted, so i want to know if the 'dataset3': 'Geonmo/midjourney-prompts-onlyonly'
is used for training, i.e.,https://github.com/navervision/lincir/blob/28943db28b4f65d41dc2724b6e79596b0b8cc82d/loader.py#L219C19-L219C21
I think it's very interesting work!
The training process is clear, but there seems to be some ambiguity about the inference. For example, if the pre-trained module \phi receives images directly as input, how does it concatenate the output with conditions during inference?
Table B.5 shows the results of different prompts. What kind of prompt does the author use in Table 2-5?
Looking forward to the author's reply!
Hi, I tried to pretrained a phi model with ViT-G backbone, but the results are not as good. Can you provide a pretrained model with ViT-G as backbone?
Thank you for this great work!
I trained lincir on my computer with a single GPU and found that the program would hang after running all the steps.
Did you face the same problem when using a single GPU for training?
Nice work! Do you have a plan for when you will release the model?
Great work! In the line 32 of encode_with_pseudo_tokens.py, i.e.,
lincir/encode_with_pseudo_tokens.py
Line 32 in 6ffbdeb
I've run the training code, but I can't find the log file in logs folder. Besides, there may be a problem in the training code that the training loop does not seem to stop after max_train_steps, possibly because it does not exit the While True loop .(https://github.com/navervision/lincir/blob/b1ce7d283ab92c0f131972c71d5fed1ce54f23ac/train_phi.py#L222C1-L222C1)
According to the README, the code for evaluating the GeneCIS benchmark is located in a branch named eval_genecis. However, I could not find this specific branch upon checking your repository.
Evaluating GeneCIS requires a few additional steps. Check out the eval_genecis
branch and make the necessary adjustments to the configuration in ./eval_genecis/config.py
. Then, run the following script:
$ cd eval_genecis
$ python evaluate.py \
--combiner_mode phi \
--model large \
--combiner_pretrain_path /path/to/trained_your/phi_best.pt
If this branch hasn't been uploaded, could you make it available?
I tried to reproduce the whole training, but found that the performance did not reach the level shown in the paper, and it would not increase after about 5000 steps, and it might decline in the following iteration, and the performance improvement was not obvious compared with the beginning, may I ask if the learning rate has been maintained at 1e-4? Whether adjusting the learning rate can further improve performance.
Hi there!
First of all, I've been reading your paper, and it's a really interesting piece of work - well done!
On another note, thank you for including a citation to our work in your LinCIR paper; it's really appreciated!
While examining your code, I noticed some notable similarities. For example, the file 'validate.py' is almost identical. Could I kindly request that you include a citation to our code in your README?
On a friendly note, I noticed that the license information in the files borrowed from SEARLE is still intact. To be consistent with open-source practices, would you mind removing the license information from the files taken directly from our repository?
Thank you very much for your understanding and cooperation.
Best,
Alberto
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.