Comments (5)
Hi, thanks for your attention!
We have open-source Tag2Text forward function, which can refer to README.md. While I would personally love to open source all the training codes right away, they are still in the company's process. We are actively working towards making them available as soon as possible.
Best wishes!
from recognize-anything.
非常赞的工作,非常期待训练代码~~
from recognize-anything.
Hi, thanks for your attention! We have open-source Tag2Text forward function, which can refer to README.md. While I would personally love to open source all the training codes right away, they are still in the company's process. We are actively working towards making them available as soon as possible. Best wishes!
Hi, thanks for your great job! I want to ask if I replace 'blip.py' with 'ram.py', it also seems to be able to run(but I haven't tried), is that right? I see the code structure is similar.Thanks for reply
from recognize-anything.
Besides, another key improvement is: BLIP only read two key-value pairs: {'image': path_of_image, 'caption': text_of_image}, and Tag2Text need to read three key-value pairs: {'image': path_of_image, 'caption': text_of_image, 'tag': the tags of the image, parsed from the caption}.
from recognize-anything.
Thanks. So when will the ram training code be released? maybe an approximate time?
from recognize-anything.
Related Issues (20)
- Some questions about fine-tuning recognize-anything model HOT 1
- Relax transformers dependency version HOT 3
- Why is the tag and Caption text predicted by Tag2Text different? Why didn't Tag2Text use specific tags given by user?
- about training 4M dataset and the loss converge slowly HOT 3
- A question on embedding
- NameError: name '_C' is not defined HOT 1
- VisionTransformer undefined in ram.models.utils.py
- HuggingFace App is not working HOT 1
- Uncertain output results
- 【Bug】BertLayer should be used as a decoder model if cross attention is added
- finetuning on specific tag list
- How can I obtain the file ram_plus_swin_large_14m.pth? HOT 1
- how to form a ram_plus_tag_embedding_class_4585_des_51.pth for my own data. HOT 2
- Unable to proceed with command 'pip install -e .' HOT 2
- Can't load tokenizer for 'bert-base-uncased'
- tag_encoder and text_decoder HOT 1
- pip install error HOT 2
- Normalize image features while calculating the L1 loss
- i think it is the best to call it MAM(match-anything-model)
- CUDA out of memory error
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from recognize-anything.