Comments (4)
Hi @xuanzebi! Good question. Yes, the fine-grained interaction of ColBERT is designed to be stronger than single-vector models like SBERT, DPR, etc. When you say text matching, as you referring to retrieval tasks (e.g., MS MARCO Ranking, Natural Questions)? For this, check our ColBERT and ColBERT-QA papers. Or do you have semantic similarity tasks (e.g., STS) in mind? We haven't tested on those, but it would certainly be a cool experiment.
from colbert.
Hi @xuanzebi! Good question. Yes, the fine-grained interaction of ColBERT is designed to be stronger than single-vector models like SBERT, DPR, etc. When you say text matching, as you referring to retrieval tasks (e.g., MS MARCO Ranking, Natural Questions)? For this, check our ColBERT and ColBERT-QA papers. Or do you have semantic similarity tasks (e.g., STS) in mind? We haven't tested on those, but it would certainly be a cool experiment.
Thank you very much for such a quick reply. The text matching I mean here refers to the similarity of two texts, such as judging whether two texts are on the same subject. I will also conduct experimental comparisons afterwards. thanks~
from colbert.
Awesome!
Do you have specific benchmarks in mind? I might be able to help you set things up if you like.
from colbert.
Closing for lack of activity. Please feel free to re-open if you have any other questions!
from colbert.
Related Issues (20)
- Process stuck on Launcher while training using example code
- incremental training on colbert
- [incremental indexing] - IndexUpdater.update_searcher is successful but search result does not show newly indexed passage HOT 1
- ColBERTv1: Faiss_depth usage in retrieval
- [c++] Calling pop() on an empty stack causes "undefined behavior"
- Searcher.search intermittent bug after IndexUpdater.persist_to_disk HOT 2
- Trainer.train() should not take checkpoint as an argument and should use self.config.checkpoint HOT 1
- Clarification on PLAID retrieval HOT 6
- Multiprocessing takes forever after on .get() with mp.Queue() (Possible Deadlock) HOT 2
- How to use a designated GPU for training? HOT 2
- FileNotFoundError: [Errno 2] No such file or directory: '***/.venv/lib/python3.8/site-packages/colbert/modeling/segmented_maxsim.cpp' HOT 3
- Duplicate search results when `k` is a high value HOT 8
- Recall and MRR for checkpoint different from paper HOT 2
- Using GPU, ColBERT.try_load_torch_extensions from IndexUpdater reports "Error building extension 'segmented_lookup_cpp'" HOT 1
- How to see progress of Indexing HOT 2
- About training from scratch HOT 8
- low recall@1k HOT 2
- about the ranking operation HOT 1
- FAILED: decompress_residuals.cuda.o, ninja: build stopped: subcommand failed (from: ColBERT/colbert/indexing/codecs/decompress_residuals.cu) HOT 6
- Fine tuning using colbert-ir/colbertv2.0 using training script now gives error HOT 6
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from colbert.