Comments (4)
Hi @DreamsofGg !
You can use this:
CUDA_VISIBLE_DEVICES="0,1,2,3" python -m torch.distributed.launch --nproc_per_node=4 -m \
colbert.train --amp --doc_maxlen 180 --mask-punctuation --bsize 32 --accum 1 \
--triples /path/to/triples.tsv --similarity l2 \
--root /root/to/log/experiments/data/ --experiment ExperimentName
You can modify the number of GPUs (e.g., CUDA_VISIBLE_DEVICES="0"
and --nproc_per_node=1
) as needed.
Let me know if you face any issues!
from colbert.
Wow! your command just need half time to finish training! (Also with a single gpu) Great thanks!
from colbert.
@okhat sorry to bother again. I found that the similarity used in the paper is cosine_sim, so how l2_sim influence the results?
from colbert.
Actually both are mentioned in the paper, see Table 1 vs. Table 2.
There shouldn't be a major difference in performance; feel free to use either one.
from colbert.
Related Issues (20)
- Generating triples for ColBERT-v2 HOT 7
- How to index large corpus in mini batches?
- How to setup indexing in docker container HOT 1
- colbert.train FAILED: running distributed training HOT 9
- COLBERTv2 not respond HOT 1
- Regarding about data preparation for ColBERT
- Cannot access pretrained model HOT 6
- Unable to query indexed data, list index out of range HOT 3
- Indexer unable to index with cuda HOT 1
- How to embed and score a query-document-pair? HOT 4
- Explore sharding HOT 5
- Add support for loading ColBERT checkpoint directly from hugging face HOT 2
- Consider uploading to pypi HOT 2
- Fine-Tune ColBERT in Google Colab-Notebook HOT 10
- Clarification about the tokenization process HOT 4
- How to use pre-trained ColBERTv2 checkpoint? HOT 6
- `Example` class loads same chunk of data for every worker? HOT 2
- How to index large corpus which cannot be loaded into the memory? HOT 5
- Using ColBERTv2 for re-ranking HOT 7
- query without passage in collection
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from colbert.