Comments (23)
This command seems like the initial indexing step, not the faiss step. Is that right?
I would run -m colbert.index from outside the colbert/ subdirectory to make sure the imports work. Also, don't pass local_rank. Basically, please follow the template in the README.
Let me know if issues still persist!
from colbert.
Another orthogonal thing is: do you want to index queries.tsv? Typically you'd want to index passages.
from colbert.
yes, they're passages. The file is under a different name and i mistakenly put queries.tsv instead of passages.tsv
from colbert.
unfortunately, they still persist
what do you mean by initial indexing step?
from colbert.
I'm assuming this is the first step of indexing (before colbert.index_faiss
is used).
If so, here's a sample script that's modified based on your command above:
CUDA_VISIBLE_DEVICES="0,1" OMP_NUM_THREADS=1 \
python -m torch.distributed.launch --nproc_per_node=2 -m \
colbert.index --root $PWD/experiments/ --amp --doc_maxlen 180 --mask-punctuation --bsize 256 \
--checkpoint out-of-the-box-model.pt \
--collection passages.tsv \
--index_root /faiss --index_name INDEX \
--root $PWD/experiments/ --experiment out_of_the_box \
Just to be sure: does this work for you?
from colbert.
from colbert.
oh, and yes it's the first step (encoding)
from colbert.
I get this now (slightly different and may be a step in the right direction)
from colbert.
I get this now (slightly different and may be a step in the right direction)
from colbert.
Hmm this is strange. Could you run with one GPU in distributed mode and see if that works?
CUDA_VISIBLE_DEVICES="0" OMP_NUM_THREADS=1 \
python -m torch.distributed.launch --nproc_per_node=1 -m \
colbert.index --root $PWD/experiments/ --amp --doc_maxlen 180 --mask-punctuation --bsize 256 \
--checkpoint out-of-the-box-model.pt \
--collection passages.tsv \
--index_root /faiss --index_name INDEX \
--root $PWD/experiments/ --experiment out_of_the_box \
from colbert.
it didn't :/
from colbert.
Okay. What about single-GPU mode?
CUDA_VISIBLE_DEVICES="0" python -m \
colbert.index --root $PWD/experiments/ --amp --doc_maxlen 180 --mask-punctuation --bsize 256 \
--checkpoint out-of-the-box-model.pt \
--collection passages.tsv \
--index_root /faiss --index_name INDEX \
--root $PWD/experiments/ --experiment out_of_the_box \
from colbert.
It's hanging right now, so might be working!
from colbert.
ah yes, getting another error that I have already seen, and then it continues to hang (i saw this yesterday):
from colbert.
Right, so your passages must be in a TSV file whose first column is the passage ID.
Passage ID must be equal to the line number, starting at 0. You can start at 1 also, but then line zero needs to have a header: id \t text
from colbert.
Oh, mine start at 1. first line is 1\tsome passage
from colbert.
Great to know! I'll fix that
from colbert.
Okay, so single-GPU encoding will work.
For debugging the multi-GPU setup, could you run the following in a Python terminal?
import torch
torch.cuda.device_count()
from colbert.
1
from colbert.
Ah, so this all makes sense then. Either you only have one GPU or pytorch is not able to detect your other GPUs.
from colbert.
(I'm not sure why that is, by the way. I have two GPUs)
from colbert.
But yeah that makes sense, I'll look into that. Thank you so much for your prompt replies! Very helpful
from colbert.
Looks resolved?
Closing but feel free to reopen if I can help with other issues!
from colbert.
Related Issues (20)
- Set batch size when indexing HOT 3
- troubleshooting encoding performance HOT 1
- Pre-filtering the documents based on metadata before late-interaction HOT 5
- What is Colbert v1.9?
- Issue: Training "resume" and "resume_optimizer" implementation was removed
- Irrelevant results returned by the Colbert V2 Model HOT 1
- crypt.h: No such file or directory HOT 7
- Basic Training (ColBERTv1-style) -> ujson.JSONDecodeError: Expected object or value HOT 2
- How can I use "all_mpnet_base_v2" model for colbert indexing and searching?
- GPU not working while training a new model in Colab
- [rank1]:[E ProcessGroupNCCL.cpp:523] [Rank 1] Watchdog caught collective operation timeout: WorkNCCL HOT 2
- How to set chunk_size
- Tokens in `skiplist` are not returned (masked out) but they still affect other tokens embeddings. Is this expected? HOT 2
- How to get the mapping information about doc_id with doc_content. HOT 1
- CollectionEncoder blocking on encoder N passages HOT 1
- Focusing retrieval on list of document ids with doc_ids parameter doesn't work
- type object 'ColBERT' has no attribute 'segmented_maxsim' HOT 1
- Where is the qrels.dev.small.tsv?
- How to get rid of the "Duplicate GPU detected : rank 0 and rank 1 both on CUDA device ca000" error while training of the ColBERTv1.9 modell? HOT 1
- Request for AMD gpu support
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from colbert.