Comments (6)
I would recommend setting up and using your notebook through WSL.
from ragatouille.
Hey, please try wrapping your code in
if __name__ == "__main__":
aČ per the README. This should do the trick, the issue is due to multiprocessing
hanging.
from ragatouille.
Also please note -- I was just updating the README to give a better example where the collection is actually split into chunks! Do use this full code instead for a better demo:
from ragatouille import RAGPretrainedModel
from ragatouille.utils import get_wikipedia_page
from ragatouille.data import CorpusProcessor
if __name__ == "__main__":
RAG = RAGPretrainedModel.from_pretrained("colbert-ir/colbertv2.0")
my_documents = [get_wikipedia_page("Hayao_Miyazaki"), get_wikipedia_page("Studio_Ghibli")]
processor = CorpusProcessor()
my_documents = processor.process_corpus(my_documents)
index_path = RAG.index(index_name="my_index", collection=my_documents)
from ragatouille.
Also please note -- I was just updating the README to give a better example where the collection is actually split into chunks! Do use this full code instead for a better demo:
from ragatouille import RAGPretrainedModel from ragatouille.utils import get_wikipedia_page from ragatouille.data import CorpusProcessor if __name__ == "__main__": RAG = RAGPretrainedModel.from_pretrained("colbert-ir/colbertv2.0") my_documents = [get_wikipedia_page("Hayao_Miyazaki"), get_wikipedia_page("Studio_Ghibli")] processor = CorpusProcessor() my_documents = processor.process_corpus(my_documents) index_path = RAG.index(index_name="my_index", collection=my_documents)
thanks! Tried the new demo code (wrapped with 'if name == "main":' also) and still not working. Got stuck at the same place. I am running it in jupyte lab.
from ragatouille.
I would recommend setting up and using your notebook through WSL.
thanks, will give a try.
from ragatouille.
Oh my bad, I assumed this was ran in a script!
Yes, Windows support isn't something we're currently targeting. It appears to run well in notebooks on Windows 11 + WSL as per #14. Closing this issue to centralise windows-related hangups there!
from ragatouille.
Related Issues (20)
- Stuck at " Loading segmented_maxsim_cpp extension (set COLBERT_LOAD_TORCH_EXTENSION_VERBOSE=True for more info)..." HOT 1
- ImportError: cannot import name 'PromptTemplate' from 'llama_index' (unknown location)
- Compatibility with LangChain 0.2.0 HOT 3
- How to extract embeddings generated by Colbert? HOT 2
- Idea: Make CorpusProcessor (and splitter_fn / preprocessing_fn) to have access to metadata
- Embedding Model with Existing Index
- How to index collection using generator function?
- Training script is not working as is
- Making deletions will alter the collection.json file, hence the search function unusable because we access the collection using list indices.
- can't access my finetuned model
- Use base model or sentence transformer
- ragatouille requires a version of numpy uncompatible with python
- ValueError: RAGatouille is not installed. Please install it with `pip install ragatouille`.
- Documentation (API Reference): missing params in description tables
- Question -- Symmetric search
- How to use highly capable Decoder only models (LLMs) with RAGatouille -- is it even advisable?
- RuntimeError when indexing with FAISS
- RAGAtouille cpu only installation
- Typescript support of RAGatouille
- Question: rerank does not use index
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
š Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. ššš
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ā¤ļø Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from ragatouille.