hwchase17 / chroma-langchain Goto Github PK

View Code? Open in Web Editor NEW

334.0 334.0 75.0 22 KB

License: MIT License

Jupyter Notebook 100.00%

chroma-langchain's People

Contributors

Stargazers

Watchers

Forkers

tan2line lexsf kabbouchi edwardrussell3 ayadiala highpoint sorokinvld bkamapantula jasonnoahchoi thomasewing04 jdinkla jerryyun captainahd sameep techthiyanes annias chenqing24 klaudioz olanyer jorie1234 pacificit 2narayana zenud jfontestad gobbletown wmbutler awareset jim-my n-h00 assetoverflow stephen-ouzounis techsuni2023 jameshennessytempus billyaungmyint dfcantor trevianxyz ups216 gauthiermartin onasu66 wesley7137 santiago-visanto thakkaryash94 ali-hazan d215w slachenberg scotthufeng assaohs assagroup rsouza ayeptee giangbui meetingattendee fwytech skyrockets-21 xiaofeixiang1234 therealvish richardgu-ctp pentad rerm06 mdshihabuddinroky muralidharchouhan heliosprimeone the-ogre yeongseon bamboocode94 vivek-kurma cocoball28 carloseduardotoledo nicholasveloso aimardev mtjszqj tridu33 salahmu snavid ninisoe1

chroma-langchain's Issues

VectorDBQA is deprecated

If I run this code I'm getting a warning:

UserWarning: VectorDBQA is deprecated - please use from langchain.chains import RetrievalQA

vectordb = Chroma.from_documents(texts, embeddings) is giving error!

collection.upsert() and client.get_or_create_collection()

Thanks for your work on this. I'm really enjoying Langchain, Chroma and OpenAI.

I am using this plugin as follows and it works great. I'm trying to also safeguard against creating new collections when one already exists. Also trying to do the same thing for items in the collection. Ideally, I'd like to know how to incorporate

client.get_or_create_collection
collection.upsert()

into my code below to facilitate this. I've started going down the path of building the db natively with Chroma, but thought it might be possible to do it in langchain with this plugin.

db = Chroma.from_texts(texts, embeddings, metadatas=metadatas, ids=ids, collection_name=collection_name, persist_directory="db")

The API deployment for this resource does not exist. + ChromaDB + VectorStore + langChain

I try to use Chroma from Vector stores in following way, after installing chromadb.

loader = PyPDFLoader("data/Diabetes.pdf")
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)
db = Chroma.from_documents(docs, OpenAIEmbeddings())

NOTE: I am using Windows machine, installed chromadb via pip, and chromadb client is working. I am able to create collections.
I get the following error:

onnxruntime is not supported, how to change to pytorch

onnxruntime is not supported on window 2012, how to change to pytorch

Implement ChromaDB with HttpClient as a MicroService then save and persist embeddings

Hi, I found your example very easy to setup and get a fair understanding on how RAG with langchain with Chroma.

Although, I'd be more interested to host chromadb as a standalone microservice and access it in the application to store embeddings and query later. Can you please add that part as well?

I've tried below piece of snippet. But for some reason, I'm not able to get the chunks saved to vector db.

# create chroma db or load db from disk
from langchain.embeddings import OllamaEmbeddings
from langchain.vectorstores import Chroma
import chromadb
from chromadb.config import Settings

client_settings = Settings(
    chroma_api_impl="chromadb.api.fastapi.FastAPI",
    chroma_client_auth_provider="chromadb.auth.token.TokenAuthClientProvider",
    chroma_client_auth_credentials="xxxxxx",
    chroma_client_auth_token_transport_header="X_CHROMA_TOKEN",
    allow_reset=True,
    anonymized_telemetry=False
)

client = chromadb.HttpClient(
    host="localhost",
    port=8000,
    settings=client_settings,
)
collection = client.get_or_create_collection(name="documents")

emb_fn = OllamaEmbeddings(base_url=OLLAMA_URL, model=OLLAMA_MODEL)

def get_chroma(chroma_client):
    chroma_db = Chroma(
        collection_name="documents",
        embedding_function=emb_fn,
        client=chroma_client,
    )
    return chroma_db

chroma_db_client = get_chroma(client)

if init_db:
    chroma_db_client.from_documents(all_document_chunks, emb_fn)
    print(collection.count())
    print(collection.peek())
else:
    chroma_db_client = Chroma(embedding_function=emb_fn)

Output:

0
{'ids': [], 'embeddings': [], 'metadatas': [], 'documents': [], 'data': None, 'uris': None}

The chromadb server is running in a docker container and shows no errors. Also the variable all_document_chunks has several chunks of a local document that I have.

Appreciate your help!

Usage process

In this program, if I ask questions unrelated to the provided documents, can I get the answer I want?

hwchase17 / chroma-langchain Goto Github PK

chroma-langchain's People

Contributors

Stargazers

Watchers

Forkers

chroma-langchain's Issues

VectorDBQA is deprecated

vectordb = Chroma.from_documents(texts, embeddings) is giving error!

collection.upsert() and client.get_or_create_collection()

The API deployment for this resource does not exist. + ChromaDB + VectorStore + langChain

onnxruntime is not supported, how to change to pytorch

Implement ChromaDB with HttpClient as a MicroService then save and persist embeddings

Usage process

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent