Coder Social home page Coder Social logo

chroma-langchain's People

Contributors

atroyn avatar hwchase17 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

chroma-langchain's Issues

VectorDBQA is deprecated

If I run this code I'm getting a warning:

UserWarning: VectorDBQA is deprecated - please use from langchain.chains import RetrievalQA

collection.upsert() and client.get_or_create_collection()

Thanks for your work on this. I'm really enjoying Langchain, Chroma and OpenAI.

I am using this plugin as follows and it works great. I'm trying to also safeguard against creating new collections when one already exists. Also trying to do the same thing for items in the collection. Ideally, I'd like to know how to incorporate

client.get_or_create_collection
collection.upsert()

into my code below to facilitate this. I've started going down the path of building the db natively with Chroma, but thought it might be possible to do it in langchain with this plugin.

db = Chroma.from_texts(texts, embeddings, metadatas=metadatas, ids=ids, collection_name=collection_name, persist_directory="db")

The API deployment for this resource does not exist. + ChromaDB + VectorStore + langChain

I try to use Chroma from Vector stores in following way, after installing chromadb.

loader = PyPDFLoader("data/Diabetes.pdf")
documents = loader.load()
text_splitter = CharacterTextSplitter(chunk_size=1000, chunk_overlap=0)
docs = text_splitter.split_documents(documents)
db = Chroma.from_documents(docs, OpenAIEmbeddings())

NOTE: I am using Windows machine, installed chromadb via pip, and chromadb client is working. I am able to create collections.
I get the following error:
image

Implement ChromaDB with HttpClient as a MicroService then save and persist embeddings

Hi, I found your example very easy to setup and get a fair understanding on how RAG with langchain with Chroma.

Although, I'd be more interested to host chromadb as a standalone microservice and access it in the application to store embeddings and query later. Can you please add that part as well?

I've tried below piece of snippet. But for some reason, I'm not able to get the chunks saved to vector db.

# create chroma db or load db from disk
from langchain.embeddings import OllamaEmbeddings
from langchain.vectorstores import Chroma
import chromadb
from chromadb.config import Settings

client_settings = Settings(
    chroma_api_impl="chromadb.api.fastapi.FastAPI",
    chroma_client_auth_provider="chromadb.auth.token.TokenAuthClientProvider",
    chroma_client_auth_credentials="xxxxxx",
    chroma_client_auth_token_transport_header="X_CHROMA_TOKEN",
    allow_reset=True,
    anonymized_telemetry=False
)

client = chromadb.HttpClient(
    host="localhost",
    port=8000,
    settings=client_settings,
)
collection = client.get_or_create_collection(name="documents")

emb_fn = OllamaEmbeddings(base_url=OLLAMA_URL, model=OLLAMA_MODEL)

def get_chroma(chroma_client):
    chroma_db = Chroma(
        collection_name="documents",
        embedding_function=emb_fn,
        client=chroma_client,
    )
    return chroma_db

chroma_db_client = get_chroma(client)

if init_db:
    chroma_db_client.from_documents(all_document_chunks, emb_fn)
    print(collection.count())
    print(collection.peek())
else:
    chroma_db_client = Chroma(embedding_function=emb_fn)

Output:

0
{'ids': [], 'embeddings': [], 'metadatas': [], 'documents': [], 'data': None, 'uris': None}

The chromadb server is running in a docker container and shows no errors. Also the variable all_document_chunks has several chunks of a local document that I have.

Appreciate your help!

Usage process

In this program, if I ask questions unrelated to the provided documents, can I get the answer I want?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.