Coder Social home page Coder Social logo

Comments (3)

tazarov avatar tazarov commented on August 27, 2024 2

@marichkazb, thanks for reaching out. Let me start by saying that your approach to Lambda is correct and is how many Chroma users are deploying/using Chroma in AWS.

Your original error does not seem to be an actual Chroma issue. From the trace, it appears to be related to pydantic models in the OpenAI package.

Regarding the second error, this appears to be some library version conflicts, which is a frequent thing in the fast-moving GenAI ecosystem. What are your system dependencies e.g. packages you have installed - chromadb-client and openai library alone?

Regarding your more specific question on the AWS Lambda. While I'll admit I am not expert in AWS stack, my personal preference would be a docker image over zipped dependencies. Have a look here for an example (https://github.com/erenyasarkurt/OpenAI-AWS-Lambda-Layer/blob/main/build/build.sh).

I understand that you can easily bake a docker image, upload it to ECR, and use it as the basis for your Lambda. If you're interested, I'll happily provide you with a more detailed example.

from chroma.

marichkazb avatar marichkazb commented on August 27, 2024

Note: also getting the following message when trying to install chromadb-client in the linux env. although when installing those manually, system responds that the requirement is already satisfied

pip3 install -t ./python/ chromadb-client

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. botocore 1.34.61 requires urllib3<1.27,>=1.25.4; python_version < "3.10", but you have urllib3 2.2.1 which is incompatible. aws-sam-cli 1.112.0 requires requests~=2.31.0, but you have requests 2.32.1 which is incompatible.

from chroma.

marichkazb avatar marichkazb commented on August 27, 2024

@tazarov thank you for your time!! I’ve created a docker image and currently use it as a basis for the Lambda function, it indeed resolved all dependency conflicts, thank you! 🙌🏻

Also, I was wondering if chroma uses any temporary files when quering the collection?

I’m using the following function get_results to get the context for the system prompt for openAI. Although it seems like within the scope of this function it attempts to write files, resulting in an error: "error": "[Errno 30] Read-only file system: '/home/sbx_user1051’”. On AWS only /tmp folder is a writable directory, so any other attempt fails.

I tried setting the home environment to /tmp in the Dockerfile using ENV HOME=/tmp, but it didn’t help. If you have any ideas on how to possibly fix this, I'd really appreciate it!

def get_results(message):
    chroma_client = chromadb.HttpClient(host='11.11.111.11’, port=8000)
    from chromadb.utils.embedding_functions import SentenceTransformerEmbeddingFunction

    embedding_function = SentenceTransformerEmbeddingFunction()
    chroma_collection = chroma_client.get_collection("knowledge", embedding_function=embedding_function)

    results = chroma_collection.query(query_texts=[message], n_results=5)
    retrieved_documents = results['documents'][0]
    concatenated_string = ""
    for document in retrieved_documents:
        concatenated_string += str(document)
    return concatenated_string
   

from chroma.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.