Comments (7)
In this, would you propose updating the api routes from
collection_name
tocollection_uuid
? I agree it is cumbersome to do again and again, and with good local clients - not necessary.
Yes I think so. We can also keep the old routes and wrap the logic somehow, creating new ones for uuid.
Solution space includes
- What we discussed above
- just cache it locally.
- Change our reads to filter on the name. Since we don't have an index currently on
collection_uuid
we could just change it. I don't think we should do that as we will want that soon.
from chroma.
So an observation: the current user-facing API doesn't expose the notion of an collection UUID at all, users interact with it solely in terms of its name.
In the (first version) of the refactored architecture, collections are associated 1:1 with Pulsar topics, which are namespaced by tenant, and have a string name (with similar but more flexible naming conventions to what topics have now.)
So IMO we can keep using the user-provided name as the primary key for now, and do away with the complexity of having an separate unique ID for a collection.
Regardless, I don't think this is urgent enough to address before the refactor lands (at which point the approach will change anyway).
from chroma.
In this, would you propose updating the api routes from collection_name
to collection_uuid
? I agree it is cumbersome to do again and again, and with good local clients - not necessary.
from chroma.
Saves a DB roundtrip, too
from chroma.
Saves a DB roundtrip, too
Yep
from chroma.
@levand correct me if im wrong, but the refactor should not affect this, since this is mostly about the plumbing between the backend and the frontend?
from chroma.
Thanks for the feedback! Let's close this for now and re-open it if/when the need arises.
from chroma.
Related Issues (20)
- [Bug]: Warning raised when query to Persistent Client HOT 4
- [Feature Request]: Use AWS S3 or Azure Blob Storage for persisting chroma db
- [Bug]: Chromadb will fail to return the embeddings with the closest results unless I set n_results to a sufficiently large number. HOT 3
- [Bug]: Error creating and inserting to collections using Persistent Client HOT 3
- [Feature Request]: Query max distance in addition to n_results HOT 1
- [Bug]: getCollection missing DefaultEmbeddingFunction (JavaScript client)
- [Feature Request]: Faster default EF on apple silicon
- [Feature Request]: Passing pre-computed embeddings directly to VectorStore HOT 2
- [Bug]: Cosine Similarity: Unusual Negative Distance in Same Sentence Search HOT 2
- [Bug]: The Write-ahead Log (embeddings_queue) doesn't get cleaned up HOT 4
- [Feature Request]: Ability to close local clients
- [Bug]: When using AzureOpenAI for embedding `azure_deployment` needs to be provided as well HOT 1
- Extending Search Filter to Include Multiple Metadata Fields HOT 3
- what happens if I call chroma.from_documents twice? HOT 4
- [Bug]: ONNXRuntime error on multiple document upserts HOT 2
- [Bug]: add_documents gets slower with each call HOT 2
- exec /docker_entrypoint.sh: no such file or directory[Install issue]: HOT 1
- [Feature Request]: Transformer-base Embedding Function HOT 1
- [Bug]: TypeError: Type is not JSON serializable: numpy.float64 chromadb/api/fastapi.py HOT 5
- [Bug]: Chroma HNSW breaking on Mac during LlamaIndex Installation HOT 3
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from chroma.