Comments (3)
Hi @Shomvel - Chroma is used by people creating persistant apps, but also people who just want an "in-memory" database to run some quick computation, for example inside a Jupyter notebook.
Making persistence the default would break the idempotency of using chroma in that case (you can run the same script twice and get the same result).
That all being said..... I do agree the docs are confusing in that place! And we also want to clean up the ergonomics around how someone would instantiate a persistent client, persist, load, and generally query it. One idea we had was instead of
import chromadb
client = chromadb.Client() // in-memory
client = chromadb.PersistentClient() //in-memory with persistence
What do you think about this?
cc @wpnbos
from chroma.
I think it's better to centralize client settings. What about making it an option and making it clear in doc that default behavior is in-memory w/o persistence?
If persist=True, make impl='duckdb+parquet' or you can also specify impl.
import chromadb
chromadb.Client() // w/o persistence
chromadb.Client(persistence=True) // persists, impl=duckdb+parquet
chromadb.Client(persistence=True, impl='clickhouse') // impl=clickhouse
from chroma.
Persistence requires the user to pass in the persistence_directory
which I think functions fine as a flag. I agree that the impl is mostly redundant unless the user wants to use Clickhouse specifically
from chroma.
Related Issues (20)
- [Bug]: disk I/O error HOT 2
- [Bug]: uris does not work with AmazonBedrockEmbeddingFunction
- [Feature Request]: metadata as list and filter conditions HOT 4
- [Feature Request]: How to retrieve ids and metadata associated with embeddings of a particular file and not just for the entire collection? HOT 2
- [Bug]: how to search ChromaDB so that I can only see results based on metadata HOT 1
- [Bug]: InvalidDimensionException: Embedding dimension 1536 does not match collection dimensionality 384 HOT 1
- [Bug]: running chromadb/chroma container in docker - RAM memory of container grows endlessly while quering collection HOT 2
- when we mounting the /chroma/chroma folder on azure file share my chroma db container starting. but chromadb not working HOT 5
- [Bug]: Issues when loading vector database from documents ? HOT 1
- [Bug]: Misprogramming in class Quota HOT 2
- Make $contains and $not_contains availabe for the where parameter to search in metadata fields HOT 2
- [Feature Request]: remove the duplicate data HOT 4
- [ENH]: OpenAI Rate Limiting HOT 3
- [Feature Request]: Support Binary Quantisation
- [Bug]: Unable to load and create vector data base by using langchain module hugging face embedding HOT 2
- [Install issue]: Try to connect chromadb server with chromadb.HttpClient(). It raises exceptions HOT 2
- [Bug]: The same DB content returns different results when queried in different environments HOT 2
- [Feature Request]: Custom Distance Function HOT 1
- home dir is hardcoded in class ONNXMiniLM_L6_V2(EmbeddingFunction[Documents]): [Bug]: HOT 3
- [Feature Request]: Split up `embedding_functions.py` HOT 5
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from chroma.