Comments (6)
Hey @MentalGear I've done some more digging. Because Voy uses the embeddings as coordinates to store and locate minimal data from the resources, the requirement for the storage is greatly reduced (no embeddings are stored). For the resource data, like title, description, and url, I can provide you with an API that allows serialized and unstructured data to further minimize the size.
from voy.
Hey @MentalGear thanks for checking out Voy! Voy doesn't support automatic partitioning or caching at the moment. Voy is designed for limited environments like mobile and edge servers! The current tree-structured index in WebAssemly is able to provide efficient traversal through your data. I think for your use case it'll perform great. But if you start to experience performance issues as your dataset grows, please do create an issue.
I really like your idea about automatic sharding. I'll look into it.
from voy.
Hey, glad you like the idea!
Indeed Voy is highly efficient in the retrieval!
Here's some resource on sharding from the excellent RxDB
I found that overall in-Memory size per tab varies for mobile browsers quite a bit:
For example (latest OS): iPad: 2GB, iPhone X: 1GB. Android is different again. Also, loading a chunk of 512mb at once on iOS might give an Out of Memory
Error, while loading the memory in smaller chunks, allows to go well above the 512mb limit.
Here's a test for in-browser memory limits.
On-Device Storage:
Privacy and on-device storage has become quite prominent in regard of the new ML generative abilities.
I was wondering if Voy could integrate, besides a normal export function, storage adapters, most preferable with something like OPFS which is fast and supported by all mayor browsers now. (Reference: https://rxdb.info/rx-storage-opfs.html)
from voy.
What about using web workers? Using a shared worker might give you more freedom from the memory limitation per tab.
from voy.
Yes, I'm also interested in this, but could not find definitive resources for web workers. Shared Web workers allow the usage of the same worker across multiple tabs of the same web resource, however I don't think they expand overall available memory (but I don't know for sure).
There's a few APIs, however with very limited browser support (https://developer.mozilla.org/en-US/docs/Web/API/Performance/memory).
from voy.
Yes, adding custom data to records would be a huge plus !
from voy.
Related Issues (20)
- Benchmark Performance
- NodeJS Support HOT 3
- Add "clear()" API
- Voy Class
- Improve error handling
- Explore Index Partitioning HOT 1
- Add lifecycle hooks and expose serde API
- Feature request: remove resource by id HOT 1
- Is heavy usage possible? HOT 3
- Add a field in Resource that allows serialized and unstructured data
- "Too many items with the same position on one axis" error HOT 1
- Error: recursive use of an object detected which would lead to unsafe aliasing in rust HOT 7
- support for OpenAI embeddings? HOT 1
- No Transformers found HOT 1
- Using Voy in a plain JS app without bundlers HOT 1
- in-memory index only?
- memory access out of bounds for simple deserialize
- Issues with ReactJs and create-wasm-app
- use Rust embeddings crate
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from voy.