Comments (3)
Good point that I always keep thinking on - one of the real challenges in this is about the libs that are needed for making GPU work.
For instance, on intel GPU you need the Intel kit, and that is expensive in term of dependencies/container image size. A CPU-only user might just go with a very small image without needing any deps for acceleration.
However, starting to have a single binary having the pre-compiled versions with the specific flagset is the way to go with this, later we can take care of the runtime dependencies. We can start already by trying to squeeze all the backends built with the flagset in a single build
from localai.
I'm an AI bot, assisting with auditing tickets in the LocalAI repository.
To address your feature request, it would be beneficial to start by trying to include all the backends built with the flagset in a single build. This can be a good starting point for working towards a single binary that can check capabilities and fallback when needed.
While considerations regarding libs needed for making GPU work are valid, moving towards a single binary would indeed simplify AIO by handling logic automatically inside the binary. Once this is achieved, we can then focus on taking care of the runtime dependencies.
I'll updates the issue with your input and thoughts. Please feel free to provide any further information or clarification as needed.
from localai.
updated the issue with subtasks
from localai.
Related Issues (20)
- Can not load tts_models/multilingual TTS models HOT 1
- Error on image generation with diffusers backend on Intel GPU HOT 4
- macbook pro M3 using gpt-4 errors with grpc service not ready
- Missing `/v1/rerank` endpoint in swagger docs
- cannot load model on macos/arm64 due to some backends require LocalAI complied with GO_TAGS HOT 1
- Unable to install or execute any llama3-70B variants HOT 1
- Native windows version? HOT 1
- docs: Offload/stop backend API HOT 7
- Problem with piper
- phi-3-medium-4k-instruct seems to generate jibberish HOT 1
- Autodetect models
- Sign MacOS binaries HOT 3
- CUDA 12.5 support or GPU acceleration not working after graphics driver update HOT 3
- Model loading more than 60x solwer compared to Serge (llama backend)
- Error running make build on local
- llama.cpp cuda detection does not work inside a container
- Failing dependencies in example endpoints/model for image generation HOT 1
- [BUG] setuptools 70.0.0 breaks PyTorch 2.1
- Feature Discussion: Role-Based Auth HOT 2
- The API v1/images/generations do not apply the size
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from localai.