Comments (24)
@gkucsko Absolutely - it is high on my todo list. Just need to wrap some ongoing efforts and will give bark
a try using ggml
and potentially utilizing 4-bit quantization
from bark.
@ggerganov @gkucsko FYI I've started implementing encodec.cpp
from bark.
@ggerganov if you send us an email at bark at suno.ai, we would definitely help sponsor some of your coffee consumption during your coding sprees. Incredibly helpful for the community. Speaking as a cpp illiterate PyTorch user :)
from bark.
I guess a first step would be to implement the pre-EnCodec part with ggml
as @gkucsko suggested.
This is not ideal since depending on a python library at the end defeats most of the purpose to have this implemented with ggml
in the first place. I think we do need to implement the EnCodec inference too, and at the moment I am not sure how difficult this will be.
The good news, is that at llama.cpp
we are making good progress with ggml
development and more and more people are becoming familiar with the codebase and contributing. I think the best way to go is I will bring attention to bark
and hopefully people will help out.
But unfortunately, my expectation is that it will likely take quite some time before we have something working.
from bark.
Had a bit more detailed look into this - it will be more difficult than I initially imagined since bark
uses Facebook's EnCodec codec. I'll need to implement that codec first, and it looks not trivial.
from bark.
Would love if @ggerganov is willing to have a look if it's doable to convert to cpp 8-bit. Has been amazing for whisper and llama.
from bark.
Haven't started looking into bark
yet. I've posted it in the llama.cpp
Roadmap for this month and hopefully it gets to people's attention and we get some help from the community. Will keep you posted if there are any updates
from bark.
this is going to break the fucking internet
not until voice cloning fully works, then the Armageddon will happen.
from bark.
legend!
from bark.
compile mostly helps during training afaik. model should already be held in memory and not loaded every time. there are other ways of improving inference speed. One is kv caching, currently in a PR, and then some others that we are currently working on for a new version. That's said, nothing works as well as a modern gpu unfortunately. Contributions from the community always welcome though (such as quantization and other tricks)
from bark.
awesome! where does the model get held in memory? i have a modern GPU but the inference is still not real-time for me
from bark.
I have no idea how everything is working, but couldn't you implement xformers or transformers with/without Jax to increase speed?
Inference is really too slow
from bark.
lots of ways to improve, especially on older hardware / gpus. a great PR also just improved by almost 2x: #27
from bark.
amazing tyty! happy to help in any way (finetuning etc).
from bark.
this is going to break the fucking internet
from bark.
Hm, happy to look into that with you. Although that step is very fast, is there an option to somehow run that part as-is without any optimizations? Shouldn’t be the bottleneck
from bark.
hey @gkucsko what GPU that we can run bark inference real-time?
from bark.
We should reach out to the community to raise whatever funds are needed to make this happen.
from bark.
that would be awesome ya. one step at a time i guess and let's see what we can do :)
from bark.
This is a mac-specific optimization that would be interesting to see: https://github.com/apple/ml-ane-transformers
from bark.
@ggerganov how are things developing on your end? anything i could help with? btw i believe the folks at huggingface are currently in the process of integrating encodec into their stack as well, so encodec probably has a long life ahead of it in the audio world if that helps. I'm sure @adefossez would appreciate work in this direction as well.
from bark.
Awesome thanks!
from bark.
I don't know anything about programming, but I'm trying to use Bark following a tutorial and inside G Colab environment. It takes a lifetime to generate one paragraph from text to speech. And every time I want to run a new paragraph, I must re-run all commands saved at G colab, no matter if I saved collab file. I mean like 6 hours or more for a one-page text file. If I am doing something wrong and someone could help me, I would be very glad.
from bark.
I don't know anything about programming, but I'm trying to use Bark following a tutorial and inside G Colab environment. It takes a lifetime to generate one paragraph from text to speech. And every time I want to run a new paragraph I must re-run all commands saved at G colab, no matter if I saved colab file. I mean like 6 hours or more for a one-page text file. If I am doing something wrong and someone could help me, I would be very glad.
from bark.
Related Issues (20)
- How to run bark on two GPU cards through configuration? HOT 1
- My graphics card has 6GB of VRAM.,but CUDA out of memory. HOT 1
- Preload Models getting stuck
- Chinese audio with a strong foreign accent HOT 8
- CUDA out of memory Error HOT 3
- No module named 'bark' HOT 9
- Cannot load models using preload_models (OSError: Read-only file system) HOT 6
- not building with torch-2.1.0 HOT 1
- how to implement to my project HOT 1
- How do I run this on my GPU? HOT 1
- [FR] Add Support for Traditional Chinese Model HOT 2
- Please add support for Indonesian language. HOT 1
- Chinese audio with a strong foreign accent HOT 4
- Is is possible to train this to get better results? HOT 1
- How do I train it myself?
- cant install i need help HOT 1
- Customize own voice as prompt HOT 1
- preload_models just gives a RuntimeError HOT 2
- [FEATURE] Audio of 2 languages in the same prompt HOT 4
- VALL-E
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from bark.