Coder Social home page Coder Social logo

Comments (6)

bkutasi avatar bkutasi commented on August 26, 2024

This looks like a way too long prompt, I would do 1 prompt/sentence at most.

from parler-tts.

ai-bits avatar ai-bits commented on August 26, 2024

@bkutasi Hey Balázs, any idea how to pimp things to make longer prompts work? Got 256GB RAM and 2x 20GB VRAM RTX 4000.
Sorta waste. ;-) Thought I had a util now to read articles to me. Maybe sth like in helpers\model_init_scripts\init_dummy_model.py

# set other default generation config params
model.generation_config.max_length = int(30 * model.audio_encoder.config.frame_rate)

Greetings from Upper Austria!
G.

from parler-tts.

bkutasi avatar bkutasi commented on August 26, 2024

@ai-bits Hallo Günter and greetings from Tirol!
So I think we won't be able to ever generate minutes long audios ever, but maybe if we get chucks or some kind of sequential generation implemented (at the moment it was trained on 30 sec voice samples).
So this is kind of a software limitation and wish I had time to fork/contribute to this because I'm really interested in good TTS implementations. If you just want simply more output, you can build a pipeline to call it multiple times for every sentence or line, but the voices will be vastly different as this cant reproduce the same exact voice.
There were some talks about this here:
#9
#11

from parler-tts.

platform-kit avatar platform-kit commented on August 26, 2024

@bkutasi why not ever? What's the blocker to re-training with longer samples?

from parler-tts.

ai-bits avatar ai-bits commented on August 26, 2024

wish I had time to fork/contribute

Just too many fronts. @ current pace give Parler coupla weeks to mature & use alts 2 do tts.
stt Whisper pretty good v Dragon Naturally Speaking in 90ies. <grin>
Been talking to GPT since July & VS Code for weeks.

Cheers
G.
PS: 49" moni, 4 hd Chrome Windoze: this, Code, Bayern-Ars, MCI-RMA
FCB just won. MCI overtime w/ projector in bed.

from parler-tts.

ai-bits avatar ai-bits commented on August 26, 2024

@bkutasi Been following the whole AI craze, but lost track what's free & local (and really open needed?) in TTS.

Not yet decided what to think of LocalAI.io, but just got TTS debugged on the M3 Mac. (AMD64 via Docker) Still don't know what special chars sometimes throw it off w/ text from the clipboard.

curl http://localhost:8080/tts -H "Content-Type: application/json" -d "{ \"model\":\"tts-1\", \"input\": \"$(pbpaste)\" }" --output ~/local-ai.wav

Bark,.. available.
Hungarian (?) im Heiligen Landl? ;-)
Cheers
G.

from parler-tts.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.