Coder Social home page Coder Social logo

Audio Alignment about transformertts HOT 3 OPEN

as-ideas avatar as-ideas commented on June 10, 2024
Audio Alignment

from transformertts.

Comments (3)

cfrancesco avatar cfrancesco commented on June 10, 2024

What do you mean exactly with aligning the audios?
With the script extract_durations.py you will generate a dataset for the forward model using the predictions of the autoregressive model. If you add the flag useGT you will use the ground truth mels (extracted directly from the wavs) as target for training the forward model, otherwise (recommended) you will use the predictions of the autoregressive model as target. Hope this helps.

from transformertts.

aayushkubb avatar aayushkubb commented on June 10, 2024

Hey thanks for your reply, just one more check. IF my autoregressive model is not good, then how much impact will it make to the forward model?

In that case what is more advisable to use? GT mels or predicted mels?

from transformertts.

cfrancesco avatar cfrancesco commented on June 10, 2024

Hi,
to evaluate you autoregressive model FOR the alignment extraction, you have to look at the last layer attention heads of your TRAINING SET. If these do not show significant jumps or collapses, then it will be OK, regardless how good your out of set predictions are (because the training set alignments are obtained with teacher forcing).
According to the literature, predicted mels (which I believe is the corresponent of sequence level knowledge distillation) are to be preferred.

from transformertts.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.