Comments (3)
What do you mean exactly with aligning the audios?
With the script extract_durations.py you will generate a dataset for the forward model using the predictions of the autoregressive model. If you add the flag useGT you will use the ground truth mels (extracted directly from the wavs) as target for training the forward model, otherwise (recommended) you will use the predictions of the autoregressive model as target. Hope this helps.
from transformertts.
Hey thanks for your reply, just one more check. IF my autoregressive model is not good, then how much impact will it make to the forward model?
In that case what is more advisable to use? GT mels or predicted mels?
from transformertts.
Hi,
to evaluate you autoregressive model FOR the alignment extraction, you have to look at the last layer attention heads of your TRAINING SET. If these do not show significant jumps or collapses, then it will be OK, regardless how good your out of set predictions are (because the training set alignments are obtained with teacher forcing).
According to the literature, predicted mels (which I believe is the corresponent of sequence level knowledge distillation) are to be preferred.
from transformertts.
Related Issues (20)
- ERROR: while preparing training data HOT 1
- inference error HOT 1
- how can i save the audio file if i am using thee pretrained model in google colab
- Word timestamps
- layer.py TransposedCNNResNorm
- Get rid of the "robotic" sound HOT 4
- model.hdf5 file does not create
- No module named 'decorator'
- How to install on windows?
- the missing of HiFiGAN model.pt HOT 2
- Pause between sentence
- model.pt
- Alignments in PyTorch implementation
- Can't Finish PHONEMIZING On Google Colab. HOT 2
- Error : raise StopIteration StopIteration
- error to run a train_aligner.py
- how Normalized dataset
- DeepPhonemizer? (Phonemizer License Issue)
- About model structute.
- [CONTRIBUTION] Speech Dataset Generator
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from transformertts.