roatienza / efficientspeech Goto Github PK

View Code? Open in Web Editor NEW

142.0 6.0 25.0 5.08 MB

PyTorch code implementation of EfficientSpeech - to be presented at ICASSP2023.

License: Apache License 2.0

Python 48.95% Shell 0.86% Jupyter Notebook 50.19%

neural speech synthesis tts

efficientspeech's People

Contributors

Stargazers

Watchers

efficientspeech's Issues

Please upgrade pytorch_lightning

we are now in 2.0 as well as lightning_bolts, also, it was outdated since torch2.0, every single APi breaks

demo.py is broken

The last submit broke demo.py and it now produces broken audio.

[Request] Provide German model

Many thanks Rowel for this repository and the provided English models. I like the quality and RTF on CPU, its about 10 on my PC, and i will soon install it on different kind of RPIs.

Do you have any plans to train a German model, for instance based on

ThorstenVoice Dataset 2022.10

Can you share your experience with regards to training speed?

Mobile android test

Do you test at android or ios?
I want test mobile os

Thanks, Moon

Unable to run demo on windows

(es) Q:\Utilities\CUDA\efficientspeech>python demo.py --checkpoint https://github.com/roatienza/efficientspeech/releases/download/pytorch2.0.1/tiny_eng_266k.ckpt  --infer-device cpu --text "the quick brown fox jumps over the lazy dog" --wav-filename fox.wav
100%|█████████████████████████████████████████████████████████████████████████████| 6.76M/6.76M [00:00<00:00, 68.4MB/s]
A:\Anaconda\envs\es\lib\site-packages\torch\nn\utils\weight_norm.py:30: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.
  warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.")
Removing weight norm...
Traceback (most recent call last):
  File "Q:\Utilities\CUDA\efficientspeech\demo.py", line 121, in <module>
    model = model.load_from_checkpoint(checkpoint,
  File "A:\Anaconda\envs\es\lib\site-packages\lightning\pytorch\utilities\model_helpers.py", line 93, in __get__
    raise TypeError(
TypeError: The classmethod `EfficientSpeech.load_from_checkpoint` cannot be called on an instance. Please call it on the class type and make sure the return value is used.

I've tried efficientspeech on my Raspberry Pi 4 and it works pretty well (~2s for 3s audio) 👍, but it still needs to be a bit faster to be really useful.
In your code I've seen a comment about the ONNX models being ~3 times faster.
I failed to use the convert script so I was wondering if you could upload the model for testing? 🙂

ONNX Inference Issue

Thank you for the excellent work and sharing this implementation.

I tried to convert to ONNX and did the inference . However I have below issue/challenges. Appreciate any valuable suggestions .

ONNX input dimension remains fixed, As a result we need to pad additional Ids to the phoneme array. In the existing code, it replicates the phoneme till the ONNX input size length. This in turn creates repeated audios of the same content. Is there any specific Id, I can pad to avoid unwanted audio at the end. OR Is there a way to pass dynamic length phoneme array to ONNX model . Please clarify if I'm missing anything here and how to avoid this.

I can promote this repo but I need some info and guide for training

I have recently released this tutorial

It was so easy to do training with DLAS. Used Ozen Toolkit to prepare whole speech training dataset with single click

So both data preparation and training were single click.

Can you give me some more information - guides about how to do training to produce speech with your repo?

I would like to make a tutorial

Master Deep Voice Cloning in Minutes: Unleash Your Vocal Superpowers! Free and Locally on Your PC

Error in networks.py

Hello,

i am running into this error:

File "/opt/conda/lib/python3.10/site-packages/lightning/pytorch/loops/fit_loop.py", line 354, in advance
self.epoch_loop.run(self._data_fetcher)
File "/opt/conda/lib/python3.10/site-packages/lightning/pytorch/loops/training_epoch_loop.py", line 133, in run
self.advance(data_fetcher)
File "/opt/conda/lib/python3.10/site-packages/lightning/pytorch/loops/training_epoch_loop.py", line 218, in advance
batch_output = self.automatic_optimization.run(trainer.optimizers[0], kwargs)
File "/opt/conda/lib/python3.10/site-packages/lightning/pytorch/loops/optimization/automatic.py", line 185, in run
self._optimizer_step(kwargs.get("batch_idx", 0), closure)
File "/opt/conda/lib/python3.10/site-packages/lightning/pytorch/loops/optimization/automatic.py", line 260, in _optimizer_step
call._call_lightning_module_hook(
File "/opt/conda/lib/python3.10/site-packages/lightning/pytorch/trainer/call.py", line 140, in _call_lightning_module_hook
output = fn(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/lightning/pytorch/core/module.py", line 1256, in optimizer_step
optimizer.step(closure=optimizer_closure)
File "/opt/conda/lib/python3.10/site-packages/lightning/pytorch/core/optimizer.py", line 155, in step
step_output = self._strategy.optimizer_step(self._optimizer, closure, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/lightning/pytorch/strategies/strategy.py", line 225, in optimizer_step
return self.precision_plugin.optimizer_step(optimizer, model=model, closure=closure, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/lightning/pytorch/plugins/precision/amp.py", line 70, in optimizer_step
closure_result = closure()
File "/opt/conda/lib/python3.10/site-packages/lightning/pytorch/loops/optimization/automatic.py", line 140, in call
self._result = self.closure(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/lightning/pytorch/loops/optimization/automatic.py", line 126, in closure
step_output = self._step_fn()
File "/opt/conda/lib/python3.10/site-packages/lightning/pytorch/loops/optimization/automatic.py", line 307, in _training_step
training_step_output = call._call_strategy_hook(trainer, "training_step", *kwargs.values())
File "/opt/conda/lib/python3.10/site-packages/lightning/pytorch/trainer/call.py", line 287, in _call_strategy_hook
output = fn(*args, **kwargs)
File "/opt/conda/lib/python3.10/site-packages/lightning/pytorch/strategies/strategy.py", line 367, in training_step
return self.model.training_step(*args, **kwargs)
File "/home/tts/efficientspeech/model.py", line 214, in training_step
y_hat = self.forward(x)
File "/home/tts/efficientspeech/model.py", line 156, in forward
return self.phoneme2mel(x, train=True) if self.training else self.predict_step(x)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/tts/efficientspeech/layers/networks.py", line 421, in forward
pred = self.encoder(x, train=train)
File "/opt/conda/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/tts/efficientspeech/layers/networks.py", line 370, in forward
fused_features = torch.cat([fused_features, pitch_features,
RuntimeError: Sizes of tensors must match except in dimension 2. Expected size 1 but got size 110 for tensor number 1 in the list.
Epoch 0: 0%| | 0/129 [00:02<?, ?it/s]

roatienza / efficientspeech Goto Github PK

efficientspeech's People

Contributors

Stargazers

Watchers

Forkers

efficientspeech's Issues

Please upgrade pytorch_lightning

demo.py is broken

[Request] Provide German model

Mobile android test

Unable to run demo on windows

ONNX models?

ONNX Inference Issue

I can promote this repo but I need some info and guide for training

Error in networks.py

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent