cmsmartvoice / one-shot-voice-cloning Goto Github PK

View Code? Open in Web Editor NEW

233.0 9.0 40.0 111.15 MB

:relaxed: One Shot Voice Cloning base on Unet-TTS

Python 22.17% Jupyter Notebook 77.83%

tts style-transfer one-shot voice-cloning

one-shot-voice-cloning's Introduction

Unet-TTS: Improving Unseen Speaker and Style Transfer in One-shot Voice Cloning

English | 中文

❗ Now we provide inferencing code and pre-training models. You could generate any text sounds you want.

⭐ The model training only uses the corpus of neutral emotion, and does not use any strongly emotional speech.

⭐ There are still great challenges in out-of-domain style transfer. Limited by the training corpus, it is difficult for the speaker-embedding or unsupervised style learning (like GST) methods to imitate the unseen data.

⭐ With the help of Unet network and AdaIN layer, our proposed algorithm has powerful speaker and style transfer capabilities.

Demo results

Paper link

✨Colab notebook is Highly Recommended for test.

⭐ Now, you only need to use the reference speech for one-shot voice cloning and no longer need to manually enter the duration statistics additionally.

😄 The authors are preparing simple, clear, and well-documented training process of Unet-TTS based on Aishell3.

It contains:

One-shot Voice cloning inference
The duration statistics of the reference speech can be estimated Automatically using Style_Encoder.
Multi-speaker TTS with speaker_embedding-Instance-Normalization, and this model provides pre-training Content Encoder.
Unet-TTS training
C++ inference

Stay tuned!

Install Requirements

Only support Linux system
Install the appropriate TensorFlow and tensorflow-addons versions according to CUDA version.
The default is TensorFlow 2.6 and tensorflow-addons 0.14.0.

cd One-Shot-Voice-Cloning/TensorFlowTTS
pip install . 
(or python setup.py install)

Usage

Option 1: Modify the reference audio file to be cloned in the UnetTTS_syn.py file. (See this file for more details)

cd One-Shot-Voice-Cloning
CUDA_VISIBLE_DEVICES=0 python UnetTTS_syn.py

Option 2: Notebook

Note: Please add the One-Shot-Voice-Cloning path to the system path. Otherwise the required class UnetTTS cannot be imported from the UnetTTS_syn.py file.

import sys
sys.path.append("<your repository's parent directory>/One-Shot-Voice-Cloning")
from UnetTTS_syn import UnetTTS

from tensorflow_tts.audio_process import preprocess_wav

"""Inint models"""
models_and_params = {"duration_param": "train/configs/unetts_duration.yaml",
                    "duration_model": "models/duration4k.h5",
                    "acous_param": "train/configs/unetts_acous.yaml",
                    "acous_model": "models/acous12k.h5",
                    "vocoder_param": "train/configs/multiband_melgan.yaml",
                    "vocoder_model": "models/vocoder800k.h5"}

feats_yaml = "train/configs/unetts_preprocess.yaml"

text2id_mapper = "models/unetts_mapper.json"

Tts_handel = UnetTTS(models_and_params, text2id_mapper, feats_yaml)

"""Synthesize arbitrary text cloning voice using a reference speech""" 
wav_fpath = "./reference_speech.wav"
ref_audio = preprocess_wav(wav_fpath, source_sr=16000, normalize=True, trim_silence=True, is_sil_pad=True,
                    vad_window_length=30,
                    vad_moving_average_width=1,
                    vad_max_silence_length=1)

# Inserting #3 marks into text is regarded as punctuation, and synthetic speech can produce pause.
text = "一句话#3风格迁移#3语音合成系统"

syn_audio, _, _ = Tts_handel.one_shot_TTS(text, ref_audio)

Reference

https://github.com/TensorSpeech/TensorFlowTTS

https://github.com/CorentinJ/Real-Time-Voice-Cloning

one-shot-voice-cloning's People

Contributors

Stargazers

Watchers

Forkers

ishine entn-at maxmax2016 mingri-lt jjandnn liujingxiu23 achyun ireneb612 chenchy josh-zhu adambear wangjiahui1126 friendmine hello-eternity ifgcguitarclub kdtiankong silyfox raghavjhha anoop-qasolve fastflair athenasaurav huaxuanw shaogx jaycedowns42 lcsouzamenezes fetpo awesome-archive oceans0423 dataengineering-team4 ken2190 ilyeong-ai redbeard-himalaya tahlly88 igponce timkar164 jakobwonisch yuan-manx skylineyang idanori

one-shot-voice-cloning's Issues

是否可以实现音色克隆？

请问该repo有实现音色克隆的可能行吗？

How to generate the duration statistical info, like test_wavs/*.npy file.

The npy files in */test_wavs are generated by the MFA tool, but first its corresponding phoneme sequence has to be known.
It is not limited to the above method, but any tool that can predict the duration of articulation can be used, such as the acoustic model of ASR.
The above method can accurately estimate the duration information of the reference audio. For cloning, in fact, the accuracy of duration information is not so demanding, and the result of coarse estimation using manual methods can achieve the same effect. For example, using a speech spectrogram viewing tool, or other audio annotation tools, the duration of phonemes can be estimated audiovisually.

The Style_Encoder in this model is equivalent to an audio frame encoder, where the final output of the network is related to the content only, with phoneme position information embedded in the results. Based on these temporal position encodings, a simple estimation of the phoneme duration of the reference audio can be performed using the Style_Encoder. Better yet, the Style_Encoder method does not require knowledge of the phoneme sequence corresponding to the audio.

One-Shot-Voice-Cloning/TensorFlowTTS/tensorflow_tts/models/moduls/core.py

Lines 700 to 705 in 6beec14

    
           indexs = tf.cast(durindex*100, tf.int32) 
        
           cc0 = tf.gather(self.cc_features0, 400+indexs) 
        
           cc1 = tf.gather(self.cc_features1, 300+indexs) 
        
           cc2 = tf.gather(self.cc_features2, 200+indexs) 
        
           cc3 = tf.gather(self.cc_features3, 100+indexs) 
        
           ccc = tf.concat([cc0, cc1, cc2, cc3], axis=-1)

Originally posted by @CMsmartvoice in #3 (comment)

Import error

I've been tying to implement this repo in my local from quite sometime. Whenever I have been trying to implement getting import error. Is there any way I get better cloning engine. I have tried voice cloning tool, looks like they have figured good cloning engine.

hi, could the training script be released?

looking forward to it, thanks!

python3.9无法兼容llvmlite

执行了pip install One-Shot-Voice-Cloning/TensorFlowTTS 报错，系统是windows10

查阅了类似错误：
https://github.com/numba/llvmlite/issues/669

依照上面的iss，目前下载了
numba===0.53.0rc1.post1
llvmlite===0.36.0rc1

但还是会报错：

Building wheels for collected packages: llvmlite, pyworld, audioread, resampy
Building wheel for llvmlite (setup.py) ... error
error: subprocess-exited-with-error

× python setup.py bdist_wheel did not run successfully.
│ exit code: 1
╰─> [24 lines of output]
running bdist_wheel
C:\Environment\Anaconda\Anaconda3\envs\virtual_xjs\python.exe C:\Users\1135053672\AppData\Local\Temp\pip-install-jlnsqgwl\llvmlite_9c5622c13c624a7da37c2a9f225418b3\ffi\build.py
Trying generator 'Visual Studio 14 2015 Win64'
Traceback (most recent call last):
File "C:\Users\1135053672\AppData\Local\Temp\pip-install-jlnsqgwl\llvmlite_9c5622c13c624a7da37c2a9f225418b3\ffi\build.py", line 168, in
main()
File "C:\Users\1135053672\AppData\Local\Temp\pip-install-jlnsqgwl\llvmlite_9c5622c13c624a7da37c2a9f225418b3\ffi\build.py", line 156, in main
main_win32()
File "C:\Users\1135053672\AppData\Local\Temp\pip-install-jlnsqgwl\llvmlite_9c5622c13c624a7da37c2a9f225418b3\ffi\build.py", line 88, in main_win32
generator = find_win32_generator()
File "C:\Users\1135053672\AppData\Local\Temp\pip-install-jlnsqgwl\llvmlite_9c5622c13c624a7da37c2a9f225418b3\ffi\build.py", line 76, in find_win32_generator
try_cmake(cmake_dir, build_dir, generator)
File "C:\Users\1135053672\AppData\Local\Temp\pip-install-jlnsqgwl\llvmlite_9c5622c13c624a7da37c2a9f225418b3\ffi\build.py", line 28, in try_cmake
subprocess.check_call(['cmake', '-G', generator, cmake_dir])
File "C:\Environment\Anaconda\Anaconda3\envs\virtual_xjs\lib\subprocess.py", line 368, in check_call
retcode = call(*popenargs, **kwargs)
File "C:\Environment\Anaconda\Anaconda3\envs\virtual_xjs\lib\subprocess.py", line 349, in call
with Popen(*popenargs, **kwargs) as p:
File "C:\Environment\Anaconda\Anaconda3\envs\virtual_xjs\lib\subprocess.py", line 951, in init
self._execute_child(args, executable, preexec_fn, close_fds,
File "C:\Environment\Anaconda\Anaconda3\envs\virtual_xjs\lib\subprocess.py", line 1420, in _execute_child
hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2] 系统找不到指定的文件。
error: command 'C:\Environment\Anaconda\Anaconda3\envs\virtual_xjs\python.exe' failed with exit code 1
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
ERROR: Failed building wheel for llvmlite
Running setup.py clean for llvmlite
Building wheel for pyworld (pyproject.toml) ... done
Created wheel for pyworld: filename=pyworld-0.3.0-cp39-cp39-win_amd64.whl size=165598 sha256=aa377fa3596f5069cd4e2935db7cda64cf55825fe8d6a3e1f6f4bebbb579b3c5
Stored in directory: c:\users\1135053672\appdata\local\pip\cache\wheels\52\e9\41\dfd518c392d2c9fbf54cec8b7067afb83e759eda086a39aee4
Building wheel for audioread (setup.py) ... done
Created wheel for audioread: filename=audioread-2.1.9-py3-none-any.whl size=23153 sha256=1512e3e485965eb7973b0f43c950fe574fe5b73845401013c8faa26577a53b5c
Stored in directory: c:\users\1135053672\appdata\local\pip\cache\wheels\d2\1c\42\1c961e1d65429e9edffdd5fa1b69cae92a1082133abbf39835
Building wheel for resampy (setup.py) ... done
Created wheel for resampy: filename=resampy-0.2.2-py3-none-any.whl size=320732 sha256=97a16689ba045017b1f15aa3b3d9a1da59e987f6a228f75902120c5f9461168a
Stored in directory: c:\users\1135053672\appdata\local\pip\cache\wheels\17\74\46\c6570ed50edb542a09fb2e88fb135939178f11a0754ceb9752
Successfully built pyworld audioread resampy
Failed to build llvmlite
Installing collected packages: wrapt, typing-extensions, textgrid, termcolor, tensorflow-estimator, tensorboard-plugin-wit, pyasn1, llvmlite, keras, jamo, flatbuffers, distance, dataclasses, clang, certifi, audioread, appdirs, zipp, wheel, werkzeug, urllib3, unidecode, typeguard, threadpoolctl, tensorboard-data-server, six, setuptools, rsa, regex, PyYAML, pypinyin, pyparsing, pycparser, pyasn1-modules, protobuf, pillow, oauthlib, numpy, kiwisolver, joblib, inflect, idna, gast, g2pM, fonttools, filelock, decorator, cython, cycler, colorama, charset-normalizer, cachetools, tqdm, tensorflow-addons, scipy, requests, pyworld, python-dateutil, packaging, opt-einsum, numba, keras-preprocessing, importlib-metadata, h5py, grpcio, google-pasta, google-auth, click, cffi, astunparse, absl-py, soundfile, scikit-learn, resampy, requests-oauthlib, pooch, nltk, matplotlib, markdown, huggingface-hub, librosa, google-auth-oauthlib, g2p-en, tensorboard, tensorflow-gpu, TensorFlowTTS
Running setup.py install for llvmlite ... error
error: subprocess-exited-with-error

× Running setup.py install for llvmlite did not run successfully.
│ exit code: 1
╰─> [27 lines of output]
running install
running build
got version from file C:\Users\1135053672\AppData\Local\Temp\pip-install-jlnsqgwl\llvmlite_9c5622c13c624a7da37c2a9f225418b3\llvmlite/_version.py {'version': '0.31.0', 'full': 'fe7d985f6421d87f613bd414479d29d912771562'}
running build_ext
C:\Environment\Anaconda\Anaconda3\envs\virtual_xjs\python.exe C:\Users\1135053672\AppData\Local\Temp\pip-install-jlnsqgwl\llvmlite_9c5622c13c624a7da37c2a9f225418b3\ffi\build.py
Trying generator 'Visual Studio 14 2015 Win64'
Traceback (most recent call last):
File "C:\Users\1135053672\AppData\Local\Temp\pip-install-jlnsqgwl\llvmlite_9c5622c13c624a7da37c2a9f225418b3\ffi\build.py", line 168, in
main()
File "C:\Users\1135053672\AppData\Local\Temp\pip-install-jlnsqgwl\llvmlite_9c5622c13c624a7da37c2a9f225418b3\ffi\build.py", line 156, in main
main_win32()
File "C:\Users\1135053672\AppData\Local\Temp\pip-install-jlnsqgwl\llvmlite_9c5622c13c624a7da37c2a9f225418b3\ffi\build.py", line 88, in main_win32
generator = find_win32_generator()
File "C:\Users\1135053672\AppData\Local\Temp\pip-install-jlnsqgwl\llvmlite_9c5622c13c624a7da37c2a9f225418b3\ffi\build.py", line 76, in find_win32_generator
try_cmake(cmake_dir, build_dir, generator)
File "C:\Users\1135053672\AppData\Local\Temp\pip-install-jlnsqgwl\llvmlite_9c5622c13c624a7da37c2a9f225418b3\ffi\build.py", line 28, in try_cmake
subprocess.check_call(['cmake', '-G', generator, cmake_dir])
File "C:\Environment\Anaconda\Anaconda3\envs\virtual_xjs\lib\subprocess.py", line 368, in check_call
retcode = call(*popenargs, **kwargs)
File "C:\Environment\Anaconda\Anaconda3\envs\virtual_xjs\lib\subprocess.py", line 349, in call
with Popen(*popenargs, **kwargs) as p:
File "C:\Environment\Anaconda\Anaconda3\envs\virtual_xjs\lib\subprocess.py", line 951, in init
self._execute_child(args, executable, preexec_fn, close_fds,
File "C:\Environment\Anaconda\Anaconda3\envs\virtual_xjs\lib\subprocess.py", line 1420, in _execute_child
hp, ht, pid, tid = _winapi.CreateProcess(executable, args,
FileNotFoundError: [WinError 2] 系统找不到指定的文件。
error: command 'C:\Environment\Anaconda\Anaconda3\envs\virtual_xjs\python.exe' failed with exit code 1
[end of output]

note: This error originates from a subprocess, and is likely not a problem with pip.
error: legacy-install-failure

× Encountered error while trying to install package.
╰─> llvmlite

note: This is an issue with the package mentioned above, not pip.
hint: See above for output from the failure.

是否需要上面llvm或怎么设置指定路径不太清楚，又或者只能降级python？

Tts_handel = UnetTTS(models_and_params, text2id_mapper, feats_yaml) 报错

环境：Google Colab 使用GPU，使用官方文档的notebook
运行到Tts_handel = UnetTTS(models_and_params, text2id_mapper, feats_yaml)报错

UnknownError                              Traceback (most recent call last)
[<ipython-input-16-2776df11a7fe>](https://localhost:8080/#) in <module>()
----> 1 Tts_handel = UnetTTS(models_and_params, text2id_mapper, feats_yaml)

23 frames
[/content/One-Shot-Voice-Cloning/UnetTTS_syn.py](https://localhost:8080/#) in __init__(self, models_and_params, text2id_mapper, feats_yaml)
     22         self.phone_dur_min          = 5
     23         self.phone_dur_max          = 20
---> 24         self.__init_models()
     25 
     26     def one_shot_TTS(self, text, src_audio, duration_stats=None, is_wrap_txt=True):

[/content/One-Shot-Voice-Cloning/UnetTTS_syn.py](https://localhost:8080/#) in __init_models(self)
     72         self.duration_model = TFAutoModel.from_pretrained(config=AutoConfig.from_pretrained(self.models_and_params["duration_param"]), 
     73                                       pretrained_path=self.models_and_params["duration_model"],
---> 74                                       name="Normalized_duration_predictor")
     75         print("duration model load finished.")
     76 

[/usr/local/lib/python3.7/dist-packages/tensorflow_tts/inference/auto_model.py](https://localhost:8080/#) in from_pretrained(cls, config, pretrained_path, **kwargs)
     59                 model = model_class(config=config, **kwargs)
     60                 if is_build:
---> 61                     model._build()
     62                 if pretrained_path is not None and ".h5" in pretrained_path:
     63                     model.load_weights(pretrained_path)

[/usr/local/lib/python3.7/dist-packages/tensorflow_tts/models/unetts.py](https://localhost:8080/#) in _build(self)
     55         char_ids = tf.convert_to_tensor([[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]], tf.int32)
     56         duration_stat = tf.convert_to_tensor([[1., 1., 1., 1.]], tf.float32)
---> 57         self(char_ids, duration_stat)
     58 
     59     def call(

[/usr/local/lib/python3.7/dist-packages/keras/engine/base_layer.py](https://localhost:8080/#) in __call__(self, *args, **kwargs)
   1035         with autocast_variable.enable_auto_cast_variables(
   1036             self._compute_dtype_object):
-> 1037           outputs = call_fn(inputs, *args, **kwargs)
   1038 
   1039         if self._activity_regularizer:

[/usr/local/lib/python3.7/dist-packages/tensorflow_tts/models/unetts.py](https://localhost:8080/#) in call(self, char_ids, duration_stat, training, **kwargs)
     72         embedding_output = self.embeddings(char_ids)
     73 
---> 74         encoder_output             = self.encoder([embedding_output, attention_mask], training=training)
     75         last_encoder_hidden_states = encoder_output[0]
     76 

[/usr/local/lib/python3.7/dist-packages/keras/engine/base_layer.py](https://localhost:8080/#) in __call__(self, *args, **kwargs)
   1035         with autocast_variable.enable_auto_cast_variables(
   1036             self._compute_dtype_object):
-> 1037           outputs = call_fn(inputs, *args, **kwargs)
   1038 
   1039         if self._activity_regularizer:

[/usr/local/lib/python3.7/dist-packages/tensorflow_tts/models/moduls/core.py](https://localhost:8080/#) in call(self, inputs, training)
    377 
    378             layer_outputs = layer_module(
--> 379                 [hidden_states, attention_mask], training=training
    380             )
    381             hidden_states = layer_outputs[0]

[/usr/local/lib/python3.7/dist-packages/keras/engine/base_layer.py](https://localhost:8080/#) in __call__(self, *args, **kwargs)
   1035         with autocast_variable.enable_auto_cast_variables(
   1036             self._compute_dtype_object):
-> 1037           outputs = call_fn(inputs, *args, **kwargs)
   1038 
   1039         if self._activity_regularizer:

[/usr/local/lib/python3.7/dist-packages/tensorflow_tts/models/moduls/core.py](https://localhost:8080/#) in call(self, inputs, training)
    339         attention_output = attention_outputs[0]
    340         intermediate_output = self.intermediate(
--> 341             [attention_output, attention_mask], training=training
    342         )
    343         layer_output = self.bert_output(

[/usr/local/lib/python3.7/dist-packages/keras/engine/base_layer.py](https://localhost:8080/#) in __call__(self, *args, **kwargs)
   1035         with autocast_variable.enable_auto_cast_variables(
   1036             self._compute_dtype_object):
-> 1037           outputs = call_fn(inputs, *args, **kwargs)
   1038 
   1039         if self._activity_regularizer:

[/usr/local/lib/python3.7/dist-packages/tensorflow_tts/models/moduls/core.py](https://localhost:8080/#) in call(self, inputs)
    290         hidden_states, attention_mask = inputs
    291 
--> 292         hidden_states = self.conv1d_1(hidden_states)
    293         hidden_states = self.intermediate_act_fn(hidden_states)
    294         hidden_states = self.conv1d_2(hidden_states)

[/usr/local/lib/python3.7/dist-packages/keras/engine/base_layer.py](https://localhost:8080/#) in __call__(self, *args, **kwargs)
   1035         with autocast_variable.enable_auto_cast_variables(
   1036             self._compute_dtype_object):
-> 1037           outputs = call_fn(inputs, *args, **kwargs)
   1038 
   1039         if self._activity_regularizer:

[/usr/local/lib/python3.7/dist-packages/keras/layers/convolutional.py](https://localhost:8080/#) in call(self, inputs)
    247       inputs = tf.pad(inputs, self._compute_causal_padding(inputs))
    248 
--> 249     outputs = self._convolution_op(inputs, self.kernel)
    250 
    251     if self.use_bias:

[/usr/local/lib/python3.7/dist-packages/tensorflow/python/util/dispatch.py](https://localhost:8080/#) in wrapper(*args, **kwargs)
    204     """Call target, and fall back on dispatchers if there is a TypeError."""
    205     try:
--> 206       return target(*args, **kwargs)
    207     except (TypeError, ValueError):
    208       # Note: convert_to_eager_tensor currently raises a ValueError, not a

[/usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/nn_ops.py](https://localhost:8080/#) in convolution_v2(input, filters, strides, padding, data_format, dilations, name)
   1136       data_format=data_format,
   1137       dilations=dilations,
-> 1138       name=name)
   1139 
   1140 

[/usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/nn_ops.py](https://localhost:8080/#) in convolution_internal(input, filters, strides, padding, data_format, dilations, name, call_from_convolution, num_spatial_dims)
   1266           data_format=data_format,
   1267           dilations=dilations,
-> 1268           name=name)
   1269     else:
   1270       if channel_index == 1:

[/usr/local/lib/python3.7/dist-packages/tensorflow/python/util/dispatch.py](https://localhost:8080/#) in wrapper(*args, **kwargs)
    204     """Call target, and fall back on dispatchers if there is a TypeError."""
    205     try:
--> 206       return target(*args, **kwargs)
    207     except (TypeError, ValueError):
    208       # Note: convert_to_eager_tensor currently raises a ValueError, not a

[/usr/local/lib/python3.7/dist-packages/tensorflow/python/util/deprecation.py](https://localhost:8080/#) in new_func(*args, **kwargs)
    615                   func.__module__, arg_name, arg_value, 'in a future version'
    616                   if date is None else ('after %s' % date), instructions)
--> 617       return func(*args, **kwargs)
    618 
    619     doc = _add_deprecated_arg_value_notice_to_docstring(

[/usr/local/lib/python3.7/dist-packages/tensorflow/python/util/deprecation.py](https://localhost:8080/#) in new_func(*args, **kwargs)
    615                   func.__module__, arg_name, arg_value, 'in a future version'
    616                   if date is None else ('after %s' % date), instructions)
--> 617       return func(*args, **kwargs)
    618 
    619     doc = _add_deprecated_arg_value_notice_to_docstring(

[/usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/nn_ops.py](https://localhost:8080/#) in conv1d(value, filters, stride, padding, use_cudnn_on_gpu, data_format, name, input, dilations)
   2009           data_format=data_format,
   2010           dilations=dilations,
-> 2011           name=name)
   2012     else:
   2013       result = squeeze_batch_dims(

[/usr/local/lib/python3.7/dist-packages/tensorflow/python/ops/gen_nn_ops.py](https://localhost:8080/#) in conv2d(input, filter, strides, padding, use_cudnn_on_gpu, explicit_paddings, data_format, dilations, name)
    930       return _result
    931     except _core._NotOkStatusException as e:
--> 932       _ops.raise_from_not_ok_status(e, name)
    933     except _core._FallbackException:
    934       pass

[/usr/local/lib/python3.7/dist-packages/tensorflow/python/framework/ops.py](https://localhost:8080/#) in raise_from_not_ok_status(e, name)
   6939   message = e.message + (" name: " + name if name is not None else "")
   6940   # pylint: disable=protected-access
-> 6941   six.raise_from(core._status_to_exception(e.code, message), None)
   6942   # pylint: enable=protected-access
   6943 

/usr/local/lib/python3.7/dist-packages/six.py in raise_from(value, from_value)

UnknownError: Failed to get convolution algorithm. This is probably because cuDNN failed to initialize, so try looking to see if a warning log message was printed above. [Op:Conv2D]

请问test_wavs中的.npy文件是如何生成的

Would you please show me how to generate the .npy files in the One-Shot-Voice-Cloning/test_wavs/

tensorflow-addons requirements issue

I got this error on
pip install .
ERROR: Could not find a version that satisfies the requirement tensorflow-addons==0.14.0 (from tensorflowtts) (from versions: 0.16.1, 0.17.0, 0.17.1, 0.18.0, 0.19.0, 0.20.0, 0.21.0)
ERROR: No matching distribution found for tensorflow-addons==0.14.0

Waiting for the training scripts!

I'm looking forward to the training script !! Thank you fro your work!

import error : from tensorflow_tts.audio_process import preprocess_wav

I just installed TensorFlowTTS whose version is 1.8 by pip install TensorFlowTTS , but when i open the notebook\OneShotVoiceClone_Inference.ipynb, it shows import error:

it seems the 1.8 version TensorFlowTTS does not have audio_process module, but i also notice the root folder has a TensorFlowTTS folder, maybe i can install it instead, do you plan to compatible with 1.8 version TensorFlowTTS ?

Other language support

Hi, @CMsmartvoice

Thanks for your great work, I would like to use your method to train Korean/English data for my research.
Would you please give some advice? Are there any plans to provide training steps in other languages?

Thanks in advance

import error

I was following your guide to perform some inference but I encontered a directory issue:

----> 2 from tensorflow_tts.audio_process import preprocess_wav
3 from UnetTTS_syn import UnetTTS

ModuleNotFoundError: No module named 'tensorflow_tts.audio_process'

I tried to move the folder outside TensorFlowTTS, ant the moduleworked but then other errors came out, Could you help ?

AttributeError: 'dict' object has no attribute '__NUMPY_SETUP__'报错

进入到One-Shot-Voice-Cloning-master\TensorFlowTTS文件夹后执行python setup.py install报错

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "C:\Users\CHOPY\Downloads\One-Shot-Voice-Cloning-master\TensorFlowTTS\setup.py", line 74, in
setup(
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\site-packages\setuptools_init_.py", line 153, in setup
return distutils.core.setup(**attrs)
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\distutils\core.py", line 148, in setup
dist.run_commands()
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\distutils\dist.py", line 966, in run_commands
self.run_command(cmd)
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\distutils\dist.py", line 985, in run_command
cmd_obj.run()
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\site-packages\setuptools\command\install.py", line 67, in run
self.do_egg_install()
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\site-packages\setuptools\command\install.py", line 117, in do_egg_install
cmd.run(show_deprecation=False)
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\site-packages\setuptools\command\easy_install.py", line 408, in run
self.easy_install(spec, not self.no_deps)
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\site-packages\setuptools\command\easy_install.py", line 650, in easy_install
return self.install_item(None, spec, tmpdir, deps, True)
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\site-packages\setuptools\command\easy_install.py", line 697, in install_item
self.process_distribution(spec, dist, deps)
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\site-packages\setuptools\command\easy_install.py", line 744, in process_distribution
distros = WorkingSet([]).resolve(
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\site-packages\pkg_resources_init_.py", line 766, in resolve
dist = best[req.key] = env.best_match(
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\site-packages\pkg_resources_init_.py", line 1051, in best_match
return self.obtain(req, installer)
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\site-packages\pkg_resources_init_.py", line 1063, in obtain
return installer(requirement)
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\site-packages\setuptools\command\easy_install.py", line 669, in easy_install
return self.install_item(spec, dist.location, tmpdir, deps)
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\site-packages\setuptools\command\easy_install.py", line 695, in install_item
dists = self.install_eggs(spec, download, tmpdir)
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\site-packages\setuptools\command\easy_install.py", line 890, in install_eggs
return self.build_and_install(setup_script, setup_base)
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\site-packages\setuptools\command\easy_install.py", line 1162, in build_and_install
self.run_setup(setup_script, setup_base, args)
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\site-packages\setuptools\command\easy_install.py", line 1146, in run_setup
run_setup(setup_script, args)
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\site-packages\setuptools\sandbox.py", line 262, in run_setup
raise
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\contextlib.py", line 137, in exit
self.gen.throw(typ, value, traceback)
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\site-packages\setuptools\sandbox.py", line 198, in setup_context
yield
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\contextlib.py", line 137, in exit
self.gen.throw(typ, value, traceback)
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\site-packages\setuptools\sandbox.py", line 169, in save_modules
saved_exc.resume()
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\site-packages\setuptools\sandbox.py", line 143, in resume
raise exc.with_traceback(self._tb)
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\site-packages\setuptools\sandbox.py", line 156, in save_modules
yield saved
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\site-packages\setuptools\sandbox.py", line 198, in setup_context
yield
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\site-packages\setuptools\sandbox.py", line 259, in run_setup
_execfile(setup_script, ns)
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\site-packages\setuptools\sandbox.py", line 46, in execfile
exec(code, globals, locals)
File "C:\Users\CHOPY\AppData\Local\Temp\easy_install-f13cm3xd\pysptk-0.1.20\setup.py", line 136, in
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\site-packages\setuptools_init.py", line 153, in setup
return distutils.core.setup(**attrs)
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\distutils\core.py", line 148, in setup
dist.run_commands()
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\distutils\dist.py", line 966, in run_commands
self.run_command(cmd)
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\distutils\dist.py", line 985, in run_command
cmd_obj.run()
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\site-packages\setuptools\command\bdist_egg.py", line 155, in run
self.run_command("egg_info")
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\distutils\cmd.py", line 313, in run_command
self.distribution.run_command(command)
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\distutils\dist.py", line 985, in run_command
cmd_obj.run()
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\site-packages\setuptools\command\egg_info.py", line 299, in run
self.find_sources()
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\site-packages\setuptools\command\egg_info.py", line 306, in find_sources
mm.run()
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\site-packages\setuptools\command\egg_info.py", line 541, in run
self.add_defaults()
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\site-packages\setuptools\command\egg_info.py", line 578, in add_defaults
sdist.add_defaults(self)
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\distutils\command\sdist.py", line 228, in add_defaults
self._add_defaults_ext()
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\distutils\command\sdist.py", line 311, in _add_defaults_ext
build_ext = self.get_finalized_command('build_ext')
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\distutils\cmd.py", line 299, in get_finalized_command
cmd_obj.ensure_finalized()
File "C:\Users\CHOPY\AppData\Local\Programs\Python\Python39\lib\distutils\cmd.py", line 107, in ensure_finalized
self.finalize_options()
File "C:\Users\CHOPY\AppData\Local\Temp\easy_install-f13cm3xd\pysptk-0.1.20\setup.py", line 77, in finalize_options
url="https://github.com/tensorspeech/TensorFlowTTS",

preprocess code

Hi, @CMsmartvoice

I pre-processed English dataset by referring to Tensorflow TTS.
But still can't do about '-embed.npy'.
What is -embed.npy? Would you please upload preprocess code about LibriTTS?
Looking forward to reading your explanations as always!

what is this

One-Shot-Voice-Cloning/TensorFlowTTS/tensorflow_tts/models/unetts.py

Line 69 in 091bfef

sheng_mean, sheng_std, yun_mean, yun_std = \

sheng? yun?

Is english text is supported?

duration model load finished.
acoustics model load finished.
vocode model load finished.
['hello']
phoneme seq: sil hello sil

---------------------------------------------------------------------------

KeyError                                  Traceback (most recent call last)

[<ipython-input-40-71c2e65db483>](https://localhost:8080/#) in <module>
     29 text = "hello"
     30 
---> 31 syn_audio, _, _ = Tts_handel.one_shot_TTS(text, ref_audio)
     32 
     33 ipd.Audio(syn_audio, rate=16000)

2 frames

[/usr/local/lib/python3.7/dist-packages/tensorflow_tts/processor/multispk_voiceclone.py](https://localhost:8080/#) in text_to_sequence(self, text, inference)
    798         #print("text",text)
    799         for symbol in text.split():
--> 800             idx = self.symbol_to_id[symbol]
    801             sequence.append(idx)
    802 

KeyError: 'hello'

有更详细的训练和使用教程吗，萌新看不懂

能多加一些图文说明就更好了

	indexs = tf.cast(durindex*100, tf.int32)
	cc0 = tf.gather(self.cc_features0, 400+indexs)
	cc1 = tf.gather(self.cc_features1, 300+indexs)
	cc2 = tf.gather(self.cc_features2, 200+indexs)
	cc3 = tf.gather(self.cc_features3, 100+indexs)
	ccc = tf.concat([cc0, cc1, cc2, cc3], axis=-1)