Coder Social home page Coder Social logo

diff-svc's Introduction

Diff-SVC Refactor (Inference, training and model code simplifier and updated from RCell's Diff-SVC)

diffsvc implemented based on DiffSinger unofficial repository

It is still under development and testing, training and inference code are fully completed
The conclusion of the temporary test is that when the number of people in the data set is too large (for example, 60 or 70 people), the sound leakage will be aggravated, and the sound leakage of about 5 people is basically the same as that of a single person
At present, you can see that there are a lot of branches, all of which are various solutions under testing \

Introduction

Realize singing voice timbre conversion based on Diffsinger + softvc. Compared with the original diffsvc repository, this repository has the following advantages and disadvantages

  • Supports multiple speakers
  • This repository is based on the unofficial diffsinger repository, and the code structure is simpler and easier to understand
  • The vocoder also uses 441khz diffsinger community vocoder
  • Acceleration is not supported

Pre-downloaded files

  • softvc hubert (hubert-soft-0d54a1f4.pt) is placed in the hubert directory
  • 441khz diffsinger community vocoder (model) is placed in the hifigan directory

Dataset preparation

You only need to put the dataset into the dataset_raw directory with the following file structure

dataset_raw
├───speaker0
│ ├───xxx1-xxx1.wav
│ ├───...
│ └───Lxx-0xx8.wav
└───speaker1
├───xx2-0xxx2.wav
├───...
└───xxx7-xxx007.wav

Data preprocessing

Basically similar to sovits3.0

  1. Resampling
python resample.py
  1. Automatically divide training set, validation set and test set
python preprocess_flist_config.py
  1. Generate hubert, f0, mel and stats
python preprocess_hubert_f0.py && python gen_stats.py

After executing the above steps, the dataset directory is the preprocessed data. You can delete the dataset_raw folder, or delete the temporary wav file after resamplingrm dataset/*/*.wav

Training

python3 train.py --model naive --dataset ms --restore_step RESTORE_STEP

Inference

inference.py

diff-svc's People

Contributors

nlpv2011 avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.