Coder Social home page Coder Social logo

rvc-archiver's Introduction

rvc-archiver

Getting started

Download F0 Transformation weights

curl -L -o shared/f0/D32k.pth      https://huggingface.co/lj1995/VoiceConversionWebUI/resolve/main/pretrained_v2/D32k.pth \
     -L -o shared/f0/D40k.pth      https://huggingface.co/lj1995/VoiceConversionWebUI/resolve/main/pretrained_v2/D40k.pth \
     -L -o shared/f0/D48k.pth      https://huggingface.co/lj1995/VoiceConversionWebUI/resolve/main/pretrained_v2/D48k.pth \
     -L -o shared/f0/G32k.pth      https://huggingface.co/lj1995/VoiceConversionWebUI/resolve/main/pretrained_v2/G32k.pth \
     -L -o shared/f0/G40k.pth      https://huggingface.co/lj1995/VoiceConversionWebUI/resolve/main/pretrained_v2/G40k.pth \
     -L -o shared/f0/G48k.pth      https://huggingface.co/lj1995/VoiceConversionWebUI/resolve/main/pretrained_v2/G48k.pth \
     -L -o shared/f0/f0D32k.pth    https://huggingface.co/lj1995/VoiceConversionWebUI/resolve/main/pretrained_v2/f0D32k.pth \
     -L -o shared/f0/f0D40k.pth    https://huggingface.co/lj1995/VoiceConversionWebUI/resolve/main/pretrained_v2/f0D40k.pth \
     -L -o shared/f0/f0D48k.pth    https://huggingface.co/lj1995/VoiceConversionWebUI/resolve/main/pretrained_v2/f0D48k.pth \
     -L -o shared/f0/f0G32k.pth    https://huggingface.co/lj1995/VoiceConversionWebUI/resolve/main/pretrained_v2/f0G32k.pth \
     -L -o shared/f0/f0G40k.pth    https://huggingface.co/lj1995/VoiceConversionWebUI/resolve/main/pretrained_v2/f0G40k.pth \
     -L -o shared/f0/f0G48k.pth    https://huggingface.co/lj1995/VoiceConversionWebUI/resolve/main/pretrained_v2/f0G48k.pth

Code Generation

python3 src/graphql/codegen.py

TODO:

  • Add the f0 curve as an inference param for Optuna
  • Try optimizing different RVC models
  • Experiment with different Azure TTS models for each of the RVC models
    • I need to figure out if it is better to use the same voice for all the models or have different voices
    • I think this could be turned into a Optuna parameter or potentially Optuna has a way to look at multiple models in their Jupyter Notebook
    • Make f0 curve a parameter
  • Pull all the model MD5 hashes that are already stored in the database

Process

  • Pull data from RVCStatSheet.csv✅
  • Store stat sheet data✅
  • Label the stat sheet data and infer the character/person with OpenAI✅
  • Pull video(s) that are compiled quality audio samples of the character
  • Run F0 on the video's audio with RMVPE✅
  • Remove background noise from audio
  • Split the audio F0s into frequency segments✅
  • Extract frame segments from the video that correspond to the respective audio frequency segment✅
  • Show some of the frame segments to OpenAI and have it guess if the character/person we want is speaking
  • Compute the average F0 on the validated audio segments
  • Use the averaged F0 for optimizing the RVC model params
  • Tune RVC model with Optuna set at the derived pitch transpose✅
  • Store the character/person info with the correct RVC params to use for inference✅

-- "06039080-116d-493e-87ab-6b219834a799" "7f3a561f-a167-4ac3-90db-8dd0147b28f8" 0.04 0 0 "rmvpe" 0.58 0 0.22 "f0G40k.pth" 3.63166352113088 -- "bcec0d61-9d30-4cdb-8ced-773c9b596108" "28b5807d-2f88-4950-ae19-8c71646d4034" 0.12 0 1 "rmvpe" 0.02 0 0.06 "f0G40k.pth" 3.960955540339152

TODO: Make the RVC container empty /tmp/gradio dir periodically to prevent size from growing indefinitely

temporary fix:

sudo -E python3 src/seed.py
# See around line 100 in optimize_params.py:
# if gradio_server_url == "http://localhost:7865/":
#     empty_directory("tmp-rvc-0")
# elif gradio_server_url == "http://localhost:7866/":
#     empty_directory("tmp-rvc-1")
# elif gradio_server_url == "http://localhost:7867/":
#     empty_directory("tmp-rvc-2")

rvc-archiver's People

Contributors

mitchsayre avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.