Coder Social home page Coder Social logo

amirldn / rtx-voice-script Goto Github PK

View Code? Open in Web Editor NEW
85.0 5.0 4.0 451 KB

A python script that takes an input MP3/FLAC and outputs an acapella/background noise stripped WAV using the power of NVIDIA's RTX Voice

License: GNU General Public License v3.0

Python 100.00%
nvidia-rtx rtx-voice rtx-acapella flac wav rtx nvidia vb-audio

rtx-voice-script's Introduction

rtx-voice-script

Code style: black HitCount

A python script that takes an input MP3/FLAC/MOV/MP4/WAV file and outputs an acapella version as a WAV using the power of NVIDIA's RTX Voice. Since RTX Voice is closed source, this records files in real-time so should be used for experimental purposes and for library overhauls.

Getting Started

Requirements


Steps

๐Ÿ’ป+๐Ÿ anaconda

Anaconda is an open source distribution for the Python programming language. The Anaconda distribution includes many of the most commonly used Python libraries by default, also a user interface for managing and updating packages.

You can create your own working environment so that, depending on the project, you can use different dependencies packages.

  1. install anaconda
  2. open anaconda terminal
  3. โŒจ๏ธ=conda create -n rtx-voice python=3.9 create working environment
  4. โŒจ๏ธ=conda activate rtx-voice
  5. ๐Ÿ–ฅ๏ธ=(base) C:\Users\UserName> โŒจ๏ธ=D: If you want to have the project in another location on the hard disk. Change partion.
  6. ๐Ÿ–ฅ๏ธ=(base) D:\> โŒจ๏ธ=cd D:\Folder\Folder change path

๐Ÿ’ป Use and Download the tool

1 ) Clone the repository and cd into it

git clone https://github.com/amirmaula/rtx-voice-script.git

2 ) Install the prerequisites via

pip install -r requirements.txt

3 ) Set RTX Voice's microphone input to your VB Audio Cable (Virtual AUX)

4 ) To execute the program, run it in your CLI like so:

python ./rtx-core.py -i [input path] -o [file directory & name for output]

For example input defined (automatically creates output file)

python rtx-core.py -i "D:\UserName\Video\video.mov"

For example single input defined and output defined (not testet)

python ./rtx-core.py -i song.flac -o D:\Music\Acapella\cool_song.wav

5 ) Follow the on screen prompts.

VirtualCable

just the number 5

NvidiaBrodcast

just the number 3

6 ) Your new config file will then be exported to code's directory.


๐ŸŽฌ sync tool

I always had the problem on my computer that there was a small time lag. This depends on the start time of the audio recording and the start time of the recording and there is always a small time difference.

python-warpdrive syncs to source audio from video with the filter audio and returned time offset. Nice Tool for this Job is from ๐Ÿง™โ€โ™‚๏ธ Darren Sholes.


๐Ÿ”ง To-do

  • Add FLAC support DONE
  • Create and load config file so that the user does not need to select input/output everytime DONE
  • Add a real-time timer of how long a file has been playing for DONE
  • Add GUI

Known Issues

  • Sometimes the config file will get messed up so just delete the config.cfg

  • The code will crash if the folder you wish to export your .wav to does not already exist, so just create the folder beforehand and it will save with no issues

  • Results may vary with this and you can tweak the noise suppression to what works for you. This can be used for songs and speech etc.

  • This is work in progress and I am somewhat an amateur when it comes to coding so any improvements made to the code and constructive criticism is greatly appreciated.

  • NVIDIA RTX is propriety software and belongs to NVIDIA and all rights are reserved. This program uses the NVIDIA RTX software as intended and does not use any exploitation.

  • An audio driver(nahimic) from my laptop(msi) interfered with my virtual audio cable and i had to close it befor i could use the virtual audio cable.

Other Solutions (VST Plugins)

You can use VST plugin for Adobe Audition, Adobe Premiere Pro or Audacity. There is this youtube tutorial from TroubleChute

A Cheap solution is to use a VST plugin called ELGATO AUDIO EFFECTS which is a free plugin and you can download it from elgato website. The Last time I checked it does not work when you press export. It gave me audio artefacts.

The VST voicefx cost money for more advanced features. Nvidia list this VST plugin here.


Notes

ffmpeg

  • ffmpeg -i input.wav -vn -ar 48000 -ac 2 -b:a 192k output.mp3

pip

  • pip check version install from requirements pip show mutagen
  • for installation pip install mutagen

conda

  • conda create -n rtx-voice python=3.9

  • conda activate rtx-voice

  • conda deactivate

  • ``rtx-core.py -i audio.flac -o D:\AudioKI\audio-d.wav````

rtx-voice-script's People

Contributors

amirldn avatar dalbyte avatar dependabot[bot] avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

rtx-voice-script's Issues

Work with separation non-voice sounds

Hello,
We need to separate one sound(not voice) from others.
As I know, the RTX Voice source code had the option to train the user's own model but it's not accessible now.
Is it possible it your software to train my own sound model?
For example, I play piano In a public place and I want to separate only piano sounds and cut all the sounds.

Or maybe you can help to create this software?

Thank you.

I dont understand the readme in 3)

How to do this 3 ) Set RTX Voice's microphone input to your VB-Audio Cable (Virtual AUX)? can you describe more specific in this part, please?

Duration of the recoder

I have a question, so between flac and a wav comes out?
Or a flac comes in and a flac can come out?

  • A limitation of the previous version was the duration of 10-15 min in mp3 if it lasted longer it was damaged.
    I'm going to check the new version and see if that keeps happening.

  • What I did to avoid that, was to use ffmpeg to split the file and then make a merge with it, perhaps using a library of it will help you to fix the duration problem.

  • Let me learn python and if you like I will help you, the question is that I work with javascript / node, but now I am learning python, thanks to the fact that you, my dear friend, have encouraged me to get my hands on python

Pd Thank you for hearing my request.

Sorry if the translation is not good, I speak Spanish

add .wav fortmat

could you add compatibility with .wav is that going to mp3 is a lot of loss of quality, please :D

Asynchronous tasks

Since all the task sessions use the same audio device, it seems impossible to run all of them simultaneously. Any idea?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.