Coder Social home page Coder Social logo

oratio's Introduction

Open Oratio

An open source pipeline to translate .mp4 video files to .mov video files in 20 different languages.

Generate quality video and podcast localizations at scale.

Setup

Most important:

python --version >= 3.7

pip install -r docs/requirements.txt

Also install rubberband brew install rubberband

For speech diarization, clone pyAudioAnalysis inside the oratio directory git clone https://github.com/tyiannak/pyAudioAnalysis.git. The dependencies for pyAudioAnalysis are already included in oratio's docs/requirements.txt. Also make sure FFmpeg is installed. More installation info at https://ffmpeg.org/.

And follow the instructions in docs/ for aws, gcloud, and ibm integration. Then make sure to setup the names of the s3 or gcloud bucket you will store your audio in. Set the AWS_BUCKET_NAME and the GCLOUD_BUCKET_NAME constants in src/constants/constants.py.

Optional Setup

Also install image magick, (if you want text overlay) brew install imagemagick

Setup pre-commit, if you want to contribute pre-commit install

Test setup: pre-commit run This should run black and run_tests.py but both should be skipped until code changes

Running the pipeline

python src/main.py tests/test_config.yaml will test your setup to make sure everything is in the right place.

After test_config.yaml starts working, make your own project folder in media/prod and edit the config.yaml to get going! Checkout my test video in media/prod/kaiser to familarize yourself with the setup.

python src/main.py will use the default config.yaml provided in the home directory.

Understanding the Repo

Start with src/main.py. Run it. Read it.

Follow the commands it executes with a debugger.

Then check out src/client.py. This is our biggest piece of abstraction, and especially if you are adding an API feature, you'll want a good understanding of what it is doing.

src/config.py and src/video_project.py have important setup information and maintain the state of the project.

File structure

. home
/docs - contains documentation on ideas, most documentation is in the relevant .py files
/src - contains source code for the pipeline
/src/api - the neural apis we work with, abstracted in the client.py
/media - contains input and output media
/media/dev - stores temporary files made during translation
/media/prod - stores the finalized input and output files
/media/test - stores test input files
/tests - unit tests for the pipeline

Metrics

Performance (speed) Performance (accuracy)

oratio's People

Contributors

kpister avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.