Coder Social home page Coder Social logo

speech-condenser's Introduction

Speech Condenser

Speech condenser is a tool for reducing the size of a dialogue.

Pipeline

Pipeline

It combines several tools to achieve the goal of reducing the size of a dialogue. Each step of the above pipleine runs inside a container.

Steps:

  1. Audio extraction - Extracts the audio from the video file.
  2. Speaker diarization - Identifies the speakers in the audio file.
  3. Split audio - Splits the audio file into smaller chunks based on the speaker diarization.
  4. Speech to text - Transcribes the audio chunks into text.
  5. Combine ASR and diarization - Combines the results of the ASR and diarization to get the text for each speaker as a dialogue.
  6. Summarization - Summarizes the dialogue.

Installation

The setup uses docker or podman to run the containers. A set of local scripts are provided to run the pipeline.

  • build.sh - Builds the containers.
  • pipeline.sh - Runs the pipeline.
  • yt-pipeline.sh - Runs the pipeline on a youtube video.

Videos needs to be provided in the data/input directory. yt-pipeline.sh will use this directory to download to cache the video. The output will be in the data/output directory.

Make sure to create a .env based on the .env.example file and privide the required values:

  • SC_RUNTIME - The runtime to use for the containers. Either docker or podman.
  • HF_TOKEN - The Hugging Face token to use for the summarization step.

Make sure to visit hf.co/pyannote/speaker-diarization and hf.co/pyannote/segmentation and accept user conditions. This required in order to be able to run the speaker diarization.

Usage

Run agains a local video file:

./pipeline.sh "data/input/video.mp4"

Run against a youtube video:

./yt-pipeline.sh "https://www.youtube.com/watch?v=video_id"

speech-condenser's People

Contributors

nezhar avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.