Coder Social home page Coder Social logo

massey's Introduction

๐ŸŽ™๏ธ MASSY - MP3 to Text Transcription Application ๐Ÿ“

python_rWYcWc6Ccv

๐ŸŒŸ Overview

MASSY (MP3 to Audio Summarization SYstem) is a powerful tool designed to transcribe multiple MP3 audio files in bulk using OpenAI's state-of-the-art Whisper model. While it may seem like a simple bulk transcription tool on the surface, MASSY offers much more under the hood.

๐Ÿš€ Key Features

  • ๐Ÿ—ƒ๏ธ Bulk transcription of MP3 files
  • ๐Ÿ”„ Automatic file splitting for large audio files
  • ๐Ÿ“„ Dual output formats: SRT (SubRip Subtitle) and plain text
  • โฑ๏ธ Precise timestamp information in SRT format
  • ๐Ÿ“Š Detailed transcription reports
  • ๐Ÿ–ฅ๏ธ User-friendly GUI with progress tracking

๐ŸŽฏ Purpose

MASSY serves two primary purposes:

  1. Human-Readable Transcripts: Generate plain text transcripts for easy reading and analysis.
  2. Machine-Readable Transcripts: Create SRT files with timestamp information for advanced processing and analysis.

The SRT format allows for a deeper understanding of the audio content, including:

  • Precise timing of spoken words
  • Detection of silence or pauses
  • Improved context for AI-driven analysis

๐Ÿ”ง How It Works

  1. File Selection: Choose a folder containing MP3 files.
  2. Transcription: MASSY uses OpenAI's Whisper model to transcribe each audio file.
  3. File Splitting: Large files (>24MB) are automatically split and merged after transcription.
  4. Output Generation: Creates SRT and/or plain text files based on user preference.
  5. Metadata Addition: Adds relevant metadata to each transcript, including:
    • File name
    • Recording date (extracted from filename)
    • Duration
    • Transcription date
  6. Report Generation: Produces a summary report of the transcription process.

๐Ÿš€ Getting Started

Prerequisites

  • Python 3.7+
  • OpenAI API key

Installation

  1. Clone the repository:
    git clone https://github.com/yourusername/massy.git
    
  2. Install required packages:
    pip install -r requirements.txt
    

Usage

  1. Run the application:
    python massy.py
    
  2. Enter your OpenAI API key.
  3. Select the folder containing your MP3 files.
  4. Choose your preferred output format (SRT, Text, or Both).
  5. Click "Transcribe" and monitor the progress.

๐Ÿง  Integration with AI Systems

MASSY is designed to be part of a larger AI-driven analysis system. The SRT output, with its precise timing information, is particularly useful for:

  • ๐Ÿ” Semantic search and retrieval
  • ๐Ÿ“Š Time-based sentiment analysis
  • ๐Ÿ—ฃ๏ธ Speaker diarization
  • ๐Ÿ”— Contextual understanding in language models

By providing both human-readable and machine-readable formats, MASSY bridges the gap between human interpretation and advanced AI analysis.

๐Ÿค Contributing

We welcome contributions to MASSY! Please see our CONTRIBUTING.md for details on how to get started.

๐Ÿ“„ License

This project is licensed under the MIT License - see the LICENSE.md file for details.

๐Ÿ™ Acknowledgments

  • OpenAI for the Whisper model
  • All contributors and users of MASSY

๐ŸŒŸ Remember: MASSY is more than just a transcription tool โ€“ it's a bridge between human understanding and machine analysis of audio content!

massey's People

Contributors

taskmasterpeace avatar

Stargazers

 avatar

Watchers

 avatar

massey's Issues

Split Files Issue

the progress tracking for split files is not being handled correctly. The app is marking some files as "Completed" prematurely, particularly for the split files.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.