Coder Social home page Coder Social logo

gustavostz / whisper-clip Goto Github PK

View Code? Open in Web Editor NEW
92.0 2.0 9.0 2.58 MB

WhisperClip simplifies your life by automatically transcribing audio recordings and saving the text directly to your clipboard. With just a click of a button, you can effortlessly convert spoken words into written text, ready to be pasted wherever you need it. This application harnesses the power of OpenAI’s Whisper for free.

Home Page: https://openai.com/research/whisper

License: MIT License

Python 100.00%
audio-processing audio-transcription clipboard openai productivity productivity-tools python speech-recognition speech-to-text whisper

whisper-clip's Introduction

WhisperClip: One-Click Audio Transcription

Example using WhisperClip

WhisperClip simplifies your life by automatically transcribing audio recordings and saving the text directly to your clipboard. With just a click of a button, you can effortlessly convert spoken words into written text, ready to be pasted wherever you need it. This application harnesses the power of OpenAI's Whisper for free, making transcription more accessible and convenient.

Table of Contents

Features

  • Record audio with a simple click.
  • Automatically transcribe audio using Whisper (free).
  • Option to save transcriptions directly to the clipboard.

Installation

Prerequisites

  • Python 3.8 or higher
  • CUDA is highly recommended for better performance but not necessary. WhisperClip can also run on a CPU.

Setting Up the Environment

  1. Clone the repository:

    git clone https://github.com/gustavostz/whisper-clip.git
    cd whisper-clip
    
  2. Install PyTorch if you don't have it already. Refer to PyTorch's website for installation instructions.

  3. Install the required dependencies:

    pip install -r requirements.txt
    

Choosing the Right Model

Based on your GPU's VRAM, choose the appropriate Whisper model for optimal performance. Below is a table of available models with their required VRAM and relative speed:

Size Required VRAM Relative speed
tiny ~1 GB ~32x
base ~1 GB ~16x
small ~2 GB ~6x
medium ~5 GB ~2x
large ~10 GB 1x

For English-only applications, .en models (e.g., tiny.en, base.en) tend to perform better.

To change the model, modify the model_name variable in config.json to the desired model name.

Usage

Run the application:

python main.py
  • Click the microphone button to start and stop recording.
  • If "Save to Clipboard" is checked, the transcription will be copied to your clipboard automatically.

Configuration

  • The default shortcut for toggling recording is Alt+Shift+R. You can modify this in the config.json file.
  • You can also change the Whisper model used for transcription in the config.json file.

Feedback

If there's interest in a more user-friendly, executable version of WhisperClip, I'd be happy to consider creating one. Your feedback and suggestions are welcome! Just let me know through the GitHub issues.

Acknowledgments

This project uses OpenAI's Whisper for audio transcription.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.