Coder Social home page Coder Social logo

voicenoteassistantgpt's Introduction

VoiceNoteAssistantGPT

VoiceNoteAssistantGPT is a web application that transcribes audio recordings and generates AI-generated responses using the OpenAI GPT-3.5 Turbo model. It allows users to upload audio files in the M4A format and receive transcriptions along with AI-generated responses.

Installation

  1. Clone the repository:
$ git clone <repository_url>
  1. Install the required dependencies:
$ pip install -r requirements.txt
  1. Set up environment variables:
  • Create a .env file in the root directory of the project.
  • Add the following environment variables to the .env file:
OPENAI_API_KEY=<your_openai_api_key>
PUSHOVER_USER_KEY=<your_pushover_user_key>
PUSHOVER_API_TOKEN=<your_pushover_api_token>
USERNAME=<your_username>
PASSWORD=<your_password>

Usage

  1. Start the application:

    $ uvicorn main:app --reload

    or if you want to make it accessible in a local network:

    uvicorn main:app --host 0.0.0.0 --port 8000

    The application will be accessible at http://localhost:8000.

  2. Open your web browser and navigate to http://localhost:8000 to access the application.

  3. Authenticate using your username and password (set that up in .env file).

  4. Upload an M4A audio file using the provided form. (iPhone shortcut coming soon)

  5. The application will transcribe the audio file in the background and generate an AI-generated response using the OpenAI GPT-3.5 Turbo model.

  6. Once the transcription and response generation are completed, you will be able to see the transcriptions and AI-generated responses on the main page.

Logging

The application logs events and errors to a log file named app.log. The log file is located in the same directory as the application.

Credits

VoiceNoteAssistantGPT is built using the following libraries and APIs:

OpenAI GPT-3.5 Turbo: Provides the AI-generated response generation capabilities.
FastAPI: A modern, fast (high-performance) web framework for building APIs with Python.
Whisper: A library for speech recognition using the Whisper ASR system. dotenv: Loads environment variables from a .env file.
Pushover: A service for sending push notifications to various devices and platforms.

License

This project is licensed under the MIT License.

voicenoteassistantgpt's People

Contributors

dotbrt avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.