Coder Social home page Coder Social logo

kkpan11 / chatgpt-voice-assistant Goto Github PK

View Code? Open in Web Editor NEW

This project forked from jakecyr/chatgpt-voice-assistant

0.0 1.0 0.0 401 KB

A chatbot that uses speech to text for input, sends the text to OpenAI's ChatGPT text generation model and speaks the response using text to speech.

Python 100.00%

chatgpt-voice-assistant's Introduction

ChatGPT Voice Assistant

GitHub Actions Build Status

A simple interface to the OpenAI ChatGPT model with speech to text for input and text to speech for the output. chatgpt-voice-assistant uses Google Translate's text-to-speech free API for audio input and output (not OpenAI Whisper).

Setup

Mac Prerequisites

Install dependencies:

brew install portaudio
brew link portaudio

Update your pydistutils config file for portaudio usage by running the following:

echo "[build_ext]" >> $HOME/.pydistutils.cfg
echo "include_dirs="`brew --prefix portaudio`"/include/" >> $HOME/.pydistutils.cfg
echo "library_dirs="`brew --prefix portaudio`"/lib/" >> $HOME/.pydistutils.cfg

Install from PyPI

Run the following to install the chatgpt-assist CLI application:

pip install `chatgpt-voice-assistant`

Install from Source

  1. Install poetry (official docs or with pip install poetry)
  2. Install all dependencies with poetry install

Running the Script

Either set the OPENAI_API_KEY environment variable before running the script or pass in your secret key to the script like in the example below:

export OPENAI_API_KEY=<OPEN API SECRET KEY HERE>
gptassist

# OR

gptassist --open-ai-key=<OPEN API SECRET KEY HERE>

or if installed from source with poetry:

poetry run gptassist --open-ai-key=<OPEN API SECRET KEY HERE>

Start speaking and turn up your volume to hear the AI assistant respond.

Say the word "exit" or hit Ctrl+C in your terminal to stop the application.

Options

Below is the help menu from the chatgpt-assist CLI detailing all available options:

-h, --help
    show this help message and exit

--log-level LOG_LEVEL
    Whether to print at the debug level or not.

--input-device-name INPUT_DEVICE_NAME
    The input device name.

--lang LANG
    The language to listen for when running speech to text (ex. en or fr).

--max-tokens MAX_TOKENS
    Max OpenAI completion tokens to use for text generation.

--tld TLD
    Top level domain (ex. com or com.au).

--safe-word SAFE_WORD
    Word to speak to exit the application.

--wake-word WAKE_WORD
    (Optional) Word to trigger a response.

--open-ai-key OPEN_AI_KEY
    Required. Open AI Secret Key (or set OPENAI_API_KEY environment variable)

--tts {apple,google}
    Choose a text-to-speech engine ('apple' (say) or 'google' (gtts), defaults to 'google')

--speech-rate SPEECH_RATE
    The rate at which to play speech. 1.0=normal

Specifying an Output Language Accent

Specify both the LANGUAGE and TOP_LEVEL_DOMAIN vars to override the default English (United States)

gptassist --open-ai-key=<OPENAI_KEY> --lang=en --tld=com

Language Examples

  • English (United States) DEFAULT
    • LANGUAGE=en TOP_LEVEL_DOMAIN=com
  • English (Australia)
    • LANGUAGE=en TOP_LEVEL_DOMAIN=com.au
  • English (India)
    • LANGUAGE=en TOP_LEVEL_DOMAIN=co.in
  • French (France)
    • LANGUAGE=fr TOP_LEVEL_DOMAIN=fr

See Localized 'accents' section on gTTS docs for more information

References

chatgpt-voice-assistant's People

Contributors

jakecyr avatar jghaines avatar rwmartinez avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.