Coder Social home page Coder Social logo

johnnvula / whiscall Goto Github PK

View Code? Open in Web Editor NEW

This project forked from skshadan/whiscall

0.0 0.0 0.0 9.5 MB

A framework for AI WhatsApp calls using Whisper, Coqui TTS, GPT-3.5 Turbo, Virtual Audio Cable, and the WhatsApp Desktop App.

License: MIT License

Python 100.00%

whiscall's Introduction

Logo

WhisCall

A framework for AI WhatsApp calls using Whisper, Coqui TTS, GPT-3.5 Turbo, Virtual Audio Cable, and the WhatsApp Desktop App.

Demo.mp4

Tools used in this framework

  • Whisper (Speech to Text)
  • OpenAI GPT 3.5 Turbo
  • Coqui TTS
  • Virtual Audio Cable
  • WhatsApp Desktop App

Installation

Install Build Tools from Visual Studio 2022

Download Visual Studio Installer

App Screenshot

Install CUDA Toolkit 12.4

Download CUDA Toolkit 12.4

App Screenshot

Install Espeak

Download Espeak from here

App Screenshot

Install VB-Audio Cable

Note: You need two separate Virtual Audio Cables. I am using VB Audio Cable and (VAC) Virtual Audio Cable. Install both.

App Screenshot

Install Whatsapp Desktop Version

Download Whatsapp

App Screenshot

Now Clone the Repo

  https://github.com/skshadan/WhisCall.git
  pip install -r requirements.txt

Find Speaker And Microphone Index

Run the below code to find the index of your virtual audio cable for the microphone and speaker.

import pyaudio

def list_audio_devices():
    p = pyaudio.PyAudio()
    info = p.get_host_api_info_by_index(0)
    num_devices = info.get('deviceCount')

    # Lists of devices to return
    speakers = []
    microphones = []

    # Scan through devices and add to list
    for i in range(0, num_devices):
        device = p.get_device_info_by_index(i)
        if device.get('maxInputChannels') > 0:
            microphones.append((i, device.get('name')))
        if device.get('maxOutputChannels') > 0:
            speakers.append((i, device.get('name')))

    p.terminate()
    return microphones, speakers

microphones, speakers = list_audio_devices()

print("Microphones:")
for idx, name in microphones:
    print(f"Index: {idx}, Name: {name}")

print("\nSpeakers:")
for idx, name in speakers:
    print(f"Index: {idx}, Name: {name}")

Select Input & Output for Microphone and Speaker in WhatsApp App

App Screenshot

Run the code

main.py

from voice import select_microphone, transcribe_audio
from response import generate_response, text_to_speech, PlayAudio

def main():
    mic_index = select_microphone()
    for text in transcribe_audio(mic_index):
        if text:
            gpt_response = generate_response(text)
            text_to_speech(gpt_response) 
            PlayAudio()


if __name__ == "__main__":
    main()

Select the Microphone Index. The TTS & Whisper will load, and that's it!!!

App Screenshot

If you want different voices, you need to change the TTS model as follows:

Download Models From Here:

Facing Any Issues?

Feel free to ask if you are having any issues. Also, feel free to contribute.

fin.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.