WhisCall

A framework for AI WhatsApp calls using Whisper, Coqui TTS, GPT-3.5 Turbo, Virtual Audio Cable, and the WhatsApp Desktop App.

Demo.mp4

Tools used in this framework

Whisper (Speech to Text)
OpenAI GPT 3.5 Turbo
Coqui TTS
Virtual Audio Cable
WhatsApp Desktop App

Installation

Install VB-Audio Cable

Note: You need two separate Virtual Audio Cables. I am using VB Audio Cable and (VAC) Virtual Audio Cable. Install both.

Install Whatsapp Desktop Version

Download Whatsapp

Now Clone the Repo

  https://github.com/skshadan/WhisCall.git

  pip install -r requirements.txt

Find Speaker And Microphone Index

Run the below code to find the index of your virtual audio cable for the microphone and speaker.

import pyaudio

def list_audio_devices():
    p = pyaudio.PyAudio()
    info = p.get_host_api_info_by_index(0)
    num_devices = info.get('deviceCount')

    # Lists of devices to return
    speakers = []
    microphones = []

    # Scan through devices and add to list
    for i in range(0, num_devices):
        device = p.get_device_info_by_index(i)
        if device.get('maxInputChannels') > 0:
            microphones.append((i, device.get('name')))
        if device.get('maxOutputChannels') > 0:
            speakers.append((i, device.get('name')))

    p.terminate()
    return microphones, speakers

microphones, speakers = list_audio_devices()

print("Microphones:")
for idx, name in microphones:
    print(f"Index: {idx}, Name: {name}")

print("\nSpeakers:")
for idx, name in speakers:
    print(f"Index: {idx}, Name: {name}")

Select Input & Output for Microphone and Speaker in WhatsApp App

Run the code

main.py

from voice import select_microphone, transcribe_audio
from response import generate_response, text_to_speech, PlayAudio

def main():
    mic_index = select_microphone()
    for text in transcribe_audio(mic_index):
        if text:
            gpt_response = generate_response(text)
            text_to_speech(gpt_response) 
            PlayAudio()


if __name__ == "__main__":
    main()

Select the Microphone Index. The TTS & Whisper will load, and that's it!!!

If you want different voices, you need to change the TTS model as follows:

Download Models From Here:

Facing Any Issues?

Feel free to ask if you are having any issues. Also, feel free to contribute.

johnnvula / whiscall Goto Github PK

whiscall's Introduction

WhisCall

Tools used in this framework

Installation

Install Build Tools from Visual Studio 2022

Install CUDA Toolkit 12.4

Install Espeak

Install VB-Audio Cable

Install Whatsapp Desktop Version

Now Clone the Repo

Find Speaker And Microphone Index

Select Input & Output for Microphone and Speaker in WhatsApp App

Run the code

main.py

Select the Microphone Index. The TTS & Whisper will load, and that's it!!!

Facing Any Issues?

fin.

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent