Coder Social home page Coder Social logo

pkarpovich / kira-client Goto Github PK

View Code? Open in Web Editor NEW
0.0 2.0 0.0 363 KB

An AI-powered voice automation tool for IoT, integrating voice-triggered commands, OpenAI-driven intent recognition, and HTTP server management for seamless control of smart devices

License: MIT License

Python 98.38% Makefile 0.56% JavaScript 1.06%
ai-assistant intent-classification porcupine trigger-word-detection whisper

kira-client's Introduction

Kira Client

Overview

Kira Client is an AI-driven application designed for automating IoT tasks using voice commands. It listens for a specific trigger word, processes spoken instructions to comprehend user intent, and executes actions on IoT devices. The application integrates with OpenAI's API for advanced intent recognition

Key Features

  • Voice Activation: Activates upon hearing a pre-defined trigger word.
  • Audio Capture & Analysis: Records and analyzes spoken instructions.
  • Advanced Intent Recognition: Leverages OpenAI API to accurately interpret user intents.
  • Dynamic IoT Interaction: Sends requests to IoT devices based on interpreted intents.
  • Visual Feedback System: Uses an LED strip to provide visual status updates.

Enhanced Intent Recognition

Kira Client utilizes a sophisticated intent recognition system powered by OpenAI. The system interprets user commands based on a configured list of intents. Each intent includes a name, description, and associated action.

Example intent configuration:

[
  {
    "name": "NewMeeting",
    "description": "create a new meeting",
    "action": {
      "type": "request",
      "options": {
        "url": "http://localhost:8090/execute?name=Create Google Meet",
        "method": "GET"
      }
    }
  }
]

Workflow

  1. Waits for the trigger word to start the listening mode.
  2. Records and transcribes the subsequent spoken instructions.
  3. The OpenAIClient interprets the intent using the provided template.
  4. Executes actions on IoT devices based on the recognized intent

HTTP Server Functionality

Kira Client incorporates an HTTP server to manage voice trigger detection and initiate intent recognition.

Key Endpoints

  • /intents/start-recognition: Pauses the voice trigger detector and starts listening for text to recognize the intent.

Installation

Prerequisites

  • Python 3.11+
  • Poetry
  • PortAudio

Steps

  1. Clone the repository
  2. Install dependencies using Poetry
poetry install
  1. Run the application
poetry run python kira_client/main.py

License

This project is licensed under the MIT License - MIT license.

kira-client's People

Contributors

dependabot[bot] avatar pkarpovich avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.