Coder Social home page Coder Social logo

odbot's Introduction

ODBot

Please take a look at our wiki before getting starting!

This is a snapshot version for Respeaker 4 Mic Hat for RPI version. Later version will use Respeaker Mic Array v2.0

Getting Started

This project uses Seeed Studio's 4 mic hat and Respeaker Mic Array v2.0. For more information on this device, refer to Seeed studio's wiki page (Product page).

http://wiki.seeedstudio.com/ReSpeaker_Mic_Array_v2.0/

Python 2.7 is used for this project and virtualenv is strongly recommended to isolate python environment.

Voice-engine is used as a voice interface and for DOA (Direction of Arrival)

CMU Sphinx and pocketsphinx is used for speech recognition.

Installation

  1. Install necessary drivers
  2. Install virtualenv (optional)
  3. Install all dependencies
  4. git clone https://github.com/Rezar/ODBot.git
  5. Apply voice-engine patch
  6. Modify keyphrase.list as necessary (optional)

Dependencies

  • Respeaker 4mic hat - Uses spidev and gpiozero.
    • Snowboy and Google Assistant library installation is unnecessary.
  • Voice-engine - Used as a voice interface
    • Voice-engine needs to be modified in order to be used with pocketsphinx in this project. Please refer to instructions below after installation.
    • Make sure the right DOA code is being used. This should be used for 4 mic hat.
  • Pocketsphinx - Used for STT (Speech-To-Text)

Drivers

Topology

Voice-engine Audio Feed Flow

Source -> ChannelPicker -> PocketSphinx
  |
  v
 DOA

Voice Command Flow

Human Voice -> Source -> ChannelPicker -> PocketSphinx -> KWS
                                              /             \
                                            DOA            Arduino Serial Communication
											                 \
															 Arduino Command Recognizer
															

How to build the ODbot (cooking recipe)

https://drive.google.com/open?id=1q2JHpqTQjX9pdawrUT7o65O4GIdZvtEY

odbot's People

Contributors

alitheg92 avatar ikheyfets avatar prkhrv avatar rezar avatar sbaik2 avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

Forkers

ikheyfets

odbot's Issues

Issue with frame size

Originally, frame sizes were around 10 - 20ms. However, to implement KWS, frame size was increased significantly so that keywords wouldn't get cut off. This worked well with pocketsphinx KWS but since voice-engine's DOA was created to take 10~20ms of frame time, DOA is not outputting accurate results. Same goes with VAD. Although we're not using it at the moment, in order to use VAD, we need to cut this frame time.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.