picovoice / web-voice-processor Goto Github PK

A library for real-time voice processing in web browsers

License: Apache License 2.0

JavaScript 29.12% TypeScript 70.53% HTML 0.35%

javascript browser web-browser real-time realtime wake-word-detection voice-commands speech-recognition speech-to-text voice-processing

web-voice-processor's Introduction

Web Voice Processor

Made in Vancouver, Canada by Picovoice

A library for real-time voice processing in web browsers.

Uses the Web Audio API to access microphone audio.
Leverages Web Workers to offload compute-intensive tasks off of the main thread.
Converts the microphone sampling rate to 16kHz, the de facto standard for voice processing engines.
Provides a flexible interface to pass in arbitrary voice processing workers.
Web Voice Processor

Browser compatibility

All modern browsers (Chrome/Edge/Opera, Firefox, Safari) are supported, including on mobile. Internet Explorer is not supported.

Using the Web Audio API requires a secure context (HTTPS connection), with the exception of localhost, for local development.

This library includes the utility function browserCompatibilityCheck which can be used to perform feature detection on the current browser and return an object indicating browser capabilities.

ESM:

import { browserCompatibilityCheck } from '@picovoice/web-voice-processor';
browserCompatibilityCheck();

IIFE:

window.WebVoiceProcessor.browserCompatibilityCheck();

Browser features

'_picovoice' : whether all Picovoice requirements are met
'AudioWorklet' (not currently used; intended for the future)
'isSecureContext' (required for microphone permission for non-localhost)
'mediaDevices' (basis for microphone enumeration / access)
'WebAssembly' (required for all Picovoice engines)
'webKitGetUserMedia' (legacy predecessor to getUserMedia)
'Worker' (required for resampler and for all engine processing)

Installation

npm install @picovoice/web-voice-processor

(or)

yarn add @picovoice/web-voice-processor

How to use

Via ES Modules (Create React App, Angular, Webpack, etc.)

import { WebVoiceProcessor } from '@picovoice/web-voice-processor';

Via HTML script tag

Add the following to your HTML:

<script src="@picovoice/web-voice-processor/dist/iife/index.js"></script>

The IIFE version of the library adds WebVoiceProcessor to the window global scope.

Start listening

WebVoiceProcessor follows the subscribe/unsubscribe pattern. WebVoiceProcessor will automatically start recording audio as soon as an engine is subscribed.

const worker = new Worker('${WORKER_PATH}');
const engine = {
  onmessage: function(e) {
    /// ... handle inputFrame
  }
}

await WebVoiceProcessor.subscribe(engine);
await WebVoiceProcessor.subscribe(worker);
// or
await WebVoiceProcessor.subscribe([engine, worker]);

An engine is either a Web Workers or an object implementing the following interface within their onmessage method:

onmessage = function (e) {
    switch (e.data.command) {
        case 'process':
            process(e.data.inputFrame);
            break;
    }
};

where e.data.inputFrame is an Int16Array of frameLength audio samples.

For examples of using engines, look at src/engines.

This is async due to its Web Audio API microphone request. The promise will be rejected if the user refuses permission, no suitable devices are found, etc. Your calling code should anticipate the possibility of rejection. When the promise resolves, the WebVoiceProcessor is running.

Stop Listening

Unsubscribing the engines initially subscribed will stop audio recorder.

await WebVoiceProcessor.unsubscribe(engine);
await WebVoiceProcessor.unsubscribe(worker);
//or
await WebVoiceProcessor.unsubscribe([engine, worker]);

Reset

Use the reset function to remove all engines and stop recording audio.

await WebVoiceProcessor.reset();

Options

To update the audio settings in WebVoiceProcessor, use the setOptions function:

// Override default options
let options = {
  frameLength: 512,
  outputSampleRate: 16000,
  deviceId: null,
  filterOrder: 50
};

WebVoiceProcessor.setOptions(options);

VuMeter

WebVoiceProcessor includes a built-in engine which returns the VU meter. To capture the VU meter value, create a VuMeterEngine instance and subscribe it to the engine:

function vuMeterCallback(dB) {
  console.log(dB)
}

const vuMeterEngine = new VuMeterEngine(vuMeterCallback);
WebVoiceProcessor.subscribe(vuMeterEngine);

The vuMeterCallback should expected a number in terms of dBFS within the range of [-96, 0].

Build from source

Use yarn or npm to build WebVoiceProcessor:

yarn
yarn build

(or)

npm install
npm run-script build

The build script outputs minified and non-minified versions of the IIFE and ESM formats to the dist folder. It also will output the TypeScript type definitions.

web-voice-processor's People

Contributors

Stargazers

Watchers

Forkers

sanyaade-machine-learning teddius qls0ulp olafthiele mrityunjoys krlsnvz93 rogervaas suryatmodulus mikob sigmadeltacom muzahidbd arcada-uas edufocal saonam ulion lcsouzamenezes dhruvesh08

web-voice-processor's Issues

release() mehod does not release the microphone from browser

When I call release() for deallocate component, the microphone on browser stay opened.
I have tested on Chrome with vue.js Porcupine.vue

I solved to call before release in Porcupine.vue file:

this.webVp.audioSource.mediaStream.getTracks().forEach(track => track.stop());
this.webVp.release();

Is possible when call pause() release microphone and on resume() reattach it?

In mobile browser like IOS Safari there are problem: if I access microphone for other task I loose microphone in porcupine.
In mobile browser like Chrome Android, if microphone are used, Google webkitSpeechRecognition not work.

porcupine in javascript

hi, thank you for your great work!
is there any way to use picovoice porcupine in javascript language?

Need a way to set microphone device id

We have a microphone selector in the UI, how would one change the device id for the wake word engine? Perhaps .init should accept another parameter for deviceId?

createScriptProcessor

Is the voice processor going to be updated to use a newer method?

This method is already deprecated according to: https://developer.mozilla.org/en-US/docs/Web/API/BaseAudioContext/createScriptProcessor

Example for engine using javascript

Hi,

Is there any example on how to create engine using javascript?

Regards.

filterOrder Question

Hi! First - congrats on putting together a really useful audio framework! Wow!

Question: what does filterOrder mean within the options?

CheetahInvalidArgumentError: Initialization failed

Hey there! I cant use the cheetah model currently, because I always get this message:

[0] Cheetah model (.pv) file belongs to a different version of the library. File is `PK���` while library is `2.0.0`.
[1] Picovoice Error (code `00000136`)

when trying to:

import { CheetahTranscript, CheetahWorker } from '@picovoice/cheetah-web';
import { WebVoiceProcessor } from '@picovoice/web-voice-processor';
import cheetahParams from '../../models/stt-model';

const cheetah = await CheetahWorker.create(
                    'xxxxx',
                    transcriptCallback,
                    {
                        base64: cheetahParams
                    },
                );

I created the model file with npx pvbase64 -i cheetah_params.pv -o model.ts (as you suggested in your docs).

Any suggestions here?

Thanks, Sebi

Add source stream to options possibilities

im my case i already have the microphone stream open as part of webrtc, can an option be made to allow passing in the media stream for the microphone as n optional item int eh options pass in.

Trying to convert to G711 Ulaw or Alaw

Is there anyone who tried this already? I'm trying to convert the LPCM (16kHz, 16bit) to G711 Alaw using a client side library. I'm not getting the expected results using https://github.com/rochars/alawmulaw.

TypeError: Cannot read properties of undefined (reading 'trim')

hi,

I've been trying to integrate with a web app, and have been getting:
TypeError: Cannot read properties of undefined (reading 'trim')

tried with a fresh create react app and coppied the component from: https://picovoice.ai/docs/api/picovoice-react/

Demo repo here https://github.com/johnbowdenatfacet/pico2

Am I missing something?

Thank you

Uncaught TypeError: engine.postMessage is not a function at web_voice_processor.js:27

I am facing this engine issue as please help into it

Uncaught TypeError: engine.postMessage is not a function at web_voice_processor.js:27 at Array.forEach (<anonymous>) at Worker.downsampler.onmessage (web_voice_processor.js:26)

also getting below warning :

wasm streaming compile failed: TypeError: Failed to execute 'compile' on 'WebAssembly': Incorrect response MIME type. Expected 'application/wasm'. (anonymous) @ VM4 pv_porcupine.js:8 Promise.then (async) (anonymous) @ VM4 pv_porcupine.js:8 Promise.then (async) instantiateAsync @ VM4 pv_porcupine.js:8 createWasm @ VM4 pv_porcupine.js:8 (anonymous) @ VM4 pv_porcupine.js:8 (anonymous) @ VM5 porcupine.js:27 (anonymous) @ VM5 porcupine.js:181 (anonymous) @ VM3 porcupine_worker.js:7
2.
falling back to ArrayBuffer instantiation

If you have any proper step which do we need to use or steps please sugeest

I have copy same index.html as you provide for 'Hey Edison'

upsampling from 8k to 16k

Can you please suggest something on up sapmpling similar to this down sampling that you did

TypeError: Attempted to assign to readonly property.

TypeError: Attempted to assign to readonly property.
web-voice-processor/dist/esm/index.js:391
388 | if (typeof AudioWorkletNode !== 'function' || !('audioWorklet' in AudioContext.prototype)) {
389 | if (AudioContext) {
390 | // @ts-ignore

391 | AudioContext.prototype.audioWorklet = {
| ^ 392 | // eslint-disable-next-line
393 | addModule: function () {
394 | var _addModule = _asyncToGenerator$1( /#PURE/_regeneratorRuntime.mark(function _callee(moduleURL, options) {

it seems on some browser env, AudioContext.prototype.audioWorklet does not allow be overridden.