Coder Social home page Coder Social logo

frame length about porcupine HOT 5 CLOSED

picovoice avatar picovoice commented on September 26, 2024
frame length

from porcupine.

Comments (5)

 avatar commented on September 26, 2024 2

i have removed wave from the imports and just using soundfile now
everything is working well

from porcupine.

kenarsa avatar kenarsa commented on September 26, 2024

correct. It accepts 512 samples per frame. You can buffer the audio and pass it to engine every other frame.

from porcupine.

 avatar commented on September 26, 2024

maybe im not understanding samples or frames correctly

so the stream comes in as a wave file.. so when i look at the wav info i get
_wave_params(nchannels=1, sampwidth=2, framerate=16000, nframes=256, comptype='NONE', compname='not compressed')

so i set nFrames = 256

and in code
b = io.BytesIO(msg.payload)
audio = wave.open(b, 'rb').readframes(nFrames)
result = porcupine.process(audio)

and result never fires the hotword being found

thought maybe the audio code is bad.. so added pyAudio to listen to the stream
pa = pyaudio.PyAudio()
stream = pa.open(format=pyaudio.paInt16, channels=1, rate=16000, output=True, frames_per_buffer=nFrames)

and then the code is
b = io.BytesIO(msg.payload)
audio = wave.open(b, 'rb').readframes(nFrames)
stream.write(audio)
result = porcupine.process(audio)

the audio playing is crystal clear so its not bad code converting the audio.. and so added the code print(len(audio)) before the stream write.. it outputs to console 512. 512. 512 .... << this is the 512 you refer to porcupine liking??

as you said porcupine like 512.. so i changed the stream to 512.. and ran again
_wave_params(nchannels=1, sampwidth=2, framerate=16000, nframes=512, comptype='NONE', compname='not compressed')
and changed nFrames = 512

again the audio playing from pyAudio is crystal clear.. now the console logs 1024. 1024. 1024..
but again porcupine never picks up the hotword

any idea what might be wrong? im not sure if the sample width being 2 and nFrames 512 makes the console len print 1024? but with the stream at 256 or 512.. and the len outputting 512 or 1024 porcupine never fires

from porcupine.

 avatar commented on September 26, 2024

so i changed using wave to soundfile
replaced the code for opening the wave with
audio, sample_rate = soundfile.read(b, dtype='int16')

the len of audio console prints 512.. good :)
porcupine fires on the hotword.. good :)

pyAudio does not sound good at all.. its all choppy
nFrames = 512
pa = pyaudio.PyAudio()
stream = pa.open(format=pyaudio.paInt16, channels=1, rate=16000, output=True, frames_per_buffer=nFrames)
...
stream.write(audio)

sounds choppy.. not clear like it was using the wave lib to open it.. but porcupine works with soundfile whereas it does not with wave.. using the same code

update
changing the audio to bytes for pyaudio now sounds clear again and not choppy
stream.write(bytes(audio))

from porcupine.

kenarsa avatar kenarsa commented on September 26, 2024

Could it be an endianness issue?

from porcupine.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.