Coder Social home page Coder Social logo

Comments (19)

piem avatar piem commented on May 22, 2024 6

Hi @jhoelzl ,

what happens if you set FORMAT to pyaudio.paFloat32 instead of pyaudio.paInt16 ?

best, piem

from aubio.

jhoelzl avatar jhoelzl commented on May 22, 2024 4

Hello,

i am trying to integrate the pitch detection example into the PyAudio record example:

"""PyAudio example: Record a few seconds of audio and save to a WAVE file."""


import pyaudio
import wave
import numpy as np
from aubio import pitch

CHUNK = 1024
FORMAT = pyaudio.paInt16
CHANNELS = 1
RATE = 16000
RECORD_SECONDS = 5
WAVE_OUTPUT_FILENAME = "output.wav"

p = pyaudio.PyAudio()

stream = p.open(format=FORMAT,
                channels=CHANNELS,
                rate=RATE,
                input=True,
                frames_per_buffer=CHUNK)

print("* recording")

frames = []

# Pitch
tolerance = 0.8
downsample = 1
win_s = 4096 // downsample # fft size
hop_s = 512  // downsample # hop size
pitch_o = pitch("yin", win_s, hop_s, RATE)
pitch_o.set_unit("midi")
pitch_o.set_tolerance(tolerance)


for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
    buffer = stream.read(CHUNK)
    frames.append(buffer)

    # Convert buffer to numpy data array
    signal = np.fromstring(buffer, dtype=np.int16)

    # Detect Pitch
    pitch = pitch_o(signal)[0]
    confidence = pitch_o.get_confidence()

    print("{} / {}".format(pitch,confidence))


print("* done recording")

stream.stop_stream()
stream.close()
p.terminate()

wf = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
wf.setnchannels(CHANNELS)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(b''.join(frames))
wf.close()

However, i get an error:

pitch = pitch_o(signal)[0]
ValueError: input array should be float

When i change
signal = np.fromstring(buffer, dtype=np.int16) to signal = np.fromstring(buffer, dtype='f') the error disappears, but i only get this output (pitch /confidance):

16.7740325928 / nan
16.7740325928 / nan
16.7740325928 / nan
16.7740325928 / nan
16.7740325928 / nan
16.7740325928 / nan
16.7740325928 / nan
16.7740325928 / nan
16.7740325928 / nan
16.7740325928 / nan
16.7740325928 / nan
16.7740325928 / nan
16.7740325928 / nan
16.7740325928 / nan

Any suggestions?

Thanks for support!

Regards,
Josef

from aubio.

jhoelzl avatar jhoelzl commented on May 22, 2024 4

Hi @piem , thanks, yes now it works fine!

from aubio.

jhoelzl avatar jhoelzl commented on May 22, 2024 1

Of course, this is now working:

#! /usr/bin/env python
import pyaudio
import wave
import numpy as np
from aubio import pitch

CHUNK = 1024
FORMAT = pyaudio.paFloat32
CHANNELS = 1
RATE = 44100
RECORD_SECONDS = 5
WAVE_OUTPUT_FILENAME = "output.wav"

p = pyaudio.PyAudio()

stream = p.open(format=FORMAT,
                channels=CHANNELS,
                rate=RATE,
                input=True,
                frames_per_buffer=CHUNK)

print("* recording")

frames = []

# Pitch
tolerance = 0.8
downsample = 1
win_s = 4096 // downsample # fft size
hop_s = 1024  // downsample # hop size
pitch_o = pitch("yin", win_s, hop_s, RATE)
pitch_o.set_unit("midi")
pitch_o.set_tolerance(tolerance)


for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
    buffer = stream.read(CHUNK)
    frames.append(buffer)

    signal = np.fromstring(buffer, dtype=np.float32)

    pitch = pitch_o(signal)[0]
    confidence = pitch_o.get_confidence()

    print("{} / {}".format(pitch,confidence))


print("* done recording")

stream.stop_stream()
stream.close()
p.terminate()

wf = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
wf.setnchannels(CHANNELS)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(b''.join(frames))
wf.close()

from aubio.

SutirthaChakraborty avatar SutirthaChakraborty commented on May 22, 2024 1

Of course, this is now working:

#! /usr/bin/env python
import pyaudio
import wave
import numpy as np
from aubio import pitch

CHUNK = 1024
FORMAT = pyaudio.paFloat32
CHANNELS = 1
RATE = 44100
RECORD_SECONDS = 5
WAVE_OUTPUT_FILENAME = "output.wav"

p = pyaudio.PyAudio()

stream = p.open(format=FORMAT,
                channels=CHANNELS,
                rate=RATE,
                input=True,
                frames_per_buffer=CHUNK)

print("* recording")

frames = []

# Pitch
tolerance = 0.8
downsample = 1
win_s = 4096 // downsample # fft size
hop_s = 1024  // downsample # hop size
pitch_o = pitch("yin", win_s, hop_s, RATE)
pitch_o.set_unit("midi")
pitch_o.set_tolerance(tolerance)


for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
    buffer = stream.read(CHUNK)
    frames.append(buffer)

    signal = np.fromstring(buffer, dtype=np.float32)

    pitch = pitch_o(signal)[0]
    confidence = pitch_o.get_confidence()

    print("{} / {}".format(pitch,confidence))


print("* done recording")

stream.stop_stream()
stream.close()
p.terminate()

wf = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
wf.setnchannels(CHANNELS)
wf.setsampwidth(p.get_sample_size(FORMAT))
wf.setframerate(RATE)
wf.writeframes(b''.join(frames))
wf.close()

Can we do the similar to this Tap with beat ? Can anyone help please ?

from aubio.

piem avatar piem commented on May 22, 2024

Yes, as long as your machine can cope up with the load and process the samples in time.

I just added a simple example to play an aubio source using a PySoundCard stream:

https://github.com/piem/aubio/blob/develop/python/demos/demo_pysoundcard_play.py

Let me know how it works for you.

from aubio.

piem avatar piem commented on May 22, 2024

Also added another simple example to record the microphone input to file:

https://github.com/piem/aubio/blob/develop/python/demos/demo_pysoundcard_record.py

from aubio.

stuaxo avatar stuaxo commented on May 22, 2024

Cool - so if I put the audio sink bit in a one thread, can I then get useful info out of aubio in realtime like onset/beats in another ?

from aubio.

stuaxo avatar stuaxo commented on May 22, 2024

Tested both of these and they work - recording and playback.

I couldn't playback mp3, which I guess is due to my ubuntu setup:

ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.rear
ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.center_lfe
ALSA lib pcm.c:2239:(snd_pcm_open_noupdate) Unknown PCM cards.pcm.side
bt_audio_service_open: connect() failed: Connection refused (111)
bt_audio_service_open: connect() failed: Connection refused (111)
bt_audio_service_open: connect() failed: Connection refused (111)
bt_audio_service_open: connect() failed: Connection refused (111)
ALSA lib pcm_dmix.c:961:(snd_pcm_dmix_open) The dmix plugin supports only playback stream
test.mp3
AUBIO ERROR: source_sndfile: Failed opening test.mp3: File contains data in an unknown format.
AUBIO ERROR: source_wavread: could not find RIFF header in test.mp3
AUBIO ERROR: source: failed creating aubio source with test.mp3 at samplerate 0 with hop_size 256
Traceback (most recent call last):
  File "demo_pysoundcard_play.py", line 25, in <module>
    play_source(sys.argv[1])
  File "demo_pysoundcard_play.py", line 11, in play_source
    f = source(source_path, hop_size = hop_size)
StandardError: error creating object

from aubio.

piem avatar piem commented on May 22, 2024

great. yes, now you can access the samples from the microphone, you can also send them to aubio.onset, aubio.pitch, and so on.

as for reading mp3 files (and many other compressed formats) on linux, you will need to recompile aubio with libav.

on ubuntu, make sure you have the following packages installed:

    libavcodec-dev libavformat-dev libavresample-dev libavutil-dev

then run ./waf configure build again.

from aubio.

stuaxo avatar stuaxo commented on May 22, 2024

Excellent, I'll give that a try.

Thinking I might check what dependencies all the examples have and add a requirements.txt if that makes sense ?

from aubio.

piem avatar piem commented on May 22, 2024

hi Stuart!

there is no real requirements for python-aubio itself, apart from numpy. so i'd rather have the user install what they need to run the demos.

i guess this one can be closed by now!

cheers, piem

from aubio.

piem avatar piem commented on May 22, 2024

great! can you post the working version here?

thanks, piem

from aubio.

jhoelzl avatar jhoelzl commented on May 22, 2024

So now i have one more question.

The example code with PyAudio above works fine. However, when i integrate this into the listen() function of the SpeechRecognition module (which also uses PyAudio), i always get an error when performing pitch = pitch_o(signal)[0]:

('Unexpected error:', <type 'exceptions.UnboundLocalError'>)

I compared the input data array from the working example script with the listen() function, it has the same structure:

  • type(signal) returns: <type 'numpy.ndarray'>
  • signal.shape returns (480,)

In both scripts, PyAudio uses FORMAT = pyaudio.paInt16 and then i convert the values:

while True:
    buffer = stream.read(CHUNK)
    signal = np.fromstring(buffer, dtype=np.int16)
    signal = signal.astype(np.float32)
    #print(signal.shape)
    #print(signal)
    #print(type(signal))
    pitch = pitch_o(signal)[0]
    confidence = pitch_o.get_confidence()

In the example script, when the hop_s value has an incorrect number, pitch_o() correctly returns the error message:

pitch = pitch_o(signal)[0]
ValueError: input size of pitch should be 240, not 480

But this does not happen in the listen() function, only this error appears:

('Unexpected error:', <type 'exceptions.UnboundLocalError'>)

I assume the module is correctly loaded, because no error appears when performing this code:

import numpy as np
from aubio import pitch
...
...
# Pitch
tolerance = 0.8
downsample = 1
win_s = 4096 // downsample # fft size
hop_s = 480  // downsample # hop size
pitch_o = pitch("yin", win_s, hop_s, 16000)
pitch_o.set_unit("f0")
pitch_o.set_tolerance(tolerance)
...
...

Maybe somebody has an idea...

Regards,
Josef

from aubio.

piem avatar piem commented on May 22, 2024

Hi @jhoelzl

Would you mind opening a new issue for this? A more detailed code example showing where the exception is raised would help.

thanks, Paul

from aubio.

jhoelzl avatar jhoelzl commented on May 22, 2024

@piem , sure, here: #67

from aubio.

fbukevin avatar fbukevin commented on May 22, 2024

Hi, guys.

I installed aubio python module and download the python demo demo_onset.py. on both Windows 7 and Linux Mint(18.1). With Python version 3.6, when I used python3 demo_onset.py test.mp3, the error output was as following:

AUBIO ERROR: source_wavread: could not find RIFF header in test.mp3
AUBIO ERROR: source: failed creating aubio source with test.mp3 at samplerate 0 with hop_size 256
Traceback (most recent call last):
File "demo_onset.py", line 18, in <module>
s = source(filename, samplerate, hop_s)
RuntimeError: error creating source with "test.mp3"

I then installed libavcodec-dev libavformat-dev libavresample-dev libavutil-dev and rebuilt aubio with ./waf configure build and sudo ./waf install, also exported LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$PWD/build/src. But I got the same error still.

I got the same problem in python 2.7. Did I lose any step or prerequisite?

from aubio.

piem avatar piem commented on May 22, 2024

from aubio.

fbukevin avatar fbukevin commented on May 22, 2024

Hi @piem ,

Sure!

Plz refer to #81

from aubio.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.