Coder Social home page Coder Social logo

microsoft / cognitive-speakerrecognition-python Goto Github PK

View Code? Open in Web Editor NEW
110.0 66.0 62.0 32 KB

Python SDK for the Microsoft Speaker Recognition API, part of Cognitive Services

Home Page: https://www.microsoft.com/cognitive-services/en-us/speaker-recognition-api

License: Other

Python 100.00%

cognitive-speakerrecognition-python's Introduction

Microsoft Speaker Recognition API: Python Sample

This repo contains Python samples (using Python 3) to demonstrate the use of Microsoft Speaker Recognition API, an offering within Microsoft Cognitive Services, formerly known as Project Oxford.

Run the Sample

First, you must obtain a free Speaker Recognition API subscription key by following the instructions on our website.

To use this sample application, there are four different scenarios:

  1. Create a user profile: python Identification\CreateProfile.py <subscription_key>
  2. Print all user profiles: python Identification\PrintAllProfiles.py <subscription_key>
  3. Enroll user profiles: python Identification\EnrollProfile.py <subscription_key> <profile_id> <enrollment_file_path>
  4. Identify test files: python Identification\IdentifyFile.py <subscription_key> <identification_file_path> <profile_ids>...

Contributing

We welcome contributions. Feel free to file issues and pull requests on the repo and we'll address them as we can. Learn more about how you can help on our Contribution Rules & Guidelines.

You can reach out to us anytime with questions and suggestions using our communities below:

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

License

All Microsoft Cognitive Services SDKs and samples are licensed with the MIT License. For more details, see LICENSE.

Sample images are licensed separately, please refer to LICENSE-IMAGE.

cognitive-speakerrecognition-python's People

Contributors

hoda-gharieb avatar lightfrenzy avatar m-sherif avatar msftgits avatar yara11 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

cognitive-speakerrecognition-python's Issues

Error enrolling

When running the enrollProfile.py I'm getting the following error.

pi@raspberrypi:~/MS_spkr_recog $ python3 Identification/EnrollProfile.py ----------------------------------------- David "/home/pi/MS_spkr_recog/spkr_profile" true
ERROR:root:Error enrolling profile.
Traceback (most recent call last):
File "Identification/EnrollProfile.py", line 70, in
enroll_profile(sys.argv[1], sys.argv[2], sys.argv[3], sys.argv[4])
File "Identification/EnrollProfile.py", line 51, in enroll_profile
force_short_audio.lower() == "true")
File "/home/pi/MS_spkr_recog/Identification/IdentificationServiceHttpClientHelper.py", line 226, in enroll_profile
with open(file_path, 'rb') as body:
IsADirectoryError: [Errno 21] Is a directory: '/home/pi/MS_spkr_recog/spkr_profile'

Is the error due to the way the command line arguments are written, or is it something else. the dashes in the command line represent my subscription key.

Thank you,

David Stanley

Guid should contain 32 digits with 4 dashes

When I try to run the identify_file function from the python file IdentifyFile.py in the Identification folder I get the following error...
Exception: Error identifying file: { "error": { "code": "BadRequest", "message": "Guid should contain 32 digits with 4 dashes (xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx)." } }
I can successfully run the function through terminal, but when I try to call the function with pre filled arguments I get the error above. This is what it looks like when I call for the function in a python file.

identify_file(subscriptionKey,audioFilePath,True, 'ea262cdd-2402-4418-b698-0272e8873dc3')

I thought maybe I had to convert it to a "Guid", which I did but then I just got this error..... (.-.)
File "IdentifyFile.py", line 43, in identify_file str(force_short_audio).lower() == "true") File "/Users/txt-19/Desktop/duckAi-py/Identification/IdentificationServiceHttpClientHelper.py", line 265, in identify_file if len(test_profile_ids) < 1: TypeError: object of type 'UUID' has no len()

Ruby API returns empty headers, no way to track operations

Hello, MS team. Sorry for leaving Ruby related issue in Python repo, but looks like Ruby repo is not available on github. I'm able to create profiles and make enroll request successfully, but it returns empty strings in header instead of id of operation. So that I'm not able to track current status of it. I know, that call was ok, since Get All Profiles method returns recently created profiles with Enrolled status. The same issue is with Identification method (https://dev.projectoxford.ai/docs/services/563309b6778daf02acc0a508/operations/5645c523778daf217c292592). I do request and receive just empty headers. Could your team fix that for Ruby? thanks. Or also, just as suggestion, would be useful to have /operations GET api to return a list of current operations for subscription. Is that possible?

Error: message Guid should contain 32 digits with 4 dashes (xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx)

When I try to run Identifyfile i get this error.
When I try to run the code on an enrolled profile I get the above error.

Here is the code I used
import IdentificationServiceHttpClientHelper
import sys

def identify_file(subscription_key, file_path, force_short_audio, profile_ids):
"""Identify an audio file on the server.

Arguments:
subscription_key -- the subscription key string
file_path -- the audio file path for identification
profile_ids -- an array of test profile IDs strings
force_short_audio -- waive the recommended minimum audio limit needed for enrollment
"""
helper = IdentificationServiceHttpClientHelper.IdentificationServiceHttpClientHelper(
    subscription_key)

identification_response = helper.identify_file(
    file_path, profile_ids,
    force_short_audio.lower() == "true")

print('Identified Speaker = {0}'.format(identification_response.get_identified_profile_id()))
print('Confidence = {0}'.format(identification_response.get_confidence()))

if name == "main":
# if len(sys.argv) < 5:
# print('Usage: python IdentifyFile.py <subscription_key> <identification_file_path>'
# ' <profile_ids>...')
# print('\t<subscription_key> is the subscription key for the service')
# print('\t<identification_file_path> is the audio file path for identification')
# print('\t<force_short_audio> True/False waives the recommended minimum audio limit needed '
# 'for enrollment')
# print('\t<profile_ids> the profile IDs for the profiles to identify the audio from.')
# sys.exit('Error: Incorrect Usage.')

identify_file('702227e**************', 'ironman/fan.wav', 'True', '6f3d4f1b-3b70-439d-8c78-2abb15fbc791')

Opearation Error: SpeakerInvalid

ERROR:root:Error polling the operation status.
ERROR:root:Error identifying file.
Traceback (most recent call last):
  File "data_processor.py", line 239, in <module>
    azure_analysis(se_times, fs1, guid, profiles, speaker_profile)
  File "data_processor.py", line 174, in azure_analysis
    resp_id = identify_file(AZURE_KEY, output_name, 'true', guid)
  File "somepath/CSRP/Identification/IdentifyFile.py", line 50, in identify_file
    force_short_audio.lower() == "true")
  File "somepath/CSRP/Identification/IdentificationServiceHttpClientHelper.py", line 288, in identify_file
    self._poll_operation(operation_url))
  File "somepath/CSRP/Identification/IdentificationServiceHttpClientHelper.py", line 327, in _poll_operation
    operation_response[self._OPERATION_MESSAGE_FIELD_NAME])
Exception: Operation Error: SpeakerInvalid

I am getting the above error when I submit an identify_file request to the server.

I think I am seeing the following case:

Case 4 - failed
HTTP/1.1 200 Ok
Content-Type: application/json
{
  "status": "failed",
  "createdDateTime":  "2015-09-30T01:28:23Z"
  "lastActionDateTime": "2015-09-30T01:35:23Z"
  "message":  "Some failure info"
}

Can I get some more detailed reasons on what the SpeakerInvalid error message means and what are some things I can do to make the file I upload go through?

mic/streaming

is there anyway to use the microphone to identify the enrolled speakers?
so the audio input as a stream?

thanks!

No module named http.client

I get an error saying No module named http.client Sorry not really experienced with python, where can I get the http.client module?

Full error after running the command "python CreateProfile.py (my subscription key)"
Traceback (most recent call last): File "CreateProfile.py", line 33, in <module> import IdentificationServiceHttpClientHelper File "/Users/Nabil/Desktop/Cognitive-SpeakerRecognition-Python/Identification/IdentificationServiceHttpClientHelper.py", line 33, in <module> import http.client ImportError: No module named http.client

AN EASIER WAY TO GET STARTED

For anyone in the future who's gonna try this repo, let me give you an easy way out. I spent a lot of time on Speaker Recognition and the official docs say something and the samples given do something else.
I will use the official REST API Docs for the text-independent method.

FOR ONCE AND FOR ALL, IF YOU WANT TO USE PYTHON, THIS IS THE EASIEST AND BEST WAY I FOUND :

CREATE PROFILE ๐Ÿ‘‡

import http.client
import json

conn = http.client.HTTPSConnection('westus.api.cognitive.microsoft.com')

headers = {'Ocp-Apim-Subscription-Key': 'YOUR_KEY',
    'Content-type': 'application/json'}

foo = {'locale': 'en-us'}
json_data = json.dumps(foo)

conn.request(method='POST', url='/speaker/verification/v2.0/text-independent/profiles', 
             body=json_data, headers=headers)
response = conn.getresponse().read().decode()
conn.close()

print(response)

ENROLL PROFILE ๐Ÿ‘‡

conn = http.client.HTTPSConnection('westus.api.cognitive.microsoft.com')

headers = {'Ocp-Apim-Subscription-Key': 'YOUR_KEY'}

with open('YOUR_WAV_FILE', 'rb') as data:
    conn.request(method='POST', 
                url=f'/speaker/verification/v2.0/text-independent/profiles/{profileId}/enrollments?ignoreMinLength=true', 
                body=data, headers=headers)
    response = conn.getresponse().read().decode()
    conn.close()
    print(response)

LIST PROFILES ๐Ÿ‘‡

conn = http.client.HTTPSConnection('westus.api.cognitive.microsoft.com')

headers = {'Ocp-Apim-Subscription-Key': 'YOUR_KEY'}

conn.request(method='GET', url='/speaker/verification/v2.0/text-independent/profiles?$top=10', 
             headers=headers)
response = conn.getresponse().read().decode()
conn.close()
print(response)

And you can now make your own calls using this method and following the docs link I mentioned above.

Cheers and Good Luck ! ๐Ÿป

Getting Bad request error for enrolment - Speaker Recognition

From @rajagopal28 on May 17, 2016 8:4

Hi,
I've been trying to enroll a voice file for a created profile using the python API.
I was able to create a profile and list all profiles successfully. But when I try to enroll a voice (.wav) file with a simple hello world phrase with the created profile, I get the error 'ERROR:root:Error enrolling profile.' which in the trace tells 'Exception: Error enrolling profile: Bad Request'. If needed I can attach the stack trace. Can you help me getting started with this?

Copied from original issue: microsoft/ProjectOxford-ClientSDK#66

How to change the predefined phrases ?

Hi,
I'vs been using the speaker recognition API. But for enrolling we have to say some predefined phrases
but what if I don't want to enroll the voice by saying the predefined phrases. What if I want to say "Abra Kadavra". Can anyone please help we with this?

DEPENDENCIES LIST ?

My issue is quite simple :

WHERE IS THE LIST OF DEPENDENCIES THAT WE NEED TO RUN THIS ?

Please provide a requirements.txt so that we can at least use this demo. The project in it's current state requires me to debug lots of dependency issues before I can actually run any script.

I expect better from a company like Microsoft.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.