Coder Social home page Coder Social logo

watson-developer-cloud / speech-to-text-nodejs Goto Github PK

View Code? Open in Web Editor NEW
1.1K 78.0 709.0 47.66 MB

:microphone: Sample Node.js Application for the IBM Watson Speech to Text Service

Home Page: https://speech-to-text-demo.ng.bluemix.net

License: Apache License 2.0

JavaScript 94.59% CSS 4.75% Dockerfile 0.44% Shell 0.22%

speech-to-text-nodejs's Introduction

🎤 Speech to Text Demo

Node.js sample applications that shows some of the the IBM Watson Speech to Text service features.

Travis semantic-release

The Speech to Text service uses IBM's speech recognition capabilities to convert speech in multiple languages into text. The transcription of incoming audio is continuously sent back to the client with minimal delay, and it is corrected as more speech is heard. The service is accessed via a WebSocket interface; a REST HTTP interface is also available;

You can view a demo of this app.

Prerequisites

  1. Sign up for an IBM Cloud account.
  2. Download the IBM Cloud CLI.
  3. Create an instance of the Speech to Text service and get your credentials:
    • Go to the Speech to Text page in the IBM Cloud Catalog.
    • Log in to your IBM Cloud account.
    • Click Create.
    • Click Show to view the service credentials.
    • Copy the apikey value.
    • Copy the url value.

Configuring the application

  1. In the application folder, copy the .env.example file and create a file called .env

    cp .env.example .env
    
  2. Open the .env file and add the service credentials that you obtained in the previous step.

    Example .env file that configures the apikey and url for a Speech to Text service instance hosted in the US East region:

    SPEECH_TO_TEXT_IAM_APIKEY=X4rbi8vwZmKpXfowaS3GAsA7vdy17Qh7km5D6EzKLHL2
    SPEECH_TO_TEXT_URL=https://api.us-east.speech-to-text.watson.cloud.ibm.com
    

Running locally

  1. Install the dependencies

    npm install
    
  2. Run the application

    npm start
    
  3. View the application in a browser at localhost:3000

Deploying to IBM Cloud as a Cloud Foundry Application

  1. Login to IBM Cloud with the IBM Cloud CLI

    ibmcloud login
    
  2. Target a Cloud Foundry organization and space.

    ibmcloud target --cf
    
  3. Edit the manifest.yml file. Change the name field to something unique. For example, - name: my-app-name.

  4. Deploy the application

    ibmcloud app push
    
  5. View the application online at the app URL, for example: https://my-app-name.mybluemix.net

License

This sample code is licensed under Apache 2.0.

Contributing

See CONTRIBUTING.

Open Source @ IBM

Find more open source projects on the IBM Github Page

speech-to-text-nodejs's People

Contributors

aameek avatar andresfvilla avatar apaparazzi0329 avatar arkwl avatar daniel-bolanos avatar dependabot[bot] avatar dpopp07 avatar ehdsouza avatar esbullington avatar germanattanasio avatar greenkeeperio-bot avatar iankit3 avatar jeff-arn avatar jsstylos avatar kasaby avatar kevinkowa avatar kkeerthana avatar kognate avatar leibaogit avatar leonrch avatar lhuihui avatar lpatino10 avatar madi-ji avatar mamoonraja avatar mediumtaj avatar mikemosca avatar nfriedly avatar ptitzler avatar sirspidey avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

speech-to-text-nodejs's Issues

build script is missing

The step npm run build in the README.md is failing because the script "build" doesn't exist in the package.json file. What should I execute in order to make the app run?

Play Sample doesn't work on Safari

Clicking on Play Sample 1 (or 2) on Safari on my Mac -- nothing happens. I get record audio isn't supported on Safari, but Play Sample should work.

Using the librery

Hello,

I was wondering if the use of BlueMix is necessary to develop over the speech to text API.

If the answer is yes, is the free plan enough to use the API?. Hope some of you with the experience on development can help me.

Bye

Redeploy the demo

Hi, German&James, I've updated the navigation link, service icon and favicon. Please help redeploy the demo :-)

it dosen't work on mobilephone

The microphone dosen't work when I used ipad or andorid to test the demo.Could it be the browser's problem?We hope not.😱😱😱😱😱

strange audio capture behavior in firefox

In STT demo, pressing the microphone button to speak works fine the first time. But if we press it to stop recording, and then press to start again, the STT output is very bad. Looks like something wrong with audio capture the second time (and all subsequent attempts are bad too). This is not an issue in Chrome, there it seems to work fine every time.

Testing Chrome and Firefox side by side, if I start/stop recording in both Chrome and Firefox, this seems to make Firefox work better - seemingly suggesting Chrome may be "initializing" something in audio capture that Firefox is not?

developer experience checklist

  • make sure it scales, it could be in the front page of reddit
    • Mainly looking at page weight: Shoot for 2-3mb max and rendering in 5 sec or less.
    • Test with dev tools throttling to 3G speeds and make sure things are still reasonable.
  • Add google analytics
  • blue-green deployment + travis (see this)
  • testing + travis (see this)
  • security.js (helmet + express-rate-limitation) + CSRF (see personality-insights and speech demos)
  • package.json should not specify node-engine so that Bluemix will always use the latest one.
  • Google RE-Captcha support (make sure design take this into account when designing a demo)
    • Talk to design to add it for existing demos.
  • ~~~Update travis to send emails when there is a tag release.~~~ SDKs only
  • Bluemix deployment tracker and privacy notice

Smart_formatting

Hi,
I would like to add smart formatting with speech to text service.Can anyone help me adding this feature?
I am new to blue-mix. So please help me with this.
Thank you

No Logs in the Console

I ran the app locally. It works, but I only see the very first console.log in my console ('listening to port'). After this no other logs are printed out.

When I run node inspector and open the URL in Chrome, nothing comes up. Not sure whether that's related to the first problem.

Is it just me? Any idea how to fix this? I looked briefly in the code but couldn't find where the logs could have been turned off. Maybe in one of the dependencies? Both issues make it really hard to extend the app.

[Speech to Text]:Not working in mobile

Hello,
I have deployed the app in bluemix. Now the url https://<...>.mybluemix.net is working perfectly in laptops/desktops. It also opens the initial page in mobile but the buttons for microphone, file open dialog etc. are appearing one below another instead of side by side. Import issue is buttons are not working when pressed.
Is there any other version is created for mobile app ?
If not then please let me know how to run successfully in mobile.

Watson Speech to Text Service Failed to Work on Safari 9.x

Hi Team,

I am working on a iOS web app. I found this node.js based SDK can't work on safari 9.x. After investigation, I've found the following issues:

  1. Safari Const declarations are not supported in strict mode.
  2. Websocket failed with "Invalid UTF-8 sequence in header value" when request or response contains any empty header, such as "Content-Type".
    image

Per above two issues, I did some modification on the local JavaScript code

  1. change "const" to "var"
  2. as for the blank header in request for token, I add blank check before post it
if ($('meta[name="ct"]').attr('content').length > 0) {
    tokenRequest.setRequestHeader('csrf-token',$('meta[name="ct"]').attr('content'));
}

But for the blank header in response from Watson server, I can do nothing =(
Is there any way to fix this? Thanks!

Here's some reference for the second issue

  1. https://bugs.webkit.org/show_bug.cgi?id=139298
  2. https://bugs.chromium.org/p/chromium/issues/detail?id=380075

No/wrong error message on Safari

I realize this isn't supported on Safari, but when trying demo on the Mac under Safari, when I click "Record Audio", nothing happens (no error message) and when I think click on something else (like Play Sample), I get "Currently audio is being record...".

The "Record Audio" should probably be grayed out if not supported

STT : main text window scroll to show most recent text

At the moment when text overflows in the main text window, users have to scroll down to see most recent text. Could we keep the text window always scrolled to bottom, showing most recently recognized text? Thanks, Vaibhava

microphone capture from W540 laptops

W540 laptops come with factory setting of microphone which completely breaks the STT service. Manually adjusting the microphone settings to disable all audio enhancements works, but people do not know about this. Can we control the microphone better from the demo application, and make sure all enhancements are off?

File upload menu stop running file

Need to ensure that if a file loaded via file upload menu is stopped prematurely, it doesn't try to send a socket message after the socket is closed (results in error message).

Sample1 and Sample2 : transcription runs ahead of playback

This looks 'magical' - we transcribe even before hearing the audio - makes it look like the whole thing is canned -:). Can we pace the audio for our samples so that transcription follows the audio playback? This is only for 'sample 1' and 'sample 2' (for all 6 models), not for general file upload where we do not know the sampling rate.

[speech-to-text] 1006 Connection dropped by remote peer.

I'm testing files of roughly 75MB through the Websocket route.
After weeks of working consistently, today I am getting the message:
1006 Connection dropped by remote peer.
This is occurring immediately, with no result messages.

About to send ./in/1861034_xxx_lp.f4v-part000.flac
connect. config is {"maxReceivedFrameSize":1048576,"maxReceivedMessageSize":8388608,"fragmentOutgoingMessages":true,"fragmentationThreshold":16384,"webSocketVersion":13,"assembleFragments":true,"disableNagleAlgorithm":true,"closeTimeout":5000,"tlsOptions":{}}
1006 Connection dropped by remote peer.

Is this issue something to report here, or to another ticketing system for the S2T service itself?

Show the playback when the url contains a debug=true or 1

We should hide/show the playback button if the url contains a specific query parameter like debug or playback

This will be really useful for users trying to test their microphone. They will be able to see what are they sending to the service.

I would suggest to use debug since that will allow us to do more than just playback.

speech-to-text-demo.mybluemix.net?debug=true

@kasaby you have already implemented the playback functionality. Can you detect the query parameter and show/hide it?

Investigate out of memory crashes

Long running app can crash due to out of memory error. (With 768 MB).

Should investigate whether this is due to heavy simultaneous usage, a very large input, an ongoing memory leak, or something else.

The URL query argument(s) ['X-WDC-PL-OPT-OUT'] are not allowed.

For me this demo neither works when I build it myself nor when I try the official version hosted by you. I tried it with a Logitech USB headset [which correctly records sound in the OS and Firefox 42.0 shares it], but then I get the error from the title, but I have also tried both an .ogg file downloaded from the TTS demo (which btw works - the other demo, not using the file, same error) as well as finally using the files from the Python STT demo, which also fail with the same error. Finally, since there is "opt out" in the message I played with the option whether Watson may learn from this session. However, whether I allow it or not, this demo still breaks.

Any suggestions how to fix this?
Thanks in advance.

Best,
Joe

STT DEMO -- Play Sample 2, Text Box typos for French broadband model--missing apostrophe in d' opinion, d'expression

Re: STT DEMO -- Typos in Text box of French broadband model; see "Play Sample 2"

When testing STT demo for French broadband model, via this link --> https://speech-to-text-demo-june20th.mybluemix.net/, Keerthana and I sighted this issue.

There are a few typos in the "Text" tab of the demo page, for "Play Sample 2".

Expected:
i) d'opinion
ii) d'expression

Actual text:
i) d opinion
ii) d expression

See details below.
Please let me know if you have any questions or comments.

Alexandra and Keerthana

Complete text below:
Tout individu a droit à la liberté d opinion et d expression. Ce qui implique le droit de ne pas être inquiété pour ses opinions et celui de chercher de recevoir et de répandre sans considérations de frontières les informations et les idées par quelque moyen d expression que ce soit.

Also, here's the JSON,for your reference:

{
"results": [
{
"word_alternatives": [
{
"start_time": 4.06,
"alternatives": [
{
"confidence": 0.9998,
"word": "ce"
}
],
"end_time": 4.27
},
{
"start_time": 4.27,
"alternatives": [
{
"confidence": 0.9995,
"word": "qui"
}
],
"end_time": 4.46
},
{
"start_time": 4.46,
"alternatives": [
{
"confidence": 0.9959,
"word": "implique"
},
{
"confidence": 0.0036,
"word": "applique"
}
],
"end_time": 4.92
},
{
"start_time": 4.92,
"alternatives": [
{
"confidence": 1,
"word": "le"
}
],
"end_time": 5.08
},
{
"start_time": 5.08,
"alternatives": [
{
"confidence": 0.9998,
"word": "droit"
}
],
"end_time": 5.4
},
{
"start_time": 5.4,
"alternatives": [
{
"confidence": 0.9998,
"word": "de"
}
],
"end_time": 5.53
},
{
"start_time": 5.53,
"alternatives": [
{
"confidence": 1,
"word": "ne"
}
],
"end_time": 5.64
},
{
"start_time": 5.64,
"alternatives": [
{
"confidence": 0.9998,
"word": "pas"
}
],
"end_time": 5.95
},
{
"start_time": 5.95,
"alternatives": [
{
"confidence": 0.9998,
"word": "être"
}
],
"end_time": 6.21
},
{
"start_time": 6.21,
"alternatives": [
{
"confidence": 0.9999,
"word": "inquiété"
}
],
"end_time": 6.71
},
{
"start_time": 6.71,
"alternatives": [
{
"confidence": 1,
"word": "pour"
}
],
"end_time": 7.01
},
{
"start_time": 7.01,
"alternatives": [
{
"confidence": 1,
"word": "ses"
}
],
"end_time": 7.2
},
{
"start_time": 7.2,
"alternatives": [
{
"confidence": 1,
"word": "opinions"
}
],
"end_time": 7.75
},
{
"start_time": 7.75,
"alternatives": [
{
"confidence": 1,
"word": "et"
}
],
"end_time": 7.78
},
{
"start_time": 7.78,
"alternatives": [
{
"confidence": 1,
"word": "celui"
}
],
"end_time": 8.19
},
{
"start_time": 8.19,
"alternatives": [
{
"confidence": 0.9999,
"word": "de"
}
],
"end_time": 8.33
},
{
"start_time": 8.33,
"alternatives": [
{
"confidence": 1,
"word": "chercher"
}
],
"end_time": 9.05
},
{
"start_time": 9.42,
"alternatives": [
{
"confidence": 1,
"word": "de"
}
],
"end_time": 9.59
},
{
"start_time": 9.59,
"alternatives": [
{
"confidence": 1,
"word": "recevoir"
}
],
"end_time": 10.52
},
{
"start_time": 10.55,
"alternatives": [
{
"confidence": 0.9999,
"word": "et"
}
],
"end_time": 10.64
},
{
"start_time": 10.64,
"alternatives": [
{
"confidence": 1,
"word": "de"
}
],
"end_time": 10.8
},
{
"start_time": 10.8,
"alternatives": [
{
"confidence": 0.9999,
"word": "répandre"
}
],
"end_time": 11.49
},
{
"start_time": 11.49,
"alternatives": [
{
"confidence": 1,
"word": "sans"
}
],
"end_time": 11.76
},
{
"start_time": 11.76,
"alternatives": [
{
"confidence": 1,
"word": "considérations"
}
],
"end_time": 12.8
},
{
"start_time": 12.8,
"alternatives": [
{
"confidence": 0.9999,
"word": "de"
}
],
"end_time": 12.93
},
{
"start_time": 12.93,
"alternatives": [
{
"confidence": 0.9999,
"word": "frontières"
}
],
"end_time": 13.7
},
{
"start_time": 13.77,
"alternatives": [
{
"confidence": 0.9999,
"word": "les"
}
],
"end_time": 14.4
},
{
"start_time": 14.4,
"alternatives": [
{
"confidence": 0.9999,
"word": "informations"
}
],
"end_time": 15.15
},
{
"start_time": 15.15,
"alternatives": [
{
"confidence": 1,
"word": "et"
}
],
"end_time": 15.19
},
{
"start_time": 15.19,
"alternatives": [
{
"confidence": 1,
"word": "les"
}
],
"end_time": 15.39
},
{
"start_time": 15.39,
"alternatives": [
{
"confidence": 1,
"word": "idées"
}
],
"end_time": 15.71
},
{
"start_time": 15.71,
"alternatives": [
{
"confidence": 0.9999,
"word": "par"
}
],
"end_time": 16.02
},
{
"start_time": 16.02,
"alternatives": [
{
"confidence": 0.9998,
"word": "quelque"
}
],
"end_time": 16.42
},
{
"start_time": 16.42,
"alternatives": [
{
"confidence": 1,
"word": "moyen"
}
],
"end_time": 16.78
},
{
"start_time": 16.78,
"alternatives": [
{
"confidence": 0.9999,
"word": "d"
}
],
"end_time": 16.9
},
{
"start_time": 16.9,
"alternatives": [
{
"confidence": 0.9999,
"word": "expression"
}
],
"end_time": 17.48
},
{
"start_time": 17.48,
"alternatives": [
{
"confidence": 1,
"word": "que"
}
],
"end_time": 17.64
},
{
"start_time": 17.64,
"alternatives": [
{
"confidence": 0.997,
"word": "ce"
},
{
"confidence": 0.003,
"word": "se"
}
],
"end_time": 17.8
},
{
"start_time": 17.8,
"alternatives": [
{
"confidence": 0.9894,
"word": "soit"
},
{
"confidence": 0.0076,
"word": "soient"
}
],
"end_time": 18.05
}
],
"keywords_result": {
"frontières": [
{
"normalized_text": "frontières",
"start_time": 12.93,
"confidence": 1,
"end_time": 13.7
}
],
"idées": [
{
"normalized_text": "idées",
"start_time": 15.39,
"confidence": 1,
"end_time": 15.71
}
]
},
"alternatives": [
{
"word_confidence": [
[
"ce",
0.9449882800711019
],
[
"qui",
0.9889126786825012
],
[
"implique",
0.7893408949680603
],
[
"le",
1
],
[
"droit",
1
],
[
"de",
1
],
[
"ne",
1
],
[
"pas",
0.9999999999999686
],
[
"être",
0.9999999999999686
],
[
"inquiété",
0.9999999999998692
],
[
"pour",
0.999999999999869
],
[
"ses",
0.9999999999999156
],
[
"opinions",
0.9999999999998309
],
[
"et",
0.9999999999997383
],
[
"celui",
0.9999999999998693
],
[
"de",
0.9999999999998689
],
[
"chercher",
0.9999999999998691
],
[
"de",
0.9999999999999346
],
[
"recevoir",
0.9999999999999342
],
[
"et",
0.9999999999999345
],
[
"de",
0.9999999999999344
],
[
"répandre",
0.9999999999999345
],
[
"sans",
0.9999999999999348
],
[
"considérations",
0.9999999999998692
],
[
"de",
0.999999999999869
],
[
"frontières",
0.999999999999874
],
[
"les",
0.9999999999997897
],
[
"informations",
0.9999999999997999
],
[
"et",
0.9999999999998037
],
[
"les",
0.9999999999997448
],
[
"idées",
0.9999999999997871
],
[
"par",
0.9999999999998365
],
[
"quelque",
0.9999999999998694
],
[
"moyen",
0.9999999999999178
],
[
"d",
0.8446927233799434
],
[
"expression",
0.9999999999998421
],
[
"que",
0.9999999999998566
],
[
"ce",
0.9835743381046492
],
[
"soit",
0.9999999999998803
]
],
"confidence": 0.99,
"transcript": "ce qui implique le droit de ne pas être inquiété pour ses opinions et celui de chercher de recevoir et de répandre sans considérations de frontières les informations et les idées par quelque moyen d expression que ce soit ",
"timestamps": [
[
"ce",
4.06,
4.27
],
[
"qui",
4.27,
4.46
],
[
"implique",
4.46,
4.92
],
[
"le",
4.92,
5.08
],
[
"droit",
5.08,
5.4
],
[
"de",
5.4,
5.53
],
[
"ne",
5.53,
5.64
],
[
"pas",
5.64,
5.95
],
[
"être",
5.95,
6.21
],
[
"inquiété",
6.21,
6.71
],
[
"pour",
6.71,
7.01
],
[
"ses",
7.01,
7.2
],
[
"opinions",
7.2,
7.75
],
[
"et",
7.75,
7.78
],
[
"celui",
7.78,
8.19
],
[
"de",
8.19,
8.33
],
[
"chercher",
8.33,
9.05
],
[
"de",
9.42,
9.59
],
[
"recevoir",
9.59,
10.52
],
[
"et",
10.55,
10.64
],
[
"de",
10.64,
10.8
],
[
"répandre",
10.8,
11.49
],
[
"sans",
11.49,
11.76
],
[
"considérations",
11.76,
12.8
],
[
"de",
12.8,
12.93
],
[
"frontières",
12.93,
13.7
],
[
"les",
13.77,
14.4
],
[
"informations",
14.4,
15.15
],
[
"et",
15.15,
15.19
],
[
"les",
15.19,
15.39
],
[
"idées",
15.39,
15.71
],
[
"par",
15.71,
16.02
],
[
"quelque",
16.02,
16.42
],
[
"moyen",
16.42,
16.78
],
[
"d",
16.78,
16.9
],
[
"expression",
16.9,
17.48
],
[
"que",
17.48,
17.64
],
[
"ce",
17.64,
17.8
],
[
"soit",
17.8,
18.05
]
]
},
{
"transcript": "ce qui implique le droit de ne pas être inquiété pour ses opinions et celui de chercher de recevoir et de répandre sans considérations de frontières les informations et les idées par quelque moyen d expression que ce soient "
},
{
"transcript": "ce qui applique le droit de ne pas être inquiété pour ses opinions et celui de chercher de recevoir et de répandre sans considérations de frontières les informations et les idées par quelque moyen d expression que ce soit "
}
],
"final": true
}
],
"result_index": 1

Split front-end code into a library

First off, this is an awesome demo and you guys did a great job on it.

That said, the front-end code is fairly complex and I think that at least part of it probably belongs in a library that we stick in bower and/or npm and then just reference it here.

That would ease developing new application because it would be something that one could simply drop in and have a clear line of "what code I can reuse without thinking about it" vs "what code I need to write for myself".

doubts / questions / suggest / request

I have Dragon Naturally speaking 12 professional installed and i create a UNIVERSAL VOICE PROFILE, i have author rights about it, this is a recent video (sorry for spanish audio)

https://youtu.be/wvVl1eeMBRo?t=117

  1. What is delay, why?, broadband?
  2. Is possible have this offline?
  3. What is accuracy
  4. Is possible add new vocabulary
  5. What is maximum of record in time
    .
    .
    .

Tnks

Recognized Text Formatting

First, thank you so much for your work.

I have a local node system all setup and working well on a Linux VM. I cant seem to figure out why the recognized text prior to final result is being displayed as one long string. Maybe cause im tired, but it's not clear to me. I would like each word regardless of final result to be displayed separately.

Currently pre final result it looks like this:

Thequickbrownfox

I would like it to display

The quick brown fox

Regardless of whether this gets corrected/changed.

Thanks!

Sample Audio File not working

See question on Dev Works: https://developer.ibm.com/answers/questions/205005/speech-to-text-demo-play-sample-audio-not-working.html

Steps Taken:

  1. Click 'Deploy to Bluemix' Button in the README file
  2. Click through to deploy application in my own bluemix space
  3. Click route (http://speech-to-text-deploybutton.mybluemix.net/) then click any of the options (record audio, upload file, play sample 1, play sample 2) and there is no response.

There is most likely a step I'm forgetting here, but I appreciate any help you can provide.

MediaStream.stop() deprecated in Chrome

When running this sample on Chrome Version 45.0.2454.101 (64-bit) for Mac OS, I get the following message on the JavaScript console from file Microphone.js

'MediaStream.stop()' is deprecated and will be removed in M47, around November 2015. Please use 'MediaStream.active' instead.

app.js config wrong?

The app.js config doesn't work for me.
I had to change it to:

var config = { version: 'v1', url: 'https://stream.watsonplatform.net/speech-to-text/api', username: '<username>', password: '<password>' };

STT DEMO KWS -- Text normalization is needed for all models; all keywords are erroneously not spotted for all non-US English models

Re: STT DEMO KWS -- Text normalization is needed for all models; all keywords are erroneously not spotted for all non-US English models. Currently, all keywords are spotted for only US English models.

When testing STT demo, via this link --> https://speech-to-text-demo-june20th.mybluemix.net/, I sighted this issue.

For all models, (except US Engish broadband and narrowband models), when more than one keyword is entered with different casings (lowercase vs. uppercase) of the same keyword, (i.e., para, PARA) in the key word spotting box, erroneously, not all of these words are spotted.

For this example, where text is..
"PERO QUIERO PREGUNTARLE SI SOLO EXISTEN PRODUCTOS. PARA LA PIEL PARA TRATAR EL ACNE. OH HAY OTRA FORMA DE TRATARLO. "

Expected:

  • para (2)
  • Start: 5.89 End: 6.12 Confidence: 100.0%
  • Start: 6.77 End: 7.02 Confidence: 99.9%
  • PARA (2)
  • Start: 5.89 End: 6.12 Confidence: 100.0%
  • Start: 6.77 End: 7.02 Confidence: 99.9%

Actual:
para

  • PARA (2)
  • Start: 5.89 End: 6.12 Confidence: 100.0%
  • Start: 6.77 End: 7.02 Confidence: 99.9%

Please let me know if you have any questions or comments.

Alexandra

Speech to text recognize method

I am trying to create a restfull service in bluemix. which will call speech to text service but i want to send base64string or byte array instead of file. Could you please help me with that

Installation error in ubuntu 32 bits

I have ubuntu 14 with 2 gb of RAM but with 32 bit architecture, i have:

sudo nano /home/notroot/speech-to-text-nodejs/server.js

var port = process.env.VCAP_APP_PORT || 3000;
to
var port = process.env.VCAP_APP_PORT || 3002;

when

npm install

errorspeech

I have installed locally with

username: process.env.STT_USERNAME || '',
password: process.env.STT_PASSWORD || ''

I other instance i need to create other?

Extra and confusing .js files in projects

It appears the public/flash and public/js folders are not necessary and should be removed. The microphone code seems to have been migrated to src/Microphone.js and related files. Took awhile to determine those extra folders weren't used anymore.

Thanks

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.