Comments (18)
@OmriBenShoham Thank you for using JS Speech SDK and writing this issue up. You asked about a 30 second limit using startContinuousRecognitionAsync()...that limit does not exist. Even with silence, continuous recognition should not stop until 10 minutes pass, or you call stopContinuousRecognitionAsync().
Note that we do have a react js integration sample using JS Speech SDK, but the sample uses the recognizeOnceAsync() API, meaning mic input stops on the first recognized result from the service.
I can adapt that sample to add continuous start/stop options...it wouldn't shock me if the ReactJS state primitives don't play well with ongoing event callbacks, but in theory continuous recognition should work with the sample. Could you look at that sample and adapt it first to see if it addresses this issue?
from cognitive-services-speech-sdk-js.
@glharper Thank you for your fast response,
I was trying to adapt the example you sent me and use recognizer.recognizeOnceAsync
My recording was still stopping after 30 seconds.
Also I checked and i'm not calling stopContinuousRecognitionAsync
.
Maybe there is a place where I can see the logs of the recognizer?
from cognitive-services-speech-sdk-js.
@OmriBenShoham I've added a branch with continuous recognition integration here. Let me know if it works for you.
from cognitive-services-speech-sdk-js.
@glharper I was trying to use my recording which is 35 seconds long with the sample branch you provided.
It still cut me exactly after 30 seconds - is the Continuous working for you with a recording of more than 30 seconds?
from cognitive-services-speech-sdk-js.
@OmriBenShoham Can you try speaking into your mic for more than 30 seconds? The audio config for that branch is fromDefaultMicrophoneInput(). Alternatively, you could change the sttFromMicrophoneContinuous method to use file input (see the fileChange() method for the audioConfig setup needed.)
from cognitive-services-speech-sdk-js.
@glharper I was speaking to my microphone for 35 seconds using "Convert speech to text from your mic (Continuous)." but only 30 seconds was applied.
I'm also adding image of the button I used from the sample branch you sent me
from cognitive-services-speech-sdk-js.
@OmniBenShoham For me, the results from microphone keep updating, even after 30 seconds. Don't think this is an SDK issue, or even an issue with the sample. To turn on Diagnostics, in the sttFromMicrophoneContinuous method, add this call at the beginning:
speechsdk.Diagnostics.SetLoggingLevel(speechsdk.LogLevel.Debug);
If you can grab the console output and attach it here as a text file, I can take a look and see if I can understand what's happening.
from cognitive-services-speech-sdk-js.
@glharper
Uploaded the console output logs as a file here.
I removed some token logs which was there(I guess it's not relevant) - I can see a InitialSilenceTimeout
as a RecognitionStatus in some of the logs, but i'm not sure if it's related.
consoleOutputContinous.log
from cognitive-services-speech-sdk-js.
@OmriBenShoham I think your audio is cutting out on your end after 45 seconds or so. From the log:
- at 2023-10-24T15:01:40 audio recognition is started by service (turn.start)
- (multiple intermediate results (speech.hypothesis))
- 2023-10-24T15:02:21.185Z: first speech.phrase (you should see the first recognized text in the sample)
- (one more intermediate result (speech.hypothesis))
- 2023-10-24T15:02:24.445Z: the second (and last) speech.phrase (you should see the text change in your sample)
- 2023-10-24T15:02:44.165Z: initial silence timeout received, with no intermediate results
Something on your end is cutting off your audio at 45 seconds.
from cognitive-services-speech-sdk-js.
@glharper
There is nothing special happening at my side after 45 seconds,
Can we maybe jump on a quick call / slack so I will be able to better demonstrate you the problem?
from cognitive-services-speech-sdk-js.
@glharper There is nothing special happening at my side after 45 seconds, Can we maybe jump on a quick call / slack so I will be able to better demonstrate you the problem?
Sure, send me an email at <my_username>(at)microsoft(dot)com and we'll set something up.
from cognitive-services-speech-sdk-js.
@glharper
Thanks! Sent you an email
from cognitive-services-speech-sdk-js.
@glharper
Kindly reminder :) sent you an email, if you can please reply and we will align a quick call
from cognitive-services-speech-sdk-js.
@glharper Kindly reminder :) sent you an email, if you can please reply and we will align a quick call
Not seeing it in my inbox, could you check where you sent it? (my username is glharper or GLHARPER in caps :-)
from cognitive-services-speech-sdk-js.
@glharper Sent you a connection with note on Linkedin :)
from cognitive-services-speech-sdk-js.
Closing, resolved via Teams meeting
from cognitive-services-speech-sdk-js.
Hi @glharper
What are the scenarios when speechEndDetected
is triggered?
from cognitive-services-speech-sdk-js.
so what was the resolution for this?
from cognitive-services-speech-sdk-js.
Related Issues (20)
- [Bug]: ErrorType (UnexpectedBreak,MissingBreak) are not receiving in detailResult words from sdk HOT 4
- [Bug]: speakSsmlAsync produces 0 duration audio but result reason is SynthesizingAudioCompleted HOT 1
- [Bug]: Real-Time Speech-to-Text Lag and Synchronization Problems on Low-Power Devices HOT 4
- [Bug]: ConversationTranscriptionResult always return 0 on Channel info HOT 1
- Illegal Invocation Error When Using Speech SDK in Cloudflare Workers Environment HOT 5
- [Bug]: 2 Node [s] with type [Others] should not contain node [voice] with type [Media] HOT 2
- [Bug]: No way to determine when the produced audio has completed HOT 2
- [Bug]: Websocket 404 in Firefox HOT 15
- [Bug]: 3D Blendshape Data Not Generating for Super Realistic Voices HOT 8
- I'm looking for a way to adjust these threshold values depending on the country, but I haven't found any options or settings for that. HOT 2
- [Bug]: JS SpeechSDK.AudioConfig.fromDefaultMicrophoneInput capturing Teams/Zoom call speaker sounds where as JAVA SpeechSDK.AudioConfig.fromDefaultMicrophoneInput not HOT 8
- How do I get the speaker's name from SpeechSynthesizer events?
- [Bug]: SDK Crashes HOT 1
- [Bug]: Speech translation dynamic addTargetLanguage fails after no speech for 1 min HOT 2
- [Bug]: Firefox WebSocket HTTP/2 Issue: App Malfunction When Engine Started and Stopped Multiple Times HOT 8
- [Bug]: Browser Unable to Decode and Play Partial Speech Segments due to Missing Header Information HOT 1
- [Bug]: Azure Speech Recognition Not Converting Speech to Text for Chinese Language HOT 8
- [Doc]: TTS batch synthesis maximum JSON payload size HOT 1
- [Bug]: SpeakerAudioDestination > onAudioEnd does not work
- Seeking Advice on Optimizing Azure Speech Services Region Handling HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cognitive-services-speech-sdk-js.