Comments (6)
@faster-than-human Thanks for using the JS Speech SDK, and writing this issue up. The description for EnableDictation from our .NET SDK reads "With dictation enabled, word descriptions of sentence structures are understood. For example, ending a statement by saying "question mark" adds a question mark to the sentence when dictation is enabled."
So your expectation here is that punctuation is not displayed, even when "question mark" is spoken?
from cognitive-services-speech-sdk-js.
Hello @glharper,
Thank you for the questions. What I expected to happen was to see no punctuation unless I dictated it. Going back to my initial example I stated...
"some guy walked into my house"
I received...
"some guy walked into my house." <- Notice the period is provided automatically even though I never said, "period".
I was under the impression that unless I said period I should not see a period in the result. For users that use dictation often (e.g. Dragon Medical from Nuance) they are used to saying punctuation while dictating, so if the service is going to continue to auto-punctuate anyways, what is the point of enableDictation() to average dictation user?
Thank you.
from cognitive-services-speech-sdk-js.
I am also not sure if this is relevant here, but I have also received reports and can reproduce another strange issue with dictation mode being enabled that doesn't make sense in the dictation context. If you dictate part of a sentence...
"The patient walked into"
Turn off the microphone, turn the microphone on and dictate again...
"my office today complaining of a headache"
I would expect that, "my", would be lower case in dictation mode because the sentence was not terminated, but the result I get is, "My office today complaining of a headache". So anytime you cycle the microphone the first word is always capitalized no matter what even if you didn't terminate your sentence.
from cognitive-services-speech-sdk-js.
Hello Everyone. I wanted to follow up on this thread. Are there any other questions I could answer for you?
from cognitive-services-speech-sdk-js.
@faster-than-human Hi there,
This is intended as there are really 3 modes of punctuation at play here:
- Explicit, which is set by the service property you mentioned, where punctuation will not be added unless the user specifically says the punctuation words, such as "period"
- Explicit and Intelligent, which is the enableDictation() API, where punctuation is inferred based on the speech. This is where the extra punctuation is coming from, and it also tries to translate the explicit phrases. It will attempt to handle explicit phrases as well.
- Intelligent, which is the default, will not translate explicit phrases, but will add intelligent based punctuation.
Example utterance. "What do we really do question mark"
- What do we really do?
- What do we really do?
- What do we really do? question mark.
Example utterance. "What do we really do" <- note without the explicit punctuation
- What do we really do
- What do we really do?
- What do we really do?
Hopefully with this combination of options you can find one that is best for your scenario. If there is something that you feel is missing from the offering please let us know.
I'm closing this issue to try and keep the list of open issues down, but feel free to re-open if you think it is necessary.
from cognitive-services-speech-sdk-js.
Hello @chschrae,
Thank you for that explanation that adequately covers my questions about that function. With that being the case I will continue to use the explicit option since my app is for medical dictation and the common workflow is to never auto-punctuate for the user.
I was wondering if you had any thoughts on the second item I noted about capitalization when using dictation mode? Is there any way to avoid having the first character being automatically capitalized when the microphone is turned on and it is not supposed to be capitalized (e.g. Not a proper noun)? My users are getting a annoyed that they cannot pause dictation for a moment without unnecessary capitalization occurring.
If you feel my question about the auto-cap issue should be its own thread please let me know and I will open a ticket for that separate of this thread.
Thank you.
from cognitive-services-speech-sdk-js.
Related Issues (20)
- [Feature request]: more information on voices, children and multilinguals voices HOT 3
- [Question] Using not default speaker for TTS HOT 14
- IntentRecognizer - Supporting CLU and simple pattern matching HOT 1
- [Bug]: result.text property set to '.' on recognized event when performing speech translation with Arabic languages. HOT 8
- [Bug]: SpeechConfig.FromEndpoint always cancel the connection with Invalid argument exception HOT 1
- [Bug]: Unable to contact server error causes memory leak HOT 1
- What is ServiceTimeout value and how to modify that? HOT 4
- Missing PronunciationAssessmentGranularity.Syllable HOT 1
- [Bug]: Speaker verification failing with 401 error since version 1.27 HOT 4
- [Bug]:Interim Failed WebSocket connection in Continuous translation HOT 4
- [Bug]: TTS doesn't strip out markdown for English(India) voices - both Prabhat and Neerja. HOT 6
- SDK returns no match, but the online recognizer works. HOT 3
- [Bug]: ErrorType (UnexpectedBreak,MissingBreak) are not receiving in detailResult words from sdk HOT 4
- [Bug]: speakSsmlAsync produces 0 duration audio but result reason is SynthesizingAudioCompleted HOT 1
- [Bug]: Real-Time Speech-to-Text Lag and Synchronization Problems on Low-Power Devices HOT 4
- [Bug]: ConversationTranscriptionResult always return 0 on Channel info HOT 1
- Illegal Invocation Error When Using Speech SDK in Cloudflare Workers Environment HOT 5
- [Bug]: 2 Node [s] with type [Others] should not contain node [voice] with type [Media] HOT 2
- [Bug]: No way to determine when the produced audio has completed HOT 2
- [Bug]: Websocket 404 in Firefox HOT 15
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cognitive-services-speech-sdk-js.