Comments (4)
@syama-aot Thanks for using JS Speech SDK, and submitting this issue. After creating a PronunciationAssessmentConfig instance, you must set that instance's enableProsodyAssessment property to true, e.g.
pronunciationAssessmentConfig.enableProsodyAssessment = true;
Hope that helps.
from cognitive-services-speech-sdk-js.
@syama-aot Thanks for using JS Speech SDK, and submitting this issue. After creating a PronunciationAssessmentConfig instance, you must set that instance's enableProsodyAssessment property to true, e.g.
pronunciationAssessmentConfig.enableProsodyAssessment = true;
Hope that helps.
@glharper ,
Thank you for reply. const pronunciationAssessmentConfig =
sdk.PronunciationAssessmentConfig.fromJSON(
'{ "GradingSystem": "FivePoint",
"Granularity": "Word",
"EnableMiscue": "False"}'
);
pronunciationAssessmentConfig.enableProsodyAssessment = true;
pronunciationAssessmentConfig.enableContentAssessmentWithTopic(
"Talk about your day today"
);
// setting the recognition language to English.
speechConfig.speechRecognitionLanguage = "en-US";
let startTime = 0; // Initialize start time
// create the speech recognizer.
var reco = new sdk.SpeechRecognizer(speechConfig, audioConfig);
pronunciationAssessmentConfig.applyTo(reco);
let totalResultText = ""; // Accumulate recognized text
let pronunciationScores = []; // Store pronunciation assessment scores
let recognitionComplete = false; // Flag to track recognition completion
let contentAssessmentResult = null;
function onRecognizedResult(result) {
if (!recognitionComplete) {
totalResultText += result.text; // Append recognized text
const pronunciationResult =
sdk.PronunciationAssessmentResult.fromResult(result);
// console.log("pronounce result", pronunciationResult);
// console.log("actual result", pronunciationResult);
pronunciationScores.push({
text: result.text,
accuracyScore: pronunciationResult.accuracyScore,
pronunciationScore: pronunciationResult.pronunciationScore,
completenessScore: pronunciationResult.completenessScore,
fluencyScore: pronunciationResult.fluencyScore,
prosodyScore: pronunciationResult.prosodyScore,
result: pronunciationResult.contentAssessmentResult,
});
// contentAssessmentResult =
// pronunciationScores?.[pronunciationScores?.length - 1];
_.forEach(pronunciationResult.detailResult.Words, (word, idx) => {
console.log(
" ",
idx + 1,
": word: ",
word.Word,
"\taccuracy score: ",
word.PronunciationAssessment.AccuracyScore,
"\terror type: ",
word.PronunciationAssessment.**ErrorType**,
";"
);
});
}
}
reco.recognizing = (s, e) => {
// console.log(Recognizing: ${e.result.text}
);
onRecognizedResult(e.result);
};
reco.recognized = (s, e) => {
if (e.result.reason === sdk.ResultReason.RecognizedSpeech) {
console.log(Recognized: ${e.result.text}
);
onRecognizedResult(e.result);
} else if (e.result.reason === sdk.ResultReason.NoMatch) {
console.log("No speech could be recognized.");
}
};
reco.sessionStopped = (s, e) => {
console.log("Session stopped event received.");
recognitionComplete = true; // Set recognition completion flag
// console.log("Total recognized text:", totalResultText);
// console.log("Pronunciation assessment scores:", pronunciationScores);
// console.log("Content assessment Result", contentAssessmentResult);
const endTime = Date.now(); // Measure end time
const totalTimeSpent = endTime - startTime; // Calculate total time spent
console.log("Total time spent (ms):", totalTimeSpent);
};
reco.startContinuousRecognitionAsync(
() => {
console.log("Continuous recognition started.");
startTime = Date.now(); // Set start time when recognition starts
},
(err) => {
console.error("Error starting continuous recognition:", err);
}
);
This is the code I have. Still I am receiving only Mispronunciations. ( Missing Break, Unexpected Break still missing in API)
from cognitive-services-speech-sdk-js.
@syama-aot In the Speech Studio, looking at the sample code (under Developer Resources) for JavaScript, I see this comment:
// For continuous pronunciation assessment mode, the service won't return the words with `Insertion` or `Omission`
// We need to compare with the reference text after received all recognized words to get these error words.
The sample code then details logic to perform that comparison. Is that logic what you're asking for?
from cognitive-services-speech-sdk-js.
@glharper I have a similar question with the SDK in Swift.
I did enable:
pronAssessmentConfig.enableProsodyAssessment()
But the word-level assessment results never return UnexpectedBreak/MissingBreak. The only errorType I can get is "Mispronunciation".
Here is my setup:
let speechRecognizer = try! SPXSpeechRecognizer(speechConfiguration: speechConfig, language: "en-US", audioConfiguration: audioConfig)
let pronAssessmentConfig = try! SPXPronunciationAssessmentConfiguration("", gradingSystem: SPXPronunciationAssessmentGradingSystem.hundredMark, granularity: SPXPronunciationAssessmentGranularity.phoneme, enableMiscue: false)
pronAssessmentConfig.enableProsodyAssessment()
try! pronAssessmentConfig.apply(to: speechRecognizer)
from cognitive-services-speech-sdk-js.
Related Issues (20)
- [Bug]: speakSsmlAsync produces 0 duration audio but result reason is SynthesizingAudioCompleted HOT 1
- [Bug]: Real-Time Speech-to-Text Lag and Synchronization Problems on Low-Power Devices HOT 4
- [Bug]: ConversationTranscriptionResult always return 0 on Channel info HOT 1
- Illegal Invocation Error When Using Speech SDK in Cloudflare Workers Environment HOT 5
- [Bug]: 2 Node [s] with type [Others] should not contain node [voice] with type [Media] HOT 2
- [Bug]: No way to determine when the produced audio has completed HOT 2
- [Bug]: Websocket 404 in Firefox HOT 15
- [Bug]: 3D Blendshape Data Not Generating for Super Realistic Voices HOT 8
- I'm looking for a way to adjust these threshold values depending on the country, but I haven't found any options or settings for that. HOT 2
- [Bug]: JS SpeechSDK.AudioConfig.fromDefaultMicrophoneInput capturing Teams/Zoom call speaker sounds where as JAVA SpeechSDK.AudioConfig.fromDefaultMicrophoneInput not HOT 8
- How do I get the speaker's name from SpeechSynthesizer events?
- [Bug]: SDK Crashes HOT 1
- [Bug]: Speech translation dynamic addTargetLanguage fails after no speech for 1 min HOT 2
- [Bug]: Firefox WebSocket HTTP/2 Issue: App Malfunction When Engine Started and Stopped Multiple Times HOT 8
- [Bug]: Browser Unable to Decode and Play Partial Speech Segments due to Missing Header Information HOT 1
- [Bug]: Azure Speech Recognition Not Converting Speech to Text for Chinese Language HOT 8
- [Doc]: TTS batch synthesis maximum JSON payload size HOT 1
- [Bug]: SpeakerAudioDestination > onAudioEnd does not work
- Seeking Advice on Optimizing Azure Speech Services Region Handling HOT 1
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from cognitive-services-speech-sdk-js.