Note
Please see Azure Cognitive Services for Speech documentation for the latest supported speech solutions.
Modify Speech Recognition Engine Properties
Introduction
This document describes the properties of speech recognition (SR) engines that comply with the Microsoft Speech Platform. Applications can modify the properties of a speech recognition (SR) engines using the methods of the ISpProperties interface.
ISpProperties
The ISpProperties interface enables the application to get or set various attributes for an instance of ISpRecognizer. The attributes are passed to the engine via the ISpProperties interface. Property values set using the ISpProperties methods remain in effect only for the current instance of ISpRecognizer, after which they revert to their default settings.
ISpProperties are identified by a unique LONG value. The Speech Platform defines certain attributes known as system attributes. The range of these attributes is from 0x0001 to 0xffff. ISpProperties attributes for vendors are defined by a unique high word value (two ANSI characters that identify the engine vendor).
Attributes may be LONGs, strings, or memory addresses.
SR Properties
The following table lists the properties of speech recognition that are set by the application and passed to the SR engine via the Speech Platform. These attributes are not required for compliance with the Speech Platform. However, the ranges accompanied by the attributes are required values and the exact interpretation of the values is left to the SR engine. The different implementation is defined by each property. The ranges and defaults for each property in the Speech Platform are also shown.
dwAttrib Value | WCHAR Value | Meaning | Range |
SPPROP_RESOURCE_USAGE | ResourceUsage | The ResourceUsage specifies the engine CPU consumption. As the resource usage increases, so does the required CPU power. | 0 - 100 Default = 50 |
SPPROP_HIGH_CONFIDENCE_THRESHOLD SPPROP_NORMAL_CONFIDENCE_THRESHOLD SPPROP_LOW_CONFIDENCE_THRESHOLD |
HighConfidenceThreshold NormalConfidenceThreshold LowConfidenceThreshold |
The threshold values are used to divide a confidence scale into four portions: rejected, low, medium, and high. The location of the low confidence, normal confidence, and high confidence markers control how the confidence of a word is labeled.
Note: SPPROP_LOW_CONFIDENCE_THRESHOLD is not used by the Microsoft Speech Platform. |
0 - 100 Defaults: LCT = 20 NCT = 50 HCT = 80 |
SPPROP_REJECTION_CONFIDENCE_THRESHOLD | CFGConfidenceRejectionThreshold | The speech recognition engine accepts full utterances with confidence scores above or equal to this threshold, and rejects full utterances with phrase confidence scores below this threshold. This property accepts the following values:
|
-1 0-100 |
SPPROP_RESPONSE_SPEED | ResponseSpeed | This indicates the amount of silence the engine looks for before completing a recognition. This attribute is used when the recognition is not ambiguous. For example, in the case of a context-free grammar (CFG) which has two sentences: 1) new game please and 2) new game, a non-ambiguous recognition would be "new game please." | 0 - 10,000ms Default = 500ms |
SPPROP_COMPLEX_RESPONSE_SPEED | ComplexResponseSpeed | This indicates the amount of silence that the engine will look for before completing a recognition. This attribute is used when the recognition is ambiguous. For example, in the case of a CFG which has two sentences: 1) new game please and 2) new game, an ambiguous recognition would be "new game." This property's value must be greater than the ResponseSpeed value. | ResponseSpeed - 10,000ms Default = 750ms |
SPPROP_ENGINE_THREAD_PRIORITY | EngineThreadPriority | Sets the priority of the engine thread(s). The range of permitted values is defined by the OS.
|
Defined by the OS |
SPPROP_ASSUME_CFG_TRUSTED_SOURCE | AssumeCFGFromTrustedSource | Bypasses file integrity checks when loading a CFG, to reduce load time. This should *only* be used by applications that can guarantee that the CFG they are loading has previously been compiled by the application (or by another application it trusts) and has been stored in a secure location where it could not be edited by a malicious agent. | Default = 0 (property is OFF), or 1 (property is ON) |