Speech Recognition Properties SAPI 5.4
Speech Recognition Properties
Introduction
This document describes the ISpProperties elements for SAPI 5 compliant SR engines. This spec will serve to define these attributes only for SR engines. Application developers hoping to build a SAPI 5 compliant engine should reference this document. For more information, developers should refer to the SAPI SDK help documents.
ISpProperties
ISpProperties is an interface that enables the SR and TTS engines to get or set various attributes for an object. The attributes are passed to the engine via the ISpProperties interface. ISpProperties are identified by a unique LONG value. SAPI defines certain attributes known as system attributes. The range of these attributes is from 0x0001 to 0xffff. Vendor ISpProperties attributes are defined by a unique high word value (two ANSI Characters that identify the engine vendor).
Attributes may be LONGs, strings, or memory addresses.
SR Properties
The following table lists the SR properties that are set by the application and passed to the SR engine via SAPI. These attributes are not required for SAPI compliance. However, the ranges accompanied by the attributes are required values and the exact interpretation of the values is left to the SR engine. The different implementation is defined by each property. The SAPI ranges and defaults for each property are also shown.
NOTE: The attributes are associated with a user profile and written in the registry by SAPI. SAPI detects the correct settings. The application should not write attribute changes to the registry.
dwAttrib Value |
WCHAR Value |
Meaning |
Range |
SPPROP_RESOURCE_USAGE |
ResourceUsage |
The ResourceUsage specifies the engine CPU consumption. As the resource usage increases, so does the required CPU power. |
0 - 100 default = 50 |
SPPROP_HIGH_CONFIDENCE_THRESHOLD SPPROP_NORMAL_CONFIDENCE_THRESHOLD SPPROP_LOW_CONFIDENCE_THRESHOLD |
HighConfidenceThreshold NormalConfidenceThreshold LowConfidenceThreshold |
The threshold values are used to divide a confidence scale into four portions: rejected, low, medium, and high. The location of the low confidence, normal confidence, and high confidence markers control how the confidence of a word is labeled. The HighConfidenceThreshold (HCT) separates the high and medium confidence range. The NormalConfidenceThreshold (NCT) separates the medium and the low confidence thresholds. The LowConfidenceThreshold (LCT) separates the low and rejected confidence range. If the all three confidences are equal to 0, then all words will have high confidence. If all three confidences are equal to 100, then all words will have low confidence. |
0 - 100 default LCT = 20 NCT = 50 HCT = 80 |
SPPROP_REJECTION_CONFIDENCE_THRESHOLD |
CFGConfidenceRejectionThreshold |
The speech recognition engine accepts full utterances with confidence scores above or equal to this threshold, and rejects full utterances with phrase confidence scores below this threshold. This property accepts the following values:
This property is not to be confused with SPPROP_HIGH_CONFIDENCE_THRESHOLD, SPPROP_NORMAL_CONFIDENCE_THRESHOLD, or SPPROP_LOW_CONFIDENCE_THRESHOLD, which are used to determine how any given confidence value is categorized (low, medium, or high). |
-1 0 - 100 |