Note
Please see Azure Cognitive Services for Speech documentation for the latest supported speech solutions.
Semantic Value Selection
The /SemanticPath option uses XPath syntax to identify the specific semantic value in the emma:interpretation nodes of both the EMMATranscription and EMMAAudio section of the EMMA document. If more than one transcription is present, the semantic values of each transcription are compared with the semantic of the recognition results and a match of any one of the transcription semantics with the recognition semantics is counted as a match. Only the top recognition result from the N-Best list is compared.
Simple Semantics
In the case where simple semantics are used (out = "Coffee"), the semantic value is listed in the output EMMA document as a literal. This can be seen in the following example.
<emma:interpretation emma:tokens="coffee" id="utterance-1-reco-nbest-1"
emma:start="1251745149648" emma:end="1251745150878" emma:confidence="0.8866627"
ms:typespace="ECMA-262" emma:grammar-ref="utterance-1-grammar-0" emma:lang="en-us">
<emma:derived-from resource="#utterance-1-rule-tree-1" composite="false" />
<emma:literal ms:dataType="string" ms:valueType="string">Coffee</emma:literal>
</emma:interpretation>
In this case, the XPath path specified in the SemanticPath parameter is ".". When "." is specified, utterances that contain complex semantics in either the transcription or the interpretation elements are skipped.
Complex Semantics
In the case of complex semantics, an XML document constitutes the semantic values. An XPath path rooted at the level of the emma:interpretation identifies the relevant semantic values to be compared. Utterances that do not contain the specified semantic value(s) in either the transcription or the interpretation are skipped.
If the XPath path is a leaf node, the semantic value is compared to the corresponding semantic value in the transcription. A match is recorded if the values are exactly the same. If the XPath path is not a leaf node, the entire sub-tree of semantic values is matched. Each node in the subtree must exactly the corresponding node in the subtree in the semantics. The names of the nodes must match and the values of each of the leaf nodes in the subtree must also match.
For complex semantics, multiple XPath values can be specified. All semantics specified must be matched in order for the recognition to be considered to have been correctly matched.
<emma:one-of id="r1" emma:start="1087995961542" emma:end="1087995963542">
<emma:interpretation id="int1" emma:confidence="0.75" emma:tokens="flights from boston to denver">
<origin emma:confidence="0.981347" ms:actualConfidence="1" ms:dataType="string" ms:valueType="string" >Boston</origin>
<destination emma:confidence="0.952872" ms:actualConfidence="1" ms:dataType="string" ms:valueType="string" >Denver</destination>
</emma:interpretation>
<emma:interpretation id="int2" emma:confidence="0.68" emma:tokens="flights from austin to denver">
<origin emma:confidence="0.728751" ms:actualConfidence="1" ms:dataType="string" ms:valueType="string" >Austin</origin>
<destination emma:confidence="0.947814" ms:actualConfidence="1" ms:dataType="string" ms:valueType="string" >Denver</destination>
</emma:interpretation>
</emma:one-of>
If attributes have been added to the semantics result, the value of these attributes must also match. EMMA and Microsoft attributes, such as emma:confidence and ms:actualConfidence are ignored.
The following is an example of semantic results with attributes of "method" and "ratio".
<martini method="shaken">
<gin ratio="8">Bombay Sapphire</gin>
<vermouth ratio="1">Noilly </vermouth>
</martini>
When arrays are present in the selected semantic results, the number of items in the array must match, as well as each individual element of the array.
The following is an example of semantic results with an array (topping).
<pizza>
<number>3</number>
<pizzasize>large</pizzasize>
<topping length="2">
<item index="0">pepperoni</item>
<item index="1">mushrooms</item>
</topping>
</pizza>