Share via


Note

Please see Azure Cognitive Services for Speech documentation for the latest supported speech solutions.

Microsoft Speech Platform

American English Phoneme Representation

The information in this topic is provided only for reference. We recommend that you do not use the SAPI Phone Set to specify pronunciations for any languages in your applications. Instead, use the Universal Phone Set (UPS) for ISpGrammarBuilder grammars and the Speech Platform will automatically perform conversion to SAPI Phone Set for legacy languages, as required. See Legacy SAPI Phone Sets for more information.

Symbolic and Numerical Representation

Application developers can create pronunciations for words in the English language that are not currently in the lexicon by using the English phones, represented in the following table, from the SAPI Phone Set. Each phone describes a unique sound of speech. Each phone has a label (the letters and/or characters that make up the phone), and a numerical identifier (the PhoneID that corresponds to each label).

You can use the SAPI Phone Set to create custom pronunciations in grammars to customize speech recognition, and in TTS prompts to customize synthesized speech output, or in lexicons that may be used for either speech recognition or TTS.

The following is an example entry in an XML-format grammar that conforms to the Speech Recognition Grammar Specification (SRGS) Version 1.0 and specifies a pronunciation for the word "hello":

  
<token sapi:pron="h eh l ow"/> hello </token>

For improved accuracy, you can add the primary (1), secondary (2) stress markers, and the syllabic markers (-) to the pronunciation.

The following is an example entry in an XML-format TTS prompt that conforms to the Speech Synthesis Markup Language (SSML) Version 1.0 and specifies a pronunciation for the word "hello" using the primary stress (1) and syllabic (-) markers:

  
<phoneme alphabet="x-microsoft-sapi" ph="h eh - l ow 1"> hello </phoneme>

American English Phoneme Table

Phone Label Example PhoneID
- syllable boundary (hyphen) 1
! Sentence terminator (exclamation mark) 2
& word boundary 3
, Sentence terminator (comma) 4
. Sentence terminator (period) 5
? Sentence terminator (question mark) 6
_ Silence (underscore) 7
1 Primary stress 8
2 Secondary stress 9
aa father 10
ae cat 11
ah cut 12
ao dog 13
aw foul 14
ax ago 15
ay bite 16
b big 17
ch chin 18
d dig 19
dh then 20
eh pet 21
er fur 22
ey ate 23
f fork 24
g gut 25
h help 26
ih fill 27
iy feel 28
jh joy 29
k cut 30
l lid 31
m mat 32
n no 33
ng sing 34
ow go 35
oy toy 36
p put 37
r red 38
s sit 39
sh she 40
t talk 41
th thin 42
uh book 43
uw too 44
v vat 45
w with 46
y yard 47
z zap 48
zh pleasure 49

Please see Legacy SAPI Phone Sets for information on other phone sets.