speech Package
Microsoft Speech SDK for Python
Modules
audio |
Classes that are concerned with the handling of audio input to the various recognizers, and audio output from the speech synthesizer. |
dialog |
Classes related to dialog service connector. |
enums | |
intent |
Classes related to intent recognition from speech. |
interop | |
languageconfig |
Classes that are concerned with the handling of language configurations |
properties | |
speech |
Classes related to recognizing text from speech, synthesizing speech from text, and general classes used in the various recognizers. |
transcription |
Classes related to conversation transcription. |
translation |
Classes related to translation of speech to other languages. |
version |
Classes
AudioDataStream |
Represents audio data stream used for operating audio data as a stream. Generates an audio data stream from a speech synthesis result (type SpeechSynthesisResult) or a keyword recognition result (type KeywordRecognitionResult). |
AutoDetectSourceLanguageResult |
Represents auto detection source language result. The result can be initialized from a speech recognition result. |
CancellationDetails | |
Connection |
Proxy class for managing the connection to the speech service of the specified Recognizer. By default, a Recognizer autonomously manages connection to service when needed. The Connection class provides additional methods for users to explicitly open or close a connection and to subscribe to connection status changes. The use of Connection is optional. It is intended for scenarios where fine tuning of application behavior based on connection status is needed. Users can optionally call open to manually initiate a service connection before starting recognition on the Recognizer associated with this Connection. After starting a recognition, calling open or close might fail. This will not impact the Recognizer or the ongoing recognition. Connection might drop for various reasons, the Recognizer will always try to reinstitute the connection as required to guarantee ongoing operations. In all these cases connected/disconnected events will indicate the change of the connection status. Note Updated in version 1.17.0. Constructor for internal use. |
ConnectionEventArgs |
Provides data for the ConnectionEvent. Note Added in version 1.2.0 Constructor for internal use. |
EventSignal |
Clients can connect to the event signal to receive events, or disconnect from the event signal to stop receiving events. Constructor for internal use. |
KeywordRecognitionEventArgs |
Class for keyword recognition event arguments. Constructor for internal use. |
KeywordRecognitionModel |
Represents a keyword recognition model. |
KeywordRecognitionResult |
Result of a keyword recognition operation. Constructor for internal use. |
KeywordRecognizer |
A keyword recognizer. |
NoMatchDetails | |
PhraseListGrammar |
Class that allows runtime addition of phrase hints to aid in speech recognition. Phrases added to the recognizer are effective at the start of the next recognition, or the next time the speech recognizer must reconnect to the speech service. Note Added in version 1.5.0. Constructor for internal use. |
PronunciationAssessmentConfig |
Represents pronunciation assessment configuration Note Added in version 1.14.0. The configuration can be initialized in two ways:
For the parameters details, see https://docs.microsoft.com/azure/cognitive-services/speech-service/rest-speech-to-text#pronunciation-assessment-parameters |
PronunciationAssessmentPhonemeResult |
Contains phoneme level pronunciation assessment result Note Added in version 1.14.0. |
PronunciationAssessmentResult |
Represents pronunciation assessment result. Note Added in version 1.14.0. The result can be initialized from a speech recognition result. |
PronunciationAssessmentWordResult |
Contains word level pronunciation assessment result Note Added in version 1.14.0. |
PropertyCollection |
Class to retrieve or set a property value from a property collection. |
RecognitionEventArgs |
Provides data for the RecognitionEvent. Constructor for internal use. |
RecognitionResult |
Detailed information about the result of a recognition operation. Constructor for internal use. |
Recognizer |
Base class for different recognizers |
ResultFuture |
The result of an asynchronous operation. private constructor |
SessionEventArgs |
Base class for session event arguments. Constructor for internal use. |
SourceLanguageRecognizer |
A source language recognizer - standalone language recognizer, can be used for single language or continuous language detection. Note Added in version 1.18.0. |
SpeechConfig |
Class that defines configurations for speech / intent recognition and speech synthesis. The configuration can be initialized in different ways:
|
SpeechRecognitionCanceledEventArgs |
Class for speech recognition canceled event arguments. Constructor for internal use. |
SpeechRecognitionEventArgs |
Class for speech recognition event arguments. Constructor for internal use. |
SpeechRecognitionResult |
Base class for speech recognition results. Constructor for internal use. |
SpeechRecognizer |
A speech recognizer. If you need to specify source language information, please only specify one of these three parameters, language, source_language_config or auto_detect_source_language_config. |
SpeechSynthesisBookmarkEventArgs |
Class for speech synthesis bookmark event arguments. Note Added in version 1.16.0. Constructor for internal use. |
SpeechSynthesisCancellationDetails |
Contains detailed information about why a result was canceled. |
SpeechSynthesisEventArgs |
Class for speech synthesis event arguments. Constructor for internal use. |
SpeechSynthesisResult |
Result of a speech synthesis operation. Constructor for internal use. |
SpeechSynthesisVisemeEventArgs |
Class for speech synthesis viseme event arguments. Note Added in version 1.16.0. Constructor for internal use. |
SpeechSynthesisWordBoundaryEventArgs |
Class for speech synthesis word boundary event arguments. Note Updated in version 1.21.0. Constructor for internal use. |
SpeechSynthesizer |
A speech synthesizer. |
SyllableLevelTimingResult |
Contains syllable level timing result Note Added in version 1.20.0. |
SynthesisVoicesResult |
Contains detailed information about the retrieved synthesis voices list. Note Added in version 1.16.0. Constructor for internal use. |
VoiceInfo |
Contains detailed information about the synthesis voice information. Note Updated in version 1.17.0. Constructor for internal use. |
Enums
AudioStreamContainerFormat |
Defines supported audio stream container format. |
AudioStreamWaveFormat |
Represents the format specified inside WAV container. |
CancellationErrorCode |
Defines error code in case that CancellationReason is Error. |
CancellationReason |
Defines the possible reasons a recognition result might be canceled. |
NoMatchReason |
Defines the possible reasons a recognition result might not be recognized. |
OutputFormat |
Output format. |
ProfanityOption |
Removes profanity (swearing), or replaces letters of profane words with stars. |
PronunciationAssessmentGradingSystem |
Defines the point system for pronunciation score calibration; default value is FivePoint. |
PronunciationAssessmentGranularity |
Defines the pronunciation evaluation granularity; default value is Phoneme. |
PropertyId |
Defines speech property ids. |
ResultReason |
Specifies the possible reasons a recognition result might be generated. |
ServicePropertyChannel |
Defines channels used to pass property settings to service. |
SpeechSynthesisOutputFormat |
Defines the possible speech synthesis output audio formats. |
StreamStatus |
Defines the possible status of audio data stream. |
SynthesisVoiceGender |
Defines the gender of synthesis voices |
SynthesisVoiceType |
Defines the type of synthesis voices |