Share via


Note

Please see Azure Cognitive Services for Speech documentation for the latest supported speech solutions.

Get Started with Speech Recognition (Microsoft.Speech)

This topic provides an overview and examples for implementing speech recognition in a Windows Forms application. See the other topics in this section for more information and examples.

A speech recognition application will typically perform the following basic operations:

  1. Initialize the speech recognizer.

  2. Set the input for speech recognition.

  3. Create a speech recognition grammar.

  4. Load the grammar into the speech recognizer.

  5. Register for speech recognition event notification.

  6. Create a handler for the speech recognition event.

  7. Start recognition.

The following provides information about how to program each of these operations. See the end of this topic for a complete example.

Initialize the Speech Recognition Engine

To initialize a speech recognizer, create a new SpeechRecognitionEngine instance. The following example uses a constructor whose parameter specifies the culture of the recognizer that the speech recognition engine should use for recognition, in this case US English (en-US). If the constructor does not specify a culture, the SpeechRecognitionEngine instance will use the default recognizer on the system.

// Create a new SpeechRecognitionEngine instance.
SpeechRecognitionEngine sre = new SpeechRecognitionEngine(new System.Globalization.CultureInfo("en-US"));

Note

A recognizer is an installed Runtime Language for speech recognition. A Runtime Language includes the language model, acoustic model, and other data necessary to provision a speech recognition engine to perform speech recognition in a particular language. See InstalledRecognizers() for more information.

Set the Input for Speech Recognition

Using any of the SpeechRecognitionEngine methods that begin with "SetInputTo", you can configure the speech recognizer to accept input from the default audio device, a WAV file, an audio stream, a WAV stream, or you can set the input to null. The following example sets the input to receive speech input from a WAV file.

// Configure the input to the recognizer.
sre.SetInputToWaveFile(@"c:\Test\Colors.wav");

Create a Speech Recognition Grammar

One way to create a speech recognition grammar is to use the constructors and methods on the GrammarBuilder and Choices classes. The following example creates a simple grammar that recognizes the words "red", "green", or "blue". The words are added using a Choices object. For a match between user speech and the grammar to occur, the user must speak exactly one of the elements added by the Choices instance. The example adds the words as a string array that is the argument to the Add([]) method.

After the Choices instance is created and set with the option strings, the example creates a GrammarBuilder instance. Using the Append(Choices) method, the example appends the colors object to the GrammarBuilder instance. In the last line, the example creates a Grammar object and initializes it with the GrammarBuilder instance.

Choices colors = new Choices();
colors.Add(new string[] {"red", "green", "blue"});

GrammarBuilder gb = new GrammarBuilder();
gb.Append(colors);

// Create the Grammar instance.
Grammar g = new Grammar(gb);

For more information about creating grammars, see Create Grammars (Microsoft.Speech).

Load the Grammar into a Speech Recognizer

After the grammar is created, it must be loaded by the SpeechRecognitionEngine instance. The following example loads the grammar by calling the LoadGrammar(Grammar) method, passing the grammar created in the previous operation.

sre.LoadGrammar(g);

Register for Speech Recognition Event Notification

The SpeechRecognitionEngine object raises a number of events during its operation, including the SpeechRecognized event. For more information, see Use Speech Recognition Events (Microsoft.Speech). The SpeechRecognitionEngine instance raises the SpeechRecognized event when it matches a user utterance with a grammar. An application registers for notification of this event by appending an EventHandler instance as shown in the following example. The argument to the EventHandler constructor, sre_SpeechRecognized, is the name of the developer-written event handler.

sre.SpeechRecognized += new EventHandler<SpeechRecognizedEventArgs>(sre_SpeechRecognized);

Create an Event Handler for Speech Recognition

When you register a handler for a particular event, the Intellisense feature in Microsoft Visual Studio creates a skeleton event handler if you press the TAB key. This process ensures that parameters of the correct type are used. The handler for the SpeechRecognized event shown in the following example displays the text of the recognized word using the Result property on the SpeechRecognizedEventArgs parameter, e.

void sre_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
{
  MessageBox.Show("Speech recognized: " + e.Result.Text);
}

Start Speech Recognition

Now that you have the recognizer initialized, the input set, a grammar loaded, and a handler for the SpeechRecognized event, you are ready to start speech recognition. You can start speech recognition either as a synchronous operation using the Recognize() method, or as an asynchronous operation using one of the RecognizeAsync() methods.

// Start recognition.
sre.Recognize();

Speech Recognition Example

The following examples are components of a Windows System.Windows.Forms application that features speech recognition. Although the application displays a form when it starts, nothing of interest happens with the form. The first example contains the code for the handler for the form’s Load event, which is raised when the form is loaded.

Note

To use the Microsoft.Speech types in this example, you must first add a reference in your project to Microsoft.Speech. Also, make sure that you have completed all the installation steps described in Microsoft Speech Platform SDK 11 Requirements and Installation.

Almost everything of interest in this application occurs in the Form1_Load method. The method builds a grammar incrementally using a Choices instance to add the strings "red", "green", and "blue". It then creates a GrammarBuilder instance using the Choices object. The method then initializes a Grammar instance with the GrammarBuilder object created earlier. The grammar, which is capable of recognizing the words "red", "green", or "blue", is then loaded into the speech recognizer. Finally, the Form1_Load method registers an event handler for the SpeechRecognized() event.

The sre_SpeechRecognized method, which executes when the SpeechRecognitionEngine instance raises the SpeechRecognized() event, displays whichever of the three colors the user spoke. The following illustration shows the interaction between the user’s speech and the SpeechRecognitionEngine with its grammar. When the utterance matches an element in the grammar, the speech recognizer makes a recognition, and produces a recognition result.

Hh378426.SimpleReco(en-us,office.14).jpg

using System;
using System.Collections.Generic;
using System.ComponentModel;
using System.Data;
using System.Drawing;
using System.Linq;
using System.Text;
using System.Windows.Forms;
using Microsoft.Speech.Recognition;

namespace WindowsFormsApplication1
{
  public partial class Form1 : Form
  {
    public Form1()
    {
      InitializeComponent();
    }

    private void Form1_Load(object sender, EventArgs e)
    {

      // Create a new SpeechRecognitionEngine instance.
      SpeechRecognitionEngine sre = new SpeechRecognitionEngine();

      // Configure the input to the recognizer.
      sre.SetInputToWaveFile(@"c:\Test\Colors.wav");

      // Create a simple grammar that recognizes "red", "green", or "blue".
      Choices colors = new Choices();
      colors.Add(new string[] {"red", "green", "blue"});

      // Create a GrammarBuilder object and append the Choices object.
      GrammarBuilder gb = new GrammarBuilder();
      gb.Append(colors);

      // Create the Grammar instance and load it into the speech recognition engine.
      Grammar g = new Grammar(gb);
      sre.LoadGrammar(g);

      // Register a handler for the SpeechRecognized event.
      sre.SpeechRecognized +=
        new EventHandler<SpeechRecognizedEventArgs>(sre_SpeechRecognized);

      // Start recognition.
      sre.Recognize();
    }

    // Create a simple handler for the SpeechRecognized event.
    void sre_SpeechRecognized(object sender, SpeechRecognizedEventArgs e)
    {
      MessageBox.Show("Speech recognized: " + e.Result.Text);
    }
  }
}

The following example is auto-generated code for a Windows Forms application.

using System;
using System.Collections.Generic;
using System.Linq;
using System.Windows.Forms;

namespace WindowsFormsApplication1
{
  static class Program
  {

    // The main entry point for the application.
    [STAThread]
    static void Main()
    {
      Application.EnableVisualStyles();
      Application.SetCompatibleTextRenderingDefault(false);
      Application.Run(new Form1());
    }
  }
}

See Also

Concepts

Audio Input for Recognition (Microsoft.Speech)

Audio Input for Recognition (Microsoft.Speech)

Use Speech Recognition Events (Microsoft.Speech)

Create and Access Semantic Content (Microsoft.Speech)

Emulate Spoken Commands (Microsoft.Speech)