Share via


Tuning Alignments and Extractions

  Microsoft Speech Technologies Homepage

This topic describes using Wave Editor to tune alignments and extractions.

Wave Editor graphically displays word boundaries, as defined by the speech recognition engine. Tuning alignments is the process of adjusting these word boundaries. The speech recognition engine creates alignments, either when Speech Prompt Editor records a transcription, or when Speech Prompt Editor imports a .wav file for a transcription. Edit these alignments, or word boundaries, in Wave Editor.

Tuning alignments addresses two issues.

  • Ensure that word boundaries are correct. Each word must include all its own phonemes and none of the phonemes for adjacent words.
  • Ensure that the appropriate amount of silence separates words and phrases.

To open Wave Editor

  1. Open a prompt database containing scripts with alignments.
  2. In the Transcription pane, double-click either the Has Wave icon or the Has Alignments icon to display a .wav file.
  3. To return to the Transcription pane, do one of the following:
    • Close Wave Editor
    • - or -
    • Click the appropriate tab at the top of the Design window
    • - or -
    • Double-click the prompt database in Solution Explorer.

Evaluating and Adjusting Word Boundaries

The speech recognition engine places word boundaries when it creates alignments. Although the speech recognition engine does a good job, sometimes manual adjustment of the default word boundaries improves the quality of a recorded prompt. To ensure good default alignments, speak fluently and naturally when recording. These actions help the speech recognition engine and make the extractions sound more natural. If a phrase such as one two three is spoken quickly during prompt recording, alignments might incorrectly add part of one word to another.

To listen to a word boundary

  1. Open Wave Editor, displaying the .wav file.
  2. In the word track at the bottom of Wave Editor, click the arrow attached to a word label. Wave Editor plays that particular word.

To adjust a word boundary

  1. Open Wave Editor, displaying the .wav file that contains the incorrect word boundary.
  2. In the word track at the bottom of Wave Editor, position the cursor over a blue boundary line.
  3. When the cursor changes to a double-arrow, click and drag the word boundary to a new location.
  4. Listen to the new word boundaries as described in To listen to a word boundary.

Note  Try to limit physical editing of the .wav files in the .promptdb file to error correction, because the .promptdb file can be used in multiple .prompts files.

To delete a word boundary

  1. Open Wave Editor, displaying the .wav file that contains the unneeded word boundary.
  2. In the word track at the bottom of Wave Editor, position the cursor over the word boundary.
  3. When the cursor changes to a double-arrow, right-click and select Delete Label from the shortcut menu.

Note  Moving or deleting word boundaries does not alter the .wav file, but if the word boundaries do not match words in the transcription, the resulting extractions are unusable.

Using the Spectrum View

Users with advanced backgrounds in speech signal processing might find the Spectrum view useful in determining word boundaries.

To open Spectrum view

  • On the Wave menu, select Show Spectrum.

To adjust contrast in Spectrum view

  1. On the Wave menu, select Adjust Contrast. The Spectrum view enters contrast mode.
  2. Click and drag within the Spectrum view to set the contrast level.
  3. On the Wave menu, select Adjust Contrast to exit contrast mode.

To change Spectrum view options

  1. On the Wave menu, select Spectrum Options. The Spectrum Options dialog box appears.
  2. Change settings in the dialog box, then click OK.

To exit Spectrum view

  • On the Wave menu, select Show Spectrum.

Increasing or Reducing Silence Between Words

The second word boundary issue to consider is whether the necessary amount of silence surrounds words and phrases. Moving word boundaries can change the amount of silence at the beginning or end of a word or phrase. Other ways to increase silence around words include pasting a silent segment from another part of the same .wav file, or copying a silent segment from another file.

To paste a silent segment within the same file

  1. Open Wave Editor, displaying the .wav file that contains the word requiring additional silence.
  2. Click and drag to select part of the waveform that represents silence. These areas may be marked with a <\sil> tag.
  3. Right-click the selection, and on the shortcut menu, click Copy.
  4. Double-click the target location in the waveform to create an insertion bar. An insertion bar is a blue line, with blue arrows at the top and bottom of the line.
  5. Right-click the insertion bar, and on the shortcut menu, click Paste.

To copy a silent segment between files

  1. Open Wave Editor, displaying the .wav file that contains the source segment.
  2. Click the prompt database tab to display the Transcription pane.
  3. In the Transcription pane, double-click either the Has Wave icon or the Has Alignments icon for the recording containing the target segment.
  4. Copy and paste as described in To paste a silent segment within the same file.

Note  Pasting silent segments alters the .wav file.

Tuning Extractions

Extraction boundaries are separate from alignments. The start and end time for an extraction can be altered to include all of the silence between it and adjacent words. For example, the recording "one two three" might contain three extractions, one for each word. The extraction "one" can include all the silence between it and "two." The extraction "two" can include all the silence between it and "one" and "three." The extraction "three" can include all the silence between it and "two."

Another way to increase silence around words is to paste a silent segment from another part of the same .wav file, or to copy a silent segment from another file, as described in the section Increasing or Reducing Silence Between Words.

Change the start and end times for extractions using the graphic interface in Wave Editor, or edit field values in the Extraction pane.

To change extraction start and end time using Wave Editor

  1. In the Extraction pane, right-click a column heading.
  2. In the Extraction pane dialog box, select Has Wave if it is not already selected, and click OK.
  3. In the Extraction pane, double-click the Has Wave icon for the extraction.
  4. In Wave Editor, in the Ext track at the bottom of Wave Editor, position the cursor over a purple boundary line.
  5. When the cursor changes to a double-arrow, click and drag the extraction boundary to a new location.
  6. To evaluate the new settings, click the arrow on the End label to play the extractions.

See Also

Prompting the User | Creating Prompts | Prompts