Specify translation options

Completed

The Translate function of the API supports numerous parameters that affect the output.

Word alignment

In written English (using Latin script), spaces are used to separate words. However, in some other languages (and more specifically, scripts) this is not always the case.

For example, translating "Smart Services" from en (English) to zh (Simplified Chinese) produces the result "智能服务", and it's difficult to understand the relationship between the characters in the source text and the corresponding characters in the translation. To resolve this problem, you can specify the includeAlignment parameter with a value of true in your call to produce the following result:

[
   {
      "translations":[
         {
            "text":"智能服务",
            "to":"zh-Hans",
            "alignment":{
               "proj":"0:4-0:1 6:13-2:3"
            }
         }
      ]
   }
]

These results tell us that characters 0 to 4 in the source correspond to characters 0 to 1 in the translation, while characters 6 to 13 in the source correspond to characters 2 to 3 in the translation.

Sentence length

Sometimes it might be useful to know the length of a translation, for example to determine how best to display it in a user interface. You can get this information by setting the includeSentenceLength parameter to true.

For example, specifying this parameter when translating the English (en) text "Hello world" to French (fr) produces the following results:

[
   {
      "translations":[
         {
            "text":"Salut tout le monde",
            "to":"fr",
            "sentLen":{"srcSentLen":[12],"transSentLen":[20]}
         }
      ]
   }
]

Profanity filtering

Sometimes text contains profanities, which you might want to obscure or omit altogether in a translation. You can handle profanities by specifying the profanityAction parameter, which can have one of the following values:

  • NoAction: Profanities are translated along with the rest of the text.
  • Deleted: Profanities are omitted in the translation.
  • Marked: Profanities are indicated using the technique indicated in the profanityMarker parameter (if supplied). The default value for this parameter is Asterisk, which replaces characters in profanities with "*". As an alternative, you can specify a profanityMarker value of Tag, which causes profanities to be enclosed in XML tags.

For example, translating the English (en) text "JSON is ▇▇▇▇ great!" (where the blocked out word is a profanity) to German (de) with a profanityAction of Marked and a profanityMarker of Asterisk produces the following result:

[
   {
      "translations":[
         {
            "text":"JSON ist *** erstaunlich.",
            "to":"de"
         }
      ]
   }
]

Note

To learn more about the translation options, including some not described here, see the Azure AI Translator API documentation.