Multilingual Voice returns wrong language when using numbers text to speech

TC 0 Reputation points
2024-11-20T18:51:21.77+00:00

Hello,

when I am using your api and the following voice: "en-US-AndrewMultilingualNeural"

I also use "es-ES" and "en-US" to specify when to use each language.

It works when using text but when a just a number like "2." is turned into text then the returned text is always English.

I am using xml:lang='es-ES' to specify the language.

Please help me fix this issue.

Thanks and best regards

Timo

Azure API Management
Azure API Management
An Azure service that provides a hybrid, multi-cloud management platform for APIs.
2,239 questions
Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,836 questions
Azure AI Language
Azure AI Language
An Azure service that provides natural language capabilities including sentiment analysis, entity extraction, and automated question answering.
432 questions
{count} votes

1 answer

Sort by: Most helpful
  1. romungi-MSFT 48,221 Reputation points Microsoft Employee
    2024-11-21T09:02:44.1833333+00:00

    @TC A good way to test this scenario is to use the Azure Speech studio with a multilingual voice. Here is a screen shot of my sentence from the studio. Navigate to audio content creation tool from speech studio home page and you will be able to test this with any of the voices.

    User's image

    There is an option to select the SSML view of the above sentence and this confirms the correct SSML tags that can be used to get the required voice output. For the above sentence this should be the SSML

    <speak xmlns="http://www.w3.org/2001/10/synthesis" xmlns:mstts="http://www.w3.org/2001/mstts" xmlns:emo="http://www.w3.org/2009/10/emotionml" version="1.0" xml:lang="en-US"><voice name="en-US-AndrewMultilingualNeural"><lang xml:lang="en-US">This is a test to check English and Spanish language pronunciation with a multilingual voice. This is 2 in english.</lang><lang xml:lang="es-ES"> Este es el 2 en español</lang>.</voice></speak>
    
    
    

    The last part of the sentence, "Este es el 2 en español" which is in spanish has a 2 and this is spoken in spanish in my audio.

    You could try the above format in your SSML and check if the same works.

    If this answers your query, do click Accept Answer and Yes for was this answer helpful. And, if you have any further query do let us know.

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.