Azure AI Speech

0 answers

Request for Access to Azure Speaker Recognition Service

Dear Azure Support Team, I hope this message finds you well. My name is Jamil Al Nashash, and I am currently working on a project that involves speech to text. , I am looking to integrate Speaker Recognition into our application using Azure speech AI…

asked

Jamil Al Nashash 0

edited the question

Grmacjon-MSFT 18,646

0 answers

Questions about AI Speech test (evaluation) error

questions regarding the abnormal behavior in Azure AI Speech. 1. Model testing failed (Invalid or empty input data for evaluation) When I tried to create a test with the uploaded dataset as a sample data, I ran into the model testing failed with invalid…

asked

Hyo Choi 0 Microsoft Employee

commented

Avinash Devarakonda 605 Microsoft Vendor

1 answer

What input formats does Batch Speech to Text support?

Are there any documentation or guidelines regarding the input formats supported by the Batch Speech to Text service? I have two mp4 files with different properties; one can be transcribed (bitrate 62kbps, mono, 16000kHz) , while the other cannot…

asked

日立s　018 20

commented

Tong Viet Anh 0

0 answers

Status: 500 trying to upload a large 400mb file to Video Translation service in Speech Studio

See attached image, works with small files but times out on large file 400mb, 40 min.

asked

VK 0

commented

Avinash Devarakonda 605 Microsoft Vendor

1 answer

"SSML parsing error: 0x8004801c - Wanted data exists" fails with lexicon based on 'fr-CA-SylvieNeural' voice in Azure Speech SDK, but works via Web Portal"

Hello, We are using the microsoft-cognitiveservices-speech-sdk npm module for text-to-speech synthesis in our Node.js application. We have an SSML segment as follows: <?xml version="1.0"?> <speak version="1.0"…

asked

Hugo Machefer 0

commented

Hugo MACHEFER 0

1 answer

Speechsynthesizer causes JVM crash during finalization (Speech client-sdk for Java)

https://github.com/Azure-Samples/cognitive-services-speech-sdk/issues/2701 As I detailed in the above github issue, there is a reproducible bug in the Java client-sdk for Azure Speech. During Finalization of Speechsynthesizer Instance it can cause a…

asked

Lucas Mikulla 0

commented

Lucas Mikulla 0

2 answers

Inquiry About Azure Speech SDK for Apple Vision OS

Dear Azure Support TeamWe have encountered some challenges and would appreciate your assistance. Here are the details of our inquiry: Current Status: We are currently trying to implement the Azure Speech SDK on Apple Vision OS. Unfortunately, we…

asked

XRSPACE-Bowen Huang 0

commented

Saideep Anchuri 685 Microsoft Vendor

1 answer

Accuracy issues reading numbers with Text to speech voices

I wanted your perspective on an issue im having with 2 new voices reading numbers, occasionally Adam Multilingual and Lewis Multilingual will read numbers that start with "four" in a way that sounds like "for five six seven" like…

asked

Andrew Silagy 0

commented

navba-MSFT 26,970 Microsoft Employee

2 answers

I would like to know if there are any other avatars besides the Asian figures for text to speech ? How do I access them?

I would like to have the option to selfct differen avatar figures besides the Asian one shown.

asked

Dr. Golden 0

commented

Prodip K. Saha 0

0 answers

Azure Speech JS SDK Returns Single Item in NBest Array

When using the Cognitive Services JavaScript Speech SDK with OutputFormat.Detailed and the recognizeOnceAsync approach, the NBest array consistently contains only a single object instead of the expected multiple alternatives. For example, when…

asked

Tom D 0

edited a comment

Tom D 0

1 answer

Azure Speech - Custom Keyword

I'm trying to generate .table file but I'm not able to get a successfull training. Training model always give a "fail" state. I've even tried the "Hello Computer" keyword... it doesn't work. Am I doing something wrong ? =>…

asked

Guillaume Demicheli 126

commented

Michael Daubenmier 0

3 answers

How to use whisper model to transcribe audio in real time using Speech SDK?

How do I use Whisper model to transcribe microphone input in real-time using Microsoft-cognitiveservices-speech-sdk npm package? I currently have this working and my region is set to northcentralus I want to know how to use Whisper to transcribe in…

asked

Nas 0

answered

Meezaan Ryklief👍🙂 0

1 answer

Why i get wrong visemes when using German with English phrases?

I am using "Azure Speech" to synthesize speech from a text input, and also to generate Viseme. When using German language, if i use English phrase it sends me back wrong visemes. Ts is not good, last viseme has ts: 0, which should not happen.…

asked

Veljko Markovic | Babylon Engineer 20

edited a comment

Sina Salam 14,626

2 answers

How to access to Whisper in Azure Speech Batch Transcription

When listing base models from API I do not have the whisper option. How do I enable it? I am following this tutorial https://zcusa.951200.xyz/en-us/azure/ai-services/speech-service/batch-transcription-create?pivots=rest-api#use-a-whisper-model This is…

asked

Václav Bílek 0

answered

romungi-MSFT 48,221 Microsoft Employee

1 answer

"Internal server error" from Azure on Filipino speech-to-text

We have been using Azure speech-to-text / transcription services for generating transcripts. Recently (I first noticed this on Monday 9 December 2024) we have seen a very high chance of "Internal server error" in the transcription result of…

asked

James Hu 20

accepted

James Hu 20

1 answer

About speaker separation in "fast-transcription-api"

Dear Azure Support Team https://zcusa.951200.xyz/en-us/rest/api/speechtotext/transcriptions/transcribe?view=rest-speechtotext-2024-05-15-preview&tabs=HTTP The details of the TranscribeDefinition class are not described anywhere, so how should I do…

asked

y.ashibe 45

commented

Schuster, Björn 0

1 answer

Fine-tuning speech-to-text base model for better address recognition

Hello, my team is creating a solution to transcribe addresses with higher accuracy. Our initial benchmarks for using a STT base model for address transcription suggests that it needs to be improved in order to be utilized in a production environment. I…

asked

Caesar Cavales 50

accepted

Caesar Cavales 50

1 answer

Handling Special Characters in Azure TTS Input

Hello, I’ve noticed that when sending text with special characters like \n (newline) to the Azure Text-to-Speech (TTS) engine, the output is synthesized literally as "backslash n." For now, we’re removing \n before sending the text to the TTS…

asked

Ananth Hegde (anahegde) 20

accepted

Ananth Hegde (anahegde) 20

1 answer

Data upload from Azure Blob Storage for the usage in Speech Studio

Hello everyone, In Speech Studio we can browse for the files locally. Is there a way to upload files from the Blob Storage? Might this functionality be present in Azure AI Foundry in the future? Many thanks in advance for your assistance!

asked

Mariana Logvinenko 20

commented

santoshkc 11,535 Microsoft Vendor

0 answers

Azure Speech - Custom Keyword

I'm trying to generate .table file but I'm not able to get a successfull training. Training model give a "fail" state after more than 48 hours of tranning. the process is: => Speech Studio => Custom KeyWord => New Project => Train…

asked

shahsen 0

commented

romungi-MSFT 48,221 Microsoft Employee

Filter

Content

1,836 questions with Azure AI Speech tags

Request for Access to Azure Speaker Recognition Service

Questions about AI Speech test (evaluation) error

What input formats does Batch Speech to Text support?

Status: 500 trying to upload a large 400mb file to Video Translation service in Speech Studio

"SSML parsing error: 0x8004801c - Wanted data exists" fails with lexicon based on 'fr-CA-SylvieNeural' voice in Azure Speech SDK, but works via Web Portal"

Speechsynthesizer causes JVM crash during finalization (Speech client-sdk for Java)

Inquiry About Azure Speech SDK for Apple Vision OS

Accuracy issues reading numbers with Text to speech voices

I would like to know if there are any other avatars besides the Asian figures for text to speech ? How do I access them?

Azure Speech JS SDK Returns Single Item in NBest Array

Azure Speech - Custom Keyword

How to use whisper model to transcribe audio in real time using Speech SDK?

Why i get wrong visemes when using German with English phrases?

How to access to Whisper in Azure Speech Batch Transcription

"Internal server error" from Azure on Filipino speech-to-text

About speaker separation in "fast-transcription-api"

Fine-tuning speech-to-text base model for better address recognition

Handling Special Characters in Azure TTS Input

Data upload from Azure Blob Storage for the usage in Speech Studio

Azure Speech - Custom Keyword