What input formats does Batch Speech to Text support?

日立s 018 20 Reputation points
2024-12-19T03:13:30.4533333+00:00

Are there any documentation or guidelines regarding the input formats supported by the Batch Speech to Text service? I have two mp4 files with different properties; one can be transcribed (bitrate 62kbps, mono, 16000kHz) , while the other cannot (168kbps, stereo, 48000kHz).
User's image

Azure AI Speech
Azure AI Speech
An Azure service that integrates speech processing into apps and services.
1,836 questions
{count} votes

Accepted answer
  1. navba-MSFT 27,065 Reputation points Microsoft Employee
    2024-12-19T09:31:10.32+00:00

    @日立s 018 Thanks for getting back. While the mp4 container is generally supported, the service can only process streamable files. Not all mp4 files are streamable; these are probably instances of such files.

    It can either be converted to a different format or amended keeping the same format by using the

    ffmpeg command ffmpeg -i inputvideo.mp4 -movflags faststart -acodec copy -vcodec copy outputvideo.mp4.

    Hope this answers.

    **

    Please do not forget to "Accept the answer” and “up-vote” wherever the information provided helps you, this can be beneficial to other community members.

    1 person found this answer helpful.

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.