Hello Thomas Frei,
Welcome to Microsoft Q&A! Thanks for posting the question.
The error message indicates that you’ve exceeded the token rate limit of your current AI Services S0 pricing tier.
Azure OpenAI’s quota feature enables assignment of rate limits to your deployments, up-to a global limit called your “quota.” Quota is assigned to your subscription on a per-region, per-model basis in units of Tokens-per-Minute (TPM).
You can check this documentation for more details.
To give more context, Tokens-Per-Minute (TPM) and Requests-Per-Minute (RPM) rate limits for the deployment.
TPM rate limits are based on the maximum number of tokens that are estimated to be processed by a request at the time the request is received.
RPM rate limits are based on the number of requests received over time. The rate limit expects that requests be evenly distributed over a one-minute period. If this average flow isn't maintained, then requests may receive an error response even though the limit isn't met when measured over the course of a minute.
Please refer this document Manage Azure OpenAI Service quota for more details.
To view your quota allocations across deployments in a given region, select Shared Resources> Quota in Azure OpenAI studio and click on the link to increase the quota*.
Also ensure that resource and the resource group were created in the same region.
I Hope this helps. Do let me know if you have any further queries.
If this answers your query, do click Accept Answer
and Yes
for was this answer helpful.
Thank you!