Cluster in state failed with QuotaExceeded after upgrade

SK 20 Reputation points
2024-10-14T13:17:21.51+00:00

After upgrading Azure Kubernetes Cluster to new kubernetes version (1.29.8) it hangs in Cluster operation status "Failed" with message:
"...Operation could not be completed as it results in exceeding approved Total Regional Cores quota. Additional details - Deployment Model: Resource Manager, Location: westeurope, Current Limit: 50, Current Usage: 40, Additional Required: 16, (Minimum) New Limit Required: 56. ..."

I was able to fix the nodePools by using the Reconcile-Option in azure portal under Diagnose and solve problems.
The option for "Reconcile the AKS Cluster" does not solve the issue.

The strange thing is, when i take a look at the quota "Total regional vCPUs", Region "West Europe" in Azure Portal it displays a usage of "20 of 50", instead of the 40 mentioned in the error-message.
The usage of 20 should be correct, sincce there are 5 nodes running (Node size: Standard_E4as_v5 -> 4 vCPU each, which should sum up to 20).

Any ideas?
Thanks

Azure Kubernetes Service (AKS)
Azure Kubernetes Service (AKS)
An Azure service that provides serverless Kubernetes, an integrated continuous integration and continuous delivery experience, and enterprise-grade security and governance.
2,127 questions
{count} votes

Accepted answer
  1. Srinud 2,425 Reputation points Microsoft Vendor
    2024-10-14T21:04:27.5433333+00:00

    Hi SK,

    I'm glad that you were able to resolve your issue and thank you for posting your solution so that others experiencing the same thing can easily reference this! Since the Microsoft Q&A community has a policy that "The question author cannot accept their own answer. They can only accept answers by others ", I'll repost your solution in case you'd like to "Accept " the answer.

    Issue:

    Cluster in state failed with QuotaExceeded after upgrade.

    Solution:

    By Running this command az aks upgrade --resource-group <resource_group_name> --name <cluster_name> --yes --debug without specifying a Kubernetes version successfully upgraded our AKS cluster to the latest version, resolving the issue and changing the state from "failed" to "succeeded.

    If you have any other questions or are still running into more issues, please let me know. Thank you again for your time and patience throughout this issue.

    Please remember to "Accept Answer" if any answer/reply helped, so that others in the community facing similar issues can easily find the solution. User's image

    Thank you.

    0 comments No comments

0 additional answers

Sort by: Most helpful

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.