Update Kubernetes and node images across multiple clusters using Azure Kubernetes Fleet Manager

Platform admins managing large number of clusters often have problems with staging the updates of multiple clusters (for example, upgrading node OS image or Kubernetes versions) in a safe and predictable way. To address this challenge, Azure Kubernetes Fleet Manager (Fleet) allows you to orchestrate updates across multiple clusters using update runs.

Update runs consist of stages, groups, and strategies and can be applied manually for one-time updates, or automatically, for ongoing regular updates using auto-upgrade profiles. All update runs (manual or automated) honor member cluster maintenance windows.

This guide covers how to configure and manually execute update runs.

Screenshot of the Azure portal pane for a fleet resource, showing member cluster Kubernetes versions and node images in use across all node pools of member clusters.

Prerequisites

  • Read the conceptual overview of this feature, which provides an explanation of update strategies, runs, stages, and groups referenced in this guide.

  • You must have a Fleet resource with one or more member cluster. If not, follow the quickstart to create a Fleet resource and join Azure Kubernetes Service (AKS) clusters as members.

  • Set the following environment variables:

    export GROUP=<resource-group>
    export FLEET=<fleet-name>
    
  • If you're following the Azure CLI instructions in this article, you need Azure CLI version 2.58.0 or later installed. To install or upgrade, see Install the Azure CLI.

  • You also need the fleet Azure CLI extension, which you can install by running the following command:

    az extension add --name fleet
    

    Run the az extension update command to update to the latest version of the extension released:

    az extension update --name fleet
    

Creating update runs

Note

Update runs honor the planned maintenance windows that you set at the AKS cluster level. For more information, see planned maintenance across multiple member clusters, which explains how update runs handle member clusters configured with planned maintenance windows.

Update run supports two options for the cluster upgrade sequence:

  • One by one: If you don't care about controlling the cluster upgrade sequence, one-by-one provides a simple approach to upgrade all member clusters of the fleet in sequence one at a time.
  • Control sequence of clusters using update groups and stages: If you want to control the cluster upgrade sequence, you can structure member clusters in update groups and update stages. You can store this sequence as a template in the form of update strategy. You can create update runs later using the update strategies instead of defining the sequence every time you need to create an update run.

Update all clusters one by one

  1. In the Azure portal, navigate to your Azure Kubernetes Fleet Manager resource.

  2. From the service menu, under Settings, select Multi-cluster update > Create a run.

  3. Enter a name for the update run, and then select One by one for the upgrade type.

    Screenshot of the Azure portal pane for creating update runs that update clusters one by one in Azure Kubernetes Fleet Manager.

  4. Select one of the following options for the Upgrade scope:

    • Kubernetes version for both control plane and node pools
    • Kubernetes version for only control plane of the cluster
    • Node image version only
  5. Select one of the following options for the Node image:

    • Latest image: Updates every AKS cluster in the update run to the latest image available for that cluster in its region.
    • Consistent image: As it's possible for an update run to have AKS clusters across multiple regions where the latest available node images can be different (check release tracker for more information). The update run picks the latest common image across all these regions to achieve consistency.

    Screenshot of the Azure portal pane for creating update runs. The upgrade scope section is shown.

  6. Select Create to create the update run.

Assign clusters to update groups and stages

Update groups and stages provide more control over the sequence that update runs follow when you're updating the clusters. Within an update stage, updates are applied to all the different update groups in parallel. Within an update group, member clusters update sequentially.

You can assign a member cluster to a specific update group in one of two ways:

Assign to group when adding member cluster to the fleet

  1. In the Azure portal, navigate to your Azure Kubernetes Fleet Manager resource.

  2. From the service menu, under Settings, select Member clusters > Add.

    Screenshot of the Azure portal page for Azure Kubernetes Fleet Manager member clusters.

  3. Select the cluster that you want to add, and then select Next: Review + add.

  4. Enter the name of the update group that you want to assign the cluster to, and then select Add.

Assign an existing fleet member to an update group

  1. In the Azure portal, navigate to your Azure Kubernetes Fleet Manager resource.

  2. From the service menu, under Settings, select Member clusters.

  3. Select the cluster or clusters that you want to assign to an update group, and then select Assign update group

    Screenshot of the Azure portal page for assigning existing member clusters to a group.

  4. Enter the name of the update group that you want to assign the cluster to, and then select Assign.

    Screenshot of the Azure portal page for member clusters that shows the form for updating a member cluster's group.

Note

A fleet member can only be a part of one update group, but an update group can have multiple fleet members assigned to it. An update group itself is not a separate resource type. Update groups are only strings representing references from the fleet members. So, if all fleet members with references to a common update group are deleted, that specific update group will cease to exist as well.

Define an update run and stages

You can define an update run using update stages to sequentially order the application of updates to different update groups. For example, a first update stage might update test environment member clusters, and a second update stage would then update production environment member clusters. You can also specify a wait time between the update stages.

  1. In the Azure portal, navigate to your Azure Kubernetes Fleet Manager resource.

  2. From the service menu, under Settings, select Multi-cluster update > Create a run.

  3. Enter a name for the update run, and then select Stages for the update sequence type.

    Screenshot of the Azure portal page for choosing stages mode within update run.

  4. Select Create stage, and then enter a name for the stage and the wait time between stages.

    Screenshot of the Azure portal page for creating a stage and defining wait time.

  5. Select the update groups that you want to include in this stage. You can also specify the order of the update groups if you want to update them in a specific sequence. When you're done, select Create.

    Screenshot of the Azure portal page for stage creation that shows the selection of upgrade groups.

  6. Select one of the following options for the Upgrade scope:

    • Kubernetes version for both control plane and node pools
    • Kubernetes version for only control plane of the cluster
    • Node image version only
  7. Select one of the following options for the Node image:

    • Latest image: Updates every AKS cluster in the update run to the latest image available for that cluster in its region.
    • Consistent image: As it's possible for an update run to have AKS clusters across multiple regions where the latest available node images can be different (check release tracker for more information). The update run picks the latest common image across all these regions to achieve consistency.

    Screenshot of the Azure portal pane for creating update runs. The upgrade scope section is shown.

  8. Select Create to create the update run.

    Specifying stages and their order every time when creating an update run can get repetitive and cumbersome. Update strategies simplify this process by allowing you to store templates for update runs. For more information, see update strategy creation and usage.

  9. In the Multi-cluster update menu, select the update run, and then select Start.

Create an update run using update strategies

Creating an update run requires you to specify the stages, groups, order each time. Update strategies simplify this process by allowing you to store templates for update runs.

Note

It's possible to create multiple update runs with unique names from the same update strategy.

You can create an update strategy using one of the following methods:

Save an update strategy while creating an update run

  • Save an update strategy while creating an update run in the Azure portal:

    A screenshot of the Azure portal showing update run stages being saved as an update strategy.

Create a new update strategy and reference it when creating an update run

  1. Navigate to the Multi-cluster update page, and then select Strategies > Create a strategy:

    A screenshot of the Azure portal showing creation of update strategy.

  2. Configure the update strategy details, and then select Create.

  3. Reference the update strategy when creating new subsequent update runs:

    A screenshot of the Azure portal showing the creation of a new update run. The 'Copy from existing strategy' button is highlighted.

Manage an update run

The following sections explain how to manage an update run using the Azure portal and Azure CLI.

  • On the Multi-cluster update page of the fleet resource, you can Start an update run that's either in Not started or Failed state:

    A screenshot of the Azure portal showing how to start an update run in the 'Not started' state.

  • On the Multi-cluster update page of the fleet resource, you can Stop a currently Running update run:

    A screenshot of the Azure portal showing how to stop an update run in the 'Running' state.

  • Within any update run in the Not Started, Failed, or Running state, you can select any Stage and Skip the upgrade:

    A screenshot of the Azure portal showing how to skip upgrade for a specific stage in an update run.

    You can similarly skip the upgrade at the update group or member cluster level too.

For more information, see the conceptual overview on the update run states and skip behavior on runs/stages/groups.

Next steps

Learn more about Azure Kubernetes Fleet Manager.