Errore di propagazione delle risorse: ClusterResourcePlacementRolloutStarted è false

Articolo
08/06/2024

Questo articolo descrive come risolvere i problemi ClusterResourcePlacementRolloutStarted durante la propagazione delle risorse usando l'oggetto ClusterResourcePlacement API in Azure Kubernetes Fleet Manager.

Sintomi

Quando si usa l'oggetto ClusterResourcePlacement API in Azure Kubernetes Fleet Manager per propagare le risorse, le risorse selezionate non vengono implementate in tutti i cluster pianificati e lo stato della ClusterResourcePlacementRolloutStarted condizione viene visualizzato come False.

Note

Per altre informazioni sul motivo per cui l'implementazione non viene avviata, è possibile controllare i log del controller di implementazione .

Causa

La strategia di implementazione del posizionamento delle risorse cluster è bloccata perché la RollingUpdate configurazione è troppo rigida.

Passaggi per la risoluzione dei problemi

ClusterResourcePlacement Nella sezione stato controllare per placementStatuses identificare i cluster con RolloutStarted lo stato impostato su False.
Individuare il corrispondente ClusterResourceBinding per il cluster identificato. Per altre informazioni, vedere Come è possibile trovare la risorsa ClusterResourceBinding più recente? Questa risorsa deve indicare lo Work stato (se è stato creato o aggiornato).
Verificare i valori di maxUnavailable e maxSurge per assicurarsi che siano allineati alle aspettative.

Case study

Nell'esempio seguente l'oggetto ClusterResourcePlacement sta tentando di propagare uno spazio dei nomi a tre cluster membri. Tuttavia, durante la creazione iniziale di ClusterResourcePlacement, lo spazio dei nomi non esisteva nel cluster hub e la flotta attualmente comprende due cluster membri denominati kind-cluster-1 e kind-cluster-2.

Specifica ClusterResourcePlacement

spec:
  policy:
    numberOfClusters: 3
    placementType: PickN
  resourceSelectors:
  - group: ""
    kind: Namespace
    name: test-ns
    version: v1
  revisionHistoryLimit: 10
  strategy:
    type: RollingUpdate

Stato di ClusterResourcePlacement

status:
  conditions:
  - lastTransitionTime: "2024-05-07T23:08:53Z"
    message: could not find all the clusters needed as specified by the scheduling
      policy
    observedGeneration: 1
    reason: SchedulingPolicyUnfulfilled
    status: "False"
    type: ClusterResourcePlacementScheduled
  - lastTransitionTime: "2024-05-07T23:08:53Z"
    message: All 2 cluster(s) start rolling out the latest resource
    observedGeneration: 1
    reason: RolloutStarted
    status: "True"
    type: ClusterResourcePlacementRolloutStarted
  - lastTransitionTime: "2024-05-07T23:08:53Z"
    message: No override rules are configured for the selected resources
    observedGeneration: 1
    reason: NoOverrideSpecified
    status: "True"
    type: ClusterResourcePlacementOverridden
  - lastTransitionTime: "2024-05-07T23:08:53Z"
    message: Works(s) are successfully created or updated in the 2 target clusters'
      namespaces
    observedGeneration: 1
    reason: WorkSynchronized
    status: "True"
    type: ClusterResourcePlacementWorkSynchronized
  - lastTransitionTime: "2024-05-07T23:08:53Z"
    message: The selected resources are successfully applied to 2 clusters
    observedGeneration: 1
    reason: ApplySucceeded
    status: "True"
    type: ClusterResourcePlacementApplied
  - lastTransitionTime: "2024-05-07T23:08:53Z"
    message: The selected resources in 2 cluster are available now
    observedGeneration: 1
    reason: ResourceAvailable
    status: "True"
    type: ClusterResourcePlacementAvailable
  observedResourceIndex: "0"
  placementStatuses:
  - clusterName: kind-cluster-2
    conditions:
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: 'Successfully scheduled resources for placement in kind-cluster-2 (affinity
        score: 0, topology spread score: 0): picked by scheduling policy'
      observedGeneration: 1
      reason: Scheduled
      status: "True"
      type: Scheduled
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: Detected the new changes on the resources and started the rollout process
      observedGeneration: 1
      reason: RolloutStarted
      status: "True"
      type: RolloutStarted
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: No override rules are configured for the selected resources
      observedGeneration: 1
      reason: NoOverrideSpecified
      status: "True"
      type: Overridden
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: All of the works are synchronized to the latest
      observedGeneration: 1
      reason: AllWorkSynced
      status: "True"
      type: WorkSynchronized
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: All corresponding work objects are applied
      observedGeneration: 1
      reason: AllWorkHaveBeenApplied
      status: "True"
      type: Applied
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: All corresponding work objects are available
      observedGeneration: 1
      reason: AllWorkAreAvailable
      status: "True"
      type: Available
  - clusterName: kind-cluster-1
    conditions:
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: 'Successfully scheduled resources for placement in kind-cluster-1 (affinity
        score: 0, topology spread score: 0): picked by scheduling policy'
      observedGeneration: 1
      reason: Scheduled
      status: "True"
      type: Scheduled
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: Detected the new changes on the resources and started the rollout process
      observedGeneration: 1
      reason: RolloutStarted
      status: "True"
      type: RolloutStarted
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: No override rules are configured for the selected resources
      observedGeneration: 1
      reason: NoOverrideSpecified
      status: "True"
      type: Overridden
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: All of the works are synchronized to the latest
      observedGeneration: 1
      reason: AllWorkSynced
      status: "True"
      type: WorkSynchronized
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: All corresponding work objects are applied
      observedGeneration: 1
      reason: AllWorkHaveBeenApplied
      status: "True"
      type: Applied
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: All corresponding work objects are available
      observedGeneration: 1
      reason: AllWorkAreAvailable
      status: "True"
      type: Available

L'output precedente indica che lo spazio dei nomi della risorsa test-ns non esiste mai nel cluster hub e mostra gli stati di condizione seguenti ClusterResourcePlacement :

Lo stato della ClusterResourcePlacementScheduled condizione viene visualizzato come False, perché i criteri specificati puntano a selezionare tre cluster, ma l'utilità di pianificazione può ospitare solo i posizionamenti in due cluster attualmente disponibili e aggiunti.
Lo stato della ClusterResourcePlacementRolloutStarted condizione viene visualizzato come True, perché il processo di implementazione è stato avviato con due cluster selezionati.
Lo stato della ClusterResourcePlacementOverridden condizione viene visualizzato come True, perché non sono configurate regole di override per le risorse selezionate.
Lo stato della ClusterResourcePlacementWorkSynchronized condizione viene visualizzato come True.
Lo stato della ClusterResourcePlacementApplied condizione viene visualizzato come True.
Lo stato della ClusterResourcePlacementAvailable condizione viene visualizzato come True.

Per garantire una facile propagazione dello spazio dei nomi tra i cluster pertinenti, procedere con la creazione dello test-ns spazio dei nomi nel cluster hub.

Stato clusterResourcePlacement dopo la creazione dello spazio dei nomi "test-ns" nel cluster hub

status:
  conditions:
  - lastTransitionTime: "2024-05-07T23:08:53Z"
    message: could not find all the clusters needed as specified by the scheduling
      policy
    observedGeneration: 1
    reason: SchedulingPolicyUnfulfilled
    status: "False"
    type: ClusterResourcePlacementScheduled
  - lastTransitionTime: "2024-05-07T23:13:51Z"
    message: The rollout is being blocked by the rollout strategy in 2 cluster(s)
    observedGeneration: 1
    reason: RolloutNotStartedYet
    status: "False"
    type: ClusterResourcePlacementRolloutStarted
  observedResourceIndex: "1"
  placementStatuses:
  - clusterName: kind-cluster-2
    conditions:
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: 'Successfully scheduled resources for placement in kind-cluster-2 (affinity
        score: 0, topology spread score: 0): picked by scheduling policy'
      observedGeneration: 1
      reason: Scheduled
      status: "True"
      type: Scheduled
    - lastTransitionTime: "2024-05-07T23:13:51Z"
      message: The rollout is being blocked by the rollout strategy
      observedGeneration: 1
      reason: RolloutNotStartedYet
      status: "False"
      type: RolloutStarted
  - clusterName: kind-cluster-1
    conditions:
    - lastTransitionTime: "2024-05-07T23:08:53Z"
      message: 'Successfully scheduled resources for placement in kind-cluster-1 (affinity
        score: 0, topology spread score: 0): picked by scheduling policy'
      observedGeneration: 1
      reason: Scheduled
      status: "True"
      type: Scheduled
    - lastTransitionTime: "2024-05-07T23:13:51Z"
      message: The rollout is being blocked by the rollout strategy
      observedGeneration: 1
      reason: RolloutNotStartedYet
      status: "False"
      type: RolloutStarted
  selectedResources:
  - kind: Namespace
    name: test-ns
    version: v1

Nell'output precedente lo stato della ClusterResourcePlacementScheduled condizione viene visualizzato come False. Lo ClusterResourcePlacementRolloutStarted stato viene visualizzato anche come False nel messaggio : The rollout is being blocked by the rollout strategy in 2 cluster(s).

Controllare la versione più recente ClusterResourceSnapshot eseguendo il comando in Come è possibile trovare la risorsa ClusterResourceBinding più recente?

ClusterResourceSnapshot più recente

apiVersion: placement.kubernetes-fleet.io/v1
kind: ClusterResourceSnapshot
metadata:
  annotations:
    kubernetes-fleet.io/number-of-enveloped-object: "0"
    kubernetes-fleet.io/number-of-resource-snapshots: "1"
    kubernetes-fleet.io/resource-hash: 72344be6e268bc7af29d75b7f0aad588d341c228801aab50d6f9f5fc33dd9c7c
  creationTimestamp: "2024-05-07T23:13:51Z"
  generation: 1
  labels:
    kubernetes-fleet.io/is-latest-snapshot: "true"
    kubernetes-fleet.io/parent-CRP: crp-3
    kubernetes-fleet.io/resource-index: "1"
  name: crp-3-1-snapshot
  ownerReferences:
  - apiVersion: placement.kubernetes-fleet.io/v1beta1
    blockOwnerDeletion: true
    controller: true
    kind: ClusterResourcePlacement
    name: crp-3
    uid: b4f31b9a-971a-480d-93ac-93f093ee661f
  resourceVersion: "14434"
  uid: 85ee0e81-92c9-4362-932b-b0bf57d78e3f
spec:
  selectedResources:
  - apiVersion: v1
    kind: Namespace
    metadata:
      labels:
        kubernetes.io/metadata.name: test-ns
      name: test-ns
    spec:
      finalizers:
      - kubernetes

ClusterResourceSnapshot Nella specifica la selectedResources sezione mostra ora lo spazio dei nomi test-ns.

Controllare se ClusterResourceBinding kind-cluster-1 è stato aggiornato dopo la creazione dello spazio dei nomi test-ns . Per altre informazioni, vedere Come trovare la risorsa ClusterResourceBinding più recente?

ClusterResourceBinding per kind-cluster-1

apiVersion: placement.kubernetes-fleet.io/v1
kind: ClusterResourceBinding
metadata:
  creationTimestamp: "2024-05-07T23:08:53Z"
  finalizers:
  - kubernetes-fleet.io/work-cleanup
  generation: 2
  labels:
    kubernetes-fleet.io/parent-CRP: crp-3
  name: crp-3-kind-cluster-1-7114c253
  resourceVersion: "14438"
  uid: 0db4e480-8599-4b40-a1cc-f33bcb24b1a7
spec:
  applyStrategy:
    type: ClientSideApply
  clusterDecision:
    clusterName: kind-cluster-1
    clusterScore:
      affinityScore: 0
      priorityScore: 0
    reason: picked by scheduling policy
    selected: true
  resourceSnapshotName: crp-3-0-snapshot
  schedulingPolicySnapshotName: crp-3-0
  state: Bound
  targetCluster: kind-cluster-1
status:
  conditions:
  - lastTransitionTime: "2024-05-07T23:13:51Z"
    message: The resources cannot be updated to the latest because of the rollout
      strategy
    observedGeneration: 2
    reason: RolloutNotStartedYet
    status: "False"
    type: RolloutStarted
  - lastTransitionTime: "2024-05-07T23:08:53Z"
    message: No override rules are configured for the selected resources
    observedGeneration: 2
    reason: NoOverrideSpecified
    status: "True"
    type: Overridden
  - lastTransitionTime: "2024-05-07T23:08:53Z"
    message: All of the works are synchronized to the latest
    observedGeneration: 2
    reason: AllWorkSynced
    status: "True"
    type: WorkSynchronized
  - lastTransitionTime: "2024-05-07T23:08:53Z"
    message: All corresponding work objects are applied
    observedGeneration: 2
    reason: AllWorkHaveBeenApplied
    status: "True"
    type: Applied
  - lastTransitionTime: "2024-05-07T23:08:53Z"
    message: All corresponding work objects are available
    observedGeneration: 2
    reason: AllWorkAreAvailable
    status: "True"
    type: Available

Rimane ClusterResourceBinding invariato. Nella specifica l'oggetto ClusterResourceBinding resourceSnapshotName fa ancora riferimento al nome precedente ClusterResourceSnapshot . Questo problema si verifica quando non è presente alcun input esplicito RollingUpdate dell'utente perché vengono applicati i valori predefiniti:

Il maxUnavailable valore è configurato per il 25% × 3 (il numero desiderato), arrotondato a 1.
Il maxSurge valore è configurato per il 25% × 3 (il numero desiderato), arrotondato a 1.

Perché ClusterResourceBinding non viene aggiornato

Inizialmente, quando ClusterResourcePlacement è stato creato , sono stati generati due ClusterResourceBindings . Tuttavia, poiché l'implementazione non si applica alla fase iniziale, la ClusterResourcePlacementRolloutStarted condizione è stata impostata su True.

Dopo aver creato lo test-ns spazio dei nomi nel cluster hub, il controller di implementazione ha tentato di aggiornare i due elementi esistenti ClusterResourceBindings. Tuttavia, maxUnavailable è stato impostato su a 1 causa della mancanza di cluster membri, che ha causato una RollingUpdate configurazione troppo rigida.

Note

Durante l'aggiornamento, se una delle associazioni non viene applicata, viola anche la configurazione, che causa maxUnavailable l'impostazione RollingUpdate su 1.

Risoluzione

In questo caso, per risolvere questo problema, prendere in considerazione l'impostazione maxUnavailable manuale di un valore maggiore di quello per 1 ridurre la RollingUpdate configurazione. In alternativa, è possibile aggiungere un terzo cluster membro.

Contattaci per ricevere assistenza

In caso di domande o bisogno di assistenza, creare una richiesta di supporto tecnico oppure formula una domanda nel Supporto della community di Azure. È possibile anche inviare un feedback sul prodotto al feedback della community di Azure.

Condividi tramite