TL;DR

Find the owning GitOps resource first (ArgoCD Application or Flux Kustomization/HelmRelease). Check diff/status, then the underlying Kubernetes resource — not just the controller UI.

First Steps

Start here when a GitOps app is stuck, degraded, or out of sync. These commands identify the owning controller object, the rendered source, and the live Kubernetes failure before you decide whether to fix Git or the cluster.

bash first-steps.sh
# ArgoCD
kubectl get applications -A
kubectl describe application <app> -n argocd
argocd app diff <app>

# Flux
flux get kustomizations
flux get helmreleases -A
kubectl describe kustomization <name> -n flux-system

ArgoCD Symptom → Fix

SymptomLikely causeFix
OutOfSync loopHPA/webhook drift, ignoreDifferences missingAdd ignoreDifferences or fix Git
ComparisonErrorHelm/Kustomize render failure, bad repo credsCheck Application events; test render locally
DegradedPod/Deployment unhealthykubectl describe failing resource
Sync stuck / ProgressingHook Job, sync wave, finalizer, admissionCheck hook Jobs, controller logs
Missing resourceManual delete or pruneSync with prune or restore in Git
selfHeal reverts patchExpected — live drift correctedFix Git, not cluster

Flux Symptom → Fix

SymptomLikely causeFix
Ready=False on KustomizationBuild error, RBAC, invalid manifestkubectl describe kustomization; check conditions message
GitRepository fetch failedBad creds, branch deleted, rate limitCheck source status; verify deploy key/token
HelmRelease install failedChart error, values mismatchflux logs; compare with helm template
Changes slow to appearReconcile intervalflux reconcile kustomization <name> --with-source

Ownership Conflicts

Use this when manual changes keep reverting or two tools appear to fight over the same object. The goal is to find the source of truth and stop making live edits that the controller will overwrite.

bash ownership.sh
# Who manages this Deployment?
kubectl get deploy web-api -n app -o yaml | grep -E 'argocd|flux|helm|managed-by'

# Do NOT run helm upgrade on ArgoCD/Flux-managed releases.
# Do NOT kubectl apply conflicting manifests on GitOps-managed resources.

Sync Stuck (ArgoCD)

This checklist narrows a stuck ArgoCD sync to hooks, sync waves, finalizers, health checks, or admission failures. Use it before force-deleting resources or disabling automated sync.

bash sync-stuck.sh
kubectl describe application web-api -n argocd
kubectl get jobs -n app | grep -E 'hook|migrate|pre-'
kubectl logs -n argocd deploy/argocd-application-controller --tail=200 | grep web-api

# Emergency: terminate stuck sync (use with approval).
argocd app terminate-op web-api

Emergency Controls

ActionArgoCDFlux
Pause auto-syncRemove automated syncPolicy or disable selfHealflux suspend kustomization <name>
Force reconcileargocd app sync <app>flux reconcile kustomization <name> --with-source
Rollbackargocd app rollback or Git revertGit revert (Flux follows Git history)
⚠️
Incident kubectl patches without disabling selfHeal/sync will be reverted within minutes. Either pause sync first or commit the fix to Git immediately after patching.

Gotchas

  • !ComparisonError != Degraded — ComparisonError means Git can't be rendered; Degraded means cluster resources are unhealthy.
  • !CRD ordering — sync may fail if CRDs aren't installed before custom resources; use sync waves or separate CRD app.
  • !Repo credentials expire — deploy keys and tokens are a common silent failure after rotation.
  • !Helm + GitOps hooks — pre-upgrade Jobs may behave differently than direct helm upgrade.