GitOps Troubleshooting
Find the owning GitOps resource first (ArgoCD Application or Flux Kustomization/HelmRelease). Check diff/status, then the underlying Kubernetes resource — not just the controller UI.
First Steps
Start here when a GitOps app is stuck, degraded, or out of sync. These commands identify the owning controller object, the rendered source, and the live Kubernetes failure before you decide whether to fix Git or the cluster.
# ArgoCD
kubectl get applications -A
kubectl describe application <app> -n argocd
argocd app diff <app>
# Flux
flux get kustomizations
flux get helmreleases -A
kubectl describe kustomization <name> -n flux-system
ArgoCD Symptom → Fix
| Symptom | Likely cause | Fix |
|---|---|---|
| OutOfSync loop | HPA/webhook drift, ignoreDifferences missing | Add ignoreDifferences or fix Git |
| ComparisonError | Helm/Kustomize render failure, bad repo creds | Check Application events; test render locally |
| Degraded | Pod/Deployment unhealthy | kubectl describe failing resource |
| Sync stuck / Progressing | Hook Job, sync wave, finalizer, admission | Check hook Jobs, controller logs |
| Missing resource | Manual delete or prune | Sync with prune or restore in Git |
| selfHeal reverts patch | Expected — live drift corrected | Fix Git, not cluster |
Flux Symptom → Fix
| Symptom | Likely cause | Fix |
|---|---|---|
| Ready=False on Kustomization | Build error, RBAC, invalid manifest | kubectl describe kustomization; check conditions message |
| GitRepository fetch failed | Bad creds, branch deleted, rate limit | Check source status; verify deploy key/token |
| HelmRelease install failed | Chart error, values mismatch | flux logs; compare with helm template |
| Changes slow to appear | Reconcile interval | flux reconcile kustomization <name> --with-source |
Ownership Conflicts
Use this when manual changes keep reverting or two tools appear to fight over the same object. The goal is to find the source of truth and stop making live edits that the controller will overwrite.
# Who manages this Deployment?
kubectl get deploy web-api -n app -o yaml | grep -E 'argocd|flux|helm|managed-by'
# Do NOT run helm upgrade on ArgoCD/Flux-managed releases.
# Do NOT kubectl apply conflicting manifests on GitOps-managed resources.
Sync Stuck (ArgoCD)
This checklist narrows a stuck ArgoCD sync to hooks, sync waves, finalizers, health checks, or admission failures. Use it before force-deleting resources or disabling automated sync.
kubectl describe application web-api -n argocd
kubectl get jobs -n app | grep -E 'hook|migrate|pre-'
kubectl logs -n argocd deploy/argocd-application-controller --tail=200 | grep web-api
# Emergency: terminate stuck sync (use with approval).
argocd app terminate-op web-api
Emergency Controls
| Action | ArgoCD | Flux |
|---|---|---|
| Pause auto-sync | Remove automated syncPolicy or disable selfHeal | flux suspend kustomization <name> |
| Force reconcile | argocd app sync <app> | flux reconcile kustomization <name> --with-source |
| Rollback | argocd app rollback or Git revert | Git revert (Flux follows Git history) |
Gotchas
- ComparisonError != Degraded — ComparisonError means Git can't be rendered; Degraded means cluster resources are unhealthy.
- CRD ordering — sync may fail if CRDs aren't installed before custom resources; use sync waves or separate CRD app.
- Repo credentials expire — deploy keys and tokens are a common silent failure after rotation.
- Helm + GitOps hooks — pre-upgrade Jobs may behave differently than direct helm upgrade.