Cordon, Drain, Taint & Eviction
TL;DR
Cordon stops new Pods from scheduling on a node. Drain safely evicts existing Pods. Taints repel Pods unless they tolerate the taint. Eviction respects controllers, PodDisruptionBudgets, grace periods, and workload safety.
Quick Meaning
| Action | Effect | Use when |
|---|---|---|
| Cordon | Marks node unschedulable. | Prepare for maintenance. |
| Drain | Evicts workload Pods from node. | Patch/reboot/replace node. |
| Taint | Repels Pods without tolerations. | Reserve nodes or isolate problems. |
| Eviction | API-initiated graceful Pod removal. | Drain, pressure, or disruption flows. |
Safe Drain Workflow
Linear drain workflow: isolate the node, evict workloads, perform change, reopen scheduling, validate.
Eviction Path And Grace
Drain triggers policy/v1 eviction subresources against workload Pods so the kubelet stops containers with the Pod’s termination grace budget. PDBs constrain how many matching Pods may be disrupted at once; the drain command surfaces “forbidden because it would violate the PodDisruptionBudget” when quotas are tight.
basheviction-watch.sh
# Watch PDB current healthy vs desired while draining.
kubectl get pdb -A
kubectl describe pdb web-api -n app
# See recent evictions/disruptions affecting a namespace.
kubectl get events -n app --sort-by=.lastTimestamp | tail -60bashsafe-drain.sh
NODE=worker-1
kubectl get node "$NODE" -o wide
kubectl describe node "$NODE" # Check taints, conditions, allocatable resources, and events.
kubectl get pods -A --field-selector spec.nodeName="$NODE" -o wide # See what will move.
kubectl get pdb -A # PodDisruptionBudgets may block drain.
kubectl cordon "$NODE" # Prevent new Pods from landing on the node.
kubectl drain "$NODE" \
--ignore-daemonsets \
--delete-emptydir-data \
--grace-period=60 \
--timeout=15m
# After maintenance:
kubectl uncordon "$NODE"
kubectl get node "$NODE"
kubectl get pods -A --field-selector spec.nodeName="$NODE"PodDisruptionBudget
yamlpdb.yaml
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: web-api
namespace: app
spec:
minAvailable: 2 # At least two matching Pods must remain available during voluntary disruption.
selector:
matchLabels:
app: web-api # Must match workload Pod labels.Taints And Tolerations
bashtaints.sh
# Add a NoSchedule taint.
kubectl taint node worker-1 dedicated=gpu:NoSchedule
# Remove that exact taint.
kubectl taint node worker-1 dedicated=gpu:NoSchedule-
# Inspect taints.
kubectl describe node worker-1 | grep -i taintsyamltoleration.yaml
tolerations:
- key: dedicated
operator: Equal
value: gpu
effect: NoSchedule # Pod can schedule onto nodes with dedicated=gpu:NoSchedule.When Drain Is Blocked
- PDB violation: scale workload up, wait for healthy replicas, or coordinate outage approval.
- Unmanaged Pod: drain refuses naked Pods unless forced; identify owner before deleting.
- emptyDir data:
--delete-emptydir-datameans local temporary data is lost. - DaemonSets: ignored by drain because they are expected to run on the node.