Image Troubleshooting
TL;DR
Start with kubectl describe pod events to see the exact pull error. Check registry credentials, image name spelling, and tag existence. Use crane or docker manifest inspect to verify a remote image exists before deploying.
ErrImagePull / ImagePullBackOff
These states mean Kubernetes couldn't pull the container image — always read the pod event message first, as it usually contains the exact registry error.
bashpull-errors.sh
# 1. Get the exact error from pod events
kubectl describe pod <pod> -n <ns> | grep -A 10 "Events:"
# 2. Verify the image exists in the registry
crane digest myregistry.io/myapp:v1.2.3 # exits 0 if image exists
docker manifest inspect myregistry.io/myapp:v1.2.3 # shows manifest
# 3. Check imagePullSecrets are configured
kubectl get pod <pod> -n <ns> -o jsonpath='{.spec.imagePullSecrets}'
kubectl get secret <pull-secret> -n <ns> -o jsonpath='{.data.\.dockerconfigjson}' | \
base64 -d | jq .auths
# 4. Test registry authentication manually on the node
# For ECR:
aws ecr get-login-password --region us-east-1 | \
crictl pull --creds AWS:$(aws ecr get-login-password) \
123456789.dkr.ecr.us-east-1.amazonaws.com/myapp:v1.2.3
# 5. Check node can reach the registry
curl -I https://myregistry.io/v2/ECR Authentication
ECR tokens expire after 12 hours; use IRSA (IAM Roles for Service Accounts) or the ECR credential helper rather than static tokens to avoid image pull failures after midnight.
bashecr-auth.sh
# Create or refresh ECR pull secret (valid 12h)
AWS_ACCOUNT=123456789
REGION=us-east-1
TOKEN=$(aws ecr get-login-password --region $REGION)
kubectl create secret docker-registry ecr-pull-secret \
--docker-server=$AWS_ACCOUNT.dkr.ecr.$REGION.amazonaws.com \
--docker-username=AWS \
--docker-password=$TOKEN \
-n my-namespace \
--dry-run=client -o yaml | kubectl apply -f -
# Automate with a CronJob that refreshes the secret every 6 hours
# Or use: https://github.com/aws-samples/aws-ecr-credentials-refresher
# Or use external-secrets operator with ECR generator
# Patch default service account to use pull secret
kubectl patch serviceaccount default -n my-namespace \
-p '{"imagePullSecrets": [{"name": "ecr-pull-secret"}]}'Image Inspection
Inspect image metadata, layers, and config without pulling the full image — useful to check what user, entrypoint, and env vars are baked into an image before deploying.
bashinspect.sh
# crane: inspect remote images without pulling (install: go install github.com/google/go-containerregistry/cmd/crane@latest)
crane config myregistry.io/myapp:v1.2.3 # full image config (entrypoint, user, env, labels)
crane config myregistry.io/myapp:v1.2.3 | jq '.config | {User, Entrypoint, Env}'
crane ls myregistry.io/myapp # list all tags
crane digest myregistry.io/myapp:v1.2.3 # image digest
# docker inspect (requires local pull)
docker inspect myapp:v1.2.3 | jq '.[0].Config | {User, Entrypoint, Cmd, Env}'
docker history myapp:v1.2.3 # layer history with commands
# dive: interactive layer explorer (install: github.com/wagoodman/dive)
dive myapp:v1.2.3 # shows layers, file changes, wasted space
# skopeo: copy and inspect images across registries without daemon
skopeo inspect docker://myregistry.io/myapp:v1.2.3
skopeo copy docker://source-registry/myapp:v1 docker://dest-registry/myapp:v1Troubleshooting Map
| Error | Check | Fix |
|---|---|---|
| ImagePullBackOff: 401 Unauthorized | imagePullSecrets, secret contents, token expiry | Recreate pull secret with fresh token; use IRSA for ECR |
| ImagePullBackOff: 404 Not Found | crane digest image:tag | Tag doesn't exist; check image name spelling or push the image |
| ImagePullBackOff: no such host | dig <registry-hostname> from node | DNS failure or private registry not reachable from node |
| ImagePullBackOff: context deadline exceeded | Node network egress, registry rate limits | Use pull-through cache; check node security group egress |
| CrashLoopBackOff after pull success | kubectl logs <pod> --previous | App error at startup; check entrypoint, env vars, config |