Kubernetes Logging
Start with kubectl logs for immediate access. For persistent, searchable logs use Loki + Promtail (cost-effective, Prometheus-native) or Fluent Bit → Elasticsearch/OpenSearch (more query power, higher cost). Always write structured JSON to stdout/stderr.
kubectl logs — Immediate Access
Use these patterns daily; add --prefix when tailing multiple pods to keep output readable, and --previous to read logs from a just-crashed container before it restarts.
# Single pod
kubectl logs <pod> -n <ns>
kubectl logs <pod> -n <ns> -c <container> # specific container
kubectl logs <pod> -n <ns> --previous # crashed/previous container
kubectl logs <pod> -n <ns> --tail=100
kubectl logs <pod> -n <ns> -f # follow in real time
# All pods matching a label selector
kubectl logs -n <ns> -l app=myapp --all-containers --prefix --tail=200
# Since time (useful for incident scoping)
kubectl logs <pod> -n <ns> --since=1h
kubectl logs <pod> -n <ns> --since-time="2026-05-23T14:00:00Z"
# Filter on the fly (when no log aggregation is available)
kubectl logs -n <ns> -l app=myapp --tail=500 | grep -iE "error|fatal|exception"
kubectl logs -n <ns> <pod> | jq 'select(.level=="error")' 2>/dev/null # structured JSON logsLoki + Promtail (Grafana Stack)
Loki stores logs as compressed chunks in object storage and indexes only labels (not the full text), making it far cheaper than Elasticsearch at scale; query with LogQL, which is similar to PromQL.
# Filter log stream by namespace and app label
{namespace="production", app="my-service"}
# Filter for error lines (case insensitive)
{namespace="production", app="my-service"} |= "error" | ~"(?i)exception"
# JSON log parsing — extract fields and filter
{namespace="production"} | json | level="error"
# Rate of error log lines per minute
rate({namespace="production", app="my-service"} |= "error" [1m])
# Top 10 error messages by frequency
topk(10,
sum by (message)(
count_over_time({namespace="production"} | json | level="error" [1h])
)
)
# Latency from structured logs (if request_duration_ms is logged)
{namespace="production"} | json | unwrap request_duration_ms | p99 by (path) [5m]# logcli: query Loki from the terminal
export LOKI_ADDR=http://loki.monitoring.svc:3100
logcli query '{namespace="production",app="myapp"}' --limit=100 --tail
logcli query '{namespace="production"} | json | level="error"' \
--from="2026-05-23T14:00:00Z" --to="2026-05-23T14:30:00Z"
logcli labels # list available labels
logcli labels app # list values for a labelFluent Bit
Fluent Bit is a lightweight DaemonSet log shipper; it reads container logs from /var/log/containers/, parses them, and forwards to Elasticsearch, Loki, S3, or other outputs with minimal CPU/memory overhead.
# Helm values for fluent/fluent-bit chart — forward to Loki
config:
inputs: |
[INPUT]
Name tail
Path /var/log/containers/*.log
multiline.parser docker, cri
Tag kube.*
Mem_Buf_Limit 50MB
Skip_Long_Lines On
filters: |
[FILTER]
Name kubernetes
Match kube.*
Kube_URL https://kubernetes.default.svc:443
Kube_CA_File /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
Kube_Token_File /var/run/secrets/kubernetes.io/serviceaccount/token
Merge_Log On # merge JSON container logs into the record
K8S-Logging.Parser On
K8S-Logging.Exclude On # honour fluentbit.io/exclude: "true" pod annotation
outputs: |
[OUTPUT]
Name loki
Match kube.*
Host loki.monitoring.svc
Port 3100
Labels job=fluentbit,namespace=$kubernetes['namespace_name'],app=$kubernetes['labels']['app']
line_format jsonLogging Troubleshooting
| Symptom | Check | Likely cause |
|---|---|---|
| No logs in Loki / Grafana | kubectl logs -n monitoring -l app=promtail | Promtail/Fluent Bit can't reach Loki, RBAC missing, wrong Loki URL |
| Logs missing for a pod | Check pod annotation fluentbit.io/exclude | Pod opted out of log collection |
| kubectl logs: "context deadline exceeded" | kubectl logs --limit-bytes=1000000 <pod> | Log volume too large; limit bytes or time range |
| Old logs not in Loki | Check Loki retention_period config | Logs older than retention period are compacted/deleted |
| High Loki ingest cost | Check cardinality of labels | Too many unique label values (e.g., pod name as label); reduce label cardinality |