Kubernetes Logging — K8s SRE Reference

TL;DR

Start with kubectl logs for immediate access. For persistent, searchable logs use Loki + Promtail (cost-effective, Prometheus-native) or Fluent Bit → Elasticsearch/OpenSearch (more query power, higher cost). Always write structured JSON to stdout/stderr.

kubectl logs — Immediate Access

Use these patterns daily; add --prefix when tailing multiple pods to keep output readable, and --previous to read logs from a just-crashed container before it restarts.

bashkubectl-logs.sh

# Single pod
kubectl logs <pod> -n <ns>
kubectl logs <pod> -n <ns> -c <container>   # specific container
kubectl logs <pod> -n <ns> --previous         # crashed/previous container
kubectl logs <pod> -n <ns> --tail=100
kubectl logs <pod> -n <ns> -f                 # follow in real time

# All pods matching a label selector
kubectl logs -n <ns> -l app=myapp --all-containers --prefix --tail=200

# Since time (useful for incident scoping)
kubectl logs <pod> -n <ns> --since=1h
kubectl logs <pod> -n <ns> --since-time="2026-05-23T14:00:00Z"

# Filter on the fly (when no log aggregation is available)
kubectl logs -n <ns> -l app=myapp --tail=500 | grep -iE "error|fatal|exception"
kubectl logs -n <ns> <pod> | jq 'select(.level=="error")' 2>/dev/null  # structured JSON logs

Loki + Promtail (Grafana Stack)

Loki stores logs as compressed chunks in object storage and indexes only labels (not the full text), making it far cheaper than Elasticsearch at scale; query with LogQL, which is similar to PromQL.

bashlogql-examples.logql

# Filter log stream by namespace and app label
{namespace="production", app="my-service"}

# Filter for error lines (case insensitive)
{namespace="production", app="my-service"} |= "error" | ~"(?i)exception"

# JSON log parsing — extract fields and filter
{namespace="production"} | json | level="error"

# Rate of error log lines per minute
rate({namespace="production", app="my-service"} |= "error" [1m])

# Top 10 error messages by frequency
topk(10,
  sum by (message)(
    count_over_time({namespace="production"} | json | level="error" [1h])
  )
)

# Latency from structured logs (if request_duration_ms is logged)
{namespace="production"} | json | unwrap request_duration_ms | p99 by (path) [5m]

bashlogcli.sh

# logcli: query Loki from the terminal
export LOKI_ADDR=http://loki.monitoring.svc:3100

logcli query '{namespace="production",app="myapp"}' --limit=100 --tail
logcli query '{namespace="production"} | json | level="error"' \
  --from="2026-05-23T14:00:00Z" --to="2026-05-23T14:30:00Z"
logcli labels                         # list available labels
logcli labels app                     # list values for a label

Fluent Bit

Fluent Bit is a lightweight DaemonSet log shipper; it reads container logs from /var/log/containers/, parses them, and forwards to Elasticsearch, Loki, S3, or other outputs with minimal CPU/memory overhead.

yamlfluent-bit-values.yaml

# Helm values for fluent/fluent-bit chart — forward to Loki
config:
  inputs: |
    [INPUT]
        Name              tail
        Path              /var/log/containers/*.log
        multiline.parser  docker, cri
        Tag               kube.*
        Mem_Buf_Limit     50MB
        Skip_Long_Lines   On
  filters: |
    [FILTER]
        Name                kubernetes
        Match               kube.*
        Kube_URL            https://kubernetes.default.svc:443
        Kube_CA_File        /var/run/secrets/kubernetes.io/serviceaccount/ca.crt
        Kube_Token_File     /var/run/secrets/kubernetes.io/serviceaccount/token
        Merge_Log           On    # merge JSON container logs into the record
        K8S-Logging.Parser  On
        K8S-Logging.Exclude On    # honour fluentbit.io/exclude: "true" pod annotation
  outputs: |
    [OUTPUT]
        Name            loki
        Match           kube.*
        Host            loki.monitoring.svc
        Port            3100
        Labels          job=fluentbit,namespace=$kubernetes['namespace_name'],app=$kubernetes['labels']['app']
        line_format     json

Logging Troubleshooting

Symptom	Check	Likely cause
No logs in Loki / Grafana	`kubectl logs -n monitoring -l app=promtail`	Promtail/Fluent Bit can't reach Loki, RBAC missing, wrong Loki URL
Logs missing for a pod	Check pod annotation `fluentbit.io/exclude`	Pod opted out of log collection
kubectl logs: "context deadline exceeded"	`kubectl logs --limit-bytes=1000000 <pod>`	Log volume too large; limit bytes or time range
Old logs not in Loki	Check Loki `retention_period` config	Logs older than retention period are compacted/deleted
High Loki ingest cost	Check cardinality of labels	Too many unique label values (e.g., pod name as label); reduce label cardinality

kubectl logs — Immediate Access

Loki + Promtail (Grafana Stack)

Fluent Bit

Logging Troubleshooting

Related Pages