Network Policies — K8s SRE Reference

TL;DR

NetworkPolicies are Kubernetes allow-list rules for Pod traffic. By default, Pods are non-isolated and can receive/send traffic. Once a policy selects a Pod for ingress or egress, that direction becomes default-deny except for matching allow rules. Policies are enforced by the CNI, so they only work if the cluster networking plugin supports them.

Mental Model

Think of NetworkPolicy as namespace-scoped firewall intent for Pods. You select the Pods being protected, then define which sources may connect to them and which destinations they may call. Standard NetworkPolicy has no explicit deny rule; denial happens when a Pod becomes isolated and no allow rule matches.

NetworkPolicy does not control Kubernetes RBAC, image pulls, host firewall rules, cloud security groups, node-to-node traffic, or every CNI-specific feature. It is primarily about Pod network traffic.

A common production pattern: isolate first, then add narrow allow rules.

Core Concepts

Concept	Meaning	Common Mistake
`podSelector`	Selects the destination Pods this policy applies to.	Thinking it selects traffic sources. It selects protected Pods.
`policyTypes`	Whether policy affects Ingress, Egress, or both.	Forgetting egress and wondering why outbound traffic still works.
`ingress.from`	Allowed sources for incoming traffic.	Selector labels do not match actual Pods/namespaces.
`egress.to`	Allowed destinations for outbound traffic.	Blocking DNS by forgetting UDP/TCP 53.
`ipBlock`	Allows or excludes CIDR ranges.	Using it for Pod IPs instead of selectors.
CNI enforcement	Calico, Cilium, Antrea, Canal, and others may enforce policies.	Policies exist but no dataplane enforces them.

Default Deny

Start with a default-deny policy when you want a namespace to be isolated. Then add explicit allow policies for required paths. Do this carefully in production because it can break DNS, metrics scraping, ingress controller traffic, and app dependencies.

yamldefault-deny-all.yaml

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: default-deny-all
  namespace: app
spec:
  podSelector: {}
  policyTypes:
    - Ingress
    - Egress

From-Scratch Lab

This lab creates a frontend, an API, and a random client. The policy allows only frontend Pods to call the API on port 80. Use it to practice proving allowed and denied paths.

yamlnetpol-lab.yaml

apiVersion: v1
kind: Namespace
metadata:
  name: app
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: api
  namespace: app
spec:
  replicas: 2
  selector:
    matchLabels:
      app: api
  template:
    metadata:
      labels:
        app: api
    spec:
      containers:
        - name: api
          image: nginxdemos/hello:plain-text
          ports:
            - name: http
              containerPort: 80
---
apiVersion: v1
kind: Service
metadata:
  name: api
  namespace: app
spec:
  selector:
    app: api
  ports:
    - name: http
      port: 80
      targetPort: http
---
apiVersion: apps/v1
kind: Deployment
metadata:
  name: frontend
  namespace: app
spec:
  replicas: 1
  selector:
    matchLabels:
      app: frontend
  template:
    metadata:
      labels:
        app: frontend
    spec:
      containers:
        - name: shell
          image: curlimages/curl
          command: ["sleep", "3600"]

yamlallow-frontend-to-api.yaml

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-frontend-to-api
  namespace: app
spec:
  podSelector:
    matchLabels:
      app: api
  policyTypes:
    - Ingress
  ingress:
    - from:
        - podSelector:
            matchLabels:
              app: frontend
      ports:
        - protocol: TCP
          port: 80

bashtest-netpol-lab.sh

kubectl apply -f netpol-lab.yaml
kubectl apply -f allow-frontend-to-api.yaml
kubectl rollout status deploy/api -n app
kubectl rollout status deploy/frontend -n app

# Allowed: frontend -> api.
FRONTEND=$(kubectl get pod -n app -l app=frontend -o jsonpath='{.items[0].metadata.name}')
kubectl exec -n app "$FRONTEND" -- curl -sS --max-time 3 http://api.app.svc.cluster.local

# Should be denied after policy isolation: random client -> api.
kubectl run random-client -n app --rm -it --image=curlimages/curl --restart=Never -- \
  curl -v --max-time 3 http://api.app.svc.cluster.local

DNS Egress Exception

If you apply egress default-deny, DNS often breaks first. Allow both UDP and TCP 53 to CoreDNS. In many clusters, selecting the kube-system namespace is enough; stricter environments also select DNS Pods by label.

yamlallow-dns-egress.yaml

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-dns-egress
  namespace: app
spec:
  podSelector: {}
  policyTypes:
    - Egress
  egress:
    - to:
        - namespaceSelector:
            matchLabels:
              kubernetes.io/metadata.name: kube-system
          podSelector:
            matchLabels:
              k8s-app: kube-dns
      ports:
        - protocol: UDP
          port: 53
        - protocol: TCP
          port: 53

Selectors And ipBlock

Selectors are the heart of NetworkPolicy. podSelector alone means Pods in the same namespace. namespaceSelector alone means all Pods in matching namespaces. Combining both means Pods matching the pod labels inside namespaces matching the namespace labels.

yamlallow-monitoring.yaml

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-monitoring-scrape
  namespace: app
spec:
  podSelector:
    matchLabels:
      app: api
  policyTypes:
    - Ingress
  ingress:
    - from:
        - namespaceSelector:
            matchLabels:
              kubernetes.io/metadata.name: monitoring
          podSelector:
            matchLabels:
              app: prometheus-agent
      ports:
        - protocol: TCP
          port: 8080

yamlallow-external-api-egress.yaml

apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: allow-payment-provider
  namespace: app
spec:
  podSelector:
    matchLabels:
      app: api
  policyTypes:
    - Egress
  egress:
    - to:
        - ipBlock:
            cidr: 203.0.113.0/24
            except:
              - 203.0.113.99/32
      ports:
        - protocol: TCP
          port: 443

Production Patterns

1Namespace isolation: start new app namespaces with default-deny and explicit platform exceptions.
2Tier boundaries: allow frontend to API, API to database, monitoring to metrics, and ingress controller to app.
3DNS and telemetry: include DNS, metrics, tracing, logging, and service mesh sidecar requirements in egress planning.
4Controller sources: if Ingress traffic is blocked, allow traffic from the ingress controller namespace/Pods to backend Services.
5Document labels: policies are only as reliable as namespace and Pod labels.

Debugging Workflow

First prove the app and Service work without guessing. DNS may resolve and endpoints may exist, but NetworkPolicy can still block the connection. Test from the exact source namespace and Pod identity that should be allowed.

bashnetpol-debug.sh

NS=<namespace>
SVC=<service>
PORT=<port>

# See policies and labels.
kubectl get networkpolicy -n "$NS"
kubectl describe networkpolicy -n "$NS"
kubectl get pods -n "$NS" --show-labels -o wide
kubectl get ns --show-labels

# Confirm Service and endpoints exist.
kubectl get svc "$SVC" -n "$NS" -o wide
kubectl get endpointslice -n "$NS" -l kubernetes.io/service-name="$SVC" -o wide

# Test from a chosen source.
kubectl run netshoot -n "$NS" --rm -it --image=nicolaka/netshoot -- /bin/bash
nslookup "$SVC.$NS.svc.cluster.local"
curl -vk --connect-timeout 3 "http://$SVC.$NS.svc.cluster.local:$PORT"
nc -vz "$SVC.$NS.svc.cluster.local" "$PORT"

# Confirm CNI policy capability.
kubectl get pods -n kube-system -o wide | grep -E 'calico|cilium|antrea|weave|canal'

Symptom To Cause

Symptom	Likely Cause	Check First
Policies applied but traffic still allowed	CNI does not enforce NetworkPolicy or policy does not select the Pod.	CNI pods, protected Pod labels, `podSelector`.
DNS stopped working	Egress policy blocks UDP/TCP 53 to CoreDNS.	DNS egress allow rule and CoreDNS labels.
Ingress returns 502/504 after policy rollout	Ingress controller namespace/Pods not allowed to backend Pods.	Source namespace labels and controller Pod labels.
Same namespace traffic blocked unexpectedly	Default-deny selected the destination Pod and no same-namespace allow exists.	All policies selecting the destination Pod.
Cross-namespace rule not matching	Namespace labels or Pod labels do not match the selectors.	`kubectl get ns --show-labels`, Pod labels.
Egress to external API blocked	No egress allow for CIDR/port, or external IP changes frequently.	`ipBlock`, provider IP ranges, proxy pattern.
Policy allows port but app still times out	Service targetPort/app listener issue, not policy.	Direct Pod curl, Service YAML, app logs.

Safe Change Pattern

List required flows before writing YAML: source, destination, namespace, port, protocol, and DNS needs.
Apply in a lab or staging namespace first and test both allowed and denied traffic.
Roll out default-deny during a change window for production namespaces.
Keep rollback simple: remove the new policies or revert the GitOps/Helm change.
After rollout, test DNS, ingress path, service-to-service calls, metrics scraping, logging, and app startup.