AWS IAM & Security Groups For Kubernetes On EKS
Use IRSA or Pod Identity for fine-grained pod AWS credentials; reserve node instance profiles for kubelet/AWS CNI/agent needs; lock security groups deliberately for apiserver signaling, workloads, load balancers, and ephemeral pod ENIs (VPC CNI). Model everything in IaC (Terraform) and correlate SG/IAM regressions when Services or Ingress stop reconciling.
IRSA Trust Flow (OIDC)
IRSA binds a Kubernetes ServiceAccount to an IAM role via the cluster OIDC issuer; STS mints scoped credentials inside the Pod.
Automate IAM role wiring with terraform-aws-modules/iam-role-for-service-accounts-eks (Terraform IRSA section). Always scope Federated: trust to :aud and sts:AssumeRoleWithWebIdentity Subject keys that include namespace/serviceaccount triples.
Node Instance Roles Vs Pod Roles
| Principal | Attached IAM | Typical permissions |
|---|---|---|
| EC2 / MNG instance profile | IAM role baked into LT | Pull from ECR, describe ENIs/asgs limited set, KMS decrypt for node volumes. |
| Cluster addons (LB controller, CSI, CA) | Dedicated IAM roles via IRSA | ELBv2 mutate, AttachVolume/ebs CSI, DescribeAutoScalingGroups for CA. |
| Workload pods | Explicit IRSA annotations | SQS, Dynamo, Secrets Manager — never widen node profile to mimic app IAM. |
# Exec into problematic pod — confirm env AWS_ROLE_ARN + token file mounts exist.
kubectl -n payments exec deploy/app -- env | grep '^AWS_' || true
# STS decode with caller identity mirrors what SDK will assume.
kubectl -n payments exec deploy/app -- aws sts get-caller-identitySecurity Groups — Nodes, API, LB, Pod ENIs
| Boundary | Usually attached to… | Things to nail |
|---|---|---|
| Cluster / control-plane SG | Managed ENIs bridging api → nodes | Preserve AWS-managed rules for apiserver/kubelet chatter; minimize human edits. |
| Node SG | Workers (and sometimes managed ENIs) | Expose only needed app ports intra-VPC; allow control-plane inbound on 443/10250 patterns per AWS baseline. |
| LB SG | ALB / NLB created via controller | Ingress health checks need correct target groups; correlate with subnets tagged kubernetes.io/ |
| Pod SG (VPC CNI feature) | Pods with dedicated SG refs | Understand IP prefix/SG interplay; egress to AWS APIs passes node/NAT routing. |
# Inspect ENIs tying SGs ↔ nodes — illustrative workflow; SG IDs differ per VPC.
metadata:
nodeSelector:
eks.amazonaws.com/nodegroup: ingress-heavy
annotations: {}
# Cross-check actual ENIs via AWS CLI / console when debugging SG egress drops.Common EKS IAM Mistakes
| Symptom | Root cause pitfall | Rapid remediation |
|---|---|---|
AccessDenied from pod calling AWS APIs | IAM policy missing kms:Decrypt / wrong resource ARN or missing IRSA annotations | Inspect pod SA annotations + CloudTrail AssumeRole; fix trust :sub string mismatches. |
| Creds silently too powerful | Workload uses node IAM role fallback | Annotate SA, disable metadata hop where safe, tighten node policy. |
| Webhook / aws-auth breakage | Stale aws-auth mapping or SSO role rename | Use EKS Access Entries where available; reconcile aws-auth carefully. |
| STS throttling globally | Huge fleets assume pod roles without caching | Regional STS endpoints (sts-regional-endpoints annotation); SDK timeouts. |
| IAM role quotas | Many micro-services each with IAM role churn | Reuse roles with tighter policy partitioning or attribute-based conditioning. |
| Broken OIDC discovery | Deleting provider while IRSA workloads still rollout | Freeze deploys until provider restored; workloads restart to pick STS errors. |
KMS CMKs & Secrets
Encrypt etcd secrets at rest with AWS KMS CMKs referenced in cluster config — platform teams own key policies for EKS principal use. Pods rarely talk to KMS unless app logic requires decrypt; tie IRSA principals with restrictive key policies.
# CloudTrail Insights / event lookup — filter iam.amazonaws.com AssumeRole.
aws logs filter-log-events \
--log-group-name CloudTrail/Default \
--filter-pattern "AssumeRole"IAM Trust Relationship Shape (Conceptual)
Generated JSON below matches what Terraform IAM modules emit — double-check issuer host matches your cluster’s OIDC URL (no stray trailing slashes).
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": {
"Federated": "arn:aws:iam::ACCOUNT_ID:oidc-provider/oidc.eks.REGION.amazonaws.com/id/CLUSTER_ID"
},
"Action": "sts:AssumeRoleWithWebIdentity",
"Condition": {
"StringEquals": {
"oidc.eks.REGION.amazonaws.com/id/CLUSTER_ID:aud": "sts.amazonaws.com",
"oidc.eks.REGION.amazonaws.com/id/CLUSTER_ID:sub": "system:serviceaccount:payments:app-sqs-consumer"
}
}
}
]
}Translating Humans Into apiserver Principals
| Mechanism | Use when… | Pitfalls |
|---|---|---|
aws-auth | Legacy clusters still mapping IAMRole → kubernetes groups | Syntax errors strand every admin offline until emergency SSM/fix. |
| Access Entries | New defaults for IAM SSO roles / break-glass users | Stale entries after IdP renaming — audit quarterly. |
| Separate admin role per env | Finance wants tight SCP boundaries prod vs dev | Copy/paste SCP statements missing kms: encrypt contexts. |
# Snapshot before edits — restores must be deterministic.
kubectl -n kube-system get configmap aws-auth -o yaml > /tmp/aws-auth-backup.yamlaws-auth ConfigMap Shape (Danger Zone)
Prefer Access Entries/EKS APIs for greenfield installs; retained purely for comprehension when navigating brownfield outages.
apiVersion: v1
kind: ConfigMap
metadata:
name: aws-auth
namespace: kube-system
data:
mapRoles: |
- rolearn: arn:aws:iam::123456789012:role/sso-infra-platform
username: sso-infra-{{SessionName}}
groups:
- system:masters # illustrative — tighten in real environments
- platform-crds-maintainers
- rolearn: arn:aws:iam::123456789012:role/eks-worker-node-role
username: system:node:{{EC2PrivateDNSName}}
groups:
- system:bootstrappers
- system:nodes
- rolearn: arn:aws:iam::123456789012:role/legacy-batch
username: batch-role
groups:
- batch-readonly-ns
mapUsers: |
- userarn: arn:aws:iam::123456789012:user/break-glass-vault
username: break-glass
groups:
- emergency-cluster-admin-temporaryEvery merge here should run through peer review correlating SSO group rename tickets and Kubernetes auth primitives.
Network Paths Pods Use For AWS APIs
AssumeRole flows leave the VPC via NAT or Interface VPC endpoints. Private clusters often mandate STS and ECR VPC endpoints plus route tables pointing to those ENIs instead of quad-zero internet routing—misroutes appear as flaky IRSA timeouts during rollout storms.
| Endpoint | Consumed by… | Operational signal |
|---|---|---|
| com.amazonaws.region.sts | IRSA, Pod Identity, External Secrets | APIServer timeouts when endpoint SG blocks node SG. |
| ECR dk/api | Image pulls accelerated without internet | 403 versus throttling differentiated by AWS support metrics. |
| Elastic Load Balancing / EC2 APIs | Controllers creating LoadBalancer backends | Insufficient ec2:* permissions surface as Ingress Events only. |
SCPs & Permission Boundaries For Platform Roles
| Artifact | Impact on EKS tooling |
|---|---|
Explicit deny on iam:PassRole | Breaks CD pipelines handing controller roles unless exception path. |
| Regional restrictions | Secrets replication + multi-region STS assumptions fail. |
| Boundary on node role | Autoscaler terminates fail even when inline policy permits. |
| Wildcard deny on unmanaged ARNs | New IRSA roles may be blocked silently until exception tickets land. |
Document cross-links for clusters built via Terraform; execution roles there must reconcile with organization guardrails ahead of merges.
Ingress / LB Security Group Hygiene
| Concern | Recommendation |
|---|---|
| Health check flap | Ensure node/SG ingress allows LB SG on targetPort—even when health checks originate from AWS IPs. |
Sticky 0.0.0.0/0 | Replace iterative debugging allowances with tightened CIDR automation. |
| NLB preserving client IP vs proxy protocol | Align externalTrafficPolicy semantics with LB type per Services. |
| Cross-account ACM | Validate SAN + trust chain controllers expect; Ingress secret mismatch blocks listener creation. |
Audit & Forensics Cheat Sheet
# Inventory IRSA-managed roles referencing your cluster issuer.
aws iam list-roles --query 'Roles[?contains(AssumeRolePolicyDocument,`:oidc-provider/eks`)].[RoleName]' --output table
aws cloudtrail lookup-events --lookup-attributes AttributeKey=EventName,AttributeValue=AssumeRoleWithWebIdentity \
--max-results 10Narrow IAM Policy Attachment Example
Use condition keys to avoid bucket-wide exposures when mapping IRSA workload roles authored via Terraform modules.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "ListPrefix",
"Effect": "Allow",
"Action": ["s3:ListBucket"],
"Resource": "arn:aws:s3:::prod-artifacts",
"Condition": {
"StringLike": { "s3:prefix": ["releases/*"] }
}
},
{
"Sid": "ObjectRW",
"Effect": "Allow",
"Action": ["s3:GetObject", "s3:PutObject"],
"Resource": "arn:aws:s3:::prod-artifacts/releases/*"
}
]
}IAM Pod Identity Vs IRSA Snapshot
| Topic | IRSA (classic) | Pod Identity agent |
|---|---|---|
| Trust source | OIDC federation on role | Agent-mediated short-lived creds |
| Debugging muscle memory | Mature community examples | Newer playbook; align versions with EKS add-ons |
| Rotation | SDK refresh via web identity file | Agent watches ServiceAccount mapping objects |
| Migration | Default for most brownfield clusters | Plan cutover windows; avoid dual annotate |
Regardless of path, never store long-lived access keys in Secrets—breaks rotation and contradicts IaC auditing.
Cross-Account Resource Policies
When pods in account A pull from ECR or decrypt CMKs in account B, BOTH identity policy (IRSA role) and resource policy (ECR repo, KMS key) must acknowledge the foreign principal. Symptoms look like IAM “allowed in policy simulator” yet runtime 403.
aws ecr get-repository-policy \
--repository-name platform/base-image \
--region us-east-1
aws kms get-key-policy \
--key-id alias/prod-etcd \
--policy-name defaultBreak-Glass Roles & Session Controls
- Time-limited break-glass role with mandatory MFA for human actions on node groups or IRSA remediation.
- CloudTrail data events on security-sensitive S3/Terraform state buckets— correlate with EKS incident timestamps.
- Session policies for support vendor roles to auto-expire lateral movement potential.
IAM Policy Simulation Snippets
# Human IAM role — deterministic unit tests before widening prod privileges.
POLICY="$(cat policy-document.json)"
aws iam simulate-principal-policy \
--policy-source-arn "$ROLE_ARN" \
--action-names ecs:DescribeServices \
--resource-arns "arn:aws:ecs:REGION:123456789012:service/cluster/svc"
# SCP overlay awareness — org trail account must run org-level simulation separately.
AWS_PROFILE=management-org aws organizations list-policies --filter SERVICE_CONTROL_POLICY
# Quick inline deny hunting for node autoscaler regressions — spot missing autoscaling verbs.
ACTIONS=(
ec2:DescribeLaunchTemplateVersions
autoscaling:SetDesiredCapacity
autoscaling:DescribeAutoScalingGroups
autoscaling:TerminateInstanceInAutoScalingGroup
)
for verb in "${ACTIONS[@]}"; do
echo "checking $verb"
aws iam simulate-principal-policy --policy-source-arn "$ROLE_ARN" \
--action-names "$verb" || true
doneOperational Metrics To Dashboard
- STS AssumeRole success vs failure rate tagged by IAM role ARN.
- ECR image pull durations split by repo and node AZ.
- ELB IdleTimeout resets correlating HTTP 504 customer reports.
- Security group DENY CloudWatch Logs insights saved queries.
- IRSA JWT audience mismatch occurrences via controller logs scraping.
- Cross-account KMS DecryptThrottle events alerting FinOps dashboards.
- apiserver aggregated admission webhook latency percentile budgets.
- Node kubelet IAM credential provider plugin failures when adopting Pod Identity hybrid.
- NLB unhealthy target counts matching cluster upgrade windows.
- Route53 RRSet change backlog length for noisy ExternalDNS deploys.
- Cluster Autoscaler evicted PDB-blocked drains counting stuck scale-down loops.
- Pod Security labeled namespace coverage percentage.
- WAF blocked requests juxtaposed vs application 4xx KPI deflection expectations.
- CloudTrail anomaly detection suppressed reasons audit trail completeness.
- Permission boundary denies captured per CI pipeline role for IaC regressions tying to Terraform CI ownership.
- Ingress controller reconcile queue depth alerting before AWS API storm.
SCP Illustration (Non-Authoritative)
Coordinates with organization SCP guidance—simulate in management account before activating.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "DenyDestructiveIamWithoutElevation",
"Effect": "Deny",
"Action": ["iam:*"],
"Resource": "*",
"Condition": {
"StringNotEqualsIfExists": { "iam:PassedToService": "" },
"BoolIfExists": { "aws:MultiFactorAuthPresent": "false" }
}
},
{
"Sid": "RequireApprovedPathsForNetworking",
"Effect": "Deny",
"Action": ["ec2:AuthorizeSecurityGroupIngress"],
"Resource": "*",
"Condition": {
"ArnNotLikeIfExists": { "ec2:Vpc": "arn:aws:ec2:REGION:123456789012:vpc/enforced-training-vpc*" }
}
}
]
}Gotchas
- Widening the node IAM role masks missing IRSA and creates blast-radius incidents.
- Copy-paste trust policies from another cluster without updating issuer URL → silent failures.
- Removing default cluster SG egress blocks image pulls hitting private registries over unexpected paths.
- IAM policy simulator does not emulate IRSA JWT claims fully — rely on STS + CloudTrail trails.
- Overlapping LB SG rules + stray 0.0.0.0/0 exposures often come from iterative debugging — audit regularly.
- Pod SG per Pod without IP prefix tuning can exhaust ENI/SG quotas in large fleets.