Fix: Kubernetes exceeded quota / Pod Stuck in Pending Due to Resource Quota

Q: How do I fix "Kubernetes exceeded quota / Pod Stuck in Pending Due to Resource Quota"?

How to fix Kubernetes 'exceeded quota' errors — pods stuck in Pending because namespace resource quotas are exhausted, missing resource requests, and LimitRange defaults.

The Error

You deploy a pod or deployment and it stays in Pending with no apparent reason, or you get:

Error from server (Forbidden): error when creating "deployment.yaml":
pods "my-app-7d9f6b4c8-xk2pq" is forbidden: exceeded quota: team-quota,
requested: cpu=500m,memory=512Mi, used: cpu=3500m,memory=3584Mi, limited: cpu=4,memory=4Gi

Or from kubectl describe pod:

Events:
  Warning  FailedScheduling  default-scheduler  0/3 nodes are available:
  3 Insufficient cpu. preemption: 0/3 nodes are available:
  3 No preemption victims found for incoming pod.

Or:

Error from server (Forbidden): pods "my-app" is forbidden:
[maximum cpu usage per Pod is 2, but limit is 4,
 maximum memory usage per Pod is 2Gi, but limit is 4Gi]

Why This Happens

Kubernetes ResourceQuotas limit the total resources (CPU, memory, pods, PVCs) that can be consumed in a namespace. When a namespace quota is exhausted, no new pods can be scheduled until existing ones release resources.

LimitRange sets per-pod or per-container min/max constraints. Pods that violate these limits are rejected at admission.

Common causes:

Namespace quota is full — existing pods are using all allocated CPU/memory.
Missing resource requests on pods — if a namespace has a quota but pods don’t declare resource requests, pods are rejected because quota enforcement requires explicit requests.
LimitRange violation — a container requests more CPU/memory than the namespace’s maximum per container.
Too many objects — quotas can limit the number of pods, services, PVCs, etc.
Wrong namespace — deploying to the wrong namespace that has stricter quotas.

There are two distinct enforcement layers in play and they fail with different messages. Admission-time enforcement (ResourceQuota, LimitRange, validating webhooks) rejects the API request before any pod is created — you see Forbidden immediately on kubectl apply. Scheduling-time enforcement runs in the kube-scheduler after admission accepts the pod; the object is created and lands in Pending with FailedScheduling events. Confusing one for the other leads to the wrong fix. Quota errors do not surface in kubectl describe pod because the pod never made it into etcd; you must look at the apply output or the controller events on the parent Deployment/ReplicaSet.

A second subtlety: quota matches scopes. A ResourceQuota with a scopeSelector (for example priorityClassName In [high]) only counts pods in that scope. A namespace can have multiple quotas, each scoped differently, and a pod must fit under every quota whose scope it matches. This is how multi-tenant clusters layer “team budget” and “priority class budget” on the same namespace. When the failing message names a quota you did not write, search for kubectl get resourcequota -n NS -o yaml and look at every scopes/scopeSelector block.

Platform and Environment Differences

The same Kubernetes API is exposed by every distribution, but defaults, admission policies, and scheduling behavior diverge across clouds and forks. Understanding which lever to pull saves hours when the same manifest works on one cluster and fails on another:

Vanilla Kubernetes vs OpenShift. OpenShift wraps namespaces in Project objects and applies cluster-level ClusterResourceQuota and per-project AppliedClusterResourceQuota. The kubectl describe quota output you expect may live under oc describe appliedclusterresourcequota. OpenShift also installs a default LimitRange that injects requests if you omit them, masking the “missing requests” error you would see on plain Kubernetes.
EKS, GKE, and AKS managed defaults. Amazon EKS clusters ship with no namespace quota by default; GKE Autopilot clusters apply tight LimitRange defaults plus a per-namespace pod-count cap; AKS provides quota templates that you can enable in the portal. If the same manifest is rejected only on one cloud, dump kubectl get resourcequota,limitrange -A and diff the cluster you migrated from.
Kubernetes 1.27 NodeSwap and resource accounting. From 1.27, kubelet supports opt-in swap on Linux nodes via --fail-swap-on=false and memorySwap.swapBehavior=LimitedSwap. Memory limits remain authoritative, but cgroup v2 changes how memory pressure is reported to the kubelet. If your cluster moved from cgroup v1 to v2, expect different eviction timing for memory-burst workloads and adjust requests upward rather than trusting the old headroom.
QoS classes shape scheduling outcomes. A pod is Guaranteed only when every container has requests == limits for CPU and memory. Burstable pods declare some requests but not matching limits. BestEffort pods declare none. ResourceQuota counts requests.cpu and requests.memory against the quota, so BestEffort pods consume zero quota but are evicted first under pressure. A namespace policy that mixes Guaranteed production pods with BestEffort jobs will see the jobs disappear during noisy-neighbor bursts.
Multi-tenant clusters and namespace fan-out. Pattern of “one tenant per namespace” works only if every namespace has its own quota; otherwise a runaway tenant exhausts cluster-wide node capacity. Larger tenants often use hierarchical namespace controllers (HNC) or Capsule to layer parent-child quotas. Quota messages from those tools include the parent namespace name — read the full error string before debugging the child.
kube-scheduler vs custom schedulers. Karpenter, Cluster Autoscaler, and ack/scheduler-plus alter when nodes are added in response to FailedScheduling. If pods sit pending for minutes while the autoscaler waits to size up, the symptom looks like a quota problem but is actually node capacity. Confirm by reading scheduler events and node pool sizing before raising the quota.

Fix 1: Check Current Quota Usage

First, understand how much quota is used and what is available:

# Check all quotas in the namespace
kubectl describe quota -n your-namespace

# Or list quotas
kubectl get resourcequota -n your-namespace

# Example output:
# Name:            team-quota
# Namespace:       production
# Resource         Used    Hard
# --------         ----    ----
# cpu              3500m   4
# memory           3584Mi  4Gi
# pods             14      20
# requests.cpu     3500m   4
# requests.memory  3584Mi  4Gi

Compare Used against Hard to see how close you are to the limit.

Check what is consuming resources:

# See CPU and memory requests per pod
kubectl top pods -n your-namespace

# Or check resource requests from pod specs
kubectl get pods -n your-namespace -o json | \
  jq -r '.items[] | .metadata.name + " cpu:" + .spec.containers[0].resources.requests.cpu + " mem:" + .spec.containers[0].resources.requests.memory'

Fix 2: Add Resource Requests to Your Pod Spec

If a namespace has a ResourceQuota, every pod must declare resource requests. Pods without requests are rejected at admission:

Broken — no resource requests:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
spec:
  replicas: 3
  template:
    spec:
      containers:
      - name: app
        image: myapp:latest
        # No resources block — rejected if quota exists

Fixed — add resource requests and limits:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-app
  namespace: your-namespace
spec:
  replicas: 3
  template:
    spec:
      containers:
      - name: app
        image: myapp:latest
        resources:
          requests:
            cpu: "100m"      # 0.1 CPU cores
            memory: "128Mi"  # 128 MiB
          limits:
            cpu: "500m"      # 0.5 CPU cores
            memory: "512Mi"  # 512 MiB

CPU units:

1 = 1 full CPU core
500m = 0.5 CPU cores (500 millicores)
100m = 0.1 CPU cores

Memory units:

128Mi = 128 mebibytes (use Mi not MB)
1Gi = 1 gibibyte
512M = 512 megabytes (slightly different from Mi)

Pro Tip: Set requests lower than limits. Kubernetes schedules based on requests (guaranteed resources) and enforces limits (maximum allowed). A container with requests.cpu: 100m and limits.cpu: 500m is scheduled on a node with 100m free but can burst to 500m if available.

Fix 3: Increase the Namespace Quota

If the quota is legitimately too low for your workload, increase it (requires cluster-admin permissions):

# View the current quota definition
kubectl get resourcequota team-quota -n your-namespace -o yaml

# Edit and apply updated quota
apiVersion: v1
kind: ResourceQuota
metadata:
  name: team-quota
  namespace: your-namespace
spec:
  hard:
    requests.cpu: "8"         # Increased from 4
    requests.memory: "8Gi"    # Increased from 4Gi
    limits.cpu: "16"
    limits.memory: "16Gi"
    pods: "40"                # Increased from 20

kubectl apply -f quota.yaml

# Or patch directly
kubectl patch resourcequota team-quota -n your-namespace \
  --patch '{"spec":{"hard":{"requests.cpu":"8","requests.memory":"8Gi"}}}'

Fix 4: Scale Down or Delete Unused Workloads

Free up quota by removing unused deployments, completed jobs, or idle pods:

# Find pods not running (completed, failed, evicted)
kubectl get pods -n your-namespace --field-selector=status.phase!=Running

# Delete completed jobs
kubectl delete jobs -n your-namespace --field-selector=status.conditions[0].type=Complete

# Scale down a deployment temporarily
kubectl scale deployment my-app -n your-namespace --replicas=1

# Delete a deployment entirely
kubectl delete deployment old-app -n your-namespace

Find the largest resource consumers:

# Sort pods by CPU request
kubectl get pods -n your-namespace -o json | jq -r '
  .items[] |
  [.metadata.name,
   (.spec.containers[].resources.requests.cpu // "0"),
   (.spec.containers[].resources.requests.memory // "0")] |
  @tsv' | sort -k2 -rh

Fix 5: Fix LimitRange Violations

A LimitRange sets constraints on individual pods/containers. If your container exceeds the maximum defined by the LimitRange, it is rejected:

# Check LimitRange in the namespace
kubectl describe limitrange -n your-namespace

# Example output:
# Type        Resource  Min   Max    Default Request  Default Limit
# ----        --------  ---   ---    ---------------  -------------
# Container   cpu       50m   2      100m             500m
# Container   memory    64Mi  2Gi    128Mi            512Mi

If your pod requests cpu: 4 but the LimitRange max is 2, it is rejected.

Fix — adjust your resource requests to stay within limits:

resources:
  requests:
    cpu: "500m"   # Within LimitRange max of 2
    memory: "1Gi" # Within LimitRange max of 2Gi
  limits:
    cpu: "2"      # At or below LimitRange max
    memory: "2Gi"

Fix — update the LimitRange if the constraint is too restrictive:

apiVersion: v1
kind: LimitRange
metadata:
  name: container-limits
  namespace: your-namespace
spec:
  limits:
  - type: Container
    max:
      cpu: "4"      # Increased max
      memory: "4Gi"
    min:
      cpu: "50m"
      memory: "64Mi"
    defaultRequest:
      cpu: "100m"
      memory: "128Mi"
    default:
      cpu: "500m"
      memory: "512Mi"

LimitRange defaults (defaultRequest and default) are applied to containers that do not specify resource requests/limits. This is how containers without resource specs are handled when a LimitRange exists — they get the defaults.

Fix 6: Debug Pod Scheduling Failures

When a pod is stuck in Pending, kubectl describe shows the scheduling reason:

kubectl describe pod my-app-xxxxx -n your-namespace

Look for the Events section at the bottom:

Events:
  Warning  FailedScheduling  0/5 nodes are available:
    2 Insufficient cpu,
    3 node(s) had untolerated taint {node-role.kubernetes.io/master: }.

Common scheduling failure reasons:

Message	Cause	Fix
`Insufficient cpu`	No node has enough CPU	Scale cluster or reduce requests
`Insufficient memory`	No node has enough memory	Scale cluster or reduce requests
`exceeded quota`	Namespace quota full	Free quota or increase it
`didn't match node selector`	No node matches selector	Fix `nodeSelector` labels
`had taint`	Nodes are tainted	Add tolerations to pod
`volume node affinity conflict`	PVC zone mismatch	Fix storage class zone

Check node capacity:

# See allocatable resources per node
kubectl describe nodes | grep -A 5 "Allocatable:"

# Or use kubectl top
kubectl top nodes

Fix 7: Set Up Proper Resource Planning

Prevent quota issues before they happen with a planned resource allocation strategy:

# Create separate namespaces with quotas per team/environment
kubectl create namespace team-a
kubectl create namespace team-b
kubectl create namespace staging
kubectl create namespace production

# staging-quota.yaml — relaxed limits for staging
apiVersion: v1
kind: ResourceQuota
metadata:
  name: staging-quota
  namespace: staging
spec:
  hard:
    requests.cpu: "4"
    requests.memory: "4Gi"
    limits.cpu: "8"
    limits.memory: "8Gi"
    pods: "30"
---
# production-quota.yaml — larger limits for production
apiVersion: v1
kind: ResourceQuota
metadata:
  name: production-quota
  namespace: production
spec:
  hard:
    requests.cpu: "32"
    requests.memory: "64Gi"
    limits.cpu: "64"
    limits.memory: "128Gi"
    pods: "200"

Use Vertical Pod Autoscaler (VPA) to automatically right-size resource requests based on actual usage:

# Install VPA
kubectl apply -f https://github.com/kubernetes/autoscaler/releases/latest/download/vertical-pod-autoscaler.yaml

# Create VPA for a deployment
kubectl apply -f - <<EOF
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
  name: my-app-vpa
  namespace: your-namespace
spec:
  targetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: my-app
  updatePolicy:
    updateMode: "Auto"
EOF

Still Not Working?

Check object count quotas. ResourceQuota can also limit the count of objects (Services, ConfigMaps, PVCs). If your quota limits count/pods: 20 and you have 20 pods, new pods are rejected even if there is plenty of CPU/memory available:

kubectl describe quota -n your-namespace | grep count

Check for namespace-level default LimitRange. Some clusters apply a default LimitRange to all namespaces. If your containers have no resource specs, they get the default limits — which may be too small for your workload.

Check for admission webhooks. Custom admission controllers (OPA Gatekeeper, Kyverno) may enforce additional policies beyond standard ResourceQuota and LimitRange. Check webhook configurations:

kubectl get validatingwebhookconfigurations
kubectl get mutatingwebhookconfigurations

Inspect scoped quotas on terminating and PriorityClass pods. ResourceQuota with scopes: [Terminating] only counts pods with an activeDeadlineSeconds set, and scopes: [BestEffort] counts pods with no requests. If your manifest moves between scopes — for example a CronJob’s pod template changes priority — the quota that applies changes silently. Dump the full quota spec and compare the scopes block:

kubectl get resourcequota -n your-namespace -o yaml | grep -A 10 scopes

Verify the rejecting controller, not just the namespace. ReplicaSets, DaemonSets, and StatefulSets all retry pod creation on admission failure. The pod object you expect may never appear; only the parent shows the rejection in its kubectl describe events. Check the controller, not the missing pod, when nothing shows up in kubectl get pods.

Watch for stale evicted pods consuming the pod-count quota. Evicted pods stay in the API by default and continue to count against pods: N. Clean them with kubectl delete pods -n NS --field-selector=status.phase=Failed before raising the limit.

For pods that are Pending due to image pull issues rather than quota, see Fix: Kubernetes ImagePullBackOff. For pods that crash after starting, see Fix: Kubernetes CrashLoopBackOff. For HPA-related scheduling problems, see Fix: Kubernetes HPA Not Scaling. For nodes with restart loops caused by OOM, see Fix: Docker Exited 137 OOMKilled.