Cluster Autoscaler

TL;DR

Cluster Autoscaler (CA) is the original Kubernetes SIG-autoscaling project — scales node groups up when pods are unschedulable and scales them down when nodes are underutilised.
Works by extending cloud-provider node groups (AWS ASG, GCP MIG, Azure VMSS, OpenStack, vSphere, Equinix Metal, Alibaba, and others).
Slower than Karpenter (2-4 minute scale-up vs ~1 minute) and requires pre-defined node groups per instance type, but cloud-agnostic and battle-tested for nearly a decade.
Still the right choice on clouds where Karpenter has no production provider, in regulated environments that require explicit ASG audit trails, and on managed Kubernetes services that bundle CA by default.

How CA Works#

Cluster Autoscaler watches the Kubernetes API for pods in the Pending state. If a pending pod's resource requests or affinity rules cannot be satisfied by existing nodes, CA simulates which configured node groups could host the pod and asks the cloud provider to extend the chosen group by one instance. Conversely, CA monitors node utilisation and removes nodes that have been below the configured threshold (default 50%) for a configurable period (default 10 minutes), provided all pods can move elsewhere.

Node Groups#

The fundamental abstraction is the node group — a cloud-provider construct that represents a set of identical instances (AWS Auto Scaling Group, GCP Managed Instance Group, Azure VMSS). CA scales these groups; it does not provision individual instances directly. This is the root of both CA's strengths (auditability, well-understood cloud primitives) and its weaknesses (you must pre-create a group per instance type you want to use).

Expanders#

When multiple node groups could satisfy a pending pod, CA's expander strategy picks one. Available expanders:

random — pick any qualifying group (default; rarely the right choice).
most-pods — pick the group that would schedule the most pending pods.
least-waste — pick the group whose resulting node has the smallest unused capacity.
price (AWS only) — pick the cheapest qualifying group.
priority — explicit user-defined priority list.

Default `random` expander leads to lopsided spend. Switch to `least-waste` or `priority` in production — a one-line change that often pays for itself in the first week.

Cluster Autoscaler vs Karpenter#

Property	Cluster Autoscaler	Karpenter
Model	Scale fixed node groups	Provision per-pod
Scale-up latency	2-4 minutes	30-60 seconds
Instance variety	Pre-defined per group	Dynamic from family list
Cloud support	10+ providers	AWS GA, Azure GA, others dev
Consolidation	Limited	Continuous
Maturity	2017	2021 (CNCF Sandbox 2023)
Audit trail	ASG operations	Direct CreateInstance

Where CA Still Wins#

On clouds without a production Karpenter provider — most of Europe's sovereign clouds, OVH, Hetzner, Equinix Metal, and many private-cloud installs — CA remains the only working option. In tightly regulated environments where every scaling action must map to a named ASG with its own change-management ticket, CA's explicit group model is an asset, not a liability. And on managed Kubernetes services (EKS Auto Mode, GKE Autopilot, AKS) the bundled CA is usually fine for steady-state workloads.

Operational Notes#

Pod priority and disruption budgets are honoured during scale-down. Pods with `cluster-autoscaler.kubernetes.io/safe-to-evict: "false"` block scale-down of their node — useful for stateful pods, dangerous if forgotten. DaemonSets do not block scale-down. Local-storage pods do unless explicitly annotated. For GPU node groups, set high `scaleDownUtilizationThreshold` (e.g. 0.65) — a half-utilised H100 is too expensive to keep, but evicting training is worse.

References

Cluster Autoscaler on GitHub · GitHub
Cluster Autoscaler FAQ · Kubernetes

How CA Works#

Node Groups#

Expanders#

When multiple node groups could satisfy a pending pod, CA's expander strategy picks one. Available expanders:

random — pick any qualifying group (default; rarely the right choice).

most-pods — pick the group that would schedule the most pending pods.

least-waste — pick the group whose resulting node has the smallest unused capacity.

price (AWS only) — pick the cheapest qualifying group.

priority — explicit user-defined priority list.

Default `random` expander leads to lopsided spend. Switch to `least-waste` or `priority` in production — a one-line change that often pays for itself in the first week.

Cluster Autoscaler vs Karpenter#

Property	Cluster Autoscaler	Karpenter
Model	Scale fixed node groups	Provision per-pod
Scale-up latency	2-4 minutes	30-60 seconds
Instance variety	Pre-defined per group	Dynamic from family list
Cloud support	10+ providers	AWS GA, Azure GA, others dev
Consolidation	Limited	Continuous
Maturity	2017	2021 (CNCF Sandbox 2023)
Audit trail	ASG operations	Direct CreateInstance

Where CA Still Wins#

Operational Notes#

Cluster Autoscaler

How CA Works#

Node Groups#

Expanders#

Cluster Autoscaler vs Karpenter#

Where CA Still Wins#

Operational Notes#

References

Browse all entries

Deploy on Yobitel

Cluster Autoscaler

How CA Works#

Node Groups#

Expanders#

Cluster Autoscaler vs Karpenter#

Where CA Still Wins#

Operational Notes#

References

Browse all entries

Deploy on Yobitel