TL;DR
- Cluster Autoscaler (CA) is the original Kubernetes SIG-autoscaling project — scales node groups up when pods are unschedulable and scales them down when nodes are underutilised.
- Works by extending cloud-provider node groups (AWS ASG, GCP MIG, Azure VMSS, OpenStack, vSphere, Equinix Metal, Alibaba, and others).
- Slower than Karpenter (2-4 minute scale-up vs ~1 minute) and requires pre-defined node groups per instance type, but cloud-agnostic and battle-tested for nearly a decade.
- Still the right choice on clouds where Karpenter has no production provider, in regulated environments that require explicit ASG audit trails, and on managed Kubernetes services that bundle CA by default.
How CA Works#
Cluster Autoscaler watches the Kubernetes API for pods in the Pending state. If a pending pod's resource requests or affinity rules cannot be satisfied by existing nodes, CA simulates which configured node groups could host the pod and asks the cloud provider to extend the chosen group by one instance. Conversely, CA monitors node utilisation and removes nodes that have been below the configured threshold (default 50%) for a configurable period (default 10 minutes), provided all pods can move elsewhere.
Node Groups#
The fundamental abstraction is the node group — a cloud-provider construct that represents a set of identical instances (AWS Auto Scaling Group, GCP Managed Instance Group, Azure VMSS). CA scales these groups; it does not provision individual instances directly. This is the root of both CA's strengths (auditability, well-understood cloud primitives) and its weaknesses (you must pre-create a group per instance type you want to use).
Expanders#
When multiple node groups could satisfy a pending pod, CA's expander strategy picks one. Available expanders:
- random — pick any qualifying group (default; rarely the right choice).
- most-pods — pick the group that would schedule the most pending pods.
- least-waste — pick the group whose resulting node has the smallest unused capacity.
- price (AWS only) — pick the cheapest qualifying group.
- priority — explicit user-defined priority list.
Default `random` expander leads to lopsided spend. Switch to `least-waste` or `priority` in production — a one-line change that often pays for itself in the first week.
Cluster Autoscaler vs Karpenter#
| Property | Cluster Autoscaler | Karpenter |
|---|---|---|
| Model | Scale fixed node groups | Provision per-pod |
| Scale-up latency | 2-4 minutes | 30-60 seconds |
| Instance variety | Pre-defined per group | Dynamic from family list |
| Cloud support | 10+ providers | AWS GA, Azure GA, others dev |
| Consolidation | Limited | Continuous |
| Maturity | 2017 | 2021 (CNCF Sandbox 2023) |
| Audit trail | ASG operations | Direct CreateInstance |
Where CA Still Wins#
On clouds without a production Karpenter provider — most of Europe's sovereign clouds, OVH, Hetzner, Equinix Metal, and many private-cloud installs — CA remains the only working option. In tightly regulated environments where every scaling action must map to a named ASG with its own change-management ticket, CA's explicit group model is an asset, not a liability. And on managed Kubernetes services (EKS Auto Mode, GKE Autopilot, AKS) the bundled CA is usually fine for steady-state workloads.
Operational Notes#
Pod priority and disruption budgets are honoured during scale-down. Pods with `cluster-autoscaler.kubernetes.io/safe-to-evict: "false"` block scale-down of their node — useful for stateful pods, dangerous if forgotten. DaemonSets do not block scale-down. Local-storage pods do unless explicitly annotated. For GPU node groups, set high `scaleDownUtilizationThreshold` (e.g. 0.65) — a half-utilised H100 is too expensive to keep, but evicting training is worse.
References
- Cluster Autoscaler on GitHub · GitHub
- Cluster Autoscaler FAQ · Kubernetes