Skip to content
All articles

Scaling Seamlessly with Cluster Autoscaler

Ben Grady
Ben Grady

Article updated on March 16, 2026

Key Takeaways

  • The EKS Cluster Autoscaler monitors pending pods every 10 seconds and increases Auto Scaling group capacity when pods cannot be scheduled on existing nodes, while also scaling down underutilized nodes whose pods can be safely rescheduled elsewhere.
  • Setting up EKS Cluster Autoscaler requires tagging Auto Scaling groups for auto-discovery, creating an IAM role with Auto Scaling permissions via IRSA, and deploying the controller with matching cluster name configuration.
  • Common scale-up failures stem from missing ASG tags, IAM permission gaps, node groups at maximum capacity, or pod resource requests exceeding available instance types.
  • Mixed instances policies combining On-Demand and Spot Instances can reduce cluster costs, but instance types within an ASG should have similar CPU and memory specifications to ensure accurate scaling calculations.

EKS Cluster Autoscaler: Setup, Configuration, and Tuning

Amazon Elastic Kubernetes Service (EKS) has become a cornerstone of modern container orchestration. One of the features that makes EKS powerful is its autoscaling capabilities — and the Cluster Autoscaler sits at the center of that story.

Why Autoscaling Matters

Autoscaling in EKS dynamically adjusts the resource capacity of your Kubernetes cluster to match changing application demands. EKS offers several autoscaling options, with the Cluster Autoscaler being a central player in this landscape.

Understanding the EKS Cluster Autoscaler

The EKS Cluster Autoscaler operates via two primary loops:

  • Scale-Up Loop (every 10 seconds): Identifies Pending pods that are unschedulable. It simulates scheduling and, if necessary, increases the Auto Scaling group DesiredCapacity to trigger new instance launches. The Autoscaler selects a node group using the configured expander strategy.
  • Scale-Down Loop: Scans for underutilized nodes. If pods can be safely rescheduled (respecting PDBs and safe-to-evict annotations), it drains the node and decreases the DesiredCapacity to terminate the instance.

The Cluster Autoscaler ensures your cluster has the right number of nodes to handle workloads efficiently. It adjusts worker node group size based on the resource requirements of your pods. For more detail on how scaling decisions work, see the Cluster Autoscaler FAQ.

What IAM Permissions Does the Cluster Autoscaler Need?

The Cluster Autoscaler requires the following AWS IAM permissions. See the AWS documentation on IAM policies for Cluster Autoscaler for the full recommended policy.

PermissionPurpose
autoscaling:DescribeAutoScalingGroupsAllows the controller to see current group status and tags.
autoscaling:DescribeAutoScalingInstancesLets the controller map EC2 instances back to their ASG.
autoscaling:DescribeLaunchConfigurationsReads launch config details for capacity planning.
autoscaling:SetDesiredCapacityEnables the controller to trigger scale-up and scale-down events.
autoscaling:TerminateInstanceInAutoScalingGroupAllows the controller to specifically remove underutilized nodes.
ec2:DescribeLaunchTemplateVersionsReads launch template specs used in scaling decisions.
ec2:DescribeInstanceTypesRetrieves instance type CPU/memory for scheduling simulation.

If available in your environment, start from the AWS-recommended policy and scope it down to least privilege.

EKS Cluster Autoscaler in Action

EKS Cluster Autoscaler is purpose-built for Amazon EKS clusters. It leverages AWS Auto Scaling groups to manage the underlying EC2 instances in your EKS node groups.

Setting Up the EKS Cluster Autoscaler

Follow these steps to deploy the Cluster Autoscaler on EKS. For the full walkthrough, see Cluster Autoscaler on AWS.

  1. Tag your Auto Scaling groups for auto-discovery (see the auto-discovery section below).
  2. Create an IAM role for the Cluster Autoscaler using IRSA (IAM Roles for Service Accounts). This requires an OIDC provider associated with your EKS cluster, a Kubernetes service account annotated with the IAM role ARN, and a trust policy that scopes access to that service account.
  3. Deploy Cluster Autoscaler (for example via the Helm chart) with the correct auto-discovery flags and a matching Kubernetes cluster name.
  4. Verify it is running by checking logs (kubectl -n kube-system logs deploy/cluster-autoscaler) and confirming your node group min/max settings allow scale-up and scale-down.

Auto Scaling Group Auto-Discovery in EKS

The Cluster Autoscaler can automatically find which node groups to manage using tag-based auto-discovery. This removes the need to hard-code ASG names in the deployment.

Required Tags

Every Auto Scaling group that the Cluster Autoscaler should manage must have these two tags:

k8s.io/cluster-autoscaler/enabled = true
k8s.io/cluster-autoscaler/<cluster-name> = owned

Replace <cluster-name> with the exact name of your EKS cluster. See the auto-discovery setup docs for details.

EKS Managed Node Groups vs. Self-Managed Node Groups

  • EKS managed node groups automatically apply the required tags when you create them through the AWS console, CLI, or Terraform with the correct cluster name. No extra tagging step is needed.
  • Self-managed node groups (where you create and manage the ASG directly) require you to add both tags manually. If these tags are missing, the Cluster Autoscaler will not discover or scale those groups.

Enabling Auto-Discovery in the Deployment

When deploying the Cluster Autoscaler, pass the --node-group-auto-discovery flag:

--node-group-auto-discovery=asg:tag=k8s.io/cluster-autoscaler/enabled=true,k8s.io/cluster-autoscaler/<cluster-name>=owned

This tells the controller to scan for ASGs matching those tags instead of requiring an explicit list.

EKS Autoscaling Methods: HPA and VPA vs. Cluster Autoscaler

Kubernetes offers three main autoscaling approaches. Each operates at a different level:

  • Horizontal Pod Autoscaler (HPA): HPA scales the number of pod replicas based on CPU, memory, or custom metrics. It responds to traffic changes by adding or removing pods.
  • Vertical Pod Autoscaler (VPA): VPA adjusts CPU and memory requests on individual pods based on observed usage. It right-sizes pods so they request what they actually need.
  • Cluster Autoscaler: Operates at the node level. It adds nodes when pods are unschedulable and removes nodes when they are underutilized.

HPA and VPA handle pod-level scaling. The Cluster Autoscaler handles node-level scaling. In most production environments, you combine all three: HPA or VPA adjusts pods, and the Cluster Autoscaler adjusts infrastructure to match.

How to Configure and Tune the EKS Cluster Autoscaler

Proper configuration is the difference between a responsive cluster and one that wastes money or drops traffic. Here are the key areas:

Core Setup

  • Enable and configure the Cluster Autoscaler within your EKS cluster using the setup steps above.
  • Define scaling policies and thresholds based on your application’s needs. Set appropriate minSize and maxSize on your ASGs.

Expander Strategy

The --expander flag controls how the Autoscaler picks a node group when multiple groups can satisfy a pending pod. Common options:

ExpanderBehavior
randomPicks a node group at random (default).
least-wasteChooses the group that will have the least idle CPU/memory after the pod is scheduled.
most-podsChooses the group that can schedule the most pending pods.
priorityUses a ConfigMap to define explicit group priorities.

Set it with: --expander=least-waste

Scale-Down Tuning Flags

These flags control how aggressively the Autoscaler removes underutilized nodes. See the Cluster Autoscaler parameters reference for the full list.

FlagDefaultWhat It Controls
--scale-down-utilization-threshold0.5Node CPU/memory utilization below which the node is considered for removal.
--scale-down-unneeded-time10mHow long a node must be underutilized before it is removed.
--scale-down-delay-after-add10mCooldown period after a scale-up before any scale-down can occur.
--scale-down-delay-after-delete0sCooldown after a node deletion before the next deletion.
--scale-down-delay-after-failure3mCooldown after a failed scale-down attempt.

Monitoring

Continuously monitor the Autoscaler as your application evolves. Check Cluster Autoscaler logs, Kubernetes events, and CloudWatch metrics (especially DesiredCapacity) to verify that scaling decisions match your expectations.

Using Mixed Instances Policies and Spot Instances

If your workloads can tolerate interruption, Spot Instances can cut costs significantly. Mixed instances policies let you combine On-Demand and Spot Instances in the same ASG, and specify multiple EC2 instance types to tap into different Spot capacity pools.

When using mixed instances policies, ensure the instance types in your ASG have similar CPU and memory specs. The Cluster Autoscaler assumes all instances in a group have roughly the same capacity. Mismatched instance types can cause incorrect scaling calculations.

Cluster Autoscaler vs. Other Autoscaler Types

The Cluster Autoscaler adjusts the cluster’s total node capacity. VPA and HPA adjust resource allocation at the pod level. The right choice depends on your use case — and in most scenarios, you should combine them for full coverage.

EKS Cluster Autoscaler Version Compatibility

The Cluster Autoscaler follows the Kubernetes minor version scheme. Each Cluster Autoscaler release supports a specific Kubernetes minor version. See the releases page for the full matrix.

EKS Kubernetes VersionRecommended Cluster Autoscaler Version
1.291.29.x
1.301.30.x
1.311.31.x

Key rules:

  • Always match the Cluster Autoscaler minor version to your EKS Kubernetes minor version (e.g., EKS 1.30 → Cluster Autoscaler 1.30.x).
  • Use the latest patch release within that minor version for bug fixes and security patches.
  • When upgrading EKS, update the Cluster Autoscaler image tag before or during the control plane upgrade to avoid API incompatibilities.
  • Running a mismatched version can cause scaling failures or unexpected behavior. The Autoscaler may not recognize newer API fields or scheduling features.

Troubleshooting the EKS Cluster Autoscaler

When the Cluster Autoscaler isn’t behaving as expected, work through these common issues:

AWS API Throttling

The Cluster Autoscaler makes frequent calls to the Auto Scaling and EC2 APIs. Under heavy load or in accounts with many ASGs, you may hit API rate limits. Symptoms include RequestLimitExceeded errors in the controller logs. To mitigate this, reduce the scan interval (--scan-interval) or request a rate limit increase through AWS Support.

Launch Template Problems

If new nodes fail to join the cluster after a scale-up, check your launch template. Common issues include an outdated AMI that doesn’t match the EKS Kubernetes version, missing or incorrect --kubelet-extra-args in the userdata, and security group rules that block communication between worker nodes and the control plane.

Managed Node Group Tag Propagation

EKS managed node groups automatically apply auto-discovery tags, but custom tags added after group creation may not propagate to new instances. Verify that the ASG-level tags have “Propagate at launch” enabled. For self-managed groups, always set tags directly on the ASG.

OIDC and IRSA Misconfiguration

If the Cluster Autoscaler pod cannot authenticate to AWS, check these items in order:

  1. The EKS cluster has an OIDC provider associated with it.
  2. The IAM role’s trust policy references the correct OIDC provider ARN and service account namespace/name.
  3. The Kubernetes service account is annotated with eks.amazonaws.com/role-arn: <role-arn>.
  4. The Cluster Autoscaler deployment uses that service account.

Symptoms of IRSA failure include AccessDenied or ExpiredToken errors in the pod logs.

Scale-Up Fails Silently

If pods remain Pending but no new nodes appear, check:

  • The ASG is not already at maxSize.
  • Auto-discovery tags are correct and match the cluster name.
  • The IAM role has autoscaling:SetDesiredCapacity permission.
  • Pod resource requests don’t exceed the capacity of any instance type in the group.

Run kubectl -n kube-system logs deploy/cluster-autoscaler | grep -i "scale up" to see the controller’s reasoning.

Scale-Down Is Blocked

Nodes may resist scale-down for several reasons:

  • Pods with cluster-autoscaler.kubernetes.io/safe-to-evict: "false".
  • PodDisruptionBudgets that prevent eviction.
  • Pods using local storage (emptyDir with data).
  • The --scale-down-unneeded-time cooldown hasn’t elapsed.

Run kubectl -n kube-system logs deploy/cluster-autoscaler | grep -i "cannot remove node" to identify the blocker.

Understanding the Limitations

Be aware of the inherent limits of EKS autoscaling. Node provisioning takes time (typically 2–5 minutes for a new EC2 instance to become Ready). Pod termination and draining add latency to scale-down. Applications that cannot tolerate brief disruption may need additional safeguards like PodDisruptionBudgets or safe-to-evict annotations.

Conclusion

EKS autoscaling, particularly the Cluster Autoscaler, lets you handle dynamic workloads in your Kubernetes cluster on AWS. By understanding the available methods, configuring the Autoscaler correctly, and tuning its behavior, you can keep your applications performant and cost-effective as demand fluctuates.

Take Your EKS Autoscaling to the Next Level

Effective EKS autoscaling is crucial for optimal performance and cost-efficiency in your Kubernetes cluster. If you’re looking to take the complexity out of managing autoscaling policies, ScaleOps Platform provides the perfect solution. While the Cluster Autoscaler is powerful on its own, its effectiveness reaches new heights when paired with the right resource management solution. ScaleOps continuously optimizes workloads both vertically and horizontally, complementing the Cluster Autoscaler seamlessly. To learn more about how ScaleOps can improve your Kubernetes resource management, check out ScaleOps.

Learn more in our detailed guide comparing Karpenter vs Cluster Autoscaler.

Frequently Asked Questions

What is the EKS Cluster Autoscaler?

The EKS Cluster Autoscaler is a Kubernetes controller that adjusts the size of your EKS node groups. It scales up when Pods are Pending due to insufficient capacity and scales down when nodes are underutilized and their Pods can be rescheduled safely.

How do I enable the Cluster Autoscaler in an EKS cluster?

  • Tag the target Auto Scaling groups for auto-discovery.
  • Create an IAM role for the Cluster Autoscaler (typically via IRSA) with Auto Scaling permissions.
  • Install Cluster Autoscaler (for example via Helm) with auto-discovery enabled.
  • Set appropriate minimum and maximum node counts on the node groups so scale-up and scale-down are allowed.

Is autoscaling possible in Kubernetes?

  • Yes. Common autoscaling methods include:
  • Horizontal Pod Autoscaler (HPA): scales the number of Pod replicas based on metrics. Vertical Pod Autoscaler (VPA): adjusts Pod CPU and memory requests over time. Cluster Autoscaler: scales the number of nodes so the scheduler can place Pods.