Autoscaling is a core part of running Kubernetes in production. Pods scale horizontally with HPA. KEDA handles event-based scaling. Even Jobs and StatefulSets can scale now. But scaling nodes is where you see the biggest impact, both in savings (when it’s done right) and in problems (when it’s done wrong.)
Karpenter was built for flexibility. It binpacks workloads to use nodes efficiently and keeps costs down by only adding capacity when needed. Reserved Instances, on the other hand, are built for commitment. They can save money, but only if you use them the right way.
So what happens when you try to use both both? Are Karpenter and AWS Reserved Instances (RIs) the ultimate power couple or a risky match?
In this post, we’ll walk through what happens when you mix the two. We’ll look at how they behave, where the friction is, and what to do to avoid wasted money or surprise behavior.
Tl;dr
- Karpenter and RIs are powerful but complex and this is because of the constant balance between flexibility and commitment
- To make them work, use weighted NodePools with explicit AZ and instance type constraints
- For guaranteed placement, dedicate RIs to critical workloads using node labels
- For precise cost ratios, use the advanced
topologySpreadConstraints
technique - When it comes to flexibility, AWS Savings Plans are often better than RIs
Karpenter and RIs: A Cloud-Native Love Story (With Baggage)
Like any good partner, AWS RIs are both reliable and ready to commit. You lock in and save big: up to 72% off if you plan to stick around for 1-3 years. They’re a long term partner. Your spreadsheets will sing. Karpenter, on the other hand, promotes flexibility over commitment.
Now imagine these two getting engaged. Romantic? Yes. Complicated? Of course.
Flexibility vs Commitment
Karpenter doesn’t know about your RI commitments. It picks instance types based on your constraints (e.g. t3a.large
, m6i.xlarge
, etc.) and what’s cheapest or most available, like choosing m6a.large
even if you’ve reserved m5.large
. If the types don’t match, your RIs go unused while Karpenter spins up On-Demand nodes. That means you’re paying for capacity twice.
RIs Are AZ-Specific, Karpenter Isn’t
RIs tend to be tied to specific Availability Zones. But Karpenter chooses AZs based on instance availability, spot pricing, and topology. Not the location of your RIs. So if your RIs are in us-east-1a
, Karpenter might still launch nodes in us-east-1b
– leaving your RIs unused.
What You Can Do: Making the Relationship Work
Tip 1: Use Weighted NodePools with Limits
Tell Karpenter to prioritize your RI instance types by assigning higher weights to specific NodePools
. This helps it choose RI-backed nodes before falling back to others.
Since RIs are tied to specific instance types and AZs, your NodePool
must match those exactly. If you don’t, Karpenter might book a quick weekend getaway in a completely different place.
Also, Karpenter won’t stop at your RI limits unless you explicitly tell it to. Set .spec.limits
in the NodePool
to match your RI capacity. Otherwise, it’ll keep provisioning On-Demand nodes once your RIs are used up.
Here’s how to configure it:
First, define a weighted RI NodePool
with limits.
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: reserved-instance
spec:
weight: 50 # Higher weight means higher priority
limits:
cpu: 100 # Limit to match your RI capacity
template:
spec:
requirements:
- key: "node.kubernetes.io/instance-type"
operator: In
values: ["m5.2xlarge"] # Match your RI type exactly
- key: "topology.kubernetes.io/zone"
operator: In
values: ["us-east-1a"] # Match your RI's AZ
Then create a fallback NodePool
for everything else:
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: default
spec:
template:
spec:
requirements:
- key: karpenter.sh/capacity-type
operator: In
values: ["spot", "on-demand"]
- key: kubernetes.io/arch
operator: In
values: ["amd64"]
But keep in mind, even with these setting in place, Karpenter won’t always choose the highest priority NodePool given specific requirements.
Tip 2: Dedicate RIs to Critical Workloads
In situations where you want specific workloads to use RI capacity, use labels to target them. The idea is to create a dynamic NodeSelector
for the workloads you want to run on RIs, resulting in the provisioning of dedicated nodes.
First, create your NodePool
with a “workload-type” label (or any other meaningful label):
# RI-optimized NodePool with custom label support
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: reserved-instances
spec:
limits:
cpu: 100 # Match your RI capacity
template:
spec:
requirements:
- key: "node.kubernetes.io/instance-type"
operator: In
values: ["c4.large"] # Your RI instance type
- key: "topology.kubernetes.io/zone"
operator: In
values: ["us-east-1a"] # Must match your RI's zone
- key: "workload-type" # Dynamic label for targeting
operator: Exists
Then, define a corresponding nodeSelector
with a unique value in your matching workload’s manifest:
apiVersion: apps/v1
kind: Deployment
metadata:
name: database-tier
spec:
replicas: 10
template:
spec:
nodeSelector:
workload-type: critical-db # This value gets applied to the node
This way, Karpenter will provision dedicated RI nodes for these workloads, and automatically define the label workload-type: critical-db
on the node at its creation time. This keeps your most important workloads pinned to RI-backed nodes, while everything else can run on spot or On-Demand instances. This pattern is also mentioned in the documentation.
Unlike taints that require modifying every deployment, this method only requires you to configure the specific applications that need RI guarantees.
Tip 3: Achieve Precise RI-to-Spot Ratios with Pod Topology Constraints
This tip is for more advanced setups.
If you want a fixed cost ratio between RI nodes and spot instances, Kubernetes can help you enforce it by spreading pod replicas evenly across “virtual domains” using topologySpreadConstraints
.
The idea is to create multiple NodePools
that share the same arbitrary label key (e.g., ‘capacity-spread’), but assign a different number of possible values to each pool. For example:
- Assign 1 value to RI
NodePool
- Assign 4 values to the Spot
NodePool
- Now you’ve created 5 virtual domains in total
When pods are scheduled with topology spread constraints using this label, Kubernetes distributes them evenly across domains. As a result, they follow a 1:4 ratio (20% RI, 80% spot).
First, create 2 NodePools
# For your RI-covered instances (20% of nodes)
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: reserved-instance
spec:
limits:
cpu: 100 # Match your RI capacity
template:
spec:
requirements:
- key: "karpenter.sh/capacity-type"
operator: In
values: ["on-demand"]
- key: "node.kubernetes.io/instance-type"
operator: In
values: ["c4.large"] # Your RI type
- key: "topology.kubernetes.io/zone"
operator: In
values: ["us-east-1a"] # Must match your RI's zone
- key: capacity-spread # This creates a virtual topology domain
operator: In
values: ["1"] # 1 value = 20% of nodes
# For spot instances (80% of nodes)
apiVersion: karpenter.sh/v1
kind: NodePool
metadata:
name: spot
spec:
template:
spec:
requirements:
- key: "karpenter.sh/capacity-type"
operator: In
values: ["spot"]
- key: capacity-spread # Same label creates connected domains
operator: In
values: ["2", "3", "4", "5"] # 4 values = 80% of nodes
Then, configure your workload with a topology spread constraint using this label as the topology key.
apiVersion: apps/v1
kind: Deployment
metadata:
name: your-application
spec:
replicas: 20
template:
spec:
# This is the key that creates the ratio effect
topologySpreadConstraints:
- maxSkew: 1
topologyKey: capacity-spread
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
app: your-app
containers:
- name: app
image: your-image
# ...
When the scheduler distributes pods evenly across five virtual domains, it requires nodes from both NodePools
: one domain backed by your RI NodePool
and the remaining four by your spot NodePool
. This setup naturally enforces a 1:4 ratio (or close to it) of RI to spot nodes as Karpenter provisions infrastructure to meet the topology spread constraints.
To make this work reliably, your RI-targeted instances must be provisioned in Availability Zones where your reserved instances are valid. This means that careful AZ configuration is essential.
Keep in mind that if the number of pods isn’t perfectly divisible across domains, the scheduler will still attempt an even distribution while respecting maxSkew
. This can cause temporary imbalances during scaling events.
Tip 4: Consider Savings Plans Instead
If you’re not already locked into Reserved Instances, Compute Savings Plans are often a better fit for Karpenter’s dynamic provisioning model. That’s because they:
- Work across instance types and families
- Apply in any Availability Zone
- Still offer significant discounts (up to 66%)
And remember, 100% RI utilization isn’t always the goal. The real win comes from balancing commitments with real-time, automated rightsizing and spot usage to drive overall cost efficiency.
Final Tips
It sounds obvious, but it’s critical: update your NodePool configurations as your RI commitments evolve. Misalignment between your infrastructure and your commitments means wasted savings.
Keep in mind, these strategies aren’t foolproof. During AZ outages or rapid scaling events, Karpenter will always prioritize availability over RI alignment.
And above all:
Monitor. Monitor. Monitor. Seriously. Your budget depends on it.
And at the very least, verify that your nodes align with your RI instance types and zones:
# Verify your nodes match RI types and zones
kubectl get nodes -o custom-columns=NAME:.metadata.name,TYPE:.metadata.labels.node\\\\.kubernetes\\\\.io/instance-type,ZONE:.metadata.labels.topology\\\\.kubernetes\\\\.io/zone
The Bottom Line
It’s not a perfect romance, but Karpenter and RIs can work together, with the right boundaries in place. With some upfront planning, you can build a cost-efficient infrastructure that holds up for the long run.
Until AWS and Karpenter offer true native integration, these strategies will help you avoid the worst mismatches while capturing most of the savings. Just remember: committing to RIs means giving up a little of Karpenter’s free-spirited flexibility.
C’est la vie in cloud marriages. ☁️