Karpenter changed the way Kubernetes clusters provisions and de-provisions nodes in AWS. By replacing the default Cluster Autoscaler with something faster and more cost-efficient, it set a new standard for optimizing how Kubernetes clusters scale.
But Kubernetes doesn’t live in just one cloud. With 86% of organizations adopting multi-cloud architectures, DevOps and Platform teams are facing a hard truth: Karpenter’s autoscaling magic doesn’t carry over to GCP, Azure, or any other cloud.
This highlights a deeper issue: Cluster Autoscaling isn’t broken in any one cloud, it’s broken between them.
Fragmentation in Multi-Cloud Cluster Autoscaling
As Karpenter matured and joined the CNCF as an incubating project, demand grew for Karpenter-like autoscaling solutions. In response, each provider built its own native autoscaler: GKE Autopilot for GCP, AKS Node Autoprovisioning (still in preview) for Azure, and so on.
These solutions are tightly coupled to their respective APIs and provisioning systems. They work well within their own clouds, but interoperability is nonexistent. And that’s by design. They were never meant to work together.
Trying to recreate Karpenter’s experience across clouds often results in a patchwork of tools that fall short of the real thing. The result? Operational fragmentation. DevOps teams are stuck managing multiple configurations, chasing policy drift, and troubleshooting inconsistent behavior from one environment to the next.
Why Only Node Scaling Falls Short
Node autoscalers make infrastructure-level decisions. They scale nodes in or out when pending pods can’t be scheduled due to insufficient capacity.
But most inefficiencies live higher up the stack, at the workload level: over-provisioned pods, stale resource requests, and misaligned HPAs that don’t reflect how applications actually behave.
Node autoscalers don’t question these inputs—they just react. If your pod specs are wrong, your autoscaler makes the wrong decisions. That leads to reactive, inefficient, and expensive scaling.
Manual Optimization Doesn’t Scale Either
There are plenty of tools that offer optimization recommendations. But almost all stop short at visibility and alerts.
Production, Dev and Staging environments are dynamic. Application behavior shifts overnight, usage patterns spike without warning, and what worked yesterday may lead to performance issues or waste today.
In theory, recommendations help. In practice, they create overhead and are not sustainable.
Teams end up in a slow, tedious loop: reviewing suggestions, coordinating with service owners, and manually making changes. Platform engineers turn into messengers, chasing developers to make infrastructure decisions they never wanted to own.
Manual optimization doesn’t scale, but it also goes against the core principle of DevOps: automate to reduce human error.
Enter ScaleOps
Real optimization requires two things working in sync:
- Node-level autoscaling that provisions and deprovisions infrastructure.
- Pod-level automation to optimize how workloads utilize it.
ScaleOps automatically optimizes pod resource requests in real time: CPU, memory, usage spikes, and historical trends included. It automatically applies the right scaling strategy for each workload.
This is the missing piece in Kubernetes optimization: real-time, context-aware automation that continuously adjusts resource requests based on actual usage patterns and application behavior.
ScaleOps doesn’t replace your autoscaler. It adds another dimension to your scaling strategy. And the best part? it works uniformly across all clouds, ensuring the same workload behaves the same way, wherever it runs. ScaleOps ensures your workloads get the resources they need at the right time—no alerts, no approvals, no config edits. Just real-time optimization that adapts to how your applications behave.
Node-level and workload-level automation are two sides of the same coin. Together, they form a closed-loop system that continuously tunes itself.
Real-time Automation, Designed for Production
In production, autoscaling decisions can’t be static or misaligned. Every workload is unique and every production environment has its own complexity and challenges (with very little room for error). Autoscaling decisions must consider:
- Real application behavior and workload type
- Availability and operational constraints like PDBs, Safe-to-Evict annotations, and noisy neighbors
- Compatibility with HPA, KEDA objects, and GitOps tools like ArgoCD or Flux
And these constraints don’t just vary between clouds, they vary between teams, clusters, and workloads.
That’s why ScaleOps was designed for production from day one. It’s fully self-hosted, runs anywhere Kubernetes runs, including air-gapped or regulated environments, and adapts to real-world conditions.
ScaleOps respects your policies. We tune your resources in real time. And we do it all without breaking things in production.