Kubernetes continues to evolve, offering features that enhance efficiency and adaptability for developers and operators. Among these are Resize CPU and Memory Resources assigned to Containers, introduced in Kubernetes version 1.27. This feature allows for adjusting the CPU and memory resources of running pods without restarting them, helping to minimize downtime and optimize resource usage. This blog post explores how this feature works, its practical applications, limitations, and cloud provider support. Understanding this functionality is vital for effectively managing containerized workloads and maintaining system reliability.

What Is In-Place Pod Vertical Scaling?

Traditionally, modifying the resource allocation for a Kubernetes pod required a restart, potentially disrupting applications and causing downtime. In-place scaling changes this by enabling real-time CPU and memory adjustments while the pod continues running. This is particularly useful for workloads with a very low tolerance for pod evictions.

What’s behind the feature gate?

The new resizePolicy spec element allows you to specify how a pod reacts to a patch command that changes its resource requests, enabling changing resource requests without rescheduling the pod.

The result of the change attempt is communicated as part of the pods’ status in a field called  resize (for more information on the new fields, check out the Kubernetes API documentation.)

Additionally, this feature introduces the restartPolicy in the spec element for containers, allowing fine-grained control over resizing behavior and allowing the developer to choose if CPU change or Memory change should lead to rescheduling the pod.

Key Features

How It Works

The InPlacePodVerticalScaling feature integrates seamlessly into Kubernetes to provide a more dynamic approach to resource allocation. Here’s a detailed breakdown of how it operates:

Diagram of Kubernetes in-place resource adjustments showing user interaction with kube-api, kubelet, containerd, and pod within a Kubernetes cluster and node structure.
  1. Feature Gate Activation: Activating the InPlacePodVerticalScaling feature gate in your cluster configuration is required to enable this functionality. This allows the kubelet on each node to detect and process resource updates dynamically.
  2. Dynamic Resource Updates via Kube API: With the feature enabled, the kubelet directly applies resource changes to running pods without requiring restarts. Supported container runtimes (e.g., containerd v1.6.9 or later) ensure these updates are applied efficiently. If constraints like insufficient free memory or CPU prevent the changes, the pod follows the regular flow: it is recreated and rescheduled.
  3. Pod Spec Adjustments: The resizePolicy field dictates how CPU and memory adjustments are handled. For instance, you can set NotRequired for live updates without restarts or RestartContainer to force a restart when a specific resource is modified.

Limitations and Considerations

While In-Place Pod Vertical Scaling offers significant benefits, it has limitations:

1. Cloud Provider Support

2. Runtime Compatibility

Container RuntimeCompatible VersionRelease Notes
containerdv1.6.9 and abovecontainerd v1.6.9 Release Notes
CRI-Ov1.24.2 and aboveCRI-O v1.24.2 Release Notes
Podmanv4.0.0 and abovePodman v4.0.0 Release Notes

3. Policy Constraints

Several Kubernetes policies and mechanisms govern resource scaling. These include:

Resource Quotas

apiVersion: v1
kind: ResourceQuota
metadata:
  name: compute-quota
  namespace: example-namespace
spec:
  hard:
    requests.cpu: "10"
    requests.memory: "32Gi"

Limit Ranges

apiVersion: v1
kind: LimitRange
metadata:
  name: limits
  namespace: example-namespace
spec:
  limits:
  - type: Container
    max:
      cpu: "2"
      memory: "4Gi"
    min:
      cpu: "100m"
      memory: "128Mi"

Admission Controllers

Application Suitability

HPA

  1. A pod is changing from 1 core to 2 cores; this can cause a scale-down in pods and affect the bottom-line performance of the application.
  2. A pod changes from 2 cores to 1; this can cause a scale-up in pods, creating a waste of resources or potential downstream pressure due to the additional and unexpected pods created.

Use Cases

Step-by-Step Guide

1. Enable the Feature Gate

Add the following configuration to enable the InPlacePodVerticalScaling feature:

apiVersion: kubeadm.k8s.io/v1beta3
kind: ClusterConfiguration
apiServer:
  extraArgs:
    feature-gates: InPlacePodVerticalScaling=true
controllerManager:
  extraArgs:
    feature-gates: InPlacePodVerticalScaling=true
scheduler:
  extraArgs:
    feature-gates: InPlacePodVerticalScaling=true

For GKE, create a cluster with alpha features enabled:

gcloud container clusters create poc \
    --enable-kubernetes-alpha \
    --no-enable-autorepair \
    --no-enable-autoupgrade

2. Deploy a Pod

Define a deployment with initial CPU and memory requests and limits:

apiVersion: apps/v1
kind: Deployment
metadata:
  name: app
spec:
  selector:
    matchLabels:
      app: app
  template:
    metadata:
      labels:
        app: app
    spec:
      containers:
      - name: nginx
        image: nginx
        resources:
          limits:
            memory: "128Mi"
            cpu: "500m"
          requests:
            memory: "64Mi"
            cpu: "250m"
        resizePolicy:
        - resourceName: cpu
          restartPolicy: NotRequired
        - resourceName: memory
          restartPolicy: NotRequired

Once deployed, you can check the cpu.weight, cpu.max, memory.max, memory.min from within the container to see the initial values that the container starts with.

kubectl exec -it $(kubectl get pods -l app=app -o jsonpath='{.items[*].metadata.name}') -- cat /sys/fs/cgroup/cpu.weight

kubectl exec -it $(kubectl get pods -l app=app -o jsonpath='{.items[*].metadata.name}') -- cat /sys/fs/cgroup/cpu.max

kubectl exec -it $(kubectl get pods -l app=app -o jsonpath='{.items[*].metadata.name}') -- cat /sys/fs/cgroup/memory.min

kubectl exec -it $(kubectl get pods -l app=app -o jsonpath='{.items[*].metadata.name}') -- cat /sys/fs/cgroup/memory.max 

3. Update Resources

Adjust resource allocations for a running pod dynamically:

kubectl patch pod $(kubectl get pods -l app=app -o jsonpath='{.items[*].metadata.name}') -p '{"spec":{"containers":[{"name":"nginx","resources":{"requests":{"cpu":"750m"}}}]}}'

4. Verify Changes

Confirm updated resource settings:

kubectl describe pod -l app=app

Additionally, you can connect to the container and see the change in cpu.weight, cpu.max, memory.max, memory.min from within the container.

kubectl exec -it $(kubectl get pods -l app=app -o jsonpath='{.items[*].metadata.name}') -- cat /sys/fs/cgroup/cpu.weight

kubectl exec -it $(kubectl get pods -l app=app -o jsonpath='{.items[*].metadata.name}') -- cat /sys/fs/cgroup/cpu.max

kubectl exec -it $(kubectl get pods -l app=app -o jsonpath='{.items[*].metadata.name}') -- cat /sys/fs/cgroup/memory.min

kubectl exec -it $(kubectl get pods -l app=app -o jsonpath='{.items[*].metadata.name}') -- cat /sys/fs/cgroup/memory.max

Conclusion

In-Place Pod Vertical Scaling is a powerful tool for managing dynamic workloads in Kubernetes, reducing downtime, and optimizing resource usage. While its adoption depends on cloud provider support and application compatibility, this feature offers significant efficiency and cost-saving benefits. As Kubernetes evolves, such features will become essential for effective container orchestration.


While Google’s Kube Startup CPU Boost example is just a specific use case scenario, ScaleOps provides an all in one resource management solution to address all needed scenarios related to Kubernetes resource management.

Schedule your demo

Submit the form and schedule your 1:1 demo with a ScaleOps platform expert.

Schedule your demo