🎉 ScaleOps is excited to announce $58M in Series B funding led by Lightspeed! Bringing our total funding to $80M! 🎉 Read more →

DevOps Kubernetes

Pod Disruption Budget: Benefits, Example & Best Practices

Ben Grady 20 November 2024 11 min read

In Kubernetes, the availability during planned and unplanned disruptions is a critical necessity for systems that require high uptime. Pod Disruption Budgets (PDBs) allow for the management of pod availability during disruptions. With PDBs, one can limit how many pods of an application could be disrupted within a window of time, hence keeping vital services running during node upgrades, scaling, or failure. In this article, we discuss the main components of PDBs, their creation, use, and benefits, along with the best practices for improving them for high availability at the very end.

What are Pod Disruption Budgets (PDBs)?

A Pod Disruption Budget defines limits to manage how many application pods may concurrently be disrupted by planned or unpredictable events in Kubernetes cluster management. PDBs provide protection when nodes need to be drained for scale or maintenance since too many replicas of a service should not go down at the same time. PDBs are useful for applications that require a baseline level of availability by enforcing that a certain number of pods must be active.

Kubernetes supports both types of disruptions: voluntary and involuntary. Both bear implications on the stability of services, and PDBs help administer voluntary disruptions so that admins can do their maintenance tasks without compromising on critical services.

Voluntary Disruptions vs. Involuntary Disruptions

Distribution TypeDescriptionExamples
VoluntaryDisruptions initiated by administrators or planned events that affect pod availability temporarily.Node maintenance, cluster upgrades, scaling down pods, rolling updates, application restarts
InvoluntaryDisruptions that occur unexpectedly due to failures in the environment or infrastructure without warning.Node crashes, network partitioning, hardware failures, cloud provider outages, unhandled exceptions

Voluntary Disruptions

These are planned outages initiated by cluster administrators. Node maintenance, reducing the number of instances, or upgrades can be considered as voluntary efforts. With PDBs, such interruptions can be managed in a way that is least intrusive to applications, enforcing availability standards when such planned events are underway.

Involuntary Disruptions

Involuntary disruptions are unexpected events that occur without administrator control, such as node crashes or network partitioning. While PDBs primarily control voluntary disruptions, designing your architecture to account for involuntary disruptions by adding additional replicas or employing resilience strategies will ensure high availability.

How to Create a Pod Disruption Budget

Creating a Pod Disruption Budget is straightforward and involves specifying the minimum number of pods that should remain available or the maximum number that can be disrupted. Let’s walk through the process.

1. Determine the Minimum Number of Instances

The first step in setting up a PDB is to understand the availability requirements of your application. For example, suppose you have a deployment with three replicas and need at least two replicas to provide adequate service. In that case, you’ll want a PDB that ensures two pods remain available during disruptions.

Setting the right number is key, as a low minAvailable could impact availability, while a high value may restrict maintenance.

2. Create a YAML File

Once you’ve decided on the number of pods that need to stay available, you’ll create a YAML file to define the PDB. Here’s an example of PDB YAML configuration:

apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
  name: example-pdb
spec:
  minAvailable: 2
  selector:
    matchLabels:
      app: my-app

Explanation:

  • apiVersion: Specifies the Kubernetes API version. For PDBs, it’s generally policy/v1.
  • minAvailable: Ensures a minimum number of pods (in this case, two) are always running.
  • selector.matchLabels: Matches pods labeled with app: my-app, so the PDB only applies to these pods.

3. Apply the YAML File

After defining the PDB in YAML, apply it using the following command:

kubectl apply -f example-pdb.yaml

This command will create the PDB in your Kubernetes cluster, where it will monitor and control disruptions based on your availability requirements.

How to Use Pod Disruption Budgets: Example

Once the PDB is set up, it’s essential to monitor and manage it regularly to ensure it’s functioning as expected. Here are a few practical commands and scenarios to get the most out of your PDB.

1. Check the PDB Status

To check the status of your PDB, run:

kubectl get pdb

This command provides an overview of the current PDBs, showing how many pods are available and if the disruption budget is being observed.

2. Simulate Disruptions

To observe how PDBs work, simulate a disruption by draining a node. Draining removes all pods from a node, except DaemonSet pods:

kubectl drain  --ignore-daemonsets

Kubernetes checks the PDB during this operation and will only drain the node if it respects the PDB. If disrupting the pods exceeds the PDB limit, Kubernetes will prevent the drain operation.

3. Update or Delete the PDB

Over time, you may need to adjust the PDB based on changing application requirements. Update the YAML file and apply it again, or delete the PDB with:

kubectl delete pdb example-pdb

Deleting the PDB removes the restrictions, allowing all pods to be disrupted.

Benefits of Using Pod Disruption Budgets

Pod Disruption Budgets are beneficial for ensuring that applications maintain high availability and resilience. They help guarantee reliability and performance by allowing services to continue functioning smoothly even during planned disruptions. 

1. Maintaining High Availability

PDBs ensure high availability by maintaining minimum threshold values for the number of available pods. This means that when maintenance or updates are performed, the essential services are not completely cut off, and critical pods run continuously. PDBs prevent all pods from being unavailable, thus helping applications meet uptime requirements and providing requisite service continuity – something of high importance in mission-critical systems.

2. Automated Management of Disruptions

With the automatic enforcing of the usage of PDBs by Kubernetes in scheduled disruption, there is less intervention needed. It decreases the occurrence of human errors and ensures only controlled disturbances in the pods to optimize the maintenance processes. Maintaining minimal time offline with minimal disruption around an application during updates or upgrades keeps the applications accessible.

3. Improved Cluster Stability

PDBs control at which times how many pods can be disrupted and thus allow Kubernetes to handle planned disruptions without compromising cluster stability. This results in easier node maintenance, scaling, and updates without affecting the system’s reliability. PDBs contribute to infrastructure resilience, which ensures that services don’t go down because of maintenance.

4. Cost Savings

PDBs define a minimum number of available pods, which means that resources will always be used efficiently without overprovisioning more than necessary. Kubernetes prevents outages by balancing pod availability. This avoids having too few pods, which could cause downtime, and too many, which could waste resources. Such efficient use of resources results in Kubernetes cost optimization while still meeting the prerequisite application availability.

Important Considerations When Using PDBs

When configuring PDBs, it’s important to keep the following points in mind to ensure they are effective:

  • Monitoring Pod Disruption Status: Regularly monitor the PDB status to detect any issues early on.
  • Single Replicas and PDB: Applications with a single replica don’t work well with PDBs since there’s no room to handle disruptions without impacting availability.
  • Horizontal Pod Autoscalers (HPA) and PDB: HPAs adjust pod counts based on demand. Be cautious, as scaling down may conflict with PDB limits.
  • Overlapping Selectors: Overlapping selectors across multiple PDBs can create conflicts, causing unintended restrictions. Carefully design selector labels to avoid overlap.
  • maxUnavailable and PDB: This value is useful for flexibility, specifying the maximum pods that can be disrupted at once. It’s a good choice for balancing availability and flexibility.
  • minAvailable and PDB: Use minAvailable for stricter availability control, ensuring a specific number of pods are always active.
  • Involuntary Disruptions and PDB: PDBs are not primarily focused on preventing involuntary disruptions, so additional resiliency measures should be considered.
  • Monitoring Pod Disruption Status: Regularly monitor PDB status to address availability issues quickly.
  • Single Replicas and PDB: Single replicas lack redundancy, so avoid PDBs to prevent downtime.

Handling Common Issues and Limitations in PDBs

As much as PDBs provide critical benefits, they also come with specific concerns that need to be addressed to get working. Overcoming such issues ensures that operations run smoothly in case of disruptions.

Challenges with Single-Replica Pods

Single-replica pods cannot withstand any kind of interruption without causing downtime. In order to maintain high availability, significant applications must be served by multiple replicas so that at least one pod is available during maintenance or interruption. More replicas ensure that at least one pod remains up and running.

Dealing with Overlapping Selectors

Overlapping PDBs can lead to conflicts if multiple PDBs target the same set of pods with different minAvailable or maxUnavailable settings. Always use appropriate label selectors to avoid overlap and ensure each PDB targets a unique set of pods. This prevents conflicts and helps guarantee availability.

Limitations of PDBs in Involuntary Disruptions

PDBs also do not protect from involuntary interruptions, such as hardware failures or network outages. If such scenarios are likely to occur, it is better to use application-level redundancy by employing multi-zone deployments or duplicating critical services across multiple nodes.

Scaling and PDB Constraints

Scaling operations might conflict with PDBs, especially if minAvailable is too high. It will either require modifying the PDB temporarily for scaling to be allowed, or reducing pods gradually so that availability is not impacted while scaling operations are in complete swing.

Monitoring and Troubleshooting PDBs

Use kubectl describe pdb <pdb-name> to monitor the status of your PDB. Analyze this information to identify issues affecting pod availability or any conflicts in PDB settings. Regular monitoring and troubleshooting will help ensure that your PDBs function properly and maintain service availability.

Best Practices for Using Pod Disruption Budgets

Following these best practices helps effectively use PDBs to maintain stability and availability:

Best PracticesDescription
Understand Application RequirementsEnsure that the PDB is aligned with the specific availability needs of your application, considering factors like traffic patterns, criticality, and SLA requirements.
Set Accurate SelectorsDefine precise label selectors in PDBs to target the right pods. Avoid overlaps and ensure the PDB applies to the correct set of pods for efficient disruption management.
Use Percentage-Based BudgetsWhen possible, use percentage-based minAvailable or maxUnavailable values instead of fixed pod counts. This allows for more flexible disruption management, especially in dynamic environments with varying pod counts.
Combine PDBs with Higher-Level ResourcesPair PDBs with higher-level Kubernetes resources like Deployments, StatefulSets, or DaemonSets to ensure proper scaling and availability while managing disruptions.
Prepare for Unexpected DisruptionsIn addition to planned maintenance, ensure your infrastructure is resilient to unexpected disruptions like node failures or network issues, possibly by using redundant resources.
Monitor PDB Status RegularlyContinuously monitor the status of PDBs to detect any disruptions or conflicts early on. This allows for quicker responses to ensure system availability is maintained.
Review and Update as NeededRegularly review PDB configurations, especially after application or Kubernetes setup changes. Update PDBs to reflect new requirements or to address changes in availability needs.

Pod Disruption Budgets Use Cases

Pod Disruption Budgets are widely used in various industries where uptime and resilience are crucial.

Financial Services

PDBs ensure mission-critical applications, such as payment processing and trading systems, are available during planned maintenance. This limits the disruption risks to transaction and compliance while ensuring such operations run without glitches or violation of service-level agreements.

SaaS Platforms

SaaS platforms rely on PDBs to guarantee that updates or scaling activities do not lead to downtime. With a minimum number of pods available, the use of PDBs prevents potential instances of downtime. It protects customers from disturbed services while meeting the uptime requirements outlined in service-level agreements.

E-commerce

E-commerce websites employ PDBs to ensure that their high-traffic and upgrading time backend services are available. All core services, for example, the payment processor and inventory manager, can be called up continuously so as not to lose sales while the services are down for maintenance or scaling.

Conclusion

Pod disruption budgets are one of the most important utilities in Kubernetes to ensure high availability and resiliency, particularly in dynamic and scalable environments. PDBs, therefore, enforce a minimum threshold for pod availability to ensure that services critical to daily operations should not be disrupted due to maintenance and scaling activities. Effective use of PDBs further enhances application stability as well as cluster reliability, and aims to provide satisfactory user experiences while minimizing downtime.

Related Articles

Amazon EKS Auto Mode: What It Is and How to Optimize Kubernetes Clusters

Amazon EKS Auto Mode: What It Is and How to Optimize Kubernetes Clusters

Amazon recently introduced EKS Auto Mode, a feature designed to simplify Kubernetes cluster management. This new feature automates many operational tasks, such as managing cluster infrastructure, provisioning nodes, and optimizing costs. It offers a streamlined experience for developers, allowing them to focus on deploying and running applications without the complexities of cluster management.

ScaleOps Pod Placement – Optimizing Unevictable Workloads

ScaleOps Pod Placement – Optimizing Unevictable Workloads

When managing large-scale Kubernetes clusters, efficient resource utilization is key to maintaining application performance while controlling costs. But certain workloads, deemed “unevictable,” can hinder this balance. These pods—restricted by Pod Disruption Budgets (PDBs), safe-to-evict annotations, or their role in core Kubernetes operations—are anchored to nodes, preventing the autoscaler from adjusting resources effectively. The result? Underutilized nodes that drive up costs and compromise scalability. In this blog post, we dive into how unevictable workloads challenge Kubernetes autoscaling and how ScaleOps’ optimized pod placement capabilities bring new efficiency to clusters through intelligent automation.

Kubernetes VPA: Pros and Cons & Best Practices

Kubernetes VPA: Pros and Cons & Best Practices

The Kubernetes Vertical Pod Autoscaler (VPA) is a critical component for managing resource allocation in dynamic containerized environments. This guide explores the benefits, limitations, and best practices of Kubernetes VPA, while offering practical insights for advanced Kubernetes users.

Schedule your demo