About Booksy:
Booksy is a leading online booking and business management platform designed for service-based industries, including beauty, wellness, fitness, and healthcare. The platform offers a seamless scheduling experience for both businesses and their clients, featuring appointment booking, reminders, payment processing, and marketing tools. With a focus on streamlining operations and enhancing customer engagement, Booksy empowers businesses to grow and manage their services with ease. Trusted by millions of users worldwide, Booksy is committed to delivering innovative solutions that drive efficiency and improve the client experience.
Key Results:
Significant Reduction in CPU as a Key Driver to Cost Savings
By leveraging ScaleOps technologies to optimize its Kubernetes-based infrastructure, Booksy has reduced its CPU requests by ~80% in the past year. ScaleOps’ ongoing automation eliminates over-provisioning while ensuring system performance and availability during peak loads. These changes not only decreased cloud costs significantly but also enhanced the reliability and scalability of their operations. This holistic approach provided immediate cost benefits and improved workload performance while supporting Booksy’s global user base growth.
Enhanced Efficiency Through Automation Frees Engineering Resources
The automation implemented by ScaleOps removed the need for manual workload tuning, significantly reducing the time and effort required by engineering teams. This shift allowed Booksy to reallocate valuable resources to other high-priority projects, boosting overall productivity and enabling the team to focus on innovation and growth rather than constant operational adjustments. The result was not only more efficient use of engineering talent but also a smoother, more scalable Kubernetes environment.
Optimizing Resource Allocation with Auto-Detected Policies
Booksy leveraged the built-in auto-detected policies to quickly and easily implement ScaleOps. The out-of-the-box feature automatically identified and applied the most suitable configurations for each workload, honoring Booksy’s diverse use cases and optimizing them individually. This ensured efficient CPU and memory allocation, preventing issues like throttling and OOM errors. By dynamically adapting to each workload’s unique requirements, ScaleOps enabled Booksy to maintain consistent performance and reliability while reducing operational overhead and achieving high service availability.
The Challenge
Automating the Management of Over 3,000 Workloads
Manual intervention was no longer sustainable, especially for production workloads that demanded high availability and reliability. Booksy faced a critical challenge in managing over 3,000 workloads distributed across development, staging, and production environments. The team used to check savings manually every few weeks or even months, resulting in missed opportunities for significant cost reduction. The complexity of scaling resources and maintaining consistent performance across such a vast ecosystem required a solution that could automate resource allocation and streamline infrastructure management.
Ensuring Consistent Performance in Dynamic Environments
Booksy encountered recurring challenges with CPU throttling and Out-Of-Memory (OOM) issues across their environments, which risked impacting application performance and availability. While the team worked diligently to avoid any disruptions for their customers, adhering to strict internal SLAs meant quickly identifying and mitigating performance issues. However, relying on manual intervention to monitor and resolve these issues was time-consuming and inefficient, especially as workloads scaled. Booksy needed a solution that could proactively optimize resource requests, eliminate performance degradation, and ensure reliable application performance at scale.
Manually Flagging Critical Events at Scale
In addition to resource automation needs, Booksy struggled to track crucial system events such as CPU and memory utilization percentages, workload disruptions, and pod health issues. Without detailed insights into these metrics, the team found it challenging to manually detect performance issues, inefficiencies, and potential risks before they impact service availability. This led to increased costs and inconsistent application performance.
The Solution
Eliminating Manual Work with Automation
To address the complexity of managing thousands of workloads across multiple environments, Booksy implemented ScaleOps to fully automate and manage resource allocation at scale. By introducing automation to their environments, Booksy eliminated the need for manual intervention in managing workloads, which had previously required regular checks for resource optimization. The ScaleOps platform dynamically adjusts resources based on real-time demand, ensuring efficient performance with significantly less human oversight. The shift to automated scaling not only minimized manual effort but also allowed the team to focus on more strategic tasks. With resource management streamlined and automated, Booksy could trust the system to optimize for both cost and performance.
Dynamic Optimization with Improved Performance and Reliability
To address the risk of CPU throttling and OOM issues, Booksy implemented ScaleOps that automated resource requests and provided real-time performance and troubleshooting capabilities at the pod, cluster and multi-cluster level. This proactive approach ensured that workloads dynamically scaled requests on demand, eliminating manual intervention and preventing resource contention. With automated scaling and optimization, Booksy was able to meet internal SLAs, minimizing downtime and preventing customer-facing service degradation. The new system provided the team with better visibility into resource usage and system performance, keeping developers in the loop while automating resource requests on their behalf.
Troubleshooting Infrastructure Anomalies
To overcome the challenge of manually detecting critical events, ScaleOps automatically flags occurrences such as CPU and Memory Utilization, workload disruptions, and noisy neighbors. This allows Booksy to go under the hood and troubleshoot resources that are impacting availability at the workload, cluster, or multi-cluster level. Additionally, ScaleOps provides valuable metrics, such as identifying expensive or wasteful workloads, empowering the team to make data-driven decisions that improve cost efficiency and ensure consistent performance across their infrastructure.
The Impact
Cutting the Number of Allocatable CPU in Half
ScaleOps’ best-in-market product reduced Booksy’s allocated CPU by ~50% across dev, staging, and production clusters. The platform automatically rightsizes workloads in real-time, ensuring efficient allocation of CPUs and memory. This not only reduces costs but also enhances performance and reliability, allowing Booksy to focus on scaling without worrying about excessive infrastructure expenses.
Reduced Infrastructure Management Time
Booksy has eliminated the manual effort required to rightsizing its infrastructure through the automated optimization capabilities provided by ScaleOps. The platform continuously adjusts resource requests based on real-time demand, allowing Booksy’s teams to focus on higher-priority tasks and strategic initiatives rather than spending time on constant workload tuning. This has streamlined operations and ensured better resource efficiency, contributing to improved performance across environments.
Keeping Developers in the Loop
By adopting ScaleOps, Booksy gained a comprehensive solution that not only automates resources at the pod level, but also keeps developers in the loop providing them visibility into critical events that can affect performance, reliability, and cost.
Summary
Booksy partnered with ScaleOps to optimize its Kubernetes-based infrastructure, reducing costs by 50%. Using ScaleOps’ automated pod rightsizing capabilities, the team significantly cut cloud costs while improving performance and scalability, even during peak demand. The onboarding of ScaleOps was seamless, with a simple Helm chart and easy installation processes that streamlined deployment and upgrades. ScaleOps’ out-of-the-box and auto-detected policies gave Booksy a fast, hands-free experience while ensuring optimal resource allocation and high service availability. This holistic approach has not only delivered significant savings but increased operational efficiency as well.