Predictive Hybrid Autoscaling for Cloud Workloads: A Machine Learning Approach to Vertical and Horizontal Resource Optimization on AWS EC2
Main Article Content
Abstract
Cloud infrastructure management faces persistent challenges in balancing service reliability with cost efficiency as workload demands fluctuate unpredictably. Traditional autoscaling approaches rely on reactive, threshold-based triggers that respond only after performance degradation begins, whereas AWS native solutions provide exclusive horizontal scaling capabilities. This article introduces a predictive hybrid autoscaling framework that leverages machine learning forecasting models to anticipate resource demands and intelligently orchestrates both vertical instance resizing and horizontal capacity adjustments. Three time-series prediction approaches—ARIMA, Long Short-Term Memory networks, and Facebook Prophet—are evaluated in a comparative study across diverse workload patterns, including periodic traffic cycles, sudden demand spikes, gradual growth trends, and unpredictable variations. The article testbed employs multiple AWS EC2 instance families under realistic application scenarios, measuring Service Level Objective compliance, availability metrics, resource utilization efficiency, and total infrastructure costs. Results demonstrate that predictive hybrid scaling substantially improves reliability while reducing operational expenses compared to reactive autoscaling and static over-provisioning strategies. LSTM networks excel at capturing complex non-linear demand patterns, while Prophet proves superior for seasonal workloads. The article integrates with standard AWS services through open-source implementations, providing cloud operators with practical tools for proactive capacity management. This article bridges predictive analytics with site reliability engineering practices, offering systematic approaches to cost-reliability optimization in dynamic cloud environments.