Load Prediction and Auto Scaling Models for Fintech Cloud Workloads

Ervin Danika

Authors

Ervin Danika Department of computer science, University of Malaya Author

Keywords:

Load prediction; auto-scaling; fintech cloud workloads; performance engineering; elastic computing; cloud infrastructure

Abstract

Fintech cloud workloads exhibit highly dynamic and burst-prone traffic patterns driven by real-time payments, market events, regulatory deadlines, and customer behavior. Ensuring performance, avAIlability, and cost efficiency under these conditions requires accurate load prediction and responsive auto-scaling mechanisms. Traditional reactive scaling approaches based on static thresholds are often insufficient, leading to latency spikes, service degradation, or excessive resource over-provisioning. This paper investigates predictive load modeling and intelligent auto-scaling strategies tAIlored for fintech cloud workloads. It proposes an integrated framework combining time-series forecasting, machine learning based demand prediction, and policy-driven scaling orchestration. Using modeled fintech workloads— including payment processing, onboarding pipelines, and risk analytics—the study evaluates predictive versus reactive scaling approaches across performance, cost, and resilience metrics. Results show that predictive load-aware auto-scaling reduces latency violations by up to 37%, lowers infrastructure costs by 29%, and improves service-level objective (slo) compliance during peak and anomalous load events. The findings position predictive load modeling as a core capability for scalable, reliable, and cost-efficient fintech cloud operations.

Load Prediction and Auto Scaling Models for Fintech Cloud Workloads

Authors

Keywords:

Abstract

Downloads

Published

Issue

Section