- Launch Templates: Define the configuration for launching new EC2 instances.
- Auto Scaling Groups (ASGs): Set parameters such as minimum, maximum, and desired instance counts.
- Scaling Policies: Specify when and how to modify the number of instances.

Dynamic Scaling
Dynamic scaling adjusts the number of instances based on real-time metrics. It is available in three modes:-
Target Tracking Scaling:
Specify a target metric value (e.g., maintaining average CPU utilization at 80%). The scaling policy automatically adds or removes instances to keep the metric close to your target. -
Step Scaling:
Define multiple thresholds along with distinct scaling actions. For instance, if CPU utilization ranges between 70% and 80%, the policy might add one instance; if it rises from 80% to 90%, it might add two instances. This approach provides a graduated response to varying loads. -
Simple Scaling:
This method triggers a fixed scaling action when a single metric surpasses a preset threshold (such as CPU utilization rising above 80%). Though effective, it is considered a legacy method compared to the other dynamic options.
Example: Dynamic Scaling Modes in Action
Imagine you set a target tracking policy to maintain CPU utilization at 50%. Whether the usage is marginally above or below 50%, the scaling mechanism makes periodic adjustments to align with the target.- Below 70%: No scaling action.
- Between 70% and 85%: Increase capacity by adding a specific number of instances.
- Above 85%: Add even more instances to handle the elevated load quickly.


Scheduled Scaling
Scheduled scaling is based on predetermined time intervals rather than real-time metrics. This approach works best when your application’s workload follows predictable patterns. For example, if a website experiences peak traffic from 8 AM to 10 AM, you can schedule an increase in instance capacity just before 8 AM and a decrease after 10 AM.
Scheduled scaling is ideal for workloads with known traffic patterns but might be less effective if the traffic pattern deviates from the expected schedule.
Predictive Scaling
Predictive scaling, sometimes referred to as historical scaling, uses historical data and machine learning models to forecast demand. This method is particularly valuable for applications with cyclical or seasonal traffic trends. The system analyzes past data to predict future resource needs and scales accordingly.
Predictive scaling requires sufficient historical data. Without enough past metrics, the accuracy of predictions may be compromised, making it less suitable for new applications.
Conclusion
AWS Auto Scaling provides three distinct approaches to handling varying workloads:- Dynamic Scaling: Easily adapts in real time with options such as target tracking, step scaling, and simple scaling.
- Scheduled Scaling: Adjusts instance counts based on predetermined schedules, perfect for predictable traffic patterns.
- Predictive Scaling: Uses historical data to forecast and react to future demands dynamically.