AWS Certified Developer - Associate
Load Balancing AutoScaling
Autoscaling Groups Overview
In this guide, you'll learn how auto scaling groups (ASGs) help manage EC2 instance capacity automatically based on your application’s traffic demands. Imagine an application that typically runs on three EC2 instances; during peak hours or special events, these instances might experience heavy traffic. With ASGs, you can automatically scale the number of instances without any manual intervention by setting scaling policies based on metrics like traffic load or CPU utilization.
This automation removes the need for constant monitoring and manual adjustments. Best of all, auto scaling groups incur no additional charges—you only pay for the EC2 instances and the duration they run.
Benefits and Features of Auto Scaling Groups
Auto scaling groups offer several powerful advantages:
- Scalability: Automatically add or remove EC2 instances based on current demand.
- Cost Efficiency: Reduce costs by scaling down during off-peak hours to avoid underutilized resources.
- High Availability and Fault Tolerance: Distribute instances across multiple availability zones, so if one zone experiences issues, your application remains accessible.
- Load Balancer Integration: Newly launched EC2 instances are registered automatically with an Elastic Load Balancer (ELB) to ensure smooth traffic distribution.
In the example below, as new EC2 instances are launched, the load balancer immediately starts directing traffic to them:
Launch Templates
When you create an auto scaling group, you must define a launch template. This template is a blueprint that specifies how new EC2 instances should be configured and includes essential settings such as:
- AMI (Amazon Machine Image)
- Instance type
- Security groups
- Key pairs
- IAM roles
- Network interfaces
- User data
Scaling Methods
Auto scaling groups provide flexible methods for managing capacity according to your application's needs.
Dynamic Scaling
Dynamic scaling adjusts capacity in real-time based on demand and offers three options:
Simple Scaling:
A scaling policy tied to a CloudWatch alarm triggers a change in capacity. For example, you may add instances when CPU utilization exceeds 70% and remove them when it falls below 30%.Step Scaling:
This method allows you to define multiple scaling actions at different thresholds. For instance:- If CPU utilization is below 20%, significantly reduce instances.
- If utilization is between 20% and 40%, moderately decrease capacity.
- Maintain capacity when utilization is between 40% and 70%.
- Add a few instances if CPU utilization is between 70% and 85%.
- Add many instances if it exceeds 85%.
Target Tracking Scaling:
With this policy, you define a target value—for example, maintaining CPU utilization at 40%. The auto scaling group automatically adjusts the instance count to keep the metric at the desired level.
Predictive Scaling
Predictive scaling leverages machine learning to analyze historical traffic patterns and forecast future demand. This proactive approach scales your ASG in advance, ensuring that resources are available before experiencing increased traffic and thus reducing latency.
Scheduled Scaling
Scheduled scaling allows you to predefine specific times to increase or decrease capacity. For example, you might scale out from 6 PM to 12 AM during high traffic periods and scale in from 12 AM to 6 AM when demand is lower.
Metrics and Cooldown Period
ASGs rely on key metrics to trigger scaling actions, including:
- ASG Average CPU Utilization
- Network In (amount and number of packets)
- Network Out
- Requests per target (from the Application Load Balancer)
Rapid fluctuations in these metrics can lead to frequent scaling events. To mitigate this, a cooldown period is used after a scaling action, during which no further actions are taken. This pause helps stabilize the system before additional scaling activities are triggered.
Instance Refresh
When you update your launch template, existing EC2 instances may have outdated configurations. Auto scaling groups offer an instance refresh feature that replaces these outdated instances with new ones running the updated configuration. This update is performed gradually to ensure your application remains available throughout the process.
Summary
ASGs enable you to manage EC2 instance capacity based on real-time and predicted demand without incurring extra costs. Key takeaways include:
- The launch template acts as a blueprint for configuring new EC2 instances.
- Simple scaling: Uses a single CloudWatch alarm and scaling policy for capacity adjustments.
- Step scaling: Provides more granular control with multiple actions at different thresholds.
- Target tracking scaling: Automatically maintains a desired metric value.
- Predictive scaling: Uses machine learning to forecast demand and scale proactively.
- Scheduled scaling: Executes scaling based on predefined time intervals.
- Cooldown period: Prevents rapid, successive scaling activities to stabilize the system.
- Instance refresh: Ensures all running instances are updated with the latest configurations.
Quick Tip
For more detailed information about auto scaling groups and other AWS services, consider exploring the AWS Documentation.
Watch Video
Watch video content