Chaos Engineering

Building a Basic FIS experiment

Experiment 1 Chaos Engineering on ASG

In this chapter, we’ll design and execute our first AWS Fault Injection Simulator (FIS) experiment against an Auto Scaling Group (ASG). The goal is to validate that terminating a single EC2 instance does not degrade application availability because the ASG will replace it automatically.

Components of the FIS Experiment

ComponentDescription
GivenWe have an application running on EC2 instances spread across multiple Availability Zones, all managed by an Auto Scaling Group.
HypothesisIf we terminate one EC2 instance, the Auto Scaling Group will launch a new instance, and the application will continue serving traffic without interruption.

Note

We use AWS Fault Injection Simulator to safely inject failures and test application resilience. Make sure your IAM role has the required permissions to execute FIS experiments.

Experiment Steps

  1. Create FIS Experiment Template
    Define the target resources (the ASG) and select the aws:ec2:terminate-instances action.
  2. Specify Targets and Actions
    • Target: EC2 instances belonging to your ASG
    • Action: Terminate one randomly selected instance
  3. Set Stop Conditions
    Monitor CloudWatch alarms (e.g., high error rates or latency). If any alarm triggers, FIS will automatically stop the experiment.
  4. Run and Observe
    Execute the FIS experiment and watch the ASG replace the terminated instance.
  5. Validate Outcome
    Confirm that the new EC2 instance passes health checks and that no user-facing errors occur.

Warning

Always run chaos experiments in a staging or non-production environment first. Verify that your CloudWatch alarms and Auto Scaling health checks are correctly configured to avoid unintended downtime.

References

Watch Video

Watch video content

Previous
Demo Create FIS Permissions