AWS Certified SysOps Administrator - Associate
Domain 2 Reliability and BCP
General Replication Options for Data Services on AWS
Welcome to this lesson on data replication within AWS. In this guide, we explore the concept of replication, its importance, and how AWS implements various replication strategies to ensure high availability, data restoration, and operational continuity.
Imagine a scenario where a website user interacts with an application. To handle diverse use cases and provide redundancy, data is replicated across different environments. This replication ensures that data remains available and consistent, which is critical, especially when manual processes would be impractical.
Consider the following diagram illustrating a primary site replicating data to several secondary sites for purposes such as reporting, data warehousing, backup, and auditing:
Data replication goes beyond simple backups. It involves adapting data for different use cases with strategies that ensure either strong or eventual consistency, depending on whether synchronous or asynchronous replication is employed.
For instance, asynchronous replication is favored when performing backups. In applications like reporting or backup operations, a slight delay in the secondary data set is acceptable. The following diagram delineates the differences between asynchronous replication, which includes a noticeable lag, and synchronous replication that provides immediate updates:
Replication Strategies
Asynchronous replication is suitable for non-critical applications where a minor delay is permissible, whereas synchronous replication is key for mission-critical scenarios demanding zero data loss.
AWS Data Replication Services
AWS offers a variety of features to handle replication, each tailored to different use cases and operating environments:
Amazon RDS Multi-AZ:
- Instance deployments include one primary and one secondary instance.
- Cluster deployments involve one primary and two secondary instances spanning multiple availability zones.
DynamoDB Global Tables:
- Provides asynchronous replication across regions, supporting globally distributed applications with eventual consistency.
Cross-Region Replication for Amazon S3:
- Automatically copies objects between buckets in different regions, with a replication lag that may extend up to 30 seconds or more depending on configuration.
AWS DataSync:
- Facilitates data transfer between on-premises storage and the AWS cloud, or between AWS services, using asynchronous methods.
The diagram below summarizes these AWS services and their functions to bolster business continuity:
Choosing the Right Replication Strategy
Transactional databases often leverage synchronous replication to ensure strong consistency. In contrast, services such as DynamoDB Global Tables, cross-region replication, and DataSync typically use asynchronous replication. The ideal strategy depends on your use case—whether the focus is on rapid recovery, enhanced availability, or offloading non-production tasks like analytics and reporting.
For instance, if there's a need to run reports without impacting the production database, setting up a read replica with asynchronous replication is a viable solution. In one case, a customer encountered performance issues on the production database when running reports. The recommended solution was to create a read replica, which required about 15 minutes for setup and roughly an hour for the initial large dataset to fully synchronize, thereby isolating analytics and reporting tasks from production traffic.
The following diagram encapsulates various replication use cases in AWS, including disaster recovery, global data accessibility, and analytics/reporting:
Key Takeaways
- Synchronous Replication: Best for applications requiring zero data loss and immediate failover, such as banking.
- Asynchronous Replication: Ideal for reporting, backups, and disaster recovery where a slight lag is acceptable.
Conclusion
Replication in AWS is a foundational technique that keeps your data available, recoverable, and protected from production impacts. By understanding and leveraging AWS replication strategies, you can tailor your approach to align with your business requirements—whether ensuring data consistency in transactional systems or offloading intensive read operations to replicas.
We hope this lesson has enriched your understanding of AWS replication options and provided you with the insights needed to make informed decisions for your data services.
We'll see you in the next article.
Watch Video
Watch video content