AWS Certified Developer - Associate
CDNs CloudFront
Cache Key Caching Policies Cache Behavior
This article provides an in-depth look at how caching works in CloudFront by examining cache behaviors, cache keys, cache policies, and origin request policies. Learn how these elements work together to optimize content delivery, minimize latency, and improve your application's overall performance.
Cache Behavior
CloudFront's cache behavior determines which origin is used to retrieve various objects based on the incoming request's path. For example, you can configure requests sent to the /images
path to be directed to one origin (such as an S3 bucket) while those sent to the /app
path may be delivered from another origin (like an EC2 instance). This configuration allows you to customize the delivery path for different types of content within the same CloudFront distribution.
How CloudFront Works Under the Hood
When a user makes a request to CloudFront, the following steps occur:
- The request reaches an edge location.
- CloudFront generates a cache key by extracting elements from the request, such as the URL, headers, cookies, and query strings based on your configuration.
- CloudFront checks whether an object matching the generated cache key exists in its cache:
- If there is a cache hit, CloudFront returns the stored object immediately.
- If there is a cache miss, CloudFront forwards the request to the origin, retrieves the object, caches it for future requests, and returns it to the user.
Key Insight
Understanding the cache key generation process is critical for optimizing your caching strategy and reducing origin load.
Understanding Cache Keys
A cache key is a unique identifier that CloudFront uses to locate and retrieve cached objects. When a client requests a resource from a domain associated with CloudFront (for example, /articles/welcome.html
on example.com
), the default cache key is composed of the hostname and the resource path.
If the object is not found in the cache, CloudFront will forward the request to the origin server to fetch it. However, the cache key can be customized to include additional request components such as query parameters, headers, and cookies.
For example, consider the HTTP request below:
GET /content/video.mp4?resolution=1080p
Host: d111111abcdef8.cloudfront.net
User-Agent: Mozilla/5.0 Gecko/20100101 Firefox/68.0
Accept: {Accept-Encoding: gzip}
Cookie: session_id=01234abcd
Compression: True
In this example, apart from the URI, CloudFront can include the query parameters (?resolution=1080p
), headers, or cookies as part of the cache key. The cache policy (discussed in the next section) defines which components are used to construct the cache key as well as the Time To Live (TTL) for cached objects.
Cache Policies in Action
Cache policies fine-tune your CloudFront configuration by specifying which components of an HTTP request should be used to form the cache key and by setting the TTL for your cached objects. Consider the following scenario:
A user makes a request:
GET /products/parts?type=motor
Host: www.carparts.com
Assuming your cache policy is configured to include the hostname, resource path, and query string, CloudFront will generate a cache key from these components. When the object is retrieved from the origin, it gets stored in the cache keyed to those properties.
Now consider a slightly different request made by another user:
GET /products/parts?type=wheel
Host: www.carparts.com
Since the query parameter differs (type=wheel
versus type=motor
), CloudFront will generate a unique cache key for this request. Without a matching cache entry, the request is forwarded to the origin, the new object is cached, and then returned to the user.
Tip
When designing cache policies, ensure you include only the necessary request components to maximize cache hit ratios while avoiding redundant duplicates.
Origin Request Policy
In the event of a cache miss, CloudFront sends an origin request which includes components defined by your cache key (such as hostname, resource path, and query string). However, your origin may require additional request data, such as a specific language header, which you might not want to include in the cache key to avoid creating separate cache entries for every variation.
The solution is to define an origin request policy. This policy specifies additional headers, cookies, or query strings to be forwarded to the origin server without impacting the cache key. AWS offers several pre-configured managed policies that cover common scenarios, or you can customize a policy to fit your requirements.
Pro Tip
Using origin request policies allows you to optimize your caching strategy while passing critical headers or cookies to your origin server for customized responses.
Summary
Below is a table that encapsulates the key aspects of CloudFront caching:
Component | Description | Key Elements |
---|---|---|
Cache Behavior | Directs requests to the appropriate origin based on the request path. | Request path (e.g., /images , /app ) |
Cache Key | A unique identifier constructed from request details to locate cached objects. | Hostname, resource path, and optionally query strings, headers, cookies |
Cache Policy | Configures which request components form the cache key and specifies the TTL for cached objects. | Hostname, resource path, query string (as needed) |
Origin Request Policy | Specifies additional request details to forward to the origin without altering the cache key. | Extra headers, cookies, or query strings not included in the caching criteria |
By carefully configuring these policies, you can improve cache hit ratios, enhance content delivery performance, and ensure your origin server receives the necessary data when required. This not only optimizes resource utilization but also enhances end-user experience by reducing latency.
For further reading, consult the following resources:
Happy caching!
Watch Video
Watch video content