Kinesis Data Firehose not only transports streaming data to its final location but also supports inline data transformation using AWS Lambda. This ensures that your data is formatted optimally before reaching its destination.
How Kinesis Data Firehose Works
The following outlines the process flow of Kinesis Data Firehose:-
Data Producers:
Data is sent to Kinesis Data Firehose by various producers. Notably, Kinesis Data Streams can also forward its processed data to Firehose as a producer. -
Data Transformation:
An optional AWS Lambda function can transform the incoming data into a preferred format, if necessary. -
Data Delivery:
The transformed data is then forwarded to the target destination, which may be:- An Amazon S3 bucket
- An OpenSearch domain
- Splunk
- A custom endpoint
-
Error Handling:
Records that fail during transformation or delivery are automatically redirected to a backup S3 bucket for further inspection.
Comparing Kinesis Data Streams and Kinesis Data Firehose
Below is a table summarizing the key differences between Kinesis Data Streams and Kinesis Data Firehose:| Feature | Kinesis Data Streams | Kinesis Data Firehose |
|---|---|---|
| Primary Focus | Real-time data ingestion and processing | Direct streaming of data to destination |
| Scaling Management | User-managed scaling (e.g., shard splitting/merging) | Fully managed with automatic scaling |
| Data Storage | Retains data for up to 365 days with replay capability | No data storage or replay capability |
| Data Transformation | N/A | Supports inline transformation using AWS Lambda |
| Failure Handling | N/A | Redirects failed records to a backup S3 bucket |
Kinesis Data Firehose is ideal for applications requiring efficient, near real-time delivery of streaming data without the overhead of managing infrastructure, making it a perfect choice for many modern data architectures.