Serverless, event-driven image compressing pipeline with Lambda and S3
Deep dive into using S3 Events to trigger a Lambda function, including how to handle failures and retries.

Guille Ojeda
January 23, 2023
Welcome to Simple AWS! A free newsletter that helps you build on AWS without being an expert. This is issue #11. Shall we?
Use case: Serverless, event-driven image compressing pipeline with AWS Lambda and S3
AWS Services involved: Lambda, S3, CloudFront
Scenario
Your app allows users to upload images directly to S3, and then displays them publicly (think social network). The problem? Modern phones take really good pictures, but the file size is way larger than what you need, and you're predicting very high storage costs. You figured out that you can write an algorithm to resize the images to a more acceptable size without noticeable quality loss!
The problem? You don't want to change the app so that users upload their images to an EC2 instance running that algorithm. You know it won't scale fast enough to handle peaks in traffic, and it would cost more than S3. You want to implement image resizing in a scalable and cost-efficient way, without having to maintain any servers.
Services
Lambda: Serverless processing. S3: Storing files. CloudFront: CDN. The trick: Triggering the Lambda function when an object is uploaded to an S3 bucket.
Solution
- Create an S3 bucket for users to upload images.
- Create a Lambda function that will be triggered when a new object is created in the S3 bucket. This function will run the code to generate a smaller version of the image. Set up the S3 Event as the trigger.
- Create an S3 bucket for storing the resized images.
- Create a CloudFront distribution for the second S3 bucket and configure it to serve the images.
- Set up CloudWatch to monitor the pipeline.
- Test it.
Discussion
This approach resizes images eagerly, expecting that the image will be shown so many times that resizing it will in most cases save you money (or the improved user experience is worth the cost). If that's not the case for you, you could resize lazily (i.e. when an image is requested).
If your image processing results can wait a bit, you'd be better off pushing the S3 event to an SQS queue and consuming the queue from an Auto Scaling Group of EC2 instances (or ECS). AWS Batch is also a great option, if the upload rate is not constant. Overall, serverless scales much faster but serverful is cheaper.
If you need to do more than just resize the images, you've got two options: For independent actions, you can send the S3 Event to multiple consumers using SNS; For a complex sequence of actions, you can use Step Functions (which we'll discuss in a future issue).
Advanced Strategies
Operational Excellence
Sorry, for this pillar I've only got generic advice for you today.
- Implement a testing environment (super cheap in this case).
- Monitor and troubleshoot the Lambda (use CloudWatch).
- Set everything up with an IaC tool (tf, cfn, your choice).
- Remember X-Ray? This is an event-driven architecture, you can use it here.
Security
- Encrypt the data at rest: Like I mentioned last time, this is automatic now. But you can still pick your own keys.
- Use IAM roles for Lambda to access the S3 bucket: Minimum permissions, folks!
- Restrict who can upload images: Every time a user needs to upload an image, generate a presigned URL for S3 and let them use that. Add your own auth to the URL generation process, so only logged in users can upload pictures.
- Restrict access to the resized images S3 bucket: Set up Origin Access Control so that users can't access the bucket directly, and can only access the content through the CloudFront distribution.
Reliability
Here's where the fun begins!
- Configure retries: S3 Events invokes Lambdas asynchronously. This means there's an eventually-consistent event queue, and the function is invoked with an event from the queue. If the function fails, it's retried up to 2 times (delay 1 min before the first retry, 2 mins before the 2nd retry). The default value is 2, you can lower it if you want.
- Make your function idempotent: If there's an eventually-consistent queue, there's duplicate records. Make sure your function can handle them gracefully, by first checking whether the resized image already exists in the destination bucket.
- Process unrecoverable failures: Set up a DLQ (a queue where failure events go), and another Lambda that consumes those events. Move the failed image to a third bucket (so you don't lose it when cleaning up the uploads bucket) and log it to a DynamoDB table for later analysis. Tip: Set up a DLQ, not an On failure Destination; DLQs receive the failed response, Destinations only receive the event that failed.
- Set up an alarm for failures: If you're expecting failures as part of your regular flow, you shouldn't sound the alarm for a single failure. Instead, define what's a normal amount of failures, and alert when the real number goes higher. You can do this easily by monitoring the length of the DLQ, or you can set up something more complex.
Performance Efficiency
- Optimize the Lambda functions: Remember our past issue about Lambda? There's 20 tips there.
- Use CloudFront to serve the images: CloudFront is a CDN. Basically, it stores the images in a cache near the user (there's lots of locations around the world), and serves requests from there. I already considered this as part of the solution, but it was worth explaining.
- Consider compressing images before uploading: This one's a clear tradeoff. On one hand, uploading will be faster. On the other hand, you'll need to uncompress the images to resize them (and pay for that extra processing time). Faster uploads for a better user experience, at a higher cost. Use this if you expect users to upload from slow networks such as 4G and the rest of the app works really well. If the rest of the app is slow, start optimizing there. If users typically upload from a 300 Mbps wifi connection, they won't even notice the improvement.
Cost Optimization
- Transition infrequently accessed objects to S3 Infrequent Access: In the scenario section I mentioned social networks. How often are old images accessed in a social network? You can set a lifecycle rule to transition objects to S3 Infrequent Access, where storage is cheaper and reads are more expensive. If you did the math right, you get a lower average cost. The math: If objects are accessed less than once a month, it's cheaper. And if you can't find any obvious patterns, you can use S3 Intelligent-Tiering.
- Set up provisioned concurrency for your lambdas: If you know you're going to have a minimum of executions, setting up provisioned concurrency will save you some money.
- Get a Savings Plan: Savings Plans are upfront commitments (with optional upfront pay) for compute resources. You'd typically link them to EC2 or Fargate, but they apply to Lambdas as well!
Resources
If you want to use Node.js to resize the images, Sharp is a great library to do so.
Check out this complete solution to resize images lazily (as opposed to eagerly, like we're doing with this solution).
If getting AWS Certified is among your new year's resolutions, let me recommend Adrian Cantrill's courses. With their mix of theory and practice, they're the best I've seen. I've literally bought them all (haven't watched them all yet). <-- This recommendation contains affiliate links.
Some of the above resources are paid promotions or contain affiliate links. I only recommend resources I've tried for myself and found actually useful, regardless of whether I get paid for it or not.
Misc.
No architecture can ever be complete without a discussion. That's why I added a sub-section called Discussion. Am I arguing with myself? I guess I am.
A future issue, probably the next one, will be about building a more complex pipeline with Step Functions. Got any other use cases or services you'd like to see? Hit reply!
As a way to thank you for your continuous support, I'm offering free 30-minute consulting sessions to newsletter subscribers. Book yours here!
Thank you for reading! See ya on the next issue.