Automating Image Processing | Amazon S3

In this blog post, we will be focusing on the following use case. Assume you have images in an S3 bucket, and you wish to automate the process of applying some filters to these images and publishing them again to S3.

The steps we will follow are:

  1. Configure AWS S3 and Iron.io services
  2. Create an IronWorker task that will process images
  3. Queue the IronWorker task

Let's dive into the details of each step.

Amazon S3 image processing

First, you need to configure your AWS S3 and Iron.io services. Make sure you have an AWS account with the necessary permissions to access and manage S3 buckets. You also need to sign up for an Iron.io account here.

1.1 Configure AWS S3

1.2 Configure Iron.io services

  • Install the Iron.io command-line tool (CLI) by following the instructions here: https://dev.iron.io/worker/cli/
  • Authenticate with your Iron.io account by specifying two environment variables: IRON_PROJECT_ID and IRON_TOKEN.

Step 2: Create an IronWorker to process images

Next, we will create an IronWorker that processes the images by applying filters and uploading the filtered images back to S3.

2.1 Write the IronWorker code

  • Create a new directory for your IronWorker and navigate to it.
  • Create the requirements.txt file with dependencies:
boto3==1.20.0
Pillow==9.0.0
iron_worker>=1.3.1
  • Create the image_processing_worker.py file with the following content:
import os
import boto3
import json
from PIL import Image, ImageFilter
from iron_worker import IronWorker

def apply_filter(input_image_path, output_image_path):
    image = Image.open(input_image_path)
    filtered_image = image.filter(ImageFilter.CONTOUR)
    filtered_image.save(output_image_path)

def main():
    input_bucket = 'your-input-bucket'
    output_bucket = 'your-output-bucket'
    
    s3 = boto3.client('s3', region_name='your-s3-region', aws_access_key_id=os.environ['AWS_ACCESS_KEY_ID'], aws_secret_access_key=os.environ['AWS_SECRET_ACCESS_KEY'])

    # Download the image from S3
    payload = IronWorker.payload()
    input_image_key = payload['s3_key']
    input_image_path = '/tmp/input_image.jpg'
    s3.download_file(input_bucket, input_image_key, input_image_path)

    # Apply the filter
    output_image_path = '/tmp/output_image.jpg'
    apply_filter(input_image_path, output_image_path)

    # Upload the filtered image to S3
    output_image_key = 'filtered/' + input_image_key
    s3.upload_file(output_image_path, output_bucket, output_image_key)
if __name__ == '__main__':
    main()

2.2 Package the Worker’s Code

  • Create a Dockerfile in the same directory as your image_processing_worker.py:
FROM python:3.9-slim

WORKDIR /app

COPY requirements.txt ./
COPY image_processing_worker.py /app/
RUN pip install --no-cache-dir -r requirements.txt

CMD ["python", "image_processing_worker.py"]
  • Build the Docker image:
docker build -t image_processing_worker .
  • Push the Docker image to Dockerhub (you need a Dockerhub account for this). In the below commands, replace “USERNAME” with your Dockerhub username:
docker tag image_processing_worker USERNAME/image_processing_worker
docker push USERNAME/image_processing_worker:latest
  • Register the docker image with the Iron.io platform:
iron register --name image_processing_worker -e AWS_ACCESS_KEY_ID=your_aws_access_key_id -e AWS_SECRET_ACCESS_KEY=your_aws_secret_access_key USERNAME/image_processing_worker:latest

Step 3: Queue the IronWorker task

Now, your application is ready to push a new IronWorker task passing the S3 key of your image as a parameter (payload):

iron worker queue --payload {\\"s3_key\\":\\"SOURCE_IMAGE_S3_KEY\\"} USERNAME/image_processing_worker

You can monitor the progress of your IronWorker tasks through the Iron.io dashboard.

Conclusion

By following these steps, you have successfully set up an automated image processing pipeline using Iron.io services and Amazon S3. Your application can now automatically apply filters to images and store the filtered images in a separate S3 bucket. This setup can be further expanded to include various filters or even more advanced image processing tasks.

What are you trying to do using IronWorker? Tell us at support@iron.io so we can help you get started ASAP!

blank

About Korak Bhaduri

Korak Bhaduri, Director of Operations at Iron.io, has been on a continuous journey exploring the nuances of serverless solutions. With varied experiences from startups to research and a foundation in management and engineering, Korak brings a thoughtful and balanced perspective to the Iron.io blog.

Leave a Comment





This site uses Akismet to reduce spam. Learn how your comment data is processed.