There is a ton of use cases for batch processing and every business is probably doing it in some way or another. Unfortunately, many of these approaches take much longer than need be. In a world of ever increasing data, the old way can now hinder the flow of business. Some examples of where you’ll see batch processing used are:
- Image/video processing
- ETL – Extract Transform Load
- Genome processing
- Big data processing
- Billing (create and send bills to customers)
- Report generation
- Notifications – mobile, email, etc.
We’ve all seen something that was created during a batch process (whether we like it or not).
Now, I’m going to show you how to take a process that would typically take a long time to run, and break it down into small pieces so it can be distributed and processed in parallel. Doing so will turn a really long process into a very quick one.
We’ll use Docker to package our worker and Iron.io to do the actual processing. Why Docker? Because, we can package our code and all our dependencies into an image for easy distribution. Why Iron.io? It’s the easiest way to do batch processing. I don’t need to think about servers or deal with distributing my tasks among a bunch of servers.
Alright, so let’s go through how to do our own batch processing.