There is a ton of use cases for batch processing and every business is probably doing it in some way or another. Unfortunately, many of these approaches take much longer than need be. In a world of ever increasing data, the old way can now hinder the flow of business. Some examples of where you’ll see batch processing used are:
ETL – Extract Transform Load
Big data processing
Billing (create and send bills to customers)
Notifications – mobile, email, etc.
We’ve all seen something that was created during a batch process (whether we like it or not).
Now, I’m going to show you how to take a process that would typically take a long time to run, and break it down into small pieces so it can be distributed and processed in parallel. Doing so will turn a really long process into a very quick one.
We’ll use Docker to package our worker and Iron.ioto do the actual processing. Why Docker? Because, we can package our code and all our dependencies into an image for easy distribution. Why Iron.io? It’s the easiest way to do batch processing. I don’t need to think about servers or deal with distributing my tasks among a bunch of servers.
Alright, so let’s go through how to do our own batch processing.
As one of the earliest users of Docker, I’ve had the pleasure of creating and working with multiple different platforms built on containers. Each platform has evolved in step with current ecosystem around it, and I’ve gotten the chance to really put Docker’s “batteries included, but removable” philosophy to the test.
Here at Iron.io, we have launched over 1 billion user containers in production, not to mention the containers we launch to keep our services running. The massive volume of containers we launch is enough to place great demands upon any platform that we use.
In our search for the right direction for the evolution of our platform, we’ve explored as many tools as possible. The release of Docker 1.9, combined with production-ready Docker Swarm and Docker Networking, brings a lot of value to those wanting to roll their own platforms.
Containers make life easy. Oh, you don’t have Ruby 2.2 installed? No problem, try this Docker image. Knowing what I tested on my local is exactly what’s running on production gives me warm fuzzies.
Docker gets a lot of love because it simplifies development. That’s not all though. If Docker punished infrastructure, there’d probably be a lot less love going around. Thankfully, Docker does some cool things on the infrastructure side, as well.
The biggest benefit is the “right-sizing” of compute resources. Your program might only need 200 MB of memory. Why dedicate an entire VM + OS to that? Docker insures our compute resources are neatly divided by memory and CPU between instances. Neat! There’s a lot to love about Docker on the infrastructure side, as well.
This post is a continuation of our Defrag 2015 coverage from yesterday. Read on to hear about our favorite talks from day two.
Where Does the Time Go? – Researching Top Activities At Work
Lisa Kamm is a Product Manager at Google. She got involved with a project to figure out how Googler’s spend their time at work. How could they make it better? Do their mobile products support their own workflows?
Kamm and a team of curious Googlers embarked on a journey to find out. The project started with a collaborative session, where Lisa posed the question: “Wait… what are the top 100 things an employee does in an average day.” Oops. By asking the question she surreptitiously also volunteered to find an answer.
The search for answers started the way you’d expect. Being a Googler, Kamm began the hunt for answers by analyzing a large set of data. Logs from mobile phone and computer usage seemed to be the easiest way to go. There were some hurdles with actually obtaining the logs, as well as with personal privacy. Kamm prevailed in the end, and was able to crunch down 2.5 billion records to get the data she needed.
“Docker, please visit the front desk to receive your complimentary upgrade to first class seating.“
That’s right, Docker just received a first class upgrade on Iron.io. A ways back, Travis (our digital frontiers-man of a CTO) announced beta support for Docker. Today, we’re ripping off the beta tag. Docker is our preferred way to package code.
Last night’s meetup, which was hosted by Betable, included two presentations and two lightning talks rounding out a solid evening for the GoSF group. Topics included identity on the web, safe storage of tokens (beyond ENV vars), and even the debut of a new Go-inspired embedded systems language.
A day ago I joined 700+ folks at the Palace Hotel in San Francisco to attend the 2015 Container Summit. Container’s are young, but one thing this event made clear is the forebears have been around quite a while.
A favorite part of the summit was hearing war stories. That is, how containers are called on to get things done in the real world. There were plenty of looks to the past and the future, as well.