Batch Processing: A Tutorial on Workers, Queueing and Gelato

Worker_Queue_Gelato_V2

Batch processing is one of the earliest ways of data processing, utilized by Herman Hollerith’s Tabulating Machine in 1890. Batch processing was developed to take advantage of scarce computing resources: it avoids idling these expensive resources by queueing instructions to process data without manual user intervention, and can shift workload to times when resources are less scarce1.

Today, we can leverage modern architectural patterns like worker systems, message queues and the cloud to level-up these advantages and simplify our code. Let’s look at an example of queueing and workers using a calorie-dense metaphor: gelato.

gelato2

Using our favorite local gelato shop as an example, we explore how architectural concepts like queueing and workers can affect a given task. We chose gelato because:

  • Each order takes time to set up. You must examine the menu and display, choose and order.
  • Each order takes time to process. The time spent preparing an order can vary based on the order size and complexity, just as job size can vary in a worker system.
  • Adding additional workers helps. The queue of customers will be processed faster, in the same way that adding more workers to a particular batch processing job will make the queue shrink faster.
  • Like your infrastructure, gelato must be kept cold, and is delicious when consumed2.

Check it out:

1 https://en.wikipedia.org/wiki/Batch_processing

2This is not true. Iron.io assumes no responsibility if you eat your infrastructure.

Leave a Comment





This site uses Akismet to reduce spam. Learn how your comment data is processed.