Blog

Search This Blog

Loading...

Thursday, August 15, 2013

Map-Reduce Capabilities and Super Easy Concurrency (via Alan deLevie and IronResponse)

We came across a great contribution the other day from Alan deLevie that makes using IronWorker for a map-reduce pattern even easier than it already is. (Love seeing tweets announcing additions to the growing list of Iron.io community addons.)
IronWorker is a cloud-based on-demand service that out-of-the-box lets you do massively concurrent processing across slices of data – which is essentially the core of the map reduce pattern. (Here's a good visual explanation of map reduce in action.)

IronResponse adds a veneer (in the form of a simple Ruby gem) that more closely mirrors the map-reduce interface. It lets you abstract away the actual queuing of tasks and management of the data. All you have to do is pass service credentials and the data and then IronResponse will manage the tasking and data storage within IronWorker.
IronResponse + IronWorker + S3 = Simple/Powerful Map Reduce
Alan does a great job explaining how IronResponse works and how to use it. Rather than try to replicate it, we want to include a portion here and then refer you to the GitHub repo for the full details. (Note that it’s super easy to get up and running.)

–––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––
IronResponse on GitHub

IronResponse

IronResponse glues together IronWorker and AWS S3 to provide a response object to remote worker scripts. This allows you to write massively concurrent Ruby programs without worrying about threads.

Rationale

Iron.io's IronWorker is a great product that provides a lot of powerful concurrency options. With IronWorker, you can scale tasks to hundreds and even thousands of workers. However, IronWorker was missing one useful feature for me: responses.

What do I mean by that? In the typical IronWorker setup, worker files are just one-off scripts that run independently of the client that queues them up.

For example:
client = IronWorkerNG::Client.new
100.times do |i|
  client.tasks.create("do_something", number: i)
end

For many use cases, this is fine. But what if I want to know the result of do_something?

A simple way to get the result would be for your worker to POST the final result somewhere, then have the client retrieve it. This gem simply abstracts that process away, allowing the developer to avoid boilerplate and to keep worker code elegant.

Here's how you would interface with IronResponse for running map-reduce across an is_prime function:
require "iron_response"

config = {...}
batch = IronResponse::Batch.new

batch.auto_update_worker = true
batch.config[:iron_io]   = config[:iron_io]
batch.config[:aws_s3]    = config[:aws_s3]
batch.worker             = "test/workers/is_prime.rb"
batch.params_array       = Array(1..10).map {|i| {number: i}}

results                  = batch.run!

p results
#=> [{"result"=>false}, {"result"=>true}, {"result"=>true}...]
IronResponse adds a simple interface to IronWorker to make map-reduce patterns even simpler. 

Under the hood, iron_response uses some functional and meta-programming to capture the final expression of a worker file, convert it to JSON, and then POST it to Amazon S3. When all the workers in an IronResponse::Batch have finished, the gem retrieves the file and converts the JSON string back to Ruby.

...

Contributing to IronResponse

If you would like to add to these capabilities, here’s how:

  1. Fork the repo
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Added some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create new Pull Request

–––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––––

Alan is a big contributor to the Iron.io community and we're grateful for his work here and on other projects that help bring super easy scaling and concurrency to the development community.

If you're a Ruby developer and want to give this a try, let us and him know what you think. And if you’d like to replicate this for other languages, feel free to model this approach. It's pretty sound in our opinion. Be sure to get in touch with us if you do (so we can acknowledge you and pass the word).