Search This Blog


Sunday, September 23, 2012 @ Big Data Hack Day is putting in some time at the Big Data Hack Day this weekend in San Francisco. A number of teams working with a number data sets to solve some hard challenges.

The event was put on by the AngelHack and folks and had a concentrated set of sponsors, all suited to working out problems with collecting and processing large volumes of data. In addition to, there were folks from Couchbase and Firebase (and Google Cloud Platform and Prior Knowledge).

 Big Data Hack Day Sponsors (partial list)

Tuesday, September 18, 2012

New Feature: Auto Retry for IronWorker

IronWorker now has an auto retry feature that, if enabled, will automatically retry your tasks if they error out. You can set the number of times it should retry and the delay between retries.


While uploading your worker, simply set the retries and retries-delay parameters:

iron_worker upload my_worker --retries 5 --retries-delay 10

That example will retry 5 times (which means it will try up to six total times) and will wait 10 seconds between each retry. If any of those tasks is successful, it will stop retrying. In other words, it will continue to retry if each task in the sequence errors.

Be sure your tasks are idempotent when using this so that multiple runs don't have any side effects.

Cheers and let us know how it works for you.

Monday, September 17, 2012

Web Crawling at Scale with Nokogiri and IronWorker (Part 2)

This is the second part of a two-part post on using IronWorker with Nokogiri to do web crawling at scale. The first blog post can be viewed here

Other resources for web crawling can be seen on our solutions page as well as this post on using IronWorker with PhantomJS.

Distributed Web Crawling

Crawling web sites is core to many web-based businesses – whether it’s monitoring prices on retail sites, analyzing sentiment in posts and reports, vetting SEO metrics for client sites,

Friday, September 7, 2012

Guest Post: Using IronCache as a Persistent Key Value Store for Real-time Chat

While IronCache is a great option for caching, it can also be used to persist data more permanently. There are dozens of great uses for a persistent key/value store, however, we’ll be building a simple chat app with Sinatra using IronCache as our datastore.

This is a guest post by JP Silvashy, CTO of Motionloft. You can find him on Twitter or his blog.

First let’s talk about a traditional cache. Take Memcached for example, it’s an in memory, non-durable, key/value store. It’s great! It’s probably the most popular object caching system there is. Until now, Memcached has always been my go-to solution.

Initially I built my mini-chat app Chattyloo using Postgres. Postgres is a big hammer,