Parallelizing Ruby on the Cloud

So I’m sure we’ve all had the need to want to run multiple threads at once to optimize a part of our applications. There are several options like spinning up new Ruby threads (example from):

 

pages = %w( www.rubycentral.com
            www.awl.com
            www.pragmaticprogrammer.com
           )

threads = []

pages.each do |page|
threads << Thread.new(page) { |myPage|
h = Net::HTTP.new(myPage, 80)
log “Fetching: #{myPage}”
resp, data = h.get(‘/’, nil )

# DO SOME PAGE PROCESSING AND STORE STUFF IN DATABASE
        }
    end

threads.each { |aThread|  aThread.join }

Or you could use a library like delayed_job that works great after a bit of database configuration and starting up the job process from the command line.

But now there is another option that is actually easier and much more scalable and that is sending jobs off into the cloud using SimpleWorkr. Here is an example of the above using SimpleWorkr:

# First we create our Worker
class WebPageWorker < SimpleWorker::Base

    attr_accessor :url
def run



h = Net::HTTP.new(myPage, 80)
        log "Fetching: #{myPage}"
        resp, data = h.get('/', nil )

 

# DO SOME PAGE PROCESSING AND STORE STUFF IN DATABASE


    end
end


# Now lets fire some of those workers up!
pages = %w( www.rubycentral.com
www.awl.com
www.pragmaticprogrammer.com
)


 
pages.each do |page|

 

   worker = WebPageWorker.new
worker.page = page
worker.queue
end
That’s it. Easy like Sunday morning.

1 Comment

  1. blank picardo on January 20, 2011 at 11:56 am

    So how do I test this locally?

Leave a Comment





This site uses Akismet to reduce spam. Learn how your comment data is processed.