Blog

Search This Blog

Loading...

Wednesday, May 26, 2010

Parallelizing Ruby on the Cloud

So I'm sure we've all had the need to want to run multiple threads at once to optimize a part of our applications. There are several options like spinning up new Ruby threads (example from):


pages = %w( www.rubycentral.com
            www.awl.com
            www.pragmaticprogrammer.com
           )

    threads = []

    pages.each do |page|
        threads << Thread.new(page) { |myPage|
            h = Net::HTTP.new(myPage, 80)
            log "Fetching: #{myPage}"
            resp, data = h.get('/', nil )
# DO SOME PAGE PROCESSING AND STORE STUFF IN DATABASE
        }
    end

    threads.each { |aThread|  aThread.join }


Or you could use a library like delayed_job that works great after a bit of database configuration and starting up the job process from the command line.

But now there is another option that is actually easier and much more scalable and that is sending jobs off into the cloud using SimpleWorkr. Here is an example of the above using SimpleWorkr:

# First we create our Worker
class WebPageWorker < SimpleWorker::Base

    attr_accessor :url
    def run


h = Net::HTTP.new(myPage, 80)
        log "Fetching: #{myPage}"
        resp, data = h.get('/', nil )
# DO SOME PAGE PROCESSING AND STORE STUFF IN DATABASE

    end
end


# Now lets fire some of those workers up!
pages = %w( www.rubycentral.com
    www.awl.com
    www.pragmaticprogrammer.com
)


pages.each do |page|
   worker = WebPageWorker.new
worker.page = page
worker.queue
end

That's it. Easy like Sunday morning.