pages = %w( www.rubycentral.com
www.awl.com
www.pragmaticprogrammer.com
)
threads = []
pages.each do |page|
threads << Thread.new(page) { |myPage|
h = Net::HTTP.new(myPage, 80)
log "Fetching: #{myPage}"
resp, data = h.get('/', nil )# DO SOME PAGE PROCESSING AND STORE STUFF IN DATABASE
}
end
threads.each { |aThread| aThread.join }Or you could use a library like delayed_job that works great after a bit of database configuration and starting up the job process from the command line.
But now there is another option that is actually easier and much more scalable and that is sending jobs off into the cloud using SimpleWorkr. Here is an example of the above using SimpleWorkr:
# First we create our Worker
class WebPageWorker < SimpleWorker::Base attr_accessor :url
def runh = Net::HTTP.new(myPage, 80)
log "Fetching: #{myPage}"
resp, data = h.get('/', nil )# DO SOME PAGE PROCESSING AND STORE STUFF IN DATABASE
endend
# Now lets fire some of those workers up!
pages = %w( www.rubycentral.com
www.awl.com
www.pragmaticprogrammer.com
)
pages.each do |page|
worker = WebPageWorker.newworker.page = page
worker.queue
end
That's it. Easy like Sunday morning.