Search This Blog


Sunday, March 27, 2011

Use Any Gem You Want!

Ask And Ye Shall Receive... gems that is. The `merge_gem` feature is now in production and ready for use allowing you to use any gem you want. It will use the gems on your local file system so whatever you have available locally you can use.

It's very easy to use:

More info.

Note: If the gem requires some native extensions, then this probably won't work for that gem so just contact us and let us know which ones you need so we can have them installed on the SimpleWorker system.

Thursday, March 10, 2011

Why IronWorker is Better than Heroku Workers / Delayed Job

First off, I would like to say that we use and love Heroku, the system they've built is game changing and 100% awesome, but their worker system is limited and does not meet our needs. Here's the top 4 reasons why IronWorker is better than Heroku's worker system and Delayed Job (Heroku's system is based on Delayed Job). 

1. Cheaper

Heroku charges you whether you are using it or not. If you only run 10 minutes of jobs an hour, you are still paying for the full hour. With IronWorker, you are only paying for 10 minutes since you are charged by the second rather than the hour.

Price comparison:

If you run jobs that use up 10 minutes every hour, Heroku would cost $0.05 (charge by the hour) vs less than $0.01 per hour on IronWorker.  If you run jobs continuously for a solid hour, the price would be the same. 

2. Elastic, Scalable and Massively Parallel

If you run a lot of jobs, the only way to get through your job queue quickly on Heroku is to add more worker processes. If you crank it up to the max, you can get 24 worker processors meaning 24 concurrent jobs running. With IronWorker you can throw as many jobs as you want at it and they will all get run in parallel. Need to run 1000 jobs?  No problem, just queue them up in IronWorker. 

Time and price comparison:

Let's say you have 1000 jobs that you need to run every hour (for instance you need to update something for each of your users, one job per user) and let's say each job takes 30 seconds to complete. With Heroku, even using the maximum amount of workers, it would take 21 minutes to complete all jobs. With IronWorker it would take about 30 seconds to complete all jobs.

The cost for the above on Heroku would be $864 per month ($1.20 per hour, $28.80 per day). The cost for IronWorker would be $300 per month ($0.42 per hour, $10 per day).

3. Advanced Scheduling

Heroku does have scheduling capabilities via the Cron add-on, but it is very limited and only supports a single action (it call call rake cron in your application Rakefile). It also allows only two scheduling options -- hourly or daily. It does not have options to run one-time, to use different time schedules (every 15 minutes, twice at day, weekly, etc), or support multiple worker schedules. On top of this, there is also a monthly fee for the scheduling option, $3 per month.

IronWorker scheduling accommodates very flexible scheduling options and has no limit on the number of workers or schedules. Best of all, scheduling is 100% free. No additional fee.
  • Run a job just one time at some time in the future
  • Set a recurring schedule to run at any frequency you want (hourly, daily, monthly, every minute, every 15 minutes, whatever you want) 
  • Schedule any number of workers, each with their own schedules
  • Schedule jobs just as easily as you would queue jobs. Instead of the 'queue' command, you use the 'schedule' command and pass in scheduling parameters. It calls the workers directly that you schedule at the times and schedule you set.

4. Monitoring, visualizing and control

Heroku does not provide any view into your workers. With IronWorker you can:
  • View your usage patterns/trends over time
  • Check status of all your workers
  • Get notified when your workers raise errors and get down to the root of the issue fast
  • Cancel queued or scheduled jobs, kill running jobs, rerun a job

10,000 jobs at 10 seconds per job:
10000 jobs/24 worker processes = 417 jobs / worker
417 jobs * 10 seconds / job = 4170 seconds = 69.5 minutes = 1 hour and 10 minutes

1000 jobs/24 workers processes = 41.7 jobs /worker
41.7 jobs * 30 seconds / job = 1250 seconds = 21 minutes

Wednesday, March 2, 2011

Worker Queues as a Key Variable

Anyone working on a serious web app knows that a worker queue makes up a key component within the app architecture. So important is it that the infrastructure equation is typically broken into servers, workers, and datastores. Sure, there are other components and line items (cdn, bandwidth, etc) that an app can't do within, but servers, worker, and datastores form the primary building blocks for architects and make up the main cost factors.

Cloud Infrastructure = Servers + Workers + Datastores

The standard way to allocate workers to handle worker jobs is much like the way application servers are determined -- how many worker servers will be needed to handle average-peak loads. The usage and cost of a worker system, however, doesn’t correlate well to the usage and cost of application servers. Most companies, especially those using self-managed worker queues, will over allocate their worker infrastructure so as to address peak needs. (Worker jobs don’t lend themselves well to on-demand server deployment and so worker servers are often idled for a good portion of time.)

No. App Servers + No. of Worker Servers
Standard method of worker allocation

The typical worker allocation mechanism -- how many application servers and how many worker servers -- doesn’t map to what's needed for agile and efficient background processing. When it comes to workers, it’s about the jobs, specifically the number of jobs and the average duration, and not the servers. Utilization pricing for app servers is based on the hour. Utilization pricing for workers needs to be based on much less granularity, which in the case of SimpleWorker, is in seconds.

No. App Servers + No. of Worker Jobs * Ave. Duration
Better method of worker allocation

With an efficient and scalable worker system, the jobs can be distributed across a virtual worker farm -- even running jobs in parallel -- all the while distributing the cost and overhead of the servers. Efficiencies of scale can be achieved with a good central worker system that's elastic, scalable, and can accommodate different job priorities. One that is able to analyze current and historic job queues to better anticipate server needs.

Benefit = Operational Cost Savings + Agility + Development Savings

With a different form of pricing and a more efficient way of scheduling, worker queue operating costs drop. (We calculate at least 3-5x for self-managed queues.) Reduced operating cost, though, is only a fraction of the benefits of a cloud-based worker system. The gain in speed and agility can far surpass the cost savings. Add in the cost savings and time gained by not having to develop or manage a worker queue and a cloud-based worker server is clear win. Which is what we're hearing more and more from people using the service.