Worker Queues as a Key Variable
Anyone working on a serious web app knows that a worker queue makes up a key component within the app architecture. So important is it that the infrastructure equation is typically broken into servers, workers, and datastores. Sure, there are other components and line items (cdn, bandwidth, etc) that an app can’t do within, but servers, worker, and datastores form the primary building blocks for architects and make up the main cost factors.
The standard way to allocate workers to handle worker jobs is much like the way application servers are determined — how many worker servers will be needed to handle average-peak loads. The usage and cost of a worker system, however, doesn’t correlate well to the usage and cost of application servers. Most companies, especially those using self-managed worker queues, will over allocate their worker infrastructure so as to address peak needs. (Worker jobs don’t lend themselves well to on-demand server deployment and so worker servers are often idled for a good portion of time.)
The typical worker allocation mechanism — how many application servers and how many worker servers — doesn’t map to what’s needed for agile and efficient background processing. When it comes to workers, it’s about the jobs, specifically the number of jobs and the average duration, and not the servers. Utilization pricing for app servers is based on the hour. Utilization pricing for workers needs to be based on much less granularity, which in the case of SimpleWorker, is in seconds.
With an efficient and scalable worker system, the jobs can be distributed across a virtual worker farm — even running jobs in parallel — all the while distributing the cost and overhead of the servers. Efficiencies of scale can be achieved with a good central worker system that’s elastic, scalable, and can accommodate different job priorities. One that is able to analyze current and historic job queues to better anticipate server needs.
With a different form of pricing and a more efficient way of scheduling, worker queue operating costs drop. (We calculate at least 3-5x for self-managed queues.) Reduced operating cost, though, is only a fraction of the benefits of a cloud-based worker system. The gain in speed and agility can far surpass the cost savings. Add in the cost savings and time gained by not having to develop or manage a worker queue and a cloud-based worker server is clear win. Which is what we’re hearing more and more from people using the service.