Search This Blog


Monday, March 12, 2012 Speaking at Data 2.0 Summit

One of the big uses of services is in working with cloud data at big scale. Some customers are using IronWorker to crawl the web and process the results. Others are taking data and transforming it using a host of workers. Still others are using IronMQ to connect cloud apps to legacy apps.

And so it makes sense to connect with data providers, information services, API companies, and others in the Big Data space at the Data 2.0 Summit in SF on April 3rd.

Chad Arimura will be on a panel talking about the Real-Time Data Stack. Alongside app servers, databases, and the data sources themselves, application infrastructure services like IronWorker and IronMQ are natural fits. There's lots of processing that needs to be done in the background and databases, unfortunately, don't work so well in orchestrating process flow, which is where message queues come in.
The topic of the realtime stack is right up our alley. Ken Fromm wrote a series on the realtime web three years ago that turned out to be one of ReadWriteWeb's top stories of the year. You can read it herehere, and here. Here's an excerpt.

Because of demand within the [Twitter] eco-system, quite a bit of effort is being made on storing, slicing, dicing, and disseminating information as quickly as possible. The fundamental implication of this activity (without any explicit markers being laid down) is that the velocity of information within the Web data system has just increased by an order of magnitude. 
The pipes are moving data at the same rate: the speed of your data connection has not changed (although it is getting faster because of an independent effort by cable companies, telcos, and the like). What has changed is the flow of data from machine to machine on the Web and the processing that happens as information makes its way to users. Companies are making use of data that takes seconds to be published to the Web, as opposed to hours or minutes. Years ago, pages might have been crawled by search engines daily. With the advent of RSS, new posts would flow through the system within hours. With Twitter, the flow is propagated from company to company to user in real time. 
This facet of the real-time stream is having a profound impact on the infrastructure of the Web. New storage and retrieval methods are being developed to overcome the time lags of writing not just to disks but to traditional databases. Adaptations to traditional structured query languages are being made to index items directly from the stream. Search engines and search capabilities are being modified to make use of real-time inputs to influence the search results. This isn't just a Twitter effect. This is an effect across all uses of the Web, because the expectation of access to real-time information is now permeating all websites and the infrastructure of the Web itself.

Message queues and massively scalable background processing are big parts of this infrastructure and playing bigger roles in an increasingly faster and more connected web.