|Creating an ETL pipeline with Iron.io and Redshift|
Harlow Ward is a developer at HotelTonight and he put together a great post on how they handle issues of scale. In it he talks about their use of Iron.io and Amazon's Redshift offering to create a simple highly scalable ETL pipeline.
Here's an excerpt from his article.
As the data requirements of our Business Intelligence team grow, we’ve been leveraging Iron.io’s IronWorker as our go-to platform for scheduling and running our Ruby-based ETL (Extract, Transform, Load) worker pipeline.
Business units within HotelTonight collect and store data across multiple external services. The ETL pipeline is responsible for gathering all the external data and unifying it into Amazon’s excellent Redshift offering.
Redshift hits a sweet spot for us as it uses familiar SQL query language, and supports connections from any platform using a Postgres adapter.
This allows our Ruby scripts to connect with the PG Gem, our Business Intelligence team to connect with their favorite SQL Workbench, and anyone in our organization with Looker access to run queries on the data.
|HotelTonight's Dashboard of Workers|
The team at Iron.io have been a great partner for us while building the ETL pipeline. Their worker platform gives us a quick and easy mechanism for deploying and managing all our Ruby workers.
The administration area boasts excellent dashboards for reporting worker status and gives us great visibility over the current state of our pipeline.
About the Author/Developer
Ruby Science," and enjoys writing technical articles focused on sharing development techniques throughout the community. (@futuresanta)