Posts Tagged ‘ETL’
Run ETL using Background Jobs Solution: A Hybrid Model
If you are a developer dealing with the complex process of data management, particularly with ETL (Extract, Transform, Load), this article is for you. We will guide you through the cases where running ETL jobs using background job solutions in a hybrid (or on-premise) model can be useful. Additionally, you will learn how IronWorker can…
Read MoreThe E.T. in ETL
Thanks to JD Hancock for the base image! CC BY 2.0 Anyone who’s ever done ETL knows it can get seriously funky. When I first started working on ETL, I was parsing data for a real estate company. Every once in awhile roofing data would appear in the pool field. “Shingles” isn’t a compelling feature…
Read MoreE is for Event: A Fresh Take on ETL
As a follow up to my previous post, The Workloads of the Internet of Things, I wanted to walk through a real world example that fully captures the principles of event-driven computing put forth. Let’s set the stage first… imagine we operate a windmill farm and want to continually track optimal weather conditions to maximize…
Read MoreHow HotelTonight Streamlined their ETL Process Using IronWorker
HotelTonight has reinvented the task of finding and booking discounted hotel rooms at travel destinations. Designed for last-minute travel planners and optimized for the mobile era, HotelTonight connects adventure-seeking, impulse travelers with just-in-time available hotel rooms wherever they land. This model has the market-enhancing effect of reducing excess inventory of unused hotel rooms, while delivering…
Read MoreHow to Build an ETL Pipeline for ElasticSearch Using Segment and Iron.io
Overview ETL is a common pattern in the big data world for collecting and consolidating data for storage and/or analysis. Here’s the basic process: Extract data from a variety of sources Transform data through a set of custom processes Load data to external databases or data warehouses Segment + Iron.io + Elasticsearch = A…
Read MoreHow HotelTonight uses Iron.io and AWS Redshift to create Ruby-based ETL pipeline (repost)
Creating an ETL pipeline with Iron.io and Redshift Operating at scale in the cloud almost always equates to having a highly distributed system architecture in place to handle workloads by auto-scaling components out horizontally Harlow Ward is a developer at HotelTonight and he put together a great post on how they handle issues of…
Read More