Skip to main content

AWS launches Glue service for running automated ETL jobs

Amazon vice president and chief technology officer Werner Vogels introduces the AWS Glue service at the AWS re:Invent conference in Las Vegas on December 1, 2016.
Image Credit: Screenshot

At its re:Invent user conference in Las Vegas today, public cloud infrastructure provider Amazon Web Services (AWS) announced the launch of AWS Glue, a tool for automatically running jobs for cleaning up data from multiple sources and getting it all ready for analysis in other tools, like business intelligence (BI) software.

This type of work is typically known as extract-transform-load, or ETL. Companies including Informatica and Talend offer software for it. Now AWS has a cloud service for it.

It’s been possible to use AWS infrastructure to do ETL work, with services like EMR (Elastic Map Reduce). The other big public clouds have Hadoop-based tools for this sort of thing, too. But with AWS Glue it will be easier.

And with the help of JDBC connectors, it will be able to connect with data in on-premises services, making AWS Glue another proof point that AWS is interested in working with organizations that still retain their own on-premises data center infrastructure.


June 5th: The AI Audit in NYC

Join us next week in NYC to engage with top executive leaders, delving into strategies for auditing AI models to ensure fairness, optimal performance, and ethical compliance across diverse organizations. Secure your attendance for this exclusive invite-only event.


When data changes at their original sources, “jobs can be triggered again to make sure you always have access to the latest information,” Amazon vice president and chief technology officer Werner Vogels said.

“AWS Glue simplifies and automates the difficult and time consuming data discovery, conversion, mapping, and job scheduling tasks,” as AWS wrote in a blog post. “AWS Glue guides you through the process of moving your data with an easy to use console that helps you understand your data sources, prepare the data for analytics, and load it reliably from data sources to destinations.”

The service is coming soon, according to a product description; customers can sign up to receive updates on its availability.