Member-only story
Is Mage.ai a real alternative to Airflow?
Let’s try a simple case study.
A couple of weeks ago, I started playing Mage.ai, and to be honest, I was really impressed with how fun and easy it was to create data pipelines, as well as capture streaming data sources and deliver them to their destination. In this article, I want to talk a little bit about the execution times of both tools (Airflow & Mage), taking as reference a small case study
But, What is Mage?
It is an open-source data pipeline tool for transforming and integrating data that is called to be the replacement for Airflow, this amazing tool will empower your data team and will give magical powers, within which we have :
- Effortlessly integrate and synchronize data from 3rd party sources
- Build real-time and batch pipelines to transform data using Python, SQL, and R
- Run, monitor, and orchestrate thousands of pipelines without losing sleep
Other Super Powers
- Easy implementation with Docker, Terraform Script, Pip, or Conda.
- Visualization of your datasets from UI.
- No more coding of extensive and complex Dags.
Step-by-step case study
- Query a Snowflake table that contains 156 links to images
- Read images from url_links with Python and rename them before saving
- Send images to S3 Bucket
- Delete the image and go to the next image
How easy was the process?
Although I have to admit that implementing Airflow with Astronomer greatly improved the UI, deployment locally, and the ease with which we data engineers use Airflow today, Mage.ai is something more because it doesn’t just allow us to orchestrate and connect sources and destinations with a couple of clicks while allowing us to view at the same time data and our blocks (Tasks), for this reason, I encourage all Data Engineers colleagues around the world to test the following phrase in the image.