Hi Paul. I appreciate your thoughts on the post. I completely agree with you on what you are saying about AWS Glue. I think it can be great bringing to the stack other tools depending on your specific need.

Regarding Spark, that is out of the scope of the post. But, I do have other posts where I talk little bit about that.

Put simply, Spark is executed in Airflow. It extracts data from sources and put it in AWS S3. Also, I've used for heavy-processing in large unstructured datasets.

https://towardsdatascience.com/implementing-the-functional-data-engineering-paradigm-in-data-load-processes-by-using-airflow-61d3bae486b0

https://towardsdatascience.com/generalizing-data-load-processes-with-airflow-a4931788a61f

Writing to learn! | LinkedIn profile: https://www.linkedin.com/in/ajhenaor | Buy me a coffee: https://www.buymeacoffee.com/ajhenaor

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store