Big Data Moscow 2018


Gerard Toonstra

BigData Republic,
The Netherlands


Gerard works as a data architect / engineer at BigData Republic, a multidisciplinary team of experienced and business oriented Data Scientists, Data Engineers, and Architects.
Gerard is also a master kaggler and enthusiastic supporter of Apache Airflow.


Data Orchestration with Apache Airflow

Companies are challenged to find ways to centralize data coming from a multitude of sources to process
and then surface the transformed data using other tools and environments. The surfaced data helps analysts
and data scientists to gain insights to optimize business processes.

Data warehousing used to be a rather isolated domain for a group of skilled engineers, but data-driven
organizations need to expose data processing capabilities to a larger group of engineers, analysts and
scientists. Apache Airflow is a data workflow engine that abstracts the complexities of ETL,
post-processing and machine learning so that engineers and analysts with varying skill levels can
maximize their efforts on their core activity: extracting value from data.

Date: October 11, 2018