Big Data Days 2022

Конференция отменена

Ricardo Ferreira

Observability Lead, Community

Elastic, US

Биография

Ricardo is Observability Lead for Elastic’s community team, where he acts as the voice of Elastic at SRE, DevOps, and DataOps communities. In this role, he is responsible for developing and driving a global community and developer advocacy strategy focused on observability. With +20 years of experience, he might have learned a thing or two about distributed systems, observability, streaming systems, and databases. Before Elastic, he worked for other vendors such as Confluent, Oracle, Red Hat, and different consulting firms.

Доклад

Building Debugging-Enabled Data Pipelines

Like what is happening with everything else in the software world, building data pipelines are getting even more complex. Most data pipelines are a mix of various technologies, architecture styles, layers, and runtimes. This creates an enormous burden in data engineering teams building and maintaining data pipelines, as identifying and fixing issues looks more and more like one of those investigative TV shows where a detective searches for a culprit amongst many possible suspects.

But it doesn’t have to be this way. Observability technologies provide a handy way for data engineers to trace, troubleshoot, and fix data pipeline problems. This session will explain why this practice is important and its benefits. It will also show in practice how to apply this in a complex-enough data pipeline built using Apache Kafka, Debezium, MySQL, and ksqlDB.

Ключевые слова

ETL
CDC
Streaming
Tracing
Kafka
Pulsar
Flink

« Hазад