On Track with Apache Kafka—Building a Streaming ETL solution with Rail Data
Want to know what you can REALLY do with Apache Kafka once you get going? This talk will show off lots of integration and stream processing techniques. What started out as a fun project integrating live streams of updates of UK rail data turned into a full-blown data platform, with integration from ActiveMQ and S3, through KSQL and stream processing for joins and decoding, into various targets including analytics on S3, Elasticsearch, PostgreSQL, and graph analysis on Neo4j. Also using Kafka compacted topics to demonstrate the theory of stream/table to store configuration to drive real-time alerts delivered through Telegram. This talk will be a curated walk-through of the specifics of how I built the system, and code samples of the salient integration points in KSQL and Kafka Connect. The data may be domain-specific but the challenges of handling batch and stream data to drive both applications and analytics are encountered by many, and this talk will give people lots of concrete examples on how to do it.