Software Development

Level: Intermediate

Kafka Fundamentals: Building Scalable Event Streaming Systems

3 days

Kafka Fundamentals: Building Scalable Event Streaming Systems

Welcome to this comprehensive course on Apache Kafka fundamentals. This course will equip you with the knowledge and skills to design, implement, and manage robust event streaming systems using Apache Kafka.

Apache Kafka has become the de facto standard for building scalable, high-throughput messaging systems and event-driven architectures. This course will take you through the core concepts of Kafka, its architecture, and practical applications. You’ll learn how to set up Kafka clusters, produce and consume messages, and design efficient stream processing applications.

Learning Outcomes

By the end of this course, participants will be able to:

  • Understand the fundamental concepts and architecture of Apache Kafka
  • Set up and configure Kafka clusters using both Zookeeper and the new KRaft Consensus Protocol
  • Design and implement efficient producer and consumer applications
  • Apply best practices for topic design and partitioning strategies
  • Implement stream processing applications using Kafka Streams API
  • Ensure data durability and fault tolerance in Kafka deployments
  • Monitor and troubleshoot Kafka clusters effectively
  • Integrate Kafka with other systems in a data ecosystem

Your Instructor

The course is led by Peter Munro, a seasoned IT trainer and software developer with over 30 years of experience.

With Peter’s extensive industry background, you’ll gain insights into real-world best practices and common pitfalls to avoid. By the end of this course, you’ll be well-prepared to implement Kafka in your own projects and drive your organisation’s event streaming initiatives forward.

Course Outline

Module 1: Introduction to Apache Kafka

  • Evolution of distributed messaging systems and Kafka’s place in the ecosystem
  • Core concepts: topics, partitions, brokers, and consumer groups
  • Kafka’s distributed architecture and its benefits
  • Use cases and real-world applications of Kafka

Module 2: Setting Up Kafka

  • Installing and configuring Kafka on various platforms
  • Zookeeper-based setup (for versions prior to 3.3)
  • Introduction to KRaft Consensus Protocol (for versions 3.3+)
  • Configuring brokers, topics, and basic security settings
  • Kafka cluster topology best practices

Module 3: Producing Messages

  • Understanding Kafka producers and their configuration options
  • Implementing idempotent and transactional producers
  • Handling serialisation and custom message formats
  • Balancing throughput, latency, and durability in message production
  • Best practices for error handling and retry mechanisms

Module 4: Consuming Messages

  • Kafka consumer concepts and consumer groups
  • Implementing efficient consumer applications
  • Managing offsets and handling consumer lag
  • Strategies for scaling out consumption with consumer groups
  • Exactly-once semantics and handling duplicate messages

Module 5: Advanced Kafka Concepts

  • Deep dive into partitioning strategies and their impact on scalability
  • Kafka’s storage internals: segments, indexes, and compaction
  • Implementing custom partitioners and serialisers
  • Kafka Connect for data integration with external systems
  • Schema management with Kafka Schema Registry

Module 6: Stream Processing with Kafka Streams

  • Introduction to stream processing concepts
  • Kafka Streams API architecture and core abstractions
  • Implementing stateless and stateful stream processing operations
  • Windowing operations and handling time in stream processing
  • Joining streams and tables in Kafka Streams applications

Module 7: Kafka Security

  • Authentication and authorisation in Kafka clusters
  • Implementing SSL/TLS encryption for client-broker communication
  • Role-Based Access Control (RBAC) in Kafka
  • Securing inter-broker communication
  • Best practices for securing Kafka in production environments

Module 8: Monitoring and Operations

  • Key metrics for monitoring Kafka clusters
  • Using Kafka’s built-in monitoring tools
  • Integration with external monitoring systems (e.g., Prometheus, Grafana)
  • Common operational tasks: adding brokers, rebalancing partitions
  • Disaster recovery strategies and multi-datacenter deployments

Module 9: Kafka Ecosystem and Integration

  • Overview of key Kafka ecosystem projects (e.g., Confluent Platform)
  • Integrating Kafka with Hadoop ecosystems
  • Using Kafka with containerisation and orchestration platforms (Docker, Kubernetes)
  • Kafka in microservices architectures
  • Event-driven design patterns with Kafka

Module 10: Capstone Project

  • Design and implement a real-world event streaming solution using Kafka
  • Apply best practices for topic design, partitioning, and consumer group strategies
  • Implement a stream processing application using Kafka Streams
  • Present and defend your Kafka-based architecture

Conclusion

Throughout this course, you’ll gain hands-on experience with Kafka, learning from real-world scenarios and best practices. You will gain insights that go beyond theory, helping you navigate the complexities of implementing Kafka in production environments.

By mastering Kafka, you’ll be well-equipped to build scalable, resilient, and high-performance event streaming systems that can handle the demands of modern data-intensive applications. Whether you’re looking to modernise your existing infrastructure or build new event-driven architectures from the ground up, this course will give you the tools and confidence to succeed.