Developing and Testing Against a Local Kafka Cluster

Developing and Testing Against a Local Kafka Cluster

I have wanted to write this article for some time now because somehow the vast majority of blog posts, articles, and yes even documentation gets one crucial piece wrong. For some reason they all leave us with a configuration accessible only from internal to the Docker network, but not external to the Docker network. So if you want to be able to develop and test against a Kafka cluster from your local development environment, then read on.

Prerequisites

Before we begin, you should have Docker and Docker Compose installed on your machine. Docker is a tool designed to make it easier to create, deploy, and run applications by using containers. Containers are like lightweight, standalone, and executable software packages that include everything needed to run an application. Docker Compose is a tool for defining and running multi-container Docker applications.

Configuring a Kafka Cluster

We are going to use Docker Compose to define our multi-container setup. We will have a Zookeeper container (which Kafka uses for managing its cluster state) and three Kafka containers.

Here is the docker-compose.yml file for the setup.

version: '3'

services:

  zookeeper:
    image: wurstmeister/zookeeper
    ports:
      - 2181:2181

  kafka:
    image: wurstmeister/kafka
    environment:
      KAFKA_ADVERTISED_HOST_NAME: localhost
      KAFKA_ADVERTISED_PORT: 9092
      KAFKA_AUTO_CREATE_TOPICS_ENABLE: false
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
    ports:
      - 9092:9092
    depends_on:
      - zookeeper

  kafka-1:
    image: wurstmeister/kafka
    environment:
      KAFKA_LISTENERS: INTERNAL://kafka-1:19091,EXTERNAL://kafka-1:9091
      KAFKA_ADVERTISED_LISTENERS: INTERNAL://kafka-1:19091,EXTERNAL://kafka-1:9091
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: EXTERNAL:PLAINTEXT, INTERNAL:PLAINTEXT
      KAFKA_INTER_BROKER_LISTENER_NAME: INTERNAL
      KAFKA_AUTO_CREATE_TOPICS_ENABLE: false
      KAFKA_BROKER_ID: 1
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: ${REPLICATION_FACTOR:-3}
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
    ports:
      - 9091:9091
    depends_on:
      - zookeeper

  kafka-2:
    image: wurstmeister/kafka
    environment:
      KAFKA_LISTENERS: INTERNAL://kafka-2:19092,EXTERNAL://kafka-2:9092
      KAFKA_ADVERTISED_LISTENERS: INTERNAL://kafka-2:19092,EXTERNAL://kafka-2:9092
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: EXTERNAL:PLAINTEXT, INTERNAL:PLAINTEXT
      KAFKA_INTER_BROKER_LISTENER_NAME: INTERNAL
      KAFKA_AUTO_CREATE_TOPICS_ENABLE: false
      KAFKA_BROKER_ID: 2
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: ${REPLICATION_FACTOR:-3}
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
    ports:
      - 9092:9092
    depends_on:
      - kafka-1

  kafka-3:
    image: wurstmeister/kafka
    environment:
      KAFKA_LISTENERS: INTERNAL://kafka-3:19093,EXTERNAL://kafka-3:9093
      KAFKA_ADVERTISED_LISTENERS: INTERNAL://kafka-3:19093,EXTERNAL://kafka-3:9093
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: EXTERNAL:PLAINTEXT, INTERNAL:PLAINTEXT
      KAFKA_INTER_BROKER_LISTENER_NAME: INTERNAL
      KAFKA_AUTO_CREATE_TOPICS_ENABLE: false
      KAFKA_BROKER_ID: 3
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: ${REPLICATION_FACTOR:-3}
      KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
    ports:
      - 9093:9093
    depends_on:
      - kafka-2

Understanding the Environment Variables

Let's dig into some key environment variables:

  • KAFKA_BROKER_ID: This is the unique identifier for each Kafka broker. It must be unique across all the brokers in a Kafka cluster.
  • KAFKA_ZOOKEEPER_CONNECT: This is the connection string for the Zookeeper connection. It is of the format :.
  • KAFKA_LISTENERS: This is a comma-separated list of listeners that the broker will listen on. Each listener is of the format ://:.
  • KAFKA_ADVERTISED_LISTENERS: Advertised listeners are the listeners that a Kafka broker will tell its clients to connect to. This should be set to the external IP address or hostname of the host machine, and the port should be the one that clients will use to connect.
  • KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: This is a comma-separated list of : pairs. Each listener name should match one declared in KAFKA_LISTENERS.
  • KAFKA_INTER_BROKER_LISTENER_NAME: This is the listener that the broker will use for communication between brokers. It must match one of the names defined in KAFKA_LISTENERS.

Note that we have defined two types of listeners: INTERNAL and EXTERNAL. INTERNAL listeners are used for communication between the Kafka brokers, while EXTERNAL listeners are used for communication between the Kafka brokers and clients outside of the Docker network.

Starting the Kafka Cluster

To start the Kafka cluster, navigate to the directory containing the docker-compose.yml file and run the following command:

docker compose up -d

You will now have a 3-broker Kafka cluster running locally on your machine.

Accessing the Kafka Cluster

With the current configuration, you can connect to any of the Kafka brokers from your host OS using the name of the broker and the respective port (kafka-1:9091, kafka-2:9092, kafka-3:9093).

For example, to produce messages to test-topic on kafka-1, you can use the following command:

kafka-console-producer.sh --broker-list kafka-1:9091 --topic test-topic

To consume messages from test-topic, you can use the following command:

kafka-console-consumer.sh --bootstrap-server kafka-1:9091 --topic test-topic --from-beginning

That's it! You now have a 3-broker Kafka cluster running locally on Docker, AND you can interact with it from your local machine. If you are curious where the rest of the web goes wrong in their examples, it's this environment variable. Notice the use of localhost instead of a broker name (e.g., kafka-3).

KAFKA_ADVERTISED_LISTENERS: INTERNAL://kafka-3:19093,EXTERNAL://localhost:9093

I hope this helps you and opens the door for development and testing with a local Kafka cluster.