Developing and Testing Against a Local Kafka Cluster
I have wanted to write this article for some time now because somehow the vast majority of blog posts, articles, and yes even documentation gets one crucial piece wrong. For some reason they all leave us with a configuration accessible only from internal to the Docker network, but not external to the Docker network. So if you want to be able to develop and test against a Kafka cluster from your local development environment, then read on.
Prerequisites
Before we begin, you should have Docker and Docker Compose installed on your machine. Docker is a tool designed to make it easier to create, deploy, and run applications by using containers. Containers are like lightweight, standalone, and executable software packages that include everything needed to run an application. Docker Compose is a tool for defining and running multi-container Docker applications.
Configuring a Kafka Cluster
We are going to use Docker Compose to define our multi-container setup. We will have a Zookeeper container (which Kafka uses for managing its cluster state) and three Kafka containers.
Here is the docker-compose.yml file for the setup.
version: '3'
services:
zookeeper:
image: wurstmeister/zookeeper
ports:
- 2181:2181
kafka:
image: wurstmeister/kafka
environment:
KAFKA_ADVERTISED_HOST_NAME: localhost
KAFKA_ADVERTISED_PORT: 9092
KAFKA_AUTO_CREATE_TOPICS_ENABLE: false
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
ports:
- 9092:9092
depends_on:
- zookeeper
kafka-1:
image: wurstmeister/kafka
environment:
KAFKA_LISTENERS: INTERNAL://kafka-1:19091,EXTERNAL://kafka-1:9091
KAFKA_ADVERTISED_LISTENERS: INTERNAL://kafka-1:19091,EXTERNAL://kafka-1:9091
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: EXTERNAL:PLAINTEXT, INTERNAL:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: INTERNAL
KAFKA_AUTO_CREATE_TOPICS_ENABLE: false
KAFKA_BROKER_ID: 1
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: ${REPLICATION_FACTOR:-3}
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
ports:
- 9091:9091
depends_on:
- zookeeper
kafka-2:
image: wurstmeister/kafka
environment:
KAFKA_LISTENERS: INTERNAL://kafka-2:19092,EXTERNAL://kafka-2:9092
KAFKA_ADVERTISED_LISTENERS: INTERNAL://kafka-2:19092,EXTERNAL://kafka-2:9092
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: EXTERNAL:PLAINTEXT, INTERNAL:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: INTERNAL
KAFKA_AUTO_CREATE_TOPICS_ENABLE: false
KAFKA_BROKER_ID: 2
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: ${REPLICATION_FACTOR:-3}
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
ports:
- 9092:9092
depends_on:
- kafka-1
kafka-3:
image: wurstmeister/kafka
environment:
KAFKA_LISTENERS: INTERNAL://kafka-3:19093,EXTERNAL://kafka-3:9093
KAFKA_ADVERTISED_LISTENERS: INTERNAL://kafka-3:19093,EXTERNAL://kafka-3:9093
KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: EXTERNAL:PLAINTEXT, INTERNAL:PLAINTEXT
KAFKA_INTER_BROKER_LISTENER_NAME: INTERNAL
KAFKA_AUTO_CREATE_TOPICS_ENABLE: false
KAFKA_BROKER_ID: 3
KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: ${REPLICATION_FACTOR:-3}
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
ports:
- 9093:9093
depends_on:
- kafka-2
Understanding the Environment Variables
Let's dig into some key environment variables:
KAFKA_BROKER_ID
: This is the unique identifier for each Kafka broker. It must be unique across all the brokers in a Kafka cluster.KAFKA_ZOOKEEPER_CONNECT
: This is the connection string for the Zookeeper connection. It is of the format :.KAFKA_LISTENERS
: This is a comma-separated list of listeners that the broker will listen on. Each listener is of the format ://:.KAFKA_ADVERTISED_LISTENERS
: Advertised listeners are the listeners that a Kafka broker will tell its clients to connect to. This should be set to the external IP address or hostname of the host machine, and the port should be the one that clients will use to connect.KAFKA_LISTENER_SECURITY_PROTOCOL_MAP
: This is a comma-separated list of : pairs. Each listener name should match one declared in KAFKA_LISTENERS.KAFKA_INTER_BROKER_LISTENER_NAME
: This is the listener that the broker will use for communication between brokers. It must match one of the names defined in KAFKA_LISTENERS.
Note that we have defined two types of listeners: INTERNAL and EXTERNAL. INTERNAL listeners are used for communication between the Kafka brokers, while EXTERNAL listeners are used for communication between the Kafka brokers and clients outside of the Docker network.
Starting the Kafka Cluster
To start the Kafka cluster, navigate to the directory containing the docker-compose.yml
file and run the following command:
docker compose up -d
You will now have a 3-broker Kafka cluster running locally on your machine.
Accessing the Kafka Cluster
With the current configuration, you can connect to any of the Kafka brokers from your host OS using the name of the broker and the respective port (kafka-1:9091, kafka-2:9092, kafka-3:9093).
For example, to produce messages to test-topic on kafka-1, you can use the following command:
kafka-console-producer.sh --broker-list kafka-1:9091 --topic test-topic
To consume messages from test-topic, you can use the following command:
kafka-console-consumer.sh --bootstrap-server kafka-1:9091 --topic test-topic --from-beginning
That's it! You now have a 3-broker Kafka cluster running locally on Docker, AND you can interact with it from your local machine. If you are curious where the rest of the web goes wrong in their examples, it's this environment variable. Notice the use of localhost
instead of a broker name (e.g., kafka-3
).
KAFKA_ADVERTISED_LISTENERS: INTERNAL://kafka-3:19093,EXTERNAL://localhost:9093
I hope this helps you and opens the door for development and testing with a local Kafka cluster.