"kafka.errors.NoBrokersAvailable: NoBrokersAvailable" problem [duplicate]

"kafka.errors.NoBrokersAvailable: NoBrokersAvailable" problem [duplicate] - python

I setup a single node Kafka Docker container on my local machine like it is described in the Confluent documentation (steps 2-3).
In addition, I also exposed Zookeeper's port 2181 and Kafka's port 9092 so that I'll be able to connect to them from a client running on local machine:
$ docker run -d \
-p 2181:2181 \
--net=confluent \
--name=zookeeper \
-e ZOOKEEPER_CLIENT_PORT=2181 \
confluentinc/cp-zookeeper:4.1.0
$ docker run -d \
--net=confluent \
--name=kafka \
-p 9092:9092 \
-e KAFKA_ZOOKEEPER_CONNECT=zookeeper:2181 \
-e KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://kafka:9092 \
-e KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=1 \
confluentinc/cp-kafka:4.1.0
Problem: When I try to connect to Kafka from the host machine, the connection fails because it can't resolve address: kafka:9092.
Here is my Java code:
Properties props = new Properties();
props.put("bootstrap.servers", "localhost:9092");
props.put("client.id", "KafkaExampleProducer");
props.put("key.serializer", LongSerializer.class.getName());
props.put("value.serializer", StringSerializer.class.getName());
KafkaProducer<Long, String> producer = new KafkaProducer<>(props);
ProducerRecord<Long, String> record = new ProducerRecord<>("foo", 1L, "Test 1");
producer.send(record).get();
producer.flush();
The exception:
java.io.IOException: Can't resolve address: kafka:9092
at org.apache.kafka.common.network.Selector.doConnect(Selector.java:235) ~[kafka-clients-2.0.0.jar:na]
at org.apache.kafka.common.network.Selector.connect(Selector.java:214) ~[kafka-clients-2.0.0.jar:na]
at org.apache.kafka.clients.NetworkClient.initiateConnect(NetworkClient.java:864) [kafka-clients-2.0.0.jar:na]
at org.apache.kafka.clients.NetworkClient.ready(NetworkClient.java:265) [kafka-clients-2.0.0.jar:na]
at org.apache.kafka.clients.producer.internals.Sender.sendProducerData(Sender.java:266) [kafka-clients-2.0.0.jar:na]
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:238) [kafka-clients-2.0.0.jar:na]
at org.apache.kafka.clients.producer.internals.Sender.run(Sender.java:176) [kafka-clients-2.0.0.jar:na]
at java.lang.Thread.run(Thread.java:748) [na:1.8.0_144]
Caused by: java.nio.channels.UnresolvedAddressException: null
at sun.nio.ch.Net.checkAddress(Net.java:101) ~[na:1.8.0_144]
at sun.nio.ch.SocketChannelImpl.connect(SocketChannelImpl.java:622) ~[na:1.8.0_144]
at org.apache.kafka.common.network.Selector.doConnect(Selector.java:233) ~[kafka-clients-2.0.0.jar:na]
... 7 common frames omitted
Question: How to connect to Kafka running in Docker? My code is running from host machine, not Docker.
Note: I know that I could theoretically play around with DNS setup and /etc/hosts but it is a workaround - it shouldn't be like that.
There is also similar question here, however it is based on ches/kafka image. I use confluentinc based image which is not the same.

Disclaimer
tl;dr - A simple port forward from the container to the host will not work, and no hosts files should be modified. What exact IP/hostname + port do you want to connect to? Make sure that value is set as advertised.listeners on the broker. Make sure that address and the servers listed as part of bootstrap.servers are actually resolvable (ping an IP/hostname, use netcat to check ports...)
To verify the ports are mapped correctly on the host, ensure that docker ps shows the kafka container is mapped from 0.0.0.0:<host_port> -> <advertised_listener_port>/tcp. The ports must match if trying to run a client from outside the Docker network.
The below answer uses confluentinc docker images to address the question that was asked, not wurstmeister/kafka. More specifically, the latter images are not well-maintained despite being the one of the most popular Kafka docker image.
The following sections try to aggregate all the details needed to use another image. For other, commonly used Kafka images, it's all the same Apache Kafka running in a container.
You're just dependent on how it is configured. And which variables make it so.
wurstmeister/kafka
Refer their README section on listener configuration, Also read their Connectivity wiki.
bitnami/kafka
If you want a small container, try these. The images are much smaller than the Confluent ones and are much more well maintained than wurstmeister. Refer their README for listener configuration.
debezium/kafka
Docs on it are mentioned here.
Note: advertised host and port settings are deprecated. Advertised listeners covers both. Similar to the Confluent containers, Debezium can use KAFKA_ prefixed broker settings to update its properties.
Others
spotify/kafka is deprecated and outdated.
fast-data-dev or lensesio/box are great for an all in one solution, but are bloated if you only want Kafka
Your own Dockerfile - Why? Is something incomplete with these others? Start with a pull request, not starting from scratch.
For supplemental reading, a fully-functional docker-compose, and network diagrams, see this blog by #rmoff
Answer
The Confluent quickstart (Docker) document assumes all produce and consume requests will be within the Docker network.
You could fix the problem of connecting to kafka:9092 by running your Kafka client code within its own container as that uses the Docker network bridge, but otherwise you'll need to add some more environment variables for exposing the container externally, while still having it work within the Docker network.
First add a protocol mapping of PLAINTEXT_HOST:PLAINTEXT that will map the listener protocol to a Kafka protocol
Key: KAFKA_LISTENER_SECURITY_PROTOCOL_MAP
Value: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
Then setup two advertised listeners on different ports. (kafka here refers to the docker container name; it might also be named broker, so double check your service + hostnames).
Key: KAFKA_ADVERTISED_LISTENERS
Value: PLAINTEXT://kafka:9092,PLAINTEXT_HOST://localhost:29092
Notice the protocols here match the left-side values of the protocol mapping setting above
When running the container, add -p 29092:29092 for the host port mapping, and advertised PLAINTEXT_HOST listener.
tl;dr
(with the above settings)
If something still doesn't work, KAFKA_LISTENERS can be set to include <PROTOCOL>://0.0.0.0:<PORT> where both options match the advertised setting and Docker-forwarded port
Client on same machine, not in a container
Advertising localhost and the associated port will let you connect outside of the container, as you'd expect.
In other words, when running any Kafka Client outside the Docker network (including CLI tools you might have installed locally), use localhost:29092 for bootstrap servers and localhost:2181 for Zookeeper (requires Docker port forwarding)
Client on another machine
If trying to connect from an external server, you'll need to advertise the external hostname/ip (e.g. 192.168.x.y) of the host as well as/in place of localhost.
Simply advertising localhost with a port forward will not work because Kafka protocol will still continue to advertise the listeners you've configured.
This setup requires Docker port forwarding and router port forwarding (and firewall / security group changes) if not in the same local network, for example, your container is running in the cloud and you want to interact with it from your local machine.
Client (or another broker) in a container, on the same host
This is the least error-prone configuration; you can use DNS service names directly.
When running an app in the Docker network, use kafka:9092 (see advertised PLAINTEXT listener config above) for bootstrap servers and zookeeper:2181 for Zookeeper, just like any other Docker service communication (doesn't require any port forwarding)
If you use separate docker run commands, or Compose files, you need to define a shared network manually
See the example Compose file for the full Confluent stack or more minimal one for a single broker.
If using multiple brokers, then they need to use unique hostnames + advertised listeners. See example
Related question
Connect to Kafka on host from Docker (ksqlDB)
Appendix
For anyone interested in Kubernetes deployments:
Accessing Kafka
Operators (recommended): https://operatorhub.io/?keyword=Kafka
Helm Artifact Hub: https://artifacthub.io/packages/search?ts_query_web=kafka&sort=stars&page=1

When you first connect to a kafka node, it will give you back all the kafka node and the url where to connect. Then your application will try to connect to every kafka directly.
Issue is always what is the kafka will give you as url ? It's why there is the KAFKA_ADVERTISED_LISTENERS which will be used by kafka to tell the world how it can be accessed.
Now for your use-case, there is multiple small stuff to think about:
Let say you set plaintext://kafka:9092
This is OK if you have an application in your docker compose that use kafka. This application will get from kafka the URL with kafka that is resolvable through the docker network.
If you try to connect from your main system or from another container which is not in the same docker network this will fail, as the kafka name cannot be resolved.
==> To fix this, you need to have a specific DNS server like a service discovery one, but it is big trouble for small stuff. Or you set manually the kafka name to the container ip in each /etc/hosts
If you set plaintext://localhost:9092
This will be ok on your system if you have a port mapping ( -p 9092:9092 when launching kafka)
This will fail if you test from an application on a container (same docker network or not) (localhost is the container itself not the kafka one)
==> If you have this and wish to use a kafka client in another container, one way to fix this is to share the network for both container (same ip)
Last option : set an IP in the name: plaintext://x.y.z.a:9092 ( kafka advertised url cannot be 0.0.0.0 as stated in the doc https://kafka.apache.org/documentation/#brokerconfigs_advertised.listeners )
This will be ok for everybody... BUT how can you get the x.y.z.a name ?
The only way is to hardcode this ip when you launch the container: docker run .... --net confluent --ip 10.x.y.z .... Note that you need to adapt the ip to one valid ip in the confluent subnet.

before zookeeper
docker container run --name zookeeper -p 2181:2181 zookeeper
after kafka
docker container run --name kafka -p 9092:9092 -e KAFKA_ZOOKEEPER_CONNECT=192.168.8.128:2181 -e KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://ip_address_of_your_computer_but_not_localhost!!!:9092 -e KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=1 confluentinc/cp-kafka
in kafka consumer and producer config
#Bean
public ProducerFactory<String, String> producerFactory() {
Map<String, Object> configProps = new HashMap<>();
configProps.put(ProducerConfig.BOOTSTRAP_SERVERS_CONFIG, "192.168.8.128:9092");
configProps.put(ProducerConfig.KEY_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
configProps.put(ProducerConfig.VALUE_SERIALIZER_CLASS_CONFIG, StringSerializer.class);
return new DefaultKafkaProducerFactory<>(configProps);
}
#Bean
public ConsumerFactory<String, String> consumerFactory() {
Map<String, Object> props = new HashMap<>();
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "192.168.8.128:9092");
props.put(ConsumerConfig.GROUP_ID_CONFIG, "group_id");
props.put(ConsumerConfig.KEY_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
props.put(ConsumerConfig.VALUE_DESERIALIZER_CLASS_CONFIG, StringDeserializer.class);
return new DefaultKafkaConsumerFactory<>(props);
}
I run my project with these regulations. Good luck dude.

the simplest way to solve this is adding a custom hostname to your broker using -h option
docker run -d \
--net=confluent \
--name=kafka \
-h broker-1 \
-p 9092:9092 \
-e KAFKA_ZOOKEEPER_CONNECT=zookeeper:2181 \
-e KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://kafka:9092 \
-e KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=1 \
confluentinc/cp-kafka:4.1.0
and edit your /etc/hosts
127.0.0.1 broker-1
and use:
props.put(ConsumerConfig.BOOTSTRAP_SERVERS_CONFIG, "broker-1:9092");

This allows me to access localhost:9092 in Kafka applications on my M1 Mac
Key: KAFKA_ADVERTISED_LISTENERS
Value: PLAINTEXT://kafka:29092,PLAINTEXT_HOST://localhost:9092
plus port forwarding :
ports
- "9092:9092"
Finally, again, for my set up, I have to set listeners key this way
Key: KAFKA_LISTENERS
Value: PLAINTEXT://0.0.0.0:29092,PLAINTEXT_HOST://0.0.0.0:9092

Related

How to produce/consume data from/to Docker using Kafka?

I have seen a lot of questions and topics but they are all related to docker-compose and creation of kafka container. But I have a 1 namenode and 3 datanode container system. I have two ports related to docker container, one is 8088 and the other is 50070. I want to send data using kafka from my local to the docker. But for me I need to send the data using 8088 or 50070. But I can't figure out how to do this. I edited
listeners=SASL_PLAINTEXT://0.0.0.0:8088, PLAINTEXT://0.0.0.0:9092
advertised.listeners=SASL_PLAINTEXT://localhost:8088,
PLAINTEXT://localhost:9092
security.inter.broker.protocol=SASL_PLAINTEXT
in server.properties but it didn't work. I am trying like this: I created a topic named test in Docker. I sent some data to it on Docker terminal. And in local I try to consume the data from topic with --bootstrap-server localhost:8088 but it gives error.
Thank you.

have a 1 namenode and 3 datanode container system
These are not Kafka services, and Docker is irrelevant to the question.
If you want to send data to Hadoop datanodes on these ports, you need to use Kafka Connect, or from Python you can use Spark, Flink, etc.
If all services are running in containers, then Compose or your host isn't completely relevant, either.
You'd use a bridge network - https://docs.docker.com/network/bridge/
I need to send the data using 8088 or 50070.
Ports don't matter, but you really should consider leaving Kafka ports as the defaults.
But you must forward the listener port from the host, properly -p 8088:8088, and you must set the necessary SASL properties in whatever command you're using with --bootstrap-server localhost:8088, which should work from both the host and inside the container (which is where you should first debug the problem)

Docker Swarm Failing to Resolve DNS by Service Name With Python Celery Workers Connecting to RabbitMQ Broker Resulting in Connection Timeout

Setup
I have Docker installed and connected 9 machines, 1 manager and 8 worker nodes, using Docker swarm. This arrangement has been used in our development servers for ~5 years now.
I'm using this to launch a task queue that uses Celery for Python. Celery is using RabbitMQ as its broker and Redis for the results backend.
I have created an overlay network in Docker so that all my Celery workers launched by Docker swarm can reference their broker and results backend by name; i.e., rabbitmq or redis, instead of by IP address. The network was created by running the following command:
docker network create -d overlay <network_name>
The RabbitMQ service and Redis service were launched on the manager node under this overlay network using the following commands:
docker service create --network <my_overlay_network> --name redis --constraint "node.hostname==manager" redis
docker service create --network <my_overlay_network> --name rabbitmq --constraint "node.hostname==manager" rabbitmq
Once both of these have been launched, I deploy my Celery workers, one per each Docker swarm worker node, on the same overlay network using the following command:
docker service create --network <my_overlay_network> --name celery-worker --constraint "node.hostname!=manager" --replicas 8 --replicas-max-per-node 1 <my_celery_worker_image>
Before someone suggest it, yes I know I should be using a Docker compose file to launch all of this. I'm currently testing, and I'll write up one after I can get everything working.
The Problem
The Celery workers are configured to reference their broker and backend by the container name:
app = Celery('tasks', backend='redis://redis', broker='pyamqp://guest#rabbitmq//')
Once all the services have been launched and verified by Docker, 3 of the 8 start successfully, connect to the broker and backend, and allow me to begin running task on them. The other 5 continuously time out when attempting to connect to RabbitMQ and report the following message:
consumer: Cannot connect to amqp://guest:**#rabbitmq:5672//: timed out.
I'm at my wits end trying to find out why only 3 of my worker nodes allow the connection to occur while the other 5 cause a continuous timeout. All launched services are connected over the same overlay network.
The issue persist when I attempt to use brokers other than RabbitMQ, leading me to think that it's not specific to any one broker. I'd likely have issues connecting to any service by name on the overlay network when on the machines that are reporting the timeout. Stopping the service and launching again always produces the same results - the same 3 nodes work while the other 5 timeout.
All nodes are running the same version of Docker (19.03.4, build 9013bf583a), and the machines were created from identical images. They're virtually the same. The only difference among them is their hostnames, e.g., manager, worker1, worker2, and etc.
I have been able to replicate this setup outside of Docker swarm (all on one machine) by using a bridge network instead of overlay when developing my application on my personal computer without issue. I didn't experience problems until I launched everything on our development server, using the steps detailed above, to test it before pushing it to production.
Any ideas on why this is occurring and how I can remedy it? Switching form Docker swarm to Kubernetes isn't an option for me currently.

It's not the answer I wanted, but this appears to be an on-going bug in Docker swarm. For any who are interested, I'll include the issue page.
https://github.com/docker/swarmkit/issues/1429
There's a work around listed by one user on there that may wake for some, but your mileage may vary. It didn't work for me. The work around is listed in the bullet below:
Don't try to use docker for Windows to get multi-node mesh network (swarm) running. It's simply not (yet) supported. If you google around, you find some Microsoft blogs telling about it. Also the docker documentation mentions it somewhere. It would be nice, if docker cmd itself would print an error/warning when trying to set something up under Windows - which simply doesn't work. It does work on a single node though.
Don't try to use a Linux in a Virtualbox under Windows and hoping to workaround with it. It, of course, doesn't work since it has the same limitations as the underlying Windows.
Make sure you open ports at least 7946 tcp/udp and 4789 udp for worker nodes. For master also 2377 tcp. Use e.g. netcat -vz -u for udp check. Without -u for tcp.
Make sure to pass --advertise-addr on the docker worker node (!) when executing the join swarm command. Here put the external IP address of the worker node which has the mentioned ports open. Doublecheck that the ports are really open!
Using ping to check the DNS resolution for container names works. If you forget the --advertise-addr or not opening port 7946 results in DNS resolution not working on worker nodes!
I suggest attempting all of the above first if you encounter the same issue. To clarify a few things in the above bullet points, the --advertise-addr flag should be used on a worker node when joining it to the swarm. If your worker node doesn't have a static IP address, you can use the interface to connect it. Run ifconfig to view your interfaces. You'll need to use the interface that has your external facing IP address. For most people, this will probably be eth0, but you should still check before running the command. Doing this, the command you would issue on the worker is:
docker swarm join --advertise-addr eth0:2377 --token <your_token> <manager_ip>:2377
With 2377 being the port Docker uses. Verify that you joined with your correct IP address by going into your manager node and running the following:
docker node inspect <your_node_name>
If you don't know your node name, it should be the host name of the machine which you joined as a worker node. You can see it by running:
docker node ls
If you joined on the right interface, you will see this at the bottom of the return when running inspect:
{
"Status": "ready",
"Addr": <your_workers_external_ip_addr>
}
If you verified that everything has joined correctly, but the issue still persist, you can try launching your services with the additional flag of --dns-option use-vc when running Docker swarm create as such:
docker swarm create --dns-option use-vc --network <my_overlay> ...
Lastly, if all the above fails for you as it did for me, then you can expose the port of the running service you wish connect to in the swarm. For me, I wished to connect my services on my worker nodes to RabbitMQ and Redis on my manager node. I did so by exposing the services port. You can do this at creation by running:
docker swarm create -p <port>:<port> ...
Or after the services has been launched by running
docker service update --publish-add <port>:<port> <service_name>
After this, your worker node services can connect to the manager node service by the IP address of the worker node host and the port you exposed. For example, using RabbitMQ, this would be:
pyamqp://<user>:<pass>#<worker_host_ip_addr>:<port>/<vhost>
Hopefully this helps someone who stumbles on this post.

Port not exposed but still reachable on internal docker network

(I'm having the inverse problem of exposing a port and it's not reachable.)
In my case I have 2 containers on the same network. One is an Alpine Python running a Python Flask app. The other is a barebones Ubuntu 18.04. The services are initialised basically like this:
docker-compose.yml:
version: '3'
services:
pythonflask:
build: someDockerfile # from python:3.6-alpine
restart: unless-stopped
ubuntucontainer:
build: someOtherDockerfile #from ubuntu:18.04
depends_on:
- pythonflask
restart: unless-stopped
The Python Flask app runs on port 5000.
Notice the lack of expose: - 5000 in the docker-compose.yml file.
The problem is that I'm able to get a correct response when cURLing http://pythonflask:5000 from inside ubuntucontainer
Steps:
$ docker exec -it ubuntucontainer /bin/bash
...and then within the container...
root#ubuntucontainer:/# curl http://pythonflask:5000/
...correctly returns my response from the Flask app.
However from my machine running docker:
$ curl http://localhost:5000/
Doesn't return anything (as expected).
As I test different ports, they get automatically exposed each time. What is doing this?

Connectivity between containers is achieved by placing the containers on the same docker network and communicating over the container ip and port (rather than the host published port). So what does expose do then?
Expose is documentation
Expose in docker is used by image creators to document the expected port that the application will listen on inside the container. With the exception of some tools and a flag in docker that uses this metadata documentation, it is not used to control access between containers or modify docker's networking. Applications may be reconfigured at runtime to listen to a different port, and you can connect to ports that have not been exposed.
For DNS lookups between containers, the network needs to be user created, not one of the default networks from docker (e.g. DNS is not enabled in the default bridge network named "bridge"). With DNS, you can lookup the container name, service name (from a compose file), and any network aliases created for that container on that network.
The other half of the equation is "publish" in docker. This creates a mapping from the host to the container to allow external access. It is implemented with a proxy process that runs on the host and forwards new connections. Because of the implementation, you can publish a port on the host even if the container is not listening on the port, though you would receive an error when trying to connect to that port in that scenario.

The lack of expose: ... just means that there is no port exposed from the service group you defined in your docker-compose.yml
Within the images you use, there are still exposed ports which are reachable from within the network that is automatically created by docker-compose.
That is why you reach one container from within another. In addition every container can be accessed via service name from the docker-compose.yml on the internal network.
You should not be able to access flask from your host (http://localhost:5000)

Accessing Docker Container on Centos Server

I've managed to deploy a Django app inside a docker container on my personal Mac using localhost with Apache. For this, I use docker-compose with the build and up commands. I'm trying to run the same Django app on a CentOS server using a docker image generated on my local machine. Apache is also running on the server on port 90.
docker run -it -d --hostname xxx.xxx.xxx -p 9090:9090 --name test idOfImage
How can I access this container with Apache using the hostname and port number in the URL? Any help would be greatly appreciated. Thanks.

From other containers the best way to access this container is to attach both to the same network and use the container's --name as a DNS name and the internal port (the second port from the -p option, which isn't strictly required for this case); from outside a container or from other hosts use the host's IP address or DNS name and the published port (the first port from the -p option).
The docker run --hostname option isn't especially useful; the only time you'd want to specify it is if you have some magic licensed software that only ran if it had a particular hostname.
Avoid localhost in a Docker context, except for the very specific case where you know you're running a process on the host system outside a container and you're trying to access a container's published port or some other service running on the host. Don't use "localhost" as a generic term, it has a very specific context-dependent meaning (every process believes it's running "on localhost").

Can't access app deployed with docker and google cloud

I currently have a Linux Debian VM set up through Google Cloud Platform. I have docker installed and would like to start running application containers within it.
I'm following the documentation under Docker's website Found Here under
"Running a web application in Docker" I download the image and run it with no issue. I then run $sudo docker ps and get the port which is 0.0.0.0:32768->5000/tcp
I then try to browse to the website at http://"MyExternalVMIP":32768 but the applications doesn't come up. Am I missing something?

First, test to see if your service works at all. To do this, from the VM itself, run:
wget http://localhost:32768
or
curl http://localhost:32768
If that works, that means the service is operating properly, so let's move further with the debugging.
There may be two firewalls that are blocking external access to your docker process:
the VM's OS firewall
Google Compute Engine firewall
You can see if you're affected by the first issue by accessing the URL from the VM itself and from another VM on the same GCE network (use the VM name in the URL, not the external IP):
wget http://[vm-name]:32768
To fix the first issue, you would have to either open up the single port (recommended):
iptables -I INPUT -p tcp -s 0.0.0.0/0 --dport 32768 -j ACCEPT
or disable firewall entirely, e.g., by stopping iptables (not recommended).
If, after fixing this, you can access the URL from another host on the same GCE network, but still can't access it from outside of Google Compute Engine, you're affected by the second issue. To fix it, you will need to open the port in the GCE firewall; this can also be done via the web UI in the Developers Console.

Create an entry in your local ssh config file as below with specific local forward port. In my case its an example of yarn's IP, which I want to access in browser.
Host hadoop
HostName <External-IP>
User <Local-machine-username>
IdentityFile ~/.ssh/<private-key-for-above-user>
LocalForward 8089 <Internal-IP>:8088

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.