I've created a Spark cluster with one master and two slaves, each one on a Docker container.
I launch it with the command start-all.sh.
I can reach the UI from my local machine at localhost:8080 and it shows me that the cluster is well launched :
Screenshot of Spark UI
Then I try to submit a simple Python script from my host machine (not from the Docker container) with this command spark-submit : spark-submit --master spark://spark-master:7077 test.py
test.py :
import pyspark
conf = pyspark.SparkConf().setAppName('MyApp').setMaster('spark://spark-master:7077')
sc = pyspark.SparkContext(conf=conf)
But the console returned me this error :
22/01/26 09:20:39 INFO StandaloneAppClient$ClientEndpoint: Connecting to master spark://spark-master:7077...
22/01/26 09:20:40 WARN StandaloneAppClient$ClientEndpoint: Failed to connect to master spark-master:7077
org.apache.spark.SparkException: Exception thrown in awaitResult:
at org.apache.spark.util.ThreadUtils$.awaitResult(ThreadUtils.scala:226)
at org.apache.spark.rpc.RpcTimeout.awaitResult(RpcTimeout.scala:75)
at org.apache.spark.rpc.RpcEnv.setupEndpointRefByURI(RpcEnv.scala:101)
at org.apache.spark.rpc.RpcEnv.setupEndpointRef(RpcEnv.scala:109)
at org.apache.spark.deploy.client.StandaloneAppClient$ClientEndpoint$$anonfun$tryRegisterAllMasters$1$$anon$1.run(StandaloneAppClient.scala:106)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.io.IOException: Failed to connect to spark-master:7077
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:245)
at org.apache.spark.network.client.TransportClientFactory.createClient(TransportClientFactory.java:187)
at org.apache.spark.rpc.netty.NettyRpcEnv.createClient(NettyRpcEnv.scala:198)
at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:194)
at org.apache.spark.rpc.netty.Outbox$$anon$1.call(Outbox.scala:190)
... 4 more
Caused by: java.net.UnknownHostException: spark-master
at java.base/java.net.InetAddress$CachedAddresses.get(InetAddress.java:797)
at java.base/java.net.InetAddress.getAllByName0(InetAddress.java:1509)
at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1368)
at java.base/java.net.InetAddress.getAllByName(InetAddress.java:1302)
at java.base/java.net.InetAddress.getByName(InetAddress.java:1252)
at io.netty.util.internal.SocketUtils$8.run(SocketUtils.java:146)
at io.netty.util.internal.SocketUtils$8.run(SocketUtils.java:143)
at java.base/java.security.AccessController.doPrivileged(Native Method)
at io.netty.util.internal.SocketUtils.addressByName(SocketUtils.java:143)
at io.netty.resolver.DefaultNameResolver.doResolve(DefaultNameResolver.java:43)
at io.netty.resolver.SimpleNameResolver.resolve(SimpleNameResolver.java:63)
at io.netty.resolver.SimpleNameResolver.resolve(SimpleNameResolver.java:55)
at io.netty.resolver.InetSocketAddressResolver.doResolve(InetSocketAddressResolver.java:57)
at io.netty.resolver.InetSocketAddressResolver.doResolve(InetSocketAddressResolver.java:32)
at io.netty.resolver.AbstractAddressResolver.resolve(AbstractAddressResolver.java:108)
at io.netty.bootstrap.Bootstrap.doResolveAndConnect0(Bootstrap.java:202)
at io.netty.bootstrap.Bootstrap.access$000(Bootstrap.java:48)
at io.netty.bootstrap.Bootstrap$1.operationComplete(Bootstrap.java:182)
at io.netty.bootstrap.Bootstrap$1.operationComplete(Bootstrap.java:168)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:577)
at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:551)
at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:490)
at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:615)
at io.netty.util.concurrent.DefaultPromise.setSuccess0(DefaultPromise.java:604)
at io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:104)
at io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:84)
at io.netty.channel.AbstractChannel$AbstractUnsafe.safeSetSuccess(AbstractChannel.java:985)
at io.netty.channel.AbstractChannel$AbstractUnsafe.register0(AbstractChannel.java:505)
at io.netty.channel.AbstractChannel$AbstractUnsafe.access$200(AbstractChannel.java:416)
at io.netty.channel.AbstractChannel$AbstractUnsafe$1.run(AbstractChannel.java:475)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:510)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:518)
at io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1044)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
... 1 more
I also try with a simple scala script, just to try to reach the cluster but I've had the same error.
Do you have any idea how can I reach my cluster with a python script?
(Edit)
I forgot to specify i've created a docker network between my master and my slaves.
So with the help of MrTshoot and Gaarv, i replace spark-master (in spark://spark-master:7077) by the ip of my master container (you can get it with the command docker network inspect my-network).
And it's work! Thanks!
When you specify .setMaster('spark://spark-master:7077') it means "reach spark cluster at DNS address "spark-master" and port 7077 which local machine cannot resolve.
So it order for your host machine to reach the cluster you must instead specify the Docker DNS / IP address of your Spark cluster, check "docker0" interface on your local machine and replace "spark-master" with it.
You can not connect Docker services on your host with its service name. you should set DNS or IP service or use following trick:
Expose your Spark Cluster Ports.
open your /etc/hosts and add following content on it
127.0.0.1 localhost spark-master
::1 localhost spark-master
I am runnning cassandra in a docker container on a custom server.
I start cassandra docker like this:
docker run --name cassandra -p 9042:9042 -d cassandra:latest
When i want to conect to the server via the python cassandra driver from datastax like this:
from cassandra.cqlengine import connection
connection.setup(["http://myserver.myname.com"], "cqlengine", protocol_version=3)
The exception is thrown:
File "C:\LONG\PATH\TO\venv\lib\site-packages\cassandra\cqlengine\connection.py", line 106, in setup
self.cluster = Cluster(self.hosts, **self.cluster_options)
File "cassandra\cluster.py", line 1181, in cassandra.cluster.Cluster.__init__
cassandra.UnresolvableContactPoints: {}
python-BaseException
After hours of searching through docker network permissions I found the simple solution, so maybe this will help you too.
The simple solution is removing "http://" from the server url and changing my code from
connection.setup(["http://myserver.myname.com"], "cqlengine", protocol_version=3)
To
connection.setup(["myserver.myname.com"], "cqlengine", protocol_version=3)
I thought its a docker networking issue and it took me many hours to pin it down to this simple mistake
I'm trying to connect Misson Planner so that I can simulate, but getting this error:
(proje) C:\Users\hasan>mavproxy.py --master tcp:127.0.0.1:5760 --out udp:127.0.0.1:20000 --out udp:127.0.0.1:10000
Auto-detected serial ports are:
Connect 0.0.0.0:14550 source_system=255
Loaded module console
Log Directory:
Telemetry log: mav.tlog
Waiting for heartbeat from 0.0.0.0:14550
MAV>
The IP i want to connect to is 127.0.0.1 but it tries to connect 0.0.0.0:14550. I did some research but none of the solutions i found solved my problem.
MAVProxy version = 1.8.35
pymavlink version = 2.4.8
screen shot
I want to install this module but there is something wrong when I try the step docker-compose build ...
I tried to update the Docker version and restart Docker many times. But it didn't work.
git clone https://github.com/uhh-lt/158.git
cd 158
docker-compose build
File "/home/ming/.local/bin/docker-compose", line 8, in <module>
sys.exit(main())
File "/home/ming/.local/lib/python3.8/site-packages/compose/cli/main.py", line 67, in main
command()
File "/home/ming/.local/lib/python3.8/site-packages/compose/cli/main.py", line 123, in perform_command
project = project_from_options('.', options)
File "/home/ming/.local/lib/python3.8/site-packages/compose/cli/command.py", line 60, in project_from_options
return get_project(
File "/home/ming/.local/lib/python3.8/site-packages/compose/cli/command.py", line 131, in get_project
client = get_client(
File "/home/ming/.local/lib/python3.8/site-packages/compose/cli/docker_client.py", line 41, in get_client
client = docker_client(
File "/home/ming/.local/lib/python3.8/site-packages/compose/cli/docker_client.py", line 170, in docker_client
client = APIClient(**kwargs)
File "/home/ming/.local/lib/python3.8/site-packages/docker/api/client.py", line 188, in __init__
self._version = self._retrieve_server_version()
File "/home/ming/.local/lib/python3.8/site-packages/docker/api/client.py", line 212, in _retrieve_server_version
raise DockerException(
docker.errors.DockerException: Error while fetching server API version: ('Connection aborted.', FileNotFoundError(2, 'No such file or directory'))
Update 2020-11-23
Thanks for two of you helping me with the error! I tried the commend but can't connect to my Docker:
ming#KITM-7664:~$ sudo /etc/init.d/docker start
[sudo] password for ming:
* Starting Docker: docker [ OK ]
ming#KITM-7664:~$ which docker
/usr/bin/docker
ming#KITM-7664:~$ docker version
Client: Docker Engine - Community
Version: 19.03.13
API version: 1.40
Go version: go1.13.15
Git commit: 4484c46d9d
Built: Wed Sep 16 17:02:52 2020
OS/Arch: linux/amd64
Experimental: false
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
ming#KITM-7664:~$ systemctl status docker
System has not been booted with systemd as init system (PID 1). Can't operate.
Failed to connect to bus: Host is down
ming#KITM-7664:~$ systemctl start docker
System has not been booted with systemd as init system (PID 1). Can't operate.
Failed to connect to bus: Host is down
ming#KITM-7664:~$ sudo /etc/init.d/docker start
* Starting Docker: docker [ OK ]
ming#KITM-7664:~$ docker version
Client: Docker Engine - Community
Version: 19.03.13
API version: 1.40
Go version: go1.13.15
Git commit: 4484c46d9d
Built: Wed Sep 16 17:02:52 2020
OS/Arch: linux/amd64
Experimental: false
Cannot connect to the Docker daemon at unix:///var/run/docker.sock. Is the docker daemon running?
Are you sure docker is running on your system? You can get that error when compose is not able to connect to docker via docker socket (if any other way for connection is not defined).
If you are running on linux, usually you can run systemctl status docker to check if docker daemon is running and systemctl start docker to start it.
It would help to tell what OS and docker version are you using.
set the permission like this,
sudo chmod 666 /var/run/docker.sock
When used WSL (Windows Subsystem for Linux) need to enable 'WSL Integration' for required distro in Windows Docker Desktop (Settings -> Resources-> WSL Integration -> Enable integration with required distros).
By default, the docker command can only be run the root user or by a user in the docker group, which is automatically created during Docker’s installation process. If you want to avoid typing sudo whenever you run the docker command, add your username to the docker group:
sudo usermod -aG docker ${USER}
To apply the new group membership, log out of the server and back in, or type the following:
su - ${USER}
You will be prompted to enter your user’s password to continue.
sudo service docker start
or
sudo service docker restart
Just had the same issue after updating Windows Docker desktop to it's latest version (20.10.2, build 2291f61). It happened that this update disabled the WSL2 integration with my virtual Ubuntu 18.04, which I use to run most projects.
I solved it this way:
Open Docker Desktop
Go to Settings > Resources > WSL Integration
Make sure that your distribution is enabled
Restart Docker
Without the need of restarting WSL2, docker should work again
I had a similar issue and it turned out to be due to Docker server not running. I started the app and then ran docker-compose up and it started working fine. Hope it helps anyone who's caught in a similar situation. :-)
You don't have permission to use the docker socket, by default only the docker group can access it. You can verify this with ls -l /var/run/docker.sock, which will print something like:
srw-rw----. 1 root docker 0 Oct 4 18:04 /var/run/docker.sock
To be able to access the socket and use Docker, add yourself to the Docker group with the following command:
sudo usermod -a -G docker $(whoami)
Then logout and back in. Docker will now work.
Make sure you check the box for 'use Docker Compose V2'. It solved for me.
docker desktop
If you're using docker-compose inside podman try the below command to resolve this issue "docker.errors.DockerException: Error while fetching server API version: ('Connection aborted.', FileNotFoundError(2, 'No such file or directory'))"
systemctl start podman.socket
I had the same problem but I managed to solve it when I ran docker server.
I'm also get this error when i try to run the docker-compose and docker desktop is turn off, i think that is bug in docker-compose that not notify the user it that the problem is becuse the docker service is down, same as the docker cli that throw correct error:
C:\Users\x\IdeaProjects\mongo-exmple>docker pull node
Using default tag: latest
error during connect: This error may indicate that the docker daemon is not running.: Post "http://%2F%2F.%2Fpipe%2Fdocker_engine/v1.24/images/create?fromImage=node&tag=latest": open //./pipe/docker_engine: The system cannot find
the file specified.
In Mac OS and Windows sometimes that's enough to start up docker application and this error is because of loss of docker daemon in system.
My issue was I was not signed into docker desktop.
At Linux (at least Ubuntu 18.04.x), you sometimes need to log out and re-login your X user session. Logging out of bash or out of the terminal window and re-logging in is not enough, see for example https://askubuntu.com/questions/1161020/groups-and-groups-user-show-different-groups-dialout-is-missing
I had this happen on a Mac after doing an OS update.
I was able to run docker on the command line but Docker Desktop wasn't running - starting Docker Desktop up fixed the issue.
In Windows exiting from Docker and running it again sometimes helps. Apparently not always docker starts up properly.
I had this problem while using ubuntu 22.04 wsl on windows and then I solve it like this
sudo update-alternatives --config iptables
it will show something like this
Selection Path Priority Status
------------------------------------------------------------
0 /usr/sbin/iptables-nft 20 auto mode
* 1 /usr/sbin/iptables-legacy 10 manual mode
2 /usr/sbin/iptables-nft 20 manual mode
then press 1
Press <enter> to keep the current choice[*], or type selection number: 1
Start docker using services
sudo service docker start
I had this error running docker compose in Linux Subsystem on Win10, the problem was that I have forgotten to start Docker Desktop application
I have this python server:
import SocketServer
class TCPHandler(SocketServer.BaseRequestHandler):
def handle(self):
print 'client is connected'
data = self.request.recv(1024)
print data
self.request.sendall('Message received!')
HOST, PORT = '0.0.0.0', 5004
server = SocketServer.TCPServer((HOST, PORT), TCPHandler)
print 'Listening on port {}'.format(PORT)
server.serve_forever()
and this client:
import socket
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect(('192.168.56.101', 5004))
s.sendall('Hello my name is John')
response = s.recv(1024)
print 'Response: {}'.format(response)
My OS is macOS and I have Ubuntu 14.04 installed on a virtual machine using VirtualBox. In VirtualBox, I setup a NAT network and I gave Ubuntu this IP address: 192.168.56.101. I put the server program on Ubuntu and added a rule in IPTables to allow incoming connections from port 5004. I started the server on Ubuntu and I tried to connect to the server using the client above on my macOS. The connection went through and the data exchange finished successfully.
Now to the problem. I installed Docker on my virtualized Ubuntu. Docker itself uses another version of Ubuntu 14.04. What I want is to run the server inside the Dockrized version of Ubuntu, so I wrote this Dockerfile:
FROM bamos/ubuntu-opencv-dlib-torch:ubuntu_14.04-opencv_2.4.11-dlib_19.0-torch_2016.07.12
ADD . /root
EXPOSE 5004
CMD ["python2", "/root/server.py"]
I build it using this command: sudo docker build -t boring91/mock and it was built successfully. I ran the Docker container using this command: sudo docker run -p 5004:5004 -t boring91/mock and it showed that it is started listening on port 5004. When I tried to connect to it using my client on my macOS, the socket connected but no data exchange happened. Same thing happens when I run the client on the virtualized Ubuntu. Can anyonw tell me whats the issue here?