Windows error while running standalone pyspark

Windows error while running standalone pyspark - python

I am trying to import pyspark in Anaconda and run sample code. However, whenever I try to run the code in Anaconda, I get following error message.
ERROR:py4j.java_gateway:An error occurred while trying to connect to the Java server (127.0.0.1:53294)
Traceback (most recent call last):
File "C:\spark\python\lib\py4j-0.10.3-src.zip\py4j\java_gateway.py", line 1021, in send_command
self.socket.sendall(command.encode("utf-8"))
ConnectionResetError: [WinError 10054] An existing connection was forcibly closed by the remote host
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\spark\python\lib\py4j-0.10.3-src.zip\py4j\java_gateway.py", line 883, in send_command
response = connection.send_command(command)
File "C:\spark\python\lib\py4j-0.10.3-src.zip\py4j\java_gateway.py", line 1025, in send_command
"Error while sending", e, proto.ERROR_ON_SEND)
py4j.protocol.Py4JNetworkError: Error while sending
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\spark\python\lib\py4j-0.10.3-src.zip\py4j\java_gateway.py", line 827, in _get_connection
connection = self.deque.pop()
IndexError: pop from an empty deque
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\spark\python\lib\py4j-0.10.3-src.zip\py4j\java_gateway.py", line 963, in start
self.socket.connect((self.address, self.port))
ConnectionRefusedError: [WinError 10061] No connection could be made because the target machine actively refused it
ERROR:py4j.java_gateway:An error occurred while trying to connect to the Java server (127.0.0.1:53294)
Traceback (most recent call last):
File "C:\spark\python\lib\py4j-0.10.3-src.zip\py4j\java_gateway.py", line 827, in _get_connection
connection = self.deque.pop()
IndexError: pop from an empty deque
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\spark\python\lib\py4j-0.10.3-src.zip\py4j\java_gateway.py", line 963, in start
self.socket.connect((self.address, self.port))
ConnectionRefusedError: [WinError 10061] No connection could be made because the target machine actively refused it
ERROR:py4j.java_gateway:An error occurred while trying to connect to the Java server (127.0.0.1:53294)
Traceback (most recent call last):
File "C:\spark\python\lib\py4j-0.10.3-src.zip\py4j\java_gateway.py", line 827, in _get_connection
connection = self.deque.pop()
IndexError: pop from an empty deque
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\spark\python\lib\py4j-0.10.3-src.zip\py4j\java_gateway.py", line 963, in start
self.socket.connect((self.address, self.port))
ConnectionRefusedError: [WinError 10061] No connection could be made because the target machine actively refused it
ERROR:py4j.java_gateway:An error occurred while trying to connect to the Java server (127.0.0.1:53294)
Traceback (most recent call last):
File "C:\spark\python\lib\py4j-0.10.3-src.zip\py4j\java_gateway.py", line 827, in _get_connection
connection = self.deque.pop()
IndexError: pop from an empty deque
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\spark\python\lib\py4j-0.10.3-src.zip\py4j\java_gateway.py", line 963, in start
self.socket.connect((self.address, self.port))
ConnectionRefusedError: [WinError 10061] No connection could be made because the target machine actively refused it
ERROR:py4j.java_gateway:An error occurred while trying to connect to the Java server (127.0.0.1:53294)
Traceback (most recent call last):
File "C:\spark\python\lib\py4j-0.10.3-src.zip\py4j\java_gateway.py", line 827, in _get_connection
connection = self.deque.pop()
IndexError: pop from an empty deque
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\spark\python\lib\py4j-0.10.3-src.zip\py4j\java_gateway.py", line 963, in start
self.socket.connect((self.address, self.port))
ConnectionRefusedError: [WinError 10061] No connection could be made because the target machine actively refused it
Reloaded modules: py4j.protocol, pyspark.sql.context, py4j.java_gateway, py4j.compat, pyspark.profiler, pyspark.sql.catalog, pyspark.context, pyspark.sql.group, pyspark.sql.conf, pyspark.sql.readwriter, pyspark.resultiterable, pyspark.sql, pyspark.sql.dataframe, pyspark.traceback_utils, pyspark.cloudpickle, pyspark.rddsampler, pyspark.accumulators, pyspark.broadcast, py4j, pyspark.rdd, pyspark.sql.functions, pyspark.java_gateway, pyspark.statcounter, pyspark.conf, pyspark.serializers, pyspark.files, pyspark.join, pyspark.sql.streaming, pyspark.shuffle, pyspark, py4j.version, pyspark.sql.session, pyspark.sql.column, py4j.finalizer, py4j.java_collections, pyspark.status, pyspark.sql.window, pyspark.sql.utils, pyspark.storagelevel, pyspark.heapq3, py4j.signals, pyspark.sql.types
Traceback (most recent call last):
File "", line 1, in
runfile('C:/Users/hlee/Desktop/pyspark.py', wdir='C:/Users/hlee/Desktop')
File "C:\Program Files\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 866, in runfile
execfile(filename, namespace)
File "C:\Program Files\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "C:/Users/hlee/Desktop/pyspark.py", line 38, in
sc = SparkContext()
File "C:\spark\python\lib\pyspark.zip\pyspark\context.py", line 115, in init
conf, jsc, profiler_cls)
File "C:\spark\python\lib\pyspark.zip\pyspark\context.py", line 168, in _do_init
self._jsc = jsc or self._initialize_context(self._conf._jconf)
File "C:\spark\python\lib\pyspark.zip\pyspark\context.py", line 233, in _initialize_context
return self._jvm.JavaSparkContext(jconf)
File "C:\spark\python\lib\py4j-0.10.3-src.zip\py4j\java_gateway.py", line 1401, in call
answer, self._gateway_client, None, self._fqn)
File "C:\spark\python\lib\py4j-0.10.3-src.zip\py4j\protocol.py", line 319, in get_return_value
format(target_id, ".", name), value)
Py4JJavaError: An error occurred while calling None.org.apache.spark.api.java.JavaSparkContext.
: java.net.BindException: Cannot assign requested address: bind: Service 'sparkDriver' failed after 16 retries! Consider explicitly setting the appropriate port for the service 'sparkDriver' (for example spark.ui.port for SparkUI) to an available port or increasing spark.port.maxRetries.
at sun.nio.ch.Net.bind0(Native Method)
at sun.nio.ch.Net.bind(Net.java:433)
at sun.nio.ch.Net.bind(Net.java:425)
at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223)
at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
at io.netty.channel.socket.nio.NioServerSocketChannel.doBind(NioServerSocketChannel.java:125)
at io.netty.channel.AbstractChannel$AbstractUnsafe.bind(AbstractChannel.java:485)
at io.netty.channel.DefaultChannelPipeline$HeadContext.bind(DefaultChannelPipeline.java:1089)
at io.netty.channel.AbstractChannelHandlerContext.invokeBind(AbstractChannelHandlerContext.java:430)
at io.netty.channel.AbstractChannelHandlerContext.bind(AbstractChannelHandlerContext.java:415)
at io.netty.channel.DefaultChannelPipeline.bind(DefaultChannelPipeline.java:903)
at io.netty.channel.AbstractChannel.bind(AbstractChannel.java:198)
at io.netty.bootstrap.AbstractBootstrap$2.run(AbstractBootstrap.java:348)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:745)
Following is my sample code, and I have no problem running Apache in cmd.
import os
import sys
spark_path = r"C:\spark"
os.environ['SPARK_HOME'] = spark_path
sys.path.insert(0, spark_path + "/bin")
sys.path.insert(0, spark_path + "/python/pyspark/")
sys.path.insert(0, spark_path + "/python/lib/pyspark.zip")
sys.path.insert(0, spark_path + "/python/lib/py4j-0.10.3-src.zip")
from pyspark import SparkContext
sc = SparkContext()
import random
NUM_SAMPLES = 100000
def sample(p):
x, y = random.random(), random.random()
return 1 if x*x + y*y < 1 else 0
count = sc.parallelize(range(0, NUM_SAMPLES)).map(sample) \
.reduce(lambda a, b: a + b)
print("Pi is roughly %f" % (4.0 * count / NUM_SAMPLES))
I have downloaded winutils.exe and added HADOOP_HOME variable in environment Varaible and added export SPARK_MASTER_IP=127.0.0.1, export SPARK_LOCAL_IP=127.0.0.1 in spark-env.sh file. However, I am still getting the same error. Can someone help me and point out what am I missing?
Thank you in advance,

In my case I just had to re-estart the kernel.
The problem was that I was creating the environment twice: every time I made a mistake I re-ran the code from the beginning.

Related

Nmap scan a network range with subnet raises PortScannerError

I'm trying to scan a nmap with the 192.168.10.1/24 network, using Python 3.9.2, Nmap 7.80 and python-nmap 0.7.1
Code:
import nmap
nm = nmap.PortScanner()
result = nm.scan(hosts = '192.168.10.1/24')
print(result)
Error:
Traceback (most recent call last):
File "/usr/local/lib/python3.9/dist-packages/nmap/nmap.py", line 369, in analyse_nmap_xml_scan
dom = ET.fromstring(self._nmap_last_output)
File "/usr/lib/python3.9/xml/etree/ElementTree.py", line 1348, in XML
return parser.close()
xml.etree.ElementTree.ParseError: no element found: line 7, column 0
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/fypgroup02/Documents/test.py", line 4, in <module>
result = nm.scan(hosts = '192.168.10.1/24')
File "/usr/local/lib/python3.9/dist-packages/nmap/nmap.py", line 306, in scan
return self.analyse_nmap_xml_scan(
File "/usr/local/lib/python3.9/dist-packages/nmap/nmap.py", line 372, in analyse_nmap_xml_scan
raise PortScannerError(nmap_err)
nmap.nmap.PortScannerError: "nmap: Target.cc:503: void Target::stopTimeOutClock(const timeval*): Assertion `htn.toclock_running == true' failed.\n"
I have tried to scan 192.168.10.1-10, it works, but why is 192.168.10.1/24 not working?

Add GPS to Your Raspberry Pi Project with Google Maps Pubnub

(I'm following this tutorial : https://www.pubnub.com/blog/raspberry-pi-gps-lte-google-maps-api/ and I use a Raspberry pi 4 model B connected trough wifi )
I carefully followed all steps carefully untill I tried to run the modified gps_simpletest.py file. When I try to run it I get the following error:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/serial/serialposix.py", line 323, in _rec onfigure_port
orig_attr = termios.tcgetattr(self.fd)
termios.error: (5, 'Input/output error')
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "gps_simpletest.py", line 21, in <module>
uart = serial.Serial("/dev/ttyS0", baudrate=9600, timeout=10)
File "/usr/lib/python3/dist-packages/serial/serialutil.py", line 240, in __ini t__
self.open()
File "/usr/lib/python3/dist-packages/serial/serialposix.py", line 272, in open
self._reconfigure_port(force_update=True)
File "/usr/lib/python3/dist-packages/serial/serialposix.py", line 326, in _rec onfigure_port
raise SerialException("Could not configure port: {}".format(msg))
serial.seria
I'm pretty sure I installed all modules, libraries, ....
Any help would be much appreciated!Thanks in advance

Python connection with Hive database in HDInsight

I am trying to create connection to Hive hosted in HDInsight cluster through my python script and getting below error-
Traceback (most recent call last):
File "ClassLoader.java", line 357, in java.lang.ClassLoader.loadClass
File "Launcher.java", line 349, in sun.misc.Launcher$AppClassLoader.loadClass
File "ClassLoader.java", line 424, in java.lang.ClassLoader.loadClass
File "URLClassLoader.java", line 382, in java.net.URLClassLoader.findClass
java.lang.ClassNotFoundException: java.lang.ClassNotFoundException: org.apache.thrift.transport.TTransportException
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "org.jpype.JPypeContext.java", line 330, in org.jpype.JPypeContext.callMethod
File "Method.java", line 498, in java.lang.reflect.Method.invoke
File "DelegatingMethodAccessorImpl.java", line 43, in sun.reflect.DelegatingMethodAccessorImpl.invoke
File "NativeMethodAccessorImpl.java", line 62, in sun.reflect.NativeMethodAccessorImpl.invoke
File "NativeMethodAccessorImpl.java", line -2, in sun.reflect.NativeMethodAccessorImpl.invoke0
File "DriverManager.java", line 247, in java.sql.DriverManager.getConnection
File "DriverManager.java", line 664, in java.sql.DriverManager.getConnection
File "HiveDriver.java", line 105, in org.apache.hive.jdbc.HiveDriver.connect
Exception: Java Exception
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "s.py", line 5, in <module>
"/root/jdbc/hive-jdbc-1.2.1000.2.6.5.3009-43.jar")
File "/usr/local/lib64/python3.6/site-packages/jaydebeapi/__init__.py", line 412, in connect
jconn = _jdbc_connect(jclassname, url, driver_args, jars, libs)
File "/usr/local/lib64/python3.6/site-packages/jaydebeapi/__init__.py", line 230, in _jdbc_connect_jpype
return jpype.java.sql.DriverManager.getConnection(url, *dargs)
java.lang.NoClassDefFoundError: java.lang.NoClassDefFoundError: org/apache/thrift/transport/TTransportException
My Script is -
import jaydebeapi
conn = jaydebeapi.connect("org.apache.hive.jdbc.HiveDriver",
"jdbc:hive2://10.20.30.40:10001/default;transportMode=http;ssl=false;httpPath=/hive2",
["username", "password"],
"/root/jdbc/hive-jdbc-1.2.1000.2.6.5.3009-43.jar")
I have exported CLASSPATH wil all jar files.

The error is java.lang.ClassNotFoundException: java.lang.ClassNotFoundException which specifies that the execution is not able to find the jar /root/jdbc/hive-jdbc-1.2.1000.2.6.5.3009-43.jar.I believe it's only placed in a host from where you are executing the code. I would suggest placing the jar file in the same directory structure in all the nodes in the cluster and have a check on permissions so that the user executing the job has access to that path.

Can not import theano after installation windows 10

I've still not been able to resolve this problem, after working on it for a week.
I'm thinking of giving up and just running theano on a virutal machine; there just doesn't seem to be any support out there for Windows 10!
Or am I wrong; is there an easy fix to this?
>>> import theano
Traceback (most recent call last):
File "C:\Users\cturn\Anaconda3\lib\site-packages\theano\theano\gof\lazylinker_c.py", line 75, in <module>
raise ImportError()
ImportError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\cturn\Anaconda3\lib\site-packages\theano\theano\gof\lazylinker_c.py", line 92, in <module>
raise ImportError()
ImportError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\cturn\Anaconda3\lib\site-packages\theano\theano\gof\cmodule.py", line 1784, in _try_compile_tmp
os.remove(exe_path + ".exe")
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\cturn\\AppData\\Local\\Temp\\try_march_3v6ffkv9.exe'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Users\cturn\Anaconda3\lib\site-packages\theano\theano\__init__.py", line 66, in <module>
from theano.compile import (
File "C:\Users\cturn\Anaconda3\lib\site-packages\theano\theano\compile\__init__.py", line 10, in <module>
from theano.compile.function_module import *
File "C:\Users\cturn\Anaconda3\lib\site-packages\theano\theano\compile\function_module.py", line 21, in <module>
import theano.compile.mode
File "C:\Users\cturn\Anaconda3\lib\site-packages\theano\theano\compile\mode.py", line 10, in <module>
import theano.gof.vm
File "C:\Users\cturn\Anaconda3\lib\site-packages\theano\theano\gof\vm.py", line 659, in <module>
from . import lazylinker_c
File "C:\Users\cturn\Anaconda3\lib\site-packages\theano\theano\gof\lazylinker_c.py", line 125, in <module>
args = cmodule.GCC_compiler.compile_args()
File "C:\Users\cturn\Anaconda3\lib\site-packages\theano\theano\gof\cmodule.py", line 2088, in compile_args
default_compilation_result, default_execution_result = try_march_flag(GCC_compiler.march_flags)
File "C:\Users\cturn\Anaconda3\lib\site-packages\theano\theano\gof\cmodule.py", line 1856, in try_march_flag
flags=cflags, try_run=True)
File "C:\Users\cturn\Anaconda3\lib\site-packages\theano\theano\gof\cmodule.py", line 2188, in try_compile_tmp
comp_args)
File "C:\Users\cturn\Anaconda3\lib\site-packages\theano\theano\gof\cmodule.py", line 1789, in _try_compile_tmp
err += "\n" + str(e)
TypeError: can't concat bytes to str
Um, can't concat bytes to str? What does this mean?

The error you are experiencing appears to result from another sub-process utilizing the same resources as the script you are attempting to write. Although it sounds trivial, I would recommend making sure that you have admin privileges, or at least privileges to the desired resources, and/or restart your computer to kill the sub-process using that module. You could also look in the task manager and kill any/all other processes using python, but that might take longer.
(This may be the program using the "resource" try_march_3v6ffkv9.exe)

KeyError when assigning ''praw.Reddit'' to variable

I could successfully connect to reddit's servers with oauth2 some time ago, but when running my script just now, I get a KeyError followed by a NoSectionError. Code is below followed by exceptions, (The code has been reduced to its essentials).
import praw
# Configuration
APP_UA = 'useragent'
...
...
...
r = praw.Reddit(APP_UA)
Error message:
Traceback (most recent call last):
File "D:\Directory\Python\lib\configparser.py", line 843, in items
d.update(self._sections[section])
KeyError: 'useragent'
A NoSectionError occurred when handling the above exception.
"During handling of the above exception, another exception occurred:"
'Traceback (most recent call last):
File "D:\Directory\Python\Projects\myprj for Reddit, globaloffensive\oddshotcrawler.py", line 19, in <module>
r = praw.Reddit(APP_UA)
File "D:\Directory\Python\lib\site-packages\praw\reddit.py", line 84, in __init__
**config_settings)
File "D:\Directory\Python\lib\site-packages\praw\config.py", line 47, in __init__
raw = dict(Config.CONFIG.items(site_name), **settings)
File "D:\Directory\Python\lib\configparser.py", line 846, in items
raise NoSectionError(section)
configparser.NoSectionError: No section: 'useragent'
[Finished in 0.2s]

Try giving it a user_agent kwarg.
r = praw.Reddit(useragent=APP_UA)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Windows error while running standalone pyspark - python

In my case I just had to re-estart the kernel. The problem was that I was creating the environment twice: every time I made a mistake I re-ran the code from the beginning.

Related

Nmap scan a network range with subnet raises PortScannerError

Add GPS to Your Raspberry Pi Project with Google Maps Pubnub

Python connection with Hive database in HDInsight

Can not import theano after installation windows 10

KeyError when assigning ''praw.Reddit'' to variable

Categories

Resources