apache-spark - Error when starting pyspark on windows - python

I'm trying to experiment with MLlib on windows with python. So it seems I need SPARK which in turn needs HADOOP. I've installed Anaconda2 which contains python 2.7, numpy, etc.
I've been following this recipe which seems to me mostly getting me where I need to go, but I think I'm stuck on this last error:
Python 2.7.13 |Anaconda 4.3.1 (64-bit)| (default, Dec 19 2016, 13:29:36) [MSC v.1500 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
Anaconda is brought to you by Continuum Analytics.
Please check out: http://continuum.io/thanks and https://anaconda.org
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
Traceback (most recent call last):
File "C:\spark\bin\..\python\pyspark\shell.py", line 43, in <module>
spark = SparkSession.builder\
File "C:\spark\python\pyspark\sql\session.py", line 179, in getOrCreate
session._jsparkSession.sessionState().conf().setConfString(key, value)
File "C:\spark\python\lib\py4j-0.10.4-src.zip\py4j\java_gateway.py", line 1133, in __call__
File "C:\spark\python\pyspark\sql\utils.py", line 79, in deco
raise IllegalArgumentException(s.split(': ', 1)[1], stackTrace)
pyspark.sql.utils.IllegalArgumentException: u"Error while instantiating 'org.apache.spark.sql.hive.HiveSessionState':"
From this output it is clear to see that there is not error regarding winutils.exe not being found.
Also, the exception is originating in the java domain of py4j, but we've lost the back-trace thanks to the IllegalArgumentException.
All guidance appreciated!
Cheers

Related

Is psutil.net_connections() unavailable for OSX?

I was trying to run psutil.net_connections() on my OS X (12.1) Macbook Pro in python, but was greeted with the error of syscall failed. This is weird because most of ther other functions of psutil worked fine with no issues, yet somehow net_connections seemed to be the only one not working.
Python 3.8.9 (default, Oct 26 2021, 07:25:53)
[Clang 13.0.0 (clang-1300.0.29.30)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import psutil
>>> psutil.net_connections(kind='tcp')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Users/mastermi/Library/Python/3.8/lib/python/site-packages/psutil/__init__.py", line 2161, in net_connections
return _psplatform.net_connections(kind)
File "/Users/mastermi/Library/Python/3.8/lib/python/site-packages/psutil/_psosx.py", line 248, in net_connections
cons = Process(pid).connections(kind)
File "/Users/mastermi/Library/Python/3.8/lib/python/site-packages/psutil/_psosx.py", line 343, in wrapper
return fun(self, *args, **kwargs)
File "/Users/mastermi/Library/Python/3.8/lib/python/site-packages/psutil/_psosx.py", line 500, in connections
rawlist = cext.proc_connections(self.pid, families, types)
RuntimeError: proc_pidinfo(PROC_PIDLISTFDS) 2/2 syscall failed
If anyone know how to fix this issue it would be greatly appreciated.
P.S. My psutil is version 5.9.0 if that helps
Short answer:
Yes, it is unavailable.
Long answer:
The most likely reason this isn't working on macOS is because it doesn't have a proc filesystem. It isn't much of a bug as it is a lacking feature. As an alternative, you could try sysctl which does have some of the features as proc, but I'm not sure if it fits your use case.

ciscoconfparse unable to import python3.7 win10

Any new solution for this. I am unable to from ciscoconfparse import CiscoConfParse
I tried several versions of ciscoconfparse in python3 usin gpowershell in windows10.
Python 3.7.0 (v3.7.0:1bf9cc5093, Jun 27 2018, 04:06:47) [MSC v.1914 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> from ciscoconfparse import CiscoConfParse
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Python3\lib\site-packages\ciscoconfparse\__init__.py", line 1, in <module>
from .ciscoconfparse import *
File "C:\Python3\lib\site-packages\ciscoconfparse\ciscoconfparse.py", line 4018
^
SyntaxError: invalid syntax
I still have not found a version that will let me import the module.
I browsed through several links here..but did not find an answer. The closest issue to mine is:
ciscoconfparse in Python 3.4 module doesn't import correctly
but not answer.
Any new info?
Thanks
I have reproduced the bug where Win10 and Python 3.7.0 cause problems.
FWIW, if you completely uninstall Python 3.7.0 (including manually deleting your Windows installation directory tree), then install Python 3.7.1, you should not have this problem with Windows 10.

ValueError with 'ciphers' when importing paramiko

I want to install paramiko on win 7 32bit ,the python is 3.3 .
I can compile it,but get follow errors:
Installed c:\python33\lib\site-packages\paramiko-1.8.0-py3.3.egg
Processing dependencies for paramiko==1.8.0
Searching for pycrypto==2.6
Best match: pycrypto 2.6
Adding pycrypto 2.6 to easy-install.pth file
Using c:\python33\lib\site-packages
Finished processing dependencies for paramiko==1.8.0
C:\Users\MC\Downloads\paramiko-paramiko-v1.8.0-9-g786920a>python
Python 3.3.0 (v3.3.0:bd8afb90ebf2, Sep 29 2012, 10:55:48) [MSC v.1600 32 bit (Intel)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import paramiko
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File ".\paramiko\__init__.py", line 62, in <module>
from .transport import SecurityOptions, Transport
File ".\paramiko\transport.py", line 68, in <module>
class SecurityOptions (object):
ValueError: 'ciphers' in __slots__ conflicts with class variable
>>>
Paramiko does not run on Python 3. Yet. I'm working on a development branch (https://github.com/dorianpula/paramiko/tree/python3-support) to add support for Python 3, and I'm working on fixing this particular issue.

unable to get parallel python (pp) running on python 2.7

I am working on windows box with 8 processors. I have to run a python script that does mass data processing. When run as such, the script uses only one processor. I learnt that, to harness the power of multiple processors, I can use parallel python (pp) library.
I installed the library on my machine and followed the instructions available at http://www.parallelpython.com/content/view/15/30/#QUICKSMP
However, the code to configure pp fails on my machine:
Python 2.7.1 (r271:86832, Nov 27 2010, 17:19:03) [MSC v.1500 64 bit (AMD64)] on win32
Type "copyright", "credits" or "license()" for more information.
>>> import pp
>>> job_server = pp.Server()
Traceback (most recent call last):
File "<pyshell#1>", line 1, in <module>
job_server = pp.Server()
File "C:\Python27\lib\site-packages\pp.py", line 343, in __init__
self.set_ncpus(ncpus)
File "C:\Python27\lib\site-packages\pp.py", line 503, in set_ncpus
range(ncpus - len(self.__workers))])
File "C:\Python27\lib\site-packages\pp.py", line 148, in __init__
self.start()
File "C:\Python27\lib\site-packages\pp.py", line 161, in start
self.pid = int(self.t.receive())
File "C:\Python27\lib\site-packages\pptransport.py", line 134, in receive
msg_len = struct.unpack("!Q", size_packed)[0]
error: unpack requires a string argument of length 8
>>>
Can you please tell me how do I get this resolved? I have installed pp version 1.6.0

How come Pylons is not recognized when I run 'import pylons' in Windows Vista command prompt?

When I try to import pylons in the virtual python environment I get the error
C:\env\Scripts>python
Python 2.7 (r27:82525, Jul 4 2010, 07:43:08) [MSC v.1500 64 bit (A
MD64)] on win32
Type "help", "copyright", "credits" or "license" for more informati
on.
>>> import pylons
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\env\lib\site-packages\pylons-1.0-py2.7.egg\pylons\__init
__.py", line 6, in <module>
from paste.registry import StackedObjectProxy
ImportError: No module named registry
As I understand this error, Python is telling me that it can not find the module named registry. Perhaps this is a result of the error I got while installing Pylons which is explained over here Why do I get an error on the last line of installing Pylons 1.0 with easy_install and Python 2.7 in Windows Vista 64?
It seems that many of the Pylon components were installed but I guess registry was not or maybe Pylons just can not see it.
Any ideas on how to solve this?
You have to activate the virtual environment before you can import pylons.
C:\Users\Josh>env\scripts\activate
(env) C:\Users\Josh>python
ActivePython 2.6.2.2 (ActiveState Software Inc.) based on
Python 2.6.2 (r262:71600, Apr 21 2009, 15:05:37) [MSC v.1500 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information.
>>> import pylons
>>>
vs. this
C:\Users\Josh\env\Scripts>python
ActivePython 2.6.2.2 (ActiveState Software Inc.) based on
Python 2.6.2 (r262:71600, Apr 21 2009, 15:05:37) [MSC v.1500 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information.
>>> pylons
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'pylons' is not defined
>>>
I guessing that you have the pylons package installed outside and inside your virtual environment. So python is letting you import pylons but the paste package is not installed outside of your virtual environment so you are getting an error.
Running the activate batch script (should be inside your env\Scripts folder) should solve the problem.

Categories

Resources