I am using pre-built 'spark-2.0.1-bin-hadoop2.7’ and when I try to start pyspark, I get following message.
Any ideas what could be wrong? I tried using python3, setting SPARK_LOCAL_IP to 127.0.0.1 but same error.
~ -> cd /Applications/spark-2.0.1-bin-hadoop2.7/bin/
/Applications/spark-2.0.1-bin-hadoop2.7/bin -> pyspark
Python 2.7.12 (default, Oct 11 2016, 05:24:00)
[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.38)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel).
16/12/19 14:50:47 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
16/12/19 14:50:47 WARN Utils: Your hostname, XXXXXX.com resolves to a loopback address: 127.0.0.1; using XX.XX.XX.XXX instead (on interface en0)
16/12/19 14:50:47 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
Traceback (most recent call last):
File "/Applications/spark-2.0.1-bin-hadoop2.7/python/pyspark/shell.py", line 43, in <module>
spark = SparkSession.builder\
File "/Applications/spark-2.0.1-bin-hadoop2.7/python/pyspark/sql/session.py", line 169, in getOrCreate
sc = SparkContext.getOrCreate(sparkConf)
File "/Applications/spark-2.0.1-bin-hadoop2.7/python/pyspark/context.py", line 294, in getOrCreate
SparkContext(conf=conf or SparkConf())
File "/Applications/spark-2.0.1-bin-hadoop2.7/python/pyspark/context.py", line 115, in __init__
conf, jsc, profiler_cls)
File "/Applications/spark-2.0.1-bin-hadoop2.7/python/pyspark/context.py", line 174, in _do_init
self._accumulatorServer = accumulators._start_update_server()
File "/Applications/spark-2.0.1-bin-hadoop2.7/python/pyspark/accumulators.py", line 259, in _start_update_server
server = AccumulatorServer(("localhost", 0), _UpdateRequestHandler)
File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/SocketServer.py", line 417, in __init__
self.server_bind()
File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/SocketServer.py", line 431, in server_bind
self.socket.bind(self.server_address)
File "/usr/local/Cellar/python/2.7.12_2/Frameworks/Python.framework/Versions/2.7/lib/python2.7/socket.py", line 228, in meth
return getattr(self._sock,name)(*args)
socket.gaierror: [Errno 8] nodename nor servname provided, or not known
Thanks
Found it. Some how my host mapping was messing it up. Changing it to point to localhost worked.:
/etc/host
#127.0.0.1 XXXXXX.com
127.0.0.1 localhost
In cases when you cannot cleanup /etc/hosts (such as it's being tempered with by some VPN solution), here is a workaround:
from pyspark.sql import SparkSession
def patch_pyspark_accumulators():
from inspect import getsource
import pyspark.accumulators as pa
exec(getsource(pa._start_update_server).replace("localhost", "127.0.0.1"), pa.__dict__)
patch_pyspark_accumulators()
spark = SparkSession.builder.getOrCreate()
Related
I have been trying to get up and running for a college project for about two weeks now following any tutorial i could find online and i am getting the error below.
Error:
`#Mac-mini ~ % pyspark
Python 3.9.7 (default, Sep 3 2021, 12:37:55)
[Clang 12.0.5 (clang-1205.0.22.9)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
Exception in thread "main" java.lang.ExceptionInInitializerError
at org.apache.spark.unsafe.array.ByteArrayMethods.<clinit>(ByteArrayMethods.java:54)
at org.apache.spark.internal.config.package$.<init>(package.scala:1095)
at org.apache.spark.internal.config.package$.<clinit>(package.scala)
at org.apache.spark.deploy.SparkSubmitArguments.$anonfun$loadEnvironmentArguments$3(SparkSubmitArguments.scala:157)
at scala.Option.orElse(Option.scala:447)
at org.apache.spark.deploy.SparkSubmitArguments.loadEnvironmentArguments(SparkSubmitArguments.scala:157)
at org.apache.spark.deploy.SparkSubmitArguments.<init>(SparkSubmitArguments.scala:115)
at org.apache.spark.deploy.SparkSubmit$$anon$2$$anon$3.<init>(SparkSubmit.scala:1022)
at org.apache.spark.deploy.SparkSubmit$$anon$2.parseArguments(SparkSubmit.scala:1022)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:85)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1039)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1048)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.lang.reflect.InaccessibleObjectException: Unable to make private java.nio.DirectByteBuffer(long,int) accessible: module java.base does not "opens java.nio" to unnamed module #24912924
at java.base/java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:354)
at java.base/java.lang.reflect.AccessibleObject.checkCanSetAccessible(AccessibleObject.java:297)
at java.base/java.lang.reflect.Constructor.checkCanSetAccessible(Constructor.java:188)
at java.base/java.lang.reflect.Constructor.setAccessible(Constructor.java:181)
at org.apache.spark.unsafe.Platform.<clinit>(Platform.java:56)
... 13 more
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/pyspark/python/pyspark/shell.py", line 35, in <module>
SparkContext._ensure_initialized() # type: ignore
File "/usr/local/lib/python3.9/site-packages/pyspark/context.py", line 331, in _ensure_initialized
SparkContext._gateway = gateway or launch_gateway(conf)
File "/usr/local/lib/python3.9/site-packages/pyspark/java_gateway.py", line 108, in launch_gateway
raise Exception("Java gateway process exited before sending its port number")
Exception: Java gateway process exited before sending its port number
Change JAVA_HOME to different Java version.
I had exactly the same error message while having JAVA_HOME set to openjdk-16.0.1 path, then I changed it to adoptopenjdk-12.jdk path and the PySpark Exception: Java gateway process exited before sending its port number has gone.
Thanks #werner for suggestion.
One of my playbook ran ok on the following setup
ansible 2.9.11 config file = None configured module search path = ['/home/alice/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules'] ansible python module location = /usr/local/lib/python3.6/dist-packages/ansible executable location = /usr/local/bin/ansible python version = 3.6.5 (default, Apr 1 2018, 05:46:30) [GCC 7.3.0]
Things broke, however, when I moved the same playbook to the following setup
ansible 2.10.2
config file = /home/bob/ansible/ansible.cfg
configured module search path = ['/home/bob/.ansible/plugins/modules', '/usr/share/ansible/plugins/modules']
ansible python module location = /usr/local/lib/python3.8/dist-packages/ansible
executable location = /usr/local/bin/ansible
python version = 3.8.5 (default, Jul 28 2020, 12:59:40) [GCC 9.3.0]
I got the following error and am trying to figure out if it is the new python version, the Ansible version, or maybe the Ansible Galaxy module version (upgraded from Fortinet.Fortimanager 1.0.5 to 2.0.0, but when I force install 1.0.5, I get another error, which makes me think this isn't the main issue)
TASK [Add model device] ******************************************************************
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: ansible_collections.fortinet.fortimanager.plugins.module_utils.common.FMGBaseException: An attempt was made at communicating with a FMG with no valid session and an unexpected error was discovered.
fatal: [fmg01]: FAILED! => {"changed": false, "module_stderr": "Traceback (most recent call last):\n File \"/tmp/ansible_fmgr_dvm_cmd_add_device_payload_ekiptyph/ansible_fmgr_dvm_cmd_add_device_payload.zip/ansible_collections/fortinet/fortimanager/plugins/modules/fmgr_dvm_cmd_add_device.py\", line 351, in main\n File \"/tmp/ansible_fmgr_dvm_cmd_add_device_payload_ekiptyph/ansible_fmgr_dvm_cmd_add_device_payload.zip/ansible/module_utils/connection.py\", line 195, in __rpc__\nansible.module_utils.connection.ConnectionError: An attempt was made at communicating with a FMG with no valid session and an unexpected error was discovered.\n\nDuring handling of the above exception, another exception occurred:\n\nTraceback (most recent call last):\n File \"/home/bob/.ansible/tmp/ansible-local-12942ssko667z/ansible-tmp-1603819794.3367019-13026-201727037698625/AnsiballZ_fmgr_dvm_cmd_add_device.py\", line 102, in <module>\n _ansiballz_main()\n File \"/home/bob/.ansible/tmp/ansible-local-12942ssko667z/ansible-tmp-1603819794.3367019-13026-201727037698625/AnsiballZ_fmgr_dvm_cmd_add_device.py\", line 94, in _ansiballz_main\n invoke_module(zipped_mod, temp_path, ANSIBALLZ_PARAMS)\n File \"/home/bob/.ansible/tmp/ansible-local-12942ssko667z/ansible-tmp-1603819794.3367019-13026-201727037698625/AnsiballZ_fmgr_dvm_cmd_add_device.py\", line 40, in invoke_module\n runpy.run_module(mod_name='ansible_collections.fortinet.fortimanager.plugins.modules.fmgr_dvm_cmd_add_device', init_globals=None, run_name='__main__', alter_sys=True)\n File \"/usr/lib/python3.8/runpy.py\", line 207, in run_module\n return _run_module_code(code, init_globals, run_name, mod_spec)\n File \"/usr/lib/python3.8/runpy.py\", line 97, in _run_module_code\n _run_code(code, mod_globals, init_globals,\n File \"/usr/lib/python3.8/runpy.py\", line 87, in _run_code\n exec(code, run_globals)\n File \"/tmp/ansible_fmgr_dvm_cmd_add_device_payload_ekiptyph/ansible_fmgr_dvm_cmd_add_device_payload.zip/ansible_collections/fortinet/fortimanager/plugins/modules/fmgr_dvm_cmd_add_device.py\", line 362, in <module>\n File \"/tmp/ansible_fmgr_dvm_cmd_add_device_payload_ekiptyph/ansible_fmgr_dvm_cmd_add_device_payload.zip/ansible_collections/fortinet/fortimanager/plugins/modules/fmgr_dvm_cmd_add_device.py\", line 356, in main\nansible_collections.fortinet.fortimanager.plugins.module_utils.common.FMGBaseException: An attempt was made at communicating with a FMG with no valid session and an unexpected error was discovered.\n", "module_stdout": "", "msg": "MODULE FAILURE\nSee stdout/stderr for the exact error", "rc": 1}
Any suggestions on how I isolate this given the output error?
Here is the playbook in question (super simple)
---
- name: Add model device to FMG and install Policy Package
hosts: fmg01
# gather_facts: no
connection: httpapi
collections:
- fortinet.fortimanager
tasks:
- name: Add model device
fmgr_dvm_cmd_add_device:
loose_validation: true
method: exec
params:
- data:
adom: root
device:
# device action: add_model
mgmt_mode: 'fmg'
#os_ver: 6
#mr: 4
sn: FGVM01TMxxxxxxxx
adm_pass: 'password'
adm_usr: 'admin'
ip: '192.168.0.100'
Whoops, the problem seems to be that the following item was missing from Fortinet Fortimanager:
config system admin user
edit admin
set rpc-permit read-write
end
every time i run pyspark i got these errors and if i ignored them when i just write sc it gives NameError: name 'sc' is not defined any help ??
pyspark
Python 2.7.12 (default, Nov 19 2016, 06:48:10)
[GCC 5.4.0 20160609] on linux2
Type "help", "copyright", "credits" or "license" for more information.
17/08/07 13:57:59 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Traceback (most recent call last):
File "/usr/local/spark/python/pyspark/shell.py", line 45, in <module>
spark = SparkSession.builder\
File "/usr/local/spark/python/pyspark/sql/session.py", line 169, in getOrCreate
sc = SparkContext.getOrCreate(sparkConf)
File "/usr/local/spark/python/pyspark/context.py", line 334, in getOrCreate
SparkContext(conf=conf or SparkConf())
File "/usr/local/spark/python/pyspark/context.py", line 118, in __init__
conf, jsc, profiler_cls)
File "/usr/local/spark/python/pyspark/context.py", line 186, in _do_init
self._accumulatorServer = accumulators._start_update_server()
File "/usr/local/spark/python/pyspark/accumulators.py", line 259, in _start_update_server
server = AccumulatorServer(("localhost", 0), _UpdateRequestHandler)
File "/usr/lib/python2.7/SocketServer.py", line 417, in __init__
self.server_bind()
File "/usr/lib/python2.7/SocketServer.py", line 431, in server_bind
self.socket.bind(self.server_address)
File "/usr/lib/python2.7/socket.py", line 228, in meth
return getattr(self._sock,name)(*args)
socket.gaierror: [Errno -2] Name or service not known
After 1 week of search i just found the solution just by add localhost to file /etc/hosts and then every thing went well
I'm not able to connect to FTP server getting below error :-
vmware#localhost ~]$ python try_ftp.py
Traceback (most recent call last):
File "try_ftp.py", line 5, in <module>
f = ftplib.FTP('ftp.python.org')
File "/usr/lib/python2.6/ftplib.py", line 116, in __init__
self.connect(host)
File "/usr/lib/python2.6/ftplib.py", line 131, in connect
self.sock = socket.create_connection((self.host, self.port), self.timeout)
File "/usr/lib/python2.6/socket.py", line 567, in create_connection
raise error, msg
socket.error: [Errno 101] Network is unreachable
I'm writing a very simple code
import ftplib
f = ftplib.FTP('ftp.python.org')
f.login('anonymous','sausaxen#xyz.com')
f.dir()
f.retrlines('RETR motd')
f.quit()
I checked my proxy settings , but it is set to "System proxy setttings"
Please suggest what should I do.
Thanks,
Sam
[torxed#archie ~]$ telnet ftp.python.org 21
Trying 82.94.164.162...
Connection failed: Connection refused
Trying 2001:888:2000:d::a2...
telnet: Unable to connect to remote host: Network is unreachable
It's not as much the hostname that's bad (ping works you mentioned) but the default port of 21 is bad. Or they're not running a standard FTP server on that host at all but rather they're using HTTP as a transport: https://www.python.org/ftp/python/
Try against ftp.acc.umu.se instead.
[torxed#archie ~]$ python
Python 3.3.5 (default, Mar 10 2014, 03:21:31)
[GCC 4.8.2 20140206 (prerelease)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import ftplib
>>> f = ftplib.FTP('ftp.acc.umu.se')
>>>
The address ftp.python.org seems bad
EDIT:
the f = ftplib.FTP('ftp.python.org') gives the error message but ping works.
Try pinging the "ftp.python.org" address.
If you need to pass through a proxy, check the you have ftp_proxy set as an environment variable. Normally, what I do is to set the proxy explicitly.
Also, as an alternative, try using httplib or requests
I have multiple times installed python-nmap following the instructions on their site, but it just doesn't work. Every time I try to test it by doing:
>>> import nmap
>>> nm = nmap.PortScanner()
I get the following error:
Python 2.7.4 (default, Apr 19 2013, 18:28:01)
[GCC 4.7.3] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import nmap
>>> nm = nmap.PortScanner()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "nmap/nmap.py", line 118, in __init__
p = subprocess.Popen(['nmap', '-V'], bufsize=10000, stdout=subprocess.PIPE)
File "/usr/lib/python2.7/subprocess.py", line 711, in __init__
errread, errwrite)
File "/usr/lib/python2.7/subprocess.py", line 1308, in _execute_child
raise child_exception
OSError: [Errno 2] No such file or directory
All help is greatly appreciated. Thanks in advance.
This error is fixed if you install a later version of nmap for python 2.*
Follow the link here: http://nmap.org/book/inst-macosx.html
You need to install nmap by the looks of it