Trouble setting up/running Apache Spark with python (in windows 10) - python

I'm super new to spark, so my issues might have a "no duh" answer that I can't quite grasp.
Firstly, I downloaded spark 1.5.2 and extracted it. In the python folder, I tried to run pyspark, but it said something along the lines that it needs a main.py, so I copied init.py to main.py and started getting weird syntax errors. I realized I was using python 2.9, so I switched to 2.7 and got a different error:
Traceback (most recent call last):
File "C:\Python27\lib\runpy.py", line 162, in _run_module_as_main
"__main__", fname, loader, pkg_name)
File "C:\Python27\lib\runpy.py", line 72, in _run_code
exec code in run_globals
File "C:\spark-1.5.2\python\pyspark\__main__.py", line 40, in <module>
from pyspark.conf import SparkConf
ImportError: No module named pyspark.conf
I found this question that looked like the same error here: What to set `SPARK_HOME` to?
So I set up my environment variables as they did (except with C:/spark-1.5.2 instead of C:/spark), but that didn't fix the error for me. Then I realized they were using spark 1.4 from github. So I made a new folder and tried it as they did. I got stuck with the command:
build/mvn -DskipTests clean package
showing the error:
Java HotSpot(TM) Client VM warning: ignoring option MaxPermSize=512M; support was removed in 8.0
Error occurred during initialization of VM
Could not reserve enough space for 2097152KB object heap
I tried adding "-XX:MaxHeapSize=3g" but no change. Noting the comment "support was removed in 8.0", I downloaded java 7, but that didn't change anything either.
Thanks in advance

Related

How to debug Python 2.7 code with VS Code?

For work I have to work with Python 2.7, I work with Squish which is an equivalent of Selenium for those who know it, and this software is only configured for Python 2.7 in my environment.
So I'm trying to use VS Code as an IDE, I managed to set my interpreter correctly, my code is working correctly without errors, but when I use the "debug my python file" function with VS Code, I get this error:
cd /myPath ; /usr/bin/env /usr/bin/python2 /myHome/.vscode/extensions/ms-python.python-2022.6.1/pythonFiles/lib/python/debugpy/launcher 44547 -- myPath/test.py Traceback (most recent call last):
File "/usr/lib/python2.7/runpy.py", line 174, in _run_module_as_main
"main", fname, loader, pkg_name)
File "/usr/lib/python2.7/runpy.py", line 72, in _run_code
exec code in run_globals
File "/myHome/.vscode/extensions/ms-python.python-2022.6.1/pythonFiles/lib/python/debugpy/main.py", line 43, in from debugpy.server import cli
File "/myHome/.vscode/extensions/ms-python.python-2022.6.1/pythonFiles/lib/python/debugpy/../debugpy/server/init.py", line 9, in import debugpy._vendored.force_pydevd # noqa
File "/myHome/.vscode/extensions/ms-python.python-2022.6.1/pythonFiles/lib/python/debugpy/../debugpy/_vendored/force_pydevd.py", line 37, in pydevd_constants = import_module('_pydevd_bundle.pydevd_constants')
File "/usr/lib/python2.7/importlib/init.py", line 37, in import_module import(name)
File "/myHome/.vscode/extensions/ms-python.python-2022.6.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydevd_bundle/pydevd_constants.py", line 362, in from _pydev_bundle._pydev_saved_modules import thread, threading
File "/myHome/.vscode/extensions/ms-python.python-2022.6.1/pythonFiles/lib/python/debugpy/_vendored/pydevd/_pydev_bundle/_pydev_saved_modules.py", line 94, in import _thread as thread; verify_shadowed.check(thread, ['start_new_thread', 'start_new', 'allocate_lock'])
ImportError: No module named _thread
For the purpose of this question I changed my code to just:
test.py :
print()
I have not configured otherwise my VS Code environment other than changing my interpreter, as I don't know what else I should do, I searched through this page: https://code.visualstudio.com/docs/python/debugging, but could not find my answer.
As rioV8 said in a comment, you have to install a previous version of the Python extension, because in the meanwhile support for Python 2 has been dropped.
To install a previous version you have to:
Open the Extensions pane from the bar on the left and find Python
Click on the gear icon and select "Install another version"
Choose v2021.9.1246542782.
After it's finished, restart VS Code.
If you want to understand why you need version v2021.9.1246542782:
The component that provides support to the language is Jedi, and the release notes of version 0.17.2 (2020-07-17) say that
This will be the last release that supports Python 2 and Python 3.5.
0.18.0 will be Python 3.6+.
And according to the release notes of the Python extension, the latest version that was based on Jedi 0.17 was 2021.9.3 (20 September 2021), because the following one (2021.10.0, 7 October 2021) says
Phase out Jedi 0.17
Is that all? No, because the selection that VS Code offers when selecting previous versions uses a different numbering scheme. Anyway, the latest one of the v2021.9.* branch is v2021.9.1246542782, which I suppose corresponds to 2021.9.3, so it's the one you need.

"jupyter-kernelspec" not found while installing iqsharp however it exists on PATH

While installing QDK for use with python as described in this guide, on executing dotnet iqsharp install I get the following exception
Traceback (most recent call last):
File "c:\users\hp\appdata\local\programs\python\python36\lib\runpy.py", line 193, in _run_module_as_main
"__main__", mod_spec)
File "c:\users\hp\appdata\local\programs\python\python36\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "C:\Users\hp\AppData\Local\Programs\Python\Python36\Scripts\jupyter.exe\__main__.py", line 7, in <module>
File "c:\users\hp\appdata\local\programs\python\python36\lib\site-packages\jupyter_core\command.py", line 247, in main
command = _jupyter_abspath(subcommand)
File "c:\users\hp\appdata\local\programs\python\python36\lib\site-packages\jupyter_core\command.py", line 134, in _jupyter_abspath
'Jupyter command `{}` not found.'.format(jupyter_subcommand)
Exception: Jupyter command `jupyter-kernelspec` not found.
However, when I run jupyter-kernelspec command on cmd it is found on PATH. Why is it happening that python is unable to locate a command if cmd can find it?
To address your specific question, you can see whether Python can locate jupyter-kernelspec (and if so, where) by running something like:
python -c "from shutil import which; print(which('jupyter-kernelspec'))"
But as to the underlying cause of the error, it seems likely that your Jupyter installation is incomplete and/or your environment is somehow misconfigured. You may want to try creating a new Python environment (perhaps using Anaconda, if you're new to Python development) and then following the QDK installation instructions again from inside that new environment (e.g., from an Anaconda command prompt with the new environment active).
Edit: From comments below, it sounds like the problem is that you have a trailing semicolon in your PATHEXT environment variable. This confuses shutil.which(), and this in turn prevents Jupyter from finding the necessary executable. (I can reproduce this problem locally by adding a trailing semicolon to PATHEXT.)
The fix should be simply to remove the trailing semicolon from PATHEXT.

Error getting Scapy to work on Windows: "'module' object has no attribute 'ex_name'"

I'm trying to run a Python script that involves ARP sniffing and is apparently dependent on the Scapy library being present. I have absolutely no idea what I'm doing but I'm reasonably good at Googling, following directions, and copying/pasting. I have it up and running on my Mac, but I'm stuck on what I hope is the last hurdle in getting Scapy working on my Windows computer (which is ultimately the one that needs to be running this script).
I followed all of the instructions at http://www.secdev.org/projects/scapy/doc/installation.html#windows, except that I chose Python 2.7 and used the newer 2.7-compatible versions of everything listed there. I used “python setup.py install” (successfully, as best I could tell) on all installs except Pypcap and Libdnet, which I installed via the Exe as an Administrator as instructed.
Unfortunately, when I type "scapy" into the command prompt to test if it works, I get the following information & error message:
C:\scapy-2.3.1>scapy
INFO: Can't import python gnuplot wrapper . Won't be able to plot.
INFO: Can't import PyX. Won't be able to use psdump() or pdfdump().
Traceback (most recent call last):
File "C:\Python27\Scripts\\scapy", line 25, in <module>
interact()
File "C:\Python27\lib\site-packages\scapy\main.py", line 278, in interact
scapy_builtins = __import__("all",globals(),locals(),".").__dict__
File "C:\Python27\lib\site-packages\scapy\all.py", line 16, in <module>
from arch import *
File "C:\Python27\lib\site-packages\scapy\arch\__init__.py", line 79, in <module>
from windows import *
File "C:\Python27\lib\site-packages\scapy\arch\windows\__init__.py", line 214, in <module>
ifaces.load_from_dnet()
File "C:\Python27\lib\site-packages\scapy\arch\windows\__init__.py", line 173, in load_from_dnet
self.data[i["name"]] = NetworkInterface(i)
File "C:\Python27\lib\site-packages\scapy\arch\windows\__init__.py", line 93, in __init__
self.update(dnetdict)
File "C:\Python27\lib\site-packages\scapy\arch\windows\__init__.py", line 107, in update
self._update_pcapdata()
File "C:\Python27\lib\site-packages\scapy\arch\windows\__init__.py", line 118, in _update_pcapdata
win_name = pcapdnet.pcap.ex_name(guess)
AttributeError: 'module' object has no attribute 'ex_name'
Can anyone help me out? If you need more information please let me know.
I am running Windows 10.
Thanks in advance,
- Ethan
I had same problem.
To solve it, i downloaded dnet-1.12.win32-py2.7.exe
and pcap-1.1.win32-py2.7.exe.
You might want to try with Scapy's current development version (from the Github repository). Support for Windows has been updated recently and should work without the need of Libdnet.
If that's not the case, you should probably open an issue.
Try it with scapy3k. Install python3 (e.g. I use Anaconda 3.5), and WinPcap driver. You do not need dnet or pypcap. Install using pip install scapy-python3 or from http://github.com/phaethon/scapy

python - pyinstaller "RuntimeWarning: Parent module 'PyInstaller.hooks.hook-PIL' not found while handling absolute import" and "tcl" related errors

I get a warning message while trying to create exectable file using pyinstaller. This warning appeared after installing Pillow. Previously i nevre got any warnings and was able to make it through.
the warning i get by pyinstaller is:
7314 INFO: Analyzing main.py
/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/PyInstaller-2.1.1dev_-py2.7.egg/PyInstaller/hooks/hook-PIL.Image.py:14: RuntimeWarning: Parent module 'PyInstaller.hooks.hook-PIL' not found while handling absolute import
from PyInstaller.hooks.shared_PIL_Image import *
Also when i tried to run the executable's exe/consol version of my code that lies inside the dist folder created by the pyinstaller (dist/main/main), these are displayed..
Traceback (most recent call last):
File "<string>", line 26, in <module>
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/PyInstaller-2.1.1dev_-py2.7.egg/PyInstaller/loader/pyi_importers.py", line 276, in load_module
exec(bytecode, module.__dict__)
File "/Users/..../build/main/out00-PYZ.pyz/PIL.PngImagePlugin", line 40, in <module>
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/PyInstaller-2.1.1dev_-py2.7.egg/PyInstaller/loader/pyi_importers.py", line 276, in load_module
exec(bytecode, module.__dict__)
File "/Users/..../build/main/out00-PYZ.pyz/PIL.Image", line 53, in <module>
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/PyInstaller-2.1.1dev_-py2.7.egg/PyInstaller/loader/pyi_importers.py", line 276, in load_module
exec(bytecode, module.__dict__)
File "/Users/..../build/main/out00-PYZ.pyz/FixTk", line 74, in <module>
OSError: [Errno 20] Not a directory: '/Users/.../dist/main/tcl'
logout
[Process completed]
so, i tried by uninstalling pillow, installing tk tcl dev version. And then installed pillow. Even that didnt helped.
I also tried reinstalling pyinstaller,. didnt help too
Update 1:
It seems Pyinstaller.hooks.hook-PIL.py file was missing in the Pyinstaller/hooks directory. And it was missing on all platforms(Mac, windows and linux). This is the warning/error message that i get on windows, which is the same i got on mac and on linux.
Later i found a link which said, its just to need Python import machinery happy. so i created as said so. Then i dont get the same error on all platforms, But on mac i still get the PILImagePlugin,Image and FixTk errors
Solution for tcl:
I found what was going wrong,.. Every problem that i faced on OSX was the OS itself(exactly the macport). Python by default comes with the mac OS. And this version of python may be useful for just learning basic python, but is not suitable for Development purpose.
Installing brew's python helped. I followed this SO link. After doing these i was still getting errors. Later i had to change the paths on /etc/paths. Basically rearranging them should work. But still then i wasn't getting it right.
Then i had to change the .bash_profile, which worked for most users, But still i was getting mac's version of python and pip, not the brews version of python.
Finally i had to restart the machine for a couple of times and do the /etc/paths and .bash_profile steps repeatedly to get the system wide effect to accept brews version of python and pip
Solution for PIL:
just adding a file called hook-PIL.py with an empty content would serve the purpose. I found a link which was having the hook files content of pyinstaller.
The location to create
for mac : /Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/site-packages/PyInstaller-2.1.1dev_-py2.7.egg/PyInstaller/hooks/ Actually for mac this step wouldn’t be required. When we install python through brew and change the path, everything that you try to install later either through pip install or from source packages tend to choose a different path. And everything will be taken care of.
for windows:C:\Python27\lib\site-packages\PyInstaller-2.1.1.dev0-py2.7.egg\PyInstaller\hooks
**Please check if this is a valid path on your machine before creating the file and then create the file. And im not sure or i don't know if just adding an empty file is the right way. But it worked for me

JPype won't compile properly

So I am having trouble compiling a very simple python script using JPype.
My code goes like:
from jpype import *
startJVM(getDefaultJVMPath(), "-ea")
java.lang.System.out.println("hello world")
shutdownJVM()
and when I run it I receive an error saying:
Traceback (most recent call last): File "test.py", line 2, in
<module>
startJVM(getDefaultJVMPath(), "-ea") File "/usr/lib/pymodules/python2.7/jpype/_core.py", line 44, in startJVM
_jpype.startup(jvm, tuple(args), True) RuntimeError: Unable to load DLL [/usr/java/jre1.5.0_05/lib/i386/client/libjvm.so], error =
/usr/java/jre1.5.0_05/lib/i386/client/libjvm.so: cannot open shared
object file: No such file or directory at
src/native/common/include/jp_platform_linux.h:45
I'm stuck and I really need help. Thanks!
I had the same problem
RuntimeError: Unable to load DLL [/usr/java/jre1.5.0_05/lib/i386/client/libjvm.so], error = /usr/java/jre1.5.0_05/lib/i386/client/libjvm.so: cannot open shared object file: No such file or directory at src/native/common/include/jp_platform_linux.h:45
In my case wrong JAVA_HOME path was set
/profile/etc
export JAVA_HOME
JAVA_HOME=/usr/lib/jvm/java-6-openjdk-amd64
PATH="$JAVA_HOME/bin:$PATH"
export PATH
The work around is to define the full path directly in the call to the JVM:
from jpype import *
startJVM('/Library/Java/JavaVirtualMachines/jdk1.7.0_79.jdk/Contents/MacOS/libjli.dylib', "-ea", "-Djava.class.path=/tmp/Jpype/sample")
java.lang.System.out.println("Hello World!!")
shutdownJVM()
Original text:
Similar issues when trying to run JPype on MacOS El Capitan. I could
not figure out how to coax the _darwin.py code finding the correct JVM
location, despite the JAVA_HOME system variable being set properly.
Caveat cursor, trying to run the above code in the Spyder IPython console did not produce any output, but the normal Console would.

Categories

Resources