Trouble encountered when launching pyspark with command prompt - python

When i attempt to launch spark in command prompt with 'spark-shell', a new command prompt simply appears and does not launch spark. i used 'pip install pyspark' to install spark. Thank you for any help

I had a virus 'malware' in my pc. I downloaded a malwarebyte anti-virus here. He cleaned up all malwares and now pyspark works well.

Related

How to use remote Spark in local vs code?

Starting to learn Spark but now stuck at the first step.
I downloaded Spark from Apache website and have finished the configurations. Now if I run pyspark command in my WSL, a Jupyter server will start and I can open it in my Windows browser and import pyspark works just fine. But if I connect to WSL with VS Code, and create a new notebook in it, then the pyspark module can't be found.
I didn't install pyspark module through pip or conda because I thought it's already included in the full version that I downloaded so it seems redundant to me.
Is there any way that I can use remote installed Spark in VS Code without separately install it again?

Cannot use python in VSCode

I simply want be able to execute python commands in VSCode.
I have already installed it by the marketplace and the main software on my computer.
Once I would have finished the python problem I am looking to install brownie, but I first need the ability to execute python commands in the VSCode terminal "pip install...".
Can you help me out?

Running pyspark in (Anaconda - Spyder) in windows OS

Dears,
I am using windows 10 and I am familiar with testing my python code in Spyder.
however, when I am trying to write ïmport pyspark" command, Spyder showing "No module named 'pyspark'"
Pyspark is installed in my PC and also I can do import pyspark in command prompt without any error.
I found many blogs explaining how to do this in Ubuntu but I did not find how to solve it in windows.
Well for using packages in Spyder, you have to install them through Anaconda. You can open
"anaconda prompt" and the write down the blew code:
conda install pyshark
That will give you the package available in SPYDER.
Hi I have installed Pyspark in windows 10 few weeks back. Let me tell you how I did it.
I followed "https://changhsinlee.com/install-pyspark-windows-jupyter/".
So after following each step precisely you can able to run pyspark using either command promp or saving a python file and running.
When you run via notebook(download Anaconda). start anacoda shell and type pyspark. now you don't need to do "ïmport pyspark".
run your program without this and it will be alright. you can also do spark-submit but for that I figured out that you need to remove the PYSPARK_DRIVER_PATH and OPTS PATH in environment variable.

Getting "targetdir variable must be provided when invoking this installer" message

when i try to install python i'm getting the "targetdir variable must be provided when invoking this installer" whilst attempting to install python 3.5. I have used tried to Run as admin however, I', still getting this error.
I had the same problem today and it did not work even with "run as administrator".
To solve the problem, I ran powershell as administrator and executed the following command:
python-3.6.1.exe InstallAllUsers=1 TargetDir="c:\python3.6"
Just right click on exe file and run as a administrator.It worked for me :)
I met this issue also.
I find method to resolve it.
Right click the exe file and choose run as administrator.
Then it can go on installing and to be successful.
Run as administrator solves this problem!
I was trying to install the latest python for windows python-3.7.0b4-amd64-webinstall but got the same error.
The problem was solved by the following steps:
Open command prompt in administrator mode.
Go to the location where the installer is downloaded.
C:\>cd C:\Users\XYZ\Downloads
C:\>python-3.7.0b4-amd64-webinstall.exe TargetDir=C:\Python37
The installer will open and chose the default installer and click Install.

how to use spark with python or jupyter notebook

I am trying to work with 12GB of data in python for which I desperately need to use Spark , but I guess I'm too stupid to use command line by myself or by using internet and that is why I guess I have to turn to SO ,
So by far I have downloaded the spark and unzipped the tar file or whatever that is ( sorry for the language but I am feeling stupid and out ) but now I can see nowhere to go. I have seen the instruction on spark website documentation and it says :
Spark also provides a Python API. To run Spark interactively in a Python interpreter, use bin/pyspark but where to do this ? please please help .
Edit : I am using windows 10
Note:: I have always faced problems when trying to install something mainly because I can't seem to understand Command prompt
If you are more familiar with jupyter notebook, you can install Apache Toree which integrates pyspark,scala,sql and SparkR kernels with Spark.
for installing toree
pip install toree
jupyter toree install --spark_home=path/to/your/spark_directory --interpreters=PySpark
if you want to install other kernels you can use
jupyter toree install --interpreters=SparkR,SQl,Scala
Now run
jupyter notebook
In the UI while selecting new notebook, you should see following kernels availble
Apache Toree-Pyspark
Apache Toree-SparkR
Apache Toree-SQL
Apache Toree-Scala
When you unzip the file, a directory is created.
Open a terminal.
Navigate to that directory with cd.
Do an ls. You will see its contents. bin must be placed
somewhere.
Execute bin/pyspark or maybe ./bin/pyspark.
Of course, in practice it's not that simple, you may need to set some paths, like said in TutorialsPoint, but there are plenty of such links out there.
I understand that you have already installed Spark in the windows 10.
You will need to have winutils.exe available as well. If you haven't already done so, download the file from http://public-repo-1.hortonworks.com/hdp-win-alpha/winutils.exe and install at say, C:\winutils\bin
Set up environment variables
HADOOP_HOME=C:\winutils
SPARK_HOME=C:\spark or wherever.
PYSPARK_DRIVER_PYTHON=ipython or jupyter notebook
PYSPARK_DRIVER_PYTHON_OPTS=notebook
Now navigate to the C:\Spark directory in a command prompt and type "pyspark"
Jupyter notebook will launch in a browser.
Create a spark context and run a count command as shown.

Categories

Resources