Running pyspark in (Anaconda - Spyder) in windows OS - python

Dears,
I am using windows 10 and I am familiar with testing my python code in Spyder.
however, when I am trying to write ïmport pyspark" command, Spyder showing "No module named 'pyspark'"
Pyspark is installed in my PC and also I can do import pyspark in command prompt without any error.
I found many blogs explaining how to do this in Ubuntu but I did not find how to solve it in windows.

Well for using packages in Spyder, you have to install them through Anaconda. You can open
"anaconda prompt" and the write down the blew code:
conda install pyshark
That will give you the package available in SPYDER.

Hi I have installed Pyspark in windows 10 few weeks back. Let me tell you how I did it.
I followed "https://changhsinlee.com/install-pyspark-windows-jupyter/".
So after following each step precisely you can able to run pyspark using either command promp or saving a python file and running.
When you run via notebook(download Anaconda). start anacoda shell and type pyspark. now you don't need to do "ïmport pyspark".
run your program without this and it will be alright. you can also do spark-submit but for that I figured out that you need to remove the PYSPARK_DRIVER_PATH and OPTS PATH in environment variable.

Related

How to use remote Spark in local vs code?

Starting to learn Spark but now stuck at the first step.
I downloaded Spark from Apache website and have finished the configurations. Now if I run pyspark command in my WSL, a Jupyter server will start and I can open it in my Windows browser and import pyspark works just fine. But if I connect to WSL with VS Code, and create a new notebook in it, then the pyspark module can't be found.
I didn't install pyspark module through pip or conda because I thought it's already included in the full version that I downloaded so it seems redundant to me.
Is there any way that I can use remote installed Spark in VS Code without separately install it again?

Error occurring while installing and importing pynput

So I am trying to install and import pynput in VSCode but its showing me an error every time I try to do it. I used VSCode's in-built terminal to install it using pip and typed the following :
pip install pynput but this error is shown : Fatal error in launcher: Unable to create process using '"c:\users\vicks\appdata\local\programs\python\python38-32\python.exe" "C:\Users\vicks\AppData\Local\Programs\Python\Python38-32\Scripts\pip.exe" install pynput': The system cannot find the file specified
After receiving the following error, I tried using CMD to install it but the same error is shown. I also tried using python pip install pynput and it shows Python was not found; run without arguments to install from the Microsoft Store, or disable this shortcut from Settings > Manage App Execution Aliases. even though I have python 3.9.7 and I have selected it as my interpreter in VSCode and I have IDLE(Python 64 bit) installed. How may I resolve the following error? Any help regarding the same is appreciated
Thanks in advance :)
There's no such thing as an in-built terminal in VS code. When you open a terminal in VS Code, it opens the default, which on Windows is usually equivalent to opening up CMD.
If you selected Python 3.9.7 as your default interpreter in VS Code, it does not mean that it will visible to your CMD / terminal. It just means that the VS Code IDE will refer to that instance of Python when launching the program from VS Code itself using the green button (or F5), and when scanning your code to point out missing packages, etc.
CMD will only automatically detect your Python if it's in your PATH environment variable. You should add the Python 3.9.7 base and Scripts path to this.
Also, it would be best if you could first uninstall conflicting versions (like your 3.8.x) of Python and remove them from PATH, assuming that this won't cause any problems for you. Perhaps keep a record all the installed packages in this old version of Python for future reference using pip freeze or pip list.
Check if c:\users\vicks\appdata\local\programs\python\python38-32\python.exe exists by typing cd c:\users\vicks\appdata\local\programs\python\python38-32

Cannot use python in VSCode

I simply want be able to execute python commands in VSCode.
I have already installed it by the marketplace and the main software on my computer.
Once I would have finished the python problem I am looking to install brownie, but I first need the ability to execute python commands in the VSCode terminal "pip install...".
Can you help me out?

Windows Python 3.9, pip, vscode not working correctly tried every tutorial

I've gotten this to work relatively easy on my Mac I with django but for some reason Windows has been a heartache.
the problem is that in the console I can only get py to start executing python.
python and python3 do not work whatsoever
and also I can't get pip to install either because the py command won't execute it but python and python3 just open the windows store.
I've installed python to the path with the installer and I made the location of the file C:\Python\Python39
changed the hierarchy in the PATH in user variables and system variables to where python is at the top in both.
I've edited the vscode settings
I've also turned off the App execution aliases. That did nothing.
I'm at a complete loss so if any one has any advice I'd be so appreciative.
This tutorial helped me make pip working when I started programing
https://youtu.be/28eLP22SMTA

how to use spark with python or jupyter notebook

I am trying to work with 12GB of data in python for which I desperately need to use Spark , but I guess I'm too stupid to use command line by myself or by using internet and that is why I guess I have to turn to SO ,
So by far I have downloaded the spark and unzipped the tar file or whatever that is ( sorry for the language but I am feeling stupid and out ) but now I can see nowhere to go. I have seen the instruction on spark website documentation and it says :
Spark also provides a Python API. To run Spark interactively in a Python interpreter, use bin/pyspark but where to do this ? please please help .
Edit : I am using windows 10
Note:: I have always faced problems when trying to install something mainly because I can't seem to understand Command prompt
If you are more familiar with jupyter notebook, you can install Apache Toree which integrates pyspark,scala,sql and SparkR kernels with Spark.
for installing toree
pip install toree
jupyter toree install --spark_home=path/to/your/spark_directory --interpreters=PySpark
if you want to install other kernels you can use
jupyter toree install --interpreters=SparkR,SQl,Scala
Now run
jupyter notebook
In the UI while selecting new notebook, you should see following kernels availble
Apache Toree-Pyspark
Apache Toree-SparkR
Apache Toree-SQL
Apache Toree-Scala
When you unzip the file, a directory is created.
Open a terminal.
Navigate to that directory with cd.
Do an ls. You will see its contents. bin must be placed
somewhere.
Execute bin/pyspark or maybe ./bin/pyspark.
Of course, in practice it's not that simple, you may need to set some paths, like said in TutorialsPoint, but there are plenty of such links out there.
I understand that you have already installed Spark in the windows 10.
You will need to have winutils.exe available as well. If you haven't already done so, download the file from http://public-repo-1.hortonworks.com/hdp-win-alpha/winutils.exe and install at say, C:\winutils\bin
Set up environment variables
HADOOP_HOME=C:\winutils
SPARK_HOME=C:\spark or wherever.
PYSPARK_DRIVER_PYTHON=ipython or jupyter notebook
PYSPARK_DRIVER_PYTHON_OPTS=notebook
Now navigate to the C:\Spark directory in a command prompt and type "pyspark"
Jupyter notebook will launch in a browser.
Create a spark context and run a count command as shown.

Categories

Resources