Run a basic command in Command Prompt using Apache spark with PySpark - python

I try to import data frame into spark using python pyspark. fro that I used jupter notebook and I write bellow code and i Got out put like below..
After that I want to run this in CMD fro that I save my python codes in text file and save as test.py (As python file) Then I Run that python file in cmd using python test.py command below the screen shot
So my task is completed but after 3 or 4 hours later I again try to do same process..Then unfortunately I got bellow error message can some one explain why its happens? because before its correctly worked, but after that its not work and I not did any changes between these two attempt. below have full error now have facing.

It looks that you have not set the Hadoop home (winutil.exe) in the command session which you have opened later.
The spark is unable to locate HADOOP_HOME. Check if you had set something different when you ran the command 1st time.

Related

Spark with python script from console not working no errors

Hi I got this weird problem. I am new to spark, just installed spark and java, made the environmental variables and spark seems to work properly. I can enter the spark console form cmd and write print ect. its working.
But when I try to do bin\spark-submit C:\Users\User\Desktop\Big_Data\pi.py where pi.py is just example form documentation it shows this: enter image description here. Almost as if the spark was working but the python file is not executed...
I also tried with different python files such as: enter image description here with command: C:\Users\User\spark-3.1.2-bin-hadoop2.7>bin\spark-submit C:\Users\User\Desktop\Big_Data\try.py C:\Users\User\Desktop\Big_Data\pg100.txt
And results are as follows:
enter image description here
No output, errors, nothing, maybe someone has any idea what is wrong???
Clarifying: basically there is no output or execution of python file when calling spark form cmd console
Looking at Spark's Quick Start Guide examples, can you run this command:
bin\spark-submit --master local C:\Users\User\Desktop\Big_Data\pi.py
More info regarding spark-submit can be found here.

How to use a Python output in a powershell script?

I started learning Python this week, and I am trying to automate adding a new user to both active directory, and on Office 365.
I have managed to add the user to AD using a client and a bot, and also use another script to generate the correct New-MsolUser syntax for Powershell.
How do I get Python to open Powershell and run the output of "o365command"?
Also will I need to connect to the tenant every time I do this so will I need to incorporate this into the script as well?
Happy to show the code I have if needed.
If you provide the output from Python as JSON to a file, then PowerShell can import that directly. See ConvertFrom-Json (ConvertFrom-Json).
As for running PowerShell from Python, look at: Running an outside program (executable) in Python?
It's not something I've ever tried but good luck.

how to execute python script on atom on windows

I am using Atom on Windows 10. While setting up Atom on my computer, I created a folder called "beyond basics". Then I created a python file. I installed platform io on Atom. i got a "+" icon on screen. upon clicking that i got a command line. I am trying to execute on that by writing python filename but I am getting an error. Any help is appreciated.
python3 myfile.py
Try typing myfile.py without the python prefix. It may work, as it works for me on Windows 10. Your bubble is covering up an error message that could help use debug. Can you add an edit and tell us the error message? Until then, just try the command without the python prefix.
You should also save before running, as was commented by Denis Fetinin.
If it still doesn't work, try addding python to the env variables. It's a simple process that you can follow here.

Command line within python script giving syntax error

I am trying to run a simple command line from python.
While the code works in Jupyter notebook it throws syntax error in Spyder.
Strangely if I run the same command line within test() below inside console it executes but script shows error.
Below is my code. TIA!
def test():
!start excel
test()
!start excel works in Jupyter notebook because the Jupyter shell is able to understand the ! prefix and run a native (Windows) command.
!: to run a shell command. E.g., ! pip freeze | grep pandas to see what version of pandas is installed.
But !start excel isn't valid python syntax. You need the exact python equivalent (for Windows at least):
import os
os.startfile("excel")

Use a single iPython console to run files and parts of code in PyCharm

I started using Pycharm and managed to have the python prompt running after a file is run (link) and I found how to run pieces of code in the current iPython console (link).
For data science, however, it is very convenient to have a single iPython console/Python kernel where I can run files and code, just like Spyder does, and continue using the same started python kernel. For example to load a big amount of data with a script and write another script or pieces of code for plotting in different ways, exploring the data.
Is there a way in Pycharm to do it?
I think that it would imply generating automatically and running a line like:
runfile('C:/temp/my_project/src/console/load_data.py', wdir='C:/temp/my_project/src')
If the option does not exist, is it possible to make a macro or similar to do it?
The option "Single instance only" in Edit configurations doesn't help.

Categories

Resources