Debugging htcondor issue running python script - python

I am submitting a python script to condor. When condor runs it it gets
an import error. Condor runs it as
/var/lib/condor/execute/dir_170475/condor_exec.exe. If I manually copy
the python script to the execute machine and put it in the same place
and run it, it does not get an import error. I am wondering how to
debug this.
How can I see the command line condor uses to run it? Can the file
copied to /var/lib/condor/execute/dir_170475/condor_exec.exe be
retained after the failure so I can see it? Any other suggestions on
how to debug this?

You can simply run an interactive job (basically just a job with sleep or cat as command) and do ssh_to_job to run it.
Generally you need to set-up your python environment on the compute node, it is best to have a venv and activate it inside your start script.

Related

Run Python script 24x7 on windows VM

Is there any way to run python script on Windows VM continuously. This script should run even computer restarts automatically.
What I am doing right now?
I have a script called FooBar.py, it contains infinite while loop to execute main() function continuously. I am running this script on powershell.
What is problem in this approach?
Sometime this VM restarts automatically or powershell window may close accidentally. This kind of issues causing failure of script execution.
What I have tried so far?
I tried pythonw.exe instead of python.exe to run the script but this does not resolve my problem.
Is there any way to run FooBar.py script continuously, is there any way in windows scheduler to restart script execution even after computer restarts
You can use pm2 to schedule the startup of your script. You can find more information on how to use it here:
https://towardsdatascience.com/automate-your-python-script-with-pm2-463238ea0b65
you can use a module of python named "schedule"
and possibly you can run your code at whatever time you need !
use "pip install schecule" to download the library or module.
for an example i will leave you a pic how to use it.
enter image description here
i mean job in the picture is a function what you want to do.
if you want it to do for a time interval in secdonds then,you can use
schedule.every(#duration#).seconds.do(#declared function#)
thank you!
Create a task in windows scheduler with trigger On a schedule and with trigger option Repeat task every 1 min. Then in a Settings tab there is the dropdown menu If the task is already running, then the following rule applies: where you can choose Do not start a new instance.

Running a python script on Google Cloud Compute Engine

For a machine learning task at school I wrote my own MLP network. The data set is quite big, and training takes forever. I was alerted to the option of running my script on the Google Cloud Compute Engine. I tried to set this up, but did not succeed (yet).
The steps I undertook where:
Create an account
Create a VM
Open the VM via the browser
Can anyone help me with importing and running my python script into the Google Cloud. Or does anyone have clear a tutorial on how to solve this? I tried finding these myself, but had no success so far.
I finally figured this out so I'll post the same answer on my own post that worked for me here. Using Debian Stretch on my VM. I'm assuming you already uploaded your file(s) to the VM and that you are in the same directory of your script.
Make your script an executable
chmod +x myscript.py
Run the nohup command to execute the script in the background. The & option ensures that the process stays alive after exiting. I've added the shebang line to my python script so there's no need to call python here
nohup /path/to/script/myscript.py &
Logout from the shell if you want
logout
Done! Now your script is up and running. You can login back and make sure that your process is still alive by checking the output of this command:
ps -e | grep myscript.py
If anything went wrong, you can check out the nohup.out file to see the output of your script:
cat nohup.out
There is even a simpler approach to to run code in the background in gcp and in every linux terminal: using screen linux
Create a new background terminal window:
screen -S WRITE_A_NAME_OF_YOUR_CHOIC_HERE
now you are in a background window in the terminal. Run your code:
python3 mycode.py
Exit screen with the hotkeys and the job will keep running on the background.
ctrl + A + D
You can close all windows now. If you wanna go back and see what's happening. Log again in your terminal. And tap the following.
screen -ls
This one will give you the list of the created "windows". Now find yours and tap
screen -r WRITE_NAME_OF_YOUR_WINDOW
And there you have it :D
You can find more commands here
You can use Google Cloud Platform tutorials itself and is very simple to follow. Links are given below
Setting up Python
https://cloud.google.com/python/setup
Getting started
https://cloud.google.com/python/getting-started/hello-world
Please note that you don't have any free tier to run Python 3.x, Standard environment with free tier only supports Python 2.x.
Edit: As per the latest update Python 3.x is default in standard environment
Just navigate to the directory where the script is placed.
python thenameofscript.py
I used Way Script which is great and have a free plan to run every hour
You can check this video to see explanation

Python Script Won't Run from Aws Ubuntu Terminal

I just set up my first aws server for a personal project I'm doing. I'm running ubuntu linux, and I have a python script that accesses an sqlite database file in order to send email. I have these same files on my own ubuntu machine and the script works fine. I'm having trouble, however, figuring out how to run my script from the terminal in my aws vm. Normally I use idle to run my python script on my linux machine, so I'm trying to figure out how to run it from the terminal and it's giving me some trouble.
I tried
python script.py
which did nothing, so I converted it to an executable, ran it, and got
./script.py: line 1: import: command not found
...and so on
I realized that I had to add
#!/usr/bin/env python3
to my script, so I did that, converted to executable again, and ran it by entering
./script.py
which also did nothing. If the program had run, it would have delivered an email to my inbox. Is there any way I can tell if it's actually trying to run my script? Or am I trying to run it incorrectly?
You can modify the script to add verbose that prints out to the console the status of the script, or if you just want to know whether your script is running in the background, you can check whether the process is active using ps (process name would be the name of the script) :
ps aux | grep "script.py"
Anyways the former is a better practice, since you can exactly know execution flow of your script.

How to attach to PyCharm debugger when executing python script from bash?

I know how to set-up run configurations to pass parameters to a specific python script. There are several entry points, I don't want a run configuration for each one do I? What I want to do instead is launch a python script from a command line shell script and be able to attach the PyCharm debugger to the python script that is executed and have it stop at break points. I've tried to use a pre-launch condition of a utility python script that will sleep for 10 seconds so I can attempt to "attach to process" of the python script. That didn't work. I tried to import pdb and settrace to see if that would stop it for attaching to the process, but that looks to be command line debugging specific only. Any clues would be appreciated.
Thanks!
You can attach the debugger to a python process launched from terminal:
Use Menu Tools --> Attach to process then select python process to debug.
If you want to debug a file installed in site-packages you may need to open the file from its original location.
You can to pause the program manually from debugger and inspect the suspended Thread to find your source file.

Problems running python script by windows task scheduler that does pscp

Not sure if anyone has run into this, but I'll take suggestions for troubleshooting and/or alternative methods.
I have a Windows 2008 server on which I am running several scheduled tasks. One of those tasks is a python script that uses pscp to log into a linux box, checks for new files and if there is anything new, copies them down to a local directory on the C: drive. I've put some logging into the script at key points as well and I'm using logging.basicConfig(level=DEBUG).
I built the command using a variable, command = 'pscp -pw xxxx name#ip:/ c:\local_dir' and then I use subprocess.call(command) to execute the command.
Now here's the weird part. If I run the script manually from the command line, it works fine. New files are downloaded and processed. However, if the Task Scheduler runs the script, no new files are downloaded. The script is running under the same user, but yet yields different results.
According to the log files created by the script and on the linux box, the script successfully logs into the linux box. However, no files are downloaded despite there being new files. Again, when I run it via the command line, files are downloaded.
Any ideas? suggestions, alternative methods?
Thanks.
You can use the windows Task Scheduler, but make sure the "optional" field "Start In" is filled in.
In the Task Scheduler app, add an action that specifies your python file to run "doSomeWork" and fill in the Start in (optional) input with the directory that contains the file.. So for example if you have a python file in:
C:\pythonProject\doSomeWork.py
You would enter:
Program/Script: doSomeWork.py
Start in (optional): C:\pythonProject
I had the same issue when trying to open an MS Access database on a Linux VM. Running the script at the Windows 7 command prompt worked but running it in Task Scheduler didn't. With Task Scheduler it would find the database and verify it existed but wouldn't return the tables within it.
The solution was to have Task Scheduler run cmd as the Program/Script with the arguments /c python C:\path\to\script.py (under Add arguments (optional)).
I can't tell you why this works but it solved my problem.
I'm having a similar issue. In testing I found that any type of call with subprocess stops the python script when run in task scheduler but works fine when run on the command line.
import subprocess
print('Start')
test = subprocess.check_output(["dir"], shell=True)
print('First call finished')
When run on command line this outputs:
Start
First call finished
When run from task scheduler the output is:
Start
In order to get the output from task scheduler I run the python script from a batch file as follows:
python test.py >> log.txt
I run the script through the batch file both on command line and through task scheduler.
Brad's answer is right. Subprocess needs the shell context to work and the task manager can launch python without that. Another way to do it is to make a batch file that is launched by the task scheduler that calls python c:\path\to\script.py etc. The only difference to this is that if you run into a script that has a call to os.getcwd() you will always get the root where the script is but you get something else when you make the call to cmd from task scheduler.
Last edit - start
After experiments... If you put there full path to python program it works without highest privileges (as admin). Meaning task settings like this:
program: "C:\Program Files\Python37\python.exe"
arguments: "D:\folder\folder\python script.py"
I have no idea why, but it works even if script uses subprocess and multiple threads.
Last edit - end
What I did is I changed task settings: checked Run with highest privileges. And task started to work perfectly while running python [script path].
But keep in mind, that title contains "Administrator: " at the begining... always...
P.S. Thanks guys for pointing out that subprocess is a problem. It made me think of task settings.
I had similar problem when one script is running from Windows Task Scheduler, and another one doesn't.
Running cmd with python [script path] didn't work for me on Windows 8.1 Embedded x64. Not sure why. Probably because of necessity to have spaces in path and issue with quotes.
Hope my answer helps someone. ;)
Create a batch file add your python script in your batch file and then schedule that batch file .it will work .
Example : suppose your python script is in folder c:\abhishek\script\merun.py
first you have to go to directory by cd command .so your batch file would be like :
cd c:\abhishek\script
python merun.py
it work for me .
Just leaving this for posterity: A similar issue I faced was resolved by using the UNC (\10.x.xx.xx\Folder\xxx)path everywhere in my .bat and .py scripts instead of the letter assigned to the drive (\K:\Folder\xxx).
I had this issue before. I was able to run the task manually in Windows Task Scheduler, but not automatically. I remembered that there was a change in the time made by another user, maybe this change made the task scheduler to error out. I am not sure. Therefore, I created another task with a different name, for the same script, and the script worked automatically. Try to create a test task running the same script. Hopefully that works!
For Anaconda installation of python in windows below solution worked for me
create a batch file with below
"C:\Users\username\Anaconda3\condabin\activate" && python "script.py" &&
deactivate
Setup task to run this batch file

Categories

Resources