Using Cron Job to run a python programs

Using Cron Job to run a python programs - python

I'm trying to run a python program using cron.
Initially I tried running a web scraper in python using cron, but I quickly realized that something was wrong.
So I tried breaking down the process to see where the error was.
And I found out that cron actually won't run any python program that is more complicated than a
print("Hello World")
For starters I'm trying to run this code using cron:
import pandas as pd
import os
import sys
sys.stdout = open("/home/pi/testbot/crontask.txt", "w")
df = pd.read_excel('db.xlsx', engine='openpyxl')
print(df)
sys.stdout.close()
I want the program to read a .xlsx file and write the output into a file called crontask.txt.
Now here comes the twist.
When I run the command python3 testbot.py
I get the right result, and content in the crontask.txt.
But when I add * * * * * /usr/bin/python3 /home/pi/testbot/testbot.py >> /home/pi/testbot/log.txt to crontab, I get zero results and zero log-entries in log.txt.
I've tried to make the .py file exeutable with chmod +x testbot.py and change permissions without luck.
I'm starting to wonder if the only way to run such programs is to make a script that runs the program and use cron job on that.
Is that my only solution?

I did solve this one, and as embarrassing as the fault was I still want to post this in case someone else experience the same problem.
As cron doesn't use any spesific paths you have to specify everything in order for it to execute programs that opens any folder/file or other program.
Not just in crontab but also in the programs themself.
In my case it was the df = pd.read_excel('db.xlsx', engine='openpyxl') that caused the error, so when I changed it to df = pd.read_excel('/home/pi/bot/db.xlsx', engine='openpyxl') it worked like a charm.
The same thing goes if you consider using a bash script to open another program. Only direct paths seem to work.

Related

Crontab not working properly but log shows the opposite

im working on a small ETL that collects data using webscraping, cleans and manipulates it and sends to a local sqlite3 database.
If i execute the command /virtualenv_path/python /script_path/script.py it runs perfectly, but if i schedule this command with crontab it does not work.
It just does not send any data. However, my log file shows me that the crontab is executing script.py using my venv as expected.
So, what is going on? What should i do to solve this?
I suppose that my script is not incorrect because if i execute without crontab it works flawlessly and even with crontab it does not show any error (as i said, log file suggests that everything is going really well)
this is my repository: https://github.com/raposofrct/wescraping-ETL
there we have ETL folder that contains my script, crontab command that im using and my sqlite database.
thanks for any help or clue that you guys can give me.

Your script is likely working, but it's not putting data into the database file you're looking at. hm_db.sqlite is relative to whatever the current working directory is:
DataBase(dados,create_engine('sqlite:///hm_db.sqlite',echo=False))
That is very unlikely to be the same directory you are in when you run the script manually. Either provide an absolute path or make the path relative to your script directory, e.g.
from pathlib import Path
root_directory = Path(__file__).parent
database_file = root_directory / "hm_db.sqlite"
DataBase(dados, create_engine(f"sqlite:///{database_file}", echo=False))
Alternatively, log os.getcwd() in your existing script to figure out where your cronjob has been storing data.

python run multiple scripts

Hi i would like to run add1.py and add2.py simultaneously and have searched BAT file and SH file but couldnt do it myself. Anyone can help me? The folder is in the following path C:\Users\Jia\Downloads\Telegram Bot\Scripts
I might also add more scripts like add3.py add4.py and the list goes on. Does anyone have simple tips that can help me run every script in this folder? Thank you!
It would be even better if the script runs one after another, example add2.py runs after add1.py finishes.

Just run: python add1.py & python add2.py. If you only want the second one to run if the first executes successfully, use python add1.py && python add2.py.
Running them at the same time would use something called concurrency, which would require some modifications to your script.
NOTE: This will only work on Windows. On Linux or MacOS, you would use: python add1.py ; python add2.py
You can manually add more scripts. To runn every python file in a folder, you could use: python *.py if you imported them all as modules into a new file called main.py and executed them in what ever order you like in that file.

As someone else already suggested, you could make a python file which executes your N python scripts.
Using subprocess as described here: https://stackoverflow.com/a/11230471/11962413
import subprocess
subprocess.call("./test1.py", shell=True)
subprocess.call("./test2.py", shell=True)

You can try this, if you are just tying to run python file:
import os
lst=[l for l in os.listdir() if l.endswith(".py")]
for ls in lst:
os.system(f'python {ls}')
Or if the name has some pattern or it is sure then, try this:
import os
for i in range(1,<up to last name+1>):
os.system(f"python add{i}.py")

Cron scheduling a Python script, selecting correct output

I am trying to schedule running a Python module in cron. This question is somewhat similar to this but I think asks for a different use case.
Task
I have a Python script I run in the shell as a module like this:
python -m myscript
It prints a bunch of numbers (via print) and works when I run it from the shell.
Question
I am now trying to run this every minute using a cron job, like so:
*/1 * * * * python -m myscript
Question1: This does not print to the terminal as expected / I don't see any output. Why? (To test if it is running at all, I redirected output to a file, which creates an empty file every minute).
Question2: My thinking was that any command that works when I run it manually in the shell will also work the same way when started via cron. Is that mistaken? E.g do I still have to make the script executable and such?
Question3: Thinking about it, I was not quite sure where cron will direct e.g. a print / stdout command and could not find in the docs. Do I have to / Should I manually specify the output target if I want it to print to a new shell window?
Currently running this on elementary OS but I want to eventually migrate to a Raspberry Pi. Any help is much appreciated!

Script hangs when importing pandas

I have an issue when importing pandas and running the script on windows task scheduler. In the end the program just hangs and no error occurs. When I execute the script in command prompt, there's no problem. I've tried a lot of different things but couldn't fix the problem so far.
What I'm looking for now is a way to import pandas in verbose mode and write the output real time to a file. I've found a lot of explanations to do this with e.g. python -v module.py 2> output.txt in the shell. But what I'd like to do is something like this:
with profiler as context:
import pandas
with open("output.txt", "w+") as file:
file.write(context.output())
The script should write the output in real time so that I can kill the task an still have the output until the program is hanging.

I had a similar problem, only I was using PyCharm.
I had an old project from my previous job, when I tried opening this project in my new PC, python got stuck after importing Pandas, "the program just hanged and no error occurred". Oddly enough, in my other projects pandas worked fine.
I noticed that when starting, pycharm displayed the path where the python interpreter was running. In my old project the path was
C:\Users\user\AppData\Local\Programs\Python\Python37\python.exe
In my other projects the path was
D:\Users\user\Anaconda3\python.exe
I solved this in pycharm by choosing the interpreter from my other projects to my old project.
I think that you can start by knowing what interpreter is running in your windows task scheduler and what interpreter runs in the command prompt. You can check this with
import sys
sys.executable
If it is the case that there are different interpreters running, maybe this thread will result useful: Change default python version for command prompt

how to properly run Python script with crontab on every system startup

I have a Python script that should open my Linux terminal, browser, file manager and text editor on system startup. I decided crontab is a suitable way to automatically run the script. Unfortunately, it doesn't went well, nothing happened when I reboot my laptop. So, I captured the output of the script to a file in order to get some clues. It seems my script is only partially executed. I use Debian 8 (Jessie), and here's my Python script:
#!/usr/bin/env python3
import subprocess
import webbrowser
def action():
subprocess.call('gnome-terminal')
subprocess.call('subl')
subprocess.call(('xdg-open', '/home/fin/Documents/Learning'))
webbrowser.open('https://reddit.com/r/python')
if __name__ == '__main__':
action()
here's the entry in my crontab file:
#reboot python3 /home/fin/Labs/my-cheatcodes/src/dsktp_startup_script/dsktp_startup_script.py > capture_report.txt
Here's the content of capture_report.txt file (I trim several lines, since its too long, it only prints my folder structures. seems like it came from 'xdg-open' line on Python script):
Directory list of /home/fin/Documents/Learning/
Type Format Sort
[Tree ] [Standard] [By Name] [Update]
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
/
... the rest of my dir stuctures goes here
I have no other clue what's possible going wrong here. I really appreciate your advice guys. thanks.

No, cron is not suitable for this. The cron daemon has no connection to your user's desktop session, which will not be running at system startup, anyway.
My recommendation would be to hook into your desktop environment's login scripts, which are responsible for starting various desktop services for you when you log in, anyway, and easily extended with your own scripts.

I'd do as tripleee suggested, but your job might be failing because it requires an X session, since you're trying to open a browser. You should put export DISPLAY=:0; after the schedule in your cronjob, as in
#reboot export DISPLAY=:0; python3 /home/fin/Labs/my-cheatcodes/src/dsktp_startup_script/dsktp_startup_script.py > capture_report.txt
If this doesn't work, you could try replacing :0 with the output of echo $DISPLAY in a graphical terminal.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Using Cron Job to run a python programs - python

Related

Crontab not working properly but log shows the opposite

python run multiple scripts

Cron scheduling a Python script, selecting correct output

Script hangs when importing pandas

how to properly run Python script with crontab on every system startup

Categories

Resources