I have a command that takes a long time that I like to run in the background like this:
python3 script.py -f input.key -o output >> logs/script.log 2>&1 &
This works perfectly in the sense that the command is indeed in the background and I can check the output and potential errors later.
The main problem is the output is only appended after the command is completely finished, whereas I would like to have up-to-date log messages so I check the progress.
So currently the log would be empty and than suddenly at 08:30 two lines would appear:
[08:00] Script starting...
[08:30] Script finished!
Instead, I would like to have output saved to file before the command is completely finished.
Since you are calling a Python script you would want to use the -u option, which forces the stdout and stderr streams to be unbuffered.
$ python3 -u script.py -f input.key -o output >> logs/script.log 2>&1 &
You can check the log periodically using cat or realtime in combination with watch:
$ watch cat logs/script.log
↳ https://docs.python.org/3.7/using/cmdline.html#cmdoption-u
Related
I have a refresh_data.sh file which contains multiple papermill commands, for example:
papermill notebook_1.ipynb output_1.ipynb -p start "2017-12-01" -p date "2017-12-31"
papermill notebook_2.ipynb output_2.ipynb -p start "2018-01-01" -p date "2018-01-31"
If I get an error while it is running the first notebook, the process continues executing the second one.
In other words, an error in one of the notebooks doesn't "break" the overall script.
As far as I remember with normal python scripts if there is an error in one of the commands within the bash script it breaks the execution of the entire script.
What is the standard behaviour of a bash script in this case? Can I change it so that it stops as soon as there is an error?
If your bash script is configured with: set -e it will fail if a command errors out:
Automatic exit from bash shell script on error
#!/bin/bash
set -e
# Any subsequent(*) commands which fail will cause the shell script to exit immediately
You can run papermill using:
--log-output to get more information about why your notebook fail.
papermill "${INPUT_NOTEBOOK_PATH}" "${OUTPUT_NOTEBOOK_PATH}" --log-output
To capture notebook execution result you can always capture the result of any previous command using $?:
papermill "${INPUT_NOTEBOOK_PATH}" "${OUTPUT_NOTEBOOK_PATH}" --log-output
notebook_result=$?
if [[ ${notebook_result} -eq 0 ]]; then
echo "All good"
else
echo $notebook_result
fi
I have a python script that I want to execute in the background on my unix server. The catch is that I need the python script to wait for the previous step to finish before moving onto the next task, yet I want my job to continue to run after I exit.
I think I can set up as follows but would like confirmation:
An excerpt of the script looks like this where command 2 is dependent on the output from command 1 since it outputs an edited executable file in same directory. I would like to point out that commands 1 and 2 do not have the nohup/& included.
subprocess.call('unix command 1 with options', shell=True)
subprocess.call('unix command 2 with options', shell=True)
If when I initiate my python script like so:
% nohup python python_script.py &
Will my script run in the background since I explicitly did not put nohup/& in my scripted unix commands but instead ran the python script in the background?
yes, by running your python script with nohup (no hangup), your script won't keel over when the network is severed and the trailing & symbol will run your script in the background.
You can still view the output of your script, nohup will pipe the stdout to the nohop.out file. You can babysit the output in real time by tailing that output file:
$ tail -f nohop.out
quick note about the nohup.out file...
nohup.out The output file of the nohup execution if
standard output is a terminal and if the
current directory is writable.
or append the command with & to run the python script as a deamon and tail the logs.
$ nohup python python_script.py > my_output.log &
$ tail -f my_output.log
You can use nohup
chomd +x /path/to/script.py
nohup python /path/to/script.py &
Or
Instead of closing your terminal, use logout It is not SIGHUP when you do logout thus the shell won't send a SIGHUP to any of its children.children.
I usually use:
nohup python -u myscript.py &> ./mylog.log & # or should I use nohup 2>&1 ? I never remember
to start a background Python process that I'd like to continue running even if I log out, and:
ps aux |grep python
# check for the relevant PID
kill <relevantPID>
It works but it's a annoying to do all these steps.
I've read some methods in which you need to save the PID in some file, but that's even more hassle.
Is there a clean method to easily start / stop a Python script? like:
startpy myscript.py # will automatically continue running in
# background even if I log out
# two days later, even if I logged out / logged in again the meantime
stoppy myscript.py
Or could this long part nohup python -u myscript.py &> ./mylog.log & be written in the shebang of the script, such that I could start the script easily with ./myscript.py instead of writing the long nohup line?
Note : I'm looking for a one or two line solution, I don't want to have to write a dedicated systemd service for this operation.
As far as I know, there are just two (or maybe three or maybe four?) solutions to the problem of running background scripts on remote systems.
1) nohup
nohup python -u myscript.py > ./mylog.log 2>&1 &
1 bis) disown
Same as above, slightly different because it actually remove the program to the shell job lists, preventing the SIGHUP to be sent.
2) screen (or tmux as suggested by neared)
Here you will find a starting point for screen.
See this post for a great explanation of how background processes works. Another related post.
3) Bash
Another solution is to write two bash functions that do the job:
mynohup () {
[[ "$1" = "" ]] && echo "usage: mynohup python_script" && return 0
nohup python -u "$1" > "${1%.*}.log" 2>&1 < /dev/null &
}
mykill() {
ps -ef | grep "$1" | grep -v grep | awk '{print $2}' | xargs kill
echo "process "$1" killed"
}
Just put the above functions in your ~/.bashrc or ~/.bash_profile and use them as normal bash commands.
Now you can do exactly what you told:
mynohup myscript.py # will automatically continue running in
# background even if I log out
# two days later, even if I logged out / logged in again the meantime
mykill myscript.py
4) Daemon
This daemon module is very useful:
python myscript.py start
python myscript.py stop
Do you mean log in and out remotely (e.g. via SSH)? If so, a simple solution is to install tmux (terminal multiplexer). It creates a server for terminals that run underneath it as clients. You open up tmux with tmux, type in your command, type in CONTROL+B+D to 'detach' from tmux, and then type exit at the main terminal to log out. When you log back in, tmux and the processes running in it will still be running.
I've created a simple init script for an application I'm building. The start part of the script looks like this:
user="ec2-user"
name=`basename $0`
pid_file="/var/run/python_worker.pid"
stdout_log="/var/log/worker/worker.log"
stderr_log="/var/log/worker/worker.err"
get_pid() {
cat "$pid_file"
}
is_running() {
[ -f "$pid_file" ] && ps `get_pid` > /dev/null 2>&1
}
case "$1" in
start)
if is_running; then
echo "Already started"
else
echo "Starting $name"
cd /var/lib/worker
. venv/bin/activate
. /etc/profile.d/worker.sh
python run.py >> "$stdout_log" 2>> "$stderr_log" &
echo $! > "$pid_file"
if ! is_running; then
echo "Unable to start, see $stdout_log and $stderr_log"
exit 1
fi
echo "$name running"
fi
I'm having trouble with this line:
python run.py >> "$stdout_log" 2>> "$stderr_log" &
I want to start my application with this code and redirect the outputs to the files specified above. However, when I include the & to make it run in the background, nothing appears in the two log files. BUT, when I remove the & from this line, the log files get data. Why is this happening?
Obviously I need to run the command to make it run as a background process in order to stop the shell waiting.
I am also sure that the process is running when I use the &. I can find it with a ps -aux :
root 11357 7.0 3.1 474832 31828 pts/1 Sl 21:22 0:00 python run.py
Anyone know what I'm doing wrong? :)
Anyone know what I'm doing wrong? :)
Short Answer:
Yes. add -u to the python command and it should work.
python -u run.py >> "$stdout_log" 2>> "$stderr_log" &
Long Answer:
It's a buffering issue (from man python):
-u Force stdin, stdout and stderr to be totally unbuffered. On systems where it matters, also put stdin, stdout
and stderr in binary mode. Note that there is internal buffering in xreadlines(), readlines() and file-
object iterators ("for line in sys.stdin") which is not influenced by this option. To work around this, you
will want to use "sys.stdin.readline()" inside a "while 1:" loop.
I am trying to run a shell script with the nohup command. The shell script takes an array of files, runs a python program on each file in a loop, and appends the output to a file. This works fine on the server, but if I try to use the nohup command it does not work. I have successfully run other programs using nohup on this server, just not this script.
#!/bin/sh
ARRAY=(0010.dat 0020.dat 0030.dat)
rm batch_results.dat
touch batch0.dat
touch batch_results.dat
for file in ${ARRAY[#]}
do
python fof.py $file > /dev/null
python mdisk5.py > ./batch0.dat
tail -1 batch0.dat
tail -1 batch0.dat >> batch_results.dat
done
The program works fine when I run it while staying connected to the server, for example
./batch.sh > /dev/null &
./batch.sh > ./output.txt &
However, when I try to run it with the nohup command,
nohup ./batch.sh > /dev/null &
if I exit the server and come back the output file (batch_results.dat) does not have any data.
I am sure I am missing some simple fix or command in here. Any ideas?
Edit:
The program fof.py produces two files that are used as input for mdisk5.py.
When I exit the server while running nohup, these two files are produced, but only for the first input file '0010.dat'. The output files batch0.dat and batch_results.dat remain empty.
Here's your problem:
#!/bin/sh
sh does not support arrays. Either change your shebang line to invoke a shell that does support arrays, like bash, or use a normal, whitespace separated string of your data files in a like
DAT_FILES="0010.dat 0020.dat 0030.dat"
for file in $DAT_FILES
do
...
done