I'm writing a stop routine for a start-up service script:
do_stop()
{
rm -f $PIDFILE
pkill -f $DAEMON || return 1
return 0
}
The problem is that pkill (same with killall) also matches the process representing the script itself and it basically terminates itself. How to fix that?
You can explicitly filter out the current PID from the results:
kill $(pgrep -f $DAEMON | grep -v ^$$\$)
To correctly use the -f flag, be sure to supply the whole path to the daemon rather than just a substring. That will prevent you from killing the script (and eliminate the need for the above grep) and also from killing all other system processes that happen to share the daemon's name.
pkill -f accepts a full blown regex. So rather than pkill -f $DAEMON you should use:
pkill -f "^"$DAEMON
To make sure only if process name starts with the given daemon name then only it is killed.
A better solution will be to save pid (Proces Id) of the process in a file when you start the process. And for the stopping the process just read the file to get the process id to be stopped/killed.
Judging by your question, you're not hard over on using pgrep and pkill, so here are some other options commonly used.
1) Use killproc from /etc/init.d/functions or /lib/lsb/init-functions (which ever is appropriate for your distribution and version of linux). If you're writing a service script, you may already be including this file if you used one of the other services as an example.
Usage: killproc [-p pidfile] [ -d delay] {program} [-signal]
The main advantage to using this is that it sends SIGTERM, waits to see if the process terminates and sends SIGKILL only if necessary.
2) You can also use the secret sauce of killproc, which is to find the process ids to kill using pidof which has a -o option for excluding a particular process. The argument for -o could be $$, the current process id, or %PPID, which is a special variable that pidof interprets as the script calling pidof. Finally if the daemon is a script, you'll need the -x so your trying to kill the script by it's name rather than killing bash or python.
for pid in $(pidof -o %PPID -x progd); do
kill -TERM $pid
done
You can see an example of this in the article Bash: How to check if your script is already running
Related
I am aware how to check if some python process is running. I am trying to write a script, which checks whether a python script is running, if it is not it should rerun it.
What I have right now is:
import os
stream = os.popen("ps aux | grep combined.py")
output = stream.read()
print(output[0])
The problem is I can't get the specific process ID this way, because output is a list of characters not a dict, where I could get PID by output["PID"], to check whether there is an PID in the list.
How would I implement such script?
In bash script:
#!/bin/bash
pid=`ps -ef |grep combined.py |grep -v grep |awk '{print $2}'`
echo $pid
You can use crontab to run the bash script and checks every few minutes if the python process is running
I usually use:
nohup python -u myscript.py &> ./mylog.log & # or should I use nohup 2>&1 ? I never remember
to start a background Python process that I'd like to continue running even if I log out, and:
ps aux |grep python
# check for the relevant PID
kill <relevantPID>
It works but it's a annoying to do all these steps.
I've read some methods in which you need to save the PID in some file, but that's even more hassle.
Is there a clean method to easily start / stop a Python script? like:
startpy myscript.py # will automatically continue running in
# background even if I log out
# two days later, even if I logged out / logged in again the meantime
stoppy myscript.py
Or could this long part nohup python -u myscript.py &> ./mylog.log & be written in the shebang of the script, such that I could start the script easily with ./myscript.py instead of writing the long nohup line?
Note : I'm looking for a one or two line solution, I don't want to have to write a dedicated systemd service for this operation.
As far as I know, there are just two (or maybe three or maybe four?) solutions to the problem of running background scripts on remote systems.
1) nohup
nohup python -u myscript.py > ./mylog.log 2>&1 &
1 bis) disown
Same as above, slightly different because it actually remove the program to the shell job lists, preventing the SIGHUP to be sent.
2) screen (or tmux as suggested by neared)
Here you will find a starting point for screen.
See this post for a great explanation of how background processes works. Another related post.
3) Bash
Another solution is to write two bash functions that do the job:
mynohup () {
[[ "$1" = "" ]] && echo "usage: mynohup python_script" && return 0
nohup python -u "$1" > "${1%.*}.log" 2>&1 < /dev/null &
}
mykill() {
ps -ef | grep "$1" | grep -v grep | awk '{print $2}' | xargs kill
echo "process "$1" killed"
}
Just put the above functions in your ~/.bashrc or ~/.bash_profile and use them as normal bash commands.
Now you can do exactly what you told:
mynohup myscript.py # will automatically continue running in
# background even if I log out
# two days later, even if I logged out / logged in again the meantime
mykill myscript.py
4) Daemon
This daemon module is very useful:
python myscript.py start
python myscript.py stop
Do you mean log in and out remotely (e.g. via SSH)? If so, a simple solution is to install tmux (terminal multiplexer). It creates a server for terminals that run underneath it as clients. You open up tmux with tmux, type in your command, type in CONTROL+B+D to 'detach' from tmux, and then type exit at the main terminal to log out. When you log back in, tmux and the processes running in it will still be running.
So we have an application that has celery workers. We start those workers using an upstart file /etc/init/fact-celery.conf that looks like the following:
description "FaCT Celery Worker."
start on runlevel [2345]
stop on runlevel [06]
respawn
respawn limit 10 5
setuid fact
setgid fact
script
[ -r /etc/default/fact ] && . /etc/default/fact
if [ "$START_CELERY" != "yes" ]; then
echo "Service disabled in '/etc/default/fact'. Not starting."
exit 1
fi
ARGUMENTS=""
if [ "$BEAT_SERVICE" = "yes" ]; then
ARGUMENTS="--beat"
fi
/usr/bin/fact celery worker --loglevel=INFO --events --schedule=/var/lib/fact/celerybeat-schedule --queues=$CELERY_QUEUES $ARGUMENTS
end script
It calls a python wrapper script that looks like the following:
#!/bin/bash
WHOAMI=$(whoami)
PYTHONPATH=/usr/share/fact
PYTHON_BIN=/opt/fact-virtual-environment/bin/python
DJANGO_SETTINGS_MODULE=fact.settings.staging
if [ ${WHOAMI} != "fact" ];
then
sudo -u fact $0 $*;
else
# Python needs access to the CWD, but we need to deal with apparmor restrictions
pushd $PYTHONPATH &> /dev/null
PYTHONPATH=${PYTHONPATH} DJANGO_SETTINGS_MODULE=${DJANGO_SETTINGS_MODULE} ${PYTHON_BIN} -m fact.managecommand $*;
popd &> /dev/null
fi
The trouble with this setup is that when we stop the service, we get left over pact-celery workers that don't die. For some reason upstart can't track the forked processes. I've read in some similar posts that upstart can't track more than two forks.
I've tried using expect fork but then upstart just hangs whenever I try to start or stop the service.
Other posts I've found on this say to call the python process directly instead of using the wrapper script, but we've already built apparmor profiles around these scripts and there are other things in our workflow that are pretty dependent on them.
Is there any way, with the current wrapper scripts, to handle killing off all the celery workers on a service stop?
There is some discussion about this in the Workers Guide, but basically the usual process is to send a TERM signal to the worker, which will cause it to wait for all the currently running tasks to finish before exiting clean.
Alternatively, you can send the KILL signal if you want it to stop immediately with potential data loss, but then as you said celery isn't able to intercept the signal and cleanup the children in that case. The only recourse that is mentioned is to manually clean up the children like this:
$ ps auxww | grep 'celery worker' | awk '{print $2}' | xargs kill -9
I can't figure out how to close a bash shell that was started via Popen. I'm on windows, and trying to automate some ssh stuff. This is much easier to do via the bash shell that comes with git, and so I'm invoking it via Popen in the following manner:
p = Popen('"my/windows/path/to/bash.exe" | git clone or other commands')
p.wait()
The problem is that after bash runs the commands I pipe into it, it doesn't close. It stays open causing my wait to block indefinitely.
I've tried stringing an "exit" command at the end, but it doesn't work.
p = Popen('"my/windows/path/to/bash.exe" | git clone or other commands && exit')
p.wait()
But still, infinite blocking on the wait. After it finishes its task, it just sits at a bash prompt doing nothing. How do I force it to close?
Try Popen.terminate() this might help kill your process. If you have only synchronous executing commands try to use it directly with subprocess.call().
for example
import subprocess
subprocess.call(["c:\\program files (x86)\\git\\bin\\git.exe",
"clone",
"repository",
"c:\\repository"])
0
Following is an example of using a pipe but this is a little overcomplicated for most use cases and makes sense only if you talk with a service that needs interaction (at least in my opinion).
p = subprocess.Popen(["c:\\program files (x86)\\git\\bin\\git.exe",
"clone",
"repository",
"c:\\repository"],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE
)
print p.stderr.read()
fatal: destination path 'c:\repository' already exists and is not an empty directory.
print p.wait(
128
This can be applied to ssh as well
To kill the process tree, you could use taskkill command on Windows:
Popen("TASKKILL /F /PID {pid} /T".format(pid=p.pid))
As #Charles Duffy said, your bash usage is incorrect.
To run a command using bash, use -c parameter:
p = Popen([r'c:\path\to\bash.exe', '-c', 'git clone repo'])
In simple cases, you could use subprocess.check_call instead of Popen().wait():
import subprocess
subprocess.check_call([r'c:\path\to\bash.exe', '-c', 'git clone repo'])
The latter command raises an exception if bash process returns non-zero status (it indicates an error).
I'm unsure if this actually is the problem, but let me explain: I have a python script that gets started by a bash script. The bash script's job is done then, but when I grep the ps aux the call is still present.
#!/bin/bash
export http_proxy='1.2.3.4:1234'
python -u /home/user/folder/myscript.py -some Parameters >> /folder/Logfile_stout.log 2>&1
If I grep for ps aux | grep python I get the python -u /home/user/folder/myscript.py -some Parameters as a result. According to the logfile the python script closed properly. (Code to end the script is within the script itself.)
The script gets started every hour and I still see all the calls from the hours before.
Thanks in advance for your help, tips or advice!
The parent bash script will remain as long as the child (python script) is running.
If you start the python script running in background (add & at end of python line) then the parent will exit.
#!/bin/bash
export http_proxy='1.2.3.4:1234'
python -u /home/user/folder/myscript.py -some Parameters >> /folder/Logfile_stout.log 2>&1 &
If you examine the process list (e.g. 'ps -elf'). It will show the child (if still running). The child PPID (parent PID) will be 1(root PID) instead of the parent PID because the parent doesn't exist any more.
It could eventually be a problem if your python script never exits.
You could make the parent script wait and kill the child, e.g. wait 30 secs and kill child if it is still present:
#!/bin/bash
export http_proxy='1.2.3.4:1234'
python -u /home/user/folder/myscript.py -some Parameters >> /folder/Logfile_stout.log 2>&1 &
sleep 30
jobs
kill %1
kill -9 %1
jobs