I want to launch a background Python job from a bash script and then gracefully kill it with SIGINT. This works fine from the shell, but I can't seem to get it to work in a script.
loop.py:
#! /usr/bin/env python
if __name__ == "__main__":
try:
print 'starting loop'
while True:
pass
except KeyboardInterrupt:
print 'quitting loop'
From the shell I can interrupt it:
$ python loop.py &
[1] 15420
starting loop
$ kill -SIGINT 15420
quitting loop
[1]+ Done python loop.py
kill.sh:
#! /bin/bash
python loop.py &
PID=$!
echo "sending SIGINT to process $PID"
kill -SIGINT $PID
But from a script I can't:
$ ./kill.sh
starting loop
sending SIGINT to process 15452
$ ps ax | grep loop.py | grep -v grep
15452 pts/3 R 0:08 python loop.py
And, if it's been launched from a script I can no longer kill it from the shell:
$ kill -SIGINT 15452
$ ps ax | grep loop.py | grep -v grep
15452 pts/3 R 0:34 python loop.py
I'm assuming I'm missing some fine point of bash job control.
You're not registering a signal handler. Try the below. It seems to work fairly reliably. I think the rare exception is when it catches the signal before Python registers the script's handler. Note that KeyboardInterrupt is only supposed to be raised, "when the user hits the interrupt key". I think the fact that it works for a explicit (e.g. via kill) SIGINT at all is an accident of implementation.
import signal
def quit_gracefully(*args):
print 'quitting loop'
exit(0);
if __name__ == "__main__":
signal.signal(signal.SIGINT, quit_gracefully)
try:
print 'starting loop'
while True:
pass
except KeyboardInterrupt:
quit_gracefully()
In addition to #matthew-flaschen's answer, you can use exec in the bash script to effectively replace the scope to the process being opened:
#!/bin/bash
exec python loop.py &
PID=$!
sleep 5 # waiting for the python process to come up
echo "sending SIGINT to process $PID"
kill -SIGINT $PID
I agree with Matthew Flaschen; the problem is with python, which apparently doesn't register the KeyboardInterrupt exception with SIGINT when it's not called from an interactive shell.
Of course, nothing prevents you from registering your signal handler like this:
def signal_handler(signum, frame):
raise KeyboardInterrupt, "Signal handler"
When you run command in background with &, SIGINT will be ignored.
Here's the relevant section of man bash:
Non-builtin commands run by bash have signal handlers set to the values inherited by the shell from
its parent. When job control is not in effect, asynchronous commands ignore SIGINT and SIGQUIT in
addition to these inherited handlers. Commands run as a result of command substitution ignore the
keyboard-generated job control signals SIGTTIN, SIGTTOU, and SIGTSTP.
I think you need to set signal handler explicitly as Matthew commented.
The script kill.sh also have a problem. Since loop.py is sent to background, there's no guarantee that kill runs after python loop.py.
#! /bin/bash
python loop.py &
PID=$!
#
# NEED TO WAIT ON EXISTENCE OF python loop.py PROCESS HERE.
#
echo "sending SIGINT to process $PID"
kill -SIGINT $PID
Tried #Steen's approach, but alas, it does not apparently hold on Mac.
Another solution, pretty much the same as the above but a little more general, is to just re-install the default handler if SIGINT is being ignored:
def _ensure_sigint_handler():
# On Mac, even using `exec <cmd>` in `bash` still yields an ignored SIGINT.
sig = signal.getsignal(signal.SIGINT)
if signal.getsignal(signal.SIGINT) == signal.SIG_IGN:
signal.signal(signal.SIGINT, signal.default_int_handler)
# ...
_ensure_sigint_handler()
Related
I have a shell script calling Python inside it.
#! /bin/bash
shopt -s extglob
echo "====test===="
~/.conda/envs/my_env/bin/python <<'EOF'
import sys
import os
try:
print("inside python")
x = 2/0
except Exception as e:
print("Exception: %s" % e)
sys.exit(2)
print("at the end of python")
EOF
echo "end of script"
If I execute this, the lines below still get printed.
"end of script"
I want to exit the shell in the exception block of the python script and let the script not reach EOF
Is there a way to create and kill a subprocess in the except block above, that will kill the entire shell script?
Can I spawn a dummy subprocess and kill it inside the exception block there by killing the entire shell script?
Any examples would be helpful.
Thanks in advance.
The whole EOF ... EOF block gets executed within the Python runtime so exiting from it doesn't affect the bash script. You'll need to collect the exit status and check it after the Python execution if you want to stop the further bash script progress, i.e.:
#!/bin/bash
~/.conda/envs/my_env/bin/python <<'EOF'
import sys
sys.exit(0x01) # use any exit code from 0-0xFF range, comment out for a clean exit
print("End of the Python script that will not execute without commenting out the above.")
EOF
exit_status=$? # store the exit status for later use
# now lets check the exit status and see if python returned a non-zero exit status
if [ $exit_status -ne 0 ]; then
echo "Python exited with a non-zero exit status, abort!"
exit $exit_status # exit the bash script with the same status
fi
# continue as usual...
echo "All is good, end of script"
From the shell script you have 2 options:
set -e: all errors quit the script
check python subcommand return code, abort if non-zero
(maybe more details here: Aborting a shell script if any command returns a non-zero value?)
Now, if you don't want to change the handling from your shell script, you could get the parent process of the python script and kill it:
except Exception as e:
import os,signal,sys
print("Exception: %s" % e)
os.kill(os.getppid(),signal.SIGTERM)
sys.exit(2)
if you need this on windows, this doesn't work (os.kill doesn't exist), you have to adapt it to invoke taskkill:
subprocess.call(["taskkill","/F","/PID",str(os.getppid())])
Now I would say that killing the parent process is bad practice. Unless you don't control the code of this parent process, you should try to handle the exit gracefully.
One way to kill the entire script could be to save the PID and then using Python's system commands to execute a kill command on the PID when the exception happens. If we imported 'os' it would be something along the lines of:
# In a shell
PID=$$
...
// Some Python Exception happens
os.system('kill -9' + $PID)
Why
import subprocess
p = subprocess.Popen(["/bin/bash", "-c", "timeout -s KILL 1 sleep 5 2>/dev/null"])
p.wait()
print(p.returncode)
returns
[stderr:] /bin/bash: line 1: 963663 Killed timeout -s KILL 1 sleep 5 2> /dev/null
[stdout:] 137
when
import subprocess
p = subprocess.Popen(["/bin/bash", "-c", "timeout -s KILL 1 sleep 5"])
p.wait()
print(p.returncode)
returns
[stdout:] -9
If you change bash to dash, you'll get 137 in both cases. I know that -9 is KILL code and 137 is 128 + 9. But seems weird for similar code to get different returncode.
Happens on Python 2.7.12 and python 3.4.3
Looks like Popen.wait() does not call Popen._handle_exitstatus https://github.com/python/cpython/blob/3.4/Lib/subprocess.py#L1468 when using /bin/bash but I could not figure out why.
This is due to the fact how bash executes timeout with or without redirection/pipes or any other bash features:
With redirection
python starts bash
bash starts timeout, monitors the process and does pipe handling.
timeout transfers itself into a new process group and starts sleep
After one second, timeout sends SIGKILL into its process group
As the process group died, bash returns from waiting for timeout, sees the SIGKILL and prints the message pasted above to stderr. It then sets its own exit status to 128+9 (a behaviour simulated by timeout).
Without redirection
python starts bash.
bash sees that it has nothing to do on its own and calls execve() to effectively replace itself with timeout.
timeout acts as above, the whole process group dies with SIGKILL.
python get's an exit status of 9 and does some mangling to turn this into -9 (SIGKILL)
In other words, without redirection/pipes/etc. bash withdraws itself from the call-chain. Your second example looks like subprocess.Popen() is executing bash, yet effectively it does not. bash is no longer there when timeout does its deed, which is why you don't get any messages and an unmangled exit status.
If you want consistent behaviour, use timeout --foreground; you'll get an exit status of 124 in both cases.
I don't know about dash; yet suppose it does not do any execve() trickery to effectively replace itself with the only program it's executing. Therefore you always see the mangled exit status of 128+9 in dash.
Update: zshshows the same behaviour, while it drops out even for simple redirections such as timeout -s KILL 1 sleep 5 >/tmp/foo and the like, giving you an exit status of -9. timeout -s KILL 1 sleep 5 && echo $? will give you status 137 in zsh also.
My goal is simple: kick off rsync and DO NOT WAIT.
Python 2.7.9 on Debian
Sample code:
rsync_cmd = "/usr/bin/rsync -a -e 'ssh -i /home/myuser/.ssh/id_rsa' {0}#{1}:'{2}' {3}".format(remote_user, remote_server, file1, file1)
rsync_cmd2 = "/usr/bin/rsync -a -e 'ssh -i /home/myuser/.ssh/id_rsa' {0}#{1}:'{2}' {3} &".format(remote_user, remote_server, file1, file1)
rsync_path = "/usr/bin/rsync"
rsync_args = shlex.split("-a -e 'ssh -i /home/mysuser/.ssh/id_rsa' {0}#{1}:'{2}' {3}".format(remote_user, remote_server, file1, file1))
#subprocess.call(rsync_cmd, shell=True) # This isn't supposed to work but I tried it
#subprocess.Popen(rsync_cmd, shell=True) # This is supposed to be the solution but not for me
#subprocess.Popen(rsync_cmd2, shell=True) # Adding my own shell "&" to background it, still fails
#subprocess.Popen(rsync_cmd, shell=True, stdin=None, stdout=None, stderr=None, close_fds=True) # This doesn't work
#subprocess.Popen(shlex.split(rsync_cmd)) # This doesn't work
#os.execv(rsync_path, rsync_args) # This doesn't work
#os.spawnv(os.P_NOWAIT, rsync_path, rsync_args) # This doesn't work
#os.system(rsync_cmd2) # This doesn't work
print "DONE"
(I've commented out the execution commands only because I'm actually keeping all of my trials in my code so that I know what I've done and what I haven't done. Obviously, I would run the script with the right line uncommented.)
What happens is this...I can watch the transfer on the server and when it's finished, then I get a "DONE" printed to the screen.
What I'd like to have happen is a "DONE" printed immediately after issuing the rsync command and for the transfer to start.
Seems very straight-forward. I've followed details outlined in other posts, like this one and this one, but something is preventing it from working for me.
Thanks ahead of time.
(I have tried everything I can find in StackExchange and don't feel like this is a duplicate because I still can't get it to work. Something isn't right in my setup and need help.)
Here is verified example for Python REPL:
>>> import subprocess
>>> import sys
>>> p = subprocess.Popen([sys.executable, '-c', 'import time; time.sleep(100)'], stdout=subprocess.PIPE, stderr=subprocess.STDOUT); print('finished')
finished
How to verify that via another terminal window:
$ ps aux | grep python
Output:
user 32820 0.0 0.0 2447684 3972 s003 S+ 10:11PM 0:00.01 /Users/user/venv/bin/python -c import time; time.sleep(100)
Popen() starts a child process—it does not wait for it to exit. You have to call .wait() method explicitly if you want to wait for the child process. In that sense, all subprocesses are background processes.
On the other hand, the child process may inherit various properties/resources from the parent such as open file descriptors, the process group, its control terminal, some signal configuration, etc—it may lead to preventing ancestors processes to exit e.g., Python subprocess .check_call vs .check_output or the child may die prematurely on Ctrl-C (SIGINT signal is sent to the foreground process group) or if the terminal session is closed (SIGHUP).
To disassociate the child process completely, you should make it a daemon. Sometimes something in between could be enough e.g., it is enough to redirect the inherited stdout in a grandchild so that .communicate() in the parent would return when its immediate child exits.
I encountered a similar issue while working with qnx devices and wanted a sub-process that runs independently of the main process and even runs after the main process terminates.
Here's the solution I found that actually works 'creationflags=subprocess.DETACHED_PROCESS':
import subprocess
import time
pid = subprocess.Popen(["python", "path_to_script\turn_ecu_on.py"], creationflags=subprocess.DETACHED_PROCESS)
time.sleep(15)
print("Done")
Link to the doc: https://docs.python.org/3/library/subprocess.html#subprocess.Popen
In Ubuntu the following commands keep working even if python app exits.
url = "https://www.youtube.com/watch?v=t3kcqTE6x4A"
cmd = f"mpv '{url}' && zenity --info --text 'you have watched {url}' &"
os.system(cmd)
I have a simple perl script that calls another python script to do the deployment of a server in cloud .
I capture the exit status of the deployment inside perl to take any further action after success/failure setup.
It's like:
$cmdret = system("python script.py ARG1 ARG2");
Here the python script runs for 3hrs to 7 hrs.
The problem here is that, irrespective of the success or failure return status, the system receive a Signal HUP at this step randomly even if the process is running in backened and breaks the steps further.
So does anyone know, if there is any time limit for holding the return status from the system which leads to sending Hangup Signal?
Inside the python script script.py, pexpect is used execute scripts remotely:
doSsh(User,Passwd,Name,'cd '+OutputDir+';python host-bringup.py setup')
doSsh(User,Passwd,Name,'cd '+OpsHome+'/ops/hlevel;python dshost.py start')
....
And doSsh is a pexpect subroutine:
def doSsh(user,password,host,command):
try:
child = pexpect.spawn("ssh -o ServerAliveInterval=100 -n %s#%s '%s'" % (user,host,command),logfile=sys.stdout,timeout=None)
i = child.expect(['password:', r'\(yes\/no\)',r'.*password for paasusr: ',r'.*[$#] ',pexpect.EOF])
if i == 0:
child.sendline(password)
elif i == 1:
child.sendline("yes")
child.expect("password:")
child.sendline(password)
data = child.read()
print data
child.close()
return True
except Exception as error:
print error
return False
This first doSsh execution takes ~6 hours and this session is killed after few hours of execution with the message : Signal HUP caught; exitingbut
the execution python host-bringup.py setup still runs in the remote host.
So in the local system, the next doSsh never runs and also the rest steps inside the perl script never continue.
SIGHUP is sent when the terminal disconnects. When you want to create a process that's not tied to the terminal, you daemonize it.
Note that nohup doesn't deamonize.
$ nohup perl -e'system "ps", "-o", "pid,ppid,sid,cmd"'
nohup: ignoring input and appending output to `nohup.out'
$ cat nohup.out
PID PPID SID CMD
21300 21299 21300 -bash
21504 21300 21300 perl -esystem "ps", "-o", "pid,ppid,sid,cmd"
21505 21504 21300 ps -o pid,ppid,sid,cmd
As you can see,
perl's PPID is that of the program that launched it.
perl's SID is that of the program that launched it.
Since the session hasn't changed, the terminal will send SIGHUP to perl when it disconnects as normal.
That said, nohup changes how perl's handles SIGHUP by causing it to be ignored.
$ perl -e'system "kill", "-HUP", "$$"; print "SIGHUP was ignored\n"'
Hangup
$ echo $?
129
$ nohup perl -e'system "kill", "-HUP", "$$"; print "SIGHUP was ignored\n"'
nohup: ignoring input and appending output to `nohup.out'
$ echo $?
0
$ tail -n 1 nohup.out
SIGHUP was ignored
If perl is killed by the signal, it's because something changed how perl handles SIGHUP.
So, either daemonize the process, or have perl ignore use SIGHUP (e.g. by using nohup). But if you use nohup, don't re-enable the default SIGHUP behaviour!
If your goal is to make your perl program ignore the HUP signal, you likely just need to set the HUP entry of the $SIG global signal handler hash:
$SIG{ 'HUP' } = 'IGNORE';
for gory details, see
perldoc perlipc
I have a problem with the way signals are propagated within a process group. Here is my situation and an explication of the problem :
I have an application, that is launched by a shell script (with a su). This shell script is itself launched by a python application using subprocess.Popen
I call os.setpgrp as a preexec_function and have verified using ps that the bash script, the su command and the final application all have the same pgid.
Now when I send signal USR1 to the bash script (the leader of the process group), sometimes the application see this signal, and sometimes not. I can't figure out why I have this random behavior (The signal is seen by the app about 50% of the time)
Here is he example code I am testing against :
Python launcher :
#!/usr/bin/env python
p = subprocess.Popen( ["path/to/bash/script"], stdout=…, stderr=…, preexec_fn=os.setpgrp )
# loop to write stdout and stderr of the subprocesses to a file
# not that I use fcntl.fcntl(p.stdXXX.fileno(), fcntl.F_SETFL, os.O_NONBLOCK)
p.wait()
Bash script :
#!/bin/bash
set -e
set -u
cd /usr/local/share/gios/exchange-manager
CONF=/etc/exchange-manager.conf
[ -f $CONF ] && . $CONF
su exchange-manager -p -c "ruby /path/to/ruby/app"
Ruby application :
#!/usr/bin/env ruby
Signal.trap("USR1") do
puts "Received SIGUSR1"
exit
end
while true do
sleep 1
end
So I try to send the signal to the bash wrapper (from a terminal or from the python application), sometimes the ruby application will see the signal and sometimes not. I don't think it's a logging issue as I have tried to replace the puts by a method that write directly to a different file.
Do you guys have any idea what could be the root cause of my problem and how to fix it ?
Your signal handler is doing too much. If you exit from within the signal handler, you are not sure that your buffers are properly flushed, in other words you may not be exiting gracefully your program. Be careful of new signals being received when the program is already inside a signal handler.
Try to modify your Ruby source to exit the program from the main loop as soon as an "exit" flag is set, and don't exit from the signal handler itself.
Your Ruby application becomes:
#!/usr/bin/env ruby
$done = false
Signal.trap("USR1") do
$done = true
end
until $done do
sleep 1
end
puts "** graceful exit"
Which should be much safer.
For real programs, you may consider using a Mutex to protect your flag variable.