I have a problem with the way signals are propagated within a process group. Here is my situation and an explication of the problem :
I have an application, that is launched by a shell script (with a su). This shell script is itself launched by a python application using subprocess.Popen
I call os.setpgrp as a preexec_function and have verified using ps that the bash script, the su command and the final application all have the same pgid.
Now when I send signal USR1 to the bash script (the leader of the process group), sometimes the application see this signal, and sometimes not. I can't figure out why I have this random behavior (The signal is seen by the app about 50% of the time)
Here is he example code I am testing against :
Python launcher :
#!/usr/bin/env python
p = subprocess.Popen( ["path/to/bash/script"], stdout=…, stderr=…, preexec_fn=os.setpgrp )
# loop to write stdout and stderr of the subprocesses to a file
# not that I use fcntl.fcntl(p.stdXXX.fileno(), fcntl.F_SETFL, os.O_NONBLOCK)
p.wait()
Bash script :
#!/bin/bash
set -e
set -u
cd /usr/local/share/gios/exchange-manager
CONF=/etc/exchange-manager.conf
[ -f $CONF ] && . $CONF
su exchange-manager -p -c "ruby /path/to/ruby/app"
Ruby application :
#!/usr/bin/env ruby
Signal.trap("USR1") do
puts "Received SIGUSR1"
exit
end
while true do
sleep 1
end
So I try to send the signal to the bash wrapper (from a terminal or from the python application), sometimes the ruby application will see the signal and sometimes not. I don't think it's a logging issue as I have tried to replace the puts by a method that write directly to a different file.
Do you guys have any idea what could be the root cause of my problem and how to fix it ?
Your signal handler is doing too much. If you exit from within the signal handler, you are not sure that your buffers are properly flushed, in other words you may not be exiting gracefully your program. Be careful of new signals being received when the program is already inside a signal handler.
Try to modify your Ruby source to exit the program from the main loop as soon as an "exit" flag is set, and don't exit from the signal handler itself.
Your Ruby application becomes:
#!/usr/bin/env ruby
$done = false
Signal.trap("USR1") do
$done = true
end
until $done do
sleep 1
end
puts "** graceful exit"
Which should be much safer.
For real programs, you may consider using a Mutex to protect your flag variable.
Related
I have subprocess that I am running by:
proc = subprocess.Popen("python -u my_script.py", shell=True)
my_script.py should print regularly to stdout and I have other non related process that is listening to this output so I can't change the output to be printed to somewhere else.
I want to ensure that the process is really regularly printing and not got stuck in some loop .etc, do I have way to check if stdout was wroten for some amount of time?
any other options to reach this goal?
EDIT
I am using windows
you can create a named pipe with mkfifo and use tee to output your script's data to both the process listening for it and the pipe.
mkfifo blarg
my_script.py | tee blarg | your_greedy_data_processing_instance
tail -f blarg
instead of tail you can use an arbitrarly complicated script to study the output and the state of the process generating it (timers, pid checks)
It appears that the access time and modification time of /dev/stdout is updated regularly. Note, however, that /dev/stdout will always be a soft link -- er, a symbolic link, I mean -- to the file handle of stdout for the process that's checking /dev/stdout. I.e., /dev/stdout links to /proc/self/fd/1.
So it seems that you could check the first file descriptor of your process to see if its modification time has changed, e.g.:
$ stat -c %y -L /proc/10830/fd/1
2021-05-13 02:34:00.367857061
-L means act on the target of the soft link, not the soft link itself; -c %y is just asking for the modification time. This Python script is running as process 10830 on my system right now, and it's occasionally updating the modification time (about every 8 seconds):
>>> import time
>>> while True: time.sleep(1); print("still alive")
still alive
still alive
still alive
....
You should Google this answer to be sure that the behavior I'm seeing is reliable, though, because I've never read anything about it before.
Alternatively, you could either (a) trust that the script is fine -- which it will, of course, always be (unless it's catching exceptions and refusing to exit even if it can no longer do anything useful, in which case you should change it to die the way it should), or (b) set up a daemon to do something like send a signal to the script, at which point the script could send a signal to the daemon to say "I'm still alive." There's literally no reason to do that, in my opinion, but how you write your programs is up to you.
So assuming that you want to press forward with this, here's a trivial example of the daemon that would monitor the script you want to make sure isn't stuck in a loop or something:
import time
import signal
import os
import sys
# keep a timestamp of when we receive a response
response_timestamp = time.time()
# add code here to get the process ID of the other script
other_pid = 0
def sig_handler(signum, frame):
global response_timestamp
response_timestamp = time.time()
if __name__ == '__main__':
# make sure that when we receive SIGBREAK, sig_handler() gets called
signal.signal(signal.SIGBREAK, sig_handler)
while True:
# send SIGBREAK to "other_pid"
os.kill(other_pid, signal.SIGBREAK)
time.sleep(15)
if time.time() - 20 > response_timestamp:
print("the other process is frozen")
sys.exit(os.EX_SOFTWARE)
Then you add this to the other script that you're monitoring:
import signal
import os
# add code here to get the process ID
other_pid = 0
def sig_handler(signum, frame):
os.kill(other_pid, signal.SIGBREAK)
...
...
(rest of your script)
Now be aware that the only thing this will do, is make sure that the process isn't completely frozen. Regrettably, Windows doesn't have a great deal of options when it comes to signals: SIGBREAK was the best one that I saw, but note that it's the signal received by a process when you hit CTRL+C to interrupt the program (so if you manually hit CTRL+C in the window running the Python program, it won't kill it, it will just make it call sig_handler()).
I would also be remiss if I did not inform you that even though this will probably work just fine, it is not safe to do almost anything inside of a signal handler function. It's bad form and may blow up on you unexpectedly, but in practice, it's pretty safe.
I have two python scripts that use two different cameras for a project I am working on and I am trying to run them both inside a different script or within each other, either way is fine.
import os
os.system('python 1.py')
os.system('python 2.py')
My problem however is that they don't run at the same time, I have to quit the first one for the next to open. I also tried doing it with bash as well with the & shell operator
python 1.py &
python 2.py &
And this does in fact make them both run however the issue is that they both run endlessly in the background and I need to close them rather easily. Any suggestion what I can do to avoid the issues with these implementations
You could do it with multiprocessing
import os
import time
import psutil
from multiprocessing import Process
def run_program(cmd):
# Function that processes will run
os.system(cmd)
# Initiating Processes with desired arguments
program1 = Process(target=run_program, args=('python 1.py',))
program2 = Process(target=run_program, args=('python 2.py',))
# Start our processes simultaneously
program1.start()
program2.start()
def kill(proc_pid):
process = psutil.Process(proc_pid)
for proc in process.children(recursive=True):
proc.kill()
process.kill()
# Wait 5 seconds and kill first program
time.sleep(5)
kill(program1.pid)
program1.join()
# Wait another 1 second and kill second program
time.sleep(1)
kill(program2.pid)
program2.join()
# Print current status of our programs
print('1.py alive status: {}'.format(program1.is_alive()))
print('2.py alive status: {}'.format(program2.is_alive()))
One possible method is to use systemd to control your process (i.e. treat them as daemons).
This is how I control my Python servers since they need to run in the background and be completely detached from the current tty so I can exit my connection to the machine and the continue processes continue. You can then also stop the server later using systemctl, as explained below.
Instructions:
Create a .service file and save it in /etc/systemd/system, with contents along the lines of:
[Unit]
Description=daemon one
[Service]
ExecStart=/path/to/1.py
and repeat with one going to 2.py.
Then you can use systemctl to control your daemons.
First reload all config files with:
systemctl daemon-reload
then start either of your daemons (where my_daemon.service is one of your unit files):
systemctl start my_daemon
it should now be running and you should find it in:
systemctl list-units
You can also check its status with:
systemctl status my_daemon
and stop/restart them with:
systemctl stop|restart my_daemon
Use subprocess.Popen. This will create a child process and return its pid.
pid = Popen("python 1.py").pid
And then check out these functions for communicating with the child process and checking if it is still running.
My goal is simple: kick off rsync and DO NOT WAIT.
Python 2.7.9 on Debian
Sample code:
rsync_cmd = "/usr/bin/rsync -a -e 'ssh -i /home/myuser/.ssh/id_rsa' {0}#{1}:'{2}' {3}".format(remote_user, remote_server, file1, file1)
rsync_cmd2 = "/usr/bin/rsync -a -e 'ssh -i /home/myuser/.ssh/id_rsa' {0}#{1}:'{2}' {3} &".format(remote_user, remote_server, file1, file1)
rsync_path = "/usr/bin/rsync"
rsync_args = shlex.split("-a -e 'ssh -i /home/mysuser/.ssh/id_rsa' {0}#{1}:'{2}' {3}".format(remote_user, remote_server, file1, file1))
#subprocess.call(rsync_cmd, shell=True) # This isn't supposed to work but I tried it
#subprocess.Popen(rsync_cmd, shell=True) # This is supposed to be the solution but not for me
#subprocess.Popen(rsync_cmd2, shell=True) # Adding my own shell "&" to background it, still fails
#subprocess.Popen(rsync_cmd, shell=True, stdin=None, stdout=None, stderr=None, close_fds=True) # This doesn't work
#subprocess.Popen(shlex.split(rsync_cmd)) # This doesn't work
#os.execv(rsync_path, rsync_args) # This doesn't work
#os.spawnv(os.P_NOWAIT, rsync_path, rsync_args) # This doesn't work
#os.system(rsync_cmd2) # This doesn't work
print "DONE"
(I've commented out the execution commands only because I'm actually keeping all of my trials in my code so that I know what I've done and what I haven't done. Obviously, I would run the script with the right line uncommented.)
What happens is this...I can watch the transfer on the server and when it's finished, then I get a "DONE" printed to the screen.
What I'd like to have happen is a "DONE" printed immediately after issuing the rsync command and for the transfer to start.
Seems very straight-forward. I've followed details outlined in other posts, like this one and this one, but something is preventing it from working for me.
Thanks ahead of time.
(I have tried everything I can find in StackExchange and don't feel like this is a duplicate because I still can't get it to work. Something isn't right in my setup and need help.)
Here is verified example for Python REPL:
>>> import subprocess
>>> import sys
>>> p = subprocess.Popen([sys.executable, '-c', 'import time; time.sleep(100)'], stdout=subprocess.PIPE, stderr=subprocess.STDOUT); print('finished')
finished
How to verify that via another terminal window:
$ ps aux | grep python
Output:
user 32820 0.0 0.0 2447684 3972 s003 S+ 10:11PM 0:00.01 /Users/user/venv/bin/python -c import time; time.sleep(100)
Popen() starts a child process—it does not wait for it to exit. You have to call .wait() method explicitly if you want to wait for the child process. In that sense, all subprocesses are background processes.
On the other hand, the child process may inherit various properties/resources from the parent such as open file descriptors, the process group, its control terminal, some signal configuration, etc—it may lead to preventing ancestors processes to exit e.g., Python subprocess .check_call vs .check_output or the child may die prematurely on Ctrl-C (SIGINT signal is sent to the foreground process group) or if the terminal session is closed (SIGHUP).
To disassociate the child process completely, you should make it a daemon. Sometimes something in between could be enough e.g., it is enough to redirect the inherited stdout in a grandchild so that .communicate() in the parent would return when its immediate child exits.
I encountered a similar issue while working with qnx devices and wanted a sub-process that runs independently of the main process and even runs after the main process terminates.
Here's the solution I found that actually works 'creationflags=subprocess.DETACHED_PROCESS':
import subprocess
import time
pid = subprocess.Popen(["python", "path_to_script\turn_ecu_on.py"], creationflags=subprocess.DETACHED_PROCESS)
time.sleep(15)
print("Done")
Link to the doc: https://docs.python.org/3/library/subprocess.html#subprocess.Popen
In Ubuntu the following commands keep working even if python app exits.
url = "https://www.youtube.com/watch?v=t3kcqTE6x4A"
cmd = f"mpv '{url}' && zenity --info --text 'you have watched {url}' &"
os.system(cmd)
I have a simple perl script that calls another python script to do the deployment of a server in cloud .
I capture the exit status of the deployment inside perl to take any further action after success/failure setup.
It's like:
$cmdret = system("python script.py ARG1 ARG2");
Here the python script runs for 3hrs to 7 hrs.
The problem here is that, irrespective of the success or failure return status, the system receive a Signal HUP at this step randomly even if the process is running in backened and breaks the steps further.
So does anyone know, if there is any time limit for holding the return status from the system which leads to sending Hangup Signal?
Inside the python script script.py, pexpect is used execute scripts remotely:
doSsh(User,Passwd,Name,'cd '+OutputDir+';python host-bringup.py setup')
doSsh(User,Passwd,Name,'cd '+OpsHome+'/ops/hlevel;python dshost.py start')
....
And doSsh is a pexpect subroutine:
def doSsh(user,password,host,command):
try:
child = pexpect.spawn("ssh -o ServerAliveInterval=100 -n %s#%s '%s'" % (user,host,command),logfile=sys.stdout,timeout=None)
i = child.expect(['password:', r'\(yes\/no\)',r'.*password for paasusr: ',r'.*[$#] ',pexpect.EOF])
if i == 0:
child.sendline(password)
elif i == 1:
child.sendline("yes")
child.expect("password:")
child.sendline(password)
data = child.read()
print data
child.close()
return True
except Exception as error:
print error
return False
This first doSsh execution takes ~6 hours and this session is killed after few hours of execution with the message : Signal HUP caught; exitingbut
the execution python host-bringup.py setup still runs in the remote host.
So in the local system, the next doSsh never runs and also the rest steps inside the perl script never continue.
SIGHUP is sent when the terminal disconnects. When you want to create a process that's not tied to the terminal, you daemonize it.
Note that nohup doesn't deamonize.
$ nohup perl -e'system "ps", "-o", "pid,ppid,sid,cmd"'
nohup: ignoring input and appending output to `nohup.out'
$ cat nohup.out
PID PPID SID CMD
21300 21299 21300 -bash
21504 21300 21300 perl -esystem "ps", "-o", "pid,ppid,sid,cmd"
21505 21504 21300 ps -o pid,ppid,sid,cmd
As you can see,
perl's PPID is that of the program that launched it.
perl's SID is that of the program that launched it.
Since the session hasn't changed, the terminal will send SIGHUP to perl when it disconnects as normal.
That said, nohup changes how perl's handles SIGHUP by causing it to be ignored.
$ perl -e'system "kill", "-HUP", "$$"; print "SIGHUP was ignored\n"'
Hangup
$ echo $?
129
$ nohup perl -e'system "kill", "-HUP", "$$"; print "SIGHUP was ignored\n"'
nohup: ignoring input and appending output to `nohup.out'
$ echo $?
0
$ tail -n 1 nohup.out
SIGHUP was ignored
If perl is killed by the signal, it's because something changed how perl handles SIGHUP.
So, either daemonize the process, or have perl ignore use SIGHUP (e.g. by using nohup). But if you use nohup, don't re-enable the default SIGHUP behaviour!
If your goal is to make your perl program ignore the HUP signal, you likely just need to set the HUP entry of the $SIG global signal handler hash:
$SIG{ 'HUP' } = 'IGNORE';
for gory details, see
perldoc perlipc
I have a python script that needs to call the defined $EDITOR or $VISUAL. When the Python script is called alone, I am able to launch the $EDITOR without a hitch, but the moment I pipe something to the Python script, the $EDITOR is unable to launch. Right now, I am using nano which shows
Received SIGHUP or SIGTERM
every time. It appears to be the same issue described here.
sinister:Programming [1313]$ echo "import os;os.system('nano')" > "sample.py"
sinister:Programming [1314]$ python sample.py
# nano is successfully launched here.
sinister:Programming [1315]$ echo "It dies here." | python sample.py
Received SIGHUP or SIGTERM
Buffer written to nano.save.1
EDIT: Clarification; inside the program, I am not piping to the editor. The code is as follows:
editorprocess = subprocess.Popen([editor or "vi", temppath])
editorreturncode = os.waitpid(editorprocess.pid, 0)[1]
When you pipe something to a process, the pipe is connected to that process's standard input. This means your terminal input won't be connected to the editor. Most editors also check whether their standard input is a terminal (isatty), which a pipe isn't; and if it isn't a terminal, they'll refuse to start. In the case of nano, this appears to cause it to exit with the message you included:
% echo | nano
Received SIGHUP or SIGTERM
You'll need to provide the input to your Python script in another way, such as via a file, if you want to be able to pass its standard input to a terminal-based editor.
Now you've clarified your question, that you don't want the Python process's stdin attached to the editor, you can modify your code as follows:
editorprocess = subprocess.Popen([editor or "vi", temppath],
stdin=open('/dev/tty', 'r'))
The specific case of find -type f | vidir - is handled here:
foreach my $item (#ARGV) {
if ($item eq "-") {
push #dir, map { chomp; $_ } <STDIN>;
close STDIN;
open(STDIN, "/dev/tty") || die "reopen: $!\n";
}
You can re-create this behavior in Python, as well:
#!/usr/bin/python
import os
import sys
sys.stdin.close()
o = os.open("/dev/tty", os.O_RDONLY)
os.dup2(o, 0)
os.system('vim')
Of course, it closes the standard input file descriptor, so if you intend on reading from it again after starting the editor, you should probably duplicate its file descriptor before closing it.