Know if subprocess is not stuck by it's prints to stdout - python

I have subprocess that I am running by:
proc = subprocess.Popen("python -u my_script.py", shell=True)
my_script.py should print regularly to stdout and I have other non related process that is listening to this output so I can't change the output to be printed to somewhere else.
I want to ensure that the process is really regularly printing and not got stuck in some loop .etc, do I have way to check if stdout was wroten for some amount of time?
any other options to reach this goal?
EDIT
I am using windows

you can create a named pipe with mkfifo and use tee to output your script's data to both the process listening for it and the pipe.
mkfifo blarg
my_script.py | tee blarg | your_greedy_data_processing_instance
tail -f blarg
instead of tail you can use an arbitrarly complicated script to study the output and the state of the process generating it (timers, pid checks)

It appears that the access time and modification time of /dev/stdout is updated regularly. Note, however, that /dev/stdout will always be a soft link -- er, a symbolic link, I mean -- to the file handle of stdout for the process that's checking /dev/stdout. I.e., /dev/stdout links to /proc/self/fd/1.
So it seems that you could check the first file descriptor of your process to see if its modification time has changed, e.g.:
$ stat -c %y -L /proc/10830/fd/1
2021-05-13 02:34:00.367857061
-L means act on the target of the soft link, not the soft link itself; -c %y is just asking for the modification time. This Python script is running as process 10830 on my system right now, and it's occasionally updating the modification time (about every 8 seconds):
>>> import time
>>> while True: time.sleep(1); print("still alive")
still alive
still alive
still alive
....
You should Google this answer to be sure that the behavior I'm seeing is reliable, though, because I've never read anything about it before.
Alternatively, you could either (a) trust that the script is fine -- which it will, of course, always be (unless it's catching exceptions and refusing to exit even if it can no longer do anything useful, in which case you should change it to die the way it should), or (b) set up a daemon to do something like send a signal to the script, at which point the script could send a signal to the daemon to say "I'm still alive." There's literally no reason to do that, in my opinion, but how you write your programs is up to you.
So assuming that you want to press forward with this, here's a trivial example of the daemon that would monitor the script you want to make sure isn't stuck in a loop or something:
import time
import signal
import os
import sys
# keep a timestamp of when we receive a response
response_timestamp = time.time()
# add code here to get the process ID of the other script
other_pid = 0
def sig_handler(signum, frame):
global response_timestamp
response_timestamp = time.time()
if __name__ == '__main__':
# make sure that when we receive SIGBREAK, sig_handler() gets called
signal.signal(signal.SIGBREAK, sig_handler)
while True:
# send SIGBREAK to "other_pid"
os.kill(other_pid, signal.SIGBREAK)
time.sleep(15)
if time.time() - 20 > response_timestamp:
print("the other process is frozen")
sys.exit(os.EX_SOFTWARE)
Then you add this to the other script that you're monitoring:
import signal
import os
# add code here to get the process ID
other_pid = 0
def sig_handler(signum, frame):
os.kill(other_pid, signal.SIGBREAK)
...
...
(rest of your script)
Now be aware that the only thing this will do, is make sure that the process isn't completely frozen. Regrettably, Windows doesn't have a great deal of options when it comes to signals: SIGBREAK was the best one that I saw, but note that it's the signal received by a process when you hit CTRL+C to interrupt the program (so if you manually hit CTRL+C in the window running the Python program, it won't kill it, it will just make it call sig_handler()).
I would also be remiss if I did not inform you that even though this will probably work just fine, it is not safe to do almost anything inside of a signal handler function. It's bad form and may blow up on you unexpectedly, but in practice, it's pretty safe.

Related

Stop printing to BASH console when Python script is moved to background

I'm working on developing some tools in Python to be used in Linux via Bash or another CLI. These tools all print info to the terminal to let the user know what's going on. Some have progress percentages where the same line is printed to repeatedly.
I've got everything working how I'd like with print statements, however, occasionally these tools may be sent to the background in Bash. This is causing problems, since the print function seems to print to the foreground regardless of where the process is running.
How do I only print to the console when the script in the foreground. The script should keep running and just not print to the foreground when sent to the background. I imagine there's some whole topic I'm missing to do this, as I am new to programming as a whole. Even if I could get a pointer in the right direction, that would likely get me there.
Enabling TOSTOP in termios should work for your case
#! /usr/bin/env python3
import sys
import termios
import tty
try:
[iflag, oflag, cflag, lflag, ispeed, ospeed, ccs] = termios.tcgetattr(sys.stdout)
termios.tcsetattr(sys.stdout.fileno(), termios.TCSANOW, [iflag, oflag, cflag, lflag | termios.TOSTOP, ispeed, ospeed, ccs])
except termios.error:
pass
...
# your script here
...
# restore the original state when done
so if your script tries to print to stdout when in background it's going to receive a SIGTTOU signal and stops.
This should work either if you start it with & or you send it to background later using job control (CTRL+Z).
edit
If you need it to continue running handle the SIGCONT signal
signal.signal(signal.SIGCONT, handler)
and in handler switch a flag
def handler(signum, frame):
global do_print
do_print = False
I'm not sure this is the most ideal or elegant solution, but here's what I've gotten to work based upon this answer to a similar question:
#! /usr/bin/env python3
import time, sys, os
def termianl_output(data): #Function to print to console ONLY if process in foreground
if os.getpgrp() == os.tcgetpgrp(sys.stdout.fileno()):
print(data)
count = 0
while count < 20:
count += 1
terminal_output(count)
time.sleep(2)
I've simply put together a new function to handle printing that first checks whether the process group it's running in is the process group that has control of the terminal. If it's not, it does not print. This could be easily modified to instead output to a log file or something, though I don't need that in my case.

How to guarantee file removing after script stopped working?

I have a script running by crontab every hour and interacts with API (database sync). Usually it take one hour or so, and I check for the next run if this process still in the memory or not:
#/usr/bin/env python
import os
import sys
pid = str(os.getpid())
pidfile = "/tmp/mydaemon.pid"
if os.path.isfile(pidfile):
print "%s already exists, exiting" % pidfile
sys.exit()
file(pidfile, 'w').write(pid)
try:
# Do some actual work here
finally:
os.unlink(pidfile)
BUT after some time script stopped working, when I look at the "ps aux | grep python", I don't see this script as the process, but I do see file on the place.
And when I run script manually, I see information printed iteratively on the screen, but after some time I see the word "Terminated", script exited and file still on the place.
How to guarantee 100% the file removed after the script stopped working?
Thanks!
It looks like your script is terminated unexpectedly, most probably due to too high memory usage. It's not guaranteed that finally will be executed on unexpected program termination. So, first of all I suggest you to find the cause of the unexpected termination an fix it.
Actually there is no 100% way to guarantee that the file will be removed. However, there are a few workarounds for handling dangling pid files.
Place your pid files on the /var/run volume, so they will be removed on unexpected system restart.
Check wether the process with such pid is still running on each script execution:
import os
def is_alive(pid):
try:
os.kill(pid, 0) # do nothing but throws an exception
return True
except OSError:
return False
# and add this to your code:
if os.path.isfile(pidfile):
with open(pidfile) as f:
if is_alive(f.read()):
sys.exit()
Again, provided code is not 100% safe because of possible pid collisions. You can make the verification of running process more sophisticated by adding parsing of ps command output. Try to find a line with the desired pid value and check wether it looks similar to your crontab entry.
Normally you can use atextit module functionality, but in your case (unexpected termination) it also may not work.
Maybe use of mkstemp (specifying required program suffix/refix) within with statement may work: it will create unique pidfile in /tmp and clear it, when with block completes or terminates.

Preventing write interrupts in python script

I'm writing a parser in Python that outputs a bunch of database rows to standard out. In order for the DB to process them properly, each row needs to be fully printed to the console. I'm trying to prevent interrupts from making the print command stop halfway through printing a line.
I tried the solution that recommended using a signal handler override, but this still doesn't prevent the row from being partially printed when the program is interrupted. (I think the WRITE system call is cancelled to handle the interrupt).
I thought that the problem was solved by issue 10956 but I upgraded to Python 2.7.5 and the problem still happens.
You can see for yourself by running this example:
# Writer
import signal
interrupted = False
def signal_handler(signal, frame):
global interrupted
iterrupted = True
signal.signal(signal.SIGINT, signal_handler)
while True:
if interrupted:
break
print '0123456789'
In a terminal:
$ mkfifo --mode=0666 pipe
$ python writer.py > pipe
In another terminal:
$ cat pipe
Then Ctrl+C the first terminal. Some of the time the second terminal will end with an incomplete sequence of characters.
Is there any way of ensuring that full lines are written?
This seems less like an interrupt problem per se then a buffering issue. If I make a small change to your code, I don't get the partial lines.
# Writer
import sys
while True:
print '0123456789'
sys.stdout.flush()
It sounds like you don't really want to catch a signal but rather block it temporarily. This is supported by some *nix flavours. However Python explicitly does not support this.
You can write a C wrapper for sigmasks or look for a library. However if you are looking for a portable solution...

Control a subprocess (specifically gdb) in multiple ways

I am developing a wrapper around gdb using python. Basically, I just want to be able to detect a few setup annoyances up-front and be able to run a single command to invoke gdb, rather than a huge string I have to remember each time.
That said, there are two cases that I am using. The first, which works fine, is invoking gdb by creating a new process and attaching to it. Here's the code that I have for this one:
def spawnNewProcessInGDB():
global gObjDir, gGDBProcess;
from subprocess import Popen
from os.path import join
import subprocess
binLoc = join(gObjDir, 'dist');
binLoc = join(binLoc, 'bin');
binLoc = join(binLoc, 'mycommand')
profileDir = join(gObjDir, '..')
profileDir = join(profileDir, 'trash-profile')
try:
gGDBProcess = Popen(['gdb', '--args', binLoc, '-profile', profileDir], cwd=gObjDir)
gGDBProcess.wait()
except KeyboardInterrupt:
# Send a termination signal to the GDB process, if it's running
promptAndTerminate(gGDBProcess)
Now, if the user presses CTRL-C while this is running, it breaks (i.e. it forwards the CTRL-C to GDB). This is the behavior I want.
The second case is a bit more complicated. It might be the case that I already had this program running on my system and it crashed, but was caught. In this case, I want to be able to connect to it using gdb to get a stack trace (or perhaps I was already running it, and I simply now want to connect to the process that's already in memory).
As a convenience feature, I've created a mirror function, which will connect to a running process using gdb:
def connectToProcess(procNum):
global gObjDir, gGDBProcess
from subprocess import Popen
import subprocess
import signal
print("Connecting to mycommand process number " + str(procNum) + "...")
try:
gGDBProcess = Popen(['gdb', '-p', procNum], cwd=gObjDir)
gGDBProcess.wait()
except KeyboardInterrupt:
promptAndTerminate(gGDBProcess)
Again, this seems to work as expected. It starts gdb, I can set breakpoints, run the program, etc. The only catch is that it doesn't forward CTRL-C to gdb if I press it while the program is running. Instead, it jumps immediately to promptAndTerminate().
I'm wondering if anyone can see why this is happening - the two calls to subprocess.Popen() seem identical to me, albeit that one is running gdb in a different mode.
I have also tried replacing the call to subprocess.Popen() with the following:
gGDBProcess = Popen(['gdb', '-p', procNum], cwd=gObjDir, stdin=subprocess.PIPE)
but this leads to undesirable results as well, because it doesn't actually communicate anything to the child gdb process (e.g. if I type in c to start the program running again after it is broken upon connection from gdb, it doesn't do anything). Again, it terminates the running python process when I type CTRL-C.
Any help would be appreciated!

Avoid python setup time

This image below says python takes lot of time in user space. Is it possible to reduce this time at all ?
In the sense I will be running a script several 100 times. Is it possible to start python so that it takes time to initialize once and doesn't do it the subsequent time ??
I just searched for the same and found this:
http://blogs.gnome.org/johan/2007/01/18/introducing-python-launcher/
Python-launcher does not solve the problem directly, but it points into an interesting direction: If you create a small daemon which you can contact via the shell to fork a new instance, you might be able to get rid of your startup time.
For example get the python-launcher and socat¹ and do the following:
PYTHONPATH="../lib.linux-x86_64-2.7/" python python-launcher-daemon &
echo pass > 1
for i in {1..100}; do
echo 1 | socat STDIN UNIX-CONNECT:/tmp/python-launcher-daemon.socket &
done
Todo: Adapt it to your program, remove the GTK stuff. Note the & at the end: Closing the socket connection seems to be slow.
The essential trick is to just create a server which opens a socket. Then it reads all the data from the socket. Once it has the data, it forks like the following:
pid = os.fork()
if pid:
return
signal.signal(signal.SIGPIPE, signal.SIG_DFL)
signal.signal(signal.SIGCHLD, signal.SIG_DFL)
glob = dict(__name__="__main__")
print 'launching', program
execfile(program, glob, glob)
raise SystemExit
Running 100 programs that way took just 0.7 seconds for me.
You might have to switch from forking to just executing the code instead of forking if you want to be really fast.
(That’s what I also do with emacsclient… My emacs takes ~30s to start (due to excessive use of additional libraries I added), but emacsclient -c shows up almost instantly.)
¹: http://www.socat.org
Write the "do this several 100 times" logic in your Python script. Call it ONCE from that other language.
Use timeit instead:
http://docs.python.org/library/timeit.html

Categories

Resources