How to guarantee file removing after script stopped working? - python

I have a script running by crontab every hour and interacts with API (database sync). Usually it take one hour or so, and I check for the next run if this process still in the memory or not:
#/usr/bin/env python
import os
import sys
pid = str(os.getpid())
pidfile = "/tmp/mydaemon.pid"
if os.path.isfile(pidfile):
print "%s already exists, exiting" % pidfile
sys.exit()
file(pidfile, 'w').write(pid)
try:
# Do some actual work here
finally:
os.unlink(pidfile)
BUT after some time script stopped working, when I look at the "ps aux | grep python", I don't see this script as the process, but I do see file on the place.
And when I run script manually, I see information printed iteratively on the screen, but after some time I see the word "Terminated", script exited and file still on the place.
How to guarantee 100% the file removed after the script stopped working?
Thanks!

It looks like your script is terminated unexpectedly, most probably due to too high memory usage. It's not guaranteed that finally will be executed on unexpected program termination. So, first of all I suggest you to find the cause of the unexpected termination an fix it.
Actually there is no 100% way to guarantee that the file will be removed. However, there are a few workarounds for handling dangling pid files.
Place your pid files on the /var/run volume, so they will be removed on unexpected system restart.
Check wether the process with such pid is still running on each script execution:
import os
def is_alive(pid):
try:
os.kill(pid, 0) # do nothing but throws an exception
return True
except OSError:
return False
# and add this to your code:
if os.path.isfile(pidfile):
with open(pidfile) as f:
if is_alive(f.read()):
sys.exit()
Again, provided code is not 100% safe because of possible pid collisions. You can make the verification of running process more sophisticated by adding parsing of ps command output. Try to find a line with the desired pid value and check wether it looks similar to your crontab entry.

Normally you can use atextit module functionality, but in your case (unexpected termination) it also may not work.
Maybe use of mkstemp (specifying required program suffix/refix) within with statement may work: it will create unique pidfile in /tmp and clear it, when with block completes or terminates.

Related

Know if subprocess is not stuck by it's prints to stdout

I have subprocess that I am running by:
proc = subprocess.Popen("python -u my_script.py", shell=True)
my_script.py should print regularly to stdout and I have other non related process that is listening to this output so I can't change the output to be printed to somewhere else.
I want to ensure that the process is really regularly printing and not got stuck in some loop .etc, do I have way to check if stdout was wroten for some amount of time?
any other options to reach this goal?
EDIT
I am using windows
you can create a named pipe with mkfifo and use tee to output your script's data to both the process listening for it and the pipe.
mkfifo blarg
my_script.py | tee blarg | your_greedy_data_processing_instance
tail -f blarg
instead of tail you can use an arbitrarly complicated script to study the output and the state of the process generating it (timers, pid checks)
It appears that the access time and modification time of /dev/stdout is updated regularly. Note, however, that /dev/stdout will always be a soft link -- er, a symbolic link, I mean -- to the file handle of stdout for the process that's checking /dev/stdout. I.e., /dev/stdout links to /proc/self/fd/1.
So it seems that you could check the first file descriptor of your process to see if its modification time has changed, e.g.:
$ stat -c %y -L /proc/10830/fd/1
2021-05-13 02:34:00.367857061
-L means act on the target of the soft link, not the soft link itself; -c %y is just asking for the modification time. This Python script is running as process 10830 on my system right now, and it's occasionally updating the modification time (about every 8 seconds):
>>> import time
>>> while True: time.sleep(1); print("still alive")
still alive
still alive
still alive
....
You should Google this answer to be sure that the behavior I'm seeing is reliable, though, because I've never read anything about it before.
Alternatively, you could either (a) trust that the script is fine -- which it will, of course, always be (unless it's catching exceptions and refusing to exit even if it can no longer do anything useful, in which case you should change it to die the way it should), or (b) set up a daemon to do something like send a signal to the script, at which point the script could send a signal to the daemon to say "I'm still alive." There's literally no reason to do that, in my opinion, but how you write your programs is up to you.
So assuming that you want to press forward with this, here's a trivial example of the daemon that would monitor the script you want to make sure isn't stuck in a loop or something:
import time
import signal
import os
import sys
# keep a timestamp of when we receive a response
response_timestamp = time.time()
# add code here to get the process ID of the other script
other_pid = 0
def sig_handler(signum, frame):
global response_timestamp
response_timestamp = time.time()
if __name__ == '__main__':
# make sure that when we receive SIGBREAK, sig_handler() gets called
signal.signal(signal.SIGBREAK, sig_handler)
while True:
# send SIGBREAK to "other_pid"
os.kill(other_pid, signal.SIGBREAK)
time.sleep(15)
if time.time() - 20 > response_timestamp:
print("the other process is frozen")
sys.exit(os.EX_SOFTWARE)
Then you add this to the other script that you're monitoring:
import signal
import os
# add code here to get the process ID
other_pid = 0
def sig_handler(signum, frame):
os.kill(other_pid, signal.SIGBREAK)
...
...
(rest of your script)
Now be aware that the only thing this will do, is make sure that the process isn't completely frozen. Regrettably, Windows doesn't have a great deal of options when it comes to signals: SIGBREAK was the best one that I saw, but note that it's the signal received by a process when you hit CTRL+C to interrupt the program (so if you manually hit CTRL+C in the window running the Python program, it won't kill it, it will just make it call sig_handler()).
I would also be remiss if I did not inform you that even though this will probably work just fine, it is not safe to do almost anything inside of a signal handler function. It's bad form and may blow up on you unexpectedly, but in practice, it's pretty safe.

How to avoid Python Subprocess stopping execution

I have a python program that process a lot of files, and one step is made through a .JAR file
I currently have something like that
for row in rows:
try:
subprocess.check_call(f'java -jar ffdec/ffdec.jar -export png "{out_dir}/" "{row[0]}.swf", stdout=subprocess.DEVNULL)
except (OSError, subprocess.SubprocessError, subprocess.CalledProcessError):
print(f"Error on {row[0]}")
continue
That works fine for executing the os command (i'm on Windows 10) and not stop on errors.
However, there is one specific error that stop the execution of my python programm.
I think it is because the .jar file doesn't really stop, and still run in the background, thus preventing python from continuing.
I there a way to call a command in Python and run it asynchronously, or skip it after a timeout of 20sec ?
I can also make a Java program to run that part of the process, but for convenience issue I'll prefer having all on Python
Just in case, i'll put here the error that stops my program (all other get properly caught by try: except:)
f�vr. 25, 2021 8:05:00 AM com.jpexs.decompiler.flash.console.ConsoleAbortRetryIgnoreHandler handle
GRAVE: Error occured
java.util.EmptyStackException
at java.util.Stack.peek(Unknown Source)
at com.jpexs.decompiler.flash.exporters.commonshape.SVGExporter.addUse(SVGExporter.java:230)
at com.jpexs.decompiler.flash.timeline.Timeline.toSVG(Timeline.java:1043)
at com.jpexs.decompiler.flash.exporters.FrameExporter.lambda$exportFrames$0(FrameExporter.java:216)
at com.jpexs.decompiler.flash.RetryTask.run(RetryTask.java:41)
at com.jpexs.decompiler.flash.exporters.FrameExporter.exportFrames(FrameExporter.java:220)
at com.jpexs.decompiler.flash.console.CommandLineArgumentParser.parseExport(CommandLineArgumentParser.java:2298)
at com.jpexs.decompiler.flash.console.CommandLineArgumentParser.parseArguments(CommandLineArgumentParser.java:891)
at com.jpexs.decompiler.flash.gui.Main.main(Main.java:1972)
After checking in depth subprocess documentation, I found a parameter called timeout :
subprocess.check_call('...', stdout=subprocess.DEVNULL, timeout=20)
That can do the job for me
Documentation for timeout

Preventing write interrupts in python script

I'm writing a parser in Python that outputs a bunch of database rows to standard out. In order for the DB to process them properly, each row needs to be fully printed to the console. I'm trying to prevent interrupts from making the print command stop halfway through printing a line.
I tried the solution that recommended using a signal handler override, but this still doesn't prevent the row from being partially printed when the program is interrupted. (I think the WRITE system call is cancelled to handle the interrupt).
I thought that the problem was solved by issue 10956 but I upgraded to Python 2.7.5 and the problem still happens.
You can see for yourself by running this example:
# Writer
import signal
interrupted = False
def signal_handler(signal, frame):
global interrupted
iterrupted = True
signal.signal(signal.SIGINT, signal_handler)
while True:
if interrupted:
break
print '0123456789'
In a terminal:
$ mkfifo --mode=0666 pipe
$ python writer.py > pipe
In another terminal:
$ cat pipe
Then Ctrl+C the first terminal. Some of the time the second terminal will end with an incomplete sequence of characters.
Is there any way of ensuring that full lines are written?
This seems less like an interrupt problem per se then a buffering issue. If I make a small change to your code, I don't get the partial lines.
# Writer
import sys
while True:
print '0123456789'
sys.stdout.flush()
It sounds like you don't really want to catch a signal but rather block it temporarily. This is supported by some *nix flavours. However Python explicitly does not support this.
You can write a C wrapper for sigmasks or look for a library. However if you are looking for a portable solution...

IOError Input/Output Error When Printing

I have inherited some code which is periodically (randomly) failing due to an Input/Output error being raised during a call to print. I am trying to determine the cause of the exception being raised (or at least, better understand it) and how to handle it correctly.
When executing the following line of Python (in a 2.6.6 interpreter, running on CentOS 5.5):
print >> sys.stderr, 'Unable to do something: %s' % command
The exception is raised (traceback omitted):
IOError: [Errno 5] Input/output error
For context, this is generally what the larger function is trying to do at the time:
from subprocess import Popen, PIPE
import sys
def run_commands(commands):
for command in commands:
try:
out, err = Popen(command, shell=True, stdout=PIPE, stderr=PIPE).communicate()
print >> sys.stdout, out
if err:
raise Exception('ERROR -- an error occurred when executing this command: %s --- err: %s' % (command, err))
except:
print >> sys.stderr, 'Unable to do something: %s' % command
run_commands(["ls", "echo foo"])
The >> syntax is not particularly familiar to me, it's not something I use often, and I understand that it is perhaps the least preferred way of writing to stderr. However I don't believe the alternatives would fix the underlying problem.
From the documentation I have read, IOError 5 is often misused, and somewhat loosely defined, with different operating systems using it to cover different problems. The best I can see in my case is that the python process is no longer attached to the terminal/pty.
As best I can tell nothing is disconnecting the process from the stdout/stderr streams - the terminal is still open for example, and everything 'appears' to be fine. Could it be caused by the child process terminating in an unclean fashion? What else might be a cause of this problem - or what other steps could I introduce to debug it further?
In terms of handling the exception, I can obviously catch it, but I'm assuming this means I wont be able to print to stdout/stderr for the remainder of execution? Can I reattach to these streams somehow - perhaps by resetting sys.stdout to sys.__stdout__ etc? In this case not being able to write to stdout/stderr is not considered fatal but if it is an indication of something starting to go wrong I'd rather bail early.
I guess ultimately I'm at a bit of a loss as to where to start debugging this one...
I think it has to do with the terminal the process is attached to. I got this error when I run a python process in the background and closed the terminal in which I started it:
$ myprogram.py
Ctrl-Z
$ bg
$ exit
The problem was that I started a not daemonized process in a remote server and logged out (closing the terminal session). A solution was to start a screen/tmux session on the remote server and start the process within this session. Then detaching the session+log out keeps the terminal associated with the process. This works at least in the *nix world.
I had a very similar problem. I had a program that was launching several other programs using the subprocess module. Those subprocesses would then print output to the terminal. What I found was that when I closed the main program, it did not terminate the subprocesses automatically (as I had assumed), rather they kept running. So if I terminated both the main program and then the terminal it had been launched from*, the subprocesses no longer had a terminal attached to their stdout, and would throw an IOError. Hope this helps you.
*NB: it must be done in this order. If you just kill the terminal, (for some reason) that would kill both the main program and the subprocesses.
I just got this error because the directory where I was writing files to ran out of memory. Not sure if this is at all applicable to your situation.
I'm new here, so please forgive if I slip up a bit when it comes to the code detail.
Recently I was able to figure out what cause the I/O error of the print statement when the terminal associated with the run of the python script is closed.
It is because the string to be printed to stdout/stderr is too long. In this case, the "out" string is the culprit.
To fix this problem (without having to keep the terminal open while running the python script), simply read the "out" string line by line, and print line by line, until we reach the end of the "out" string. Something like:
while true:
ln=out.readline()
if not ln: break
print ln.strip("\n") # print without new line
The same problem occurs if you print the entire list of strings out to the screen. Simply print the list one item by one item.
Hope that helps!
The problem is you've closed the stdout pipe which python is attempting to write to when print() is called
This can be caused by running a script in the background using & and then closing the terminal session (ie. closing stdout)
$ python myscript.py &
$ exit
One solution is to set stdout to a file when running in the background
Example
$ python myscript.py > /var/log/myscript.log 2>&1 &
$ exit
No errors on print()
It could happen when your shell crashes while the print was trying to write the data into it.
For my case, I just restart the service, then this issue disappear. don't now why.
My issue was the same OSError Input/Output error, for Odoo.
After I restart the service, it disappeared.

What is happening to my process?

I'm executing a SSH process like so:
checkIn()
sshproc = subprocess.Popen([command], shell=True)
exit = os.waitpid(sshproc.pid, 0)[1]
checkOut()
Its important that the process form checkIn() and checkOut() actions before and after these lines of code. I have a test case that involves that I exit the SSH session by closing the terminal window manually. Sure enough, my program doesn't operate correctly and checkOut() is never called in this case. Can someone give me a pointer into what I can look in to fix this bug?
Let me know if any other information would helpful.
Thanks!
The Python process would normally execute in the same window as the ssh subprocess, and therefore be terminated just as abruptly when you close that window -- before getting a chance to execute checkOut. To try and ensure that a function gets called at program exit (though for sufficiently-abrupt terminations, depending on your OS, there may be no guarantees), try Python standard library module atexit.
Perhaps all you need is a try ... finally block?
try:
checkIn()
sshproc = subprocess.Popen([command], shell=True)
exit = os.waitpid(sshproc.pid, 0)[1]
finally:
checkOut()
Unless the system crashes, the process receives SIGKILL, etc., checkOut() should be called.

Categories

Resources