I create a FIFO, and periodically open it in read-only and non-blockingly mode from a.py:
os.mkfifo(cs_cmd_fifo_file, 0777)
io = os.open(fifo, os.O_RDONLY | os.O_NONBLOCK)
buffer = os.read(io, BUFFER_SIZE)
From b.py, open the fifo for writing:
out = open(fifo, 'w')
out.write('sth')
Then a.py will raise an error:
buffer = os.read(io, BUFFER_SIZE)
OSError: [Errno 11] Resource temporarily unavailable
Anyone know what's wrong?
According to the manpage of read(2):
EAGAIN or EWOULDBLOCK
The file descriptor fd refers to a socket and has been marked
nonblocking (O_NONBLOCK), and the read would block.
POSIX.1-2001 allows either error to be returned for this case,
and does not require these constants to have the same value, so
a portable application should check for both possibilities.
So what you're getting is that there is no data available for reading. It is safe to handle the error like this:
try:
buffer = os.read(io, BUFFER_SIZE)
except OSError as err:
if err.errno == errno.EAGAIN or err.errno == errno.EWOULDBLOCK:
buffer = None
else:
raise # something else has happened -- better reraise
if buffer is None:
# nothing was received -- do something else
else:
# buffer contains some received data -- do something with it
Make sure you have the errno module imported: import errno.
out = open(fifo, 'w')
Who will close it for you?
Replace your open+write by this:
with open(fifo, 'w') as fp:
fp.write('sth')
UPD:
Ok, than just make this:
out = os.open(fifo, os.O_NONBLOCK | os.O_WRONLY)
os.write(out, 'tetet')
Related
I my application, i have below requests:
1. There has one thread will regularly record some logs in file. The log file will be rollovered in certain interval. for keeping the log files small.
2. There has another thread also will regularly to process these log files. ex: Move the log files to other place, parse the log's content to generate some log reports.
But, there has a condition is the second thread can not process the log file that's using to record the log. in code side, the pseudocode similars like below:
#code in second thread to process the log files
for logFile in os.listdir(logFolder):
if not file_is_open(logFile) or file_is_use(logFile):
ProcessLogFile(logFile) # move log file to other place, and generate log report....
So, how do i check is a file is already open or is used by other process?
I did some research in internet. And have some results:
try:
myfile = open(filename, "r+") # or "a+", whatever you need
except IOError:
print "Could not open file! Please close Excel!"
I tried this code, but it doesn't work, no matter i use "r+" or "a+" flag
try:
os.remove(filename) # try to remove it directly
except OSError as e:
if e.errno == errno.ENOENT: # file doesn't exist
break
This code can work, but it can not reach my request, since i don't want to delete the file to check if it is open.
An issue with trying to find out if a file is being used by another process is the possibility of a race condition. You could check a file, decide that it is not in use, then just before you open it another process (or thread) leaps in and grabs it (or even deletes it).
Ok, let's say you decide to live with that possibility and hope it does not occur. To check files in use by other processes is operating system dependant.
On Linux it is fairly easy, just iterate through the PIDs in /proc. Here is a generator that iterates over files in use for a specific PID:
def iterate_fds(pid):
dir = '/proc/'+str(pid)+'/fd'
if not os.access(dir,os.R_OK|os.X_OK): return
for fds in os.listdir(dir):
for fd in fds:
full_name = os.path.join(dir, fd)
try:
file = os.readlink(full_name)
if file == '/dev/null' or \
re.match(r'pipe:\[\d+\]',file) or \
re.match(r'socket:\[\d+\]',file):
file = None
except OSError as err:
if err.errno == 2:
file = None
else:
raise(err)
yield (fd,file)
On Windows it is not quite so straightforward, the APIs are not published. There is a sysinternals tool (handle.exe) that can be used, but I recommend the PyPi module psutil, which is portable (i.e., it runs on Linux as well, and probably on other OS):
import psutil
for proc in psutil.process_iter():
try:
# this returns the list of opened files by the current process
flist = proc.open_files()
if flist:
print(proc.pid,proc.name)
for nt in flist:
print("\t",nt.path)
# This catches a race condition where a process ends
# before we can examine its files
except psutil.NoSuchProcess as err:
print("****",err)
I like Daniel's answer, but for Windows users, I realized that it's safer and simpler to rename the file to the name it already has. That solves the problems brought up in the comments to his answer. Here's the code:
import os
f = 'C:/test.xlsx'
if os.path.exists(f):
try:
os.rename(f, f)
print 'Access on file "' + f +'" is available!'
except OSError as e:
print 'Access-error on file "' + f + '"! \n' + str(e)
You can check if a file has a handle on it using the next function (remember to pass the full path to that file):
import psutil
def has_handle(fpath):
for proc in psutil.process_iter():
try:
for item in proc.open_files():
if fpath == item.path:
return True
except Exception:
pass
return False
I know I'm late to the party but I also had this problem and I used the lsof command to solve it (which I think is new from the approaches mentioned above). With lsof we can basically check for the processes that are using this particular file.
Here is how I did it:
from subprocess import check_output,Popen, PIPE
try:
lsout=Popen(['lsof',filename],stdout=PIPE, shell=False)
check_output(["grep",filename], stdin=lsout.stdout, shell=False)
except:
#check_output will throw an exception here if it won't find any process using that file
just write your log processing code in the except part and you are good to go.
Instead on using os.remove() you may use the following workaround on Windows:
import os
file = "D:\\temp\\test.pdf"
if os.path.exists(file):
try:
os.rename(file,file+"_")
print "Access on file \"" + str(file) +"\" is available!"
os.rename(file+"_",file)
except OSError as e:
message = "Access-error on file \"" + str(file) + "\"!!! \n" + str(e)
print message
You can use inotify to watch for activity in file system. You can watch for file close events, indicating that a roll-over has happened. You should also add additional condition on file-size. Make sure you filter out file close events from the second thread.
A slightly more polished version of one of the answers from above.
from pathlib import Path
def is_file_in_use(file_path):
path = Path(file_path)
if not path.exists():
raise FileNotFoundError
try:
path.rename(path)
except PermissionError:
return True
else:
return False
On Windows, you can also directly retrieve the information by leveraging on the NTDLL/KERNEL32 Windows API. The following code returns a list of PIDs, in case the file is still opened/used by a process (including your own, if you have an open handle on the file):
import ctypes
from ctypes import wintypes
path = r"C:\temp\test.txt"
# -----------------------------------------------------------------------------
# generic strings and constants
# -----------------------------------------------------------------------------
ntdll = ctypes.WinDLL('ntdll')
kernel32 = ctypes.WinDLL('kernel32', use_last_error=True)
NTSTATUS = wintypes.LONG
INVALID_HANDLE_VALUE = wintypes.HANDLE(-1).value
FILE_READ_ATTRIBUTES = 0x80
FILE_SHARE_READ = 1
OPEN_EXISTING = 3
FILE_FLAG_BACKUP_SEMANTICS = 0x02000000
FILE_INFORMATION_CLASS = wintypes.ULONG
FileProcessIdsUsingFileInformation = 47
LPSECURITY_ATTRIBUTES = wintypes.LPVOID
ULONG_PTR = wintypes.WPARAM
# -----------------------------------------------------------------------------
# create handle on concerned file with dwDesiredAccess == FILE_READ_ATTRIBUTES
# -----------------------------------------------------------------------------
kernel32.CreateFileW.restype = wintypes.HANDLE
kernel32.CreateFileW.argtypes = (
wintypes.LPCWSTR, # In lpFileName
wintypes.DWORD, # In dwDesiredAccess
wintypes.DWORD, # In dwShareMode
LPSECURITY_ATTRIBUTES, # In_opt lpSecurityAttributes
wintypes.DWORD, # In dwCreationDisposition
wintypes.DWORD, # In dwFlagsAndAttributes
wintypes.HANDLE) # In_opt hTemplateFile
hFile = kernel32.CreateFileW(
path, FILE_READ_ATTRIBUTES, FILE_SHARE_READ, None, OPEN_EXISTING,
FILE_FLAG_BACKUP_SEMANTICS, None)
if hFile == INVALID_HANDLE_VALUE:
raise ctypes.WinError(ctypes.get_last_error())
# -----------------------------------------------------------------------------
# prepare data types for system call
# -----------------------------------------------------------------------------
class IO_STATUS_BLOCK(ctypes.Structure):
class _STATUS(ctypes.Union):
_fields_ = (('Status', NTSTATUS),
('Pointer', wintypes.LPVOID))
_anonymous_ = '_Status',
_fields_ = (('_Status', _STATUS),
('Information', ULONG_PTR))
iosb = IO_STATUS_BLOCK()
class FILE_PROCESS_IDS_USING_FILE_INFORMATION(ctypes.Structure):
_fields_ = (('NumberOfProcessIdsInList', wintypes.LARGE_INTEGER),
('ProcessIdList', wintypes.LARGE_INTEGER * 64))
info = FILE_PROCESS_IDS_USING_FILE_INFORMATION()
PIO_STATUS_BLOCK = ctypes.POINTER(IO_STATUS_BLOCK)
ntdll.NtQueryInformationFile.restype = NTSTATUS
ntdll.NtQueryInformationFile.argtypes = (
wintypes.HANDLE, # In FileHandle
PIO_STATUS_BLOCK, # Out IoStatusBlock
wintypes.LPVOID, # Out FileInformation
wintypes.ULONG, # In Length
FILE_INFORMATION_CLASS) # In FileInformationClass
# -----------------------------------------------------------------------------
# system call to retrieve list of PIDs currently using the file
# -----------------------------------------------------------------------------
status = ntdll.NtQueryInformationFile(hFile, ctypes.byref(iosb),
ctypes.byref(info),
ctypes.sizeof(info),
FileProcessIdsUsingFileInformation)
pidList = info.ProcessIdList[0:info.NumberOfProcessIdsInList]
print(pidList)
I provided one solution. please see the following code.
def isFileinUsed(ifile):
widlcard = "/proc/*/fd/*"
lfds = glob.glob(widlcard)
for fds in lfds:
try:
file = os.readlink(fds)
if file == ifile:
return True
except OSError as err:
if err.errno == 2:
file = None
else:
raise(err)
return False
You can use this function to check if a file is in used.
Note:
This solution only can be used for Linux system.
I am loading a volume from a dicom folder
import SimpleITK as sitk
reader = sitk.ImageSeriesReader()
dicom_names = reader.GetGDCMSeriesFileNames(input_dir)
reader.SetFileNames(dicom_names)
image = reader.Execute()
, and I am getting the following warning. Is it possible to catch this warning?
WARNING: In d:\a\1\work\b\itk-prefix\include\itk-5.1\itkImageSeriesReader.hxx, line 480
ImageSeriesReader (000002C665417450): Non uniform sampling or missing slices detected, maximum nonuniformity:292.521
I have tried the solutions from this question and it does not work. Is it because the warning message is coming from the C code?
As the warning generated from C++ code cannot be caught in python, I came up with a workaround/hack, which does not depend on a warning object. The solution is based on redirecting sys.stderr of the code that can generate a warning to a file and checking the file for the "warning" keyword.
The context manager code based on this answer.
import sys
from contextlib import contextmanager
def flush(stream):
try:
libc.fflush(None)
stream.flush()
except (AttributeError, ValueError, IOError):
pass # unsupported
def fileno(file_or_fd):
fd = getattr(file_or_fd, 'fileno', lambda: file_or_fd)()
if not isinstance(fd, int):
raise ValueError("Expected a file (`.fileno()`) or a file descriptor")
return fd
#contextmanager
def stdout_redirected(to=os.devnull, stdout=None):
if stdout is None:
stdout = sys.stdout
stdout_fd = fileno(stdout)
# copy stdout_fd before it is overwritten
# Note: `copied` is inheritable on Windows when duplicating a standard stream
with os.fdopen(os.dup(stdout_fd), 'wb') as copied:
# stdout.flush() # flush library buffers that dup2 knows nothing about
# stdout.flush() does not flush C stdio buffers on Python 3 where I/O is
# implemented directly on read()/write() system calls. To flush all open C stdio
# output streams, you could call libc.fflush(None) explicitly if some C extension uses stdio-based I/O:
flush(stdout)
try:
os.dup2(fileno(to), stdout_fd) # $ exec >&to
except ValueError: # filename
with open(to, 'wb') as to_file:
os.dup2(to_file.fileno(), stdout_fd) # $ exec > to
try:
yield stdout # allow code to be run with the redirected stdout
finally:
# restore stdout to its previous value
# Note: dup2 makes stdout_fd inheritable unconditionally
# stdout.flush()
flush(stdout)
os.dup2(copied.fileno(), stdout_fd) # $ exec >&copied
Detecting warning generated by the C++ code:
import SimpleITK as sitk
with open('output.txt', 'w') as f, stdout_redirected(f, stdout=sys.stderr):
reader = sitk.ImageSeriesReader()
dicom_names = reader.GetGDCMSeriesFileNames(input_dir)
reader.SetFileNames(dicom_names)
image = reader.Execute()
with open('output.txt') as f:
content = f.read()
if "warning" in content.lower():
raise RuntimeError('SimpleITK Warning!')
Prior warning: I'm hacking around here out of curiosity. I have no specific reason to do what I'm doing below!
Below is done on Python 2.7.13 on MacOS 10.12.5
I was hacking around with python and I thought it'd be interesting to see what happened if I made stdout nonblocking
fcntl.fcntl(sys.stdout.fileno(), fcntl.F_SETFL, os.O_NONBLOCK)
The call to fcntl is definitely successful. I then try to write a large amount of data (bigger than the max buffer size of a pipe on OSX - which is 65536 bytes). I do it in a variety of ways and get different outcomes, sometimes an exception, sometimes what seems to be a hard fail.
Case 1
fcntl.fcntl(sys.stdout.fileno(), fcntl.F_SETFL, os.O_NONBLOCK)
try:
sys.stdout.write("A" * 65537)
except Exception as e:
time.sleep(1)
print "Caught: {}".format(e)
# Safety sleep to prevent quick exit
time.sleep(1)
This always throws the exception Caught: [Errno 35] Resource temporarily unavailable. Makes sense I think. Higher level file object wrapper is telling me the write call failed.
Case 2
fcntl.fcntl(sys.stdout.fileno(), fcntl.F_SETFL, os.O_NONBLOCK)
try:
sys.stdout.write("A" * 65537)
except Exception as e:
print "Caught: {}".format(e)
# Safety sleep to prevent quick exit
time.sleep(1)
This sometimes throws the exception Caught: [Errno 35] Resource temporarily unavailable or sometimes there is no exception caught and I see the following output:
close failed in file object destructor:
sys.excepthook is missing
lost sys.stderr
Case 3
fcntl.fcntl(sys.stdout.fileno(), fcntl.F_SETFL, os.O_NONBLOCK)
try:
sys.stdout.write("A" * 65537)
except Exception as e:
print "Caught: {}".format(e)
# Safety sleep to prevent quick exit
time.sleep(1)
print "Slept"
This sometimes throws the exception Caught: [Errno 35] Resource temporarily unavailable or sometimes there is no exception caught and I just see "Slept". It seems that by printing "Slept" I don't get the error message from Case 2.
Case 4
fcntl.fcntl(sys.stdout.fileno(), fcntl.F_SETFL, os.O_NONBLOCK)
try:
os.write(sys.stdout.fileno(), "A" * 65537)
except Exception as e:
print "Caught: {}".format(e)
# Safety sleep to prevent quick exit
time.sleep(1)
Always okay!
Case 5
fcntl.fcntl(sys.stdout.fileno(), fcntl.F_SETFL, os.O_NONBLOCK)
try:
print os.write(sys.stdout.fileno(), "A" * 65537)
except Exception as e:
print "Caught: {}".format(e)
# Safety sleep to prevent quick exit
time.sleep(1)
This is sometimes okay or sometimes prints the close failed in file object destructor error message.
My question is, why does this fail like this in python? Am I doing something fundamentally bad here - either with python or at the system level?
It seems like somehow that writing too soon to stdout when the write already failed causes the error message. The error doesn't appear to be an exception. No idea where it's coming from.
N.B. I can write the equivalent program in C and it works okay:
#include <stdio.h>
#include <stdlib.h>
#include <memory.h>
#include <sys/fcntl.h>
#include <unistd.h>
int main(int argc, const char * argv[])
{
const size_t NUM_CHARS = 65537;
char buf[NUM_CHARS];
// Set stdout non-blocking
fcntl(fileno(stdout), F_SETFL, O_NONBLOCK);
// Try to write a large amount of data
memset(buf, 65, NUM_CHARS);
size_t written = fwrite(buf, 1, NUM_CHARS, stdout);
// Wait briefly to give stdout a chance to be read from
usleep(1000);
// This will be written correctly
sprintf(buf, "\nI wrote %zd bytes\n", written);
fwrite(buf, 1, strlen(buf), stdout);
return 0;
}
This is interesting. There are a few things that I've found so far:
Case 1
This is because sys.stdout.write will either write all of the string or throw an exception, which isn't the desired behavior when using O_NONBLOCK. When the underlying call to write returns EAGAIN (Errno 35 on OS X), it should be tried again with the data that is remaining. os.write should be used instead, and the return value should be checked to make sure all the data is written.
This code works as expected:
fcntl.fcntl(sys.stdout.fileno(), fcntl.F_SETFL, os.O_NONBLOCK)
def stdout_write(s):
written = 0
while written < len(s):
try:
written = written + os.write(sys.stdout.fileno(), s[written:])
except OSError as e:
pass
stdout_write("A" * 65537)
Case 2
I suspect that this error message is due to
https://bugs.python.org/issue11380:
close failed in file object destructor:
sys.excepthook is missing
lost sys.stderr
I'm not sure why it's sometimes being called. It may be because there is a print in the except statement which is trying to use the same stdout that a write just failed on.
Case 3
This is similar to Case 1. This code always works for me:
fcntl.fcntl(sys.stdout.fileno(), fcntl.F_SETFL, os.O_NONBLOCK)
def stdout_write(s):
written = 0
while written < len(s):
try:
written = written + os.write(sys.stdout.fileno(), s[written:])
except OSError as e:
pass
stdout_write("A" * 65537)
time.sleep(1)
print "Slept"
Case 4
Make sure you check the return value of os.write, I suspect that the full 65537 bytes are not being successfully written.
Case 5
This is similar to Case 2.
How can I make a fifo between two python processes, that allow dropping of lines if the reader is not able to handle the input?
If the reader tries to read or readline faster then the writer writes, it should block.
If the reader cannot work as fast as the writer writes, the writer should not block. Lines should not be buffered (except one line at a time) and only the last line written should be received by the reader on its next readline attempt.
Is this possible with a named fifo, or is there any other simple way for achiving this?
The following code uses a named FIFO to allow communication between two scripts.
If the reader tries to read faster than the writer, it blocks.
If the reader cannot keep up with the writer, the writer does not block.
Operations are buffer oriented. Line oriented operations are not currently implemented.
This code should be considered a proof-of-concept. The delays and buffer sizes are arbitrary.
Code
import argparse
import errno
import os
from select import select
import time
class OneFifo(object):
def __init__(self, name):
self.name = name
def __enter__(self):
if os.path.exists(self.name):
os.unlink(self.name)
os.mkfifo(self.name)
return self
def __exit__(self, exc_type, exc_value, exc_traceback):
if os.path.exists(self.name):
os.unlink(self.name)
def write(self, data):
print "Waiting for client to open FIFO..."
try:
server_file = os.open(self.name, os.O_WRONLY | os.O_NONBLOCK)
except OSError as exc:
if exc.errno == errno.ENXIO:
server_file = None
else:
raise
if server_file is not None:
print "Writing line to FIFO..."
try:
os.write(server_file, data)
print "Done."
except OSError as exc:
if exc.errno == errno.EPIPE:
pass
else:
raise
os.close(server_file)
def read_nonblocking(self):
result = None
try:
client_file = os.open(self.name, os.O_RDONLY | os.O_NONBLOCK)
except OSError as exc:
if exc.errno == errno.ENOENT:
client_file = None
else:
raise
if client_file is not None:
try:
rlist = [client_file]
wlist = []
xlist = []
rlist, wlist, xlist = select(rlist, wlist, xlist, 0.01)
if client_file in rlist:
result = os.read(client_file, 1024)
except OSError as exc:
if exc.errno == errno.EAGAIN or exc.errno == errno.EWOULDBLOCK:
result = None
else:
raise
os.close(client_file)
return result
def read(self):
try:
with open(self.name, 'r') as client_file:
result = client_file.read()
except OSError as exc:
if exc.errno == errno.ENOENT:
result = None
else:
raise
if not len(result):
result = None
return result
def parse_argument():
parser = argparse.ArgumentParser()
parser.add_argument('-c', '--client', action='store_true',
help='Set this flag for the client')
parser.add_argument('-n', '--non-blocking', action='store_true',
help='Set this flag to read without blocking')
result = parser.parse_args()
return result
if __name__ == '__main__':
args = parse_argument()
if not args.client:
with OneFifo('known_name') as one_fifo:
while True:
one_fifo.write('one line')
time.sleep(0.1)
else:
one_fifo = OneFifo('known_name')
while True:
if args.non_blocking:
result = one_fifo.read_nonblocking()
else:
result = one_fifo.read()
if result is not None:
print result
The server checks if the client has opened the FIFO. If the client has opened the FIFO, the server writes a line. Otherwise, the server continues running. I have implemented a non-blocking read because the blocking read causes a problem: If the server restarts, most of the time the client stays blocked and never recovers. With a non-blocking client, a server restart is more easily tolerated.
Output
[user#machine:~] python onefifo.py
Waiting for client to open FIFO...
Waiting for client to open FIFO...
Writing line to FIFO...
Done.
Waiting for client to open FIFO...
Writing line to FIFO...
Done.
[user#machine:~] python onefifo.py -c
one line
one line
Notes
On startup, if the server detects that the FIFO already exists, it removes it. This is the easiest way to notify clients that the server has restarted. This notification is usually ignored by the blocking version of the client.
Well, that's not actually a FIFO (queue) as far as I am aware - it's a single variable. I suppose it might be implementable if you set up a queue or pipe with a maximum size of 1, but it seems that it would work better to use a Lock on a single object in one of the processes, which the other process references via a proxy object. The reader would set it to None whenever it reads, and the writer would overwrite the contents every time it writes.
You can get those to the other processes by passing the proxy of the object, and a proxy of the lock, as an argument to all relevant processes. To get it slightly more conveniently, you can use a Manager, which provides a single object with proxy that you can pass in, which contains and provides proxies for whatever other objects (including locks) you want to put in it. This answer provides a useful example of proper use of a Manager to pass objects into a new process.
I am trying to print a list of tuples formatted in my stdout. For this, I use the str.format method. Everything works fine, but when I pipe the output to see the
first lines using the head command a IOError occurs.
Here is my code:
# creating the data
data = []$
for i in range(0, 1000):
pid = 'pid%d' % i
uid = 'uid%d' % i
pname = 'pname%d' % i
data.append( (pid, uid, pname) )
# find max leghed string for each field
pids, uids, pnames = zip(*data)
max_pid = len("%s" % max( pids) )
max_uid = len("%s" % max( uids) )
max_pname = len("%s" % max( pnames) )
# my template for the formatted strings
template = "{0:%d}\t{1:%d}\t{2:%d}" % (max_pid, max_uid, max_pname)
# print the formatted output to stdout
for pid, uid, pname in data:
print template.format(pid, uid, pname)
And here is the error I get after running the command: python myscript.py | head
Traceback (most recent call last):
File "lala.py", line 16, in <module>
print template.format(pid, uid, pname)
IOError: [Errno 32] Broken pipe
Can anyone help me on this?
I tried to put print in a try-except block to handle the error,
but after that there was another message in the console:
close failed in file object destructor:
sys.excepthook is missing
lost sys.stderr
I also tried to flush immediately the data through a two consecutive
sys.stdout.write and sys.stdout.flush calls, but nothing happend..
head reads from stdout then closes it. This causes print to fail, internally it writes to sys.stdout, now closed.
You can simply catch the IOError and exit silently:
try:
for pid, uid, pname in data:
print template.format(pid, uid, pname)
except IOError:
# stdout is closed, no point in continuing
# Attempt to close them explicitly to prevent cleanup problems:
try:
sys.stdout.close()
except IOError:
pass
try:
sys.stderr.close()
except IOError:
pass
The behavior you are seeing is linked to the buffered output implementation in Python3. The problem can be avoided using the -u option or setting environmental variable PYTHONUNBUFFERED=x. See the man pages for more information on -u.
$ python2.7 testprint.py | echo
Exc: <type 'exceptions.IOError'>
$ python3.5 testprint.py | echo
Exc: <class 'BrokenPipeError'>
Exception ignored in: <_io.TextIOWrapper name='<stdout>' mode='w' encoding='UTF-8'>
BrokenPipeError: [Errno 32] Broken pipe
$ python3.5 -u testprint.py | echo
Exc: <class 'BrokenPipeError'>
$ export PYTHONUNBUFFERED=x
$ python3.5 testprint.py | echo
Exc: <class 'BrokenPipeError'>
In general, I try to catch the most specific error I can get away with. In this case it is BrokenPipeError:
try:
# I usually call a function here that generates all my output:
for pid, uid, pname in data:
print template.format(pid, uid, pname)
except BrokenPipeError as e:
pass # Ignore. Something like head is truncating output.
finally:
sys.stderr.close()
If this is at the end of execution, I find I only need to close sys.stderr. If I don't close sys.stderr, I'll get a BrokenPipeError but without a stack trace.
This seems to be the minimum fix for writing tools that output to pipelines.
Had this problem with Python3 and debug logging piped into head as well. If your script talks to the network or does file IO, simply dropping IOError's is not a good solution. Despite mentions here, I was not able to catch BrokenPipeError for some reason.
Found a blog post talking about restoring the default signal handler for sigpipe: http://newbebweb.blogspot.com/2012/02/python-head-ioerror-errno-32-broken.html
In short, you add the following to your script before the bulk of the output:
if log.isEnabledFor(logging.DEBUG): # optional
# set default handler to no-op
from signal import signal, SIGPIPE, SIG_DFL
signal(SIGPIPE, SIG_DFL)
This seems to happen with head, but not other programs such as grep---as mentioned head closes stdout. If you don't use head with the script often, it may not be worth worrying about.