Get file object from file number - python

Let's say I have a list of the opened files (actually, of the file numbers):
import resource
import fcntl
def get_open_fds():
fds = []
soft, hard = resource.getrlimit(resource.RLIMIT_NOFILE)
for fd in range(3, soft):
try:
flags = fcntl.fcntl(fd, fcntl.F_GETFD)
except IOError:
continue
fds.append(fd)
return fds
Now I would like to get the names of those files. How can I do this?
EDIT
Just to clarify, for those downvoting this: fd is an integer. It is NOT a filedescriptor. Sorry for confusing you with the name, but the code is self-explanatory.
EDIT2
I am getting flamed about this, I think because of my choice of fd to mean file number. I just checked the documentation:
All functions in this module take a file descriptor fd as their first
argument. This can be an integer file descriptor, such as returned by
sys.stdin.fileno(), or a file object, such as sys.stdin itself, which
provides a fileno() which returns a genuine file descriptor.
So fd is indeed an integer. It can also be a file object but, in the general case, fd has not .name.

As per this answer:
for fd in get_open_fds():
print fd, os.readlink('/proc/self/fd/%d' % fd)

I was in the same boat. What I ended up doing is writing my own open that keeps track of all the files opened. Then in the initial Python file the first thing that happens is the built-in open gets replaced by mine, and then later I can query it for the currently open files. This is what it looks like:
class Open(object):
builtin_open = open
_cache = {}
#classmethod
def __call__(cls, name, *args):
file = cls.builtin_open(name, *args)
cls._cache[name] = file
return file
#classmethod
def active(cls, name):
cls.open_files()
try:
return cls._cache[name]
except KeyError:
raise ValueError('%s has been closed' % name)
#classmethod
def open_files(cls):
closed = []
for name, file in cls._cache.items():
if file.closed:
closed.append(name)
for name in closed:
cls._cache.pop(name)
return cls._cache.items()
import __builtin__
__builtin__.open = Open()
then later...
daemon.files_preserve = [open.active('/dev/urandom')]

Related

Decorator to close open files

I would like to write a decorator that closes all open files. For example:
#close_files
def view(request):
f = open('myfile.txt').read()
return render('template.html')
Ignoring any threadsafe stuff, how could I write such a decorator to close out any open files after the function is returned? I'm not interested in writing a context manager, but something like this:
def close_files(func):
#wraps(func)
def wrapper_close_files(*args):
return_value = func(*args, **kwargs)
# close open files here?
return return_value
return wrapper_close_files
Unfortunately I cannot use something like this:
with open('myfile.txt') as _f: f = _f.read()
I'm asking about how to do a decorator to close files where we do not have direct access to the variable which references the file (handler).
According to python docs 3.6:
If you’re not using the with keyword, then you should call f.close() to close the file and immediately free up any system resources used by it. If you don’t explicitly close a file, Python’s garbage collector will eventually destroy the object and close the open file for you, but the file may stay open for a while. Another risk is that different Python implementations will do this clean-up at different times.
So if you want to do this, you need a decorator who captures the open command from the builtins module. However, since we don't know how many files will be opened, we can use ExitStack. Thus, you can use a function that keeps the file handle within the ExitStack context.
My version for that is:
def close_opened_files(func):
#functools.wraps(func)
def wrapper(*args, **kwargs):
import builtins
_open = getattr(builtins, 'open')
try:
with contextlib.ExitStack() as stack:
def myopen(*args, **kwargs):
f = stack.enter_context(_open(*args, **kwargs))
return f
setattr(builtins, 'open', myopen)
ret = func(*args, **kwargs)
return ret
finally:
setattr(builtins, 'open', _open)
return wrapper
At the end of the decorator, the original open function should be restored in the builtins module.
One of the problems with this type of approach is: the functions that are called inside func and that use the open function, will also have their files closed at the end of decorator, even if it represents an error.
Here are some examples:
The following examples show read and write operations, also with files being closed within the function.
#close_opened_files
def test_write(fn1, fn2=None):
f1 = open(fn1, 'w')
print('Hello world', file=f1)
if fn2:
f2 = open(fn2, 'w')
print('foo-bar', file=f2)
f2.close()
#close_opened_files
def test_read(fn1, fn2=None):
f1 = open(fn1, 'r')
for line in f1: print(line)
if fn2:
f2 = open(fn2, 'r')
for line in f2: print(line)
f2.close()
test_write('file1.txt', 'file2.txt')
test_read('file1.txt', 'file2.txt')
try: test_read('non_exist_filename.txt')
except FileNotFoundError as ex: print(ex)
try: test_exception('file1.txt')
except RuntimeError as ex: print(ex)
These last examples show the decorator under the exceptions: 'no file exists', or any other exception that happens.
The last case, file will be closed before the exception is raised.
I believe it would be possible to use ast to traverse the code for file openings, and attempt to close any files that were found to have been opened.

How to differentiate a file like object from a file path like object

Summary:
There is a variety of function for which it would be very useful to be able to pass in two kinds of objects: an object that represents a path (usually a string), and an object that represents a stream of some sort (often something derived from IOBase, but not always). How can this variety of function differentiate between these two kinds of objects so they can be handled appropriately?
Say I have a function intended to write a file from some kind of object file generator method:
spiff = MySpiffy()
def spiffy_file_makerA(spiffy_obj, file):
file_str = '\n'.join(spiffy_obj.gen_file())
file.write(file_str)
with open('spiff.out', 'x') as f:
spiffy_file_makerA(spiff, f)
...do other stuff with f...
This works. Yay. But I'd prefer to not have to worry about opening the file first or passing streams around, at least sometimes... so I refactor with the ability to take a file path like object instead of a file like object, and a return statement:
def spiffy_file_makerB(spiffy_obj, file, mode):
file_str = '\n'.join(spiffy_obj.gen_file())
file = open(file, mode)
file.write(file_str)
return file
with spiffy_file_makerB(spiff, 'file.out', 'x') as f:
...do other stuff with f...
But now I get the idea that it would be useful to have a third function that combines the other two versions depending on whether file is file like, or file path like, but returns the f destination file like object to a context manager. So that I can write code like this:
with spiffy_file_makerAB(spiffy_obj, file_path_like, mode = 'x') as f:
...do other stuff with f...
...but also like this:
file_like_obj = get_some_socket_or_stream()
with spiffy_file_makerAB(spiffy_obj, file_like_obj, mode = 'x'):
...do other stuff with file_like_obj...
# file_like_obj stream closes when context manager exits
# unless `closefd=False`
Note that this will require something a bit different than the simplified versions provided above.
Try as a I might, I haven't been able to find an obvious way to do this, and the ways I have found seem pretty contrived and just a potential for problems later. For example:
def spiffy_file_makerAB(spiffy_obj, file, mode, *, closefd=True):
try:
# file-like (use the file descriptor to open)
result_f = open(file.fileno(), mode, closefd=closefd)
except TypeError:
# file-path-like
result_f = open(file, mode)
finally:
file_str = '\n'.join(spiffy_obj.gen_file())
result_f.write(file_str)
return result_f
Are there any suggestions for a better way? Am I way off base and need to be handling this completely differently?
For my money, and this is an opinionated answer, checking for the attributes of the file-like object for the operations you will need is a pythonic way to determine an object’s type because that is the nature of pythonic duck tests/duck-typing:
Duck typing is heavily used in Python, with the canonical example being file-like classes (for example, cStringIO allows a Python string to be treated as a file).
Or from the python docs’ definition of duck-typing
A programming style which does not look at an object’s type to determine if it has the right interface; instead, the method or attribute is simply called or used (“If it looks like a duck and quacks like a duck, it must be a duck.”) By emphasizing interfaces rather than specific types, well-designed code improves its flexibility by allowing polymorphic substitution. Duck-typing avoids tests using type() or isinstance(). (Note, however, that duck-typing can be complemented with abstract base classes.) Instead, it typically employs hasattr() tests or EAFP programming.
If you feel very strongly that there is some very good reason that just checking the interface for suitability isn't enough, you can just reverse the test and test for basestring or str to test whether the provided object is path-like. The test will be different depending on your version of python.
is_file_like = not isinstance(fp, basestring) # python 2
is_file_like = not isinstance(fp, str) # python 3
In any case, for your context manager, I would go ahead and make a full-blown object like the below in order to wrap the functionality that you were looking for.
class SpiffyContextGuard(object):
def __init__(self, spiffy_obj, file, mode, closefd=True):
self.spiffy_obj = spiffy_obj
is_file_like = all(hasattr(attr) for attr in ('seek', 'close', 'read', 'write'))
self.fp = file if is_file_like else open(file, mode)
self.closefd = closefd
def __enter__(self):
return self.fp
def __exit__(self, type_, value, traceback):
generated = '\n'.join(self.spiffy_obj.gen_file())
self.fp.write(generated)
if self.closefd:
self.fp.__exit__()
And then use it like this:
with SpiffyContextGuard(obj, 'hamlet.txt', 'w', True) as f:
f.write('Oh that this too too sullied flesh\n')
fp = open('hamlet.txt', 'a')
with SpiffyContextGuard(obj, fp, 'a', False) as f:
f.write('Would melt, thaw, resolve itself into a dew\n')
with SpiffyContextGuard(obj, fp, 'a', True) as f:
f.write('Or that the everlasting had not fixed his canon\n')
If you wanted to use try/catch semantics to check for type suitability, you could also wrap the file operations you wanted to expose on your context guard:
class SpiffyContextGuard(object):
def __init__(self, spiffy_obj, file, mode, closefd=True):
self.spiffy_obj = spiffy_obj
self.fp = self.file_or_path = file
self.mode = mode
self.closefd = closefd
def seek(self, offset, *args):
try:
self.fp.seek(offset, *args)
except AttributeError:
self.fp = open(self.file_or_path, mode)
self.fp.seek(offset, *args)
# define wrappers for write, read, etc., as well
def __enter__(self):
return self
def __exit__(self, type_, value, traceback):
generated = '\n'.join(self.spiffy_obj.gen_file())
self.write(generated)
if self.closefd:
self.fp.__exit__()
my suggestion is to pass pathlib.Path objects around. you can simply .write_bytes(...) or .write_text(...) to these objects.
other that that you'd have to check the type of your file variable (this is how polymorphism can be done in python):
from io import IOBase
def some_function(file)
if isinstance(file, IOBase):
file.write(...)
else:
with open(file, 'w') as file_handler:
file_handler.write(...)
(i hope io.IOBase is the most basic class to check against...). and you would have to catch possible exceptions around all that.
Probably not the answer you're looking for, but from a taste point of view I think it's better to have functions that only do one thing. Reasoning about them is easier this way.
I'd just have two functions: spiffy_file_makerA(spiffy_obj, file), which handles your first case, and a convenience function that wraps spiffy_file_makerA and creates a file for you.
Another approach to this problem, inspired by this talk from Raymond Hettinger at PyCon 2013, would be to keep the two functions separate as suggested by a couple of the other answers, but to bring the functions together into a class with a number of alternative options for outputting the object.
Continuing with the example I started with, it might look something like this:
class SpiffyFile(object):
def __init__(self, spiffy_obj, file_path = None, *, mode = 'w'):
self.spiffy = spiffy_obj
self.file_path = file_path
self.mode = mode
def to_str(self):
return '\n'.join(self.spiffy.gen_file())
def to_stream(self, fstream):
fstream.write(self.to_str())
def __enter__(self):
try:
# do not override an existing stream
self.fstream
except AttributeError:
# convert self.file_path to str to allow for pathlib.Path objects
self.fstream = open(str(self.file_path), mode = self.mode)
return self
def __exit__(self, exc_t, exc_v, tb):
self.fstream.close()
del self.fstream
def to_file(self, file_path = None, mode = None):
if mode is None:
mode = self.mode
try:
fstream = self.fstream
except AttributeError:
if file_path is None:
file_path = self.file_path
# convert file_path to str to allow for pathlib.Path objects
with open(str(file_path), mode = mode) as fstream:
self.to_stream(fstream)
else:
if mode != fstream.mode:
raise IOError('Ambiguous stream output mode: \
provided mode and fstream.mode conflict')
if file_path is not None:
raise IOError('Ambiguous output destination: \
a file_path was provided with an already active file stream.')
self.to_stream(fstream)
Now we have lots of different options for exporting a MySpiffy object by using a SpiffyFile object. We can just write it to a file directly:
from pathlib import Path
spiff = MySpiffy()
p = Path('spiffies')/'new_spiff.txt'
SpiffyFile(spiff, p).to_file()
We can override the path, too:
SpiffyFile(spiff).to_file(p.parent/'other_spiff.text')
But we can also use an existing open stream:
SpiffyFile(spiff).to_stream(my_stream)
Or, if we want to edit the string first we could open a new file stream ourselves and write the edited string to it:
my_heading = 'This is a spiffy object\n\n'
with open(str(p), mode = 'w') as fout:
spiff_out = SpiffyFile(spiff).to_str()
fout.write(my_heading + spiff_out)
And finally, we can just use a context manager with the SpiffyFile object directly to as many different locations- or streams- as we like (note that we can pass the pathlib.Path object directly without worrying about string conversion, which is nifty):
with SpiffyFile(spiff, p) as spiff_file:
spiff_file.to_file()
spiff_file.to_file(p.parent/'new_spiff.txt')
print(spiff_file.to_str())
spiff_file.to_stream(my_open_stream)
This approach is more consistent with the mantra: explicit is better than implicit.

Overridden function isn't called when using with statement in python

I'm running code from homework assignments with open in them. Trouble is, the students were told not to submit the data they were given and assume we have it - and open doesn't look in sys.path.
Luckily, I am using Spyder, which allows me to choose a script to be executed when initializing a console. I figured I could override open, so I defined a new open function which calls the original open on the absolute path of the files. But when someone uses with open(...) as ..., it doesn't work.
I know this may not to a good thing to do, but I can't go over every file in every submitted assignment looking for and replacing the call to open...
My code is:
old_open = open
def open(*args, **kwrdargs):
try:
res = old_open(*args,**kwrdargs)
return res
except:
args= list(args)
if ('DS1' in args[0]):
args[0]=DS1
elif ('DS2_X' in args[0]):
args[0] = DS2_X
elif ('DS2_Y' in args[0]):
args[0] = DS2_Y
args = tuple(args)
res = old_open(*args,**kwrdargs)
return res
DS1,DS2_X, DS2_Y contain the absolute path to the files.
When executing:
with open('DS1.data', 'r') as f:
I get the error:
FileNotFoundError: [Errno 2] No such file or directory: 'DS1.data'
while using:
f=open('DS1.data','r')
works.
I debugged the code, and when using with, my open is not called, but when using f=open(...), it is. Why is this happening?
open is supposed to return a file-like object, it is that object (ie f in the below example) that is supposed to have an __enter__ and __exit__. For example you could write your with-statement as:
f = open(...)
with f as ...:
do_something()
If you don't return the object returned by open, but some own wrapper around the file object you have to wrap these too. But from your description it more look like you don't need that, but rather that you've somewhere didn't return a file. Your open should look something like:
def open(fname, *args, **kwds):
for p in sys.path:
fn = build_filename(p, fname)
try:
return _orig_open(fn, *args, **kwds)
except IOerror as e:
pass
return _orig_open(fn, *args, **kwds) # must return file or rais exception

What is NamedTemporaryFile useful for on Windows?

The Python module tempfile contains both NamedTemporaryFile and TemporaryFile. The documentation for the former says
Whether the name can be used to open the file a second time, while the named temporary file is still open, varies across platforms (it can be so used on Unix; it cannot on Windows NT or later)
What is the point of the file having a name if I can't use that name? If I want the useful (for me) behaviour of Unix on Windows, I've got to make a copy of the code and rip out all the bits that say if _os.name == 'nt' and the like.
What gives? Surely this is useful for something, since it was deliberately coded this way, but what is that something?
It states that accessing it a second time while it is still open. You can still use the name otherwise, just be sure to pass delete=False when creating the NamedTemporaryFile so that it persists after it is closed.
I use this:
import os, tempfile, gc
class TemporaryFile:
def __init__(self, name, io, delete):
self.name = name
self.__io = io
self.__delete = delete
def __getattr__(self, k):
return getattr(self.__io, k)
def __del__(self):
if self.__delete:
try:
os.unlink(self.name)
except FileNotFoundError:
pass
def NamedTemporaryFile(mode='w+b', bufsize=-1, suffix='', prefix='tmp', dir=None, delete=True):
if not dir:
dir = tempfile.gettempdir()
name = os.path.join(dir, prefix + os.urandom(32).hex() + suffix)
if mode is None:
return TemporaryFile(name, None, delete)
fh = open(name, "w+b", bufsize)
if mode != "w+b":
fh.close()
fh = open(name, mode)
return TemporaryFile(name, fh, delete)
def test_ntf_txt():
x = NamedTemporaryFile("w")
x.write("hello")
x.close()
assert os.path.exists(x.name)
with open(x.name) as f:
assert f.read() == "hello"
def test_ntf_name():
x = NamedTemporaryFile(suffix="s", prefix="p")
assert os.path.basename(x.name)[0] == 'p'
assert os.path.basename(x.name)[-1] == 's'
x.write(b"hello")
x.seek(0)
assert x.read() == b"hello"
def test_ntf_del():
x = NamedTemporaryFile(suffix="s", prefix="p")
assert os.path.exists(x.name)
name = x.name
del x
gc.collect()
assert not os.path.exists(name)
def test_ntf_mode_none():
x = NamedTemporaryFile(suffix="s", prefix="p", mode=None)
assert not os.path.exists(x.name)
name = x.name
f = open(name, "w")
f.close()
assert os.path.exists(name)
del x
gc.collect()
assert not os.path.exists(name)
works on all platforms, you can close it, open it up again, etc.
The feature mode=None is what you want, you can ask for a tempfile, specify mode=None, which gives you a UUID styled temp name with the dir/suffix/prefix that you want. The link references tests that show usage.
It's basically the same as NamedTemporaryFile, except the file will get auto-deleted when the object returned is garbage collected... not on close.
You don't want to "rip out all the bits...". It's coded like that for a reason. It says you can't open it a SECOND time while it's still open. Don't. Just use it once, and throw it away (after all, it is a temporary file). If you want a permanent file, create your own.
"Surely this is useful for something, since it was deliberately coded this way, but what is that something". Well, I've used it to write emails to (in a binary format) before copying them to a location where our Exchange Server picks them up & sends them. I'm sure there are lots of other use cases.
I'm pretty sure the Python library writers didn't just decide to make NamedTemporaryFile behave differently on Windows for laughs. All those _os.name == 'nt' tests will be there because of platform differences between Windows and Unix. So my inference from that documentation is that on Windows a file opened the way NamedTemporaryFile opens it cannot be opened again while NamedTemporaryFile still has it open, and that this is due to the way Windows works.

How do you subclass the file type in Python?

I'm trying to subclass the built-in file class in Python to add some extra features to stdin and stdout. Here's the code I have so far:
class TeeWithTimestamp(file):
"""
Class used to tee the output of a stream (such as stdout or stderr) into
another stream, and to add a timestamp to each message printed.
"""
def __init__(self, file1, file2):
"""Initializes the TeeWithTimestamp"""
self.file1 = file1
self.file2 = file2
self.at_start_of_line = True
def write(self, text):
"""Writes text to both files, prefixed with a timestamp"""
if len(text):
# Add timestamp if at the start of a line; also add [STDERR]
# for stderr
if self.at_start_of_line:
now = datetime.datetime.now()
prefix = now.strftime('[%H:%M:%S] ')
if self.file1 == sys.__stderr__:
prefix += '[STDERR] '
text = prefix + text
self.file1.write(text)
self.file2.write(text)
self.at_start_of_line = (text[-1] == '\n')
The purpose is to add a timestamp to the beginning of each message, and to log everything to a log file. However, the problem I run into is that if I do this:
# log_file has already been opened
sys.stdout = TeeWithTimestamp(sys.stdout, log_file)
Then when I try to do print 'foo', I get a ValueError: I/O operation on closed file. I can't meaningfully call file.__init__() in my __init__(), since I don't want to open a new file, and I can't assign self.closed = False either, since it's a read-only attribute.
How can I modify this so that I can do print 'foo', and so that it supports all of the standard file attributes and methods?
Calling file.__init__ is quite feasible (e.g., on '/dev/null') but no real use because your attempted override of write doesn't "take" for the purposes of print statements -- the latter internally calls the real file.write when it sees that sys.stdout is an actual instance of file (and by inheriting you've made it so).
print doesn't really need any other method except write, so making your class inherit from object instead of file will work.
If you need other file methods (i.e., print is not all you're doing), you're best advised to implement them yourself.
You can as well avoid using super :
class SuperFile(file):
def __init__(self, *args, **kwargs):
file.__init__(self, *args, **kwargs)
You'll be able to write with it.

Categories

Resources