I have a huge code base before me, and I have a place where a file with name "foobar" gets written.
I have no clue where this file gets read.
My idea how to solve this:
do monkey patching or mocking. An exceptions should get raised if a file with this name gets opened.
run all tests and see where the exception gets raised.
How to let the interpreter raise an exception if a file with given name gets opened?
I am sure that the place I search is pure python, not a c-extension.
I use Python 2.7
You can override (shadow) builtin open function. Add this in your main module:
import __builtin__
open_file = __builtin__.open
def fake_open(filename, *args, **kwargs):
if filename == 'foobar':
raise Exception('foobar filename')
else:
return open_file(filename, *args, **kwargs)
__builtin__.open = fake_open
Related
I would like to write a decorator that closes all open files. For example:
#close_files
def view(request):
f = open('myfile.txt').read()
return render('template.html')
Ignoring any threadsafe stuff, how could I write such a decorator to close out any open files after the function is returned? I'm not interested in writing a context manager, but something like this:
def close_files(func):
#wraps(func)
def wrapper_close_files(*args):
return_value = func(*args, **kwargs)
# close open files here?
return return_value
return wrapper_close_files
Unfortunately I cannot use something like this:
with open('myfile.txt') as _f: f = _f.read()
I'm asking about how to do a decorator to close files where we do not have direct access to the variable which references the file (handler).
According to python docs 3.6:
If you’re not using the with keyword, then you should call f.close() to close the file and immediately free up any system resources used by it. If you don’t explicitly close a file, Python’s garbage collector will eventually destroy the object and close the open file for you, but the file may stay open for a while. Another risk is that different Python implementations will do this clean-up at different times.
So if you want to do this, you need a decorator who captures the open command from the builtins module. However, since we don't know how many files will be opened, we can use ExitStack. Thus, you can use a function that keeps the file handle within the ExitStack context.
My version for that is:
def close_opened_files(func):
#functools.wraps(func)
def wrapper(*args, **kwargs):
import builtins
_open = getattr(builtins, 'open')
try:
with contextlib.ExitStack() as stack:
def myopen(*args, **kwargs):
f = stack.enter_context(_open(*args, **kwargs))
return f
setattr(builtins, 'open', myopen)
ret = func(*args, **kwargs)
return ret
finally:
setattr(builtins, 'open', _open)
return wrapper
At the end of the decorator, the original open function should be restored in the builtins module.
One of the problems with this type of approach is: the functions that are called inside func and that use the open function, will also have their files closed at the end of decorator, even if it represents an error.
Here are some examples:
The following examples show read and write operations, also with files being closed within the function.
#close_opened_files
def test_write(fn1, fn2=None):
f1 = open(fn1, 'w')
print('Hello world', file=f1)
if fn2:
f2 = open(fn2, 'w')
print('foo-bar', file=f2)
f2.close()
#close_opened_files
def test_read(fn1, fn2=None):
f1 = open(fn1, 'r')
for line in f1: print(line)
if fn2:
f2 = open(fn2, 'r')
for line in f2: print(line)
f2.close()
test_write('file1.txt', 'file2.txt')
test_read('file1.txt', 'file2.txt')
try: test_read('non_exist_filename.txt')
except FileNotFoundError as ex: print(ex)
try: test_exception('file1.txt')
except RuntimeError as ex: print(ex)
These last examples show the decorator under the exceptions: 'no file exists', or any other exception that happens.
The last case, file will be closed before the exception is raised.
I believe it would be possible to use ast to traverse the code for file openings, and attempt to close any files that were found to have been opened.
I'm running code from homework assignments with open in them. Trouble is, the students were told not to submit the data they were given and assume we have it - and open doesn't look in sys.path.
Luckily, I am using Spyder, which allows me to choose a script to be executed when initializing a console. I figured I could override open, so I defined a new open function which calls the original open on the absolute path of the files. But when someone uses with open(...) as ..., it doesn't work.
I know this may not to a good thing to do, but I can't go over every file in every submitted assignment looking for and replacing the call to open...
My code is:
old_open = open
def open(*args, **kwrdargs):
try:
res = old_open(*args,**kwrdargs)
return res
except:
args= list(args)
if ('DS1' in args[0]):
args[0]=DS1
elif ('DS2_X' in args[0]):
args[0] = DS2_X
elif ('DS2_Y' in args[0]):
args[0] = DS2_Y
args = tuple(args)
res = old_open(*args,**kwrdargs)
return res
DS1,DS2_X, DS2_Y contain the absolute path to the files.
When executing:
with open('DS1.data', 'r') as f:
I get the error:
FileNotFoundError: [Errno 2] No such file or directory: 'DS1.data'
while using:
f=open('DS1.data','r')
works.
I debugged the code, and when using with, my open is not called, but when using f=open(...), it is. Why is this happening?
open is supposed to return a file-like object, it is that object (ie f in the below example) that is supposed to have an __enter__ and __exit__. For example you could write your with-statement as:
f = open(...)
with f as ...:
do_something()
If you don't return the object returned by open, but some own wrapper around the file object you have to wrap these too. But from your description it more look like you don't need that, but rather that you've somewhere didn't return a file. Your open should look something like:
def open(fname, *args, **kwds):
for p in sys.path:
fn = build_filename(p, fname)
try:
return _orig_open(fn, *args, **kwds)
except IOerror as e:
pass
return _orig_open(fn, *args, **kwds) # must return file or rais exception
I'd like to embed pylint in a program. The user enters python programs (in Qt, in a QTextEdit, although not relevant) and in the background I call pylint to check the text he enters. Finally, I print the errors in a message box.
There are thus two questions: First, how can I do this without writing the entered text to a temporary file and giving it to pylint ? I suppose at some point pylint (or astroid) handles a stream and not a file anymore.
And, more importantly, is it a good idea ? Would it cause problems for imports or other stuffs ? Intuitively I would say no since it seems to spawn a new process (with epylint) but I'm no python expert so I'm really not sure. And if I use this to launch pylint, is it okay too ?
Edit:
I tried tinkering with pylint's internals, event fought with it, but finally have been stuck at some point.
Here is the code so far:
from astroid.builder import AstroidBuilder
from astroid.exceptions import AstroidBuildingException
from logilab.common.interface import implements
from pylint.interfaces import IRawChecker, ITokenChecker, IAstroidChecker
from pylint.lint import PyLinter
from pylint.reporters.text import TextReporter
from pylint.utils import PyLintASTWalker
class Validator():
def __init__(self):
self._messagesBuffer = InMemoryMessagesBuffer()
self._validator = None
self.initValidator()
def initValidator(self):
self._validator = StringPyLinter(reporter=TextReporter(output=self._messagesBuffer))
self._validator.load_default_plugins()
self._validator.disable('W0704')
self._validator.disable('I0020')
self._validator.disable('I0021')
self._validator.prepare_import_path([])
def destroyValidator(self):
self._validator.cleanup_import_path()
def check(self, string):
return self._validator.check(string)
class InMemoryMessagesBuffer():
def __init__(self):
self.content = []
def write(self, st):
self.content.append(st)
def messages(self):
return self.content
def reset(self):
self.content = []
class StringPyLinter(PyLinter):
"""Does what PyLinter does but sets checkers once
and redefines get_astroid to call build_string"""
def __init__(self, options=(), reporter=None, option_groups=(), pylintrc=None):
super(StringPyLinter, self).__init__(options, reporter, option_groups, pylintrc)
self._walker = None
self._used_checkers = None
self._tokencheckers = None
self._rawcheckers = None
self.initCheckers()
def __del__(self):
self.destroyCheckers()
def initCheckers(self):
self._walker = PyLintASTWalker(self)
self._used_checkers = self.prepare_checkers()
self._tokencheckers = [c for c in self._used_checkers if implements(c, ITokenChecker)
and c is not self]
self._rawcheckers = [c for c in self._used_checkers if implements(c, IRawChecker)]
# notify global begin
for checker in self._used_checkers:
checker.open()
if implements(checker, IAstroidChecker):
self._walker.add_checker(checker)
def destroyCheckers(self):
self._used_checkers.reverse()
for checker in self._used_checkers:
checker.close()
def check(self, string):
modname = "in_memory"
self.set_current_module(modname)
astroid = self.get_astroid(string, modname)
self.check_astroid_module(astroid, self._walker, self._rawcheckers, self._tokencheckers)
self._add_suppression_messages()
self.set_current_module('')
self.stats['statement'] = self._walker.nbstatements
def get_astroid(self, string, modname):
"""return an astroid representation for a module"""
try:
return AstroidBuilder().string_build(string, modname)
except SyntaxError as ex:
self.add_message('E0001', line=ex.lineno, args=ex.msg)
except AstroidBuildingException as ex:
self.add_message('F0010', args=ex)
except Exception as ex:
import traceback
traceback.print_exc()
self.add_message('F0002', args=(ex.__class__, ex))
if __name__ == '__main__':
code = """
a = 1
print(a)
"""
validator = Validator()
print(validator.check(code))
The traceback is the following:
Traceback (most recent call last):
File "validator.py", line 16, in <module>
main()
File "validator.py", line 13, in main
print(validator.check(code))
File "validator.py", line 30, in check
self._validator.check(string)
File "validator.py", line 79, in check
self.check_astroid_module(astroid, self._walker, self._rawcheckers, self._tokencheckers)
File "c:\Python33\lib\site-packages\pylint\lint.py", line 659, in check_astroid_module
tokens = tokenize_module(astroid)
File "c:\Python33\lib\site-packages\pylint\utils.py", line 103, in tokenize_module
print(module.file_stream)
AttributeError: 'NoneType' object has no attribute 'file_stream'
# And sometimes this is added :
File "c:\Python33\lib\site-packages\astroid\scoped_nodes.py", line 251, in file_stream
return open(self.file, 'rb')
OSError: [Errno 22] Invalid argument: '<?>'
I'll continue digging tomorrow. :)
I got it running.
the first one (NoneType …) is really easy and a bug in your code:
Encountering an exception can make get_astroid “fail”, i.e. send one syntax error message and return!
But for the secong one… such bullshit in pylint’s/logilab’s API… Let me explain: Your astroid object here is of type astroid.scoped_nodes.Module.
It’s also created by a factory, AstroidBuilder, which sets astroid.file = '<?>'.
Unfortunately, the Module class has following property:
#property
def file_stream(self):
if self.file is not None:
return open(self.file, 'rb')
return None
And there’s no way to skip that except for subclassing (Which would render us unable to use the magic in AstroidBuilder), so… monkey patching!
We replace the ill-defined property with one that checks an instance for a reference to our code bytes (e.g. astroid._file_bytes) before engaging in above default behavior.
def _monkeypatch_module(module_class):
if module_class.file_stream.fget.__name__ == 'file_stream_patched':
return # only patch if patch isn’t already applied
old_file_stream_fget = module_class.file_stream.fget
def file_stream_patched(self):
if hasattr(self, '_file_bytes'):
return BytesIO(self._file_bytes)
return old_file_stream_fget(self)
module_class.file_stream = property(file_stream_patched)
That monkeypatching can be called just before calling check_astroid_module. But one more thing has to be done. See, there’s more implicit behavior: Some checkers expect and use astroid’s file_encoding field. So we now have this code in the middle of check:
astroid = self.get_astroid(string, modname)
if astroid is not None:
_monkeypatch_module(astroid.__class__)
astroid._file_bytes = string.encode('utf-8')
astroid.file_encoding = 'utf-8'
self.check_astroid_module(astroid, self._walker, self._rawcheckers, self._tokencheckers)
One could say that no amount of linting creates actually good code. Unfortunately pylint unites enormous complexity with a specialization of calling it on files. Really good code has a nice native API and wraps that with a CLI interface. Don’t ask me why file_stream exists if internally, Module gets built from but forgets the source code.
PS: i had to change sth else in your code: load_default_plugins has to come before some other stuff (maybe prepare_checkers, maybe sth. else)
PPS: i suggest subclassing BaseReporter and using that instead of your InMemoryMessagesBuffer
PPPS: this just got pulled (3.2014), and will fix this: https://bitbucket.org/logilab/astroid/pull-request/15/astroidbuilderstring_build-was/diff
4PS: this is now in the official version, so no monkey patching required: astroid.scoped_nodes.Module now has a file_bytes property (without leading underscore).
Working with an unlocatable stream may definitly cause problems in case of relative imports, since the location is then needed to find the actually imported module.
Astroid support building an AST from a stream, but this is not used/exposed through Pylint which is a level higher and designed to work with files. So while you may acheive this it will need a bit of digging into the low-level APIs.
The easiest way is definitly to save the buffer to the file then to use the SA answer to start pylint programmatically if you wish (totally forgot this other account of mine found in other responses ;). Another option being to write a custom reporter to gain more control.
I know how to override an object's getattr() to handle calls to undefined object functions. However, I would like to achieve the same behavior for the builtin getattr() function. For instance, consider code like this:
call_some_undefined_function()
Normally, that simply produces an error:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'call_some_undefined_function' is not defined
I want to override getattr() so that I can intercept the call to "call_some_undefined_function()" and figure out what to do.
Is this possible?
I can only think of a way to do this by calling eval.
class Global(dict):
def undefined(self, *args, **kargs):
return u'ran undefined'
def __getitem__(self, key):
if dict.has_key(self, key):
return dict.__getitem__(self, key)
return self.undefined
src = """
def foo():
return u'ran foo'
print foo()
print callme(1,2)
"""
code = compile(src, '<no file>', 'exec')
globals = Global()
eval(code, globals)
The above outputs
ran foo
ran undefined
You haven't said why you're trying to do this. I had a use case where I wanted to be capable of handling typos that I made during interactive Python sessions, so I put this into my python startup file:
import sys
import re
def nameErrorHandler(type, value, traceback):
if not isinstance(value, NameError):
# Let the normal error handler handle this:
nameErrorHandler.originalExceptHookFunction(type, value, traceback)
name = re.search(r"'(\S+)'", value.message).group(1)
# At this point we know that there was an attempt to use name
# which ended up not being defined anywhere.
# Handle this however you want...
nameErrorHandler.originalExceptHookFunction = sys.excepthook
sys.excepthook = nameErrorHandler
Hopefully this is helpful for anyone in the future who wants to have a special error handler for undefined names... whether this is helpful for the OP or not is unknown since they never actually told us what their intended use-case was.
I want to wrap the default open method with a wrapper that should also catch exceptions. Here's a test example that works:
truemethod = open
def fn(*args, **kwargs):
try:
return truemethod(*args, **kwargs)
except (IOError, OSError):
sys.exit('Can\'t open \'{0}\'. Error #{1[0]}: {1[1]}'.format(args[0], sys.exc_info()[1].args))
open = fn
I want to make a generic method of it:
def wrap(method, exceptions = (OSError, IOError)):
truemethod = method
def fn(*args, **kwargs):
try:
return truemethod(*args, **kwargs)
except exceptions:
sys.exit('Can\'t open \'{0}\'. Error #{1[0]}: {1[1]}'.format(args[0], sys.exc_info()[1].args))
method = fn
But it doesn't work:
>>> wrap(open)
>>> open
<built-in function open>
Apparently, method is a copy of the parameter, not a reference as I expected. Any pythonic workaround?
The problem with your code is that inside wrap, your method = fn statement is simply changing the local value of method, it isn't changing the larger value of open. You'll have to assign to those names yourself:
def wrap(method, exceptions = (OSError, IOError)):
def fn(*args, **kwargs):
try:
return method(*args, **kwargs)
except exceptions:
sys.exit('Can\'t open \'{0}\'. Error #{1[0]}: {1[1]}'.format(args[0], sys.exc_info()[1].args))
return fn
open = wrap(open)
foo = wrap(foo)
Try adding global open. In the general case, you might want to look at this section of the manual:
This module provides direct access to all ‘built-in’ identifiers of Python; for example, __builtin__.open is the full name for the built-in function open(). See chapter Built-in Objects.
This module is not normally accessed explicitly by most applications, but can be useful in modules that provide objects with the same name as a built-in value, but in which the built-in of that name is also needed. For example, in a module that wants to implement an open() function that wraps the built-in open(), this module can be used directly:
import __builtin__
def open(path):
f = __builtin__.open(path, 'r')
return UpperCaser(f)
class UpperCaser:
'''Wrapper around a file that converts output to upper-case.'''
def __init__(self, f):
self._f = f
def read(self, count=-1):
return self._f.read(count).upper()
# ...
CPython implementation detail: Most modules have the name __builtins__ (note the 's') made available as part of their globals. The value of __builtins__ is normally either this module or the value of this modules’s __dict__ attribute. Since this is an implementation detail, it may not be used by alternate implementations of Python.
you can just add return fn at the end of your wrap function and then do:
>>> open = wrap(open)
>>> open('bhla')
Traceback (most recent call last):
File "<pyshell#24>", line 1, in <module>
open('bhla')
File "<pyshell#18>", line 7, in fn
sys.exit('Can\'t open \'{0}\'. Error #{1[0]}: {1[1]}'.format(args[0], sys.exc_info()[1].args))
SystemExit: Can't open 'bhla'. Error #2: No such file or directory