Python warnings stack levels - python

I am trying the warning message doesn't include the source line that generated it, using warnings stack levels, but instead of seeing only the message, I am getting one more line which says:
File "sys", line 1
Is possible not to get this line?
This is my code:
#! /usr/bin/env python
# -*- coding: utf-8 -*-
import sys
import warnings
def warning_function():
warnings.warn("Python 3.x is required!", RuntimeWarning, stacklevel = 8)
if sys.version_info[0] < 3:
...
else:
warning_function()

Well that's exactly what you asked for: the stacklevel=8 parameter requires to unwind 7 calls between showing the current line. As you have not that number of calls, you end in the starting of Python interpreter.
If you want further control on the printed string, you should overwrite the warnings.showwarning function:
old_fw = warnings.showwarning # store previous function...
def new_sw(message, category, filename, lineno, file = None, line = None):
msg = warnings.formatwarning(message, category, filename, lineno,
line).split(':')[-2:]
sys.stderr.write("Warning (from warnings module):\n{}:{}\n".format(
msg[0][1:], msg[1]))
warnings.showwarning = new_sw
That way you will not have the File "...", line ... line

Related

logging.warn() add stacktrace

There are several logging.warn('....') calls in the legacy code base I am working on today.
I want to understand the log output better. Up to now logging.warn() does emit one line. But this single line is not enough to understand the context.
I would like to see the stacktrace of the interpreter.
Since there are a lot of logging.warn('....') lines in my code, I would like to leave them like they are and only modify the configuration of the logging.
How can I add the interpreter stacktrace to every warn() or error() call automatically?
I know that logging.exception("message") shows the stacktrace, but I would like to leave the logging.warn() lines untouched.
The answer I was looking for was given by #Martijn Pieters♦ in the comments
In python 3.x
logger.warning(f'{error_message}', stack_info=True)
does exactly what you need.
Thanks #Martijn Pieters♦
it is trivial if you accept to add a log handler:
import logging
import traceback
class WarnWithStackHandler(logging.StreamHandler):
def emit(self, record):
if record.levelno == logging.WARNING:
stack = traceback.extract_stack()
# skip logging internal stacks
stack = stack[:-7]
for line in traceback.format_list(stack):
print(line, end='')
super().emit(record)
I don't believe a handler is your solution. Go for a filter:
import os.path
import traceback
import logging
_LOGGING_FILE = os.path.normcase(logging.addLevelName.__code__.co_filename)
_CURRENT_FILE = os.path.normcase(__file__)
_ELIMINATE_STACK = (_CURRENT_FILE, _LOGGING_FILE)
class AddStackFilter(logging.Filter):
def __init__(self, levels=None):
self.levels = levels or set()
def get_stack(self):
# Iterator over file names
filenames = iter(_ELIMINATE_STACK)
filename = next(filenames, "")
frames = traceback.walk_stack(None)
# Walk up the frames
for frame, lineno in frames:
# If frame is not from file, continue on to the next file
while os.path.normcase(frame.f_code.co_filename) != filename:
filename = next(filenames, None)
if filename is None:
break
else:
# It's from the given file, go up a frame
continue
# Finished iterating over all files
break
# No frames left
else:
return None
info = traceback.format_stack(frame)
info.insert(0, 'Stack (most recent call last):\n')
# Remove last newline
info[-1] = info[-1].rstrip()
return "".join(info)
def filter(self, record):
if record.levelno in self.levels:
sinfo = self.get_stack()
if sinfo is not None:
record.stack_info = sinfo
return True
This filter has numerous advantages:
Removes stack frames from the local file and logging's file.
Leaves stack frames in case we come back to the local file after passing through logging. Important if we wish to use the same module for other stuff.
You can attach it to any handler or logger, doesn't bind you to StreamHandler or any other handler.
You can affect multiple handlers using the same filter, or a single handler, your choice.
The levels are given as an __init__ variable, allowing you to add more levels as needed.
Allows you to add the stack trace to the log, and not just print.
Plays well with the logging module, putting the stack in the correct place, nothing unexpected.
Usage:
>>> import stackfilter
>>> import logging
>>> sfilter = stackfilter.AddStackFilter(levels={logging.WARNING})
>>> logging.basicConfig()
>>> logging.getLogger().addFilter(sfilter)
>>> def testy():
... logging.warning("asdasd")
...
>>> testy()
WARNING:root:asdasd
Stack (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in testy

How to find a spurious print statement?

I'm debugging a large Python codebase. Somewhere, a piece of code is printing {} to console, presumably this is some old debugging code that's been left in by accident.
As this is the only console output that doesn't go through logger, is there any way I can find the culprit? Perhaps by redefining what the print statement does, so I can cause an exception?
Try to redirect sys.stdout to custom stream handler (see Redirect stdout to a file in Python?), where you can override write() method.
Try something like this:
import io
import sys
import traceback
class TestableIO(io.BytesIO):
def __init__(self, old_stream, initial_bytes=None):
super(TestableIO, self).__init__(initial_bytes)
self.old_stream = old_stream
def write(self, bytes):
if 'bb' in bytes:
traceback.print_stack(file=self.old_stream)
self.old_stream.write(bytes)
sys.stdout = TestableIO(sys.stdout)
sys.stderr = TestableIO(sys.stderr)
print('aa')
print('bb')
print('cc')
Then you will get nice traceback:
λ python test.py
aa
File "test.py", line 22, in <module>
print('bb')
File "test.py", line 14, in write
traceback.print_stack(file=self.old_stream)
bb
cc

How to limit python traceback to specific files

I write a lot of Python code that uses external libraries. Frequently I will write a bug, and when I run the code I get a big long traceback in the Python console. 99.999999% of the time it's due to a coding error in my code, not because of a bug in the package. But the traceback goes all the way to the line of error in the package code, and either it takes a lot of scrolling through the traceback to find the code I wrote, or the traceback is so deep into the package that my own code doesn't even appear in the traceback.
Is there a way to "black-box" the package code, or somehow only show traceback lines from my code? I'd like the ability to specify to the system which directories or files I want to see traceback from.
In order to print your own stacktrace, you would need to handle all unhandled exceptions yourself; this is how the sys.excepthook becomes handy.
The signature for this function is sys.excepthook(type, value, traceback) and its job is:
This function prints out a given traceback and exception to sys.stderr.
So as long as you can play with the traceback and only extract the portion you care about you should be fine. Testing frameworks do that very frequently; they have custom assert functions which usually does not appear in the traceback, in other words they skip the frames that belong to the test framework. Also, in those cases, the tests usually are started by the test framework as well.
You end up with a traceback that looks like this:
[ custom assert code ] + ... [ code under test ] ... + [ test runner code ]
How to identify your code.
You can add a global to your code:
__mycode = True
Then to identify the frames:
def is_mycode(tb):
globals = tb.tb_frame.f_globals
return globals.has_key('__mycode')
How to extract your frames.
skip the frames that don't matter to you (e.g. custom assert code)
identify how many frames are part of your code -> length
extract length frames
def mycode_traceback_levels(tb):
length = 0
while tb and is_mycode(tb):
tb = tb.tb_next
length += 1
return length
Example handler.
def handle_exception(type, value, tb):
# 1. skip custom assert code, e.g.
# while tb and is_custom_assert_code(tb):
# tb = tb.tb_next
# 2. only display your code
length = mycode_traceback_levels(tb)
print ''.join(traceback.format_exception(type, value, tb, length))
install the handler:
sys.excepthook = handle_exception
What next?
You could adjust length to add one or more levels if you still want some info about where the failure is outside of your own code.
see also https://gist.github.com/dnozay/b599a96dc2d8c69b84c6
As others suggested, you could use sys.excepthook:
This function prints out a given traceback and exception to sys.stderr.
When an exception is raised and uncaught, the interpreter calls sys.excepthook with three arguments, the exception class, exception instance, and a traceback object. In an interactive session this happens just before control is returned to the prompt; in a Python program this happens just before the program exits. The handling of such top-level exceptions can be customized by assigning another three-argument function to sys.excepthook.
(emphasis mine)
It's possible to filter a traceback extracted by extract_tb (or similar functions from the traceback module) based on specified directories.
Two functions that can help:
from os.path import join, abspath
from traceback import extract_tb, format_list, format_exception_only
def spotlight(*show):
''' Return a function to be set as new sys.excepthook.
It will SHOW traceback entries for files from these directories. '''
show = tuple(join(abspath(p), '') for p in show)
def _check_file(name):
return name and name.startswith(show)
def _print(type, value, tb):
show = (fs for fs in extract_tb(tb) if _check_file(fs.filename))
fmt = format_list(show) + format_exception_only(type, value)
print(''.join(fmt), end='', file=sys.stderr)
return _print
def shadow(*hide):
''' Return a function to be set as new sys.excepthook.
It will HIDE traceback entries for files from these directories. '''
hide = tuple(join(abspath(p), '') for p in hide)
def _check_file(name):
return name and not name.startswith(hide)
def _print(type, value, tb):
show = (fs for fs in extract_tb(tb) if _check_file(fs.filename))
fmt = format_list(show) + format_exception_only(type, value)
print(''.join(fmt), end='', file=sys.stderr)
return _print
They both use the traceback.extract_tb. It returns "a list of “pre-processed” stack trace entries extracted from the traceback object"; all of them are instances of traceback.FrameSummary (a named tuple). Each traceback.FrameSummary object has a filename field which stores the absolute path of the corresponding file. We check if it starts with any of the directory paths provided as separate function arguments to determine if we'll need to exclude the entry (or keep it).
Here's an Example:
The enum module from the standard library doesn't allow reusing keys,
import enum
enum.Enum('Faulty', 'a a', module=__name__)
yields
Traceback (most recent call last):
File "/home/vaultah/so/shadows/main.py", line 23, in <module>
enum.Enum('Faulty', 'a a', module=__name__)
File "/home/vaultah/cpython/Lib/enum.py", line 243, in __call__
return cls._create_(value, names, module=module, qualname=qualname, type=type, start=start)
File "/home/vaultah/cpython/Lib/enum.py", line 342, in _create_
classdict[member_name] = member_value
File "/home/vaultah/cpython/Lib/enum.py", line 72, in __setitem__
raise TypeError('Attempted to reuse key: %r' % key)
TypeError: Attempted to reuse key: 'a'
We can restrict stack trace entries to our code (in /home/vaultah/so/shadows/main.py).
import sys, enum
sys.excepthook = spotlight('/home/vaultah/so/shadows')
enum.Enum('Faulty', 'a a', module=__name__)
and
import sys, enum
sys.excepthook = shadow('/home/vaultah/cpython/Lib')
enum.Enum('Faulty', 'a a', module=__name__)
give the same result:
File "/home/vaultah/so/shadows/main.py", line 22, in <module>
enum.Enum('Faulty', 'a a', module=__name__)
TypeError: Attempted to reuse key: 'a'
There's a way to exclude all site directories (where 3rd party packages are installed - see site.getsitepackages)
import sys, site, jinja2
sys.excepthook = shadow(*site.getsitepackages())
jinja2.Template('{%}')
# jinja2.exceptions.TemplateSyntaxError: unexpected '}'
# Generates ~30 lines, but will only display 4
Note: Don't forget to restore sys.excepthook from sys.__excepthook__. Unfortunately, you won't be able to "patch-restore" it using a context manager.
the traceback.extract_tb(tb) would return a tuple of error frames in the format(file, line_no, type, error_statement) , you can play with that to format the traceback. Also refer https://pymotw.com/2/sys/exceptions.html
import sys
import traceback
def handle_exception(ex_type, ex_info, tb):
print ex_type, ex_info, traceback.extract_tb(tb)
sys.excepthook = handle_exception

pylint on in-memory file/stream

I'd like to embed pylint in a program. The user enters python programs (in Qt, in a QTextEdit, although not relevant) and in the background I call pylint to check the text he enters. Finally, I print the errors in a message box.
There are thus two questions: First, how can I do this without writing the entered text to a temporary file and giving it to pylint ? I suppose at some point pylint (or astroid) handles a stream and not a file anymore.
And, more importantly, is it a good idea ? Would it cause problems for imports or other stuffs ? Intuitively I would say no since it seems to spawn a new process (with epylint) but I'm no python expert so I'm really not sure. And if I use this to launch pylint, is it okay too ?
Edit:
I tried tinkering with pylint's internals, event fought with it, but finally have been stuck at some point.
Here is the code so far:
from astroid.builder import AstroidBuilder
from astroid.exceptions import AstroidBuildingException
from logilab.common.interface import implements
from pylint.interfaces import IRawChecker, ITokenChecker, IAstroidChecker
from pylint.lint import PyLinter
from pylint.reporters.text import TextReporter
from pylint.utils import PyLintASTWalker
class Validator():
def __init__(self):
self._messagesBuffer = InMemoryMessagesBuffer()
self._validator = None
self.initValidator()
def initValidator(self):
self._validator = StringPyLinter(reporter=TextReporter(output=self._messagesBuffer))
self._validator.load_default_plugins()
self._validator.disable('W0704')
self._validator.disable('I0020')
self._validator.disable('I0021')
self._validator.prepare_import_path([])
def destroyValidator(self):
self._validator.cleanup_import_path()
def check(self, string):
return self._validator.check(string)
class InMemoryMessagesBuffer():
def __init__(self):
self.content = []
def write(self, st):
self.content.append(st)
def messages(self):
return self.content
def reset(self):
self.content = []
class StringPyLinter(PyLinter):
"""Does what PyLinter does but sets checkers once
and redefines get_astroid to call build_string"""
def __init__(self, options=(), reporter=None, option_groups=(), pylintrc=None):
super(StringPyLinter, self).__init__(options, reporter, option_groups, pylintrc)
self._walker = None
self._used_checkers = None
self._tokencheckers = None
self._rawcheckers = None
self.initCheckers()
def __del__(self):
self.destroyCheckers()
def initCheckers(self):
self._walker = PyLintASTWalker(self)
self._used_checkers = self.prepare_checkers()
self._tokencheckers = [c for c in self._used_checkers if implements(c, ITokenChecker)
and c is not self]
self._rawcheckers = [c for c in self._used_checkers if implements(c, IRawChecker)]
# notify global begin
for checker in self._used_checkers:
checker.open()
if implements(checker, IAstroidChecker):
self._walker.add_checker(checker)
def destroyCheckers(self):
self._used_checkers.reverse()
for checker in self._used_checkers:
checker.close()
def check(self, string):
modname = "in_memory"
self.set_current_module(modname)
astroid = self.get_astroid(string, modname)
self.check_astroid_module(astroid, self._walker, self._rawcheckers, self._tokencheckers)
self._add_suppression_messages()
self.set_current_module('')
self.stats['statement'] = self._walker.nbstatements
def get_astroid(self, string, modname):
"""return an astroid representation for a module"""
try:
return AstroidBuilder().string_build(string, modname)
except SyntaxError as ex:
self.add_message('E0001', line=ex.lineno, args=ex.msg)
except AstroidBuildingException as ex:
self.add_message('F0010', args=ex)
except Exception as ex:
import traceback
traceback.print_exc()
self.add_message('F0002', args=(ex.__class__, ex))
if __name__ == '__main__':
code = """
a = 1
print(a)
"""
validator = Validator()
print(validator.check(code))
The traceback is the following:
Traceback (most recent call last):
File "validator.py", line 16, in <module>
main()
File "validator.py", line 13, in main
print(validator.check(code))
File "validator.py", line 30, in check
self._validator.check(string)
File "validator.py", line 79, in check
self.check_astroid_module(astroid, self._walker, self._rawcheckers, self._tokencheckers)
File "c:\Python33\lib\site-packages\pylint\lint.py", line 659, in check_astroid_module
tokens = tokenize_module(astroid)
File "c:\Python33\lib\site-packages\pylint\utils.py", line 103, in tokenize_module
print(module.file_stream)
AttributeError: 'NoneType' object has no attribute 'file_stream'
# And sometimes this is added :
File "c:\Python33\lib\site-packages\astroid\scoped_nodes.py", line 251, in file_stream
return open(self.file, 'rb')
OSError: [Errno 22] Invalid argument: '<?>'
I'll continue digging tomorrow. :)
I got it running.
the first one (NoneType …) is really easy and a bug in your code:
Encountering an exception can make get_astroid “fail”, i.e. send one syntax error message and return!
But for the secong one… such bullshit in pylint’s/logilab’s API… Let me explain: Your astroid object here is of type astroid.scoped_nodes.Module.
It’s also created by a factory, AstroidBuilder, which sets astroid.file = '<?>'.
Unfortunately, the Module class has following property:
#property
def file_stream(self):
if self.file is not None:
return open(self.file, 'rb')
return None
And there’s no way to skip that except for subclassing (Which would render us unable to use the magic in AstroidBuilder), so… monkey patching!
We replace the ill-defined property with one that checks an instance for a reference to our code bytes (e.g. astroid._file_bytes) before engaging in above default behavior.
def _monkeypatch_module(module_class):
if module_class.file_stream.fget.__name__ == 'file_stream_patched':
return # only patch if patch isn’t already applied
old_file_stream_fget = module_class.file_stream.fget
def file_stream_patched(self):
if hasattr(self, '_file_bytes'):
return BytesIO(self._file_bytes)
return old_file_stream_fget(self)
module_class.file_stream = property(file_stream_patched)
That monkeypatching can be called just before calling check_astroid_module. But one more thing has to be done. See, there’s more implicit behavior: Some checkers expect and use astroid’s file_encoding field. So we now have this code in the middle of check:
astroid = self.get_astroid(string, modname)
if astroid is not None:
_monkeypatch_module(astroid.__class__)
astroid._file_bytes = string.encode('utf-8')
astroid.file_encoding = 'utf-8'
self.check_astroid_module(astroid, self._walker, self._rawcheckers, self._tokencheckers)
One could say that no amount of linting creates actually good code. Unfortunately pylint unites enormous complexity with a specialization of calling it on files. Really good code has a nice native API and wraps that with a CLI interface. Don’t ask me why file_stream exists if internally, Module gets built from but forgets the source code.
PS: i had to change sth else in your code: load_default_plugins has to come before some other stuff (maybe prepare_checkers, maybe sth. else)
PPS: i suggest subclassing BaseReporter and using that instead of your InMemoryMessagesBuffer
PPPS: this just got pulled (3.2014), and will fix this: https://bitbucket.org/logilab/astroid/pull-request/15/astroidbuilderstring_build-was/diff
4PS: this is now in the official version, so no monkey patching required: astroid.scoped_nodes.Module now has a file_bytes property (without leading underscore).
Working with an unlocatable stream may definitly cause problems in case of relative imports, since the location is then needed to find the actually imported module.
Astroid support building an AST from a stream, but this is not used/exposed through Pylint which is a level higher and designed to work with files. So while you may acheive this it will need a bit of digging into the low-level APIs.
The easiest way is definitly to save the buffer to the file then to use the SA answer to start pylint programmatically if you wish (totally forgot this other account of mine found in other responses ;). Another option being to write a custom reporter to gain more control.

What is the equivalent of Perl's (<>) in Python? fileinput doesn't work as expected

In Perl one uses:
while (<>) {
# process files given as command line arguments
}
In Python I found:
import fileinput
for line in fileinput.input():
process(line)
But, what happens when the file given in the command line does NOT exist?
python test.py test1.txt test2.txt filenotexist1.txt filenotexist2.txt test3.txt was given as the argument.
I tried various ways of using try: except: nextfile, but I couldn't seem to make it work.
For the above commandline, the script should run for test1-3.txt but just go to next file silent when the file is NOT found.
Perl does this very well. I have searched this all over the net, but I couldn't find the answer to this one anywhere.
import sys
import os
for f in sys.argv[1:]:
if os.path.exists(f):
for line in open(f).readlines():
process(line)
Something like this;
import sys
for f in sys.argv[1:]:
try:
data = open(f).readlines()
process(data)
except IOError:
continue
Turning #Brian's answer into a generator, and catching IOError rather than testing for existence which is more Pythonic and then printing a warning to stderr on failure:
import sys
def read_files(files = None):
if not files:
files = sys.argv[1:]
for file in files:
try:
for line in open(file):
yield line
except IOError, e:
print >>sys.stderr, 'Warning:', e
for line in read_files():
print line,
Output (the file baz does not exist):
$ python read_lines.py foo bar baz
line 1 of foo
line 2 of foo
line 1 of bar
line 2 of bar
Warning: [Errno 2] No such file or directory: 'baz'
You might want to put in a little effort tidying up the error message, but it might not be worth the effort.
You can solve your problem with fileinput module as follows:
import fileinput
input = fileinput.input()
while True:
try:
process(input.next())
except IOError:
input.nextfile()
except StopIteration:
break
Unfortunately you can't use for loop because the IOException breaks it.
I tried to implement #VGE's suggestion, but my attempt turned out not to be too elegant. I'd appreciate any suggestions for how to improve this.
import sys, fileinput, errno, os
class nosuchfile:
def readlines(foo, bar):
return []
def close(arg):
pass
EXITCODE=0
def skip_on_error (filename, mode):
"""Function to pass in as fileinput.input(openhook=...) hook function.
Instead of give up on the first error, skip the rest of the file and
continue with the next file in the input list.
In case of an error from open() an error message is printed to standard
error and the global variable EXITCODE gets overwritten by a nonzero
value.
"""
global EXITCODE
try:
return open(filename, mode)
except IOError, e:
sys.stderr.write ("%s: %s: %s\n" % (sys.argv[0], filename, os.strerror(e.errno)))
EXITCODE = 1
return nosuchfile()
def main ():
do_stuff(fileinput.input(openhook=skip_on_error))
return EXITCODE
Both the placeholder dummy filehandle class nosuchfile and the global variable EXITCODE are pretty serious warts. I tried to figure out how to pass in a reference to a locally scoped exitcode variable, but gave up.
This also fails to handle errors which happen while reading, but the majority of error cases seem to happen in open anyway.
Simple, explicit, and silent:
import fileinput
from os.path import exists
import sys
for line in fileinput.input(files=filter(exists, sys.argv[1:])):
process(line)
Maybe You can play with the openhook parameter to control not existing file.

Categories

Resources