Python failure injection

Python failure injection - python

Is there a neat way to inject failures in a Python script? I'd like to avoid sprinkling the source code with stuff like:
failure_ABC = True
failure_XYZ = True
def inject_failure_ABC():
raise Exception('ha! a fake error')
def inject_failure_XYZ():
# delete some critical file
pass
# some real code
if failure_ABC:
inject_failure_ABC()
# some more real code
if failure_XYZ:
inject_failure_XYZ()
# even more real code
Edit:
I have the following idea: insert "failure points" as specially-crafted comments. The write a simple parser that will be called before the Python interpreter, and will produce the actual instrumented Python script with the actual failure code. E.g:
#!/usr/bin/parser_script_producing_actual_code_and_calls python
# some real code
# FAIL_123
if foo():
# FAIL_ABC
execute_some_real_code()
else:
# FAIL_XYZ
execute_some_other_real_code()
Anything starting with FAIL_ is considered as a failure point by the script, and depending on a configuration file the failure is enabled/disabled. What do you think?

You could use mocking libraries, for example unittest.mock, there also exist many third party ones as well. You can then mock some object used by your code such that it throws your exception or behaves in whatever way you want it to.

When testing error handling, the best approach is to isolate the code that can throw errors in a new method which you can override in a test:
class ToTest:
def foo(...):
try:
self.bar() # We want to test the error handling in foo()
except:
....
def bar(self):
... production code ...
In your test case, you can extend ToTest and override bar() with code that throws the exceptions that you want to test.
EDIT You should really consider splitting large methods into smaller ones. It will make the code easier to test, to understand and to maintain. Have a look at Test Driven Development for some ideas how to change your development process.
Regarding your idea to use "Failure Comments". This looks like a good solution. There is one small problem: You will have to write your own Python parser because Python doesn't keep comments when it produces bytecode.
So you can either spend a couple of weeks to write this or a couple of weeks to make your code easier to test.
There is one difference, though: If you don't go all the way, the parser will be useless. Also, the time spent won't have improved one bit of your code. Most of the effort will go into the parser and tools. So after all that time, you will still have to improve the code, add failure comments and write the tests.
With refactoring the code, you can stop whenever you want but the time spent so far will be meaningful and not wasted. Your code will start to get better with the first change you make and it will keep improving.
Writing a complex tool takes time and it will have it's own bugs which need to fix or work around. None of this will improve your situation in the short term and you don't have a guarantee that it will improve the long term.

If you only want to stop your code at some point, and fall back to interactive interpreter, one can use:
assert 1==0
But this only works if you do not run python with -O
Edit
Actually, my first answer was to quick, without really understanding what you want to do, sorry.
Maybe your code becomes already more readable if you do parameterization through parameters, not through variable/function suffices. Something like
failure = {"ABC": False, "XYZ":False}
#Do something, maybe set failure
def inject_failure(failure):
if not any(failure.values()):
return
if failure["ABC"]:
raise Exception('ha! a fake error')
elif failure["XYZ"]:
# delete some critical file
pass
inject_failure(failure)

Related

Refactoring Test Logic in Pytest to Avoid Complex Raises Block

I'm using the flake8-pytest-style plugin and it flags a certain test as violating PT012. This is about having too much logic in the raises() statement.
The code in question is this:
def test_bad_python_version(capsys) -> None:
import platform
from quendor.__main__ import main
with pytest.raises(SystemExit) as pytest_wrapped_e, mock.patch.object(
platform,
"python_version",
) as v_info:
v_info.return_value = "3.5"
main()
terminal_text = capsys.readouterr()
expect(terminal_text.err).to(contain("Quendor requires Python 3.7"))
expect(pytest_wrapped_e.type).to(equal(SystemExit))
expect(pytest_wrapped_e.value.code).to(equal(1))
Basically this is testing the following code:
def main() -> int:
if platform.python_version() < "3.7":
sys.stderr.write("\nQuendor requires Python 3.7 or later.\n")
sys.stderr.write(f"Your current version is {platform.python_version()}\n\n")
sys.exit(1)
What I do is just pass in a version of Python that is less than the required and make sure the error appears as expected. The test itself works perfectly fine. (I realize it can be questionable as to whether this should be a unit test at all since it's really testing more of an aspect of Python than my own code.)
Clearly the lint check is suggesting that my test is a little messy and I can certainly understand that. But it's not clear from the above referenced page what I'm supposed to do about it.
I do realize I could just disable the quality check for this particular test but I'm trying to craft as good of Python code as I can, particularly around tests. And I'm at a loss as to how to refactor this code to meet the criteria.
I know I can create some other test helper function and then have that function called from the raises block. But that strikes me as being less clear overall since now you have to look in two places in order to see what the test is doing.

the lint error is a very good one! in fact in your case because the lint error is not followed you have two lines of unreachable code (!) (the two capsys-related lines) because main() always raises
the lint is suggesting that you only have one line in a raises() block -- the naive refactor from your existing code is:
with mock.patch.object(
platform,
"python_version",
return_value="3.5",
):
with pytest.raises(SystemExit) as pytest_wrapped_e:
main()
terminal_text = capsys.readouterr()
expect(terminal_text.err).to(contain("Quendor requires Python 3.7"))
expect(pytest_wrapped_e.type).to(equal(SystemExit))
expect(pytest_wrapped_e.value.code).to(equal(1))
an aside, you should never use platform.python_version() for version comparisons as it produces incorrect results for python 3.10 -- more on that and a linter for it here

Python REPL: issuing commands in advance to execute after block

This is a bit of an odd question; it came up in the context of a tool that exposes a Python API, which we spend a lot of time querying interactively from the REPL. The particular idiom causing issues is something like this:
for var in slow_generator_of_giant_list():
stats = update(stats, var)
print stats
To enter this at the REPL, I can type this:
>>> for var in slow_generator_of_giant_list():
... stats = update(stats, var)
...
If I now attempt to type the print, I get a syntax error due to improper indentation. (Or else I put the print inside the loop and do it on every iteration.)
But if I hit enter to go to the next line, the loop runs immediately, and I have to wait for it to finish, or type the print command in the face of possible output coming at me, etc.
Obviously I can define a function containing the above, and it might be worth saving into a file anyway, but in the general case we're constructing these on the fly, and it would be nice to have a way to "schedule" a command to run after the end of a loop from the REPL. In a language with block delimiters, I could of course put it after the ending delimiter (and any necessary statement separator). But my coworkers and I were stumped trying to do something similar here.
Is there perhaps an ugly abuse of Pythonic syntax that will do the trick that my coworkers and I couldn't think of? Or a recommended way to avoid the problem while still making it easy to throw together ad hoc interactive queries?
Thanks for any pointers.

Not beautiful, but this should work:
>>> mygen = slow_generator_of_giant_list()
>>> try:
... while True: stats = update(stats, mygen.next())
... except StopIteration:
... print stats
...

I would just say that you would find it easier just to not use the interactive shell for this.
It's not much effort to save a file and run it. You only have to keep it around for as long as you use it.
I actually have found this answering on SO. I keep a file open in my text editor with a terminal in the right directory, and just use it as a scratchpad for mocking up answers in.

How to write test cases for assignment

The part of my assignment is to create tests for each function. This ones kinda long but I am so confused. I put a link below this function so you can see how it looks like
first code is extremely long because.
def load_profiles(profiles_file, person_to_friends, person_to_networks):
'''(file, dict of {str : list of strs}, dict of {str : list of strs}) -> NoneType
Update person to friends and person to networks dictionaries to include
the data in open file.'''
# for updating person_to_friends dict
update_p_to_f(profiles_file, person_to_friends)
update_p_to_n(profiles_file, person_to_networks)
heres the whole code: http://shrib.com/8EF4E8Z3, I tested it through mainblock and it works.
This is the text file(profiles_file) we were provided that we are using to convert them :
http://shrib.com/zI61fmNP
How do I run test cases for this through nose, what kinda of test outcomes are there? Or am I not being specific enough?
import nose
import a3_functions
def test_load_profiles_
if name == 'main':
nose.runmodule()
I went that far then I didn't know what I can test for the function.

Lets assume the code you wrote so far is in a module called "mycode".
Write a new module called testmycode. (i.e. create a python file called testmycode.py)
In there, import the module you want to test (mycode)
Write a function called testupdate().
In that function, first write a text file (with file.write) that you expect to be valid. Then let update_p_to_f update it. Verify that it did what you expect, using assert. This is a test for reading a text file.
Then you can write a second function called testupdate_write(), where you let your code write to a file -- then verify that what it wrote is correct.
To run the tests, use (on the commandline)
nosetests -sx testmycode.py
Which will load testmycode and run all functions it finds there that start with test.

You probably want to test both the overall output of your program is correct, and that individual parts of your program are correct.
#j13r has already covered how to test the overall correctness of your program for a full run.
You mention that you have four helper functions. You can write tests for these separately.
Testing smaller pieces of your code is helpful because you can test each piece in more numerous and more specific ways than if you only test the whole thing.
The unittest module is a framework for performing tests.

When would `if False` execute in Python?

While browsing some code, I came across this line:
if False: #shedskin
I understand that Shedskin is a kind of Python -> C++ compiler, but I can't understand that line.
Shouldn't if False: never execute? What's going on here?
For context:
This is the whole block:
if False: # shedskin
AStar(SQ_MapHandler([1], 1, 1)).findPath(SQ_Location(1,1), SQ_Location(1,1))
More context is on Google Code (scroll down all the way).

It won't execute, because it isn't supposed to. The if False: is there to intentionally prevent the next line from executing, because that code's only purpose is seemingly to help Shed Skin infer type information about the argument to the AStar() function.
You can see another example of this in httplib:
# Useless stuff to help type info
if False :
conn._set_tunnel("example.com")

It will never get executed. It's one way to temporarily disable part of the code.

Theoretically, it could get executed:
True, False = False, True
if False: print 'foo'
But typically this will be used to temporarily disable a code path.

You are correct in assuming that this will never evaluate to true. This is sometimes done when the programmer has a lot of debugging code but does not want to remove the debugging code in a release, so they just put if False: above it all.

not enough reputation to comment yet apparently, but tim stone's answer is correct. suppose we have a function like this:
def blah(a,b):
return a+b
now in order to perform type inference, there has to be at least one call to blah, or it becomes impossible to know the types of the arguments at compile-time.
for a stand-alone program, this not a problem, since everything that has to be compiled for it to run is called indirectly from somewhere..
for an extension module, calls can come from the 'outside', so sometimes we have to add a 'fake' call to a function for type inference to become possible.. hence the 'if False'.
in the shedskin example set there are a few programs that are compiled as extension modules, in order to be combined with for example pygame or multiprocessing.

How can I strip Python logging calls without commenting them out?

Today I was thinking about a Python project I wrote about a year back where I used logging pretty extensively. I remember having to comment out a lot of logging calls in inner-loop-like scenarios (the 90% code) because of the overhead (hotshot indicated it was one of my biggest bottlenecks).
I wonder now if there's some canonical way to programmatically strip out logging calls in Python applications without commenting and uncommenting all the time. I'd think you could use inspection/recompilation or bytecode manipulation to do something like this and target only the code objects that are causing bottlenecks. This way, you could add a manipulator as a post-compilation step and use a centralized configuration file, like so:
[Leave ERROR and above]
my_module.SomeClass.method_with_lots_of_warn_calls
[Leave WARN and above]
my_module.SomeOtherClass.method_with_lots_of_info_calls
[Leave INFO and above]
my_module.SomeWeirdClass.method_with_lots_of_debug_calls
Of course, you'd want to use it sparingly and probably with per-function granularity -- only for code objects that have shown logging to be a bottleneck. Anybody know of anything like this?
Note: There are a few things that make this more difficult to do in a performant manner because of dynamic typing and late binding. For example, any calls to a method named debug may have to be wrapped with an if not isinstance(log, Logger). In any case, I'm assuming all of the minor details can be overcome, either by a gentleman's agreement or some run-time checking. :-)

What about using logging.disable?
I've also found I had to use logging.isEnabledFor if the logging message is expensive to create.

Use pypreprocessor
Which can also be found on PYPI (Python Package Index) and be fetched using pip.
Here's a basic usage example:
from pypreprocessor import pypreprocessor
pypreprocessor.parse()
#define nologging
#ifdef nologging
...logging code you'd usually comment out manually...
#endif
Essentially, the preprocessor comments out code the way you were doing it manually before. It just does it on the fly conditionally depending on what you define.
You can also remove all of the preprocessor directives and commented out code from the postprocessed code by adding 'pypreprocessor.removeMeta = True' between the import and
parse() statements.
The bytecode output (.pyc) file will contain the optimized output.
SideNote: pypreprocessor is compatible with python2x and python3k.
Disclaimer: I'm the author of pypreprocessor.

I've also seen assert used in this fashion.
assert logging.warn('disable me with the -O option') is None
(I'm guessing that warn always returns none.. if not, you'll get an AssertionError
But really that's just a funny way of doing this:
if __debug__: logging.warn('disable me with the -O option')
When you run a script with that line in it with the -O option, the line will be removed from the optimized .pyo code. If, instead, you had your own variable, like in the following, you will have a conditional that is always executed (no matter what value the variable is), although a conditional should execute quicker than a function call:
my_debug = True
...
if my_debug: logging.warn('disable me by setting my_debug = False')
so if my understanding of debug is correct, it seems like a nice way to get rid of unnecessary logging calls. The flipside is that it also disables all of your asserts, so it is a problem if you need the asserts.

As an imperfect shortcut, how about mocking out logging in specific modules using something like MiniMock?
For example, if my_module.py was:
import logging
class C(object):
def __init__(self, *args, **kw):
logging.info("Instantiating")
You would replace your use of my_module with:
from minimock import Mock
import my_module
my_module.logging = Mock('logging')
c = my_module.C()
You'd only have to do this once, before the initial import of the module.
Getting the level specific behaviour would be simple enough by mocking specific methods, or having logging.getLogger return a mock object with some methods impotent and others delegating to the real logging module.
In practice, you'd probably want to replace MiniMock with something simpler and faster; at the very least something which doesn't print usage to stdout! Of course, this doesn't handle the problem of module A importing logging from module B (and hence A also importing the log granularity of B)...
This will never be as fast as not running the log statements at all, but should be much faster than going all the way into the depths of the logging module only to discover this record shouldn't be logged after all.

You could try something like this:
# Create something that accepts anything
class Fake(object):
def __getattr__(self, key):
return self
def __call__(self, *args, **kwargs):
return True
# Replace the logging module
import sys
sys.modules["logging"] = Fake()
It essentially replaces (or initially fills in) the space for the logging module with an instance of Fake which simply takes in anything. You must run the above code (just once!) before the logging module is attempted to be used anywhere. Here is a test:
import logging
logging.basicConfig(level=logging.DEBUG,
format='%(asctime)s %(levelname)-8s %(message)s',
datefmt='%a, %d %b %Y %H:%M:%S',
filename='/temp/myapp.log',
filemode='w')
logging.debug('A debug message')
logging.info('Some information')
logging.warning('A shot across the bows')
With the above, nothing at all was logged, as was to be expected.

I'd use some fancy logging decorator, or a bunch of them:
def doLogging(logTreshold):
def logFunction(aFunc):
def innerFunc(*args, **kwargs):
if LOGLEVEL >= logTreshold:
print ">>Called %s at %s"%(aFunc.__name__, time.strftime("%H:%M:%S"))
print ">>Parameters: ", args, kwargs if kwargs else ""
try:
return aFunc(*args, **kwargs)
finally:
print ">>%s took %s"%(aFunc.__name__, time.strftime("%H:%M:%S"))
return innerFunc
return logFunction
All you need is to declare LOGLEVEL constant in each module (or just globally and just import it in all modules) and then you can use it like this:
#doLogging(2.5)
def myPreciousFunction(one, two, three=4):
print "I'm doing some fancy computations :-)"
return
And if LOGLEVEL is no less than 2.5 you'll get output like this:
>>Called myPreciousFunction at 18:49:13
>>Parameters: (1, 2)
I'm doing some fancy computations :-)
>>myPreciousFunction took 18:49:13
As you can see, some work is needed for better handling of kwargs, so the default values will be printed if they are present, but that's another question.
You should probably use some logger module instead of raw print statements, but I wanted to focus on the decorator idea and avoid making code too long.
Anyway - with such decorator you get function-level logging, arbitrarily many log levels, ease of application to new function, and to disable logging you only need to set LOGLEVEL. And you can define different output streams/files for each function if you wish. You can write doLogging as:
def doLogging(logThreshold, outStream=sys.stdout):
.....
print >>outStream, ">>Called %s at %s" etc.
And utilize log files defined on a per-function basis.

This is an issue in my project as well--logging ends up on profiler reports pretty consistently.
I've used the _ast module before in a fork of PyFlakes (http://github.com/kevinw/pyflakes) ... and it is definitely possible to do what you suggest in your question--to inspect and inject guards before calls to logging methods (with your acknowledged caveat that you'd have to do some runtime type checking). See http://pyside.blogspot.com/2008/03/ast-compilation-from-python.html for a simple example.
Edit: I just noticed MetaPython on my planetpython.org feed--the example use case is removing log statements at import time.
Maybe the best solution would be for someone to reimplement logging as a C module, but I wouldn't be the first to jump at such an...opportunity :p

:-) We used to call that a preprocessor and although C's preprocessor had some of those capablities, the "king of the hill" was the preprocessor for IBM mainframe PL/I. It provided extensive language support in the preprocessor (full assignments, conditionals, looping, etc.) and it was possible to write "programs that wrote programs" using just the PL/I PP.
I wrote many applications with full-blown sophisticated program and data tracing (we didn't have a decent debugger for a back-end process at that time) for use in development and testing which then, when compiled with the appropriate "runtime flag" simply stripped all the tracing code out cleanly without any performance impact.
I think the decorator idea is a good one. You can write a decorator to wrap the functions that need logging. Then, for runtime distribution, the decorator is turned into a "no-op" which eliminates the debugging statements.
Jon R

I am doing a project currently that uses extensive logging for testing logic and execution times for a data analysis API using the Pandas library.
I found this string with a similar concern - e.g. what is the overhead on the logging.debug statements even if the logging.basicConfig level is set to level=logging.WARNING
I have resorted to writing the following script to comment out or uncomment the debug logging prior to deployment:
import os
import fileinput
comment = True
# exclude files or directories matching string
fil_dir_exclude = ["__","_archive",".pyc"]
if comment :
## Variables to comment
source_str = 'logging.debug'
replace_str = '#logging.debug'
else :
## Variables to uncomment
source_str = '#logging.debug'
replace_str = 'logging.debug'
# walk through directories
for root, dirs, files in os.walk('root/directory') :
# where files exist
if files:
# for each file
for file_single in files :
# build full file name
file_name = os.path.join(root,file_single)
# exclude files with matching string
if not any(exclude_str in file_name for exclude_str in fil_dir_exclude) :
# replace string in line
for line in fileinput.input(file_name, inplace=True):
print "%s" % (line.replace(source_str, replace_str)),
This is a file recursion that excludes files based on a list of criteria and performs an in place replace based on an answer found here: Search and replace a line in a file in Python

I like the 'if __debug_' solution except that putting it in front of every call is a bit distracting and ugly. I had this same problem and overcame it by writing a script which automatically parses your source files and replaces logging statements with pass statements (and commented out copies of the logging statements). It can also undo this conversion.
I use it when I deploy new code to a production environment when there are lots of logging statements which I don't need in a production setting and they are affecting performance.
You can find the script here: http://dound.com/2010/02/python-logging-performance/

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.