How to unit test program interacting with block devices

How to unit test program interacting with block devices - python

I have a program that interacts with and changes block devices (/dev/sda and such) on linux. I'm using various external commands (mostly commands from the fdisk and GNU fdisk packages) to control the devices. I have made a class that serves as the interface for most of the basic actions with block devices (for information like: What size is it? Where is it mounted? etc.)
Here is one such method querying the size of a partition:
def get_drive_size(device):
"""Returns the maximum size of the drive, in sectors.
:device the device identifier (/dev/sda and such)"""
query_proc = subprocess.Popen(["blockdev", "--getsz", device], stdout=subprocess.PIPE)
#blockdev returns the number of 512B blocks in a drive
output, error = query_proc.communicate()
exit_code = query_proc.returncode
if exit_code != 0:
raise Exception("Non-zero exit code", str(error, "utf-8")) #I have custom exceptions, this is slight pseudo-code
return int(output) #should always be valid
So this method accepts a block device path, and returns an integer. The tests will run as root, since this entire program will end up having to run as root anyway.
Should I try and test code such as these methods? If so, how? I could try and create and mount image files for each test, but this seems like a lot of overhead, and is probably error-prone itself. It expects block devices, so I cannot operate directly on image files in the file system.
I could try mocking, as some answers suggest, but this feels inadequate. It seems like I start to test the implementation of the method, if I mock the Popen object, rather than the output. Is this a correct assessment of proper unit-testing methodology in this case?
I am using python3 for this project, and I have not yet chosen a unit-testing framework. In the absence of other reasons, I will probably just use the default unittest framework included in Python.

You should look into the mock module (I think it's part of the unittest module now in Python 3).
It enables you to run tests without the need to depened in any external resources while giving you control over how the mocks interact with your code.
I would start from the docs in Voidspace
Here's an example:
import unittest2 as unittest
import mock
class GetDriveSizeTestSuite(unittest.TestCase):
#mock.patch('path/to/original/file.subprocess.Popen')
def test_a_scenario_with_mock_subprocess(self, mock_popen):
mock_popen.return_value.communicate.return_value = ('Expected_value', '')
mock_popen.return_value.returncode = '0'
self.assertEqual('expected_value', get_drive_size('some device'))

Related

Calling a multiple python scripts from python with predefined environment

Probably related to globals and locals in python exec(), Python 2 How to debug code injected by the exec block and How to get local variables updated, when using the `exec` call?
I am trying to develop a test framework for our desktop applications which uses click bot like functions.
My goal was to enable non-programmers to write small scripts which could work as a test script. So my idea is to structure the test scripts by files like:
tests-folder
| -> 01-first-test.py
| -> 02-second-test.py
| ... etc
| -> fixture.py
And then just execute these scripts in alphabetical order. However, I also wanted to have fixtures which would define functions, classes, variables and make them available to the different scripts without having the scripts to import that fixture explicitly. If that works, I could also have that approach for 2 or more directory levels.
I could get it working-ish with some hacking around, but I am not entirely convinced. I have a test_sequence.py which looks like this:
from pathlib import Path
from copy import deepcopy
from my_module.test import Failure
def run_test_sequence(test_dir: str):
error_occured = False
fixture = {
'some_pre_defined_variable': 'this is available in all scripts and fixtures',
'directory_name': test_dir,
}
# Check if fixture.py exists and load that first
fixture_file = Path(dir) / 'fixture.py'
if fixture_file.exists():
with open(fixture_file.absolute(), 'r') as code:
exec(code.read(), fixture, fixture)
# Go over all files in test sequence directory and execute them
for test_file in sorted(Path(test_dir).iterdir()):
if test_file.name == 'fixture.py':
continue
# Make a deepcopy, so scripts cannot influence one another
fixture_copy = deepcopy(fixture)
fixture_copy.update({
'some_other_variable': 'this is available in all scripts but not in fixture'
})
try:
with open(test_file.absolute(), 'r') as code:
exec(code.read(), fixture_locals, fixture_locals)
except Failure:
error_occured = True
return error_occured
This iterates over all files in the directory tests-folder and executes them in order (with fixture.py first). It also makes the local variables, functions and classes from fixture.py available to each test-script.
A test script could then just be arbitrary code that will be executed and if it raises my custom Failure exception, this will be noted as a failed test.
The whole sequence is started with a script that does
from my_module.test_sequence import run_test_sequence
if __name__ == '__main__':
exit(run_test_sequence('tests-folder')
This mostly works.
What it cannot do, and what leaves me unsatisfied with this approach:
I cannot debug the scripts itself. Since the code is loaded as string and then interpreted, breakpoints inside the test scripts are not recognized.
Calling fixture functions behaves weird. When I define a function in fixture.py like:
from my_hello_module import print_hello
def printer():
print_hello()
I will receive a message during execution that print_hello is undefined because the variables/modules/etc. in the scope surrounding printer are lost.
Stacktraces are useless. On failure it shows the stacktrace but of course only shows my line which says `exec(...)' and the insides of that function, but none of the code that has been loaded.
I am sure there are other drawbacks, that I have not found yet, but these are the most annoying ones.
I also tried to find a solution through __import__ but I couldn't get it to inject my custom locals or globals into the imported script.
Is there a solution, that I am too inexperienced to find or another builtin Python function that actually does, what I am trying to do? Or is there no way to achieve this and I should rather have each test-script import the fixture and file/directory names from the test-scripts itself. I want those scripts to have as few dependencies and pythony code as possible. Ideally they are just:
from my_module.test import *
click(x, y, LEFT)
write('admin')
press('tab')
write('password')
press('enter')
if text_on_screen('Login successful'):
succeed('Test successful')
else:
fail('Could not login')
Additional note: I think I had the debugger working when I still used execfile, but it is not available in python3 environments.

How can I make py.test tests accept interactive input?

I'm using py.test for a somewhat unconventional application. Basically, I want to have user interaction in a test via print() and input() (this is Python 3.5). The ultimate goal is to have semi-automated testing for hardware and multi-layered software which cannot be automatically tested even in principle. Some test cases will ask the testing technician to do something (to be confirmed by enter or press any key or similar on the console) or ask them to take a simple measurement or visually confirm something (to be entered on the console).
Example of what I (naively) want to do:
def test_thingie():
thingie_init('red')
print('Testing the thingie.')
# Ask the testing technician to enter info, or confirm that he has set things up physically
x = int(input('Technician: How many *RED* widgets are on the thingie? Enter integer:'))
assert x == the_correct_number
This works with when the test file is called with pytest -s to prevent stdin and stdout capturing, but the documented means (with capsys.disabled()) in the py.test documentation don't work since they only affect stdout and stderr.
What's a good way to make this work using code in the py.test module, no command line options, and ideally per-test?
The platform, for what it's worth, is Windows and I'd prefer not to have this clobber or get clobbered by the wrapping of stdin/out/whatever that results from nested shells, uncommon shells, etc.

no command line options
Use pytest.ini option or env variable to avoid using command line option every time.
ideally per-test?
Use function scoped fixture to take user input. Example code:
# contents of conftest.py
import pytest
#pytest.fixture(scope='function')
def take_input(request):
val = input(request.param)
return val
#Content of test_input.py
import pytest
#pytest.mark.parametrize('prompt',('Enter value here:'), indirect=True)
def test_input(take_input):
assert take_input == "expected string"

Customized Execution status in Robot Framework

In Robot Framework, the execution status for each test case can be either PASS or FAIL. But I have a specific requirement to mark few tests as NOT EXECUTED when it fails due to dependencies.
I'm not sure on how to achieve this. I need expert's advise for me to move ahead.

Until a SKIP status is implemented, you can use exitonfailure to stop further execution if a critical test failed, and then change the output.xml (and the tests results.html) to show those tests as "NOT_RUN" (gray color), rather than "FAILED" (red color).
Here's an example (Tested on RobotFramework 3.1.1 and Python 3.6):
First create a new class that extends the abstract class ResultVisitor:
class ResultSkippedAfterCritical(ResultVisitor):
def visit_suite(self, suite):
suite.set_criticality(critical_tags='Critical')
for test in suite.tests:
if test.status == 'FAIL' and "Critical failure occurred" in test.message:
test.status = 'NOT_RUN'
test.message = 'Skipping test execution after critical failure.'
Assuming you've already created the suite (for example with TestSuiteBuilder()), run it without creating report.html and log.html:
outputDir = suite.name.replace(" ", "_")
outputFile = "output.xml"
logger.info(F"Running Test Suite: {suite.name}", also_console=True)
result = suite.run(output=outputFile, outputdir=outputDir, \
report=None, log=None, critical='Critical', exitonfailure=True)
Notice that I've used "Critical" as the identifing tag for critical tests, and exitonfailure option.
Then, revisit the output.xml, and create report.html and log.html from it:
revisitOutputFile = os.path.join(outputDir, outputFile)
logger.info(F"Checking skipped tests in {revisitOutputFile} due to critical failures", also_console=True)
result = ExecutionResult(revisitOutputFile)
result.visit(ResultSkippedAfterCritical())
result.save(revisitOutputFile)
reportFile = 'report.html'
logFile = 'log.html'
logger.info(F"Generating {reportFile} and {logFile}", also_console=True)
writer = ResultWriter(result)
writer.write_results(outputdir=outputDir, report=reportFile, log=logFile)
It should display all the tests after the critical failure with grayed status = "NOT_RUN":

There is nothing you can do, robot only supports two values for the test status: pass and fail. You can mark a test as non-critical so it won't break the build, but it will still show up in logs and reports as having been run.
The robot core team has said they will not support this feature. See issue 1732 for more information.
Even though robot doesn't support the notion of skipped tests, you have the option to write a script that scans output.xml and removes tests that you somehow marked as skipped (perhaps by adding a tag to the test). You will also have to adjust the counts of the failed tests in the xml. Once you've modified the output.xml file, you can use rebot to regenerate the log and report files.

If you only need the change to be made for your log/report files you should take a look here for implementing a SuiteVisitor for the --prerebotmodifier option. As stated by Bryan Oakley, this might screw up your pass/fail count if you don't keep that in mind.
Currently it doesn't seem to be possible to actually alter the test-status before output.xml is created, but there are plans to implement it in RF 3.0. And there is a discussion for a skip status
Another more complex solution would be to create your own output file through implementing a listener to use with the --listener option that creates an output file as it fits your needs (possibly alongside with the original output.xml).
There is also the possibility to set tags during test execution, but im not familar with that yet so I can't really tell anything about that atm. That might be another possibility to account for those dependency-failures, as there are options to ignore certain tagged keywords for the log/report generation

I solved it this way:
Run Keyword If ${blabla}==${True} do-this-task ELSE log to console ${PREV_TEST_STATUS}${yellow}| NRUN |
test not executed and marked as NRUN

Actually, you can SET TAG to run whatever keyword you like (for sanity testing, regression testing...)
Just go to your test script configuration and set tags
And whenever you want to run, just go to Run tab and select check-box Only run tests with these tags / Skip tests with these tags
And click Start button :) Robot framework will select any keyword that match and run it.
Sorry, I don't have enough reputation to post images :(

Accessing samba shares with gio in python

I am trying to make a simple command line client for accessing shares via the Python bindings of gio (yes, the main requirement is to use gio).
I can see that comparing with it's predecessor gnome-vfs, it provides some means to do authentication stuff (subclassing MountOperation), and even some methods which are quite specific to samba shares, like set_domain().
But I'm stuck with this code:
import gio
fh = gio.File("smb://server_name/")
If that server needs authentication, I suppose that a call to fh.mount_enclosing_volume() is needed, as this methods takes a MountOperation as a parameter. The problem is that calling this methods does nothing, and the logical fh.enumerate_children() (to list the available shares) that comes next fails.
Anybody could provide a working example of how this would be done with gio ?

The following appears to be the minimum code needed to mount a volume:
def mount(f):
op = gio.MountOperation()
op.connect('ask-password', ask_password_cb)
f.mount_enclosing_volume(op, mount_done_cb)
def ask_password_cb(op, message, default_user, default_domain, flags):
op.set_username(USERNAME)
op.set_domain(DOMAIN)
op.set_password(PASSWORD)
op.reply(gio.MOUNT_OPERATION_HANDLED)
def mount_done_cb(obj, res):
obj.mount_enclosing_volume_finish(res)
(Derived from gvfs-mount.)
In addition, you may need a glib.MainLoop running because GIO mount functions are asynchronous. See the gvfs-mount source code for details.

How can I strip Python logging calls without commenting them out?

Today I was thinking about a Python project I wrote about a year back where I used logging pretty extensively. I remember having to comment out a lot of logging calls in inner-loop-like scenarios (the 90% code) because of the overhead (hotshot indicated it was one of my biggest bottlenecks).
I wonder now if there's some canonical way to programmatically strip out logging calls in Python applications without commenting and uncommenting all the time. I'd think you could use inspection/recompilation or bytecode manipulation to do something like this and target only the code objects that are causing bottlenecks. This way, you could add a manipulator as a post-compilation step and use a centralized configuration file, like so:
[Leave ERROR and above]
my_module.SomeClass.method_with_lots_of_warn_calls
[Leave WARN and above]
my_module.SomeOtherClass.method_with_lots_of_info_calls
[Leave INFO and above]
my_module.SomeWeirdClass.method_with_lots_of_debug_calls
Of course, you'd want to use it sparingly and probably with per-function granularity -- only for code objects that have shown logging to be a bottleneck. Anybody know of anything like this?
Note: There are a few things that make this more difficult to do in a performant manner because of dynamic typing and late binding. For example, any calls to a method named debug may have to be wrapped with an if not isinstance(log, Logger). In any case, I'm assuming all of the minor details can be overcome, either by a gentleman's agreement or some run-time checking. :-)

What about using logging.disable?
I've also found I had to use logging.isEnabledFor if the logging message is expensive to create.

Use pypreprocessor
Which can also be found on PYPI (Python Package Index) and be fetched using pip.
Here's a basic usage example:
from pypreprocessor import pypreprocessor
pypreprocessor.parse()
#define nologging
#ifdef nologging
...logging code you'd usually comment out manually...
#endif
Essentially, the preprocessor comments out code the way you were doing it manually before. It just does it on the fly conditionally depending on what you define.
You can also remove all of the preprocessor directives and commented out code from the postprocessed code by adding 'pypreprocessor.removeMeta = True' between the import and
parse() statements.
The bytecode output (.pyc) file will contain the optimized output.
SideNote: pypreprocessor is compatible with python2x and python3k.
Disclaimer: I'm the author of pypreprocessor.

I've also seen assert used in this fashion.
assert logging.warn('disable me with the -O option') is None
(I'm guessing that warn always returns none.. if not, you'll get an AssertionError
But really that's just a funny way of doing this:
if __debug__: logging.warn('disable me with the -O option')
When you run a script with that line in it with the -O option, the line will be removed from the optimized .pyo code. If, instead, you had your own variable, like in the following, you will have a conditional that is always executed (no matter what value the variable is), although a conditional should execute quicker than a function call:
my_debug = True
...
if my_debug: logging.warn('disable me by setting my_debug = False')
so if my understanding of debug is correct, it seems like a nice way to get rid of unnecessary logging calls. The flipside is that it also disables all of your asserts, so it is a problem if you need the asserts.

As an imperfect shortcut, how about mocking out logging in specific modules using something like MiniMock?
For example, if my_module.py was:
import logging
class C(object):
def __init__(self, *args, **kw):
logging.info("Instantiating")
You would replace your use of my_module with:
from minimock import Mock
import my_module
my_module.logging = Mock('logging')
c = my_module.C()
You'd only have to do this once, before the initial import of the module.
Getting the level specific behaviour would be simple enough by mocking specific methods, or having logging.getLogger return a mock object with some methods impotent and others delegating to the real logging module.
In practice, you'd probably want to replace MiniMock with something simpler and faster; at the very least something which doesn't print usage to stdout! Of course, this doesn't handle the problem of module A importing logging from module B (and hence A also importing the log granularity of B)...
This will never be as fast as not running the log statements at all, but should be much faster than going all the way into the depths of the logging module only to discover this record shouldn't be logged after all.

You could try something like this:
# Create something that accepts anything
class Fake(object):
def __getattr__(self, key):
return self
def __call__(self, *args, **kwargs):
return True
# Replace the logging module
import sys
sys.modules["logging"] = Fake()
It essentially replaces (or initially fills in) the space for the logging module with an instance of Fake which simply takes in anything. You must run the above code (just once!) before the logging module is attempted to be used anywhere. Here is a test:
import logging
logging.basicConfig(level=logging.DEBUG,
format='%(asctime)s %(levelname)-8s %(message)s',
datefmt='%a, %d %b %Y %H:%M:%S',
filename='/temp/myapp.log',
filemode='w')
logging.debug('A debug message')
logging.info('Some information')
logging.warning('A shot across the bows')
With the above, nothing at all was logged, as was to be expected.

I'd use some fancy logging decorator, or a bunch of them:
def doLogging(logTreshold):
def logFunction(aFunc):
def innerFunc(*args, **kwargs):
if LOGLEVEL >= logTreshold:
print ">>Called %s at %s"%(aFunc.__name__, time.strftime("%H:%M:%S"))
print ">>Parameters: ", args, kwargs if kwargs else ""
try:
return aFunc(*args, **kwargs)
finally:
print ">>%s took %s"%(aFunc.__name__, time.strftime("%H:%M:%S"))
return innerFunc
return logFunction
All you need is to declare LOGLEVEL constant in each module (or just globally and just import it in all modules) and then you can use it like this:
#doLogging(2.5)
def myPreciousFunction(one, two, three=4):
print "I'm doing some fancy computations :-)"
return
And if LOGLEVEL is no less than 2.5 you'll get output like this:
>>Called myPreciousFunction at 18:49:13
>>Parameters: (1, 2)
I'm doing some fancy computations :-)
>>myPreciousFunction took 18:49:13
As you can see, some work is needed for better handling of kwargs, so the default values will be printed if they are present, but that's another question.
You should probably use some logger module instead of raw print statements, but I wanted to focus on the decorator idea and avoid making code too long.
Anyway - with such decorator you get function-level logging, arbitrarily many log levels, ease of application to new function, and to disable logging you only need to set LOGLEVEL. And you can define different output streams/files for each function if you wish. You can write doLogging as:
def doLogging(logThreshold, outStream=sys.stdout):
.....
print >>outStream, ">>Called %s at %s" etc.
And utilize log files defined on a per-function basis.

This is an issue in my project as well--logging ends up on profiler reports pretty consistently.
I've used the _ast module before in a fork of PyFlakes (http://github.com/kevinw/pyflakes) ... and it is definitely possible to do what you suggest in your question--to inspect and inject guards before calls to logging methods (with your acknowledged caveat that you'd have to do some runtime type checking). See http://pyside.blogspot.com/2008/03/ast-compilation-from-python.html for a simple example.
Edit: I just noticed MetaPython on my planetpython.org feed--the example use case is removing log statements at import time.
Maybe the best solution would be for someone to reimplement logging as a C module, but I wouldn't be the first to jump at such an...opportunity :p

:-) We used to call that a preprocessor and although C's preprocessor had some of those capablities, the "king of the hill" was the preprocessor for IBM mainframe PL/I. It provided extensive language support in the preprocessor (full assignments, conditionals, looping, etc.) and it was possible to write "programs that wrote programs" using just the PL/I PP.
I wrote many applications with full-blown sophisticated program and data tracing (we didn't have a decent debugger for a back-end process at that time) for use in development and testing which then, when compiled with the appropriate "runtime flag" simply stripped all the tracing code out cleanly without any performance impact.
I think the decorator idea is a good one. You can write a decorator to wrap the functions that need logging. Then, for runtime distribution, the decorator is turned into a "no-op" which eliminates the debugging statements.
Jon R

I am doing a project currently that uses extensive logging for testing logic and execution times for a data analysis API using the Pandas library.
I found this string with a similar concern - e.g. what is the overhead on the logging.debug statements even if the logging.basicConfig level is set to level=logging.WARNING
I have resorted to writing the following script to comment out or uncomment the debug logging prior to deployment:
import os
import fileinput
comment = True
# exclude files or directories matching string
fil_dir_exclude = ["__","_archive",".pyc"]
if comment :
## Variables to comment
source_str = 'logging.debug'
replace_str = '#logging.debug'
else :
## Variables to uncomment
source_str = '#logging.debug'
replace_str = 'logging.debug'
# walk through directories
for root, dirs, files in os.walk('root/directory') :
# where files exist
if files:
# for each file
for file_single in files :
# build full file name
file_name = os.path.join(root,file_single)
# exclude files with matching string
if not any(exclude_str in file_name for exclude_str in fil_dir_exclude) :
# replace string in line
for line in fileinput.input(file_name, inplace=True):
print "%s" % (line.replace(source_str, replace_str)),
This is a file recursion that excludes files based on a list of criteria and performs an in place replace based on an answer found here: Search and replace a line in a file in Python

I like the 'if __debug_' solution except that putting it in front of every call is a bit distracting and ugly. I had this same problem and overcame it by writing a script which automatically parses your source files and replaces logging statements with pass statements (and commented out copies of the logging statements). It can also undo this conversion.
I use it when I deploy new code to a production environment when there are lots of logging statements which I don't need in a production setting and they are affecting performance.
You can find the script here: http://dound.com/2010/02/python-logging-performance/

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.