How to do integration tests for a Python script?

How to do integration tests for a Python script? - python

I have a Python command line script that connects to a database, generates files from the data and sends e-mails. I already have some unit-tests for the important components. Now I'd like to do tests that use several or all components together, load the test database with sample data and check for the correct output.
Are there Python libraries that support this kind of testing? Specifically, I'm looking for easy ways to
put sample data in the database
check for specific changes in the database
check for existence and content of specific files
Should I even do these tests in Python or should I just write a bunch of shell scripts?

I use nosetests
http://somethingaboutorange.com/mrl/projects/nose/1.0.0/
And it fits very well with another component, "Coverage", that provides you with information about how much of your code is tested.

Yes. The unittest libraries can be used for this.
Don't be fooled by the name. A "unit" is not always a single, standalone class (or function).
A "unit" can be a composite "unit" of one or more classes or modules or packages.

Related

How to write unittests for a custom pylint checker?

I have written a custom pylint checker to check for non-i18n'ed strings in python code. I'd like to write some unittests for that code now.
But how? I'd like to have tests with python code as a string, which will get passed through my plugin/checker and check the output?
I'd rather not write the strings out to temporary files and then invoke the pylint binary on it. Is there a cleaner more robust approach, which only tests my code?

Take a look at pylint's test directory. You will find several example there. For instance, unittest_checker_*.py files are all unit-tests for unique checker.
You may also be interested in functional tests: put some input file in test/input and expected message file in test/messages, then python test_func.py should run it. While this is more for pylint's core checker, it may be easily backported in your own project.

Testing full program by comparing output file to reference file: what's it called, and is there a package for it?

I have a completely non-interactive python program that takes some command-line options and input files and produces output files. It can be fairly easily tested by choosing simple cases and writing the input and expected output files by hand, then running the program on the input files and comparing output files to the expected ones.
1) What's the name for this type of testing?
2) Is there a python package to do this type of testing?
It's not difficult to set up by hand in the most basic form, and I did that already. But then I ran into cases like output files containing the date and other information that can legitimately change between the runs - I considered writing something that would let me specify which sections of the reference files should be allowed to be different and still have the test pass, and realized I might be getting into "reinventing the wheel" territory.
(I rewrote a good part of unittest functionality before I caught myself last time this happened...)

I guess you're referring to a form of system testing.
No package would know which parts can legitimately change. My suggestion is to mock out the sections of code that result in the changes so that you can be sure that the output is always the same - you can use tools like Mock for that. Comparing two files is pretty straightforward, just dump each to a string and compare strings.

Functional testing. Or regression testing, if that is its purpose. Or code coverage, if you structure your data to cover all code paths.

Running subset of auto-discovered python unittests

Short Question
Is it possible to select, at run time, what unittests are going to be run when using an auto-discovery method in Python's unittest module.
Background
I am using the unittest module to run system tests on an external system. See below for an example sudo-testcase. The unittest module allows me to create an arbitrary number testcases that I can run using the unittest's testrunner. I have been using this method roughly 6 months of constant use and it is working out very well.
At this point in time I am wanting to try and make this more generic and user friendly. For all of my test suites that I am running now, I have hard coded which tests must run for every system. This is fine for an untested system, but when a test fails incorrectly (a user connects to the wrong test point etc...) they must re-run the entire test suite. As some of the complete suites can take up to 20 min, this is no where near ideal.
I know it is possible to create custom testsuite builders that could define which tests to run. My issue with this is that there are hundreds of testcases that can be run and maintaining this would be come a nightmare if test case names change etc...
My hope was to use nose, or the built-in unittest module to achieve this. The discovery part seems to be pretty straight forward for both options, but my issue is that only way to select a subset of testcases to run is to define a pattern that exists in the testcase name. This means I would still have to hard code a list of patterns to define what testcases to run. So if I have to hard code this list, what is the point of using the auto-discovery (please note this is rhetorical question)?
My end goal is to have a generic way to select which unittests to run during execution, in the form of check boxes or a text field the user can edit. Ideally the solution would be using Python 2.7 and need would need to run on Windows, OSX, and Linux.
Edit
To help clarify, I do not want the tool to generate the list of choices or the check boxes. The tool ideally would return a list of all of the tests in a directory, including the full path and what Suite (if any) the testcase belongs down. With this list, I would build the check boxes or a combo box the user interacts with and pass these tests into a testsuite on the fly to run.
Example Testcase
test_measure_5v_reference
1) Connect to DC power supply via GPIB
2) Set DC voltage to 12.0 V
3) Connect to a Digital Multimeter via GPIB
4) Measure / Record the voltage at a given 5V reference test point
5) Use unittest's assert functions to make sure the value is within tolerance

Store each subset of tests in its own module. Get a list of module names by having the user select them using, as you stated, checkboxes or text entry. Once you have the list of module names, you can build a corresponding test suite doing something similar to the following.
testsuite = unittest.TestSuite()
for module in modules:
testsuite.addTest(unittest.defaultTestLoader.loadTestsFromModule(module))

Dangerous Python Keywords?

I am about to get a bunch of python scripts from an untrusted source.
I'd like to be sure that no part of the code can hurt my system, meaning:
(1) the code is not allowed to import ANY MODULE
(2) the code is not allowed to read or write any data, connect to the network etc
(the purpose of each script is to loop through a list, compute some data from input given to it and return the computed value)
before I execute such code, I'd like to have a script 'examine' it and make sure that there's nothing dangerous there that could hurt my system.
I thought of using the following approach: check that the word 'import' is not used (so we are guaranteed that no modules are imported)
yet, it would still be possible for the user (if desired) to write code to read/write files etc (say, using open).
Then here comes the question:
(1) where can I get a 'global' list of python methods (like open)?
(2) Is there some code that I could add to each script that is sent to me (at the top) that would make some 'global' methods invalid for that script (for example, any use of the keyword open would lead to an exception)?
I know that there are some solutions of python sandboxing. but please try to answer this question as I feel this is the more relevant approach for my needs.
EDIT: suppose that I make sure that no import is in the file, and that no possible hurtful methods (such as open, eval, etc) are in it. can I conclude that the file is SAFE? (can you think of any other 'dangerous' ways that built-in methods can be run?)

This point hasn't been made yet, and should be:
You are not going to be able to secure arbitrary Python code.
A VM is the way to go unless you want security issues up the wazoo.

You can still obfuscate import without using eval:
s = '__imp'
s += 'ort__'
f = globals()['__builtins__'].__dict__[s]
** BOOM **

Built-in functions.
Keywords.
Note that you'll need to do things like look for both "file" and "open", as both can open files.
Also, as others have noted, this isn't 100% certain to stop someone determined to insert malacious code.

An approach that should work better than string matching us to use module ast, parse the python code, do your whitelist filtering on the tree (e.g. allow only basic operations), then compile and run the tree.
See this nice example by Andrew Dalke on manipulating ASTs.

built in functions/keywords:
eval
exec
__import__
open
file
input
execfile
print can be dangerous if you have one of those dumb shells that execute code on seeing certain output
stdin
__builtins__
globals() and locals() must be blocked otherwise they can be used to bypass your rules
There's probably tons of others that I didn't think about.
Unfortunately, crap like this is possible...
object().__reduce__()[0].__globals__["__builtins__"]["eval"]("open('/tmp/l0l0l0l0l0l0l','w').write('pwnd')")
So it turns out keywords, import restrictions, and in-scope by default symbols alone are not enough to cover, you need to verify the entire graph...

Use a Virtual Machine instead of running it on a system that you are concerned about.

Without a sandboxed environment, it is impossible to prevent a Python file from doing harm to your system aside from not running it.
It is easy to create a Cryptominer, delete/encrypt/overwrite files, run shell commands, and do general harm to your system.
If you are on Linux, you should be able to use docker to sandbox your code.
For more information, see this GitHub issue: https://github.com/raxod502/python-in-a-box/issues/2.
I did come across this on GitHub, so something like it could be used, but that has a lot of limits.
Another approach would be to create another Python file which parses the original one, removes the bad code, and runs the file. However, that would still be hit-and-miss.

python test framework

I have to test a piece of hardware using it's provided python API.
The hardware has two interfaces one of which has to be programmed by
using it's API and has to be checked if values are read/written correctly by using another interface.
Is there a python library I can use ?
It's something like this:
Test1
write using Interface under Test
check if written correctly by working interface.
program hardware using working interface 3 then
Test2
write using Interface under Test and check
Also try out various range of values for writing within the test at various speeds set through the API
and so on...
A log or results file should be created at the end of this series of tests which details all these tests and whether they passed or failed and some other results from the test

Try the unittest module from the standard library (formerly known as PyUnit).

I'd recommend py.test. It features auto discovery of tests, is non-invasive and you can easily log test results to a file (though that should be possible with every test framework).

Just to be complete another of these auto discovery test suites is nose (http://code.google.com/p/python-nose/). I normally just use just straight up unittest (http://docs.python.org/library/unittest.html) but I am in a possibly more formal environment.

If you want to have a simple test library for auto-logging, and providing you the ability to try out speed, retry, and something else related to the test, you could try test_steps package, which can be used independently or with py.test / nose platform together.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.