Repeated single or multiple tests with Nose

Repeated single or multiple tests with Nose - python

Similar to this question, I'd like to have Nose run a test (or all tests) n times -- but not in parallel.
I have a few hundred tests in a project; some are some simple unit tests. Others are integration tests w/ some degree of concurrency. Frequently when debugging tests I want to "hit" a test harder; a bash loop works, but makes for a lot of cluttered output -- no more nice single "." for each passing test. Having the ability to beat on the selected tests for some number of trials seems like a natural thing to ask Nose to do, but I haven't found it anywhere in the docs.
What's the simplest way to get Nose to do this (other than a bash loop)?

You can write a nose test as a generator, and nose will then run each function
yielded:
def check_something(arg):
# some test ...
def test_something():
for arg in some_sequence:
yield (check_something, arg)
Using nose-testconfig, you could make the number of test runs a command line argument:
from testconfig import config
# ...
def test_something():
for n in range(int(config.get("runs", 1))):
yield (check_something, arg)
Which you'd call from the command line with e.g.
$ nosetests --tc=runs:5
... for more than one run.
Alternatively (but also using nose-testconfig), you could write a decorator:
from functools import wraps
from testconfig import config
def multi(fn):
#wraps(fn)
def wrapper():
for n in range(int(config.get("runs", 1))):
fn()
return wrapper
#multi
def test_something():
# some test ...
And then, if you want to divide your tests into different groups, each with their own command line argument for the number of runs:
from functools import wraps
from testconfig import config
def multi(cmd_line_arg):
def wrap(fn):
#wraps(fn)
def wrapper():
for n in range(int(config.get(cmd_line_arg, 1))):
fn()
return wrapper
return wrap
#multi("foo")
def test_something():
# some test ...
#multi("bar")
def test_something_else():
# some test ...
Which you can call like this:
$ nosetests --tc=foo:3 --tc=bar:7

You'll have to write a script to do this, but you can repeat the test names on the commandline X times.
nosetests testname testname testname testname testname testname testname
etc.

Solution I ended up using is create sh script run_test.sh:
var=0
while $1; do
((var++))
echo "*** RETRY $var"
done
Usage:
./run_test.sh "nosetests TestName"
It runs test infinitely but stops on first error.

One way is in the test itself:
Change this:
class MyTest(unittest.TestCase):
def test_once(self):
...
To this:
class MyTest(unittest.TestCase):
def assert_once(self):
...
def test_many(self):
for _ in range(5):
self.assert_once()

There should never be a reason to run a test more than once. It's important that your tests are deterministic (i.e. given the same state of the codebase, they always produce the same result.) If this isn't the case, then instead of running tests more than once, you should redesign the tests and/or code so that they are.
For example, one reason why tests fail intermittently is a race condition between the test and the code-under-test (CUT). In this circumstance, a naive response is to add a big 'voodoo sleep' to the test, to 'make sure' that the CUT is finished before the test starts asserting.
This is error-prone though, because if your CUT is slow for any reason (underpowered hardware, loaded box, busy database, etc) then it will fail sporadically. A better solution in this instance is to have your test wait for an event, rather than sleeping.
The event could be anything of your choosing. Sometimes, events you can use are already being generated (e.g. Javascript DOM events, the 'pageRendered' kind of events that Selenium tests can make use of.) Other times, it might be appropriate for you to add code to your CUT which raises an event when it's done (perhaps your architecture involves other components that are interested in events like this.)
Often though, you'll need to re-write the test such that it tries to detect whether your CUT is finished executing (e.g. does the output file exist yet?), and if not, sleeps for 50ms and then tries again. Eventually it will time out and fail, but only do this after a very long time (e.g. 100 times the expected execution time of your CUT)
Another approach is to design your CUT using 'onion/hexagonal/ports'n'adaptors' principles, which insists your business logic should be free of all external dependencies. This means that your business logic can be tested using plain old sub-millisecond unit tests, which never touch the network or filesystem. Once this is done, you need far fewer end-to-end system tests, because they are now serving just as integration tests, and don't need to try to manipulate every detail and edge-case of your business logic going through the UI. This approach will also yield big benefits in other areas, such as improved CUT design (reducing dependencies between components), tests are much easier to write, and the time taken to run the whole test suite is much reduced.
Using approaches like the above can entirely eliminate the problem of unreliable tests, and I'd recommend doing so, to improve not just your tests, but also your codebase, and your design abilities.

Related

Pytest schedule intervals between groups of tests

Is there any way of telling pytest to run a certain set of tests, then wait for a known amount of time, then run another set of tests? For example, if I have tests with the following requirements:
Each test has 3 parts (3 methods to execute)
Part 2 must not be run for each test until a specific, known amount of time has passed since running part 1.
Part 3 must not be run for each test until a specific, known amount of time has passed since running part 2.
If I stitched parts 1, 2 and 3 together for each test and just used time.sleep(), this would take far too long to execute all tests.
Instead I want to run all of the part 1s back to back, then wait a known amount of time, then run all of the part 2s back to back, then wait a known amount of time, then run all of the part 3s.
It appears that this should be possible to implement using markers https://docs.pytest.org/en/stable/example/markers.html and probably implementing hooks https://docs.pytest.org/en/latest/reference.html#hooks to implement certain behaviour based on the markers used, though I'm not very familiar with pytest hooks.
I also came across pytest-ordering https://pytest-ordering.readthedocs.io/en/develop/ which appears to provide behaviour close to what I'm looking for. I just need a way of waiting between certain groups of tests.

You could combine all part one tests in one class, all part two tests in another class, and use class scope fixture for the delay, something like this:
import pytest
import time
#pytest.fixture(scope='class')
def delay():
time.sleep(5)
class TestPart1:
def test_one_part_1(self):
assert 1 == 1
def test_two_part_1(self):
assert 2 == 2
#pytest.mark.usefixtures("delay")
class TestPart2:
def test_one_part_2(self):
assert 1 == 1
def test_two_part_2(self):
assert 2 == 2

Call method on many objects in parallel

I wanted to use concurrency in Python for the first time. So I started reading a lot about Python concurreny (GIL, threads vs processes, multiprocessing vs concurrent.futures vs ...) and seen a lot of convoluted examples. Even in examples using the high level concurrent.futures library.
So I decided to just start trying stuff and was surprised with the very, very simple code I ended up with:
from concurrent.futures import ThreadPoolExecutor
class WebHostChecker(object):
def __init__(self, websites):
self.webhosts = []
for website in websites:
self.webhosts.append(WebHost(website))
def __iter__(self):
return iter(self.webhosts)
def check_all(self):
# sequential:
#for webhost in self:
# webhost.check()
# threaded:
with ThreadPoolExecutor(max_workers=10) as executor:
executor.map(lambda webhost: webhost.check(), self.webhosts)
class WebHost(object):
def __init__(self, hostname):
self.hostname = hostname
def check(self):
print("Checking {}".format(self.hostname))
self.check_dns() # only modifies internal state, i.e.: sets self.dns
self.check_http() # only modifies internal status, i.e.: sets self.http
Using the classes looks like this:
webhostchecker = WebHostChecker(["urla.com", "urlb.com"])
webhostchecker.check_all() # -> this calls .check() on all WebHost instances in parallel
The relevant multiprocessing/threading code is only 3 lines. I barely had to modify my existing code (which I hoped to be able to do when first starting to write the code for sequential execution, but started to doubt after reading the many examples online).
And... it works! :)
It perfectly distributes the IO-waiting among multiple threads and runs in less than 1/3 of the time of the original program.
So, now, my question(s):
What am I missing here?
Could I implement this differently? (Should I?)
Why are other examples so convoluted? (Although I must say I couldn't find an exact example doing a method call on multiple objects)
Will this code get me in trouble when I expand my program with features/code I cannot predict right now?
I think I already know of one potential problem and it would be nice if someone can confirm my reasoning: if WebHost.check() also becomes CPU bound I won't be able to swap ThreadPoolExecutor for ProcessPoolExecutor. Because every process will get cloned versions of the WebHost instances? And I would have to code something to sync those cloned instances back to the original?
Any insights/comments/remarks/improvements/... that can bring me to greater understanding will be much appreciated! :)

Ok, so I'll add my own first gotcha:
If webhost.check() raises an Exception, then the thread just ends and self.dns and/or self.http might NOT have been set. However, with the current code, you won't see the Exception, UNLESS you also access the executor.map() results! Leaving me wondering why some objects raised AttributeErrors after running check_all() :)
This can easily be fixed by just evaluating every result (which is always None, cause I'm not letting .check() return anything). You can do it after all threads have run or during. I choose to let Exceptions be raised during (ie: within the with statement), so the program stops at the first unexpected error:
def check_all(self):
with ThreadPoolExecutor(max_workers=10) as executor:
# this alone works, but does not raise any exceptions from the threads:
#executor.map(lambda webhost: webhost.check(), self.webhosts)
for i in executor.map(lambda webhost: webhost.check(), self.webhosts):
pass
I guess I could also use list(executor.map(lambda webhost: webhost.check(), self.webhosts)) but that would unnecessarily use up memory.

How to ignore tests when session fixture fails in pytest

Let's say I have a test as shown below:
import pytest
import copy
#pytest.fixture(scope='session')
def session_tool(request):
tool = request.config.tool
# Build is the critical part and may fail, raising an exception
tool.build()
return tool
#pytest.fixture
def tool(session_tool):
return copy.deepcopy(session_tool)
def test_tool(tool, args):
assert tool.run(args) == 0
It builds a session-scoped tool and then creates a copy of it for each testcase. But when the build fails, session_tool fixture is executed again for the next testcase, which fails again... until it fails for all testcases. As there are a lot of testcases, it takes some time until the process is finished.
Is there any way to tell pytest to skip all tests which use session_fixture after the first attempt to build fails?

I can think of two approaches:
1) calling pytest.skip() will cause the test to be skipped. This works if it's called from within a fixture as well. In your case, it will cause all the remaining tests to be skipped.
2) calling pytest.exit() will cause your test suite to stop running, as if KeyboardInterrupt was triggered.

Timing a unit test, including the set up

How can you capture the time of an individual unit-test, including the set-up cost?
I've got a test base with a set-up procedure which takes a non-trivial amount of time to complete. I've got several tests which descend from that test base, and I've got a decorator which, in theory, should print out the time it takes to run each test:
class TestBase(unittest.TestCase):
def setUp(self):
# some setup procedure that takes a long time
def timed_test(decorated_test):
def run_test(self, *kw, **kwargs):
start = time.time()
decorated_test(self, *kw, **kwargs)
end = time.time()
print "test_duration: %s (seconds)" % (end - start)
return run_test
class TestSomething(TestBase):
#timed_test
def test_something_useful(self):
# some test
Now, when I run these tests it turns out that I'm only printing the time it took for the test to run not including the set-up time. Tangentially, a related question may be: is it best to deal with timing outside of your testing framework?

I would not reinvent the wheel and use nose test runner with nose-timer plugin:
A timer plugin for nosetests that answers the question: how much time
does every test take?
See more about nose-timer here:
How to benchmark unit tests in Python without adding any code

Is there a way to "nice" a method of a Python script

My scripts have multiple components, and only some pieces need to be nice-d. i.e., run in low priority.
Is there a way to nice only one method of Python, or I need to break it down into several processes?
I am using Linux, if that matters.

You could write a decorator that renices the running process on entry and exit:
import os
import functools
def low_priority(f):
#functools.wraps(f)
def reniced(*args, **kwargs):
os.nice(5)
try:
f(*args,**kwargs)
finally:
os.nice(-5)
return reniced
Then you can use it this way:
#low_priority
def test():
pass # Or whatever you want to do.
Disclaimers:
Works on my machine, not sure how universal os.nice is.
As noted below, whether it works or not may depend on your os/distribution, or on being root.
Nice is on a per-process basis. Behaviour with multiple threads per process will likely not be sane, and may crash.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.