Example
Let's say you have a hypothetical API like this:
import foo
account_name = foo.register()
session = foo.login(account_name)
session.do_something()
The key point being that in order to do_something(), you need to be registered and logged in.
Here's an over-simplified, first-pass, suite of unit tests one might write:
# test_foo.py
import foo
def test_registration_should_succeed():
foo.register()
def test_login_should_succeed():
account_name = foo.register()
foo.login(account_name)
def test_do_something_should_succeed():
account_name = foo.register()
session = foo.login(account_name)
session.do_something()
The Problem
When registration fails, all the tests fail and that makes it non-obvious where
the real problem is. It looks like everything's broken, but really only one, crucial, thing is broken. It's hard to find that once crucial thing unless you are familiar with all the tests.
The Question
How do you structure your unit tests so that subsequent tests aren't executed when core functionality on which they depend fails?
Ideas
Here are possible solutions I've thought of.
Manually detect failures in each test and and raise SkipTest. - Works, but a lot of manual, error-prone work.
Leverage generators to not yield subsequent tests when the primary ones fail. - Not sure if this would actually work (because how do I "know" the previously yielded test failed).
Group tests into test classes. E.g., these are all the unit tests that require you to be logged in. - Not sure this actually addresses my issue. Wouldn't there be just as many failures?
Rather than answering the explicit question, I think a better answer is to use mock objects. In general, unit tests should not require accessing external databases (as is presumably required to log in). However, if you want to have some integration tests that do this (which is a good idea), then those tests should test the integration aspects, and your unit tests should tests the unit aspects. I would go so far as to keep the integration tests and the unit tests in separate files, so that you can quickly run the unit tests on a very regular basis, and run the integration tests on a slightly less regular basis (although still at least once a day).
This problem indicates that Foo is doing to much. You need to separate concerns. Then testing will become easy. If you had a Registrator, a LoginHandler and a DoSomething class, plus a central controller class that orchestrated the workflow, then everything could be tested separately.
Related
Question is, is there a way mark unit tests by different categories: unit, integration, "dangerous_integration". Then from say "Team City" setup, auto skip "dangerous_integration"?
For tests which is marked "Dangerous" - they should not be run automatically. Those need run manually, preferably even on debugger with very close watch?
For python, here's an example where a class I have that can send live trades to crypto exchanges. I essentially shunt all test methods with a "return" as first line of test method to avoid situation where we send live trades out to the street unintentionally.
import unittest
class BinanaceConnectorTests(unittest.TestCase):
def setUp(self):
...
def tearDown(self):
...
def testSendRealOrders(self):
return <-- I hardcode return here so nobody accidentally
... actual implementation shunted by above return ...
Is it the proper way of doing this? I am using Visual Studio. I remember in C# with xunit, there's way you can mark unit tests by different categories.
I am thinking there isnt such thing as "dangerous_integration" tests - if you need such thing, separate those tests out to a manual python script and simply don't mark them as unit tests of any kind. That'd be safest.
The scenarios for selective test execution that you have presented (like dangerous_integration) should probably better be addressed in a different way, as was already discussed in the comments. There are, however, situations where you want certain tests to be excluded in the normal case. For example, you might have some long-running tests which you only want to execute occasionally. Or, some tests produce outputs beyond simple 'passed' or 'failed' results that require thorough analysis, which you also only want to run at times.
One approach can be to group these tests into classes of their own and use this to include or exclude them from the respective runs. Or, if it seems more appropriate to have these tests implemented together with other tests, you could use some naming convention that indicates to which category a test belongs. You have the option to selectively run tests that are grouped or named in certain ways as is described in the Python unittest manual: https://docs.python.org/3.7/library/unittest.html. For example, the attribute testNamePatterns of unittest.TestLoader could be used for the selection of tests based on their names.
I have some resource creation and deletion code that needs to run before and after certain tests, which I've put into a fixture using yield in the usual way. However, before running the tests, I want to verify that the resource creation has happened correctly, and likewise after the deletion, I want to verify that it has happened. I can easily stick asserts into the fixtures themselves, but I'm not sure this is good pytest practice, and I'm concerned that it will make debugging and interpreting the logs harder. Is there a better or canonical way to do validation in pytest?
I had encountered something like this recently - although, I was using unittest instead of pytest.
What I ended up doing was something similar to a method level setup/teardown. That way, future test functions would never be affected by past test functions.
For my use-case, I loaded my test fixtures in this setup function, then ran a couple of basic tests against those tests to ensure validity of fixtures (as part of setup itself). This, I realized, added a bit of time to each test in the class, but ensured that all the fixture data was exactly what I expected it to be (we were loading stuff into a dockerized elasticsearch container). I guess time for running tests is something you can make a judgement call about.
So I'm trying to decide the way to plan and organize a testing suite for my python project but I have a doubt of when a unit test is no longer a unit test and I would love to have some feedback from the comunity.
If I understand correctly:
A Unit test test a minimal part of your code, be if a function/method that does one and only one simple thing, even if it has several use cases.
An Integration test tests that two or more units of you code that are executed under the same context, environment, etc (but trying to keep it to a minimmum of units per integration test) work well together and not only by themselves.
My doubt is: say I have a simple function that performs a HTTP request and returns the content of such request, be it HTML, JSON, etc, it doesn't matter, the fact is that the function is very, very simple but requests information from an external source, like:
import requests
def my_function(arg):
# do something very simple with `arg`, like removing spaces or the simplest thing you can imagine
return requests.get('http://www.google.com/' + arg).content
Now this is a very stupid example, but my doubt is:
Given that this function is requesting information from an external source, when you write a test for it, can you still consider such test a Unit test?
UPDATE: The test for my_function() would stub out calls to the external source so that it doesn't depend on network/db/filesystem/etc so that it's isolated. But the fact that the function that is being tested depends on external sources when running, for example, in production.
Thanks in advance!! :)
P.S.: Of course Maybe I'm not understading 100% de purposes of Unit and Integration testing, so, if I'm mistaken, please point me out where, I'll appreciate it.
Based on your update:
The test for my_function() would stub out calls to the external source
so that it doesn't depend on network/db/filesystem/etc so that it's
isolated. But the fact that the function that is being tested depends
on external sources when running, for example, in production.
As long as the external dependencies are stubbed out during your test then yes you can call it a Unit Test. It's a Unit Test based on how the unit under test behaves in your test suite rather than how the unit behaves in production.
Based on your original question:
Given that this function is requesting information from an external
source, when you write a test for it, can you still consider such test
a Unit test?
No, any test for code that touches or depends on things external to the unit under test is an integration test. This includes the existence of any web, file system, and database requests.
If your code is not 100% isolated from it's dependencies and not 100% reproducible without other components then it is an integration test.
For your example code to be properly unit tested you would need to mock out the call to google.com.
With the code calling google.com, your test would fail if google went down or you lost connection to the internet (ie the test is not 100% isolated). Your test would also fail if the behavior of google changed (ie the test is not 100% reproducable).
I don't think this is an integration test since it doesn't make use of different parts of your application. The function does really one thing and tests for it can be called unit.
On the other side, this particular function has an external dependency and you don't want to be dependent on the network in your tests. This is where mocking would really help a lot.
In other words, isolate the function and make unit tests for it.
Also, make integration tests that would use a more high-level approach and would test parts of your application which call my_function().
Whether or not your code is tested in isolation is not a distinguishing criterion whether your test is a unit test or not. For example, you would (except in very rare cases) not stub standard library functions like sin(x). But, if you don't stub sin(x) that does not make your test an integration test.
When is a test an integration test? A test is an integration test if your goal with the test is to find bugs on integration level. That means, with an integration test you want to find out whether the interaction between two (or more) components is based on the same assumptions on both (all) sides.
Mocking, however, is an orthogonal technique that can be combined with almost all kinds of tests. (In an integration test, however, you can not mock the partners of the interaction of which you want to test - but you can mock additional components).
Since mocking typically causes some effort, it must bring a benefit, like:
significantly speeding up your tests
testing although some software part is not ready yet or buggy
testing exceptional cases that are hard to set up in an integrated software
getting rid of nondeterministic behaviour like timing or randomness
...
If, however, mocking does not solve a real problem, you might be better off using the depended-on component directly.
I'm currently writing a set of unit tests for a Python microblogging library, and following advice received here have begun to use mock objects to return data as if from the service (identi.ca in this case).
However, surely by mocking httplib2 - the module I am using to request data - I am tying the unit tests to a specific implementation of my library, and removing the ability for them to function after refactoring (which is obviously one primary benefit of unit testing in the firt place).
Is there a best of both worlds scenario? The only one I can think of is to set up a microblogging server to use only for testing, but this would clearly be a large amount of work.
You are right that if you refactor your library to use something other than httplib2, then your unit tests will break. That isn't such a horrible dependency, since when that time comes it will be a simple matter to change your tests to mock out the new library.
If you want to avoid that, then write a very minimal wrapper around httplib2, and your tests can mock that. Then if you ever shift away from httplib2, you only have to change your wrapper. But notice the number of lines you have to change is the same either way, all that changes is whether they are in "test code" or "non-test code".
Not sure what your problem is. The mock class is part of the tests, conceptually at least. It is ok for the tests to depend on particular behaviour of the mock objects that they inject into the code being tested. Of course the injection itself should be shared across unit tests, so that it is easy to change the mockup implementation.
I'm wondering if there is a test framework that allows for tests to be declared as being dependent on other tests. This would imply that they should not be run, or that their results should not be prominently displayed, if the tests that they depend on do not pass.
The point of such a setup would be to allow the root cause to be more readily determined in a situation where there are many test failures.
As a bonus, it would be great if there some way to use an object created with one test as a fixture for other tests.
Is this feature set provided by any of the Python testing frameworks? Or would such an approach be antithetical to unit testing's underlying philosophy?
Or would such an approach be
antithetical to unit testing's
underlying philosophy?
Yep...if it is a unit test, it should be able to run on its own. Anytime I have found someone wanting to create dependencies on tests was due to the code being structured in a poor manner. I am not saying this is the instance in your case but it can often be a sign of code smell.
Proboscis is a Python test framework that extends Python’s built-in unittest module and Nose with features from TestNG.
Sounds like what you're looking for. Note that it works a bit differently to unittest and Nose, but that page explains how it works pretty well.
This seems to be a recurring question - e.g. #3396055
It most probably isn't a unit-test, because they should be fast (and independent). So running them all isn't a big drag. I can see where this might help in short-circuiting integration/regression runs to save time. If this is a major need for you, I'd tag the setup tests with [Core] or some such attribute.
I then proceed to write a build script which has two tasks
Taskn : run all tests in X,Y,Z dlls marked with tag [Core]
Taskn+1 depends on Taskn: run all tests in X,Y,Z dlls excluding those marked with tag [Core]
(Taskn+1 shouldn't run if Taskn didn't succeed.) It isn't a perfect solution - e.g. it would just bail out if any one [Core] test failed. But I guess you should be fixing the Core ones instead of proceeding with Non-Core tests.
It looks like what you need is not to prevent the execution of your dependent tests but to report the results of your unit test in a more structured way that allows you to identify when an error in a test cascades onto other failed tests.
The test runners py.test, Nosetests and unit2/unittest2 all support the notion of "exiting after the first failure". py.test more generally allows to specify "--maxfail=NUM" to stop running and reporting after NUM failures. This may already help your case especially since maintaining and updating dependencies for tests may not be that interesting a task.