Testing for external resource consistency / skipping django tests

Testing for external resource consistency / skipping django tests - python

I'm writing tests for a Django application that uses an external data source. Obviously, I'm using fake data to test all the inner workings of my class but I'd like to have a couple of tests for the actual fetcher as well. One of these will need to verify that the external source is still sending the data in the format my application expects, which will mean making the request to retrieve that information in the test.
Obviously, I don't want our CI to come down when there is a network problem or the data provider has a spot of downtime. In this case, I would like to throw a warning that skips the rest of that test method and doesn't contribute to an overall failure. This way, if the data arrives successfully I can check it for consistency (failing if there is a problem) but if the data cannot be fetched it logs a warning so I (or another developer) know to quickly check that the data source is ok.
Basically, I would like to test my external source without being dependant on it!
Django's test suite uses Python's unittest module (at least, that's how I'm using it) which looks useful, given that the documentation for it describes Skipping tests and expected failures. This feature is apparently 'new in version 2.7', which explains why I can't get it to work - I've checked the version of unittest I have installed on from the console and it appears to be 1.63!
I can't find a later version of unittest in pypi so I'm wondering where I can get hold of the unittest version described in that document and whether it will work with Django (1.2).
I'm obviously open to recommendations / discussion on whether or not this is the best approach to my problem :)
[EDIT - additional information / clarification]
As I said, I'm obviously mocking the dependancy and doing my tests on that. However, I would also like to be able to check that the external resource (which typically is going to be an API) still matches my expected format, without bringing down CI if there is a network problem or their server is temporarily down. I basically just want to check the consistency of the resource.
Consider the following case...
If you have written a Twitter application, you will have tests for all your application's methods and behaviours - these will use fake Twitter data. This gives you a complete, self-contained set of tests for your application. The problem is that this doesn't actually check that the application works because your application inherently depends on the consistency of Twitter's API. If Twitter were to change an API call (perhaps change the URL, the parameters or the response) the application would stop working even though the unit tests would still pass. (Or perhaps if they were to completely switch off basic authentication!)
My use case is simpler - I have a single xml resource that is used to import information. I have faked the resource and tested my import code but I would like to have a test that checks the format of that xml resource has not changed.
My question is about skipping tests in Django's test runner so I can throw a warning if the resource is unavailable without the tests failing, specifically getting a version of Python's unittest module that supports this behaviour. I've given this much background information to allow anyone with experience in this area to offer alternative suggestions.
Apologies for the lengthy question, I'm aware most people won't read this now.
I've 'bolded' the important bits to make it easier to read.

I created a separate answer since your edit invalidated my last answer.
I assume you're running on Python version 2.6 - I believe the changes that you're looking for in unittest are available in Python version 2.7. Since unittest is in the standard library, updating to Python 2.7 should make those changes available to you. Is that an option that will work for you?
One other thing that I might suggest is to maybe separate the "external source format verification" test(s) into a separate test suite and run that separately from the rest of your unit tests. That way your core unit tests are still fast and you don't have to worry about the external dependencies breaking your main test suites. If you're using Hudson, it should be fairly easy to create a separate job that will handle those tests for you. Just a suggestion.

The new features in unittest in 2.7 have been backported to 2.6 as unittest2. You can just pip install and substitute unittest2 for unittest and your tests will work as thyey did plus you get the new features without upgrading to 2.7.

What are you trying to test? The code in your Django application(s) or the dependency? Can you just Mock whatever that external dependency is? If you're just wanting to test your Django application, then I would say Mock the external dependency, so your tests are not dependent upon that external resource's availability.
If you can post some code of your "actual fetcher" maybe you will get some tips on how you could use mocks.

Related

How to allow switching to (nullable) infrastructure stubs in Django

Question - What approach or design pattern would make it easy in Django to use stubs for external integrations when running tests? With 'external integrations' I mean a couple of external REST APIs, and NAS file system. The external integrations are already separate modules/classes.
What I do now -
Currently, I disable external dependencies in tests mainly by sprinkling mock.path() statements across my test code.
But this is getting unpractical (needs to be applied per test; also easy to forget especially in more high-level tests), and links too much to the internals of certain modules.
Some details of what I am looking for
I like the concept of 'nullable infrastructure' described at
https://www.jamesshore.com/v2/blog/2018/testing-without-mocks#nullable-infrastructure.
I am especially looking for an approach that integrates well with Django, i.e. considering the settings.py file approach, and running tests via python manage.py test.
I would like to be able to easily:
state that all tests should use the nullable counterpart of an infrastructure class or function
override that behaviour per test, or test class, when needed (e.g. when testing the actual external infrastructure).
I tried the approach outlined in https://medium.com/analytics-vidhya/mocking-external-apis-in-django-4a2b1c9e3025, which basically says to create an interface implementation, a real implementation and a stub implementation. The switching is done using a django setting parameter and a class decorator on the interface class (which returns the chosen class, rather than the interface). But it isn't working out very well: the class decorator in my setup does not work with #override_settings (the decorator applies the settings upon starting of django, not when running the test), and there is really a lot of extra code (which also feels un-pythonic).

What's a more Pythonic way to test parts of my code?

I'm on Windows 10, Python 2.7.13 installed via Anaconda. Recently I've been writing a lot of scripts to read/write data from files to other files, move them around, and do some visualizations with matplotlib. My workflow has been having an Anaconda Prompt open next to Sublime Text, and I copy/paste individual lines into my workspace to test something. This doesn't feel like a "best practice", especially because I can't copy/paste multiple lines with indents, so I have to write them out manually twice. I'd really like to find a better way to work on this. What would you recommend changing?

There are several types of software testing that vary in their complexity and what they test. Generally speaking, it is a good practice to leverage what is know as unit testing. Unit testing is the methodology of writing groups of tests where each test is responsible for testing a small "unit" of code. By only testing individual pieces of your project with each test, you are given a very granular idea of what parts of your project is working correctly and what parts are not working correctly. It also allows for your tests to be repeatable, source controlled, and automated. Typically each "unit" that a test is written for is a single callable item such as a function or method of a class.
In order to get the most out of unit testing, your functions and methods need to be single responsibility entities. This means they should perform one task and one task only. This makes it much easier to test them. Python's standard library has a built package, appropriately named unittest to perform this type of testing..
I would start looking at the unittest package's documentation. It provides more explanation on unit testing and how to use the package in your python code. You can also use the coverage package to determine how much of you code is tested via unit tests.
I hope this helps.

Python Twisted - How to use cache() in a non-rpy script

I am still somewhat new to Python, and have begun learning to use the Twisted framework so that I may set up an asynchronous web server. The details about storing stateful information in the Session object is pretty straightforward, but there is something that is lacking in the documentation that is throwing me off. The first line in the script on this tutorial reads:
cache()
...rest of the script goes here
This is something that only works in what is called an rpy script - more about that here. The problem is, I don't really want to use an rpy script, and it allegedly is not a requirement. The page I referenced describes rpy scripts as being mainly for experimenting with new ideas AND NOT MUCH ELSE.
My issue is that when I try to run a non-rpy version of my script, I get this error:
NameError: name 'cache' is not defined
Some additional research has told me that cache() is a part of the globals for every rpy script, so there is no need to import. However, the documentation doesn't describe how to use cache() in a non-rpy file. So, there is my question - how can I use cache() in a non-rpy file? I am pretty sure it is just a matter of knowing which module to import, which I do not. Any help will be appreciated.

A distinctive feature of the handling of rpy scripts by Twisted Web is that the source code is re-evaluated on each request.
cache is an API specifically for rpy scripts to tell the runtime not to re-evaluate the source again. If cache is called, the results of evaluating the source are saved and used to satisfy the next request for that resource.
Since this feature is unique to handling of rpy scripts, there is no need or value in using cache when defining resources for Twisted Web in a different way.

Apparently, if you aren't using an rpy file, you simply don't need to use cache(). I simply removed that line from the code and it seems to be working fine. Any additional input on this is still appreciated, because the documentation is lacking.

Difference between django-webtest and selenium

I have been reading about testing in django. One thing that was recommended was use of django-webtest for functional testing. I found a decent article here that teaches how to go about functional testing in selenium using python. But people have also recommended Ian Bicking's WebTest's extension djagno-webtest to use for testing forms in django. How is testing with webtest and testing with selenium different in context of django forms?
So from functional testing point of view:
How does django-webtest and selenium go side by side?
Do we need to have both of them or any one would do?

The key difference is that selenium runs an actual browser, while WebTest hooks to the WSGI.
This results in the following differences:
You can't test JS code with WebTest, since there is nothing to run it.
WebTest is much faster since it hooks to the WSGI, this also means a smaller memory footprint
WebTest does not require to actually run the server on a port so it's a bit easier to parallize
WebTest does not check different problems that occur with actual browsers, like specific browser version bugs (cough.. internet explorer.. cough..)
Bottom line:
PREFER to use WebTest, unless you MUST use Selenium for things that can't be tested with WebTest.

The important thing to know about Selenium is that it's primarily built to be a server-agnostic testing framework. It doesn't matter what framework or server-side implementation is used to create the front-end as long as it behaves as expected. Also, while you can (and when possible you probably should) write tests manually in Selenium, many tests are recorded macros of someone going through the motions that are then turned into code automatically.
On the other hand, django-webtest is built to work specifically on Django websites. It's actually a Django-specific extension to WebTest, which is not Django-only, but WSGI-only (and therefore Python-only). Because of that, it can interact with the application with a higher level of awareness of how things work on the server. This can make running tests faster and can also makes it easy to write more granular, detailed tests. Also, unlike Selenium, your tests can't be automatically written as recorded macros.
Otherwise, the two tools have generally the same purpose and are intended to test the same kinds of things. That said, I would suggest picking one rather than using both.

Test framework allowing tests to depend on other tests

I'm wondering if there is a test framework that allows for tests to be declared as being dependent on other tests. This would imply that they should not be run, or that their results should not be prominently displayed, if the tests that they depend on do not pass.
The point of such a setup would be to allow the root cause to be more readily determined in a situation where there are many test failures.
As a bonus, it would be great if there some way to use an object created with one test as a fixture for other tests.
Is this feature set provided by any of the Python testing frameworks? Or would such an approach be antithetical to unit testing's underlying philosophy?

Or would such an approach be
antithetical to unit testing's
underlying philosophy?
Yep...if it is a unit test, it should be able to run on its own. Anytime I have found someone wanting to create dependencies on tests was due to the code being structured in a poor manner. I am not saying this is the instance in your case but it can often be a sign of code smell.

Proboscis is a Python test framework that extends Python’s built-in unittest module and Nose with features from TestNG.
Sounds like what you're looking for. Note that it works a bit differently to unittest and Nose, but that page explains how it works pretty well.

This seems to be a recurring question - e.g. #3396055
It most probably isn't a unit-test, because they should be fast (and independent). So running them all isn't a big drag. I can see where this might help in short-circuiting integration/regression runs to save time. If this is a major need for you, I'd tag the setup tests with [Core] or some such attribute.
I then proceed to write a build script which has two tasks
Taskn : run all tests in X,Y,Z dlls marked with tag [Core]
Taskn+1 depends on Taskn: run all tests in X,Y,Z dlls excluding those marked with tag [Core]
(Taskn+1 shouldn't run if Taskn didn't succeed.) It isn't a perfect solution - e.g. it would just bail out if any one [Core] test failed. But I guess you should be fixing the Core ones instead of proceeding with Non-Core tests.

It looks like what you need is not to prevent the execution of your dependent tests but to report the results of your unit test in a more structured way that allows you to identify when an error in a test cascades onto other failed tests.

The test runners py.test, Nosetests and unit2/unittest2 all support the notion of "exiting after the first failure". py.test more generally allows to specify "--maxfail=NUM" to stop running and reporting after NUM failures. This may already help your case especially since maintaining and updating dependencies for tests may not be that interesting a task.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.