How to allow switching to (nullable) infrastructure stubs in Django - python

Question - What approach or design pattern would make it easy in Django to use stubs for external integrations when running tests? With 'external integrations' I mean a couple of external REST APIs, and NAS file system. The external integrations are already separate modules/classes.
What I do now -
Currently, I disable external dependencies in tests mainly by sprinkling mock.path() statements across my test code.
But this is getting unpractical (needs to be applied per test; also easy to forget especially in more high-level tests), and links too much to the internals of certain modules.
Some details of what I am looking for
I like the concept of 'nullable infrastructure' described at
https://www.jamesshore.com/v2/blog/2018/testing-without-mocks#nullable-infrastructure.
I am especially looking for an approach that integrates well with Django, i.e. considering the settings.py file approach, and running tests via python manage.py test.
I would like to be able to easily:
state that all tests should use the nullable counterpart of an infrastructure class or function
override that behaviour per test, or test class, when needed (e.g. when testing the actual external infrastructure).
I tried the approach outlined in https://medium.com/analytics-vidhya/mocking-external-apis-in-django-4a2b1c9e3025, which basically says to create an interface implementation, a real implementation and a stub implementation. The switching is done using a django setting parameter and a class decorator on the interface class (which returns the chosen class, rather than the interface). But it isn't working out very well: the class decorator in my setup does not work with #override_settings (the decorator applies the settings upon starting of django, not when running the test), and there is really a lot of extra code (which also feels un-pythonic).

Related

Is there a way to automatically mock Python classes completely with methods using type hints?

I am writing a program which interacts with many external services such as Gmail and Discord through their respective SDKs. The problem I am running into is that the program makes a lot of network calls and runs expensive computations in development I would rather avoid with stub objects. The SDKs I am using expose their functionality through standard Python classes with type hints. At the moment, I am creating stubs for them manually but it would not be feasible in the long run.
For example, this is a simplified example to illustrate what I am trying to achieve.
#dataclass
class EmailReceipt:
receiver_email_address:str
email_text:str
...
class GmailService:
...
def send_email(receiver_email:str)->EmailReceipt:
"A network call is made to send email address here"
# More methods follow
...
class GmailServiceStub:
def send_email((receiver_email:str))->>EmailReceipt:
"The code instantiates a random object of class EmailReceipt and returns it"
# More stub methods follow
In development I would like to avoid making a request to the mail server, so I am creating a mock class. The codebase uses dependency injection throughout so it is trivial to swap different versions of GmailService class. I am using mocked versions of external servives for rapid development but I think it could also be used for testing.
All I am doing here is just implementing the contract, that send_email method returns an instance of EmailReceipt disregarding any domain logic so that it can be used downstream by other classes. At the moment, it is just 2 services with 10 methods in total but it is growing and I would rather have a tool or a library generate it for me.
So I am wondering if there is a tool or a library which could do it or something close to it, ideally with this interface.
mocked_service=mocker.mock_class(service)
# All methods of mocked_service return appropriate objects/types which can be used downstream.
If it is not possible in Python, are there other programming languages where this would be possible?

Should I take steps to ensure a Django app can scale before writing it?

So, I'm looking at writing an app with python2 django(-rest-framework), postgres and angular.
I'm aware there are lots of things that can be done
multi-server setup behind load balancer
DB replication/sharding?
caching (in various ways)
swapping DRF serialiser for serpy
running on python3
running on pypy
my question is - Which of these (or other things) should really be done right at the start of the project?
Write with scalability in mind.
scaliability is not limited just to production servers/environments but also to development environments.
Always write with scalability in mind.
At development
Scalability at development let you develop the product seemlessly.
Structure your repository
Use git branching models like GitFlow so that developers can work on parallel, or a single developer can switch working on different features. Use a bug tracker.
Design your apps.
Before actually writing a single line of code, write down what apps you are going to write. Design apps as to minimize Relations (ManyToMany, ForeignKey, etc..), imports. Django offers pulggable app architecture feel free to use it wisely.
Write your tests first.
This ensures that you can migrate(production environments), upgrade and downgrade with less pain and hairloss. Trust me writing tests feels so boring, but it is worth to.
Abstract models, managers
Use Abstract Models and Managers, it can eliminate bolier plate model code and help you maintain the code.
Name variables, classes and methods descriptive.
Name the variables, classes and methods descriptive, as you would be able to know what it represents without looking documentation.
Document code.
Feel free to document classes and methods, so you or other peers who look into the code get an idea of what it is indented for than stacktracing to see what a method is doing.
Use debug toolbar
Use django debug toolbar on developement, as you test your API, use prefectch_related() and select_related() to minimize/eliminate duplicated queries.
Modularize code.
Modularize the code. Python and django in general encourage use of modules. Modules are easy to manage. Use classes, more inheritance and abstract base classes to reuse code.
Use Continuous Integration
use continuous integration test your repo and make sure new pushes don't break the system.
At production.
scalability at production let you serve the product to infinite users seemlessly.
For multi server setup
Stick with rest design principles.
Eliminate sessions.
Use distributed cache like Redis.
Swapping DRF serializer for serpy
Start with serpy if you need more speed and if you are comfortable with. It is better to stick with Serpy than rewriting DRF serializer as writing both looks smiliar, but make sure you are not wasting time by optimizing for the lost 1 or 2ms.
Running on python3
Depends on the libraries that you plan to use.
Running on pypy
pypy is faster than the standard implementations. It depends on the library compatibility to use pypy. A list of compatibile package and status of compatibility.
now the question,
Which of these (or other things) should really be done right at the start of the project?
Ans: Developemet (1,2,3,4,5,6,7,8) Production(1,2)
I don't think you need to start worrying about the setup right away. I would discourage premature optimizations. Rather, run the app in production, profile it. See what affects the performance when you hit scale - you would know what's the bottleneck.
The first and main things you have to get right are a clean and correct db schema and clear, readable and correctly factored (DRY... unless it's accidental duplication) and decoupled code. If you know to design a relational DB schema and learn to use Python and Django properly you shouldn't have much problems so far, and if you get both these things right it will (well it should) be easy to scale - by adding cache where needed (Redis, Memcache, or an intermediary NoSQL document database storing "pre-processed" versions of your often accessed data), adding servers, load-balancing etc, depending on your application's needs. Django is built to scale easily, and unless you do stupid things it does scale easily.

Unit Test cases for Python Eve Webservices

We have developed the APIs using the python eve framework . Is there a way we can write the unit test cases for the APIs we have developed in EVE. Is there a Unit Test case component bundled into Python EVE.I need to bundle them with my Continuous Integration setup.
If yes please help me with the steps how to proceed with it.
You could start by looking at Eve's own test suite. There's 600+ examples in there. There are two base classes that provide a lot of utility methods: TestMinimal and TestBase. Almost all other test classes inherit from either of those. You probably want to use TestMinimal as it takes care of setting up and dropping the MongoDB connection for you. It also provides stuff like assert200, assert404 etc.
In general, you use the test_client object, as you do with Flask itself. Have a look at Testing Flask Applications too and Eve's Running the Tests page.

Testing for external resource consistency / skipping django tests

I'm writing tests for a Django application that uses an external data source. Obviously, I'm using fake data to test all the inner workings of my class but I'd like to have a couple of tests for the actual fetcher as well. One of these will need to verify that the external source is still sending the data in the format my application expects, which will mean making the request to retrieve that information in the test.
Obviously, I don't want our CI to come down when there is a network problem or the data provider has a spot of downtime. In this case, I would like to throw a warning that skips the rest of that test method and doesn't contribute to an overall failure. This way, if the data arrives successfully I can check it for consistency (failing if there is a problem) but if the data cannot be fetched it logs a warning so I (or another developer) know to quickly check that the data source is ok.
Basically, I would like to test my external source without being dependant on it!
Django's test suite uses Python's unittest module (at least, that's how I'm using it) which looks useful, given that the documentation for it describes Skipping tests and expected failures. This feature is apparently 'new in version 2.7', which explains why I can't get it to work - I've checked the version of unittest I have installed on from the console and it appears to be 1.63!
I can't find a later version of unittest in pypi so I'm wondering where I can get hold of the unittest version described in that document and whether it will work with Django (1.2).
I'm obviously open to recommendations / discussion on whether or not this is the best approach to my problem :)
[EDIT - additional information / clarification]
As I said, I'm obviously mocking the dependancy and doing my tests on that. However, I would also like to be able to check that the external resource (which typically is going to be an API) still matches my expected format, without bringing down CI if there is a network problem or their server is temporarily down. I basically just want to check the consistency of the resource.
Consider the following case...
If you have written a Twitter application, you will have tests for all your application's methods and behaviours - these will use fake Twitter data. This gives you a complete, self-contained set of tests for your application. The problem is that this doesn't actually check that the application works because your application inherently depends on the consistency of Twitter's API. If Twitter were to change an API call (perhaps change the URL, the parameters or the response) the application would stop working even though the unit tests would still pass. (Or perhaps if they were to completely switch off basic authentication!)
My use case is simpler - I have a single xml resource that is used to import information. I have faked the resource and tested my import code but I would like to have a test that checks the format of that xml resource has not changed.
My question is about skipping tests in Django's test runner so I can throw a warning if the resource is unavailable without the tests failing, specifically getting a version of Python's unittest module that supports this behaviour. I've given this much background information to allow anyone with experience in this area to offer alternative suggestions.
Apologies for the lengthy question, I'm aware most people won't read this now.
I've 'bolded' the important bits to make it easier to read.
I created a separate answer since your edit invalidated my last answer.
I assume you're running on Python version 2.6 - I believe the changes that you're looking for in unittest are available in Python version 2.7. Since unittest is in the standard library, updating to Python 2.7 should make those changes available to you. Is that an option that will work for you?
One other thing that I might suggest is to maybe separate the "external source format verification" test(s) into a separate test suite and run that separately from the rest of your unit tests. That way your core unit tests are still fast and you don't have to worry about the external dependencies breaking your main test suites. If you're using Hudson, it should be fairly easy to create a separate job that will handle those tests for you. Just a suggestion.
The new features in unittest in 2.7 have been backported to 2.6 as unittest2. You can just pip install and substitute unittest2 for unittest and your tests will work as thyey did plus you get the new features without upgrading to 2.7.
What are you trying to test? The code in your Django application(s) or the dependency? Can you just Mock whatever that external dependency is? If you're just wanting to test your Django application, then I would say Mock the external dependency, so your tests are not dependent upon that external resource's availability.
If you can post some code of your "actual fetcher" maybe you will get some tips on how you could use mocks.

Popularity of path hooks (PEP 302 custom import)

My project has the ability to run python functions remotely. Doing so requires transmitting modules a given function utilizes. Determining what to send is conducted via a modified modulefinder.
As I modify the modulefinder to support arbitrary path_hooks, I've started to get the impression that path_hooks are not all that popular. Quick google codesearching seems to only show the ZipImporter using them. I've noticed a minor project using it (and even then, its loader doesn't support the PEP 302 extension of get_code, which is needed by the modified modulefinder).
Has anyone come across or created projects that use custom path_hooks to access source code?
Yes, I've coded some path hooks (for one of the obvious purposes: access modules living in other forms of storage besides the filesystem and zipfiles), but never on an open-source project (and actually never needed to support modulefinder in them). What difficulties are you encountering? While I can't share my original code I think I can share the know-how developed with it (though offhand I can't recall any special difficulties -- it has been a while). As for "popular", I guess they will be in direct proportion to the need to site modules "elsewhere" (e.g. in some form of DB), though of course general "usermode file systems" built e.g. using fuse , macfuse and dokan may also allow this (and offer other advantages in terms of generality -- not sure how performance compares).

Categories

Resources