If I have the following architecture...
Please note the edits below. It occurred to me (after some recent refactoring) that there are actually three classes in three different files. Sorry that the file/class names are getting ridiculous. I assure you those are not the real names. :)
main_class.py
class MainClass(object):
def do_some_stuff(self):
dependent_class = DependentClass()
dependent_class.py
class DependentClass(object):
def __init__():
dependent_dependent_class = DependentDependentClass()
dependent_dependent_class.do_dependent_stuff()
dependent_dependent_class.py
class DependentDependentClass(object):
def do_dependent_stuff(self):
print "I'm gonna do production stuff that I want to mock"
print "Like access a database or interact with a remote server"
class MockDependentDependentClass(object):
def do_dependent_stuff(self):
print "Respond as if the production stuff was all successful."
and I want to call main_class.do_some_stuff during testing but, during its execution, I want instances of DependentDependentClass replaced with MockDependentDependentClass how can I do that pythonically using best practices.
Currently, the best thing I could come up with is to conditionally instantiate one class or the other based on the presence/value of an environment variable. It certainly works but is pretty dirty.
I spent some time reading about the unittest.mock and mock.patch functions and they seem like they might be able to help but each description that I could wrap my head around seemed to be a little different than my actual use case.
The key is that I don't want to define mock return values or attributes but that I want the namespace changed, globally, I guess, such that when my application thinks it is instantiating DependentClass it is actually instantiating MockDependentClass.
The fact that I can't find any examples of anyone doing exactly this means one of two things:
It's because I'm doing it in a very dumb/naive way.
I'm doing something so genius no else has ever encountered it.
... I assume it's number 1...
Full disclosure, unit testing is not something with which I am skilled. It's an effort that my internal tools development team is trying to catch up to step our game up a bit. It's possible that I'm not thinking about testing correctly.
Any thoughts would be most welcome. Thank you, in advance!
SOLUTION!!!
Thanks to #de1 for the help. Given my clever architecture shown above the following accomplishes what I want.
The following code is located in main_class.py
import dependent_class
from dependent_dependent_class import MockDependentDependentClass
with patch.object(dependent_class, "DependentDependentClass", MockDependentDependentClass):
main_class = MainClass()
main_class.do_some_stuff()
The code seems to (and hell if I know how it's doing this) manipulate the namespace within the module dependent_class so that, while inside the with block (that's a context manager for anyone who is hung up on that part) anything referring to the class object DependentDependentClass will actually be referencing MockDependentDependentClass.
The mock module does indeed seem to be a good fit in this case. You can specify the mock (your mock) to use when calling the various patch methods.
If you are importing only the class rather than the module you can patch the imported DependentDependentClass in DependentClass:
import .DependentClass as dependent_class
from .DependentDependentClass import MockDependentDependentClass
with patch.object(dependent_class, 'DependentDependentClass', MockDependentDependentClass):
# do something while class is patched
Alternatively:
with patch('yourmodule.DependentClass.DependentDependentClass', MockDependentDependentClass):
# do something while class is patched
or the following will only work if you are accessing the class via a module or import it after it is being patched:
with patch('yourmodule.DependentDependentClass.DependentDependentClass', MockDependentDependentClass):
# do something while class is patched
Just bare in mind what object is being patched, when.
Note: you might find it less confusing naming your files in lower case, slightly different to the embedded class(es).
Note 2: If you need to mock a dependency of a dependency of the module under test then it might suggest that you are not testing at the right level.
Related
I am leveraging an existing framework for a tool build activity based on python. Let me get into my issue straight :
Let's say the framework I am using is having a module named m1.py where I am having below function
def func_should_not_run(*args,**kwargs):
<doing something >
And I have a another module named m2.py where I am having below class :
from m1 import
class JustAClass:
def __init__(self,*args,*kwargs):
<All kind of initialisation..>
def run_something(self,*args,*kwargs):
<lots of code before>
func_should_not_run(*args,*kwargs)
<lots of code after>
Now my code module my_mod.py is having below class where I am creating an instance of above class from framework and calling m2.JustAClass.run_something inside another method as below
class JustAnotherClass:
def __init__(self,*args,*kwargs):
<All kind of initialisation..>
self.obj1=JustAClass(*some_args,**some_kwargs)
def run(self):
<some code before>
self.obj1.run_something(*some_other_args,**some_other_kwargs)
<some code after>
Now due to some implementation issue with m1.func_should_not_run which is getting called inside m2.JustAClass.run_something , I need to replace it with my own function func_should_run so that when func_should_not_run will be called inside m2.JustAClass.run_something, it should instead execute func_should_run from my module.
How can I achieve this?
Is there any way if I can override the import statement "from m1 import" on m2.py from my_mod.py?
This solution is risky for some aspects and could potentially fail given its side-effects, but worth to be mentioned in my opinion.
The idea is to replace (or better reload) the module that depends on the module you want to change, after some adjustment. I am going to start from the code and then I will show you the problems and limits of this approach:
from m2 import JustAClass
def func_should_run():
print('This is the function you want to call')
class JustAnotherClass:
def __init__(self,*args, **kwargs):
self.obj1 = JustAClass(*args, **kwargs)
def run(self):
self.obj1.run_something()
if __name__ == '__main__':
import importlib
importlib.import_module('m1').func_should_not_run = func_should_run
importlib.reload(importlib.import_module('m2'))
janc = JustAnotherClass()
janc.run()
Output:
This is the function you want to call
After importing importlib:
importlib.import_module('m1').func_should_not_run = func_should_run: I am importing module m1 and changing the func_should_not_run reference to func_should_run. This means that, for all the following calls to func_should_not_run, the code executed is the one of func_should_run. Obviously, this is also not valid for objects referencing the old func_should_not_run, like m2.JustAClass, so
importlib.reload(importlib.import_module('m2')): here I am reloading the module m2, that is going to use the new version of func_should_not_run because the module m1 is already loaded in the cache (i.e. sys.module) and therefore is not going to reload it (for this reason Transitive reloading can't occur, unless you explicitly do that).
From now on, every instance of JustAnotherClass correctly calls func_should_run
Should you use importlib.reload() for this?
Typically, reloading a module is useful when you have applied changes to a certain module and you do not want to restart the whole system to see those changes. In your case, unless you have clear in mind all the risks of this approach, you are kind of abusing the reload().
What are the main side-effects of this solution?
For start, reloading has its costs, especially if inside the module you have some initialization code that you do not want to re-execute. This means:
You are inevitably going to execute the module code twice (at least)
Be sure to comment your code explaining that every occurence of func_should_not_run is actually replace with func_should_run, but this is definitely not a good practice and not maintainable if used in many places.
To conclude, it is a simple as much as risky solution that can be adopted taking all the necessary precautions, with the awareness that it is just a hack and not a reasonable design decision.
As I write it, it seems almost surreal to me that I'm actually experiencing this problem.
I have a list of objects. Each of these objects are of instances of an Individual class that I wrote.
Thus, conventional wisdom says that isinstance(myObj, Individual) should return True. However, this was not the case. So I thought that there was a bug in my programming, and printed type(myObj), which to my surprise printed instance and myObj.__class__ gave me Individual!
>>> type(pop[0])
<type 'instance'>
>>> isinstance(pop[0], Individual) # with all the proper imports
False
>>> pop[0].__class__
Genetic.individual.Individual
I'm stumped! What gives?
EDIT: My Individual class
class Individual:
ID = count()
def __init__(self, chromosomes):
self.chromosomes = chromosomes[:] # managed as a list as order is used to identify chromosomal functions (i.e. chromosome i encodes functionality f)
self.id = self.ID.next()
# other methods
This error indicates that the Individual class somehow got created twice. You created pop[0] with one version of Instance, and are checking for instance with the other one. Although they are pretty much identical, Python doesn't know that, and isinstance fails. To verify this, check whether pop[0].__class__ is Individual evaluates to false.
Normally classes don't get created twice (unless you use reload) because modules are imported only once, and all class objects effectively remain singletons. However, using packages and relative imports can leave a trap that leads to a module being imported twice. This happens when a script (started with python bla, as opposed to being imported from another module with import bla) contains a relative import. When running the script, python doesn't know that its imports refer to the Genetic package, so it processes its imports as absolute, creating a top-level individual module with its own individual.Individual class. Another other module correctly imports the Genetic package which ends up importing Genetic.individual, which results in the creation of the doppelganger, Genetic.individual.Individual.
To fix the problem, make sure that your script only uses absolute imports, such as import Genetic.individual even if a relative import like import individual appears to work just fine. And if you want to save on typing, use import Genetic.individual as individual. Also note that despite your use of old-style classes, isinstance should still work, since it predates new-style classes. Having said that, it would be highly advisable to switch to new-style classes.
You need to use new-style classes that inherit from
class ClassName(object):
pass
From your example, you are using old-style classes that inherit from
class Classname:
pass
EDIT: As #user4815162342 said,
>>> type(pop[0])
<type 'instance'>
is caused by using an old-style class, but this is not the cause of your issues with isinstance. You should instead make sure you don't create the class in more than one place, or if you do, use distinct names. Importing it more than once should not be an issue.
Let's assume that we have a system of modules that exists only on production stage. At the moment of testing these modules do not exist. But still I would like to write tests for the code that uses those modules. Let's also assume that I know how to mock all the necessary objects from those modules. The question is: how do I conveniently add module stubs into current hierarchy?
Here is a small example. The functionality I want to test is placed in a file called actual.py:
actual.py:
def coolfunc():
from level1.level2.level3_1 import thing1
from level1.level2.level3_2 import thing2
do_something(thing1)
do_something_else(thing2)
In my test suite I already have everything I need: I have thing1_mock and thing2_mock. Also I have a testing function. What I need is to add level1.level2... into current module system. Like this:
tests.py
import sys
import actual
class SomeTestCase(TestCase):
thing1_mock = mock1()
thing2_mock = mock2()
def setUp(self):
sys.modules['level1'] = what should I do here?
#patch('level1.level2.level3_1.thing1', thing1_mock)
#patch('level1.level2.level3_1.thing1', thing2_mock)
def test_some_case(self):
actual.coolfunc()
I know that I can substitute sys.modules['level1'] with an object containing another object and so on. But it seems like a lot of code for me. I assume that there must be much simpler and prettier solution. I just cannot find it.
So, no one helped me with my problem and I decided to solve it by myself. Here is a micro-lib called surrogate which allows one to create stubs for non-existing modules.
Lib can be used with mock like this:
from surrogate import surrogate
from mock import patch
#surrogate('this.module.doesnt.exist')
#patch('this.module.doesnt.exist', whatever)
def test_something():
from this.module.doesnt import exist
do_something()
Firstly #surrogate decorator creates stubs for non-existing modules, then #patch decorator can alter them. Just as #patch, #surrogate decorators can be used "in plural", thus stubbing more than one module path. All stubs exist only at the lifetime of decorated function.
If anyone gets any use of this lib, that would be great :)
How do people get around this issue:
A controller (say controller.py) imports two models (say a_model.py and b_model.py):
from app.model import a_model
from app.model import b_model
Now let's say that a_model wants to use a function in b_model (let's say it wants to get something from b_model where an id of the record is from a query in a_model, so I do (in a_model):
from app.model import b_model
Now since our controller has already imported b_model.py and a_model.py is attempting to do the same, we break the application.
Can someone tell me the best way around this? Maybe use a proxy? Or a library loader?
There's no problem with importing a module from two different modules in Python. Maybe your particular design makes it a problem, but it's not something Python imposes.
Anyhow, you could probably solve the problem by moving common stuff from a_model and b_model to some other module, i.e. model_common, and importing that from both a_model and b_model.
It is fine as long as you don't have cicrcular references, ie.
main --> model.a_model -> model.b_model
\-> model.b_model
Is ok.
But if you add import a_model from b_model.py things would get complicated as there would be no way to order loading in such way that for each module all prerequisites would be satisfied.
Python handles this situation less nicely as one would expect and instead of reporting about circular imports, raises exception about undefined symbol in one of the modules.
Is this what you're trying to do?
class A(db.Model):
b = db.ReferenceProperty(B)
class B(db.Model):
a = db.ReferenceProperty(A)
If so, the easy fix is probably to turn one of them into a weak reference:
class A(db.Model):
b = db.ReferenceProperty()
class B(db.Model):
a = db.ReferenceProperty(A)
This is a crappy solution, no question. I'm not sure if there's a better way to do it.
I am trying to understand, what is monkey patching or a monkey patch?
Is that something like methods/operators overloading or delegating?
Does it have anything common with these things?
No, it's not like any of those things. It's simply the dynamic replacement of attributes at runtime.
For instance, consider a class that has a method get_data. This method does an external lookup (on a database or web API, for example), and various other methods in the class call it. However, in a unit test, you don't want to depend on the external data source - so you dynamically replace the get_data method with a stub that returns some fixed data.
Because Python classes are mutable, and methods are just attributes of the class, you can do this as much as you like - and, in fact, you can even replace classes and functions in a module in exactly the same way.
But, as a commenter pointed out, use caution when monkeypatching:
If anything else besides your test logic calls get_data as well, it will also call your monkey-patched replacement rather than the original -- which can be good or bad. Just beware.
If some variable or attribute exists that also points to the get_data function by the time you replace it, this alias will not change its meaning and will continue to point to the original get_data. (Why? Python just rebinds the name get_data in your class to some other function object; other name bindings are not impacted at all.)
A MonkeyPatch is a piece of Python code which extends or modifies
other code at runtime (typically at startup).
A simple example looks like this:
from SomeOtherProduct.SomeModule import SomeClass
def speak(self):
return "ook ook eee eee eee!"
SomeClass.speak = speak
Source: MonkeyPatch page on Zope wiki.
What is a monkey patch?
Simply put, monkey patching is making changes to a module or class while the program is running.
Example in usage
There's an example of monkey-patching in the Pandas documentation:
import pandas as pd
def just_foo_cols(self):
"""Get a list of column names containing the string 'foo'
"""
return [x for x in self.columns if 'foo' in x]
pd.DataFrame.just_foo_cols = just_foo_cols # monkey-patch the DataFrame class
df = pd.DataFrame([list(range(4))], columns=["A","foo","foozball","bar"])
df.just_foo_cols()
del pd.DataFrame.just_foo_cols # you can also remove the new method
To break this down, first we import our module:
import pandas as pd
Next we create a method definition, which exists unbound and free outside the scope of any class definitions (since the distinction is fairly meaningless between a function and an unbound method, Python 3 does away with the unbound method):
def just_foo_cols(self):
"""Get a list of column names containing the string 'foo'
"""
return [x for x in self.columns if 'foo' in x]
Next we simply attach that method to the class we want to use it on:
pd.DataFrame.just_foo_cols = just_foo_cols # monkey-patch the DataFrame class
And then we can use the method on an instance of the class, and delete the method when we're done:
df = pd.DataFrame([list(range(4))], columns=["A","foo","foozball","bar"])
df.just_foo_cols()
del pd.DataFrame.just_foo_cols # you can also remove the new method
Caveat for name-mangling
If you're using name-mangling (prefixing attributes with a double-underscore, which alters the name, and which I don't recommend) you'll have to name-mangle manually if you do this. Since I don't recommend name-mangling, I will not demonstrate it here.
Testing Example
How can we use this knowledge, for example, in testing?
Say we need to simulate a data retrieval call to an outside data source that results in an error, because we want to ensure correct behavior in such a case. We can monkey patch the data structure to ensure this behavior. (So using a similar method name as suggested by Daniel Roseman:)
import datasource
def get_data(self):
'''monkey patch datasource.Structure with this to simulate error'''
raise datasource.DataRetrievalError
datasource.Structure.get_data = get_data
And when we test it for behavior that relies on this method raising an error, if correctly implemented, we'll get that behavior in the test results.
Just doing the above will alter the Structure object for the life of the process, so you'll want to use setups and teardowns in your unittests to avoid doing that, e.g.:
def setUp(self):
# retain a pointer to the actual real method:
self.real_get_data = datasource.Structure.get_data
# monkey patch it:
datasource.Structure.get_data = get_data
def tearDown(self):
# give the real method back to the Structure object:
datasource.Structure.get_data = self.real_get_data
(While the above is fine, it would probably be a better idea to use the mock library to patch the code. mock's patch decorator would be less error prone than doing the above, which would require more lines of code and thus more opportunities to introduce errors. I have yet to review the code in mock but I imagine it uses monkey-patching in a similar way.)
According to Wikipedia:
In Python, the term monkey patch only
refers to dynamic modifications of a
class or module at runtime, motivated
by the intent to patch existing
third-party code as a workaround to a
bug or feature which does not act as
you desire.
First: monkey patching is an evil hack (in my opinion).
It is often used to replace a method on the module or class level with a custom implementation.
The most common usecase is adding a workaround for a bug in a module or class when you can't replace the original code. In this case you replace the "wrong" code through monkey patching with an implementation inside your own module/package.
Monkey patching can only be done in dynamic languages, of which python is a good example. Changing a method at runtime instead of updating the object definition is one example;similarly, adding attributes (whether methods or variables) at runtime is considered monkey patching. These are often done when working with modules you don't have the source for, such that the object definitions can't be easily changed.
This is considered bad because it means that an object's definition does not completely or accurately describe how it actually behaves.
Monkey patching is reopening the existing classes or methods in class at runtime and changing the behavior, which should be used cautiously, or you should use it only when you really need to.
As Python is a dynamic programming language, Classes are mutable so you can reopen them and modify or even replace them.
What is monkey patching? Monkey patching is a technique used to dynamically update the behavior of a piece of code at run-time.
Why use monkey patching? It allows us to modify or extend the behavior of libraries, modules, classes or methods at runtime without
actually modifying the source code
Conclusion Monkey patching is a cool technique and now we have learned how to do that in Python. However, as we discussed, it has its
own drawbacks and should be used carefully.