I am trying to understand, what is monkey patching or a monkey patch?
Is that something like methods/operators overloading or delegating?
Does it have anything common with these things?
No, it's not like any of those things. It's simply the dynamic replacement of attributes at runtime.
For instance, consider a class that has a method get_data. This method does an external lookup (on a database or web API, for example), and various other methods in the class call it. However, in a unit test, you don't want to depend on the external data source - so you dynamically replace the get_data method with a stub that returns some fixed data.
Because Python classes are mutable, and methods are just attributes of the class, you can do this as much as you like - and, in fact, you can even replace classes and functions in a module in exactly the same way.
But, as a commenter pointed out, use caution when monkeypatching:
If anything else besides your test logic calls get_data as well, it will also call your monkey-patched replacement rather than the original -- which can be good or bad. Just beware.
If some variable or attribute exists that also points to the get_data function by the time you replace it, this alias will not change its meaning and will continue to point to the original get_data. (Why? Python just rebinds the name get_data in your class to some other function object; other name bindings are not impacted at all.)
A MonkeyPatch is a piece of Python code which extends or modifies
other code at runtime (typically at startup).
A simple example looks like this:
from SomeOtherProduct.SomeModule import SomeClass
def speak(self):
return "ook ook eee eee eee!"
SomeClass.speak = speak
Source: MonkeyPatch page on Zope wiki.
What is a monkey patch?
Simply put, monkey patching is making changes to a module or class while the program is running.
Example in usage
There's an example of monkey-patching in the Pandas documentation:
import pandas as pd
def just_foo_cols(self):
"""Get a list of column names containing the string 'foo'
"""
return [x for x in self.columns if 'foo' in x]
pd.DataFrame.just_foo_cols = just_foo_cols # monkey-patch the DataFrame class
df = pd.DataFrame([list(range(4))], columns=["A","foo","foozball","bar"])
df.just_foo_cols()
del pd.DataFrame.just_foo_cols # you can also remove the new method
To break this down, first we import our module:
import pandas as pd
Next we create a method definition, which exists unbound and free outside the scope of any class definitions (since the distinction is fairly meaningless between a function and an unbound method, Python 3 does away with the unbound method):
def just_foo_cols(self):
"""Get a list of column names containing the string 'foo'
"""
return [x for x in self.columns if 'foo' in x]
Next we simply attach that method to the class we want to use it on:
pd.DataFrame.just_foo_cols = just_foo_cols # monkey-patch the DataFrame class
And then we can use the method on an instance of the class, and delete the method when we're done:
df = pd.DataFrame([list(range(4))], columns=["A","foo","foozball","bar"])
df.just_foo_cols()
del pd.DataFrame.just_foo_cols # you can also remove the new method
Caveat for name-mangling
If you're using name-mangling (prefixing attributes with a double-underscore, which alters the name, and which I don't recommend) you'll have to name-mangle manually if you do this. Since I don't recommend name-mangling, I will not demonstrate it here.
Testing Example
How can we use this knowledge, for example, in testing?
Say we need to simulate a data retrieval call to an outside data source that results in an error, because we want to ensure correct behavior in such a case. We can monkey patch the data structure to ensure this behavior. (So using a similar method name as suggested by Daniel Roseman:)
import datasource
def get_data(self):
'''monkey patch datasource.Structure with this to simulate error'''
raise datasource.DataRetrievalError
datasource.Structure.get_data = get_data
And when we test it for behavior that relies on this method raising an error, if correctly implemented, we'll get that behavior in the test results.
Just doing the above will alter the Structure object for the life of the process, so you'll want to use setups and teardowns in your unittests to avoid doing that, e.g.:
def setUp(self):
# retain a pointer to the actual real method:
self.real_get_data = datasource.Structure.get_data
# monkey patch it:
datasource.Structure.get_data = get_data
def tearDown(self):
# give the real method back to the Structure object:
datasource.Structure.get_data = self.real_get_data
(While the above is fine, it would probably be a better idea to use the mock library to patch the code. mock's patch decorator would be less error prone than doing the above, which would require more lines of code and thus more opportunities to introduce errors. I have yet to review the code in mock but I imagine it uses monkey-patching in a similar way.)
According to Wikipedia:
In Python, the term monkey patch only
refers to dynamic modifications of a
class or module at runtime, motivated
by the intent to patch existing
third-party code as a workaround to a
bug or feature which does not act as
you desire.
First: monkey patching is an evil hack (in my opinion).
It is often used to replace a method on the module or class level with a custom implementation.
The most common usecase is adding a workaround for a bug in a module or class when you can't replace the original code. In this case you replace the "wrong" code through monkey patching with an implementation inside your own module/package.
Monkey patching can only be done in dynamic languages, of which python is a good example. Changing a method at runtime instead of updating the object definition is one example;similarly, adding attributes (whether methods or variables) at runtime is considered monkey patching. These are often done when working with modules you don't have the source for, such that the object definitions can't be easily changed.
This is considered bad because it means that an object's definition does not completely or accurately describe how it actually behaves.
Monkey patching is reopening the existing classes or methods in class at runtime and changing the behavior, which should be used cautiously, or you should use it only when you really need to.
As Python is a dynamic programming language, Classes are mutable so you can reopen them and modify or even replace them.
What is monkey patching? Monkey patching is a technique used to dynamically update the behavior of a piece of code at run-time.
Why use monkey patching? It allows us to modify or extend the behavior of libraries, modules, classes or methods at runtime without
actually modifying the source code
Conclusion Monkey patching is a cool technique and now we have learned how to do that in Python. However, as we discussed, it has its
own drawbacks and should be used carefully.
Related
I have a complex function that calls many other 3rd party methods. I monkeypatched them out one by one:
import ThirdParty as tp
def my_method():
tp.func_3rd_party_1()
...
tp.func_3rd_party_5()
return "some_value"
In my test:
import pytest
def test_my_method(monkeypatch):
monkeypatch.setattr(ThirdParty, 'func_3rd_party_1', some_mock_1())
...
monkeypatch.setattr(ThirdParty, 'func_3rd_party_5', some_mock_5())
return_value = my_method()
assert return value
This runs just fine but the test feels too implicit for me in this form. I'd like to explicitly state that the monkeypatched methods were indeed called.
For the record, my mocked methods are not using any inbuilt Mock library resource. They are just redefined methods (smart stubs).
Is there any way to assert for that?
So the pytest monkeypatching fixture is specifically provided so you can change some global attributes like environment variables, stuff in third party libraries, etc, to provide some controlled and easy behavior for your test.
The Mock objects, on the other hand, are meant to provide all sorts of tracking and inspection on the object.
The two go hand in hand: You use patching to replace some third party function with a Mock object, then execute your code, and then ask the Mock object if it has indeed been invoked with the right arguments, for the right number of times.
Note that even though the mock module is part of unittest, it works perfectly fine with pytest.
Now as for the patching itself, it's up to your personal preference, and depends a bit on what exactly you want to patch, whether using unittest.mock.patch is more compact or pytest's monkeypatch fixture.
import pytest
from unittest.mock import Mock
def test_my_method(monkeypatch):
# refer to the mock module documentation for more complex
# set ups, where the mock object _also_ exhibits some behavior.
# as is, calling the function doesn't actually _do_ anything.
some_mock_1 = Mock()
...
some_mock_5 = Mock(return_value=66)
monkeypatch.setattr(ThirdParty, 'func_3rd_party_1', some_mock_1)
...
monkeypatch.setattr(ThirdParty, 'func_3rd_party_5', some_mock_5)
some_mock_1.assert_called_once()
some_mock_5.assert_called_with(42)
...
Now a note on this type of testing: Don't go overboard! It can quite easily lead to what's called brittle tests: Tests that break with the slightest change to your code. It can make refactoring an impossible neightmare.
These types of assertions are best when you use them in a message-focused object-oriented approach. If the whole point of the class or method under test is to invoke, in a particular way, the method or class of another object, then Mock away. If the calls to third party functions on the other hand are merely a means to an end, then go a level higher with your test and test for the desired behavior instead.
If I have the following architecture...
Please note the edits below. It occurred to me (after some recent refactoring) that there are actually three classes in three different files. Sorry that the file/class names are getting ridiculous. I assure you those are not the real names. :)
main_class.py
class MainClass(object):
def do_some_stuff(self):
dependent_class = DependentClass()
dependent_class.py
class DependentClass(object):
def __init__():
dependent_dependent_class = DependentDependentClass()
dependent_dependent_class.do_dependent_stuff()
dependent_dependent_class.py
class DependentDependentClass(object):
def do_dependent_stuff(self):
print "I'm gonna do production stuff that I want to mock"
print "Like access a database or interact with a remote server"
class MockDependentDependentClass(object):
def do_dependent_stuff(self):
print "Respond as if the production stuff was all successful."
and I want to call main_class.do_some_stuff during testing but, during its execution, I want instances of DependentDependentClass replaced with MockDependentDependentClass how can I do that pythonically using best practices.
Currently, the best thing I could come up with is to conditionally instantiate one class or the other based on the presence/value of an environment variable. It certainly works but is pretty dirty.
I spent some time reading about the unittest.mock and mock.patch functions and they seem like they might be able to help but each description that I could wrap my head around seemed to be a little different than my actual use case.
The key is that I don't want to define mock return values or attributes but that I want the namespace changed, globally, I guess, such that when my application thinks it is instantiating DependentClass it is actually instantiating MockDependentClass.
The fact that I can't find any examples of anyone doing exactly this means one of two things:
It's because I'm doing it in a very dumb/naive way.
I'm doing something so genius no else has ever encountered it.
... I assume it's number 1...
Full disclosure, unit testing is not something with which I am skilled. It's an effort that my internal tools development team is trying to catch up to step our game up a bit. It's possible that I'm not thinking about testing correctly.
Any thoughts would be most welcome. Thank you, in advance!
SOLUTION!!!
Thanks to #de1 for the help. Given my clever architecture shown above the following accomplishes what I want.
The following code is located in main_class.py
import dependent_class
from dependent_dependent_class import MockDependentDependentClass
with patch.object(dependent_class, "DependentDependentClass", MockDependentDependentClass):
main_class = MainClass()
main_class.do_some_stuff()
The code seems to (and hell if I know how it's doing this) manipulate the namespace within the module dependent_class so that, while inside the with block (that's a context manager for anyone who is hung up on that part) anything referring to the class object DependentDependentClass will actually be referencing MockDependentDependentClass.
The mock module does indeed seem to be a good fit in this case. You can specify the mock (your mock) to use when calling the various patch methods.
If you are importing only the class rather than the module you can patch the imported DependentDependentClass in DependentClass:
import .DependentClass as dependent_class
from .DependentDependentClass import MockDependentDependentClass
with patch.object(dependent_class, 'DependentDependentClass', MockDependentDependentClass):
# do something while class is patched
Alternatively:
with patch('yourmodule.DependentClass.DependentDependentClass', MockDependentDependentClass):
# do something while class is patched
or the following will only work if you are accessing the class via a module or import it after it is being patched:
with patch('yourmodule.DependentDependentClass.DependentDependentClass', MockDependentDependentClass):
# do something while class is patched
Just bare in mind what object is being patched, when.
Note: you might find it less confusing naming your files in lower case, slightly different to the embedded class(es).
Note 2: If you need to mock a dependency of a dependency of the module under test then it might suggest that you are not testing at the right level.
I want to test a function in python, but it relies on a module-level "private" function, that I don't want called, but I'm having trouble overriding/mocking it. Scenario:
module.py
_cmd(command, args):
# do something nasty
function_to_be_tested():
# do cool things
_cmd('rm', '-rf /')
return 1
test_module.py
import module
test_function():
assert module.function_to_be_tested() == 1
Ideally, in this test I dont want to call _cmd. I've looked at some other threads, and I've tried the following with no luck:
test_function():
def _cmd(command, args):
# do nothing
pass
module._cmd = _cmd
although checking module._cmd against _cmd doesn't give the correct reference. Using mock:
from mock import patch
def _cmd_mock(command, args):
# do nothing
pass
#patch('module._cmd', _cmd_mock)
test_function():
...
gives the correct reference when checking module._cmd, although `function_to_be_tested' still uses the original _cmd (as evidenced by it doing nasty things).
This is tricky because _cmd is a module-level function, and I dont want to move it into a module
[Disclaimer]
The synthetic example posted in this question works and the described issue become from specific implementation in production code. Maybe this question should be closed as off topic because the issue is not reproducible.
[Note] For impatient people Solution is at the end of the answer.
Anyway that question given to me a good point to thought: how we can patch a method reference when we cannot access to the variable where the reference is?
Lot of times I found some issue like this. There are lot of ways to meet that case and the commons are
Decorators: the instance we would like replace is passed as decorator argument or used in decorator static implementation
What we would like to patch is a default argument of a method
In both cases maybe refactor the code is the best way to play with that but what about if we are playing with some legacy code or the decorator is a third part decorator?
Ok, we have the back on the wall but we are using python and in python nothing is impossible. What we need is just the reference of the function/method to patch and instead of patching its reference we can patch the __code__: yes I'm speaking about patching the bytecode instead the function.
Get a real example. I'm using default parameter case that is simple, but it works either in decorator case.
def cmd(a):
print("ORIG {}".format(a))
def cmd_fake(a):
print("NEW {}".format(a))
def do_work(a, c=cmd):
c(a)
do_work("a")
cmd=cmd_fake
do_work("b")
Output:
ORIG a
ORIG b
Ok In this case we can test do_work by passing cmd_fake but there some cases where is impossible do it: for instance what about if we need to call something like that:
def what_the_hell():
list(map(lambda a:do_work(a), ["c","d"]))
what we can do is patch cmd.__code__ instead of _cmd by
cmd.__code__ = cmd_fake.__code__
So follow code
do_work("a")
what_the_hell()
cmd.__code__ = cmd_fake.__code__
do_work("b")
what_the_hell()
Give follow output:
ORIG a
ORIG c
ORIG d
NEW b
NEW c
NEW d
Moreover if we want to use a mock we can do it by add follow lines:
from unittest.mock import Mock, call
cmd_mock = Mock()
def cmd_mocker(a):
cmd_mock(a)
cmd.__code__=cmd_mocker.__code__
what_the_hell()
cmd_mock.assert_has_calls([call("c"),call("d")])
print("WORKS")
That print out
WORKS
Maybe I'm done... but OP still wait for a solution of his issue
from mock import patch, Mock
cmd_mock = Mock()
#A closure for grabbing the right function code
def cmd_mocker(a):
cmd_mock(a)
#patch.object(module._cmd,'__code__', new=cmd_mocker.__code__)
test_function():
...
Now I should say never use this trick unless you are with the back on the wall. Test should be simple to understand and to debug ... try to debug something like this and you will become mad!
As I write it, it seems almost surreal to me that I'm actually experiencing this problem.
I have a list of objects. Each of these objects are of instances of an Individual class that I wrote.
Thus, conventional wisdom says that isinstance(myObj, Individual) should return True. However, this was not the case. So I thought that there was a bug in my programming, and printed type(myObj), which to my surprise printed instance and myObj.__class__ gave me Individual!
>>> type(pop[0])
<type 'instance'>
>>> isinstance(pop[0], Individual) # with all the proper imports
False
>>> pop[0].__class__
Genetic.individual.Individual
I'm stumped! What gives?
EDIT: My Individual class
class Individual:
ID = count()
def __init__(self, chromosomes):
self.chromosomes = chromosomes[:] # managed as a list as order is used to identify chromosomal functions (i.e. chromosome i encodes functionality f)
self.id = self.ID.next()
# other methods
This error indicates that the Individual class somehow got created twice. You created pop[0] with one version of Instance, and are checking for instance with the other one. Although they are pretty much identical, Python doesn't know that, and isinstance fails. To verify this, check whether pop[0].__class__ is Individual evaluates to false.
Normally classes don't get created twice (unless you use reload) because modules are imported only once, and all class objects effectively remain singletons. However, using packages and relative imports can leave a trap that leads to a module being imported twice. This happens when a script (started with python bla, as opposed to being imported from another module with import bla) contains a relative import. When running the script, python doesn't know that its imports refer to the Genetic package, so it processes its imports as absolute, creating a top-level individual module with its own individual.Individual class. Another other module correctly imports the Genetic package which ends up importing Genetic.individual, which results in the creation of the doppelganger, Genetic.individual.Individual.
To fix the problem, make sure that your script only uses absolute imports, such as import Genetic.individual even if a relative import like import individual appears to work just fine. And if you want to save on typing, use import Genetic.individual as individual. Also note that despite your use of old-style classes, isinstance should still work, since it predates new-style classes. Having said that, it would be highly advisable to switch to new-style classes.
You need to use new-style classes that inherit from
class ClassName(object):
pass
From your example, you are using old-style classes that inherit from
class Classname:
pass
EDIT: As #user4815162342 said,
>>> type(pop[0])
<type 'instance'>
is caused by using an old-style class, but this is not the cause of your issues with isinstance. You should instead make sure you don't create the class in more than one place, or if you do, use distinct names. Importing it more than once should not be an issue.
I'd like to have several of the doctests in a file share test data and/or functions. Is there a way to do this without locating them in an external file or within the code of the file being tested?
update
"""This is the docstring for the module ``fish``.
I've discovered that I can access the module under test
from within the doctest, and make changes to it, eg
>>> import fish
>>> fish.data = {1: 'red', 2: 'blue'}
"""
def jef():
"""
Modifications made to the module will persist across subsequent tests:
>>> import fish
>>> fish.data[1]
'red'
"""
pass
def seuss():
"""
Although the doctest documentation claims that
"each time doctest finds a docstring to test,
it uses a shallow copy of M‘s globals",
modifications made to the module by a doctest
are not imported into the context of subsequent docstrings:
>>> data
Traceback (most recent call last):
...
NameError: name 'data' is not defined
"""
pass
So I guess that doctest copies the module once, and then copies the copy for each docstring?
In any case, importing the module into each docstring seems usable, if awkward.
I'd prefer to use a separate namespace for this, to avoid accidentally trampling on actual module data that will or will not be imported into subsequent tests in an possibly undocumented manner.
It's occurred to me that it's (theoretically) possible to dynamically create a module in order to contain this namespace. However as yet I've not gotten any direction on how to do that from the question I asked about that a while back. Any information is quite welcome! (as a response to the appropriate question)
In any case I'd prefer to have the changes be propagated directly into the namespace of subsequent docstrings. So my original question still stands, with that as a qualifier.
This is the sort of thing that causes people to turn away from doctests: as your tests grow in complexity, you need real programming tools to be able to engineer your tests just as you would engineer your product code.
I don't think there's a way to include shared data or functions in doctests other than defining them in your product code and then using them in the doctests.
You are going to need to use real code to define some of your test infrastructure. If you like doctests, you can use that infrastructure from your doctests.
This is possible, albeit perhaps not advertised as loudly.
To obtain literate modules with tests that all use a shared execution context (i.e. individual tests can share and re-use their results), one has to look at the relevant part of documentation which says:
... each time doctest finds a docstring to test, it uses a shallow copy of M‘s globals, so that running tests doesn’t change the module’s real globals, and so that one test in M can’t leave behind crumbs that accidentally allow another test to work.
...
You can force use of your own dict as the execution context by passing globs=your_dict to testmod() or testfile() instead.
Given this, I managed to reverse-engineer from doctest module that besides using copies (i.e. the dict's copy() method), it also clears the globals dict (using clear()) after each test.
Thus, one can patch their own globals dictionary with something like:
class Context(dict):
def clear(self):
pass
def copy(self):
return self
and then use it as:
import doctest
from importlib import import_module
module = import_module('some.module')
doctest.testmod(module,
# Make a copy of globals so tests in this
# module don't affect the tests in another
glob=Context(module.__dict__.copy()))