Pytest mocker patch - how to troubleshoot? - python

I am having what I believe to be a common issue in using mock patching in that I can not figure out the right thing to patch.
I have two questions that I am hoping for help with.
Thoughts on how to fix the specific issue in the below example
And possibly most-importantly pro-tips/pointers/thoughts/suggestions on how to best troubleshoot the "which thing do I patch" question. The problem I'm having is, without a full understanding of how patching works, I really dont even know what I should be looking for and find myself playing a guessing game.
An example using pyarrow that is currently causing me pain:
mymodule.py
import pyarrow
class HdfsSearch:
def __init__(self):
self.fs = self._connect()
def _connect(self) -> object:
return pyarrow.hdfs.connect(driver="libhdfs")
def search(self, path: str):
return self.fs.ls(path=path)
test_module.py
import pyarrow
import pytest
from mymodule import HdfsSearch
#pytest.fixture()
def hdfs_connection_fixture(mocker):
mocker.patch("pyarrow.hdfs.connect")
yield HdfsSearch()
def test_hdfs_connection(hdfs_connection_fixture):
pyarrow.hdfs.connect.assert_called_once() # <-- succeeds
def test_hdfs_search(hdfs_connection_fixture):
hdfs_connection_fixture.search(".")
pyarrow.hdfs.HadoopFileSystem.ls.assert_called_once() # <-- fails
pytest output:
$ python -m pytest --verbose test_module.py
=========================================================================================================== test session starts ============================================================================================================
platform linux -- Python 3.7.4, pytest-5.0.1, py-1.8.0, pluggy-0.12.0 -- /home/bbaur/miniconda3/envs/dev/bin/python
cachedir: .pytest_cache
rootdir: /home/user1/work/app
plugins: cov-2.7.1, mock-1.10.4
collected 2 items
test_module.py::test_hdfs_connection PASSED [ 50%]
test_module.py::test_hdfs_search FAILED [100%]
================================================================================================================= FAILURES =================================================================================================================
_____________________________________________________________________________________________________________ test_hdfs_search _____________________________________________________________________________________________________________
hdfs_connection_fixture = <mymodule.HdfsSearch object at 0x7fdb4ec2a610>
def test_hdfs_search(hdfs_connection_fixture):
hdfs_connection_fixture.search(".")
> pyarrow.hdfs.HadoopFileSystem.ls.assert_called_once()
E AttributeError: 'function' object has no attribute 'assert_called_once'
test_module.py:16: AttributeError

You're not calling the assert on the Mock object, this is the correct assert:
hdfs_connection_fixture.fs.ls.assert_called_once()
Explanation:
When you access any attribute in a Mock object it will return another Mock object.
Since you patched "pyarrow.hdfs.connect" you've replaced it with a Mock, let's call it Mock A. Your _connect method will return that Mock A and you'll assign it to self.fs.
Now let's break down what's happening in the search method when you call self.fs.ls.
self.fs returns your Mock A object, then the .ls will return a different Mock object, let's call it Mock B. In this Mock B object you're doing the call passing (path=path).
In your assert you're trying to access pyarrow.hdfs.HadoopFileSystem, but it was never patched. You'll need do the assert on the Mock B object, which is at hdfs_connection_fixture.fs.ls
What to Patch
If you change your import in mymodule.py to this from pyarrow.hdfs import connect your patch will stop working.
Why is that?
When you patch something you're changing what a name points to, not the actual object.
Your current patch is patching the name pyarrow.hdfs.connect and in mymodule you're using the same name pyarrow.hdfs.connect so everything is fine.
However, if you use from pyarrow.hdfs import connect mymodule will have imported the real pyarrow.hdfs.connect and created a reference for it with the name mymodule.connect.
So when you call connect inside mymodule you're accessing the name mymodule.connect, which is not patched.
That is why you would need to patch mymodule.connect when using from import.
I'd recommend using from x import y when doing this kind of patching. It makes it more explicit what you're trying to mock and the patch will be limited to that module only, which can prevent unforeseen side-effects.
Source, this section in the Python documentation: Where to patch

To understand how patching works in python let's first understand the import statement.
When we use import pyarrow in a module (mymodule.py in this case) it does two operations :
It searches for the pyarrow module in sys.modules
It binds the results of that search to a name(pyarrow) in the local scope.
By doing something like: pyarrow = sys.modules['pyarrow']
NOTE: import statements in python doesn't execute code. The import statement brings a name into local scope. The execution of code happens as a side-effect only when python can't find a module in sys.modules
So, to patch pyarrow imported in mymodule.py we need to patch the pyarrow name present in the local scope of mymodule.py
patch('mymodule.pyarrow', autospec=True)
test_module.py
import pytest
from mock import Mock, sentinel
from pyarrow import hdfs
from mymodule import HdfsSearch
class TestHdfsSearch(object):
#pytest.fixture(autouse=True, scope='function')
def setup(self, mocker):
self.hdfs_mock = Mock(name='HadoopFileSystem', spec=hdfs.HadoopFileSystem)
self.connect_mock = mocker.patch("mymodule.pyarrow.hdfs.connect", return_value=self.hdfs_mock)
def test_initialize_HdfsSearch_should_connect_pyarrow_hdfs_file_system(self):
HdfsSearch()
self.connect_mock.assert_called_once_with(driver="libhdfs")
def test_initialize_HdfsSearch_should_set_pyarrow_hdfs_as_file_system(self):
hdfs_search = HdfsSearch()
assert self.hdfs_mock == hdfs_search.fs
def test_search_should_retrieve_directory_contents(self):
hdfs_search = HdfsSearch()
self.hdfs_mock.ls.return_value = sentinel.contents
result = hdfs_search.search(".")
self.hdfs_mock.ls.assert_called_once_with(path=".")
assert sentinel.contents == result
Use context managers to patch built-ins
def test_patch_built_ins():
with patch('os.curdir') as curdir_mock: # curdir_mock lives only inside with block. Doesn't lives outside
assert curdir_mock == os.curdir
assert os.curdir == '.'

Related

Set a module's class method as an attribute of that module from outside that module

The objective:
I have a package with submodules that I would like to be accessible in the most straightforward way possible. The submodules contain classes to take advantage of the class structure, but don't need to be initialized (as they contain static and class methods). So, ideally, I would like to access them as follows:
from myPackage.subModule import someMethod
print (someMethod)
from myPackage import subModule
print (subModule.someMethod)
import myPackage
print(myPackage.subModule.someMethod)
Here is the package structure:
myPackage ─┐
__init__.py
subModule
subModule2
etc.
Example of a typical submodule:
# submodule.py
class SomeClass():
someAttr = list(range(10))
#classmethod
def someMethod(cls):
pass
#staticmethod
def someMethod2():
pass
Here is the code I have in my '__init __.py': In order to achieve the above; it attempts to set attributes for each class at the package level, and the same for it's methods at the sub-module level.
# __init__.py
def import_submodules(package, filetypes=('py', 'pyc', 'pyd'), ignoreStartingWith='_'):
'''Import submodules to the given package, expose any classes at the package level
and their respective class methods at submodule level.
:Parameters:
package (str)(obj) = A python package.
filetypes (str)(tuple) = Filetype extension(s) to include.
ignoreStartingWith (str)(tuple) = Ignore submodules starting with given chars.
'''
if isinstance(package, str):
package = sys.modules[package]
if not package:
return
pkg_dir = os.path.dirname(os.path.abspath(package.__file__))
sys.path.append(pkg_dir) #append this dir to the system path.
for mod_name in os.listdir(pkg_dir):
if mod_name.startswith(ignoreStartingWith):
continue
elif os.path.isfile(os.path.join(pkg_dir, mod_name)):
mod_name, *mod_ext = mod_name.rsplit('.', 1)
if filetypes:
if not mod_ext or mod_ext[0] not in filetypes:
continue
mod = importlib.import_module(mod_name)
vars(package)[mod_name] = mod
classes = inspect.getmembers(mod, inspect.isclass)
for cls_name, clss in classes:
vars(package)[cls_name] = clss
methods = inspect.getmembers(clss, inspect.isfunction)
for method_name, method in methods:
vars(mod)[method_name] = method
del mod_name
import_submodules(__name__)
At issue is this line:
vars(mod)[method_name] = method
Which ultimately results in: (indicating that the attribute was not set)
from myPackage.subModule import someMethod
ImportError: cannot import name 'someMethod' from 'myPackage.subModule'
I am able to set the methods as attributes to the module within that module, but setting them from outside (ie. in the package __init __), isn't working as written. I understand this isn't ideal to begin with, but my current logic is; that the ease of use, outweighs any perceived issues with namespace pollution. I am, of course, always open to counter-arguments.
I just checked it on my machine.
Created a package myPackage with a module subModule that has a function someMethod.
I run a python shell with working directory in the same directory that the myPackage is in, and to get these 3 import statements to work:
from myPackage.subModule import someMethod
from myPackage import subModule
import myPackage
All I had to do was to create an __init__.py with this line in it:
from . import subModule
Found a nice "hacky" solution -
subModule.py:
class myClass:
#staticmethod
def someMethod():
print("I have a bad feeling about this")
myInstance = myClass()
someMethod = myInstance.someMethod
init.py is empty
Still scratching my head of why I am unable to do this from the package __init __, but this solution works with the caveat it has to be called at the end of each submodule. Perhaps someone, in the future, someone can chime in as to why this wasn't working when completely contained in the __init __.
def addMembers(module, ignoreStartingWith='_'):
'''Expose class members at module level.
:Parameters:
module (str)(obj) = A python module.
ignoreStartingWith (str)(tuple) = Ignore class members starting with given chars.
ex. call: addMembers(__name__)
'''
if isinstance(module, str):
module = sys.modules[module]
if not module:
return
classes = inspect.getmembers(module, inspect.isclass)
for cls_name, clss in classes:
cls_members = [(o, getattr(clss, o)) for o in dir(clss) if not o.startswith(ignoreStartingWith)]
for name, mem in cls_members:
vars(module)[name] = mem
This is the solution I ended up going with. It needs to be put at the end of each submodule of your package. But, it is simple and in addition to all the standard ways of importing, allows you to import a method directly:
def __getattr__(attr):
'''Attempt to get a class attribute.
:Parameters:
attr (str): A name of a class attribute.
:Return:
(obj) The attribute.
'''
try:
return getattr(Someclass, attr)
except AttributeError as error:
raise AttributeError(f'{__file__} in __getattr__\n\t{error} ({type(attr).__name__})')
from somePackage.someModule import someMethod

Mock a function present inside a list in pytest

I want to mock a function present inside a list and check whether it has been called at least once. Below is a similar implementation I tried:-
In fun_list.py (funA and funB are two functions in other_module)
import other_module
FUN_LIST = [
other_module.funA,
other_module.funB,
]
def run_funs():
for fun in FUN_LIST:
fun()
In demo.py
from fun_list import run_funs
def run_demo():
...
run_funs()
...
In test_demo.py
from demo import run_demo
#patch('other_module.funB')
def test_demo_funs(mocked_funB):
mocked_funB.return_value = {}
run_demo()
assert mocked_funB.called
In above case, I'm trying to mock funB in other_module but the function doesn't get mocked and the cursor gets inside the actual funB in other_module. Thus, the assert mocked_funB.called returns false.
Any lead on how I can mock other_module.funB ?
I have found a similar question on StackOverflow but that went unanswered, so decided to post my version of it.
Any help will be appreciated, thank you in advance.
You need to mock before importing the module under test. The code in the module scope will be executed when import the module. It is too late to mock through the decorator when the test case is executed.
E.g.
other_module.py:
def funA():
pass
def funB():
pass
fun_list.py:
import other_module
print('execute module scope code')
FUN_LIST = [
other_module.funA,
other_module.funB,
]
def run_funs():
for fun in FUN_LIST:
fun()
demo.py:
from fun_list import run_funs
def run_demo():
run_funs()
test_demo.py:
import unittest
from unittest.mock import patch
class TestDemo(unittest.TestCase):
#patch('other_module.funB')
def test_demo_funs(self, mocked_funB):
print('mock before import the module')
from demo import run_demo
mocked_funB.return_value = {}
run_demo()
assert mocked_funB.called
if __name__ == '__main__':
unittest.main()
test result:
mock before import the module
execute module scope code
.
----------------------------------------------------------------------
Ran 1 test in 0.002s
OK
Name Stmts Miss Cover Missing
--------------------------------------------------------------------------
src/stackoverflow/67563601/demo.py 3 0 100%
src/stackoverflow/67563601/fun_list.py 6 0 100%
src/stackoverflow/67563601/other_module.py 4 1 75% 6
src/stackoverflow/67563601/test_demo.py 12 0 100%
--------------------------------------------------------------------------
TOTAL 25 1 96%
I took the lead from #slideshowp2's answer and have modified things a bit differently. In my case I was having multiple such test functions that is mocking funB and calling run_demo (originally a client.post() call from Django.test). If an earlier function calls it successfully, the other subsequent patches were failing (because of the same reason stated by #slideshowp2). So, I changed the approach to this :-
In fun_list.py (funA and funB are two functions in other_module)
import other_module
FUN_LIST = [
'funA',
'funB',
]
def run_funs():
for fun in FUN_LIST:
getattr(other_module, fun)()
In demo.py
from fun_list import run_funs
def run_demo():
...
run_funs()
...
In test_demo.py
from demo import run_demo
#patch('other_module.funB')
def test_demo_funs(mocked_funB):
mocked_funB.return_value = {}
run_demo()
assert mocked_funB.called

How to patch a module that hasn't been imported by parent package's __init__.py

I'm trying to test a tool I'm building which uses some jMetalPy functionality. I had/have a previous version working but I am now trying to refactor out some external dependencies (such as the aforementioned jMetalPy).
Project Code & Structure
Here is a minimalist structure of my project.
MyToolDirectory
¦--/MyTool
¦----/__init__.py
¦----/_jmetal
¦------/__init__.py
¦------/core
¦--------/quality_indicator.py
¦----/core
¦------/__init__.py
¦------/run_manager.py
¦----/tests
¦------/__init__.py
¦------/test_run_manager.py
The _jmetal directory is to remove external dependency on the jMetalPy package - and I have copied only the necessary packages/modules that I need.
Minimal contents of run_manager.py
# MyTool\core\run_manager.py
import jmetal
# from jmetal.core.quality_indicators import HyperVolume # old working version
class RunManager:
def __init__(self):
pass
#staticmethod
def calculate_hypervolume(front, ref_point):
if front is None or len(front) < 1:
return 0.
hv = jmetal.core.quality_indicator.HyperVolume(ref_point)
# hv = HyperVolume(ref_point)
hypervolume = hv.compute(front)
return hypervolume
Minimal contents of test_run_manager.py
# MyTool\tests\test_run_manager.py
import unittest
from unittest.mock import MagicMock, Mock, patch
from MyTool import core
class RunManagerTest(unittest.TestCase):
def setUp(self):
self.rm = core.RunManager()
def test_calculate_hypervolume(self):
ref_points = [0.0, 57.5]
front = [None, None]
# with patch('MyTool.core.run_manager.HyperVolume') as mock_HV: # old working version
with patch('MyTool.core.run_manager.jmetal.core.quality_indicator.HyperVolume') as mock_HV:
mock_HV.return_value = MagicMock()
res = self.rm.calculate_hypervolume(front, ref_points)
mock_HV.assert_called_with(ref_points)
mock_HV().compute.assert_called_with(front)
Main Question
When I run a test with the code as-is, I get this error message:
E ModuleNotFoundError: No module named 'MyTool.core.run_manager.jmetal'; 'MyTool.core.run_manager' is not a package
But when I change it to:
with patch('MyTool.core.run_manager.jmetal.core') as mock_core:
mock_HV = mock_core.quality_indicator.HyperVolume
mock_HV.return_value = MagicMock()
res = self.rm.calculate_hypervolume(front, ref_points)
mock_HV.assert_called_with(ref_points)
mock_HV().compute.assert_called_with(front)
... now the test passes. What gives?!
Why can't (or rather, how can) I surgically patch the exact class I want (i.e., HyperVolume) without patching out an entire sub-package as well? Is there a way around this? There may be code in jmetal.core that needs to run normally.
Is the reason this isn't working only because there is no from . import quality_indicator statement in jMetalPy's jmetal\core\__init__.py ?
Because even with patch('MyTool.core.run_manager.jmetal.core.quality_indicator) throws:
E AttributeError: <module 'jmetal.core' from 'path\\to\\venv\\lib\\site-packages\\jmetal\\core\\__init__.py'> does not have the attribute 'quality_indicator'
Or is there something I'm doing wrong?
In the case that it is just about adding those import statements, I could do that in my _jmetal sub-package, but I was hoping to let the user default to their own jMetalPy installation if they already had one by adding this to MyTool\__init__.py:
try:
import jmetal
except ModuleNotFoundError:
from . import _jmetal as jmetal
and then replacing all instances of import jmetal with from MyTool import jmetal. However, I'd run into the same problem all over again.
I feel that there is some core concept I am not grasping. Thanks for the help.

Mock nested import in Python with MagicMock

My file (ensure_path.py):
import os
def ensure_path(path):
if not os.path.exists(path):
os.makedirs(path)
return path
My test:
import unittest
from unittest.mock import patch, MagicMock
from src.util.fs.ensure_path import ensure_path
FAKE_PATH = '/foo/bar'
class EnsurePathSpec(unittest.TestCase):
#patch('os.path.exists', side_effect=MagicMock(return_value=False))
#patch('os.makedirs', side_effect=MagicMock(return_value=True))
def test_path_exists_false(self, _mock_os_path_exists_false, _mock_os_makedirs):
ensure_path(FAKE_PATH)
_mock_os_path_exists_false.assert_called_with(FAKE_PATH)
_mock_os_makedirs.assert_called_with(FAKE_PATH)
#patch('os.path.exists', side_effect=MagicMock(return_value=True))
#patch('os.makedirs', side_effect=MagicMock(return_value=True))
def test_path_exists_true(self, _mock_os_path_exists_true, _mock_os_makedirs):
ensure_path(FAKE_PATH)
_mock_os_path_exists_true.assert_called_with(FAKE_PATH)
_mock_os_makedirs.assert_not_called()
This is giving the failed assertion Expected call: makedirs('/foo/bar') which I think makes sense because I think I'm mocking os.makedirs at the wrong level.
I've tried replacing #patch('os.makedirs', with #patch('src.util.fs.ensure_path.os.makedirs', and a couple variations of that but I get
ImportError: No module named 'src.util.fs.ensure_path.os'; 'src.util.fs.ensure_path' is not a package
Here is my __init__.py flow :
Is there an obvious fix I'm missing?
Your patch arguments need to be in the reverse order of the #patch decorators.

How do I change the name of an imported library?

I am using jython with a third party application. The third party application has some builtin libraries foo. To do some (unit) testing we want to run some code outside of the application. Since foo is bound to the application we decided to write our own mock implementation.
However there is one issue, we implemented our mock class in python while their class is in java. Thus to use their code one would do import foo and foo is the mock class afterwards. However if we import the python module like this we get the module attached to the name, thus one has to write foo.foo to get to the class.
For convenience reason we would love to be able to write from ourlib.thirdparty import foo to bind foo to the foo-class. However we would like to avoid to import all the classes in ourlib.thirdparty directly, since the loading time for each file takes quite a while.
Is there any way to this in python? ( I did not get far with Import hooks I tried simply returning the class from load_module or overwriting what I write to sys.modules (I think both approaches are ugly, particularly the later))
edit:
ok: here is what the files in ourlib.thirdparty look like simplified(without magic):
foo.py:
try:
import foo
except ImportError:
class foo
....
Actually they look like this:
foo.py:
class foo
....
__init__.py in ourlib.thirdparty
import sys
import os.path
import imp
#TODO: 3.0 importlib.util abstract base classes could greatly simplify this code or make it prettier.
class Importer(object):
def __init__(self, path_entry):
if not path_entry.startswith(os.path.join(os.path.dirname(__file__), 'thirdparty')):
raise ImportError('Custom importer only for thirdparty objects')
self._importTuples = {}
def find_module(self, fullname):
module = fullname.rpartition('.')[2]
try:
if fullname not in self._importTuples:
fileObj, self._importTuples[fullname] = imp.find_module(module)
if isinstance(fileObj, file):
fileObj.close()
except:
print 'backup'
path = os.path.join(os.path.join(os.path.dirname(__file__), 'thirdparty'), module+'.py')
if not os.path.isfile(path):
return None
raise ImportError("Could not find dummy class for %s (%s)\n(searched:%s)" % (module, fullname, path))
self._importTuples[fullname] = path, ('.py', 'r', imp.PY_SOURCE)
return self
def load_module(self, fullname):
fp = None
python = False
print fullname
if self._importTuples[fullname][1][2] in (imp.PY_SOURCE, imp.PY_COMPILED, imp.PY_FROZEN):
fp = open( self._importTuples[fullname][0], self._importTuples[fullname][1][1])
python = True
try:
imp.load_module(fullname, fp, *self._importTuples[fullname])
finally:
if python:
module = fullname.rpartition('.')[2]
#setattr(sys.modules[fullname], module, getattr(sys.modules[fullname], module))
#sys.modules[fullname] = getattr(sys.modules[fullname], module)
if isinstance(fp, file):
fp.close()
return getattr(sys.modules[fullname], module)
sys.path_hooks.append(Importer)
As others have remarked, it is such a plain thing in Python that the import statement iself has a syntax for that:
from foo import foo as original_foo, for example -
or even import foo as module_foo
Interesting to note is that the import statemente binds a name to the imported module or object ont he local context - however, the dictionary sys.modules (on the moduels sys of course), is a live reference to all imported modules, using their names as a key. This mechanism plays a key role in avoding that Python re-reads and re-executes and already imported module , when running (that is, if various of yoru modules or sub-modules import the samefoo` module, it is just read once -- the subsequent imports use the reference stored in sys.modules).
And -- besides the "import...as" syntax, modules in Python are just another object: you can assign any other name to them in run time.
So, the following code would also work perfectly for you:
import foo
original_foo = foo
class foo(Mock):
...

Categories

Resources