Annotate a function argument as being a specific module - python
I have a pytest fixture that imports a specific module. This is needed as importing the module is very expensive, so we don't want to do it on import-time (i.e. during pytest test collection). This results in code like this:
#pytest.fixture
def my_module_fix():
import my_module
yield my_module
def test_something(my_module_fix):
assert my_module_fix.my_func() = 5
I am using PyCharm and would like to have type-checking and autocompletion in my tests. To achieve that, I would somehow have to annotate the my_module_fix parameter as having the type of the my_module module.
I have no idea how to achieve that. All I found is that I can annotate my_module_fix as being of type types.ModuleType, but that is not enough: It is not any module, it is always my_module.
If I get your question, you have two (or three) separate goals
Deferred import of slowmodule
Autocomplete to continue to work as if it was a standard import
(Potentially?) typing (e.g. mypy?) to continue to work
I can think of at least five different approaches, though I'll only briefly mention the last because it's insane.
Import the module inside your tests
This is (by far) the most common and IMHO preferred solution.
e.g. instead of
import slowmodule
def test_foo():
slowmodule.foo()
def test_bar():
slowmodule.bar()
you'd write:
def test_foo():
import slowmodule
slowmodule.foo()
def test_bar():
import slowmodule
slowmodule.bar()
[deferred importing] Here, the module will be imported on-demand/lazily. So if you have pytest setup to fail-fast, and another test fails before pytest gets to your (test_foo, test_bar) tests, the module will never be imported and you'll never incur the runtime cost.
Because of Python's module cache, subsequent import statements won't actually re-import the module, just grab a reference to the already-imported module.
[autocomplete/typing] Of course, autocomplete will continue to work as you expect in this case. This is a perfectly fine import pattern.
While it does require adding potentially many additional import statements (one inside each test function), it's immediately clear what is going on (regardless of whether it's clear why it's going on).
[3.7+] Proxy your module with module __getattr__
If you create a module (e.g. slowmodule_proxy.py) with the contents like:
def __getattr__(name):
import slowmodule
return getattr(slowmodule, name)
And in your tests, e.g.
import slowmodule
def test_foo():
slowmodule.foo()
def test_bar():
slowmodule.bar()
instead of:
import slowmodule
you write:
import slowmodule_proxy as slowmodule
[deferred import] Thanks to PEP-562, you can "request" any name from slowmodule_proxy and it will fetch and return the corresponding name from slowmodule. Just as above, including the import inside the function will cause slowmodule to be imported only when the function is called and executed instead of on module load. Module caching still applies here of course, so you're only incurring the import penalty once per interpreter session.
[autocomplete] However, while deferred importing will work (and your tests run without issue), this approach (as stated so far) will "break" autocomplete:
Now we're in the realm of PyCharm. Some IDEs will perform "live" analysis of modules and actually load up the module and inspect its members. (PyDev had this option). If PyCharm did this, implementing module.__dir__ (same PEP) or __all__ would allow your proxy module to masquerade as the actual slowmodule and autocomplete would work.† But, PyCharm does not do this.
Nonetheless, you can fool PyCharm into giving you autocomplete suggestions:
if False:
import slowmodule
else:
import slowmodule_proxy as slowmodule
The interpreter will only execute the else branch, importing the proxy and naming it slowmodule (so your test code can continue to reference slowmodule unchanged).
But PyCharm will now provide autocompletion for the underlying module:
† While live-analysis can be an incredibly helpful, there's also a (potential) security concern that comes with it that static syntax analysis doesn't have. And the maturation of type hinting and stub files has made it less of an issue still.
Proxy slowmodule explicitly
If you really hated the dynamic proxy approach (or the fact that you have to fool PyCharm in this way), you could proxy the module explicitly.
(You'd likely only want to consider this if the slowmodule API is stable.)
If slowmodule has methods foo and bar you'd create a proxy module like:
def foo(*args, **kwargs):
import slowmodule
return slowmodule.foo(*args, **kwargs)
def bar(*args, **kwargs):
import slowmodule
return slowmodule.bar(*args, **kwargs)
(Using args and kwargs to pass arguments through to the underlying callables. And you could add type hinting to these functions to mirror the slowmodule functions.)
And in your test,
import slowmodule_proxy as slowmodule
Same as before. Importing inside the method gives you the deferred importing you want and the module cache takes care of multiple import calls.
And since it's a real module whose contents can be statically analyzed, there's no need to "fool" PyCharm.
So the benefit of this solution is that you don't have a bizarre looking if False in your test imports. This, however, comes at the (substantial) cost of having to maintain a proxy file alongside your module -- which could prove painful in the case that slowmodule's API wasn't stable.
[3.5+] Use importlib's LazyLoader instead of a proxy module
Instead of the proxy module slowmodule_proxy, you could follow a pattern similar to the one shown in the importlib docs
>>> import importlib.util
>>> import sys
>>> def lazy_import(name):
... spec = importlib.util.find_spec(name)
... loader = importlib.util.LazyLoader(spec.loader)
... spec.loader = loader
... module = importlib.util.module_from_spec(spec)
... sys.modules[name] = module
... loader.exec_module(module)
... return module
...
>>> lazy_typing = lazy_import("typing")
>>> #lazy_typing is a real module object,
>>> #but it is not loaded in memory yet.
You'd still need to fool PyCharm though, so something like:
if False:
import slowmodule
else:
slowmodule = lazy_import('slowmodule')
would be necessary.
Outside of the single additional level of indirection on module member access (and the two minor version availability difference), it's not immediately clear to me what, if anything, there is to be gained from this approach over the previous proxy module method, however.
Use importlib's Finder/Loader machinery to hook import (don't do this)
You could create a custom module Finder/Loader that would (only) hook your slowmodule import and, instead load, for example your proxy module.
Then you could just import that "importhook" module before you imported slowmode in your tests, e.g.
import myimporthooks
import slowmodule
def test_foo():
...
(Here, myimporthooks would use importlib's finder and loader machinery to do something simlar to the importhook package but intercept and redirect the import attempt rather than just serving as an import callback.)
But this is crazy. Not only is what you want (seemingly) achievable through (infinitely) more common and supported methods, but it's incredibly fragile, error-prone and, without diving into the internals of PyTest (which may mess with module loaders itself), it's hard to say whether it'd even work.
When Pytest collects files to be tested, modules are only imported once, even if the same import statement appears in multiple files.
To observe when my_module is imported, add a print statement and then use the Pytest -s flag (short for --capture=no), to ensure that all standard output is displayed.
my_module.py
answer: int = 42
print("MODULE IMPORTED: my_module.py")
You could then add your test fixture to a conftest.py file:
conftest.py
import pytest
#pytest.fixture
def my_module_fix():
import my_module
yield my_module
Then in your test files, my_module.py may be imported to add type hints:
test_file_01.py
import my_module
def test_something(my_module_fix: my_module):
assert my_module_fix.answer == 42
test_file_02.py
import my_module
def test_something2(my_module_fix: my_module):
assert my_module_fix.answer == 42
Then run Pytest to display all standard output and verify that the module is only imported once at runtime.
pytest -s ./
Output from Pytest
platform linux -- Python 3.9.7, pytest-6.2.5
rootdir: /home/your_username/your_repo
collecting ... MODULE IMPORTED: my_module.py <--- Print statement executed once
collected 2 items
test_file_01.py .
test_file_02.py .
This is quick and naive, but can you possibly annotate your tests with the "quote-style" annotation that is meant for different purposes, but may suit here by skipping the import at runtime but still help your editor?
def test_something(my_module_fix: "my_module"):
In a quick test, this seems to accomplish it at least for my setup.
Although it might not be considered a 'best practice', to keep your specific use case simple, you could just lazily import the module directly in your test where you need it.
def test_something():
import my_module
assert my_module.my_func() = 5
I believe Pytest will only import the module when the applicable tests run. Pytest should also 'cache' the import as well so that if multiple tests import the module, it is only actually imported once. This may also solve your autocomplete issues for your editor.
Side-note: Avoid writing code in a specific way to cater for a specific editor. Keep it simple, not everyone who looks at your code will use Pycharm.
would like to have type-checking and autocompletion in my tests
It sounds like you want Something that fits as the type symbol of your test function:
def test_something(my_module_fix: Something):
assert my_module_fix.my_func() = 5
... and from this, [hopefully] your IDE can make some inferences about my_module_fix. I'm not a PyCharm user, so I can't speak to what it can tell you from type signatures, but I can say this isn't something that's readily available.
For some intuition, in this example -- Something is a ModuleType like 3 is an int. Analogously, accessing a nonexistent attribute of Something is like doing something not allowed with 3. Perhaps, like accessing an attribute of it (3.__name__).
But really, I seems like you're thinking about this from the wrong direction. The question a type signature answers is: what contract must this [these] argument[s] satisfy for this function to use it [them]. Using the example above, a type of 3 is the kind of thing that is too specific to make useful functions:
def add_to(i: 3):
return i
Perhaps a better name for your type is:
def test_something(my_module_fix: SomethingThatHasMyFuncMethod):
assert my_module_fix.my_func() = 5
So the type you probably want is something like this:
class SomethingThatHasMyFuncMethod:
def my_func(self) -> int: ...
You'll need to define it (maybe in a .pyi file). See here for info.
Finally, here's some unsolicited advice regarding:
importing the module is very expensive, so we don't want to do it on import-time
You should probably employ some method of making this module do it's thing lazily. Some of the utils in the django framework could serve as a reference point. There's also the descriptor protocol which is a bit harder to grok but may suit your needs.
You want to add type hinting for the arguments of test_something, in particular to my_module_fix. This question is hard because we can't import my_module at the top.
Well, what is the type of my_module? I'm going to assume that if you do import my_module and then type(my_module) you will get <class 'module'>.
If you're okay with not telling users exactly which module it has to be, then you could try this.
from types import ModuleType
#pytest.fixture
def my_module_fix():
import my_module
yield my_module
def test_something(my_module_fix: ModuleType):
assert my_module_fix.my_func() = 5
The downside is that a user might infer that any old module would be suitable for my_module_fix.
You want it to be more specific? There's a cost to that, but we can get around some of it (it's hard to speak on performance in VS Code vs PyCharm vs others).
In that case, let's use TYPE_CHECKING. I think this was introduced in Python 3.6.
from types import ModuleType
from typing import TYPE_CHECKING
if TYPE_CHECKING:
import my_module
#pytest.fixture
def my_module_fix():
import my_module
yield my_module
def test_something(my_module_fix: my_module):
assert my_module_fix.my_func() = 5
You won't pay these costs during normal run time. You will pay the cost of importing my_module when your type checker is doing what it does.
If you're on an earlier verison of Python, you might need to do this instead.
from types import ModuleType
from typing import TYPE_CHECKING
if TYPE_CHECKING:
import my_module
#pytest.fixture
def my_module_fix():
import my_module
yield my_module
def test_something(my_module_fix: "my_module"):
assert my_module_fix.my_func() = 5
The type checker in VS Code is smart enough to know what's going on here. I can't speak for other type checkers.
Related
Testing constants declarations using pytest
We have a Python 3.7 application that has a declared constants.py file that has this form: APP_CONSTANT_1 = os.environ.get('app-constant-1-value') In a test.py we were hoping to test the setting of these constants using something like this (this is highly simplified but represents the core issue): class TestConfig: """General config tests""" #pytest.fixture def mock_os_environ(self, monkeypatch): """ """ def mock_get(*args, **kwargs): return 'test_config_value' monkeypatch.setattr(os.environ, "get", mock_get) def test_mock_env_vars(self, mock_os_environ): import constants assert os.environ.get('app-constant-1-value') == 'test_config_value' #passes assert constants.APP_CONSTANT_1 == 'test_config_value' #fails The second assertion fails as constants.constants.APP_CONSTANT_1 is None. Turns out that the constants.py seems to be loaded during pytest's 'collecting' phase and thus is already set by the time the test is run. What are we missing here? I feel like there is a simple way to resolve this in pytest but haven't yet discovered the secret. Is there some way to avoid loading the constants file prior to the tests being run? Any ideas are appreciated.
The problem is most likely that constants has been loaded before. To make sure it gets the patched value, you have to reload it: import os from importlib import reload import pytest import constants class TestConfig: """General config tests""" #pytest.fixture def mock_os_environ(self, monkeypatch): """ """ monkeypatch.setenv('app-constant-1-value', 'test_config_value') reload(constants) def test_mock_env_vars(self, mock_os_environ): assert os.environ.get('app-constant-1-value') == 'test_config_value' assert app.APP_CONSTANT_1 == 'test_config_value' Note that I used monkeypatch.setenv to specifically set the variable you need. If you don't need to change all environment variables, this is easier to use.
Erm, I would avoid using constants. You can subclass os.environment for a start, and then use a mocked subclass for your unit tests, so you can have my_env.unique_env as a member variable. You can then use eg. import json to use a json configuration file without getting involved with hard coded python. The subclass can then hold the relevant variables (or methods if you prefer) Being able to add a facade to os.environment provides you with the abstraction you are looking for, without any of the problems. Even is one is using a legacy/larger project, the advantage of using an adapter for access to the environment must be apparent. Since you are writing unit tests, there is an opportunity to use an adapter class in both the tests and the functions being tested.
How to mock out a function that is not exposed from within __init__ file?
The code structure is as follows. service_a/__init__.py: from .func_a import func_a service_a/func_a.py def func_a(): ... service_b/__init__.py: from .func_b import func_b service_b/func_b.py: from service_a import func_a def func_b(): func_a() ... Now when I unit test func_b, I'm not sure about how to mock out func_a. My test code is like: from unittest import mock from service_b import func_b # within test method: with mock.patch('[patch_path_for_func_a]') as mock_func: func_b() ... I don't know what to put in [patch_path_for_func_a] because func_a isn't exposed from service_b at runtime.
Well, if you really want to do this there are ways but probably not a nice one. For example, you might add service_b to sys.path so that module func_b can be imported: import sys from unittest import mock import service_b sys.path.extend(service_b.__path__) import func_b with mock.patch('func_b.func_a') as mock_func: func_b.func_b() mock_func.assert_called_once_with() Now if you think that trying to write this test has been painful I suggest you take a step back to identify the source of the pain and get rid of it. You are actively hiding the module service_b.func_b so that it can't be easily imported, then your test wants to do exactly that. Different parts of the code have opposite goals and that is what is causing the pain. These are several ways in which you can avoid this, they might or might not work well for your specific case (it is hard to give concrete advice about foobar examples): Do not hide the module If you do not give the same name to module and the function or somehow you reorganize your code so that the module where func_a() is being looked up can be easily imported the problem vanishes. Do not mock func_a() func_b() calling func_a() looks like an implementation detail so it might make sense if the test does not care about it. Just call func_b() and assert that it returns the expected value. Make the dependency explicit If you really feel that the test should call func_b() with an alternate implementation (in this case a mock) of func_a() that is a hint that other code might want to do the same thing. Making the dependency explicit will make that way easier for any code, including the test, and will avoid the need of monkey-patching and the distraction of having to think where to patch. def func_b(func_a): func_a() ... And then the test is straightforward (no monkey-patching, no strained imports, no irrelevant and distracting details): func_a = mock.Mock() # arrange func_b(func_a) # act func_a.assert_called_once_with() # assert
Is it good practice to pass a Python module as an argument
I'm pretty new to python comming from Javaland. I'm writing a few modules and want to test them independently. Some of these have dependencies on functions defined in other modules. I want to find a light-weight way of injecting a test module when running the code from the test and use that instead of the real module that define those tests. I have come up with the pattern below as a means to achive that. Say I have somemodule.py that define a function: def aFunction: return _calculate_real_value_and_do_a_bunch_of_stuff() In foo.py I have a class that depend on that function: import somemodule class Foo: def bar(self, somemodule=somemodule): return 'bar:' + somemodule.aFunction() In test_foo.py: import test_foo def aFunction: return 'test_value' class FooTest(unittest.TestCase: def test_bar(self): self.assertEquals('bar:test_value',somemodule.aFunction(test_foo)) This works for injecting a module into Foo.bar, but is it good practice? Are there other, better ways of enabling testing of a module with dependencies? I find that the code is quite readable and I get the added benefit of a dependency list in the arguments to the function. The only downside I see is that I have an explicit dependency on somemodule in foo.py and from a dependency injection POV it might smell?
The usual way to do this is via monkeypatching. Python lets you do this: import somemodule somemodule.aFunction = aFunction and now from the perspective of foo, somemodule.aFunction is your test function. The mock library has a patch decorator that does much the same thing but wraps it so that the original is restored when the test ends.
How do I mock the hierarchy of non-existing modules?
Let's assume that we have a system of modules that exists only on production stage. At the moment of testing these modules do not exist. But still I would like to write tests for the code that uses those modules. Let's also assume that I know how to mock all the necessary objects from those modules. The question is: how do I conveniently add module stubs into current hierarchy? Here is a small example. The functionality I want to test is placed in a file called actual.py: actual.py: def coolfunc(): from level1.level2.level3_1 import thing1 from level1.level2.level3_2 import thing2 do_something(thing1) do_something_else(thing2) In my test suite I already have everything I need: I have thing1_mock and thing2_mock. Also I have a testing function. What I need is to add level1.level2... into current module system. Like this: tests.py import sys import actual class SomeTestCase(TestCase): thing1_mock = mock1() thing2_mock = mock2() def setUp(self): sys.modules['level1'] = what should I do here? #patch('level1.level2.level3_1.thing1', thing1_mock) #patch('level1.level2.level3_1.thing1', thing2_mock) def test_some_case(self): actual.coolfunc() I know that I can substitute sys.modules['level1'] with an object containing another object and so on. But it seems like a lot of code for me. I assume that there must be much simpler and prettier solution. I just cannot find it.
So, no one helped me with my problem and I decided to solve it by myself. Here is a micro-lib called surrogate which allows one to create stubs for non-existing modules. Lib can be used with mock like this: from surrogate import surrogate from mock import patch #surrogate('this.module.doesnt.exist') #patch('this.module.doesnt.exist', whatever) def test_something(): from this.module.doesnt import exist do_something() Firstly #surrogate decorator creates stubs for non-existing modules, then #patch decorator can alter them. Just as #patch, #surrogate decorators can be used "in plural", thus stubbing more than one module path. All stubs exist only at the lifetime of decorated function. If anyone gets any use of this lib, that would be great :)
How should I perform imports in a python module without polluting its namespace?
I am developing a Python package for dealing with some scientific data. There are multiple frequently-used classes and functions from other modules and packages, including numpy, that I need in virtually every function defined in any module of the package. What would be the Pythonic way to deal with them? I have considered multiple variants, but every has its own drawbacks. Import the classes at module-level with from foreignmodule import Class1, Class2, function1, function2 Then the imported functions and classes are easily accessible from every function. On the other hand, they pollute the module namespace making dir(package.module) and help(package.module) cluttered with imported functions Import the classes at function-level with from foreignmodule import Class1, Class2, function1, function2 The functions and classes are easily accessible and do not pollute the module, but imports from up to a dozen modules in every function look as a lot of duplicate code. Import the modules at module-level with import foreignmodule Not too much pollution is compensated by the need to prepend the module name to every function or class call. Use some artificial workaround like using a function body for all these manipulations and returning only the objects to be exported... like this def _export(): from foreignmodule import Class1, Class2, function1, function2 def myfunc(x): return function1(x, function2(x)) return myfunc myfunc = _export() del _export This manages to solve both problems, module namespace pollution and ease of use for functions... but it seems to be not Pythonic at all. So what solution is the most Pythonic? Is there another good solution I overlooked?
Go ahead and do your usual from W import X, Y, Z and then use the __all__ special symbol to define what actual symbols you intend people to import from your module: __all__ = ('MyClass1', 'MyClass2', 'myvar1', …) This defines the symbols that will be imported into a user's module if they import * from your module. In general, Python programmers should not be using dir() to figure out how to use your module, and if they are doing so it might indicate a problem somewhere else. They should be reading your documentation or typing help(yourmodule) to figure out how to use your library. Or they could browse the source code yourself, in which case (a) the difference between things you import and things you define is quite clear, and (b) they will see the __all__ declaration and know which toys they should be playing with. If you try to support dir() in a situation like this for a task for which it was not designed, you will have to place annoying limitations on your own code, as I hope is clear from the other answers here. My advice: don't do it! Take a look at the Standard Library for guidance: it does from … import … whenever code clarity and conciseness require it, and provides (1) informative docstrings, (2) full documentation, and (3) readable code, so that no one ever has to run dir() on a module and try to tell the imports apart from the stuff actually defined in the module.
One technique I've seen used, including in the standard library, is to use import module as _module or from module import var as _var, i.e. assigning imported modules/variables to names starting with an underscore. The effect is that other code, following the usual Python convention, treats those members as private. This applies even for code that doesn't look at __all__, such as IPython's autocomplete function. An example from Python 3.3's random module: from warnings import warn as _warn from types import MethodType as _MethodType, BuiltinMethodType as _BuiltinMethodType from math import log as _log, exp as _exp, pi as _pi, e as _e, ceil as _ceil from math import sqrt as _sqrt, acos as _acos, cos as _cos, sin as _sin from os import urandom as _urandom from collections.abc import Set as _Set, Sequence as _Sequence from hashlib import sha512 as _sha512 Another technique is to perform imports in function scope, so that they become local variables: """Some module""" # imports conventionally go here def some_function(arg): "Do something with arg." import re # Regular expressions solve everything ... The main rationale for doing this is that it is effectively lazy, delaying the importing of a module's dependencies until they are actually used. Suppose one function in the module depends on a particular huge library. Importing the library at the top of the file would mean that importing the module would load the entire library. This way, importing the module can be quick, and only client code that actually calls that function incurs the cost of loading the library. Further, if the dependency library is not available, client code that doesn't need the dependent feature can still import the module and call the other functions. The disadvantage is that using function-level imports obscures what your code's dependencies are. Example from Python 3.3's os.py: def get_exec_path(env=None): """[...]""" # Use a local import instead of a global import to limit the number of # modules loaded at startup: the os module is always loaded at startup by # Python. It may also avoid a bootstrap issue. import warnings
Import the module as a whole: import foreignmodule. What you claim as a drawback is actually a benefit. Namely, prepending the module name makes your code easier to maintain and makes it more self-documenting. Six months from now when you look at a line of code like foo = Bar(baz) you may ask yourself which module Bar came from, but with foo = cleverlib.Bar it is much less of a mystery. Of course, the fewer imports you have, the less of a problem this is. For small programs with few dependencies it really doesn't matter all that much. When you find yourself asking questions like this, ask yourself what makes the code easier to understand, rather than what makes the code easier to write. You write it once but you read it a lot.
For this situation I would go with an all_imports.py file which had all the from foreignmodule import ..... from another module import ..... and then in your working modules import all_imports as fgn # or whatever you want to prepend ... something = fgn.Class1() Another thing to be aware of __all__ = ['func1', 'func2', 'this', 'that'] Now, any functions/classes/variables/etc that are in your module, but not in your modules's __all__ will not show up in help(), and won't be imported by from mymodule import * See Making python imports more structured? for more info.
I would compromise and just pick a short alias for the foreign module: import foreignmodule as fm It saves you completely from the pollution (probably the bigger issue) and at least reduces the prepending burden.
I know this is an old question. It may not be 'Pythonic', but the cleanest way I've discovered for exporting only certain module definitions is, really as you've found, to globally wrap the module in a function. But instead of returning them, to export names, you can simply globalize them (global thus in essence becomes a kind of 'export' keyword): def module(): global MyPublicClass,ExportedModule import somemodule as ExportedModule import anothermodule as PrivateModule class MyPublicClass: def __init__(self): pass class MyPrivateClass: def __init__(self): pass module() del module I know it's not much different than your original conclusion, but frankly to me this seems to be the cleanest option. The other advantage is, you can group any number of modules written this way into a single file, and their private terms won't overlap: def module(): global A i,j,k = 1,2,3 class A: pass module() del module def module(): global B i,j,k = 7,8,9 # doesn't overwrite previous declarations class B: pass module() del module Though, keep in mind their public definitions will, of course, overlap.