How to fake a module in unittest? - python

I have a code that is based on a configuration file called config.py which defines a class called Config and contains all the configuration options. As the config file can be located anywhere in the user's storage, so I use importlib.util to import it (as specified in this answer). I want to test this functionality with unittest for different configurations. How do I do it? A simple answer could be make a different file for every possible config I want to test and then pass its path to the config loader but this is not what I want. What I basically need is that I implement the Config class, and fake it as if it were the actual config file. How to achieve this?
EDIT Here is the code I want to test:
import os
import re
import traceback
import importlib.util
from typing import Any
from blessings import Terminal
term = Terminal()
class UnknownOption(Exception):
pass
class MissingOption(Exception):
pass
def perform_checks(config: Any):
checklist = {
"required": {
"root": [
"flask",
"react",
"mysql",
"MODE",
"RUN_REACT_IN_DEVELOPMENT",
"RUN_FLASK_IN_DEVELOPMENT",
],
"flask": ["HOST", "PORT", "config"],
# More options
},
"optional": {
"react": [
"HTTPS",
# More options
],
"mysql": ["AUTH_PLUGIN"],
},
}
# Check for missing required options
for kind in checklist["required"]:
prop = config if kind == "root" else getattr(config, kind)
for val in kind:
if not hasattr(prop, val):
raise MissingOption(
"Error while parsing config: "
+ f"{prop}.{val} is a required config "
+ "option but is not specified in the configuration file."
)
def unknown_option(option: str):
raise UnknownOption(
"Error while parsing config: Found an unknown option: " + option
)
# Check for unknown options
for val in vars(config):
if not re.match("__[a-zA-Z0-9_]*__", val) and not callable(val):
if val in checklist["optional"]:
for ch_val in vars(val):
if not re.match("__[a-zA-Z0-9_]*__", ch_val) and not callable(
ch_val
):
if ch_val not in checklist["optional"][val]:
unknown_option(f"Config.{val}.{ch_val}")
else:
unknown_option(f"Config.{val}")
# Check for illegal options
if config.react.HTTPS == "true":
# HTTPS was set to true but no cert file was specified
if not hasattr(config.react, "SSL_KEY_FILE") or not hasattr(
config.react, "SSL_CRT_FILE"
):
raise MissingOption(
"config.react.HTTPS was set to True without specifying a key file and a crt file, which is illegal"
)
else:
# Files were specified but are non-existent
if not os.path.exists(config.react.SSL_KEY_FILE):
raise FileNotFoundError(
f"The file at { config.react.SSL_KEY_FILE } was set as the key file"
+ "in configuration but was not found."
)
if not os.path.exists(config.react.SSL_CRT_FILE):
raise FileNotFoundError(
f"The file at { config.react.SSL_CRT_FILE } was set as the certificate file"
+ "in configuration but was not found."
)
def load_from_pyfile(root: str = None):
"""
This loads the configuration from a `config.py` file located in the project root
"""
PROJECT_ROOT = root or os.path.abspath(
".." if os.path.abspath(".").split("/")[-1] == "lib" else "."
)
config_file = os.path.join(PROJECT_ROOT, "config.py")
print(f"Loading config from {term.green(config_file)}")
# Load the config file
spec = importlib.util.spec_from_file_location("", config_file)
config = importlib.util.module_from_spec(spec)
# Execute the script
spec.loader.exec_module(config)
# Not needed anymore
del spec, config_file
# Load the mode from environment variable and
# if it is not specified use development mode
MODE = int(os.environ.get("PROJECT_MODE", -1))
conf: Any
try:
conf = config.Config()
conf.load(PROJECT_ROOT, MODE)
except Exception:
print(term.red("Fatal: There was an error while parsing the config.py file:"))
traceback.print_exc()
print("This error is non-recoverable. Aborting...")
exit(1)
print("Validating configuration...")
perform_checks(conf)
print(
"Configuration",
term.green("OK"),
)

Without seeing a bit more of your code, it's tough to give a terribly direct answer, but most likely, you want to use Mocks
In the unit test, you would use a mock to replace the Config class for the caller/consumer of that class. You then configure the mock to give the return values or side effects that are relevant to your test case.
Based on what you've posted, you may not need any mocks, just fixtures. That is, examples of Config that exercise a given case. In fact, it would probably be best to do exactly what you suggested originally--just make a few sample configs that exercise all the cases that matter.
It's not clear why that is undesirable--in my experience, it's much easier to read and understand a test with a coherent fixture than it is to deal with mocking and constructing objects in the test class. Also, you'd find this much easier to test if you broke the perform_checks function into parts, e.g., where you have comments.
However, you can construct the Config objects as you like and pass them to the check function in a unit test. It's a common pattern in Python development to use dict fixtures. Remembering that in python objects, including modules, have an interface much like a dictionary, suppose you had a unit test
from unittest import TestCase
from your_code import perform_checks
class TestConfig(TestCase):
def test_perform_checks(self):
dummy_callable = lambda x: x
config_fixture = {
'key1': 'string_val',
'key2': ['string_in_list', 'other_string_in_list'],
'key3': { 'sub_key': 'nested_val_string', 'callable_key': dummy_callable},
# this is your in-place fixture
# you make the keys and values that correspond to the feature of the Config file under test.
}
perform_checks(config_fixture)
self.assertTrue(True) # i would suggest returning True on the function instead, but this will cover the happy path case
def perform_checks_invalid(self):
config_fixture = {}
with self.assertRaises(MissingOption):
perform_checks(config_fixture)
# more tests of more cases
You can also override the setUp() method of the unittest class if you want to share fixtures among tests. One way to do this would be set up a valid fixture, then make the invalidating changes you want to test in each test method.

Related

Is it possible to mock os.scandir and its attributes?

for entry in os.scandir(document_dir)
if os.path.isdir(entry):
# some code goes here
else:
# else the file needs to be in a folder
file_path = entry.path.replace(os.sep, '/')
I am having trouble mocking os.scandir and the path attribute within the else statement. I am not able to mock the mock object's property I created in my unit tests.
with patch("os.scandir") as mock_scandir:
# mock_scandir.return_value = ["docs.json", ]
# mock_scandir.side_effect = ["docs.json", ]
# mock_scandir.return_value.path = PropertyMock(return_value="docs.json")
These are all the options I've tried. Any help is greatly appreciated.
It depends on what you realy need to mock. The problem is that os.scandir returns entries of type os.DirEntry. One possibility is to use your own mock DirEntry and implement only the methods that you need (in your example, only path). For your example, you also have to mock os.path.isdir. Here is a self-contained example for how you can do this:
import os
from unittest.mock import patch
def get_paths(document_dir):
# example function containing your code
paths = []
for entry in os.scandir(document_dir):
if os.path.isdir(entry):
pass
else:
# else the file needs to be in a folder
file_path = entry.path.replace(os.sep, '/')
paths.append(file_path)
return paths
class DirEntry:
def __init__(self, path):
self.path = path
def path(self):
return self.path
#patch("os.scandir")
#patch("os.path.isdir")
def test_sut(mock_isdir, mock_scandir):
mock_isdir.return_value = False
mock_scandir.return_value = [DirEntry("docs.json")]
assert get_paths("anydir") == ["docs.json"]
Depending on your actual code, you may have to do more.
If you want to patch more file system functions, you may consider to use pyfakefs instead, which patches the whole file system. This will be overkill for a single test, but can be handy for a test suite relying on file system functions.
Disclaimer: I'm a contributor to pyfakefs.

Call a function from different file where the file name and function name are read from a list

I have multiple functions stored in different files, Both file names and function names are stored in lists. Is there any option to call the required function without the conditional statements?
Example, file1 has functions function11 and function12,
def function11():
pass
def function12():
pass
file2 has functions function21 and function22
def function21():
pass
def function22():
pass
and I have the lists
file_name = ["file1", "file2", "file1"]
function_name = ["function12", "function22", "funciton12"]
I will get the list index from different function, based on that I need to call the function and get the output.
If the other function will give you a list index directly, then you don't need to deal with the function names as strings. Instead, directly store (without calling) the functions in the list:
import file1, file2
functions = [file1.function12, file2.function22, file1.function12]
And then call them once you have the index:
function[index]()
There are ways to do what is called "reflection" in Python and get from the string to a matching-named function. But they solve a problem that is more advanced than what you describe, and they are more difficult (especially if you also have to work with the module names).
If you have a "whitelist" of functions and modules that are allowed to be called from the config file, but still need to find them by string, you can explicitly create the mapping with a dict:
allowed_functions = {
'file1': {
'function11': file1.function11,
'function12': file1.function12
},
'file2': {
'function21': file2.function21,
'function22': file2.function22
}
}
And then invoke the function:
try:
func = allowed_functions[module_name][function_name]
except KeyError:
raise ValueError("this function/module name is not allowed")
else:
func()
The most advanced approach is if you need to load code from a "plugin" module created by the author. You can use the standard library importlib package to use the string name to find a file to import as a module, and import it dynamically. It looks something like:
from importlib.util import spec_from_file_location, module_from_spec
# Look for the file at the specified path, figure out the module name
# from the base file name, import it and make a module object.
def load_module(path):
folder, filename = os.path.split(path)
basename, extension = os.path.splitext(filename)
spec = spec_from_file_location(basename, path)
module = module_from_spec(spec)
spec.loader.exec_module(module)
assert module.__name__ == basename
return module
This is still unsafe, in the sense that it can look anywhere on the file system for the module. Better if you specify the folder yourself, and only allow a filename to be used in the config file; but then you still have to protect against hacking the path by using things like ".." and "/" in the "filename".
(I have a project that does something like this. It chooses the paths from a whitelist that is also under the user's control, so I have to warn my users not to trust the path-whitelist file from each other. I also search the directories for modules, and then make a whitelist of plugins that may be used, based only on plugins that are in the directory - so no funny games with "..". And I'm still worried I forgot something.)
Once you have a module name, you can get a function from it by name like:
dynamic_module = load_module(some_path)
try:
func = getattr(dynamic_module, function_name)
except AttributeError:
raise ValueError("function not in module")
At any rate, there is no reason to eval anything, or generate and import code based on user input. That is most unsafe of all.
Another alternative. This is not much safer than an eval() however.
Someone with access to the lists you read from the config file could inject malicious code in the lists you import.
I.e.
'from subprocess import call; subprocess.call(["rm", "-rf", "./*" stdout=/dev/null, stderr=/dev/null, shell=True)'
Code:
import re
# You must first create a directory named "test_module"
# You can do this with code if needed.
# Python recognizes a "module" as a module by the existence of an __init__.py
# It will load that __init__.py at the "import" command, and you can access the methods it imports
m = ["os", "sys", "subprocess"] # Modules to import from
f = ["getcwd", "exit", "call; call('do', '---terrible-things')"] # Methods to import
# Create an __init__.py
with open("./test_module/__init__.py", "w") as FH:
for count in range(0, len(m), 1):
# Writes "from module import method" to __init.py
line = "from {} import {}\n".format(m[count], f[count])
# !!!! SANITIZE THE LINE !!!!!
if not re.match("^from [a-zA-Z0-9._]+ import [a-zA-Z0-9._]+$", line):
print("The line '{}' is suspicious. Will not be entered into __init__.py!!".format(line))
continue
FH.write(line)
import test_module
print(test_module.getcwd())
OUTPUT:
The line 'from subprocess import call; call('do', '---terrible-things')' is suspicious. Will not be entered into __init__.py!!
/home/rightmire/eclipse-workspace/junkcode
I'm not 100% sure I'm understanding the need. Maybe more detail in the question.
Is something like this what you're looking for?
m = ["os"]
f = ["getcwd"]
command = ''.join([m[0], ".", f[0], "()"])
# Put in some minimum sanity checking and sanitization!!!
if ";" in command or <other dangerous string> in command:
print("The line '{}' is suspicious. Will not run".format(command))
sys.exit(1)
print("This will error if the method isnt imported...")
print(eval(''.join([m[0], ".", f[0], "()"])) )
OUTPUT:
This will error if the method isnt imported...
/home/rightmire/eclipse-workspace/junkcode
As pointed out by #KarlKnechtel, having commands come in from an external file is a gargantuan security risk!

How to correctly call jsonnet with imports from Python

I'm using jsonnet to build json objects that will be used by Python code, calling jsonnet from Python using the bindings. I want to set up my directory structure so that the jsonnet files are in a subdirectory or subdirectories relative to where the Python code is run, something like:
foo.py
jsonnet/
jsonnet/bar.jsonnet
jsonnet/baz.libsonnet
Running foo.py should then be able to use _jsonnet.evaluate_snippet() on strings read from files in jsonnet/ that import other files from jsonnet/. What's the best way to do this?
The default importer uses paths relative to the file from which they are imported. In case of evaluate_snippet you need to pass the path manually. This way jsonnet knows where to look for imported files.
If your intention is to process the files you can use a custom importer. (Digression: jsonnet tries to avoid the need to preprocess the source files, so there is probably a better way or a missing feature in jsonnet.)
Below is the complete, working example on how to use custom importers in Python (adjusted to the directory structure provided):
import os
import unittest
import _jsonnet
# Returns content if worked, None if file not found, or throws an exception
def try_path(dir, rel):
if not rel:
raise RuntimeError('Got invalid filename (empty string).')
if rel[0] == '/':
full_path = rel
else:
full_path = dir + rel
if full_path[-1] == '/':
raise RuntimeError('Attempted to import a directory')
if not os.path.isfile(full_path):
return full_path, None
with open(full_path) as f:
return full_path, f.read()
def import_callback(dir, rel):
full_path, content = try_path(dir, rel)
if content:
return full_path, content
raise RuntimeError('File not found')
class JsonnetTests(unittest.TestCase):
def setUp(self):
self.input_filename = os.path.join(
"jsonnet",
"bar.jsonnet",
)
self.expected_str = '{\n "num": 42,\n "str": "The answer to life ..."\n}\n'
with open(self.input_filename, "r") as infile:
self.input_snippet = infile.read()
def test_evaluate_file(self):
json_str = _jsonnet.evaluate_file(
self.input_filename,
import_callback=import_callback,
)
self.assertEqual(json_str, self.expected_str)
def test_evaluate_snippet(self):
json_str = _jsonnet.evaluate_snippet(
"jsonnet/bar.jsonnet",
self.input_snippet,
import_callback=import_callback,
)
self.assertEqual(json_str, self.expected_str)
if __name__ == '__main__':
unittest.main()
Note: it's a modified version of an example from jsonnet repo.
I don't fully get why you would use evaluate_snippet() (maybe mask the actual filenames via loading them from python into strings + evaluate_snippet("blah", str) ? ), instead of evaluate_file() - in any case that structure should just work ok.
Example:
jsonnet_test.py:
import json:
import _jsonnet
jsonnet_file = "jsonnet/bar.jsonnet"
data = json.loads(_jsonnet.evaluate_file(jsonnet_file))
print("{str} => {num}".format(**data))
jsonnet/bar.jsonnet:
local baz = import "baz.libsonnet";
{
str: "The answer to life ...",
num: baz.mult(6, 7),
}
jsonnet/baz.libsonnet:
{
mult(a, b):: (
a * b
),
}
Output:
$ python jsonnet_test.py
The answer to life ... => 42

Python: Creating a mock or fake directory with files for unittesting

I am trying to create a unit test for the following function:
def my_function(path):
#Search files at the given path
for file in os.listdir(path):
if file.endswith(".json"):
#Search for file i'm looking for
if file == "file_im_looking_for.json":
#Open file
os.chdir(path)
json_file=json.load(open(file))
print json_file["name"]
However I am having trouble successfully creating a fake directory with files in order for the function to work correctly and not through errors.
Below is what I have so far but it is not working for me, and I'm not sure how to incorporate "file_im_looking_for" as the file in the fake directory.
tmpfilepath = os.path.join(tempfile.gettempdir(), "tmp-testfile")
#mock.patch('my_module.os')
def test_my_function(self):
# make the file 'exist'
mock_path.endswith.return_value = True
file_im_looking_for=[{
"name": "test_json_file",
"type": "General"
}]
my_module.my_function("tmpfilepath")
Any advice where I'm going wrong or other ideas to approach this problem are appreciated!
First of all, you forgot to pass the mocked object to test function. The right way to use mock in your test should be like this.
#mock.patch('my_module.os')
def test_my_function(self, mock_path):
Anyway, you shouldn't mock the endswith, but the listdir. The snippet below is an example and may help you.
app.py
def check_files(path):
files = []
for _file in os.listdir(path):
if _file.endswith('.json'):
files.append(_file)
return files
test_app.py
import unittest
import mock
from app import check_files
class TestCheckFile(unittest.TestCase):
#mock.patch('app.os.listdir')
def test_check_file_should_succeed(self, mock_listdir):
mock_listdir.return_value = ['a.json', 'b.json', 'c.json', 'd.txt']
files = check_files('.')
self.assertEqual(3, len(files))
#mock.patch('app.os.listdir')
def test_check_file_should_fail(self, mock_listdir):
mock_listdir.return_value = ['a.json', 'b.json', 'c.json', 'd.txt']
files = check_files('.')
self.assertNotEqual(2, len(files))
if __name__ == '__main__':
unittest.main()
Edit: Answering your question in comment, you need to mock the json.loads and the open from your app.
#mock.patch('converter.open')
#mock.patch('converter.json.loads')
#mock.patch('converter.os.listdir')
def test_check_file_load_json_should_succeed(self, mock_listdir, mock_json_loads, mock_open):
mock_listdir.return_value = ['a.json', 'file_im_looking_for.json', 'd.txt']
mock_json_loads.return_value = [{"name": "test_json_file", "type": "General"}]
files = check_files('.')
self.assertEqual(1, len(files))
But remember! If your is too broad or hard to maintain, perhaps refactoring your API should be a good idea.
I would suggest to use Python's tempfile library, specifically TemporaryDirectory.
The issue with your and Mauro Baraldi's solution is that you have to patch multiple functions. This is a very error prone way, since with mock.patch you have to know exactly what you are doing! Otherwise, this may cause unexpected errors and eventually frustration.
Personally, I prefer pytest, since it has IMO nicer syntax and better fixtures, but since the creator used unittest I will stick with it.
I would rewrite your test code like this:
import json
import pathlib
import tempfile
import unittest
wrong_data = {
"name": "wrong_json_file",
"type": "Fake"
}
correct_data = {
"name": "test_json_file",
"type": "General"
}
class TestMyFunction(unittest.TestCase):
def setUp(self):
""" Called before every test. """
self._temp_dir = tempfile.TemporaryDirectory()
temp_path = pathlib.Path(self._temp_dir.name)
self._create_temporary_file_with_json_data(temp_path / 'wrong_json_file.json', wrong_data)
self._create_temporary_file_with_json_data(temp_path / 'file_im_looking_for.json', correct_data)
def tearDown(self):
""" Called after every test. """
self._temp_dir.cleanup()
def _create_temporary_file_with_json_data(self, file_path, json_data):
with open(file_path, 'w') as ifile:
ifile.write(json.dumps(content))
def test_my_function(self):
my_module.my_function(str(self._temp_dir))
You see that your actual test is compressed down to a single line! Admittedly, there is no assert, but if your function would return something, the result would behave as expected.
No mocking, because everything exists and will be cleaned up afterwards. And the best thing is that you now can add more tests with a lower barrier of entry.

Configuring Python Application based upon git branch

I've been thinking about ways to automatically setup configuration in my Python applications.
I usually use the following type of approach:
'''config.py'''
class Config(object):
MAGIC_NUMBER = 44
DEBUG = True
class Development(Config):
LOG_LEVEL = 'DEBUG'
class Production(Config):
DEBUG = False
REPORT_EMAIL_TO = ["ceo#example.com", "chief_ass_kicker#example.com"]
Typically, when I'm running the app in different ways I could do something like:
from config import Development, Production
do_something():
if self.conf.DEBUG:
pass
def __init__(self, config='Development'):
if config == "production":
self.conf = Production
else:
self.conf = Development
I like working like this because it makes sense, however I'm wondering if I can somehow integrate this into my git workflow too.
A lot of my applications have separate scripts, or modules that can be run alone, thus there isn't always a monolithic application to inherit configurations from some root location.
It would be cool if a lot of these scripts and seperate modules could check what branch is currently checked out and make their default configuration decisions based upon that, e.g., by looking for a class in config.py that shares the same name as the name of the currently checked out branch.
Is that possible, and what's the cleanest way to achieve it?
Is it a good/bad idea?
I'd prefer spinlok's method, but yes, you can do pretty much anything you want in your __init__, e.g.:
import inspect, subprocess, sys
def __init__(self, config='via_git'):
if config == 'via_git':
gitsays = subprocess.check_output(['git', 'symbolic-ref', 'HEAD'])
cbranch = gitsays.rstrip('\n').replace('refs/heads/', '', 1)
# now you know which branch you're on...
tbranch = cbranch.title() # foo -> Foo, for class name conventions
classes = dict(inspect.getmembers(sys.modules[__name__], inspect.isclass)
if tbranch in classes:
print 'automatically using', tbranch
self.conf = classes[tbranch]
else:
print 'on branch', cbranch, 'so falling back to Production'
self.conf = Production
elif config == 'production':
self.conf = Production
else:
self.conf = Development
This is, um, "slightly tested" (python 2.7). Note that check_output will raise an exception if git can't get a symbolic ref, and this also depends on your working directory. You can of course use other subprocess functions (to provide a different cwd for instance).

Categories

Resources