Related
I have a code that is based on a configuration file called config.py which defines a class called Config and contains all the configuration options. As the config file can be located anywhere in the user's storage, so I use importlib.util to import it (as specified in this answer). I want to test this functionality with unittest for different configurations. How do I do it? A simple answer could be make a different file for every possible config I want to test and then pass its path to the config loader but this is not what I want. What I basically need is that I implement the Config class, and fake it as if it were the actual config file. How to achieve this?
EDIT Here is the code I want to test:
import os
import re
import traceback
import importlib.util
from typing import Any
from blessings import Terminal
term = Terminal()
class UnknownOption(Exception):
pass
class MissingOption(Exception):
pass
def perform_checks(config: Any):
checklist = {
"required": {
"root": [
"flask",
"react",
"mysql",
"MODE",
"RUN_REACT_IN_DEVELOPMENT",
"RUN_FLASK_IN_DEVELOPMENT",
],
"flask": ["HOST", "PORT", "config"],
# More options
},
"optional": {
"react": [
"HTTPS",
# More options
],
"mysql": ["AUTH_PLUGIN"],
},
}
# Check for missing required options
for kind in checklist["required"]:
prop = config if kind == "root" else getattr(config, kind)
for val in kind:
if not hasattr(prop, val):
raise MissingOption(
"Error while parsing config: "
+ f"{prop}.{val} is a required config "
+ "option but is not specified in the configuration file."
)
def unknown_option(option: str):
raise UnknownOption(
"Error while parsing config: Found an unknown option: " + option
)
# Check for unknown options
for val in vars(config):
if not re.match("__[a-zA-Z0-9_]*__", val) and not callable(val):
if val in checklist["optional"]:
for ch_val in vars(val):
if not re.match("__[a-zA-Z0-9_]*__", ch_val) and not callable(
ch_val
):
if ch_val not in checklist["optional"][val]:
unknown_option(f"Config.{val}.{ch_val}")
else:
unknown_option(f"Config.{val}")
# Check for illegal options
if config.react.HTTPS == "true":
# HTTPS was set to true but no cert file was specified
if not hasattr(config.react, "SSL_KEY_FILE") or not hasattr(
config.react, "SSL_CRT_FILE"
):
raise MissingOption(
"config.react.HTTPS was set to True without specifying a key file and a crt file, which is illegal"
)
else:
# Files were specified but are non-existent
if not os.path.exists(config.react.SSL_KEY_FILE):
raise FileNotFoundError(
f"The file at { config.react.SSL_KEY_FILE } was set as the key file"
+ "in configuration but was not found."
)
if not os.path.exists(config.react.SSL_CRT_FILE):
raise FileNotFoundError(
f"The file at { config.react.SSL_CRT_FILE } was set as the certificate file"
+ "in configuration but was not found."
)
def load_from_pyfile(root: str = None):
"""
This loads the configuration from a `config.py` file located in the project root
"""
PROJECT_ROOT = root or os.path.abspath(
".." if os.path.abspath(".").split("/")[-1] == "lib" else "."
)
config_file = os.path.join(PROJECT_ROOT, "config.py")
print(f"Loading config from {term.green(config_file)}")
# Load the config file
spec = importlib.util.spec_from_file_location("", config_file)
config = importlib.util.module_from_spec(spec)
# Execute the script
spec.loader.exec_module(config)
# Not needed anymore
del spec, config_file
# Load the mode from environment variable and
# if it is not specified use development mode
MODE = int(os.environ.get("PROJECT_MODE", -1))
conf: Any
try:
conf = config.Config()
conf.load(PROJECT_ROOT, MODE)
except Exception:
print(term.red("Fatal: There was an error while parsing the config.py file:"))
traceback.print_exc()
print("This error is non-recoverable. Aborting...")
exit(1)
print("Validating configuration...")
perform_checks(conf)
print(
"Configuration",
term.green("OK"),
)
Without seeing a bit more of your code, it's tough to give a terribly direct answer, but most likely, you want to use Mocks
In the unit test, you would use a mock to replace the Config class for the caller/consumer of that class. You then configure the mock to give the return values or side effects that are relevant to your test case.
Based on what you've posted, you may not need any mocks, just fixtures. That is, examples of Config that exercise a given case. In fact, it would probably be best to do exactly what you suggested originally--just make a few sample configs that exercise all the cases that matter.
It's not clear why that is undesirable--in my experience, it's much easier to read and understand a test with a coherent fixture than it is to deal with mocking and constructing objects in the test class. Also, you'd find this much easier to test if you broke the perform_checks function into parts, e.g., where you have comments.
However, you can construct the Config objects as you like and pass them to the check function in a unit test. It's a common pattern in Python development to use dict fixtures. Remembering that in python objects, including modules, have an interface much like a dictionary, suppose you had a unit test
from unittest import TestCase
from your_code import perform_checks
class TestConfig(TestCase):
def test_perform_checks(self):
dummy_callable = lambda x: x
config_fixture = {
'key1': 'string_val',
'key2': ['string_in_list', 'other_string_in_list'],
'key3': { 'sub_key': 'nested_val_string', 'callable_key': dummy_callable},
# this is your in-place fixture
# you make the keys and values that correspond to the feature of the Config file under test.
}
perform_checks(config_fixture)
self.assertTrue(True) # i would suggest returning True on the function instead, but this will cover the happy path case
def perform_checks_invalid(self):
config_fixture = {}
with self.assertRaises(MissingOption):
perform_checks(config_fixture)
# more tests of more cases
You can also override the setUp() method of the unittest class if you want to share fixtures among tests. One way to do this would be set up a valid fixture, then make the invalidating changes you want to test in each test method.
I have multiple functions stored in different files, Both file names and function names are stored in lists. Is there any option to call the required function without the conditional statements?
Example, file1 has functions function11 and function12,
def function11():
pass
def function12():
pass
file2 has functions function21 and function22
def function21():
pass
def function22():
pass
and I have the lists
file_name = ["file1", "file2", "file1"]
function_name = ["function12", "function22", "funciton12"]
I will get the list index from different function, based on that I need to call the function and get the output.
If the other function will give you a list index directly, then you don't need to deal with the function names as strings. Instead, directly store (without calling) the functions in the list:
import file1, file2
functions = [file1.function12, file2.function22, file1.function12]
And then call them once you have the index:
function[index]()
There are ways to do what is called "reflection" in Python and get from the string to a matching-named function. But they solve a problem that is more advanced than what you describe, and they are more difficult (especially if you also have to work with the module names).
If you have a "whitelist" of functions and modules that are allowed to be called from the config file, but still need to find them by string, you can explicitly create the mapping with a dict:
allowed_functions = {
'file1': {
'function11': file1.function11,
'function12': file1.function12
},
'file2': {
'function21': file2.function21,
'function22': file2.function22
}
}
And then invoke the function:
try:
func = allowed_functions[module_name][function_name]
except KeyError:
raise ValueError("this function/module name is not allowed")
else:
func()
The most advanced approach is if you need to load code from a "plugin" module created by the author. You can use the standard library importlib package to use the string name to find a file to import as a module, and import it dynamically. It looks something like:
from importlib.util import spec_from_file_location, module_from_spec
# Look for the file at the specified path, figure out the module name
# from the base file name, import it and make a module object.
def load_module(path):
folder, filename = os.path.split(path)
basename, extension = os.path.splitext(filename)
spec = spec_from_file_location(basename, path)
module = module_from_spec(spec)
spec.loader.exec_module(module)
assert module.__name__ == basename
return module
This is still unsafe, in the sense that it can look anywhere on the file system for the module. Better if you specify the folder yourself, and only allow a filename to be used in the config file; but then you still have to protect against hacking the path by using things like ".." and "/" in the "filename".
(I have a project that does something like this. It chooses the paths from a whitelist that is also under the user's control, so I have to warn my users not to trust the path-whitelist file from each other. I also search the directories for modules, and then make a whitelist of plugins that may be used, based only on plugins that are in the directory - so no funny games with "..". And I'm still worried I forgot something.)
Once you have a module name, you can get a function from it by name like:
dynamic_module = load_module(some_path)
try:
func = getattr(dynamic_module, function_name)
except AttributeError:
raise ValueError("function not in module")
At any rate, there is no reason to eval anything, or generate and import code based on user input. That is most unsafe of all.
Another alternative. This is not much safer than an eval() however.
Someone with access to the lists you read from the config file could inject malicious code in the lists you import.
I.e.
'from subprocess import call; subprocess.call(["rm", "-rf", "./*" stdout=/dev/null, stderr=/dev/null, shell=True)'
Code:
import re
# You must first create a directory named "test_module"
# You can do this with code if needed.
# Python recognizes a "module" as a module by the existence of an __init__.py
# It will load that __init__.py at the "import" command, and you can access the methods it imports
m = ["os", "sys", "subprocess"] # Modules to import from
f = ["getcwd", "exit", "call; call('do', '---terrible-things')"] # Methods to import
# Create an __init__.py
with open("./test_module/__init__.py", "w") as FH:
for count in range(0, len(m), 1):
# Writes "from module import method" to __init.py
line = "from {} import {}\n".format(m[count], f[count])
# !!!! SANITIZE THE LINE !!!!!
if not re.match("^from [a-zA-Z0-9._]+ import [a-zA-Z0-9._]+$", line):
print("The line '{}' is suspicious. Will not be entered into __init__.py!!".format(line))
continue
FH.write(line)
import test_module
print(test_module.getcwd())
OUTPUT:
The line 'from subprocess import call; call('do', '---terrible-things')' is suspicious. Will not be entered into __init__.py!!
/home/rightmire/eclipse-workspace/junkcode
I'm not 100% sure I'm understanding the need. Maybe more detail in the question.
Is something like this what you're looking for?
m = ["os"]
f = ["getcwd"]
command = ''.join([m[0], ".", f[0], "()"])
# Put in some minimum sanity checking and sanitization!!!
if ";" in command or <other dangerous string> in command:
print("The line '{}' is suspicious. Will not run".format(command))
sys.exit(1)
print("This will error if the method isnt imported...")
print(eval(''.join([m[0], ".", f[0], "()"])) )
OUTPUT:
This will error if the method isnt imported...
/home/rightmire/eclipse-workspace/junkcode
As pointed out by #KarlKnechtel, having commands come in from an external file is a gargantuan security risk!
it's a little bit I'm out of python syntax and I have a problem in reading a .ini file with interpolated values.
this is my ini file:
[DEFAULT]
home=$HOME
test_home=$home
[test]
test_1=$test_home/foo.csv
test_2=$test_home/bar.csv
Those lines
from ConfigParser import SafeConfigParser
parser = SafeConfigParser()
parser.read('config.ini')
print parser.get('test', 'test_1')
does output
$test_home/foo.csv
while I'm expecting
/Users/nkint/foo.csv
EDIT:
I supposed that the $ syntax was implicitly included in the so called string interpolation (referring to the manual):
On top of the core functionality, SafeConfigParser supports
interpolation. This means values can contain format strings which
refer to other values in the same section, or values in a special
DEFAULT section.
But I'm wrong. How to handle this case?
First of all according to the documentation you should use %(test_home)s to interpolate test_home. Moreover the key are case insensitive and you can't use both HOME and home keys. Finally you can use SafeConfigParser(os.environ) to take in account of you environment.
from ConfigParser import SafeConfigParser
import os
parser = SafeConfigParser(os.environ)
parser.read('config.ini')
Where config.ini is
[DEFAULT]
test_home=%(HOME)s
[test]
test_1=%(test_home)s/foo.csv
test_2=%(test_home)s/bar.csv
You can write custom interpolation in case of Python 3:
import configparser
import os
class EnvInterpolation(configparser.BasicInterpolation):
"""Interpolation which expands environment variables in values."""
def before_get(self, parser, section, option, value, defaults):
value = super().before_get(parser, section, option, value, defaults)
return os.path.expandvars(value)
cfg = """
[section1]
key = value
my_path = $PATH
"""
config = configparser.ConfigParser(interpolation=EnvInterpolation())
config.read_string(cfg)
print(config['section1']['my_path'])
If you want to expand some environment variables, you can do so using os.path.expandvars before parsing a StringIO stream:
import ConfigParser
import os
import StringIO
with open('config.ini', 'r') as cfg_file:
cfg_txt = os.path.expandvars(cfg_file.read())
config = ConfigParser.ConfigParser()
config.readfp(StringIO.StringIO(cfg_txt))
the trick for proper variable substitution from environment is to use the ${} syntax for the environment variables:
[DEFAULT]
test_home=${HOME}
[test]
test_1=%(test_home)s/foo.csv
test_2=%(test_home)s/bar.csv
ConfigParser.get values are strings, even if you set values as integer or True. But ConfigParser has getint, getfloat and getboolean.
settings.ini
[default]
home=/home/user/app
tmp=%(home)s/tmp
log=%(home)s/log
sleep=10
debug=True
config reader
>>> from ConfigParser import SafeConfigParser
>>> parser = SafeConfigParser()
>>> parser.read('/home/user/app/settings.ini')
>>> parser.get('defaut', 'home')
'/home/user/app'
>>> parser.get('defaut', 'tmp')
'/home/user/app/tmp'
>>> parser.getint('defaut', 'sleep')
10
>>> parser.getboolean('defaut', 'debug')
True
Edit
Indeed you could get name values as environ var if you initialize SafeConfigParser with os.environ. Thanks for the Michele's answer.
Quite late, but maybe it can help someone else looking for the same answers that I had recently. Also, one of the comments was how to fetch Environment variables and values from other sections. Here is how I deal with both converting environment variables and multi-section tags when reading in from an INI file.
INI FILE:
[PKG]
# <VARIABLE_NAME>=<VAR/PATH>
PKG_TAG = Q1_RC1
[DELIVERY_DIRS]
# <DIR_VARIABLE>=<PATH>
NEW_DELIVERY_DIR=${DEL_PATH}\ProjectName_${PKG:PKG_TAG}_DELIVERY
Python Class that uses the ExtendedInterpolation so that you can use the ${PKG:PKG_TAG} type formatting. I add the ability to convert the windows environment vars when I read in INI to a string using the builtin os.path.expandvars() function such as ${DEL_PATH} above.
import os
from configparser import ConfigParser, ExtendedInterpolation
class ConfigParser(object):
def __init__(self):
"""
initialize the file parser with
ExtendedInterpolation to use ${Section:option} format
[Section]
option=variable
"""
self.config_parser = ConfigParser(interpolation=ExtendedInterpolation())
def read_ini_file(self, file='./config.ini'):
"""
Parses in the passed in INI file and converts any Windows environ vars.
:param file: INI file to parse
:return: void
"""
# Expands Windows environment variable paths
with open(file, 'r') as cfg_file:
cfg_txt = os.path.expandvars(cfg_file.read())
# Parses the expanded config string
self.config_parser.read_string(cfg_txt)
def get_config_items_by_section(self, section):
"""
Retrieves the configurations for a particular section
:param section: INI file section
:return: a list of name, value pairs for the options in the section
"""
return self.config_parser.items(section)
def get_config_val(self, section, option):
"""
Get an option value for the named section.
:param section: INI section
:param option: option tag for desired value
:return: Value of option tag
"""
return self.config_parser.get(section, option)
#staticmethod
def get_date():
"""
Sets up a date formatted string.
:return: Date string
"""
return datetime.now().strftime("%Y%b%d")
def prepend_date_to_var(self, sect, option):
"""
Function that allows the ability to prepend a
date to a section variable.
:param sect: INI section to look for variable
:param option: INI search variable under INI section
:return: Void - Date is prepended to variable string in INI
"""
if self.config_parser.get(sect, option):
var = self.config_parser.get(sect, option)
var_with_date = var + '_' + self.get_date()
self.config_parser.set(sect, option, var_with_date)
Based on #alex-markov answer (and code) and #srand9 comment, the following solution works with environment variables and cross-section references.
Note that the interpolation is now based on ExtendedInterpolation to allow cross-sections references and on before_read instead of before_get.
#!/usr/bin/env python3
import configparser
import os
class EnvInterpolation(configparser.ExtendedInterpolation):
"""Interpolation which expands environment variables in values."""
def before_read(self, parser, section, option, value):
value = super().before_read(parser, section, option, value)
return os.path.expandvars(value)
cfg = """
[paths]
foo : ${HOME}
[section1]
key = value
my_path = ${paths:foo}/path
"""
config = configparser.ConfigParser(interpolation=EnvInterpolation())
config.read_string(cfg)
print(config['section1']['my_path'])
It seems in the last version 3.5.0, ConfigParser was not reading the env variables, so I end up providing a custom Interpolation based on the BasicInterpolation one.
class EnvInterpolation(BasicInterpolation):
"""Interpolation as implemented in the classic ConfigParser,
plus it checks if the variable is provided as an environment one in uppercase.
"""
def _interpolate_some(self, parser, option, accum, rest, section, map,
depth):
rawval = parser.get(section, option, raw=True, fallback=rest)
if depth > MAX_INTERPOLATION_DEPTH:
raise InterpolationDepthError(option, section, rawval)
while rest:
p = rest.find("%")
if p < 0:
accum.append(rest)
return
if p > 0:
accum.append(rest[:p])
rest = rest[p:]
# p is no longer used
c = rest[1:2]
if c == "%":
accum.append("%")
rest = rest[2:]
elif c == "(":
m = self._KEYCRE.match(rest)
if m is None:
raise InterpolationSyntaxError(option, section,
"bad interpolation variable reference %r" % rest)
var = parser.optionxform(m.group(1))
rest = rest[m.end():]
try:
v = os.environ.get(var.upper())
if v is None:
v = map[var]
except KeyError:
raise InterpolationMissingOptionError(option, section, rawval, var) from None
if "%" in v:
self._interpolate_some(parser, option, accum, v,
section, map, depth + 1)
else:
accum.append(v)
else:
raise InterpolationSyntaxError(
option, section,
"'%%' must be followed by '%%' or '(', "
"found: %r" % (rest,))
The difference between the BasicInterpolation and the EnvInterpolation is in:
v = os.environ.get(var.upper())
if v is None:
v = map[var]
where I'm trying to find the var in the enviornment before checking in the map.
Below is a simple solution that
Can use default value if no environment variable is provided
Overrides variables with environment variables (if found)
needs no custom interpolation implementation
Example:
my_config.ini
[DEFAULT]
HOST=http://www.example.com
CONTEXT=${HOST}/auth/
token_url=${CONTEXT}/oauth2/token
ConfigParser:
import os
import configparser
config = configparser.ConfigParser(interpolation=configparser.ExtendedInterpolation())
ini_file = os.path.join(os.path.dirname(__file__), 'my_config.ini')
# replace variables with environment variables(if exists) before loading ini file
with open(ini_file, 'r') as cfg_file:
cfg_env_txt = os.path.expandvars(cfg_file.read())
config.read_string(cfg_env_txt)
print(config['DEFAULT']['token_url'])
Output:
If no environtment variable $HOST or $CONTEXT is present this config will take the default value
user can override the default value by creating $HOST, $CONTEXT environment variable
works well with docker container
Given a string with a module name, how do you import everything in the module as if you had called:
from module import *
i.e. given string S="module", how does one get the equivalent of the following:
__import__(S, fromlist="*")
This doesn't seem to perform as expected (as it doesn't import anything).
Please reconsider. The only thing worse than import * is magic import *.
If you really want to:
m = __import__ (S)
try:
attrlist = m.__all__
except AttributeError:
attrlist = dir (m)
for attr in attrlist:
globals()[attr] = getattr (m, attr)
Here's my solution for dynamic naming of local settings files for Django. Note the addition below of a check to not include attributes containing '__' from the imported file. The __name__ global was being overwritten with the module name of the local settings file, which caused setup_environ(), used in manage.py, to have problems.
try:
import socket
HOSTNAME = socket.gethostname().replace('.','_')
# See http://docs.python.org/library/functions.html#__import__
m = __import__(name="settings_%s" % HOSTNAME, globals=globals(), locals=locals(), fromlist="*")
try:
attrlist = m.__all__
except AttributeError:
attrlist = dir(m)
for attr in [a for a in attrlist if '__' not in a]:
globals()[attr] = getattr(m, attr)
except ImportError, e:
sys.stderr.write('Unable to read settings_%s.py\n' % HOSTNAME)
sys.exit(1)
The underlying problem is that I am developing some Django, but on more than one host (with colleagues), all with different settings. I was hoping to do something like this in the project/settings.py file:
from platform import node
settings_files = { 'BMH.lan': 'settings_bmh.py", ... }
__import__( settings_files[ node() ] )
It seemed a simple solution (thus elegant), but I would agree that it has a smell to it and the simplicity goes out the loop when you have to use logic like what John Millikin posted (thanks). Here's essentially the solution I went with:
from platform import node
from settings_global import *
n = node()
if n == 'BMH.lan':
from settings_bmh import *
# add your own, here...
else:
raise Exception("No host settings for '%s'. See settings.py." % node())
Which works fine for our purposes.
It appears that you can also use dict.update() on module's dictionaries in your case:
config = [__import__(name) for name in names_list]
options = {}
for conf in config:
options.update(conf.__dict__)
Update: I think there's a short "functional" version of it:
options = reduce(dict.update, map(__import__, names_list))
I didn't find a good way to do it so I took a simpler but ugly way from http://www.djangosnippets.org/snippets/600/
try:
import socket
hostname = socket.gethostname().replace('.','_')
exec "from host_settings.%s import *" % hostname
except ImportError, e:
raise e
Given a string with a module name, how do you import everything in the module as if you had called:
from module import *
i.e. given string S="module", how does one get the equivalent of the following:
__import__(S, fromlist="*")
This doesn't seem to perform as expected (as it doesn't import anything).
Please reconsider. The only thing worse than import * is magic import *.
If you really want to:
m = __import__ (S)
try:
attrlist = m.__all__
except AttributeError:
attrlist = dir (m)
for attr in attrlist:
globals()[attr] = getattr (m, attr)
Here's my solution for dynamic naming of local settings files for Django. Note the addition below of a check to not include attributes containing '__' from the imported file. The __name__ global was being overwritten with the module name of the local settings file, which caused setup_environ(), used in manage.py, to have problems.
try:
import socket
HOSTNAME = socket.gethostname().replace('.','_')
# See http://docs.python.org/library/functions.html#__import__
m = __import__(name="settings_%s" % HOSTNAME, globals=globals(), locals=locals(), fromlist="*")
try:
attrlist = m.__all__
except AttributeError:
attrlist = dir(m)
for attr in [a for a in attrlist if '__' not in a]:
globals()[attr] = getattr(m, attr)
except ImportError, e:
sys.stderr.write('Unable to read settings_%s.py\n' % HOSTNAME)
sys.exit(1)
The underlying problem is that I am developing some Django, but on more than one host (with colleagues), all with different settings. I was hoping to do something like this in the project/settings.py file:
from platform import node
settings_files = { 'BMH.lan': 'settings_bmh.py", ... }
__import__( settings_files[ node() ] )
It seemed a simple solution (thus elegant), but I would agree that it has a smell to it and the simplicity goes out the loop when you have to use logic like what John Millikin posted (thanks). Here's essentially the solution I went with:
from platform import node
from settings_global import *
n = node()
if n == 'BMH.lan':
from settings_bmh import *
# add your own, here...
else:
raise Exception("No host settings for '%s'. See settings.py." % node())
Which works fine for our purposes.
It appears that you can also use dict.update() on module's dictionaries in your case:
config = [__import__(name) for name in names_list]
options = {}
for conf in config:
options.update(conf.__dict__)
Update: I think there's a short "functional" version of it:
options = reduce(dict.update, map(__import__, names_list))
I didn't find a good way to do it so I took a simpler but ugly way from http://www.djangosnippets.org/snippets/600/
try:
import socket
hostname = socket.gethostname().replace('.','_')
exec "from host_settings.%s import *" % hostname
except ImportError, e:
raise e