I need to record logs in many places of my Django application and I want to have it in a standardized way so that when people contribute they don't have to worry to much about it.
So instead of calling:
logger.info(<standardized message here>)
I would like to write a function or probably a class that inherits Logger and selects the correct .info, .warning, .error and so on and outputs the message the way we want.
So in the code I would only call
log(request, 200, "info", "Success message")
And the function I wrote looks like this:
logger = logging.getLogger(__name__)
def log(request=None, status_code=None, level=None, message=None):
"""
Create a log stream in a standardized format
<request_path> <request_method> <status_code>: user <user_id> <message>
"""
method = request.method
path = request.path
user_id = None
if not request.user.is_anonymous:
user_id = request.user.id
log = f"{path} {method} {status_code}: user {user_id} {message}"
if level == 'info':
logger_type = logger.info
elif level == 'warning':
logger_type = logger.warning
elif level == 'error':
logger_type = logger.error
elif level == 'debug':
logger_type = logger.debug
return logger_type(log)
The problem is that our log formatter also records the line in the code where the log happened, and because the log is called in the log function, the row reflects the return logger_type(log) instead of the actual line in the code where log was called.
How can I write a class with a method to do the automatic selection of log level based on the input and still stream the log from the line where it happened?
I want to have it in a standardized way
If you really want to do things in a standardized way, then use what the logging module gives you and don't create your own helper functions.
The logging already lets you pass in "extra" arguments for the logging message (doc):
logger.info("success message", extra={"request": request, "status_code": 200})
You can then put your logic extract parts of the request in a formatter object.
As well as following the standard, this will give you additional flexibility, as your formatter can do different things based on what extra parameters it gets.
One thing that the Python logging module is missing that other loggers have is the idea of a thread-local "context" object. You could use such and object to preserve the request at the time it's processed, so that it's available in all log messages without being passed in explicitly. You could implement this by adding a request_context dictionary to the current thread, and look for it in your formatter.
If you would like to stick with your helper function approach, the logging methods accept kwarg stacklevel that specifies the corresponding number of stack frames to skip when computing the line number and function name ref. The default is 1, so setting it to 2 should look to the frame above you helper function. For example:
import logging
logging.basicConfig(format="%(lineno)d")
logger = logging.getLogger(__name__)
def log():
logger.error("message", stacklevel=2)
log()
I'm trying to figure out how to create unit tests for a function, which behavior is based on a third party service.
Suppose a function like this:
def sync_check():
delta_secs = 90
now = datetime.datetime.now().utcnow()
res = requests.get('<url>')
alert = SlackAlert()
last_value = res[-1]['date'] # Last element of the array is the most recent
secs = (now - last_value).seconds
if secs >= delta_secs:
alert.notify("out of sync. Delay: {} seconds".format(secs))
else:
alert.notify('in sync')
What's best practice to write unit test for this function? I need to test both if and else branches, but this depends on the third party service.
The first thing that come to my mind is to create a fake webserver and point to that one (changing url) but this way the codebase would include testing logic, like:
if test:
url = <mock_web_server_url>
else:
url = <third_party_service_url>
Moreover, unit testing would trigger slack alerts, which doesn't have to happen.
So there I shoulde change again the codebase like:
if secs >= delta_secs:
if test:
logging.debug("out of sync alert sent - testing mode")
else:
alert.notify("out of sync. Delay: {} seconds".format(secs))
else:
if test:
logging.debug("in sync alert sent - testing mode")
else:
alert.notify('in sync')
Which I don't really like.
Am I missing any design to solve this problem?
Check out Dependency Injection to test code that depends on third party services, without having to check whether you're running in test mode, like in your example. The basic idea is to have the slack alert service be an argument of your function, so for unit testing you can use a fake service that acts the way you want it to for each test.
Your code would end up looking something like this:
def sync_check(alert):
delta_secs = 90
now = datetime.datetime.now().utcnow()
res = requests.get('<url>')
last_value = res[-1]['date'] # Last element of the array is the most recent
secs = (now - last_value).seconds
if secs >= delta_secs:
alert.notify("out of sync. Delay: {} seconds".format(secs))
else:
alert.notify('in sync')
and in a test case, you could have your alert object be something as simple as:
class TestAlert:
def __init__(self):
self.message = None
def notify(self, message):
self.message = message
You could then test your function by passing on an instance of your TestAlert class, and check the logged output if you want to, by accessing the message attribute. This code would not access any third party services.
def test_sync_check():
alert = TestAlert()
sync_check(alert)
assert alert.message == 'in sync'
I'm trying to do home assignment connected with python from Data Manipulation at Scale: Systems and Algorithms at Curesra. Generally I have problems with understanding base code which was presented as an example of MapReduce alogorythm. I would be grateful for helping me understand it in 2 places, details below.
I tired to go step by step through code flow of below two files after running command:
python wordcount.py 'data/books.json'
File wordcount.py is opened
mr = MapReduce.MapReduce() - me object is created
def __init__(self): part from MapReduce.py is
executed
We come back to wordcount.py
Functions def mapper(record): and def reducer(key,list_of_values): are created but for the time being without execution
Python go to if __name__ == '__main__':
` inputdata = open(sys.argv[1]) - json file is assigned to a
variable
mr.execute(inputdata, mapper, reducer) - A call to the function from MapReduce.py.
And here is my first question we haven't deffined mapper or reducer variable/object so far. Is it just null/no value passed to this function or we somehow defined this variable before but I missed this?
Later me move to def execute(self, data, mapper, reducer): in
MapReduce.py
And there we have mapper(record).
So this is reference to a function in wordcount.py, am I right? But if we have reference to a function in different file shouldn't we use import at the beginning of the file and define from which file this function came?
(...) further code execution
wordcount.py file:
import MapReduce
import sys
"""
Word Count Example in the Simple Python MapReduce Framework
"""
mr = MapReduce.MapReduce()
# =============================
# Do not modify above this line
def mapper(record):
# key: document identifier
# value: document contents
key = record[0]
value = record[1]
words = value.split()
for w in words:
mr.emit_intermediate(w, 1)
def reducer(key, list_of_values):
# key: word
# value: list of occurrence counts
total = 0
for v in list_of_values:
total += v
mr.emit((key, total))
# Do not modify below this line
# =============================
if __name__ == '__main__':
inputdata = open(sys.argv[1])
mr.execute(inputdata, mapper, reducer)
MapReduce.py file:
import json
class MapReduce:
def __init__(self):
self.intermediate = {}
self.result = []
def emit_intermediate(self, key, value):
self.intermediate.setdefault(key, [])
self.intermediate[key].append(value)
def emit(self, value):
self.result.append(value)
def execute(self, data, mapper, reducer):
for line in data:
record = json.loads(line)
mapper(record)
for key in self.intermediate:
reducer(key, self.intermediate[key])
#jenc = json.JSONEncoder(encoding='latin-1')
jenc = json.JSONEncoder()
for item in self.result:
print jenc.encode(item)
Thank you in advance for help with that.
In python everything is a object, that include functions, so you can pass a functionA as argument to another functionB (or class or whenever), and if functionB expect that you to do it, it will assume that you give it a functions with the right firm and a proceed as normal.
In yours case
mr.execute(inputdata, mapper, reducer)
here mapper, reducer are the functions previously defined that are passed as argument to the method execute of the instance mr of the class MapReduce and as you can see, said method use it as the functions that it expect.
Thank to this you can, as the that code show, make generic code that do some calculus that can be used in similar way by many applications by given the user the options of supplies his/her own functions.
A much more generic example of this is the function map, this function receive a function that do something, map don't care what it does or where it comefrom, only that receive as many argument as map himself receive (others that say functions) and return a value to build a new list with the results.
I'm using ZODB as a persistent storage for objects that are going to be modified through a webservice.
Below is an example to which I reduced the issue.
The increment-function is what is called from multiple threads.
My problem is, that when increment is called simultaneously from two threads, for different keys, I'm getting the conflict-error.
I imagine it should be possible to resolve this, at least as long different keys are modified, in a proper way?
If so, I didn't manage to find an example on how to... (the zodb-documentation seems to be somewhat spread across different sites :/ )
Glad about any ideas...
import time
import transaction
from ZODB.FileStorage import FileStorage
from ZODB.DB import DB
from ZODB.POSException import ConflictError
def test_db():
store = FileStorage('zodb_storage.fs')
return DB(store)
db_test = test_db()
# app here is a flask-app
#app.route('/increment/<string:key>')
def increment(key):
'''increment the value of a certain key'''
# open connection
conn = db_test.open()
# get the current value:
root = conn.root()
val = root.get(key,0)
# calculate new value
# in the real application this might take some seconds
time.sleep(0.1)
root[key] = val + 1
try:
transaction.commit()
return '%s = %g' % (key, val)
except ConflictError:
transaction.abort()
return 'ConflictError :-('
You have two options here: implement conflict resolution, or retry the commit with fresh data.
Conflict resolution only applies to custom types you store in the ZODB, and can only be applied if you know how to merge your change into the newly-changed state.
The ZODB looks for a _p_resolveConflict() method on custom types and calls that method with the old state, the saved state you are in conflict with, and the new state you tried to commit; you are supposed to return the merged state. For a simple counter, like in your example, that'd be a as simple as updating the saved state with the change between the old and new states:
class Counter(Persistent):
def __init__(self, start=0):
self._count = start
def increment(self):
self._count += 1
return self._count
def _p_resolveConflict(self, old, saved, new):
# default __getstate__ returns a dictionary of instance attributes
saved['_count'] += new['_count'] - old['_count']
return saved
The other option is to retry the commit; you want to limit the number of retries, and you probably want to encapsulate this in a decorator on your method, but the basic principle is that you loop up to a limit, make your calculations based on ZODB data (which, after a conflict error, will auto-read fresh data where needed), then attempt to commit. If the commit is successful you are done:
max_retries = 10
retry = 0
conn = db_test.open()
root = conn.root()
while retry < max_retries:
val = root.get(key,0)
time.sleep(0.1)
root[key] = val + 1
try:
transaction.commit()
return '%s = %g' % (key, val)
except ConflictError:
retry += 1
raise CustomExceptionIndicatingTooManyRetries
Is there a way in Python unittest to set the order in which test cases are run?
In my current TestCase class, some testcases have side effects that set conditions for the others to run properly. Now I realize the proper way to do this is to use setUp() to do all setup related things, but I would like to implement a design where each successive test builds slightly more state that the next can use. I find this much more elegant.
class MyTest(TestCase):
def test_setup(self):
# Do something
def test_thing(self):
# Do something that depends on test_setup()
Ideally, I would like the tests to be run in the order they appear in the class. It appears that they run in alphabetical order.
Don't make them independent tests - if you want a monolithic test, write a monolithic test.
class Monolithic(TestCase):
def step1(self):
...
def step2(self):
...
def _steps(self):
for name in dir(self): # dir() result is implicitly sorted
if name.startswith("step"):
yield name, getattr(self, name)
def test_steps(self):
for name, step in self._steps():
try:
step()
except Exception as e:
self.fail("{} failed ({}: {})".format(step, type(e), e))
If the test later starts failing and you want information on all failing steps instead of halting the test case at the first failed step, you can use the subtests feature: https://docs.python.org/3/library/unittest.html#distinguishing-test-iterations-using-subtests
(The subtest feature is available via unittest2 for versions prior to Python 3.4: https://pypi.python.org/pypi/unittest2 )
It's a good practice to always write a monolithic test for such expectations. However, if you are a goofy dude like me, then you could simply write ugly looking methods in alphabetical order so that they are sorted from a to b as mentioned in the Python documentation - unittest — Unit testing framework
Note that the order in which the various test cases will be run is
determined by sorting the test function names with respect to the
built-in ordering for strings
Example
def test_a_first():
print "1"
def test_b_next():
print "2"
def test_c_last():
print "3"
From unittest — Unit testing framework, section Organizing test code:
Note: The order in which the various tests will be run is determined by sorting the test method names with respect to the built-in ordering for strings.
So just make sure test_setup's name has the smallest string value.
Note that you should not rely on this behavior — different test functions are supposed to be independent of the order of execution. See ngcohlan's answer above for a solution if you explicitly need an order.
Another way that I didn't see listed in any related questions: Use a TestSuite.
Another way to accomplish ordering is to add the tests to a unitest.TestSuite. This seems to respect the order in which the tests are added to the suite using suite.addTest(...). To do this:
Create one or more TestCase subclasses,
class FooTestCase(unittest.TestCase):
def test_ten():
print('Testing ten (10)...')
def test_eleven():
print('Testing eleven (11)...')
class BarTestCase(unittest.TestCase):
def test_twelve():
print('Testing twelve (12)...')
def test_nine():
print('Testing nine (09)...')
Create a callable test-suite generation added in your desired order, adapted from the documentation and this question:
def suite():
suite = unittest.TestSuite()
suite.addTest(BarTestCase('test_nine'))
suite.addTest(FooTestCase('test_ten'))
suite.addTest(FooTestCase('test_eleven'))
suite.addTest(BarTestCase('test_twelve'))
return suite
Execute the test-suite, e.g.,
if __name__ == '__main__':
runner = unittest.TextTestRunner(failfast=True)
runner.run(suite())
For context, I had a need for this and wasn't satisfied with the other options. I settled on the above way of doing test ordering.
I didn't see this TestSuite method listed any of the several "unit-test ordering questions" (e.g., this question and others including execution order, or changing order, or tests order).
I ended up with a simple solution that worked for me:
class SequentialTestLoader(unittest.TestLoader):
def getTestCaseNames(self, testCaseClass):
test_names = super().getTestCaseNames(testCaseClass)
testcase_methods = list(testCaseClass.__dict__.keys())
test_names.sort(key=testcase_methods.index)
return test_names
And then
unittest.main(testLoader=utils.SequentialTestLoader())
A simple and flexible way is to assign a comparator function to unittest.TestLoader.sortTestMethodsUsing:
Function to be used to compare method names when sorting them in getTestCaseNames() and all the loadTestsFrom*() methods.
Minimal usage:
import unittest
class Test(unittest.TestCase):
def test_foo(self):
""" test foo """
self.assertEqual(1, 1)
def test_bar(self):
""" test bar """
self.assertEqual(1, 1)
if __name__ == "__main__":
test_order = ["test_foo", "test_bar"] # could be sys.argv
loader = unittest.TestLoader()
loader.sortTestMethodsUsing = lambda x, y: test_order.index(x) - test_order.index(y)
unittest.main(testLoader=loader, verbosity=2)
Output:
test_foo (__main__.Test)
test foo ... ok
test_bar (__main__.Test)
test bar ... ok
Here's a proof of concept for running tests in source code order instead of the default lexical order (output is as above).
import inspect
import unittest
class Test(unittest.TestCase):
def test_foo(self):
""" test foo """
self.assertEqual(1, 1)
def test_bar(self):
""" test bar """
self.assertEqual(1, 1)
if __name__ == "__main__":
test_src = inspect.getsource(Test)
unittest.TestLoader.sortTestMethodsUsing = lambda _, x, y: (
test_src.index(f"def {x}") - test_src.index(f"def {y}")
)
unittest.main(verbosity=2)
I used Python 3.8.0 in this post.
Tests which really depend on each other should be explicitly chained into one test.
Tests which require different levels of setup, could also have their corresponding setUp() running enough setup - various ways thinkable.
Otherwise unittest handles the test classes and test methods inside the test classes in alphabetical order by default (even when loader.sortTestMethodsUsing is None). dir() is used internally which sorts by guarantee.
The latter behavior can be exploited for practicability - e.g. for having the latest-work-tests run first to speed up the edit-testrun-cycle.
But that behavior should not be used to establish real dependencies. Consider that tests can be run individually via command-line options etc.
One approach can be to let those sub tests be not be treated as tests by the unittest module by appending _ in front of them and then building a test case which builds on the right order of these sub-operations executed.
This is better than relying on the sorting order of unittest module as that might change tomorrow and also achieving topological sort on the order will not be very straightforward.
An example of this approach, taken from here (Disclaimer: my own module), is as below.
Here, test case runs independent tests, such as checking for table parameter not set (test_table_not_set) or test for primary key (test_primary_key) still in parallel, but a CRUD test makes sense only if done in right order and state set by previous operations. Hence those tests have been rather made just separate unit, but not test. Another test (test_CRUD) then builds a right order of those operations and tests them.
import os
import sqlite3
import unittest
from sql30 import db
DB_NAME = 'review.db'
class Reviews(db.Model):
TABLE = 'reviews'
PKEY = 'rid'
DB_SCHEMA = {
'db_name': DB_NAME,
'tables': [
{
'name': TABLE,
'fields': {
'rid': 'uuid',
'header': 'text',
'rating': 'int',
'desc': 'text'
},
'primary_key': PKEY
}]
}
VALIDATE_BEFORE_WRITE = True
class ReviewTest(unittest.TestCase):
def setUp(self):
if os.path.exists(DB_NAME):
os.remove(DB_NAME)
def test_table_not_set(self):
"""
Tests for raise of assertion when table is not set.
"""
db = Reviews()
try:
db.read()
except Exception as err:
self.assertIn('No table set for operation', str(err))
def test_primary_key(self):
"""
Ensures, primary key is honored.
"""
db = Reviews()
db.table = 'reviews'
db.write(rid=10, rating=5)
try:
db.write(rid=10, rating=4)
except sqlite3.IntegrityError as err:
self.assertIn('UNIQUE constraint failed', str(err))
def _test_CREATE(self):
db = Reviews()
db.table = 'reviews'
# backward compatibility for 'write' API
db.write(tbl='reviews', rid=1, header='good thing', rating=5)
# New API with 'create'
db.create(tbl='reviews', rid=2, header='good thing', rating=5)
# Backward compatibility for 'write' API, without tbl,
# explicitly passed
db.write(tbl='reviews', rid=3, header='good thing', rating=5)
# New API with 'create', without table name explicitly passed.
db.create(tbl='reviews', rid=4, header='good thing', rating=5)
db.commit() # Save the work.
def _test_READ(self):
db = Reviews()
db.table = 'reviews'
rec1 = db.read(tbl='reviews', rid=1, header='good thing', rating=5)
rec2 = db.read(rid=1, header='good thing')
rec3 = db.read(rid=1)
self.assertEqual(rec1, rec2)
self.assertEqual(rec2, rec3)
recs = db.read() # Read all
self.assertEqual(len(recs), 4)
def _test_UPDATE(self):
db = Reviews()
db.table = 'reviews'
where = {'rid': 2}
db.update(condition=where, header='average item', rating=2)
db.commit()
rec = db.read(rid=2)[0]
self.assertIn('average item', rec)
def _test_DELETE(self):
db = Reviews()
db.table = 'reviews'
db.delete(rid=2)
db.commit()
self.assertFalse(db.read(rid=2))
def test_CRUD(self):
self._test_CREATE()
self._test_READ()
self._test_UPDATE()
self._test_DELETE()
def tearDown(self):
os.remove(DB_NAME)
you can start with:
test_order = ['base']
def index_of(item, list):
try:
return list.index(item)
except:
return len(list) + 1
2nd define the order function:
def order_methods(x, y):
x_rank = index_of(x[5:100], test_order)
y_rank = index_of(y[5:100], test_order)
return (x_rank > y_rank) - (x_rank < y_rank)
3rd set it in the class:
class ClassTests(unittest.TestCase):
unittest.TestLoader.sortTestMethodsUsing = staticmethod(order_methods)
ncoghlan's answer was exactly what I was looking for when I came to this question. I ended up modifying it to allow each step-test to run, even if a previous step had already thrown an error; this helps me (and maybe you!) to discover and plan for the propagation of error in multi-threaded database-centric software.
class Monolithic(TestCase):
def step1_testName1(self):
...
def step2_testName2(self):
...
def steps(self):
'''
Generates the step methods from their parent object
'''
for name in sorted(dir(self)):
if name.startswith('step'):
yield name, getattr(self, name)
def test_steps(self):
'''
Run the individual steps associated with this test
'''
# Create a flag that determines whether to raise an error at
# the end of the test
failed = False
# An empty string that the will accumulate error messages for
# each failing step
fail_message = ''
for name, step in self.steps():
try:
step()
except Exception as e:
# A step has failed, the test should continue through
# the remaining steps, but eventually fail
failed = True
# Get the name of the method -- so the fail message is
# nicer to read :)
name = name.split('_')[1]
# Append this step's exception to the fail message
fail_message += "\n\nFAIL: {}\n {} failed ({}: {})".format(name,
step,
type(e),
e)
# Check if any of the steps failed
if failed is True:
# Fail the test with the accumulated exception message
self.fail(fail_message)
I also wanted to specify a particular order of execution to my tests. The main differences to other answers in here are:
I wanted to perverse a more verbose test
method name without replacing whole name with step1, step2 etc.
I also wanted the printed method execution in the console to have some granularity apposed to using a Monolithic solution in some of the other answers.
So for the execution for monolithic test method is looked like this:
test_booking (__main__.TestBooking) ... ok
I wanted:
test_create_booking__step1 (__main__.TestBooking) ... ok
test_process_booking__step2 (__main__.TestBooking) ... ok
test_delete_booking__step3 (__main__.TestBooking) ... ok
How to achieve this
I provided a suffix to my method name with the __step<order> for example (order of definition is not important):
def test_create_booking__step1(self):
[...]
def test_delete_booking__step3(self):
[...]
def test_process_booking__step2(self):
[...]
For the test suite override the __iter__ function which will build an iterator for the test methods.
class BookingTestSuite(unittest.TestSuite):
""" Extends the functionality of the the standard test suites """
def __iter__(self):
for suite in self._tests:
suite._tests = sorted(
[x for x in suite._tests if hasattr(x, '_testMethodName')],
key = lambda x: int(x._testMethodName.split("step")[1])
)
return iter(self._tests)
This will sort test methods into order and execute them accordingly.