How to test complicated functions which use requests? - python

I want to test my code that is based on the API created by someone else, but im not sure how should I do this.
I have created some function to save the json into file so I don't need to send requests each time I run test, but I don't know how to make it work in situation when the original (check) function takes an input arg (problem_report) which is an instance of some class provided by API and it has this
problem_report.get_correction(corr_link) method. I just wonder if this is a sign of bad written code by me, beacuse I can't write a test to this, or maybe I should rewrite this function in my tests file like I showed at the end of provided below code.
# I to want test this function
def check(problem_report):
corrections = {}
for corr_link, corr_id in problem_report.links.items():
if re.findall(pattern='detailCorrection', string=corr_link):
correction = problem_report.get_correction(corr_link)
corrections.update({corr_id: correction})
return corrections
# function serves to load json from file, normally it is downloaded by API from some page.
def load_pr(pr_id):
print('loading')
with open('{}{}_view_pr.json'.format(saved_prs_path, pr_id)) as view_pr:
view_pr = json.load(view_pr)
...
pr_info = {'view_pr': view_pr, ...}
return pr_info
# create an instance of class MyPR which takes json to __init__
#pytest.fixture
def setup_pr():
print('setup')
pr = load_pr('123')
my_pr = MyPR(pr['view_pr'])
return my_pr
# test function
def test_check(setup_pr):
pr = setup_pr
checked_pr = pr.check(setup_rft[1]['problem_report_pr'])
assert checker_pr
# rewritten check function in test file
#mock.patch('problem_report.get_correction', side_effect=get_corr)
def test_check(problem_report):
corrections = {}
for corr_link, corr_id in problem_report.links.items():
if re.findall(pattern='detailCorrection', string=corr_link):
correction = problem_report.get_correction(corr_link)
corrections.update({corr_id: correction})
return corrections
Im' not sure if I provided enough code and explanation to underastand the problem, but I hope so. I wish you could tell me if this is normal that some function are just hard to test, and if this is good practice to rewritte them separately so I can mock functions inside the tested function. I also was thinking that I could write new class with similar functionality but API is very large and it would be very long process.

I understand your question as follows: You have a function check that you consider hard to test because of its dependency on the problem_report. To make it better testable you have copied the code into the test file. You will test the copied code because you can modify this to be easier testable. And, you want to know if this approach makes sense.
The answer is no, this does not make sense. You are not testing the real function, but completely different code. Well, the code may not start being completely different, but in short time the copy and the original will deviate, and it will be a maintenance nightmare to ensure that the copy always resembles the original. Improving code for testability is a different story: You can make changes to the check function to improve its testability. But then, exactly the same resulting function should be used both in the test and the production code.
How to better test the function check then? First, are you sure that using the original problem_report objects really can not be sensibly used in your tests? (Here are some criteria that help you decide: What to mock for python test cases?). Now, lets assume that you come to the conclusion you can not sensibly use the original problem_report.
In that case, here the interface is simple enough to define a mocked problem_report. Keep in mind that Python uses duck typing, so you only have to create a class that has a links member which has an items() method. Plus, your mocked problem_report class needs a method get_correction(). Beyond that, your mock does not have to produce types that are similar to the types used by problem_report. The items() method can return simply a list of lists, like [["a",2],["xxxxdetailCorrectionxxxx",4]]. The same argument holds for get_correction, which could for example simply return its argument or a derived value, like, its negative.
For the above example (items() returning [["a",2],["xxxxdetailCorrectionxxxx",4]] and get_correction returning the negative of its argument) the expected result would be {4: -4}. No need to simulate real correction objects. And, you can create your mocked versions of problem_report without need to read data from files - the mocks can be setup completely from within the unit-testing code.

Try patching the problem_report symbol in the module. You should put your tests in a separate class.
#mock.patch('some.module.path.problem_report')
def test_check(problem_report):
problem_report.side_effect = get_corr
corrections = {}
for corr_link, corr_id in problem_report.links.items():
if re.findall(pattern='detailCorrection', string=corr_link):
correction = problem_report.get_correction(corr_link)
corrections.update({corr_id: correction})
return corrections

Related

How to assert a method has been called from another complex method in Python?

I am adding some tests to existing not so test friendly code, as title suggest, I need to test if the complex method actually calls another method, eg.
class SomeView(...):
def verify_permission(self, ...):
# some logic to verify permission
...
def get(self, ...):
# some codes here I am not interested in this test case
...
if some condition:
self.verify_permission(...)
# some other codes here I am not interested in this test case
...
I need to write some test cases to verify self.verify_permission is called when condition is met.
Do I need to mock all the way to the point of where self.verify_permission is executed? Or I need to refactor the def get() function to abstract out the code to become more test friendly?
There are a number of points made in the comments that I strongly disagree with, but to your actual question first.
This is a very common scenario. The suggested approach with the standard library's unittest package is to utilize the Mock.assert_called... methods.
I added some fake logic to your example code, just so that we can actually test it.
code.py
class SomeView:
def verify_permission(self, arg: str) -> None:
# some logic to verify permission
print(self, f"verify_permission({arg=}=")
def get(self, arg: int) -> int:
# some codes here I am not interested in this test case
...
some_condition = True if arg % 2 == 0 else False
...
if some_condition:
self.verify_permission(str(arg))
# some other codes here I am not interested in this test case
...
return arg * 2
test.py
from unittest import TestCase
from unittest.mock import MagicMock, patch
from . import code
class SomeViewTestCase(TestCase):
def test_verify_permission(self) -> None:
...
#patch.object(code.SomeView, "verify_permission")
def test_get(self, mock_verify_permission: MagicMock) -> None:
obj = code.SomeView()
# Odd `arg`:
arg, expected_output = 3, 6
output = obj.get(arg)
self.assertEqual(expected_output, output)
mock_verify_permission.assert_not_called()
# Even `arg`:
arg, expected_output = 2, 4
output = obj.get(arg)
self.assertEqual(expected_output, output)
mock_verify_permission.assert_called_once_with(str(arg))
You use a patch variant as a decorator to inject a MagicMock instance to replace the actual verify_permission method for the duration of the entire test method. In this example that method has no return value, just a side effect (the print). Thus, we just need to check if it was called under the correct conditions.
In the example, the condition depends directly on the arg passed to get, but this will obviously be different in your actual use case. But this can always be adapted. Since the fake example of get has exactly two branches, the test method calls it twice to traverse both of them.
When doing unit tests, you should always isolate the unit (i.e. function) under testing from all your other functions. That means, if your get method calls other methods of SomeView or any other functions you wrote yourself, those should be mocked out during test_get.
You want your test of get to be completely agnostic to the logic inside verify_permission or any other of your functions used inside get. Those are tested separately. You assume they work "as advertised" for the duration of test_get and by replacing them with Mock instances you control exactly how they behave in relation to get.
Note that the point about mocking out "network requests" and the like is completely unrelated. That is an entirely different but equally valid use of mocking.
Basically, you 1.) always mock your own functions and 2.) usually mock external/built-in functions with side effects (like e.g. network or disk I/O). That is it.
Also, writing tests for existing code absolutely has value. Of course it is better to write tests alongside your code. But sometimes you are just put in charge of maintaining a bunch of existing code that has no tests. If you want/can/are allowed to, you can refactor the existing code and write your tests in sync with that. But if not, it is still better to add tests retroactively than to have no tests at all for that code.
And if you write your unit tests properly, they still do their job, if you or someone else later decides to change something about the code. If the change breaks your tests, you'll notice.
As for the exception hack to interrupt the tested method early... Sure, if you want. It's lazy and calls into question the whole point of writing tests, but you do you.
No, seriously, that is a horrible approach. Why on earth would you test just part of a function? If you are already writing a test for it, you may as well cover it to the end. And if it is so complex that it has dozens of branches and/or calls 10 or 20 other custom functions, then yes, you should definitely refactor it.

Make function return predictable values for testing purposes

I need a function to return predictable values when running tests.
For example, I have a function get_usd_rates(), which is loading USD forex rates from some API.
So far when I was using it inside any other function I was passing rates as an optional argument for testing purposes only. Seems hacky, but it worked. Like this:
def some_other_function(rates=None):
if rates is None:
rates = get_usd_rates()
# do something with rates
But now I am facing a situation where I can't pass extra argument to a function (private class method for django model, which is called on model field change).
Is there a way to make get_usd_rates() function aware that test is running and always return some predefined value without noticeable performance impact in this case?
Or what is the best way to deal with this problem.
What you need to do is mock the methods. This is a module present in the unittest module. Try using mock.patch:
from unittest.mock import patch
#patch('path.to.get_usd_rates')
def your_test_function(mock_get_usd_rates):
mock_get_usd_rates.return_value = "Some predefined value"
# Rest of your test (Anywhere that get_usd_rates is used will now automaticlly use mock_get_usd_rates)
What happens here is that mock.patch will replace your function get_usd_rates with a mock on which you set what you want the return value to be. There are various ways to do this other than a decorator (context manager for one, etc.) Reference: unittest.mock

Is it appropriate to use a class for the purpose of organizing functions that share inputs?

To provide a bit of context, I am building a risk model that pulls data from various different sources. Initially I wrote the model as a single function that when executed read in the different data sources as pandas.DataFrame objects and used those objects when necessary. As the model grew in complexity, it quickly became unreadable and I found myself copy an pasting blocks of code often.
To cleanup the code I decided to make a class that when initialized reads, cleans and parses the data. Initialization takes about a minute to run and builds my model in its entirety.
The class also has some additional functionality. There is a generate_email method that sends an email with details about high risk factors and another method append_history that point-in-times the risk model and saves it so I can run time comparisons.
The thing about these two additional methods is that I cannot imagine a scenario where I would call them without first re-calibrating my risk model. So I have considered calling them in init() like my other methods. I haven't only because I am trying to justify having a class in the first place.
I am consulting this community because my project structure feels clunky and awkward. I am inclined to believe that I should not be using a class at all. Is it frowned upon to create classes merely for the purpose of organization? Also, is it bad practice to call instance methods (that take upwards of a minute to run) within init()?
Ultimately, I am looking for reassurance or a better code structure. Any help would be greatly appreciated.
Here is some pseudo code showing my project structure:
class RiskModel:
def __init__(self, data_path_a, data_path_b):
self.data_path_a = data_path_a
self.data_path_b = data_path_b
self.historical_data = None
self.raw_data = None
self.lookup_table = None
self._read_in_data()
self.risk_breakdown = None
self._generate_risk_breakdown()
self.risk_summary = None
self.generate_risk_summary()
def _read_in_data(self):
# read in a .csv
self.historical_data = pd.read_csv(self.data_path_a)
# read an excel file containing many sheets into an ordered dictionary
self.raw_data = pd.read_excel(self.data_path_b, sheet_name=None)
# store a specific sheet from the excel file that is used by most of
# my class's methods
self.lookup_table = self.raw_data["Lookup"]
def _generate_risk_breakdown(self):
'''
A function that creates a DataFrame from self.historical_data,
self.raw_data, and self.lookup_table and stores it in
self.risk_breakdown
'''
self.risk_breakdown = some_dataframe
def _generate_risk_summary(self):
'''
A function that creates a DataFrame from self.lookup_table and
self.risk_breakdown and stores it in self.risk_summary
'''
self.risk_summary = some_dataframe
def generate_email(self, recipient):
'''
A function that sends an email with details about high risk factors
'''
if __name__ == "__main__":
risk_model = RiskModel(data_path_a, data_path_b)
risk_model.generate_email(recipient#generic.com)
In my opinion it is a good way to organize your project, especially since you mentioned the high rate of re-usability of parts of the code.
One thing though, I wouldn't put the _read_in_data, _generate_risk_breakdown and _generate_risk_summary methods inside __init__, but instead let the user call this methods after initializing the RiskModel class instance.
This way the user would be able to read in data from a different path or only to generate the risk breakdown or summary, without reading in the data once again.
Something like this:
my_risk_model = RiskModel()
my_risk_model.read_in_data(path_a, path_b)
my_risk_model.generate_risk_breakdown(parameters)
my_risk_model.generate_risk_summary(other_parameters)
If there is an issue of user calling these methods in an order which would break the logical chain, you could throw an exception if generate_risk_breakdown or generate_risk_summary are called before read_in_data. Of course you could only move the generate... methods out, leaving the data import inside __init__.
To advocate more on exposing the generate... methods out of __init__, consider a case scenario, where you would like to generate multiple risk summaries, changing various parameters. It would make sense, not to create the RiskModel every time and read the same data, but instead change the input to generate_risk_summary method:
my_risk_model = RiskModel()
my_risk_model.read_in_data(path_a, path_b)
for parameter in [50, 60, 80]:
my_risk_model.generate_risk_summary(parameter)
my_risk_model.generate_email('test#gmail.com')

How to construct an instance of _pytest.pytester.Testdir

I'm attempting to do some debugging (specifically on pytest/testing/test_doctest.py) and I want to step through some code in IPython. I have experience with pytest, but I never do anything too fancy with it, so I've never delved to deep into the more "magic" things it does.
In the test that I want to step through (potentially introspecting some of the objects), there is an argument called testdir, but nowhere in this file does it reference what testdir is or how I could possibly construct one.
After doing some digging it seems this is some magic fixture that automatically gets constructed and send to your function as a parameter, when you execute pytest with the pytester plugin. When I tracked down that class, it is constructed again via some magic request param, where the code is massively unhelpful in telling you what that magic request is or how to make one.
To make this concrete I simply want to take a test like this one:
def test_reportinfo(self, testdir):
'''
Test case to make sure that DoctestItem.reportinfo() returns lineno.
'''
p = testdir.makepyfile(test_reportinfo="""
def foo(x):
'''
>>> foo('a')
'b'
'''
return 'c'
""")
items, reprec = testdir.inline_genitems(p, '--doctest-modules')
reportinfo = items[0].reportinfo()
assert reportinfo[1] == 1
and run its logic in IPython. Looking at what the testdir object does, it seems pretty cool. It automatically makes a file for you and runs pytest problematically instead of via the command line. How can I make one of these? Is there some documentation I missed that makes how to do this clear and seem less obfuscated?
If I wanted to use something like this is my tests is there a way I could make what the magic testdir parameter is slightly more explicit so the next coder that looks at it isn't pulling his/her hair out like I am?
After much agonizing, I've figured out how to instantiate a fixture value.
import _pytest
config = _pytest.config._prepareconfig(['-s'], plugins=['pytester'])
session = _pytest.main.Session(config)
_pytest.tmpdir.pytest_configure(config)
_pytest.fixtures.pytest_sessionstart(session)
_pytest.runner.pytest_sessionstart(session)
def func(testdir):
return testdir
parent = _pytest.python.Module('parent', config=config, session=session)
function = _pytest.python.Function(
'func', parent, callobj=func, config=config, session=session)
_pytest.fixtures.fillfixtures(function)
testdir = function.funcargs['testdir']
The main idea is to create a dummy pytest session. This is a bit tricky. Its critical that the ['-s'] is passed into _prepareconfig otherwise this will not print stdout, or crash when run in IPython.
Given a barebones config and session, the next step is to manually load in whatever fixture functionality you are going to use. This amounts to manually calling the hooks that pluggy usually takes care of for you. I found these by looking at the attribute error I got when trying to run code without them. Usually its just due to session or config lacking a required attribute. There may be a better way to go about doing this (aka automatically via pluggy).
Next, we create a function that requests the specific fixture we are interested in. Its up to you to know what these names are. Finally we setup a dummy module / function tree structure and call fillfixtures, which does the magic. The funcargs then contains a dictionary of these objects ready for use. Be careful if you expect some teardown functionality. I'm not sure if this covers that, but I don't really need it for what I'm doing.
Hope this helps someone else. Note: this talk helped me understand what was happening in pytest under the hood a bit better: https://www.youtube.com/watch?v=zZsNPDfOoHU

Computing a function name from another function name

In python 3.4, I want to be able to do a very simple dispatch table for testing purposes. The idea is to have a dictionary with the key being a string of the name of the function to be tested and the data item being the name of the test function.
For example:
myTestList = (
"myDrawFromTo",
"myDrawLineDir"
)
myTestDict = {
"myDrawFromTo": test_myDrawFromTo,
"myDrawLineDir": test_myDrawLineDir
}
for myTest in myTestList:
result = myTestDict[myTest]()
The idea is that I have a list of function names someplace. In this example, I manually create a dictionary that maps those names to the names of test functions. The test function names are a simple extension of the function name. I'd like to compute the entire dictionary from the list of function names (here it is myTestList).
Alternately, if I can do the same thing without the dictionary, that'd be fine, too. I tried just building a new string from the entries in myTestList and then using local() to set up the call, but didn't have any luck. The dictionary idea came from the Python 3.x documentation.
There are two parts to the problem.
The easy part is just prefixing 'text_' onto each string:
tests = {test: 'test_'+test for test in myTestDict}
The harder part is actually looking up the functions by name. That kind of thing is usually a bad idea, but you've hit on one of the cases (generating tests) where it often makes sense. You can do this by looking them up in your module's global dictionary, like this:
tests = {test: globals()['test_'+test] for test in myTestList}
There are variations on the same idea if the tests live somewhere other than the module's global scope. For example, it might be a good idea to make them all methods of a class, in which case you'd do:
tester = TestClass()
tests = {test: getattr(tester, 'test_'+test) for test in myTestList}
(Although more likely that code would be inside TestClass, so it would be using self rather than tester.)
If you don't actually need the dict, of course, you can change the comprehension to an explicit for statement:
for test in myTestList:
globals()['test_'+test]()
One more thing: Before reinventing the wheel, have you looked at the testing frameworks built into the stdlib, or available on PyPI?
Abarnert's answer seems to be useful but to answer your original question of how to call all test functions for a list of function names:
def test_f():
print("testing f...")
def test_g():
print("testing g...")
myTestList = ['f', 'g']
for funcname in myTestList:
eval('test_' + funcname + '()')

Categories

Resources