How to get DRYer tests for simple function using python glob? - python

I have a function that searches for a file in current location, then in Download folder and if not found, raises and error. Using pytest and pytest-mock, I was able to test the code with a code much bigger than the tested one. Is there a way to make this tighter/DRYer?
Tested code:
# cei.py
import glob
import os
def get_xls_filename() -> str:
""" Returns first xls filename in current folder or Downloads folder """
csv_filenames = glob.glob("InfoCEI*.xls")
if csv_filenames:
return csv_filenames[0]
home = os.path.expanduser("~")
csv_filenames = glob.glob(home + "/Downloads/InfoCEI*.xls")
if csv_filenames:
return csv_filenames[0]
return sys.exit(
"Error: file not found."
)
There are three test scenarios here. Found in current, found in downloads and not found.
Test code:
# test_cei.py
from unittest.mock import Mock
import pytest
from pytest_mock import MockFixture
import cei
#pytest.fixture
def mock_glob_glob_none(mocker: MockFixture) -> Mock:
"""Fixture for mocking glob.glob."""
mock = mocker.patch("glob.glob")
mock.return_value = []
return mock
#pytest.fixture
def mock_os_path_expanduser(mocker: MockFixture) -> Mock:
"""Fixture for mocking os.path.expanduser."""
mock = mocker.patch("os.path.expanduser")
mock.return_value = "/home/user"
return mock
def test_get_xls_filename_not_found(mock_glob_glob_none, mock_os_path_expanduser) -> None:
with pytest.raises(SystemExit):
assert cei.get_xls_filename()
mock_glob_glob_none.assert_called()
mock_os_path_expanduser.assert_called_once()
#pytest.fixture
def mock_glob_glob_found(mocker: MockFixture) -> Mock:
"""Fixture for mocking glob.glob."""
mock = mocker.patch("glob.glob")
mock.return_value = ["/my/path/InfoCEI.xls"]
return mock
def test_get_xls_filename_current_folder(mock_glob_glob_found) -> None:
assert cei.get_xls_filename() == "/my/path/InfoCEI.xls"
mock_glob_glob_found.assert_called_once()
#pytest.fixture
def mock_glob_glob_found_download(mocker: MockFixture) -> Mock:
"""Fixture for mocking glob.glob."""
values = {
"InfoCEI*.xls": [],
"/home/user/Downloads/InfoCEI*.xls": ["/home/user/Downloads/InfoCEI.xls"],
}
def side_effect(arg):
return values[arg]
mock = mocker.patch("glob.glob")
mock.side_effect = side_effect
return mock
def test_get_xls_filename_download_folder(
mock_glob_glob_found_download, mock_os_path_expanduser
) -> None:
assert cei.get_xls_filename() == "/home/user/Downloads/InfoCEI.xls"
mock_os_path_expanduser.assert_called_once()
mock_glob_glob_found_download.assert_called_with(
"/home/user/Downloads/InfoCEI*.xls"
)

This is obviously a bit opinion-based, but I'll try.
First, there is nothing wrong with the tests being larger than the tested code. Depending on the number of tested use cases, this can easily happen, and I wouldn't use this a criterion for test quality.
That being said, your tests shall usually test the API / interface, which in this case is the returned file path under different conditions. Testing if os.path.expanduser has been called is part of the internal implementation that may not be stable - I would not consider that a good thing at least in this case. You already test the most relevant use cases (a test for having files in both locations might be added), where these internals are used.
Here is what I would probably do:
import os
import pytest
from cei import get_xls_filename
#pytest.fixture
def cwd(fs, monkeypatch):
fs.cwd = "/my/path"
monkeypatch.setenv("HOME", "/home/user")
def test_get_xls_filename_not_found(fs, cwd) -> None:
with pytest.raises(SystemExit):
assert get_xls_filename()
def test_get_xls_filename_current_folder(fs, cwd) -> None:
fs.create_file("/my/path/InfoCEI.xls")
assert get_xls_filename() == "InfoCEI.xls" # adapted to your implementation
def test_get_xls_filename_download_folder(fs, cwd) -> None:
path = os.path.join("/home/user", "Downloads", "InfoCEI.xls")
fs.create_file(path)
assert get_xls_filename() == path
Note that I used the pyfakefs fixture fs to mock the fs (I'm a contributor to pyfakefs, so this is what I'm used to, and it makes the code a bit shorter), but this may be overkill for you.
Basically, I try to test only the API, put the common setup (here cwd and home path location) into a fixture (or in a setup method for xUnit-like tests), and add the test-specific setup (creation of the test file) to the test itself.

Related

How to mock a class method that is called from another class with pytest_mock

In the below files I have
InternalDogWebhookResource which calls VisitOrchestrator.fetch_visit. I am attempting to write a test for InternalDogWebhookResource but mock VisitOrchestrator.fetch_visit since it is a network call.
I have tried the mock paths:
api.dog.handlers.internal.VisitOrchestrator.fetch_visit
api.dog.handlers.internal.InternalDogWebhookResource.VisitOrchestrator.fetch_visit
api.dog.handlers.internal.InternalDogWebhookResource.fetch_visit
and many others, but I am always getting AssertionError: assert None
I can confirm that the client.post in the test works because when i remove the mock asserts, i get a response back from the api which means fetch_visit is called.
How can I find the mocker.patch path?
api/dog/handlers/internal.py
from api.dog.helpers.visits import VisitOrchestrator
#api.route("/internal/dog/webhook")
class InternalDogWebhookResource():
def post(self) -> JsonResponse:
if event_type == EventType.CHANGE:
VisitOrchestrator.fetch_visit(event['visitId'])
return JsonResponse(status=204)
api/dog/helpers/visits.py
class VisitOrchestrator:
#classmethod
def fetch_visit(cls, visit_id: str) -> VisitModel:
# do stuff
return visit
tests/v0/dog/handlers/test_webhook.py
import pytest
from pytest_mock import MockerFixture
from api.dog.handlers.internal import InternalDogWebhookResource, EventType
from tests.v0.utils import url_for
def test_webhook_valid(client, config, mocker: MockerFixture):
visit_id = '1231231'
mock_object = mocker.patch(
'api.dog.handlers.internal.VisitOrchestrator.fetch_visit',
return_value=visit_id,
)
res = client.post(
url_for(InternalDogWebhookResource),
json={'blag': 'blargh'}
)
assert mock_object.assert_called_once()
You're doing the right things - your second approach is generally the way to go with mocks (mocking api.dog.handlers.internal.InternalDogWebhookResource.VisitOrchestrator.fetch_visit)
I would try to do the minimal test code function:
def test_webhook_valid(mocker):
mock_fetch_visit = mocker.MagicMock(name='fetch_visit')
mocker.patch('api.dog.handlers.internal.VisitOrchestrator.fetch_visit',
new=mock_fetch_visit)
InternalDogWebhookResource().post()
assert 1 == mock_fetch_visit.call_count
If this works for you - maybe the problem is with the client or other settings in your test method.

Is it possible to mock os.scandir and its attributes?

for entry in os.scandir(document_dir)
if os.path.isdir(entry):
# some code goes here
else:
# else the file needs to be in a folder
file_path = entry.path.replace(os.sep, '/')
I am having trouble mocking os.scandir and the path attribute within the else statement. I am not able to mock the mock object's property I created in my unit tests.
with patch("os.scandir") as mock_scandir:
# mock_scandir.return_value = ["docs.json", ]
# mock_scandir.side_effect = ["docs.json", ]
# mock_scandir.return_value.path = PropertyMock(return_value="docs.json")
These are all the options I've tried. Any help is greatly appreciated.
It depends on what you realy need to mock. The problem is that os.scandir returns entries of type os.DirEntry. One possibility is to use your own mock DirEntry and implement only the methods that you need (in your example, only path). For your example, you also have to mock os.path.isdir. Here is a self-contained example for how you can do this:
import os
from unittest.mock import patch
def get_paths(document_dir):
# example function containing your code
paths = []
for entry in os.scandir(document_dir):
if os.path.isdir(entry):
pass
else:
# else the file needs to be in a folder
file_path = entry.path.replace(os.sep, '/')
paths.append(file_path)
return paths
class DirEntry:
def __init__(self, path):
self.path = path
def path(self):
return self.path
#patch("os.scandir")
#patch("os.path.isdir")
def test_sut(mock_isdir, mock_scandir):
mock_isdir.return_value = False
mock_scandir.return_value = [DirEntry("docs.json")]
assert get_paths("anydir") == ["docs.json"]
Depending on your actual code, you may have to do more.
If you want to patch more file system functions, you may consider to use pyfakefs instead, which patches the whole file system. This will be overkill for a single test, but can be handy for a test suite relying on file system functions.
Disclaimer: I'm a contributor to pyfakefs.

Testing argument using Python Click

I have a command-line script with Python-click with an argument and option:
# console.py
import click
#click.command()
#click.version_option()
#click.argument("filepath", type=click.Path(exists=True), default=".")
#click.option(
"-m",
"--max-size",
type=int,
help="Max size in megabytes.",
default=20,
show_default=True,
)
def main(filepath: str, max_size: int) -> None:
max_size_bytes = max_size * 1024 * 1024 # convert to MB
if filepath.endswith(".pdf"):
print("success")
else:
print(max_size_bytes)
Both the argument and option have default values and work on the command-line and using the CLI it behaves as expected. But when I try testing it following Click documentation and debug it, it does not enter the first line:
# test_console.py
from unittest.mock import Mock
import click.testing
import pytest
from pytest_mock import MockFixture
from pdf_split_tool import console
#pytest.fixture
def runner() -> click.testing.CliRunner:
"""Fixture for invoking command-line interfaces."""
return click.testing.CliRunner()
#pytest.fixture
def mock_pdf_splitter_pdfsplitter(mocker: MockFixture) -> Mock:
"""Fixture for mocking pdf_splitter.PdfSplitter."""
return mocker.patch("pdf_split_tool.pdf_splitter.PdfSplitter", autospec=True)
def test_main_uses_specified_filepath(
runner: click.testing.CliRunner,
mock_pdf_splitter_pdfsplitter: Mock,
) -> None:
"""It uses the specified filepath."""
result = runner.invoke(console.main, ["test.pdf"])
assert result.exit_code == 0
I couldn't see why it is giving since the debugger did not enter the first line of function main(). Any ideas of what could be wrong?
The failure is due to following error.
(pdb)print result.output
"Usage: main [OPTIONS] [FILEPATH]\nTry 'main --help' for help.\n\nError: Invalid value for '[FILEPATH]': Path 'test.pdf' does not exist.\n"
This is happening due to following code in console.py which checks if the filepath exists.
#click.argument("filepath", type=click.Path(exists=True), default=".")
One way to test creating a temporary file is using afterburner's code:
# test_console.py
def test_main_uses_specified_filepath() -> None:
runner = click.testing.CliRunner()
with runner.isolated_filesystem():
with open('test.pdf', 'w') as f:
f.write('Hello World!')
result = runner.invoke(main, ["test.pdf"])
assert result.exit_code == 0
I've changed your test method to the following. However, this is more an augmentation to apoorva kamath's answer.
def test_main_uses_specified_filepath() -> None:
runner = click.testing.CliRunner()
with runner.isolated_filesystem():
with open('test.pdf', 'w') as f:
f.write('Hello World!')
result = runner.invoke(main, ["test.pdf"])
assert result.exit_code == 0
Simply put, it creates an isolated file system that gets cleaned up after the text is executed. So any files created there are destroyed with it.
For more information, Click's Isolated Filesystem documentation might come in handy.
Alternatively, you can remove the exists=True parameter to your file path.

How to check if mock functions has been called?

I'm writing unit tests for a simple function that writes bytes into s3:
import s3fs
def write_bytes_as_csv_to_s3(payload, bucket, key):
fs = s3fs.S3FileSystem()
fname = f"{bucket}/{key}"
print(f"writing {len(payload)} bytes to {fname}")
with fs.open(fname, "wb") as f:
f.write(payload)
return fname
def test_write_bytes_as_csv_to_s3(mocker):
s3fs_mock = mocker.patch('s3fs.S3FileSystem')
open_mock = mocker.MagicMock()
# write_mock = mocker.MagicMock()
# open_mock.write.return_value = write_mock
s3fs_mock.open.invoke.return_value = open_mock
result = write_bytes_as_csv_to_s3('awesome'.encode(), 'random', 'key')
assert result == 'random/key'
s3fs_mock.assert_called_once()
open_mock.assert_called_once()
# write_mock.assert_called_once()
How can I check if method open and write has been called once? Not sure how to set mocker to cover my case.
The unit-test you written above is perfect and mostly covered all the functionality of the methods which you want to test.
In pytest, there is a functionality to get the unittest coverage report which will show the lines covered by unittest.
Kindly install the pytest plugin html-report(if not installed) and execute the following document:-
py.test --cov=<filename to cover: unittest> --cov-report=html <testfile>
After that, you would likely found a html file in the current location o/r in the htmlconv/ directory. And from that, you could easily figure it out about the line covered and also the percentage of the unittest test coverage.
The issue is understanding how each mock is created and what exactly it mocks. For example mocker.patch('s3fs.S3FileSystem') returns a mock of s3fs.S3FileSystem, not the instance returned by calling s3fs.S3FileSystem(). Then to mock with fs.open(fname, "wb") as f you need to mock what the __enter__ dunder method returns. Hopefully the following code makes the relations clear:
def test_write_bytes_as_csv_to_s3(mocker):
# Mock of the open's context manager
open_cm_mock = mocker.MagicMock()
# Mock of the object returned by fs.open()
open_mock = mocker.MagicMock()
open_mock.__enter__.return_value = open_cm_mock
# Mock of the new instance returned by s3fs.S3FileSystem()
fs_mock = mocker.MagicMock()
fs_mock.open.return_value = open_mock
# Patching of s3fs.S3FileSystem
mocker.patch('s3fs.S3FileSystem').return_value = fs_mock
# Running the tested code and making assertions
result = write_bytes_as_csv_to_s3('awesome'.encode(), 'random', 'key')
assert result == 'random/key'
assert open_cm_mock.write.call_count == 1

Passing (yield) fixtures as test parameters (with a temp directory)

Question
Is it possible to pass yielding pytest fixtures (for setup and teardown) as parameters to test functions?
Context
I'm testing an object that reads and writes data from/to files in a single directory. That path of that directory is saved as an attribute of the object.
I'm having trouble with the following:
using a temporary directory with my test; and
ensuring that the directory is removed after each test.
Example
Consider the following (test_yieldfixtures.py):
import pytest, tempfile, os, shutil
from contextlib import contextmanager
#contextmanager
def data():
datadir = tempfile.mkdtemp() # setup
yield datadir
shutil.rmtree(datadir) # teardown
class Thing:
def __init__(self, datadir, errorfile):
self.datadir = datadir
self.errorfile = errorfile
#pytest.fixture
def thing1():
with data() as datadir:
errorfile = os.path.join(datadir, 'testlog1.log')
yield Thing(datadir=datadir, errorfile=errorfile)
#pytest.fixture
def thing2():
with data() as datadir:
errorfile = os.path.join(datadir, 'testlog2.log')
yield Thing(datadir=datadir, errorfile=errorfile)
#pytest.mark.parametrize('thing', [thing1, thing2])
def test_attr(thing):
print(thing.datadir)
assert os.path.exists(thing.datadir)
Running pytest test_yieldfixtures.py outputs the following:
================================== FAILURES ===================================
______________________________ test_attr[thing0] ______________________________
thing = <generator object thing1 at 0x0000017B50C61BF8>
#pytest.mark.parametrize('thing', [thing1, thing2])
def test_attr(thing):
> print(thing.datadir)
E AttributeError: 'function' object has no attribute 'props'
test_mod.py:39: AttributeError
OK. So fixture functions don't have a the properties of my class. Fair enough.
Attempt 1
A function won't have the properties, so I tried calling that functions to actually get the objects. However, that just
#pytest.mark.parametrize('thing', [thing1(), thing2()])
def test_attr(thing):
print(thing.props['datadir'])
assert os.path.exists(thing.get('datadir'))
Results in:
================================== FAILURES ===================================
______________________________ test_attr[thing0] ______________________________
thing = <generator object thing1 at 0x0000017B50C61BF8>
#pytest.mark.parametrize('thing', [thing1(), thing2()])
def test_attr(thing):
> print(thing.datadir)
E AttributeError: 'generator' object has no attribute 'props'
test_mod.py:39: AttributeError
Attempt 2
I also tried using return instead of yield in the thing1/2 fixtures, but that kicks me out of the data context manager and removes the directory:
================================== FAILURES ===================================
______________________________ test_attr[thing0] ______________________________
thing = <test_mod.Thing object at 0x000001C528F05358>
#pytest.mark.parametrize('thing', [thing1(), thing2()])
def test_attr(thing):
print(thing.datadir)
> assert os.path.exists(thing.datadir)
Closing
To restate the question: Is there anyway to pass these fixtures as parameters and maintain the cleanup of the temporary directory?
Try making your data function / generator into a fixture. Then use request.getfixturevalue() to dynamically run the named fixture.
import pytest, tempfile, os, shutil
from contextlib import contextmanager
#pytest.fixture # This works with pytest>3.0, on pytest<3.0 use yield_fixture
def datadir():
datadir = tempfile.mkdtemp() # setup
yield datadir
shutil.rmtree(datadir) # teardown
class Thing:
def __init__(self, datadir, errorfile):
self.datadir = datadir
self.errorfile = errorfile
#pytest.fixture
def thing1(datadir):
errorfile = os.path.join(datadir, 'testlog1.log')
yield Thing(datadir=datadir, errorfile=errorfile)
#pytest.fixture
def thing2(datadir):
errorfile = os.path.join(datadir, 'testlog2.log')
yield Thing(datadir=datadir, errorfile=errorfile)
#pytest.mark.parametrize('thing_fixture_name', ['thing1', 'thing2'])
def test_attr(request, thing):
thing = request.getfixturevalue(thing) # This works with pytest>3.0, on pytest<3.0 use getfuncargvalue
print(thing.datadir)
assert os.path.exists(thing.datadir)
Going one step futher, you can parametrize the thing fixtures like so:
class Thing:
def __init__(self, datadir, errorfile):
self.datadir = datadir
self.errorfile = errorfile
#pytest.fixture(params=['test1.log', 'test2.log'])
def thing(request):
with tempfile.TemporaryDirectory() as datadir:
errorfile = os.path.join(datadir, request.param)
yield Thing(datadir=datadir, errorfile=errorfile)
def test_thing_datadir(thing):
assert os.path.exists(thing.datadir)
Temporary directories and files are handled by pytest using the built in fixtures tmpdir and tmpdir_factory.
For this usage, tmpdir should be sufficient: https://docs.pytest.org/en/latest/tmpdir.html
Also, paramertrized fixtures would work well for this example.
These are documented here: https://docs.pytest.org/en/latest/fixture.html#fixture-parametrize
import os
import pytest
class Thing:
def __init__(self, datadir, errorfile):
self.datadir = datadir
self.errorfile = errorfile
#pytest.fixture(params=(1, 2))
def thing(request, tmpdir):
errorfile_name = 'testlog{}.log'.format(request.param)
errorfile = tmpdir.join(errorfile_name)
return Thing(datadir=str(tmpdir), errorfile=str(errorfile))
def test_attr(request, thing):
assert os.path.exists(thing.datadir)
BTW, In Python Testing with pytest, parametrized fixtures are covered in ch3. tmpdir and other built in fixtures are covered in ch4.
I see your problem but I'm not sure about the solution. The problem:
Your functions thing1 and thing2 contain yield statements. When you call a function like that, the returned value is a "generator object." It's an iterator - a sequence of values, which is of course not the same thing as the first value of yield, or any one particular value.
Those are the objects being passed to your test_attr function. The test environment is doing that for you automagically, or at least I think that's how it works.
What you really want is the object created in your yield expression, in other words, Thing(datadir=datadir, errorfile=errorfile). There are three ways to get a generator to emit its individual values: by calling next(iter), by calling iter.__next__() or by using the iterator in a loop with an in expression.
One possibility is to iterate the generator once. Like this:
def test_attr(thing):
first_thing = next(thing)
print(first_thing.datadir)
assert os.path.exists(first_thing.datadir)
first_thing will be the object you want to test, i.e., Thing(datadir=datadir, errorfile=errorfile).
But this is only the first hurdle. The generator function is not finished. Its internal "program counter" is just after the yield statement. So you haven't exited the context manager and haven't deleted your temporary directory yet. To do this you must call next(thing) again and catch a StopIteration exception.
Alternatively I think this will work:
def test_attr(thing):
for a_thing in thing:
print(a_thing.datadir)
assert os.path.exists(a_thing.datadir)
The in expression loops through all the items in the iterator (there's only one) and exits gracefully when StopIteration occurs. The function exits from the context manager and your work is done.
To me it's an open question whether this makes your code more or less readable and maintainable. It's a bit clumsy.

Categories

Resources