pytest fixtures: testing pandas dataframe

pytest fixtures: testing pandas dataframe - python

I have some scripts in package directory and some tests in tests directory, along with a CSV file containing a dataframe that i want to use for testing purposes.
main_directory/
|
|- package/
| |- foo.py
| |- bar.py
|
|- tests/
|- conftest.py
|- test1.py
|- test.csv
I am using pytest and i have defined a conftest.py that contains a fixture that i want to use for the whole test session, that should return a pandas test dataframe imported from a csv file, as in the following:
#conftest.py
import pytest
from pandas import read_csv
path="test.csv"
#pytest.fixture(scope="session")
def test_data():
return read_csv(path)
I have been trying to use the fixture to return the test dataframe for the test_functions.
The original test functions were a bit more complex, calling pandas groupby on the object returned by the fixture. I kept on getting the error 'TestStrataFrame' object has no attribute 'groupby' so i simplified the test to the test below and, as I was still getting errors, I realized that i am probably missing something.
My test is the following:
#test1.py
import unittest
import pytest
class TestStrataFrame(unittest.TestCase):
def test_fixture(test_data):
assert isinstance(test_data,pd.DataFrame) is True
The above test_fixture returns:
=============================================== FAILURES ================================================
_____________________________________ TestStrataFrame.test_fixture ______________________________________
test_data = <tests.test_data.TestStrataFrame testMethod=test_fixture>
def test_fixture(test_data):
ciao=test_data
> assert isinstance(ciao,pd.DataFrame) is True
E AssertionError: assert False is True
E + where False = isinstance(<tests.test_data.TestStrataFrame testMethod=test_fixture>, <class 'pandas.core.frame.DataFrame'>)
E + where <class 'pandas.core.frame.DataFrame'> = pd.DataFrame
tests/test_data.py:23: AssertionError
=========================================== warnings summary ============================================
../../../../../opt/miniconda3/envs/geo/lib/python3.7/importlib/_bootstrap.py:219
/opt/miniconda3/envs/geo/lib/python3.7/importlib/_bootstrap.py:219: RuntimeWarning: numpy.ufunc size changed, may indicate binary incompatibility. Expected 192 from C header, got 216 from PyObject
return f(*args, **kwds)
-- Docs: https://docs.pytest.org/en/stable/warnings.html
======================================== short test summary info ========================================
FAILED tests/test_data.py::TestStrataFrame::test_fixture - AssertionError: assert False is True
================================ 1 failed, 4 passed, 1 warning in 12.82s ================================
How can i do this correctly?
PS : At the moment i would not focus on the RuntimeWarning. I am getting since after I have started trying to solve this issue, but i am quite sure the tests were failing even before I got that warning - so they are probably unrelated. I reinstalled the environment and the warning persists, hopefully might go away with solving the issue...

this works for me:
isinstance(type(my_pd_df),type(pandas.core.frame.DataFrame) )

The error is coming as test_data is not been passed to test_fixture method. for example below are two ways you can tweak your Class and its method.
import unittest
import pytest
import pandas as pd
class TestStrataFrame(unittest.TestCase):
test_data=pd.DataFrame()
def test_fixture(self):
test_data=pd.DataFrame()
assert isinstance(test_data,pd.DataFrame) is True
def test_fixture_1(self):
assert isinstance(TestStrataFrame.test_data,pd.DataFrame) is True
and run from terminal : pytest test_sample.py

This is the expected behavior if you take note of this page here. That page clearly states:
The following pytest features do not work, and probably never will due to different design philosophies:
1. Fixtures (except for autouse fixtures, see below);
2. Parametrization;
3. Custom hooks;
You can modify your code to the following to work.
# conftest.py
from pathlib import Path
import pytest
from pandas import read_csv
CWD = Path(__file__).resolve()
FIN = CWD.parent / "test.csv"
#pytest.fixture(scope="class")
def test_data(request):
request.cls.test_data = read_csv(FIN)
# test_file.py
import unittest
import pytest
import pandas as pd
#pytest.mark.usefixtures("test_data")
class TestStrataFrame(unittest.TestCase):
def test_fixture(self):
assert hasattr(self, "test_data")
assert isinstance(self.test_data, pd.DataFrame)
==>pytest tests/
============================= test session starts ==============================
platform darwin -- Python 3.9.1, pytest-6.2.2, py-1.10.0, pluggy-0.13.1
rootdir: /Users/***/Desktop/scripts/stackoverflow
collected 1 item
tests/test_file.py . [100%]
============================== 1 passed in 0.03s ===============================
You can see more about mixing fixtures with the unittest framework here.

Related

Fixture found but not called

I am testing a bot in python-telegram-bot using pyrogram to simulate inputs. I have to set a cryptography key in my environment so the code being tested can access it. I tried to accomplish this using the setup-teardown concept inside pytest.fixtures.
For that, i created a file where all my fixtures are created, and one of them is set like #pytest.fixture(autouse=True, scope="session")
The fixture is being found, but not executed. Why?
File structure:
--root
|-tests/
|- conftest.py
|- test_something1.py
|- test_something2.py
I am executing pytest -s --fixtures from the root folder
Here is conftest.py
# coding:utf-8
import os
import pytest
from pyrogram import Client
from telegram.ext import Application
from config import BOT_ID
from contrib.consts import CRYPT_KEY_EDK
from tests.config import TESTER_USERNAME, API_ID, API_HASH, CRYPT_KEY
#pytest.fixture
async def bot():
async with Application.builder().token(BOT_ID).build() as bot:
yield bot
#pytest.fixture
async def pyro_client():
async with Client(name=TESTER_USERNAME, api_id=API_ID, api_hash=API_HASH) as pyro_client:
yield pyro_client
#pytest.fixture(autouse=True, scope="session")
def set_crypt_key():
print("Entered set_crypt_key fixture")
os.environ[CRYPT_KEY_EDK] = CRYPT_KEY
And here is 'test_can_access_netbox.py':
# coding: utf-8
from unittest import mock
import pynetbox
import pytest
import requests
from pynetbox import RequestError
from telegram.ext import ContextTypes
from contrib.consts import NETBOX_CONFIG_CDK
from contrib.datawrappers.netboxconfig import NetboxConfig
from tests.config import NETBOX_TEST_TOKEN, NETBOX_TEST_URL
chat_data_mock = {
NETBOX_CONFIG_CDK: NetboxConfig(token=NETBOX_TEST_TOKEN, url=NETBOX_TEST_URL)
}
#mock.patch("telegram.ext.ContextTypes.DEFAULT_TYPE", chat_data=chat_data_mock)
def test_can_access_netbox(context: ContextTypes.DEFAULT_TYPE):
"""TC002"""
try:
nb = pynetbox.api(**context.chat_data[NETBOX_CONFIG_CDK].to_dict())
nb.status()
except RequestError as e:
pytest.fail("Error requesting connection to Netbox: {}".format(e))
except requests.exceptions.ConnectionError as e:
pytest.fail("Connection error: {}".format(e))
if __name__ == '__main__':
test_can_access_netbox()
And when i run my tests with pytest -s --fixtures, i get this error:
------------------------------------------------------------- fixtures defined from tests.fixtures -------------------------------------------------------------
bot -- tests/fixtures.py:19
pyro_client -- tests/fixtures.py:29
set_crypt_key [session scope] -- tests/fixtures.py:39
============================================================================ ERRORS ============================================================================
_______________________________________________________ ERROR collecting tests/test_can_access_netbox.py _______________________________________________________
tests/test_can_access_netbox.py:16: in <module>
NETBOX_CONFIG_CDK: NetboxConfig(token=NETBOX_TEST_TOKEN, url=NETBOX_TEST_URL)
contrib/datawrappers/netboxconfig.py:8: in __init__
super().__init__()
contrib/classesbehaviors/cryptographs.py:13: in __init__
self.crypt = Fernet(os.environ[CRYPT_KEY_EDK].encode())
/usr/lib/python3.8/os.py:675: in __getitem__
raise KeyError(key) from None
E KeyError: 'TELOPS_CRYPT_KEY'
=================================================================== short test summary info ====================================================================
ERROR tests/test_can_access_netbox.py - KeyError: 'TELOPS_CRYPT_KEY'
======================================================================= 1 error in 0.36s =======================================================================
Notice that my print() did not appear, despite of me using -s on command.
Libs i am using:
pytest==7.2.1
pytest-asyncio==0.20.3
Python is 3.8.10
Here are the first lines of the command output, i hope they help:
platform linux -- Python 3.8.10, pytest-7.2.1, pluggy-1.0.0
rootdir: /home/joao/Desktop/enterprise/project_name
plugins: asyncio-0.20.3, anyio-3.6.2
asyncio: mode=strict

The error messages you pasted don't quite match the snippets you included, so it is hard to see the precise reason for the failures.
For example, if we look at the confest.py snippet you included, it will definitely fail, but with a different exception - NameError because it will not be able to resolve the names CRYPT_KEY_EDK or CRYPT_KEY values.
Secondly the error message talks about a file test_can_access_netbox.py for which we cannot see here.
This being said, here is a simple example to show how one can use a fixture to set an environment variable:
constants.py
CRYPT_KEY_EDK = "CRYPT-KEY-EDK-VALUE"
CRYPT_KEY = "SOME-CRYPT-KEY-VALUE"
test_fixture_executed.py
import pytest
import os
import constants
#pytest.mark.asyncio
async def test_env_var_is_present():
assert constants.CRYPT_KEY_EDK in os.environ
conftest.py
import pytest
import os
import constants
#pytest.fixture(autouse=True, scope="session")
def set_crypt_key():
print("Entered set_crypt_key fixture")
os.environ[CRYPT_KEY_EDK] = CRYPT_KEY

Okay, i figured it out (thanks to willcode.io asking for the test file):
In my test file i had some lines out of any function. The problem was: they were being executed when pytest was trying to fetch all the tests. But these lines needed to be executed after the fixtures.
Solution: As these floating lines were doing some setup process, i just inserted them into a fixture and everything is working now.

Change XDG_DATA_HOME environment variable in pytest

I have some trouble with changing the environment variables with tmp_path in a project, so I tried to write a sample project to debug it. That doesn't work and don't understand why. The project uses a settings.py file to define some constants. module.py import this constants and do his stuff.
src
settings.py
import os
from pathlib import Path
XDG_HOME = Path(os.environ.get("XDG_DATA_HOME"))
HOME = XDG_HOME / "home"
module.py
from xdg_and_pytest.settings import HOME
def return_home(default=HOME):
return default
tests
In my tests, I have a fixture to change the environment variable. The first test to call it put tmp_dir in the $XDG_DATA_HOME environment variable but the second one get the same path ...
conftest.py
import pytest
#pytest.fixture
def new_home(tmp_path, monkeypatch):
monkeypatch.setenv("XDG_DATA_HOME", str(tmp_path))
return new_home
test_module.py
def test_new_home_first(new_home):
from xdg_and_pytest.module import return_home
assert "new_home_first" in str(return_home())
def test_new_home_second(new_home):
from xdg_and_pytest.module import return_home
assert "new_home_second" in str(return_home())
command-line result
poetry run pytest
====================== test session starts ======================
platform linux -- Python 3.10.4, pytest-7.1.2, pluggy-1.0.0
collected 2 items
tests/test_module.py .F [100%]
=========================== FAILURES ============================
_____________________ test_new_home_second ______________________
new_home = <function new_home at 0x7f75991b13f0>
def test_new_home_second(new_home):
from xdg_and_pytest.module import return_home
> assert "new_home_second" in str(return_home())
E AssertionError: assert 'new_home_second' in '/tmp/pytest-of-bisam/pytest-92/test_new_home_first0/home'
E + where '/tmp/pytest-of-bisam/pytest-92/test_new_home_first0/home' = str(PosixPath('/tmp/pytest-of-bisam/pytest-92/test_new_home_first0/home'))
E + where PosixPath('/tmp/pytest-of-bisam/pytest-92/test_new_home_first0/home') = <function return_home at 0x7f759920bac0>()
tests/test_module.py:10: AssertionError
==================== short test summary info ====================
FAILED tests/test_module.py::test_new_home_second - AssertionE...
================== 1 failed, 1 passed in 0.09s ==================
This is the clearest code I got, but I tried lots of different monkeypatching ways. Maybe should I left the idea of a settings.py file and try something else ? I don't want to use a scope=session solution because I want to try different kind of data in $XDG_DATA_HOME.

Attempted relative import with no known parent package when doing automated testing

My simplified folder structure is:
projectroot/
__init__.py
src/
__init__.py
util.py
tests/
__init__.py
test_util.py
In util.py I have the following function:
def build_format_string(date: bool = True, time: bool = True) -> str:
format_str = ""
if date:
format_str += "%Y-%m-%d"
if time:
if format_str[-1] != " ":
format_str += " "
format_str += "%H:%M:%S"
return format_str
I have written the corresponding test_build_format_string function inside test_util.py as follows:
from ..src.util import build_format_string
import pytest
#pytest.mark.parametrize('date, time, expected', [(True, True, "%Y-%m-%d %H:%M:%S")])
def test_build_format_string(date, time, expected):
assert type(date) == bool , f"date arg of build_format_string must be boolean, not {type(date)}!"
assert type(time) == bool , f"time arg of build_format_string must be boolean, not {type(time)}!"
assert build_format_string(date, time) == expected, f"""Result of build_format_string when called with date={date} and time={time}
must be {expected}; got {build_format_string(date, time)} instead."""
When running the automated test from command line as python -m pytest test_util.py or py.test test_util.py, I get the attempted relative import beyond top-level package error message, while when running the test_util.py in debug mode in my code editor, I get a similar attempted relative import with no known parent package error.
I have already read through a bunch of SO comments on this very frequent error but I am now even more confused than before. In many, I read that __init__.py should be placed in every folder and subfolder but this is exactly what I have here; still, relative importing doesn't work. But without importing the function(s) located in src.util, I can't have my automated tests to run. How can I get this to work?

You are running the test script from inside its directory - python has no way of knowing this is part of a package. Try to cd to projectroot and from there:
$ python -m pytest -m tests.test_util # note no py

How to integrate checking of readme in pytest

I use pytest in my .travis.yml to check my code.
I would like to check the README.rst, too.
I found readme_renderer via this StackO answer
Now I ask myself how to integrate this into my current tests.
The docs of readme_renderer suggest this, but I have not clue how to integrate this into my setup:
python setup.py check -r -s

I think the simplest and most robust option is to write a pytest plugin that replicates what the distutils command you mentioned in you answer does.
That could be as simple as a conftest.py in your test dir. Or if you want a standalone plugin that's distributable for all of us to benefit from there's a nice cookiecutter template.
Ofc there's inherently nothing wrong with calling the check manually in your script section after the call to pytest.

I check it like this now:
# -*- coding: utf-8 -*-
from __future__ import absolute_import, division, unicode_literals, print_function
import os
import subx
import unittest
class Test(unittest.TestCase):
def test_readme_rst_valid(self):
base_dir = os.path.dirname(os.path.dirname(os.path.dirname(__file__)))
subx.call(cmd=['python', os.path.join(base_dir, 'setup.py'), 'check', '--metadata', '--restructuredtext', '--strict'])
Source: https://github.com/guettli/reprec/blob/master/reprec/tests/test_setup.py

So I implemented something but it does require some modifications. You need to modify your setup.py as below
from distutils.core import setup
setup_info = dict(
name='so1',
version='',
packages=[''],
url='',
license='',
author='tarun.lalwani',
author_email='',
description=''
)
if __name__ == "__main__":
setup(**setup_info)
Then you need to create a symlink so we can import this package in the test
ln -s setup.py setup_mypackage.py
And then you can create a test like below
# -*- coding: utf-8 -*-
from __future__ import absolute_import, division, unicode_literals, print_function
import os
import unittest
from distutils.command.check import check
from distutils.dist import Distribution
import setup_mypackage
class Test(unittest.TestCase):
def test_readme_rst_valid(self):
dist = Distribution(setup_mypackage.setup_info)
test = check(dist)
test.ensure_finalized()
test.metadata = True
test.strict = True
test.restructuredtext = True
global issues
issues = []
def my_warn(msg):
global issues
issues += [msg]
test.warn = my_warn
test.check_metadata()
test.check_restructuredtext()
if len(issues) > 0:
assert len(issues) == 0, "\n".join(issues)
Running the test then I get
...
AssertionError: missing required meta-data: version, url
missing meta-data: if 'author' supplied, 'author_email' must be supplied too
Ran 1 test in 0.067s
FAILED (failures=1)
This is one possible workaround that I can think of

Upvoted because checking readme consistence is a nice thing I never integrated in my own projects. Will do from now on!
I think your approach with calling the check command is fine, although it will check more than readme's markup. check will validate the complete metadata of your package, including the readme if you have readme_renderer installed.
If you want to write a unit test that does only markup check and nothing else, I'd go with an explicit call of readme_renderer.rst.render:
import pathlib
from readme_renderer.rst import render
def test_markup_is_generated():
readme = pathlib.Path('README.rst')
assert render(readme.read_text()) is not None
The None check is the most basic test: if render returns None, it means that the readme contains errors preventing it from being translated to HTML. If you want more fine-grained tests, work with the HTML string returned. For example, I expect my readme to contain the word "extensions" to be emphasized:
import pathlib
import bs4
from readme_renderer.rst import render
def test_extensions_is_emphasized():
readme = pathlib.Path('README.rst')
html = render(readme.read_text())
soup = bs4.BeautifulSoup(html)
assert soup.find_all('em', string='extensions')
Edit: If you want to see the printed warnings, use the optional stream argument:
from io import StringIO
def test_markup_is_generated():
warnings = StringIO()
with open('README.rst') as f:
html = render(f.read(), stream=warnings)
warnings.seek(0)
assert html is not None, warnings.read()
Sample output:
tests/test_readme.py::test_markup_is_generated FAILED
================ FAILURES ================
________ test_markup_is_generated ________
def test_markup_is_generated():
warnings = StringIO()
with open('README.rst') as f:
html = render(f.read(), stream=warnings)
warnings.seek(0)
> assert html is not None, warnings.read()
E AssertionError: <string>:54: (WARNING/2) Title overline too short.
E
E ----
E fffffff
E ----
E
E assert None is not None
tests/test_readme.py:10: AssertionError
======== 1 failed in 0.26 seconds ========

Django and tests in docfiles

I'm having a small problem with my test suite with Django.
I'm working on a Python package that can run in both Django and Plone (http://pypi.python.org/pypi/jquery.pyproxy).
All the tests are written as doctests, either in the Python code or in separate docfiles (for example the README.txt).
I can have those tests running fine but Django just do not count them:
[vincent ~/buildouts/tests/django_pyproxy]> bin/django test pyproxy
...
Creating test database for alias 'default'...
----------------------------------------------------------------------
Ran 0 tests in 0.000s
OK
But if I had some failing test, it will appear correctly:
[vincent ~/buildouts/tests/django_pyproxy]> bin/django test pyproxy
...
Failed example:
1+1
Expected nothing
Got:
2
**********************************************************************
1 items had failures:
1 of 44 in README.rst
***Test Failed*** 1 failures.
Creating test database for alias 'default'...
----------------------------------------------------------------------
Ran 0 tests in 0.000s
OK
This is how my test suite is declared right now:
import os
import doctest
from unittest import TestSuite
from jquery.pyproxy import base, utils
OPTIONFLAGS = (doctest.ELLIPSIS |
doctest.NORMALIZE_WHITESPACE)
__test__ = {
'base': doctest.testmod(
m=base,
optionflags=OPTIONFLAGS),
'utils': doctest.testmod(
m=utils,
optionflags=OPTIONFLAGS),
'readme': doctest.testfile(
"../../../README.rst",
optionflags=OPTIONFLAGS),
'django': doctest.testfile(
"django.txt",
optionflags=OPTIONFLAGS),
}
I guess I'm doing something wrong when declaring the test suite but I don't have a clue what it is exactly.
Thanks for your help,
Vincent

I finally solved the problem with the suite() method:
import os
import doctest
from django.utils import unittest
from jquery.pyproxy import base, utils
OPTIONFLAGS = (doctest.ELLIPSIS |
doctest.NORMALIZE_WHITESPACE)
testmods = {'base': base,
'utils': utils}
testfiles = {'readme': '../../../README.rst',
'django': 'django.txt'}
def suite():
return unittest.TestSuite(
[doctest.DocTestSuite(mod, optionflags = OPTIONFLAGS)
for mod in testmods.values()] + \
[doctest.DocFileSuite(f, optionflags = OPTIONFLAGS)
for f in testfiles.values()])
Apparently the problem when calling doctest.testfile or doctest.testmod is that the tests are directly ran.
Using DocTestSuite/DocFileSuite builds the list and then the test runner runs them.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

pytest fixtures: testing pandas dataframe - python

this works for me: isinstance(type(my_pd_df),type(pandas.core.frame.DataFrame) )

Related

Fixture found but not called

Change XDG_DATA_HOME environment variable in pytest

Attempted relative import with no known parent package when doing automated testing

How to integrate checking of readme in pytest

Django and tests in docfiles

Categories

Resources