Pytest spending most time post-test? - python

I run a single function through pytest that (in the example below) takes 71 seconds to run. However, pytest spends an additional 6-7 minutes doing... something. I can tell from a log file that the intended test is executed at the beginning of the 7 minutes, but I cannot imagine what's going on afterwards (and apparently it's not the teardown, if the "slowest durations" output is to be believed).
The pytest function itself is extremely minimal:
def test_preprocess_and_train_model():
import my_module.pipeline as pipeline # noqa
pipeline.do_pipeline(do_s3_upload=False,
debug=True, update_params={'tf_verbosity': 0})
If I run test_preprocess_and_train_model() by hand (e.g., if I invoke the function through an interpreter rather than through pytest), it takes about 70 seconds.
What is happening and how can I speed it up?
▶ pytest --version
pytest 6.2.2
▶ python --version
Python 3.8.5
▶ time pytest -k test_preprocess_and_train_model -vv --durations=0
====================================================== test session starts =======================================================
platform darwin -- Python 3.8.5, pytest-6.2.2, py-1.9.0, pluggy-0.13.1 -- /usr/local/opt/python#3.8/bin/python3.8
cachedir: .pytest_cache
rootdir: /Users/blah_blah_blah/tests
collected 3 items / 2 deselected / 1 selected
test_pipelines.py::test_preprocess_and_train_model PASSED [100%]
======================================================== warnings summary ========================================================
test_pipelines.py::test_preprocess_and_train_model
/Users/your_name_here/Library/Python/3.8/lib/python/site-packages/tensorflow/python/autograph/impl/api.py:22: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
import imp
test_pipelines.py: 3604341 warnings
sys:1: DeprecationWarning: PY_SSIZE_T_CLEAN will be required for '#' formats
test_pipelines.py::test_preprocess_and_train_model
/Users/your_name_here/Library/Python/3.8/lib/python/site-packages/tensorflow/python/keras/engine/training.py:2325: UserWarning: `Model.state_updates` will be removed in a future version. This property should not be used in TensorFlow 2.0, as `updates` are applied automatically.
warnings.warn('`Model.state_updates` will be removed in a future version. '
-- Docs: https://docs.pytest.org/en/stable/warnings.html
======================================================= slowest durations ========================================================
71.88s call test_pipelines.py::test_preprocess_and_train_model
0.00s teardown test_pipelines.py::test_preprocess_and_train_model
0.00s setup test_pipelines.py::test_preprocess_and_train_model
================================= 1 passed, 2 deselected, 3604343 warnings in 422.15s (0:07:02) ==================================
pytest -k test_preprocess_and_train_model -vv --durations=0 305.95s user 158.42s system 104% cpu 7:26.14 total

I thought I'd post the resolution of this in case it helps anyone else.
theY4Kman's suggestion of just profiling the code was excellent. My own code ran in 70 seconds as expected and exited cleanly, but it produced 3.6M warning messages. Each warning message in pytest triggered an os.stat() check (to figure out which line of code had produced the warning), and evaluating 3.6M system calls took 5 minutes or so. If I commented out the heart of warning_record_to_str() in pytest's warnings.py, then pytest took about 140 seconds (i.e., most of the problem was solved).
This also explained why the performance depended on the platform (Mac vs Ubuntu), because the number of error messages differed vastly.
There seem to be two sad conclusions, if I'm understanding things correctly:
Even if I use the --disable-warnings flag, it appears that pytest happily collects all the warnings, stats them, etc; it just doesn't print them. In my case, that meant that pytest spent 85% of its time computing information that it immediately discarded.
The os.stat() check occurs inside Python's linecache.py function. Since I had an identical warning 3.6M times, I expected the function would run os.stat() once and cache the result, but that did not happen. I assume I'm just misunderstanding something basic about the intended use of this function.

Related

How to view runtime warnings in PyCharm when running tests using pytest?

When running tests in PyCharm 2022.3.2 (Professional Edition) using pytest (6.2.4) and Python 3.9 I get the following result in the PyCharm console window:
D:\cenv\python.exe "D:/Program Files (x86)/JetBrains/PyCharm 2022.3.2/plugins/python/helpers/pycharm/_jb_pytest_runner.py" --path D:\tests\test_k.py
Testing started at 6:49 PM ...
Launching pytest with arguments D:\tests\test_k.py --no-header --no-summary -q in D:\tests
============================= test session starts =============================
collecting ... collected 5 items
test_k.py::test_init
test_k.py::test_1
test_k.py::test_2
test_k.py::test_3
test_k.py::test_4
======================= 5 passed, 278 warnings in 4.50s =======================
Process finished with exit code 0
PASSED [ 20%]PASSED [ 40%]PASSED [ 60%]PASSED [ 80%]PASSED [100%]
So the actual warnings don't show. Only the number of warnings (278) is shown.
I tried:
selecting: Pytest: do not add "--no-header --no-summary -q" in advanced settings
Setting Additional arguments to -Wall in the Run/Debug configurations window
Setting Interpreter options to -Wall in the Run/Debug configurations window
and all permutations, all to no avail. Is there a way to show all runtime warnings when running tests using pytest in PyCharm in the PyCharm Console window?
EDIT:
#Override12
When I select do not add "--no-header --no-summary -q" in advanced settings I get the following output:
D:\Projects\S\SHARK\development_SCE\cenv\python.exe "D:/Program Files (x86)/JetBrains/PyCharm 2020.3.4/plugins/python/helpers/pycharm/_jb_pytest_runner.py" --path D:\Projects\S\SHARK\development_SCE\cenv\Lib\site-packages\vistrails-3.5.0rc0-py3.9.egg\vistrails\packages\SHARK\analysis\tests\test_fairing_1_plus_k.py -- --jb-show-summary
Testing started at 10:07 AM ...
Launching pytest with arguments D:\Projects\S\SHARK\development_SCE\cenv\Lib\site-packages\vistrails-3.5.0rc0-py3.9.egg\vistrails\packages\SHARK\analysis\tests\test_fairing_1_plus_k.py in D:\Projects\S\SHARK\development_SCE\cenv\Lib\site-packages\vistrails-3.5.0rc0-py3.9.egg\vistrails\packages
============================= test session starts =============================
platform win32 -- Python 3.9.7, pytest-6.2.4, py-1.10.0, pluggy-0.13.1 -- D:\Projects\S\SHARK\development_SCE\cenv\python.exe
cachedir: .pytest_cache
rootdir: D:\Projects\S\SHARK\development_SCE\cenv\Lib\site-packages\vistrails-3.5.0rc0-py3.9.egg\vistrails\packages
plugins: pytest_check-1.0.5
collecting ... collected 5 items
SHARK/analysis/tests/test_fairing_1_plus_k.py::test_init
SHARK/analysis/tests/test_fairing_1_plus_k.py::test_without_1_k_fairing
SHARK/analysis/tests/test_fairing_1_plus_k.py::test_1_k_fairing_given
SHARK/analysis/tests/test_fairing_1_plus_k.py::test_without_1_k_fairing_only_3_values_under_threshold
SHARK/analysis/tests/test_fairing_1_plus_k.py::test_1_k_fairing_given_only_3_values_under_threshold
============================== warnings summary ===============================
......\pyreadline\py3k_compat.py:8
D:\Projects\S\SHARK\development_SCE\cenv\lib\site-packages\pyreadline\py3k_compat.py:8: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated since Python 3.3, and in 3.10 it will stop working
return isinstance(x, collections.Callable)
......\nose\importer.py:12
D:\Projects\S\SHARK\development_SCE\cenv\lib\site-packages\nose\importer.py:12: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
from imp import find_module, load_module, acquire_lock, release_lock
SHARK/analysis/tests/test_fairing_1_plus_k.py: 276 warnings
D:\Projects\S\SHARK\development_SCE\cenv\lib\site-packages\pymarin\objects\key.py:1101: UserWarning: siUnits is deprecated, use siUnit
warnings.warn('siUnits is deprecated, use siUnit')
-- Docs: https://docs.pytest.org/en/stable/warnings.html
======================= 5 passed, 278 warnings in 5.79s =======================
Process finished with exit code 0
PASSED [ 20%]PASSED [ 40%]PASSED [ 60%]PASSED [ 80%]PASSED [100%]
So 4 warnings are displayed. However I would like to see all 278 warnings.
When I run pytest from the command line outside PyCharm I get the same result. So it seems to be a pytest problem and it seems that it has nothing to do with PyCharm.
I think you could also try adding verbosity or collect warnings in a programatic way, by changing how you execute the tests.
The solution is a combination of two things:
Setting 'do not add "--no-header --no-summary -q"' in advanced settings as #Override12 suggested.
When the same warning is issued multiple times, only the first time is displayed. In my case solving the first warning reduced the number of warnings from 278 to 2.

pytest --cov is telling me I haven't imported something that I have

Here's what I run, and what I get:
me $ pytest --cov MakeInfo.py
================================================================================ test session starts =================================================================================
platform darwin -- Python 3.7.4, pytest-6.2.5, py-1.10.0, pluggy-0.13.0
rootdir: /Users/me/Documents/workspace-vsc/Pipeline/src/python
plugins: cov-2.12.1, arraydiff-0.3, remotedata-0.3.2, doctestplus-0.4.0, openfiles-0.4.0
collected 42 items
test_MakeInfo.py ............ [ 28%]
test_MakeJSON.py ... [ 35%]
test_convert_refflat_to_bed.py .. [ 40%]
test_generate_igv_samples.py ... [ 47%]
test_sample_info_to_jsons.py .... [ 57%]
read_coverage/test_intervaltree.py ......... [ 78%]
util/test_util.py ......... [100%]**Coverage.py warning: Module MakeInfo.py was never imported**. (module-not-imported)
Coverage.py warning: No data was collected. (no-data-collected)
WARNING: Failed to generate report: No data to report.
/Users/me/opt/anaconda3/lib/python3.7/site-packages/pytest_cov/plugin.py:271: PytestWarning: Failed to generate report: No data to report.
self.cov_controller.finish()
Here's the top of test_MakeInfo.py:
import pytest
import os
import sys
import json
import MakeInfo
from MakeInfo import main, _getNormalTumorInfo
What I'm looking for: Tell me how much of MakeInfo.py is covered by tests in test_MakeInfo.py
What I'm getting: confused
Is my command wrong for what I want? Nothing calls into MakeInfo.py, it's stand alone and called from the command line. So of course none of the other tests are including it.
Is there a way to tell pytest --cov to look at this test file, and the source file, and ignore everything else?
Try changing the call to just the python module name MakeInfo instead of the file name MakeInfo.py. Or if it is in an inner subdirectory, use the dot . notation e.g. some_dir.some_subdir.MakeInfo
pytest --cov MakeInfo
So, it turns out that in this case the order of arguments is key
pytest test_MakeInfo.py --cov
Runs pytest on my one test file, and gives me coverage information for the one source file
pytest --cov test_MakeInfo.py
Tries to run against every test file (one of which was written by someone else on my team, and apparently uses a different test tool, so it throws up errors when pytest tries to run it)
pytest --cov MakeInfo
Is part way there: it runs all the tests, including the ones that fail, but then gives me the coverage information I want
So #Niel's answer is what you want if you have multiple test files targeting a single source file

pytest-cov options for code coverage from external libraries

Currently I'm trying to implement an automation testing tool for python projects. Is it possible to collect the code coverage from external libraries using pytest-cov module? As far as I know only the coverage module will report the code coverage from external libraries!
Example:
import random
def test_rand():
assert random.randint(0,10) == 5
Using the command coverage run -m --pylib pytest file.py::test_rand we can get the code coverage from external libraries (e.g. random module in our case).
Is it possible to do the same thing using pytest-cov instead?
By default pytest-cov will report coverage for all libraries, including external.
If you run pytest --cov against your code it will produce many lines of coverage including py, pytest, importlib, etc.
To limit the scope of the coverage, i.e. you only want to inspect coverage for random, just pass the module names to the cov option e.g. pytest --cov=random. The coverage report then only considers the named modules. You can also pass multiple modules by specifying multiple cov values, e.g. pytest --cov=random --cov=pytest
Here's an example running your test to produce coverage only against random
$ pytest --cov=random
====== test session starts ======
platform linux -- Python 3.6.12, pytest-6.2.4, py-1.10.0, pluggy-0.13.1
plugins: cov-2.12.1
collected 1 item
test_something.py F
[100%]
=========== FAILURES ============
___________ test_rand ___________
def test_rand():
import random
> assert random.randint(0,10) == 5
E AssertionError: assert 0 == 5
E + where 0 = <bound method Random.randint of <random.Random object at ...>>(0, 10)
test_something.py:6: AssertionError
---------- coverage: platform linux, python 3.6.12-final-0 -----------
Name Stmts Miss Cover
/.../random.py 350 334 --
TOTAL 350 334 5%

Python coverage report covering only test file

I’m pretty new to contributing to open source projects and am trying to get some coverage reports so I can find out what needs more / better testing. However, I am having trouble getting the full coverage of a test. This is for pytorch
For example, lets say I want to get the coverage report of test_indexing_py.
I run the command:
pytest test_indexing.py --cov=../ --cov-report=html
Resulting in this:
================================================= test session starts =================================================
platform win32 -- Python 3.7.4, pytest-5.2.1, py-1.8.0, pluggy-0.13.0
rootdir: C:\Projects\pytorch
plugins: hypothesis-5.4.1, arraydiff-0.3, cov-2.8.1, doctestplus-0.4.0, openfiles-0.4.0, remotedata-0.3.2
collected 62 items
test_indexing.py ............................s................................. [100%]
----------- coverage: platform win32, python 3.7.4-final-0 -----------
Coverage HTML written to dir htmlcov
=========================================== 61 passed, 1 skipped in 50.43s ============================================
Ok, looks like the tests ran. Now when I check the html coverage report, I only get the coverage for the test file and not for the classes tested (the tests are ordered by coverage percentage).
As you can see, I am getting coverage for only test_indexing.py. How do I get the full coverage report including the classes tested?
Any guidance will be greatly appreciated.
I think its because you are asking to check the coverage from the test running directory, ie where test_indexing.py is.
A better approach would be like running the test from the root directory itself, rather than test directory, it has several advantages like the configuration file reading and all.
And regarding your question, try running the test from the root directory and try
pytest path/to/test/ --cov --cov-report=html

Py.test running very slowly (and not at all with some Python3 statements)

In my code, I have a line that says something like
print("Some string:", end=" ")
When I try to run pytest, I get the following:
ryansmacbook:scripts ryan$ py.test file.py
============================= test session starts ==============================
platform darwin -- Python 2.7.5 -- py-1.4.20 -- pytest-2.5.2
collected 0 items / 1 errors
==================================== ERRORS ====================================
________________________ ERROR collecting cpp-allman.py ________________________
/Library/Python/2.7/site-packages/pytest-2.5.2-py2.7.egg/_pytest/python.py:451: in _importtestmodule
> mod = self.fspath.pyimport(ensuresyspath=True)
/Library/Python/2.7/site-packages/py-1.4.20-py2.7.egg/py/_path/local.py:620: in pyimport
> __import__(modname)
E File "/Users/ryan/path/to/file.py", line 65
E "Some string:", end=" ")
E ^
E SyntaxError: invalid syntax
=========================== 1 error in 0.07 seconds ============================
When I comment out the print statement, testing takes forever. I'm trying to test regexes (Testing regexes in Python using py.test), but this is what happens:
ryansmacbook:scripts ryan$ py.test file.py
============================= test session starts ==============================
platform darwin -- Python 2.7.5 -- py-1.4.20 -- pytest-2.5.2
collecting 0 items^C
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! KeyboardInterrupt !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
/Users/ryan/magzor/scripts/file.py:134: KeyboardInterrupt
============================= in 2397.55 seconds ==============================
Before I implemented that possible solution, testing took between 30 and 60 seconds, which still seems too long. What's going on?
========================================
Edit: I commented out a part of my code that read from one file and wrote to another but was not contained within a test_ prefixed function. Pytest now runs in about 0.03 seconds. Why is that? I thought tests were independent of program function.
You need to have py.test installed in a python3 environment. A #! line with python3 will only be picked up if the script is run by the OS directly (i.e. ./file.py).
To be sure you're invoking the correct version of py.test you could invoke it as python3 -m pytest. Note the second line of py.test's output when running tests where it shows exactly which version of python is being used (2.7.5 in your case).
The speed issue is probably a separate question hard to answer without seeing the code involved. Somehow the commented out code must have been triggered before. This would be a mistake most likely at import time as py.test does not randomly start to run code.

Categories

Resources