How to speed up pytest

How to speed up pytest - python

Is there some way to speed up the repeated execution of pytest? It seems to spend a lot of time collecting tests, even if I specify which files to execute on the command line. I know it isn't a disk speed issue either since running pyflakes across all the .py files is very fast.
The various answers represent different ways pytest can be slow. They helped sometimes, did not in others. I'm adding one more answer that explains a common speed problem. But it's not possible to select "The" answer here.

Using the norecursedirs option in pytest.ini or tox.ini can save a lot of collection time, depending on what other files you have in your working directory. My collection time is roughly halved for a suite of 300 tests when I have that in place (0.34s vs 0.64s).
If you're already using tox like I am, you just need to add the following in your tox.ini:
[pytest]
norecursedirs = docs *.egg-info .git appdir .tox
You can also add it in a free-standing pytest.ini file.
The pytest documentation has more details on pytest configuration files.

I was having the same problem where I was calling pytest at the root of my project and my tests were three subdirectories down. The collection was taking 6-7 seconds before 0.4 seconds of actual test execution.
My solution initially was to call pytest with the relative path to the tests:
pytest src/www/tests/
If doing that speeds up your collection also, you can add the relative path to the tests to the end of the addopts setting in your pytest.ini - eg:
[pytest]
addopts = --doctest-glob='test_*.md' -x src/www/tests/
This dropped the collection + execution time down to about a second and I could still just call pytest as I was before.

With xdist you can parallelize pytest runs. It allows even to ship tests to remote machines. Depends on your setup it can speedup quite a bit :)

In bash, try { find -name '*_test.py'; find -name 'test_*.py'; } | xargs pytest.
For me, this brings total test time down to a fraction of a second.

For me, adding PYTHONDONTWRITEBYTECODE=1 to my environment variables achieved a massive speedup! Note that I am using network drives which might be a factor.
Windows Batch: set PYTHONDONTWRITEBYTECODE=1
Unix: export PYTHONDONTWRITEBYTECODE=1
subprocess.run: Add keyword env={'PYTHONDONTWRITEBYTECODE': '1'}
PyCharm already set this variable automatically for me.
Note that the first two options only remain active for your current terminal session.

In the special case where you are running under cygwin's python, its unix-style file handling is slow. See pytest.py test very slow startup in cygwin for how to speed things up in that special situation.

If you have some antivirus software running, try turning it off. I had this exact same problem. Collecting tests ran incredibly slow. It turned out to be my antivirus software (Avast) that was causing the problem. When I disabled the antivirus software, test collection ran about five times faster. I tested it several times, turning the antivirus on and off, so I have no doubt that was the cause in my case.
Edit: To be clear, I don't think antivirus should be turned off and left off. I just recommend turning it off temporarily to see if it is the source of the slow down. In my case, it was, so I looked for other antivirus solutions that didn't have the same issue.

Pytest imports all modules in the testpaths directories to look for tests. The import itself can be slow. This is the same startup time you'd experience if you ran those tests directly, however, since it imports all of the files it will be a lot longer. It's kind of a worst-case scenario.
This doesn't add time to the whole test run though, as it would need to import those files anyway to execute the tests.
If you narrow down the search on the command line, to specific files or directories, it will only import those ones. This can be a significant speedup while running specific tests.
Speeding up those imports involves modifying those modules. The size of the module, and the transitive imports, slow down the startup. Additionally look for any code that is executed -- code outside of a function. That also needs to be executed during the test collection phase.

Related

Debug memory usage during py.test run

We have test which passes if run stand alone. But if we run all tests, py.test fails since no memory is left.
My question: How to display the memory usage of the py.test process before and after each test?
This way we could be able to find the tests which have memory leaks.
Other solutions are welcome, too.
We run Python 2.7 on linux.
The root of the memory problem was solved: Django changed Queryset iteration to load all instances. In my case millions :-) See: https://docs.djangoproject.com/en/1.6/releases/1.6/#queryset-iteration
But I am still interested in the general question.

pytest-xdist plugin gives you --boxed option, where each test is ran in own subprocess.
You that to work around your test, and also to track resource usage (not sure how atm).
Finally, it is quite possible that it is interaction of your tests and not a single test alone that piles up memory. You can use -k selector or pytest-random plugin's flag --random to verify my conjecture.
https://pypi.python.org/pypi/pytest-xdist
https://pypi.python.org/pypi/pytest-random

What's a good way to find the difference between two nose tests runs?

I am trying to prepare a pull request for my changes to matplotlib here: https://github.com/shmuller/matplotlib.git. After merging with upstream/master (https://github.com/matplotlib/matplotlib.git), I wanted to find out if I broke anything, so I run the test suite (python tests.py -v -a) on upstream/master. I get:
Ran 4688 tests in 555.109s
FAILED(KNOWNFAIL=330, SKIP=9, errors=197, failures=16)
Now on my merged branch:
Ran 4682 tests in 555.070s
FAILED(KNOWNFAIL=330, SKIP=9, errors=200, failures=18)
Darn! Quite close, but not the same! So I did break something that wasn't broken before. Since there are thousands of tests, and lots of errors and failures to begin with, it does not appear obvious to find out what I broke.
So my question is: What's a good way to find out which tests broke that weren't broken before?
tests.py essentially does:
import nose
nose.main()
so I am hoping for a feature in nose that helps me figure out what I broke, but couldn't find anything in the help (nosetests --help). I obviously could log and diff the whole output, but I'm hoping for a more elegant solution.

Save the logs to two files, A and B. Then use a diff tool like Meld or Emacs' M-x ediff to see the differences.
If you have a guess about what test(s) are relevant to the code you changed, then you could run
nosetests /path/to/test_file.py
Fix the errors relevant to the code you changed, and then see if outputs are identical (by running diff).
If you run
nosetests --with-id
then on subsequent runs, adding the --failed flag will cause nosetests to re-run only the failed tests. That may also help you zero-in on the differences.
nosetests --with-id --failed

Python benchmark tool like nosetests?

What I want
I would like to create a set of benchmarks for my Python project. I would like to see the performance of these benchmarks change as I introduce new code. I would like to do this in the same way that I test Python, by running the utility command like nosetests and getting a nicely formatted readout.
What I like about nosetests
The nosetests tool works by searching through my directory structure for any functions named test_foo.py and runs all functions test_bar() contained within. It runs all of those functions and prints out whether or not they raised an exception.
I'd like something similar that searched for all files bench_foo.py and ran all contained functions bench_bar() and reported their runtimes.
Questions
Does such a tool exist?
If not what are some good starting points? Is some of the nose source appropriate for this?

nosetests can run any type of test, so you can decide if they test functionality, input/output validity etc., or performance or profiling (or anything else you'd like). The Python Profiler is a great tool, and it comes with your Python installation.
import unittest
import cProfile
class ProfileTest(unittest.TestCase):
test_run_profiler:
cProfile.run('foo(bar)')
cProfile.run('baz(bar)')
You just add a line to the test, or add a test to the test case for all the calls you want to profile, and your main source is not polluted with test code.
If you only want to time execution and not all the profiling information, timeit is another useful tool.

The wheezy documentation has a good example on how to do this with nose. The important part if you just want to have the timings is to use options -q for quiet run, -s for not capturing the output (so you will see the output of the report) and -m benchmark to only run the 'timing' tests.
I recommend using py.test for testing over. To run the example from wheezy with that, change the name of the runTest method to test_bench_run and run only this benchmark with:
py.test -qs -k test_bench benchmark_hello.py
(-q and -s having the same effect as with nose and -k to select the pattern of the test names).
If you put your benchmark tests in file in a separate file or directory from normal tests they are of course more easy to select and don't need special names.

Why are there no Makefiles for automation in Python projects?

As a long time Python programmer, I wonder, if a central aspect of Python culture eluded me a long time: What do we do instead of Makefiles?
Most ruby-projects I've seen (not just rails) use Rake, shortly after node.js became popular, there was cake. In many other (compiled and non-compiled) languages there are classic Make files.
But in Python, no one seems to need such infrastructure. I randomly picked Python projects on GitHub, and they had no automation, besides the installation, provided by setup.py.
What's the reason behind this?
Is there nothing to automate? Do most programmers prefer to run style checks, tests, etc. manually?
Some examples:
dependencies sets up a virtualenv and installs the dependencies
check calls the pep8 and pylint commandlinetools.
the test task depends on dependencies enables the virtualenv, starts selenium-server for the integration tests, and calls nosetest
the coffeescript task compiles all coffeescripts to minified javascript
the runserver task depends on dependencies and coffeescript
the deploy task depends on check and test and deploys the project.
the docs task calls sphinx with the appropiate arguments
Some of them are just one or two-liners, but IMHO, they add up. Due to the Makefile, I don't have to remember them.
To clarify: I'm not looking for a Python equivalent for Rake. I'm glad with paver. I'm looking for the reasons.

Actually, automation is useful to Python developers too!
Invoke is probably the closest tool to what you have in mind, for automation of common repetitive Python tasks: https://github.com/pyinvoke/invoke
With invoke, you can create a tasks.py like this one (borrowed from the invoke docs)
from invoke import run, task
#task
def clean(docs=False, bytecode=False, extra=''):
patterns = ['build']
if docs:
patterns.append('docs/_build')
if bytecode:
patterns.append('**/*.pyc')
if extra:
patterns.append(extra)
for pattern in patterns:
run("rm -rf %s" % pattern)
#task
def build(docs=False):
run("python setup.py build")
if docs:
run("sphinx-build docs docs/_build")
You can then run the tasks at the command line, for example:
$ invoke clean
$ invoke build --docs
Another option is to simply use a Makefile. For example, a Python project's Makefile could look like this:
docs:
$(MAKE) -C docs clean
$(MAKE) -C docs html
open docs/_build/html/index.html
release: clean
python setup.py sdist upload
sdist: clean
python setup.py sdist
ls -l dist

Setuptools can automate a lot of things, and for things that aren't built-in, it's easily extensible.
To run unittests, you can use the setup.py test command after having added a test_suite argument to the setup() call. (documentation)
Dependencies (even if not available on PyPI) can be handled by adding a install_requires/extras_require/dependency_links argument to the setup() call. (documentation)
To create a .deb package, you can use the stdeb module.
For everything else, you can add custom setup.py commands.
But I agree with S.Lott, most of the tasks you'd wish to automate (except dependencies handling maybe, it's the only one I find really useful) are tasks you don't run everyday, so there wouldn't be any real productivity improvement by automating them.

There is a number of options for automation in Python. I don't think there is a culture against automation, there is just not one dominant way of doing it. The common denominator is distutils.
The one which is closed to your description is buildout. This is mostly used in the Zope/Plone world.
I myself use a combination of the following: Distribute, pip and Fabric. I am mostly developing using Django that has manage.py for automation commands.
It is also being actively worked on in Python 3.3

Any decent test tool has a way of running the entire suite in a single command, and nothing is stopping you from using rake, make, or anything else, really.
There is little reason to invent a new way of doing things when existing methods work perfectly well - why re-invent something just because YOU didn't invent it? (NIH).

The make utility is an optimization tool which reduces the time spent building a software image. The reduction in time is obtained when all of the intermediate materials from a previous build are still available, and only a small change has been made to the inputs (such as source code). In this situation, make is able to perform an "incremental build": rebuild only a subset of the intermediate pieces that are impacted by the change to the inputs.
When a complete build takes place, all that make effectively does is to execute a set of scripting steps. These same steps could just be deposited into a flat script. The -n option of make will in fact print these steps, which makes this possible.
A Makefile isn't "automation"; it's "automation with a view toward optimized incremental rebuilds." Anything scripted with any scripting tool is automation.
So, why would Python project eschew tools like make? Probably because Python projects don't struggle with long build times that they are eager to optimize. And, also, the compilation of a .py to a .pyc file does not have the same web of dependencies like a .c to a .o.
A C source file can #include hundreds of dependent files; a one-character change in any one of these files can mean that the source file must be recompiled. A properly written Makefile will detect when that is or is not the case.
A big C or C++ project without an incremental build system would mean that a developer has to wait hours for an executable image to pop out for testing. Fast, incremental builds are essential.
In the case of Python, probably all you have to worry about is when a .py file is newer than its corresponding .pyc, which can be handled by simple scripting: loop over all the files, and recompile anything newer than its byte code. Moreover, compilation is optional in the first place!
So the reason Python projects tend not to use make is that their need to perform incremental rebuild optimization is low, and they use other tools for automation; tools that are more familiar to Python programmers, like Python itself.

The original PEP where this was raised can be found here. Distutils has become the standard method for distributing and installing Python modules.
Why? It just happens that python is a wonderful language to perform the installation of Python modules with.

Here are few examples of makefile usage with python:
https://blog.horejsek.com/makefile-with-python/
https://krzysztofzuraw.com/blog/2016/makefiles-in-python-projects.html
I think that a most of people is not aware "makefile for python" case. It could be useful, but "sexiness ratio" is too small to propagate rapidly (just my PPOV).

Is there nothing to automate?
Not really. All but two of the examples are one-line commands.
tl;dr Very little of this is really interesting or complex. Very little of this seems to benefit from "automation".
Due to documentation, I don't have to remember the commands to do this.
Do most programmers prefer to run stylechecks, tests, etc. manually?
Yes.
generation documentation,
the docs task calls sphinx with the appropiate arguments
It's one line of code. Automation doesn't help much.
sphinx-build -b html source build/html. That's a script. Written in Python.
We do this rarely. A few times a week. After "significant" changes.
running stylechecks (Pylint, Pyflakes and the pep8-cmdtool).
check calls the pep8 and pylint commandlinetools
We don't do this. We use unit testing instead of pylint.
You could automate that three-step process.
But I can see how SCons or make might help someone here.
tests
There might be space for "automation" here. It's two lines: the non-Django unit tests (python test/main.py) and the Django tests. (manage.py test). Automation could be applied to run both lines.
We do this dozens of times each day. We never knew we needed "automation".
dependecies sets up a virtualenv and installs the dependencies
Done so rarely that a simple list of steps is all that we've ever needed. We track our dependencies very, very carefully, so there are never any surprises.
We don't do this.
the test task depends on dependencies enables the virtualenv, starts selenium-server for the integration tests, and calls nosetest
The start server & run nosetest as a two-step "automation" makes some sense. It saves you from entering the two shell commands to run both steps.
the coffeescript task compiles all coffeescripts to minified javascript
This is something that's very rare for us. I suppose it's a good example of something to be automated. Automating the one-line script could be helpful.
I can see how SCons or make might help someone here.
the runserver task depends on dependencies and coffeescript
Except. The dependencies change so rarely, that this seems like overkill. I supposed it can be a good idea of you're not tracking dependencies well in the first place.
the deploy task depends on check and test and deploys the project.
It's an svn co and python setup.py install on the server, followed by a bunch of customer-specific copies from the subversion area to the customer /www area. That's a script. Written in Python.
It's not a general make or SCons kind of thing. It has only one actor (a sysadmin) and one use case. We wouldn't ever mingle deployment with other development, QA or test tasks.

Speeding up the python "import" loader

I'm getting seriously frustrated at how slow python startup is. Just importing more or less basic modules takes a second, since python runs down the sys.path looking for matching files (and generating 4 stat() calls - ["foo", "foo.py", "foo.pyc", "foo.so"] - for each check). For a complicated project environment, with tons of different directories, this can take around 5 seconds -- all to run a script that might fail instantly.
Do folks have suggestions for how to speed up this process? For instance, one hack I've seen is to set the LD_PRELOAD_32 environment variable to a library that caches the result of ENOENT calls (e.g. failed stat() calls) between runs. Of course, this has all sorts of problems (potentially confusing non-python programs, negative caching, etc.).

zipping up as many pyc files as feasible (with proper directory structure for packages), and putting that zipfile as the very first entry in sys.path (on the best available local disk, ideally) can speed up startup times a lot.

The first things that come to mind are:
Try a smaller path
Make sure your modules are pyc's so they'll load faster
Make sure you don't double import, or import too much
Other than that, are you sure that the disk operations are what's bogging you down? Is your disk/operating system really busy or old and slow?
Maybe a defrag is in order?

When trying to speed things up, profiling is key. Otherwise, how will you know which parts of your code are really the slow ones?
A while ago, I've created the runtime and import profile visualizer tuna, and I think it may be useful here. Simply create an import profile (with Python 3.7+) and run tuna on it:
python3.7 -X importtime -c "import scipy" 2> scipy.log
tuna scipy.log

If you run out of options, you can create a ramdisk to store your python packages. A ramdisk appears as a directory in your file system, but will actually be mapped directly to your computer's RAM. Here are some instructions for Linux/Redhat.
Beware: A ramdisk is volatile, so you'll also need to keep a backup of your files on your regular hard drive, otherwise you'll lose your data when your computer shuts down.

Something's missing from your premise--I've never seen some "more-or-less" basic modules take over a second to import, and I'm not running Python on what I would call cutting-edge hardware. Either you're running on some seriously old hardware, or you're running on an overloaded machine, or either your OS or Python installation is broken in some way. Or you're not really importing "basic" modules.
If it's any of the first three issues, you need to look at the root problem for a solution. If it's the last, we really need to know what the specific packages are to be of any help.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.