Identifying which test case is running on pytest-xdist processes

Identifying which test case is running on pytest-xdist processes - python

I'm executing python unit tests in parallel with pytest-forked (pytest -r sxX -n auto --cache-clear --max-worker-restart=4 --forked) and there is one test case which takes quite some time and which is running at the end while the other test case runners/CPU cores are idle (because presumably there's only this one test cases left to complete).
I'd like to know which test case that is (to maybe run it at the beginning or disable it). Note, this is not a matter of finding the longest running test case as that may not be the culprit. I'm explicitly looking for some way of knowing which test case is assigned to a pytest runner Python process. Calling ps shows something like python -u -c import sys; exec(eval(sys.stdin.readline())) (for as many CPU cores in the machine) which isn't particularly helpful.
Is there a way to set the name of the test case to the process and retrieve it with system tools such as ps? I'm running those test cases on Linux, in case that's relevant.

Since pytest-dist 2.4, there's a solution to showing which test case is running. It requires an additional package setproctitle.
Identifying workers from the system environment
New in version 2.4
If the setproctitle package is installed, pytest-xdist will use it to update the process title (command line) on its workers to show their current state. The titles used are [pytest-xdist running] file.py/node::id and [pytest-xdist idle], visible in standard tools like ps and top on Linux, Mac OS X and BSD systems. For Windows, please follow setproctitle’s pointer regarding the Process Explorer tool.
This is intended purely as an UX enhancement, e.g. to track down issues with long-running or CPU intensive tests. Errors in changing the title are ignored silently. Please try not to rely on the title format or title changes in external scripts.
https://pypi.org/project/pytest-xdist/#identifying-workers-from-the-system-environmentbas

Here is a way to see which test is running when pytest-xdist is processing.
Link to docs: https://docs.pytest.org/en/7.1.x/reference/reference.html#_pytest.hookspec.pytest_report_teststatus
Add the following function to your conftest.py file.
#conftest.py
def pytest_report_teststatus(report):
print(report.__dict__['nodeid'])
Example command to start pytest:
python -m pytest -n 3 -s --show-capture=no --disable-pytest-warnings

Related

Checking that PYTHONIOENCODING is always "utf8"

I know unittests and use write them daily.
They get executed during development and CI.
Now I have test which I would like to ensure on the production system:
PYTHONIOENCODING must be "utf8"
Above I used the verb "test", this means I want to check the state. This question is not about how to do this.
AFAIK the unittest framework can't help me here, since it only gets executed during development and CI.
How to solve this in the python world withou re-inventing the wheel?
Above is only an example. There are several other things next to PYTHONIOENCODING which I would like to check.
Next use case for these checks: Some days ago we had an issue on the production sever. The command-line tool convert gets used and some versions are broken and create wrong results. I would like to write a simple check to ensure that the convert tool on the production server is not broken.

Straightforward approach (Checking)
Put this near the start of the code:
import os
if os.environ.get('PYTHONIOENCODING', '').lower() not in {'utf-8', 'utf8'}:
raise EnvironmentError("Environment variable $PYTHONIOENCODING must be set to 'utf8'")
Alternative solution (Ensuring)
In one of the projects I code for, there's a "startup script", so instead of running python3 main.py, we run this in production:
bash main.sh
whose content is rather simple:
#!/bin/bash
export PYTHONIOENCODING=utf8
exec /usr/bin/env python3 main.py

testinfra
If you want to write and run tests against the deployment infrastructure, you can use the testinfra plugin for pytest. For example, test for a simple requirement of validating an environment variable on target machine could look like:
def test_env_var(host):
assert host.run_expect((0,), 'test "$PYTHONIOENCODING" == "utf8"')
This infrastructure test suite can be developed in a separate project and invoked before the actual deployment takes place (for example, we invoke the infra tests right after the docker image is built; if the tests fail, the image is not uploaded to our private image repository/deployed to prod etc).

Using pytest, is it possible for a unit test to know that it is being run with code coverage monitoring on?

I am currently developing some tests using python py.test / unittest that, via subprocess, invoke another python application (so that I can exercise the command line options, and confirm that the tool is installed correctly).
I would like to be able to run the tests in such a way that I can get a view of the code coverage metrics (using coverage.py) for the target application using pytest_cov. By default this does not work as the code coverage instrumentation does not apply to code invoked with subprocess.
Code Coverage of the code does work if I update the tests to directly invoke the entry class for the target application (rather than running via the command line).
Ideally I want to have a single set of code which can be run in two ways:
If code coverage monitoring is not enabled then use command line
Otherwise execute the main class of the target application.
Which leads to my question(s):
Is it possible for a python unit test to determine if it is being run with code coverage enabled?
Otherwise: is there any easy way to pass a command line flag from the pytest invocation that can be used to set the mode within the code.

Coverage.py has a facility to automatically measure coverage in sub-processes that are spawned: http://coverage.readthedocs.io/en/latest/subprocess.html

Coverage sets three environment flags when running tests: COV_CORE_SOURCE, COV_CORE_CONFIG and COV_CORE_DATAFILE.
So you can use a simple if-statement to verify whether the current test is being run with coverage enabled:
import os
if "COV_CORE_SOURCE" in os.environ:
# do what yo need to do when Coverage is enabled

Comparing memory usage of shell script and Python script for same task

Assume two scripts: (a) a bash shell script that calls a JAVA jar (hereafter my_shell_script), and (b) a Python script that imports functions from other Python packages but that does not call any non-Python package or software (hereafter my_Python_script). Both scripts have the same purpose: They take the same input (hereafter testinput) and generate roughly the same output.
I would like to measure and compare the memory usage of both scripts as a function of time as they execute.
To that end, I execute each script via valgrind using massif (setting the time_unit to milliseconds), followed by a summary of the massif output via ms_print.
INF=testinput
# Testing Shell script
valgrind --tool=massif --pages-as-heap=yes --time-unit=ms --massif-out-file=${INF}_shell.out bash my_shell_script -f $INF -j my_java_jar
ms_print --threshold=50.0 ${INF}_shell.out > ${INF}_shell.summary
# Testing Python script
valgrind --tool=massif --pages-as-heap=yes --time-unit=ms --massif-out-file=${INF}_Python.out python2 my_python_script $INF
ms_print --threshold=50.0 ${INF}_Python.out > ${INF}_Python.summary
While valgrind/massif record a memory usage for my_python_script that is roughly consistent with what I see via htop, this is not the case for my_shell_script. The statistics on htop indicate a GB of memory usage during the execution of my_shell_script, yet valgrind/massif record only a few dozen MB of memory used.
Thus, I suspect that valgrind/massif record the memory usage of the execution of the bash code, but not of the JAVA jar that the bash code is calling.
How can I measure the memory usage of my_shell_script as a function of time correctly?

The option --trace-children=yes|no indicates to trace (or not) the children. The default value is no, which means that the shell executing the script will
be under valgrind, but not the launched python or java process.
So, specify --trace-children=yes.
Be sure to use e.g. a %p in the massif out file argument, otherwise all the
processes (e.g. the shell and its children) will write to the same file, which will not work properly.

Restart the process where the error has been detected

In a Django project, it is possible to create unit-tests to verify what we had done so far. The principle is simple. We have to execute the command python3 manage.py test in the shell. When an error is detected in the program, the shell will display it and stop the process. However, the procedure has a little gap. If we have several errors, we have to correct it and restart the whole process. This process could take several minutes which depends of our program. Is there a manner to restart the process where the error has been detected instead of restart the whole procedure?
EDIT :
In fact, another problem I have is to retain the databases instead of recreate it. How could I do such thing?

If you want to automatically run only failing tests you need to use a third party testing driver like Nose or create your own. But it's not worth it because ...
You can specify particular tests to run by supplying any number of
“test labels” to ./manage.py test. Each test label can be a full
Python dotted path to a package, module, TestCase subclass, or test
method. For instance:
# Run just one test method
$ ./manage.py test animals.tests.AnimalTestCase.test_animals_can_speak
Source: https://docs.djangoproject.com/en/1.10/topics/testing/overview/
This approach can be used to re run only the ones that have failed.
Please note that third party test runners will probably recreate the database every time you run the test - even for only the failing test. On the other hand the django default test runner has the --keep option which allows the database to be reused. For more details see: https://stackoverflow.com/a/37100979/267540

Running Salome script without graphics

I exported a script from Salome (dump), and I want to run it in python (I'm doing some geometric operation and I don't need any graphics). So I removed all the graphic command, but when I try to launch my python file, python cannot found the salome libraries.
I tried to export the salome path ('install_path'/appli_V6_5_0p1/bin/salome/) in PYTHONPATH and LD_LIBRARY_PATH but it still doesn't work.
I also would like to know if it's possible to use only the geompy library without salome, and if it's possible, how can I install only the geompy library? ( I need to launch some geompy script on a UAV with only 8gb of memory, so the less thing I install, the better)

I had similar wishes to you but after much searching I ended up concluding that what we both want to do is not completely possible.
In order to run a salome script on the command line without the GUI use
salome -t python script.py
or simply
salome -t script.py
In order to run a salome script you must call it using the salome executable. It seems that you cannot use the salome libraries (by importing them into a python script that is then called with python script.py) without the compiled program. The executables that salome uses contain much of what the platform needs to do its job.
This frustrated me for a long time, but I found a workaround; for a simple example, if you have a salome script you can call the salome executable from within another python program with
os.system("salome -t python script.py")
But now you have a problem; salome does not automatically kill the session so if you run the above command a number of times your system will become clogged up with multiple instances of running salome processes. These can be killed manually by running killSalome.py, found in your salome installation folder. But beware! This will kill all instances of salome running on your computer! This will be a problem if you are running multiple model generation scripts at once or if you also have the salome GUI open.
Obviously, a better way is for your script to kill each specific instance of salome after it has been used. The following is one method (the exact paths to the executable etc will need to change depending on your installation):
# Make a subprocess call to the salome executable and store the used port in a text file:
subprocess.call('/salomedirectory/bin/runAppli -t python script.py --ns-port-log=/absolute/path/salomePort.txt', shell=True)
# Read in the port number from the text file:
port_file = open('/absolute/path/salomePort.txt','r')
killPort = int(port_file.readline())
port_file.close()
# Kill the session with the specified port:
subprocess.call('/salomedirectory/bin/salome/killSalomeWithPort.py %s' % killPort,shell=True)
EDIT: Typo correction to the python os command.
EDIT2: I recently found that problems with this method are met when the port log file (here "salomePort.txt" but can be named arbitrarily) is given with only its relative path. It seems that giving it with its full, absolute path is necessarily for this to work.

If you are working with salome on windows platform, use following
salome_folder\WORK>run_salome.bat -t script_file.py

According to Salome FAQ:
To launch SALOME without GUI, use “runSalome –t” command: only the
necessary servers are launched (without GUI). To start an interactive
Python console then (for example, to be able to load TUI scripts), you
will need to use --pinter option
To launch Salome with only chosen modules:
To launch a group of chosen SALOME modules, use the command “runSalome
–-modules=XXX,YYY”, where XXX and YYY are modules' names. You can use
–h command to display help of runSalome script.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.