How to implement a global setUp() for multiprocessing using nosetests - python

So far I've used nosetests with just one process and everything works fine.
To ensure my setUp is only executed once, I'm using a boolean var.
def setUp(self):
if not self.setupOk:
selTest.setupOk = True
# start selenium
# do other stuff which will be needed for all other tests to be able to run
Now I would like to run nosetests with the option --processes=5
How can I ensure that setUp(self) is only execued by one process (while the other processes are waiting).
I've tried to work with
def setUp(self):
lock = multiprocessing.Lock()
lock.acquire()
if not self.setupOk:
selTest.setupOk = True
# start selenium
# do other stuff which will be needed for all other tests to be able to run
lock.release()
but this doesn't seems to work.

setUp will be called before every test is run. If you want a method to execute just once, you can use setUpClass:
#classmethod
def setUpClass(cls):
print "do stuff which needs to be run once"

Related

Is there a way to force spicific fixture trigger as last one

I have fixture that launches sub-process during testing.
It causes some issues, because, apparently process is killed prematurely.
from multiprocessing import Process
from uvicorn import run
from main import app
# -- snip --
#pytest.fixture(scope="session")
def external_client():
"""Launch application as an independent process"""
config = dict(app=app, host="127.0.0.1", port=7001, workers=1)
p = Process(target=run, kwargs=config)
p.start()
yield
p.kill()
# -- snip --
This process creates postgresql connections. Which are used in other fixtures for teardown/ cleaning database. (yeah...that sounds not right on paper)
When I extended tests I faced to errors like below, which occured inconsitenly.
sqlalchemy.exc.DatabaseError: (psycopg2.DatabaseError) error with status PGRES_TUPLES_OK and no message from the libpq
I tried to follow answer here, and dispose engines explicitly. But it did not help.
I am currently tyring to solve this, and one of my ideas is to kill process only in last fixture, but I do not know how to order them(only if manually trying to track fixture chain and add to last triggered fixture)
So I would like to to something like that:
# -- snip --
#pytest.fixture(scope="session")
def external_client():
"""Launch application as an independent process"""
config = dict(app=app, host="127.0.0.1", port=7001, workers=1)
p = Process(target=run, kwargs=config)
p.start()
yield p
#pytest.fixture()
def kill_process(external_client):
yield
# I want that this code as the last triggered code of all test fixtures.
external_client.kill()
# -- snip --
Is there a way to do it? Or better way to avoid closing process prematurely?

How can I run cleanup code only for specific tests with pytest?

With pytest is there a way to run cleanup code on a specific test function/method alone. I know we can do this to run for each test function. But here I want to place some cleanup logic specific to a single test function.
I can just put cleanup code at the end of the test. But if test fails then cleanup wont be done.
Create a fixture with your cleanup code and inject it only into the one test by using the fixture as an argument for your test or by explicitly marking the test with the pytest.mark.usefixtures decorator.
import pytest
#pytest.fixture
def my_cleanup_fixture():
# Startup code
...
yield
# Cleanup code
...
#pytest.mark.usefixtures('my_cleanup_fixture')
def test_with_special_cleanup():
pass
my_cleanup_fixture has scope function by default, so the startup and cleanup code will run for each function it is injected.

How to ignore tests when session fixture fails in pytest

Let's say I have a test as shown below:
import pytest
import copy
#pytest.fixture(scope='session')
def session_tool(request):
tool = request.config.tool
# Build is the critical part and may fail, raising an exception
tool.build()
return tool
#pytest.fixture
def tool(session_tool):
return copy.deepcopy(session_tool)
def test_tool(tool, args):
assert tool.run(args) == 0
It builds a session-scoped tool and then creates a copy of it for each testcase. But when the build fails, session_tool fixture is executed again for the next testcase, which fails again... until it fails for all testcases. As there are a lot of testcases, it takes some time until the process is finished.
Is there any way to tell pytest to skip all tests which use session_fixture after the first attempt to build fails?
I can think of two approaches:
1) calling pytest.skip() will cause the test to be skipped. This works if it's called from within a fixture as well. In your case, it will cause all the remaining tests to be skipped.
2) calling pytest.exit() will cause your test suite to stop running, as if KeyboardInterrupt was triggered.

python running coverage on never ending process

I have a multi processed web server with processes that never end, I would like to check my code coverage on the whole project in a live environment (not only from tests).
The problem is, that since the processes never end, I don't have a good place to set the cov.start() cov.stop() cov.save() hooks.
Therefore, I thought about spawning a thread that in an infinite loop will save and combine the coverage data and then sleep some time, however this approach doesn't work, the coverage report seems to be empty, except from the sleep line.
I would be happy to receive any ideas about how to get the coverage of my code,
or any advice about why my idea doesn't work. Here is a snippet of my code:
import coverage
cov = coverage.Coverage()
import time
import threading
import os
class CoverageThread(threading.Thread):
_kill_now = False
_sleep_time = 2
#classmethod
def exit_gracefully(cls):
cls._kill_now = True
def sleep_some_time(self):
time.sleep(CoverageThread._sleep_time)
def run(self):
while True:
cov.start()
self.sleep_some_time()
cov.stop()
if os.path.exists('.coverage'):
cov.combine()
cov.save()
if self._kill_now:
break
cov.stop()
if os.path.exists('.coverage'):
cov.combine()
cov.save()
cov.html_report(directory="coverage_report_data.html")
print "End of the program. I was killed gracefully :)"
Apparently, it is not possible to control coverage very well with multiple Threads.
Once different thread are started, stopping the Coverage object will stop all coverage and start will only restart it in the "starting" Thread.
So your code basically stops the coverage after 2 seconds for all Thread other than the CoverageThread.
I played a bit with the API and it is possible to access the measurments without stopping the Coverage object.
So you could launch a thread that save the coverage data periodically, using the API.
A first implementation would be something like in this
import threading
from time import sleep
from coverage import Coverage
from coverage.data import CoverageData, CoverageDataFiles
from coverage.files import abs_file
cov = Coverage(config_file=True)
cov.start()
def get_data_dict(d):
"""Return a dict like d, but with keys modified by `abs_file` and
remove the copied elements from d.
"""
res = {}
keys = list(d.keys())
for k in keys:
a = {}
lines = list(d[k].keys())
for l in lines:
v = d[k].pop(l)
a[l] = v
res[abs_file(k)] = a
return res
class CoverageLoggerThread(threading.Thread):
_kill_now = False
_delay = 2
def __init__(self, main=True):
self.main = main
self._data = CoverageData()
self._fname = cov.config.data_file
self._suffix = None
self._data_files = CoverageDataFiles(basename=self._fname,
warn=cov._warn)
self._pid = os.getpid()
super(CoverageLoggerThread, self).__init__()
def shutdown(self):
self._kill_now = True
def combine(self):
aliases = None
if cov.config.paths:
from coverage.aliases import PathAliases
aliases = PathAliases()
for paths in self.config.paths.values():
result = paths[0]
for pattern in paths[1:]:
aliases.add(pattern, result)
self._data_files.combine_parallel_data(self._data, aliases=aliases)
def export(self, new=True):
cov_report = cov
if new:
cov_report = Coverage(config_file=True)
cov_report.load()
self.combine()
self._data_files.write(self._data)
cov_report.data.update(self._data)
cov_report.html_report(directory="coverage_report_data.html")
cov_report.report(show_missing=True)
def _collect_and_export(self):
new_data = get_data_dict(cov.collector.data)
if cov.collector.branch:
self._data.add_arcs(new_data)
else:
self._data.add_lines(new_data)
self._data.add_file_tracers(get_data_dict(cov.collector.file_tracers))
self._data_files.write(self._data, self._suffix)
if self.main:
self.export()
def run(self):
while True:
sleep(CoverageLoggerThread._delay)
if self._kill_now:
break
self._collect_and_export()
cov.stop()
if not self.main:
self._collect_and_export()
return
self.export(new=False)
print("End of the program. I was killed gracefully :)")
A more stable version can be found in this GIST.
This code basically grab the info collected by the collector without stopping it.
The get_data_dict function take the dictionary in the Coverage.collector and pop the available data. This should be safe enough so you don't lose any measurement.
The report files get updated every _delay seconds.
But if you have multiple process running, you need to add extra efforts to make sure all the process run the CoverageLoggerThread. This is the patch_multiprocessing function, monkey patched from the coverage monkey patch...
The code is in the GIST. It basically replaces the original Process with a custom process, which start the CoverageLoggerThread just before running the run method and join the thread at the end of the process.
The script main.py permits to launch different tests with threads and processes.
There is 2/3 drawbacks to this code that you need to be carefull of:
It is a bad idea to use the combine function concurrently as it performs comcurrent read/write/delete access to the .coverage.* files. This means that the function export is not super safe. It should be alright as the data is replicated multiple time but I would do some testing before using it in production.
Once the data have been exported, it stays in memory. So if the code base is huge, it could eat some ressources. It is possible to dump all the data and reload it but I assumed that if you want to log every 2 seconds, you do not want to reload all the data every time. If you go with a delay in minutes, I would create a new _data every time, using CoverageData.read_file to reload previous state of the coverage for this process.
The custom process will wait for _delay before finishing as we join the CoverageThreadLogger at the end of the process so if you have a lot of quick processes, you want to increase the granularity of the sleep to be able to detect the end of the Process more quickly. It just need a custom sleep loop that break on _kill_now.
Let me know if this help you in some way or if it is possible to improve this gist.
EDIT:
It seems you do not need to monkey patch the multiprocessing module to start automatically a logger. Using the .pth in your python install you can use a environment variable to start automatically your logger on new processes:
# Content of coverage.pth in your site-package folder
import os
if "COVERAGE_LOGGER_START" in os.environ:
import atexit
from coverage_logger import CoverageLoggerThread
thread_cov = CoverageLoggerThread(main=False)
thread_cov.start()
def close_cov()
thread_cov.shutdown()
thread_cov.join()
atexit.register(close_cov)
You can then start your coverage logger with COVERAGE_LOGGER_START=1 python main.y
Since you are willing to run your code differently for the test, why not add a way to end the process for the test? That seems like it will be simpler than trying to hack coverage.
You can use pyrasite directly, with the following two programs.
# start.py
import sys
import coverage
sys.cov = cov = coverage.coverage()
cov.start()
And this one
# stop.py
import sys
sys.cov.stop()
sys.cov.save()
sys.cov.html_report()
Another way to go would be to trace the program using lptrace even if it only prints calls it can be useful.

How can I run a command once in parallel mode in fabric?

I have a fabric script to manage our deployments. I need it to run in parallel mode so it can finish in a reasonable amount of time, but I need one command to run only once, not multiple times as it would in parallel mode.
Don't specify the hosts before executing the function which you would like to execute only one time.
After that function, you can set env.host variable to the machines you want to run on.
For example,
def task():
init()
execute(main_job)
def init():
# do some initialization
# set host
env.host = ['192.168.5.11', '192.168.5.12']
#parallel
def main_job():
# main job code...

Categories

Resources