patch object spawned in sub-processs - python

I am using multiprocessing package for crating sub-processes. I need to handle exceptions from sub-process. catch, report, terminate and re-spawn sub-process.
I struggle to create test for it.
I would like to patch object which represent my sub-process and raise exception to see if handling is correct.
But it looks like that object is patched only in main process and in spawned process is unchanged version. Any ideas how to accomplish requested functionality?
Example:
import multiprocessing
import time
class SubprocessClass(multiprocessing.Process):
def __init__(self) -> None:
super().__init__()
def simple_method(self):
return 42
def run(self):
try:
self.simple_method()
except Exception:
# ok, exception handled
pass
else:
# I wanted exception ! <- code goes here
assert False
#mock.patch.object(SubprocessClass, "simple_method")
def test_patch_subprocess(mock_simple_method):
mock_simple_method.side_effect = Exception("exception from mock")
subprocess = SubprocessClass()
subprocess.run()
subprocess.start()
time.sleep(0.1)
subprocess.join()

you can monkey-patch the object before it is started
(it is a bit iffy but you will get actual process running that code)
def _this_always_raises(*args, **kwargs):
raise RuntimeError("I am overridden")
def test_patch_subprocess():
subprocess = SubprocessClass()
subprocess.simple_method = _this_always_raises
subprocess.start()
time.sleep(0.1)
subprocess.join()
assert subprocess.exitcode == 0
you could also mock multiprocessing to behave like threading but that is a bit unpredictable
if you want to do it genericly for all objects you can mock the class for another one derived from the original one with only one method overriden
class SubprocessClassThatRaisesInSimpleMethod(SubprocessClass):
def simple_method(self):
raise RuntimeError("I am overridden")
# then mock with unittest mock the process spawner to use this class instead of SubprocessClass

Related

How to Mock: pytest.raises DID NOT RAISE <class 'subprocess.TimeoutExpired'>

I am using subprocess for a task, and I have a try/except block for catching TimeoutExpired. I try to mock my object using side_effect so I can catch the fake exception with pytest.raises. Whatever I do, I get DID NOT RAISE <class 'subprocess.TimeoutExpired'>.
I have tried many things, and even though I'm not so experienced with mocking, I believe something like that should in principle work:
# my_check.py
from subprocess import TimeoutExpired, PIPE, check_output
class Check:
def __init__(self, name):
self.name = name
self.status = self.get_status()
def get_status(self):
try:
out = check_output(["ls"], universal_newlines=True, stderr=PIPE, timeout=2)
except TimeoutExpired as e:
print(f"Command timed out: {e}")
raise
if self.name in out:
return True
return False
# test_my_check.py
import pytest
from unittest import mock
from subprocess import TimeoutExpired
#mock.patch("src.my_check.Check", autospec=True)
def test_is_installed_exception(check_fake):
check_fake.get_status.side_effect = TimeoutExpired
obj_fake = check_fake("random_file.txt")
with pytest.raises(TimeoutExpired):
obj_fake.get_status()
For some reason it doesn't work though, and I can't get my head around what's wrong.
If you want to test the functionality of a class (Check) you cannot mock that class. You have to mock instead the calls that you want to change - in this case probably check_output, which you want to raise an exception:
#mock.patch("src.my_check.check_output")
def test_is_installed_exception(mocked_check_output):
mocked_check_output.side_effect = TimeoutExpired("check_output", 1)
with pytest.raises(TimeoutExpired):
Check("random_file.txt")
A few notes:
you have to patch "src.my_check.check_output", because you import check_output using from subprocess import check_output, so you use the reference in your class
you have to construct a valid TimeoutExpired object - it requires 2 arguments, so you have to provide them
as get_status is already called in __init__, you have to test the class construction - you can't get a properly constructed instance because of the raised exception

Using context managers for recovering from celery's SoftTimeLimitExceeded

I am trying to set a maximum run time for my celery jobs.
I am currently recovering from exceptions with a context manager. I ended up with code very similar to this snippet:
from celery.exceptions import SoftTimeLimitExceeded
class Manager:
def __enter__(self):
return self
def __exit__(self, error_type, error, tb):
if error_type == SoftTimeLimitExceeded:
logger.info('job killed.')
# swallow the exception
return True
#task
def do_foo():
with Manager():
run_task1()
run_task2()
run_task3()
What I expected:
If do_foo times out in run_task1, the logger logs, the SoftTimeLimitExceeded exception is swallowed, the body of the manager is skipped, the job ends without running run_task2 and run_task3.
What I observe:
do_foo times out in run_task1, SoftTimeLimitExceeded is raised, the logger logs, the SoftTimeLimitExceeded exception is swallowed but run_task2 and run_task3 are running nevertheless.
I am looking for an answer to following two questions:
Why is run_task2 still executed when SoftTimeLimitExceeded is raised in run_task1 in this setting?
Is there an easy way to transform my code so that it can performs as expected?
Cleaning up the code
This code is pretty good; there's not much cleaning up to do.
You shouldn't return self from __enter__ if the context manager isn't designed to be used with the as keyword.
is should be used when checking classes, since they are singletons...
but you should prefer issubclass to properly emulate exception handling.
Implementing these changes gives:
from celery.exceptions import SoftTimeLimitExceeded
class Manager:
def __enter__(self):
pass
def __exit__(self, error_type, error, tb):
if issubclass(error_type, SoftTimeLimitExceeded):
logger.info('job killed.')
# swallow the exception
return True
#task
def do_foo():
with Manager():
run_task1()
run_task2()
run_task3()
Debugging
I created a mock environment for debugging:
class SoftTimeLimitExceeded(Exception):
pass
class Logger:
info = print
logger = Logger()
del Logger
def task(f):
return f
def run_task1():
print("running task 1")
raise SoftTimeLimitExceeded
def run_task2():
print("running task 2")
def run_task_3():
print("running task 3")
Executing this and then your program gives:
>>> do_foo()
running task 1
job killed.
This is the expected behaviour.
Hypotheses
I can think of two possibilities:
Something in the chain, probably run_task1, is asynchronous.
celery is doing something weird.
I'll run with the second hypothesis because I can't test the former.
I've been bitten by the obscure behaviour of a combination between context managers, exceptions and coroutines before, so I know what sorts of problems it causes. This seems like one of them, but I'll have to look at celery's code before I can go any further.
Edit: I can't make head nor tail of celery's code, and searching hasn't turned up the code that raises SoftTimeLimitExceeded to allow me to trace it backwards. I'll pass it on to somebody more experienced with celery to see if they can work out how it works.

Why does the python threading.Thread object has 'start', but not 'stop'? [duplicate]

This question already has answers here:
Is there any way to kill a Thread?
(31 answers)
Closed 10 years ago.
The python module threading has an object Thread to be used to run processes and functions in a different thread. This object has a start method, but no stop method. What is the reason a Thread cannot be stopped my calling a simple stop method? I can imagine cases when it is unconvenient to use the join method...
start can be generic and make sense because it just fires off the target of the thread, but what would a generic stop do? Depending upon what your thread is doing, you could have to close network connections, release system resources, dump file and other streams, or any number of other custom, non-trivial tasks. Any system that could do even most of these things in a generic way would add so much overhead to each thread that it wouldn't be worth it, and would be so complicated and shot through with special cases that it would be almost impossible to work with. You can keep track of all created threads without joining them in your main thread, then check their run state and pass them some sort of termination message when the main thread shuts itself down though.
It is definitely possible to implement a Thread.stop method as shown in the following example code:
import threading
import sys
class StopThread(StopIteration): pass
threading.SystemExit = SystemExit, StopThread
class Thread2(threading.Thread):
def stop(self):
self.__stop = True
def _bootstrap(self):
if threading._trace_hook is not None:
raise ValueError('Cannot run thread with tracing!')
self.__stop = False
sys.settrace(self.__trace)
super()._bootstrap()
def __trace(self, frame, event, arg):
if self.__stop:
raise StopThread()
return self.__trace
class Thread3(threading.Thread):
def _bootstrap(self, stop_thread=False):
def stop():
nonlocal stop_thread
stop_thread = True
self.stop = stop
def tracer(*_):
if stop_thread:
raise StopThread()
return tracer
sys.settrace(tracer)
super()._bootstrap()
################################################################################
import time
def main():
test = Thread2(target=printer)
test.start()
time.sleep(1)
test.stop()
test.join()
def printer():
while True:
print(time.time() % 1)
time.sleep(0.1)
if __name__ == '__main__':
main()
The Thread3 class appears to run code approximately 33% faster than the Thread2 class.
Addendum:
With sufficient knowledge of Python's C API and the use of the ctypes module, it is possible to write a far more efficient way of stopping a thread when desired. The problem with using sys.settrace is that the tracing function runs after each instruction. If an asynchronous exception is raised instead on the thread that needs to be aborted, no execution speed penalty is incurred. The following code provides some flexibility in this regard:
#! /usr/bin/env python3
import _thread
import ctypes as _ctypes
import threading as _threading
_PyThreadState_SetAsyncExc = _ctypes.pythonapi.PyThreadState_SetAsyncExc
# noinspection SpellCheckingInspection
_PyThreadState_SetAsyncExc.argtypes = _ctypes.c_ulong, _ctypes.py_object
_PyThreadState_SetAsyncExc.restype = _ctypes.c_int
# noinspection PyUnreachableCode
if __debug__:
# noinspection PyShadowingBuiltins
def _set_async_exc(id, exc):
if not isinstance(id, int):
raise TypeError(f'{id!r} not an int instance')
if not isinstance(exc, type):
raise TypeError(f'{exc!r} not a type instance')
if not issubclass(exc, BaseException):
raise SystemError(f'{exc!r} not a BaseException subclass')
return _PyThreadState_SetAsyncExc(id, exc)
else:
_set_async_exc = _PyThreadState_SetAsyncExc
# noinspection PyShadowingBuiltins
def set_async_exc(id, exc, *args):
if args:
class StateInfo(exc):
def __init__(self):
super().__init__(*args)
return _set_async_exc(id, StateInfo)
return _set_async_exc(id, exc)
def interrupt(ident=None):
if ident is None:
_thread.interrupt_main()
else:
set_async_exc(ident, KeyboardInterrupt)
# noinspection PyShadowingBuiltins
def exit(ident=None):
if ident is None:
_thread.exit()
else:
set_async_exc(ident, SystemExit)
class ThreadAbortException(SystemExit):
pass
class Thread(_threading.Thread):
def set_async_exc(self, exc, *args):
return set_async_exc(self.ident, exc, *args)
def interrupt(self):
self.set_async_exc(KeyboardInterrupt)
def exit(self):
self.set_async_exc(SystemExit)
def abort(self, *args):
self.set_async_exc(ThreadAbortException, *args)
Killing threads in a reliable fashion is not very easy. Think of the cleanups required: which locks (that might be shared with other threads!) should automatically be released? Otherwise, you will easily run into a deadlock!
The better way is to implement a proper shutdown yourself, and then set
mythread.shutdown = True
mythread.join()
to stop the thread.
Of course your thread should do something like
while not this.shutdown:
continueDoingSomething()
releaseThreadSpecificLocksAndResources()
to frequently check for the shutdown flag. Alternatively, you can rely on OS-specific signaling mechanisms to interrupt a thread, catch the interrupt, and then cleanup.
The cleanup is the most important part!
Stopping a thread should be up to the programmer to implement. Such as designing your thread to check it there are any requests for it to terminate immediately. If python (or any threading language) allowed you to just stop a thread then you would have code that just stopped. This is bug prone, etc.
Imagine if your thread as writing output to a file when you killed/stopped it. Then the file might be unfinished and corrupt. However if you simple signaled the thread you wanted it to stop then it could close the file, delete it, etc. You, the programmer, decided how to handle it. Python can't guess for you.
I'd suggest reading up on multi-threading theory. A decent start: http://en.wikipedia.org/wiki/Multithreading_(software)#Multithreading
On some platforms you can't forcibly "stop" a thread. It's also bad to do it since then the thread won't be able to clean up allocated resources. And it might happen when the thread is doing something important, like I/O.

Make Python unittest fail on exception from any thread

I am using the unittest framework to automate integration tests of multi-threaded python code, external hardware and embedded C. Despite my blatant abuse of a unittesting framework for integration testing, it works really well. Except for one problem: I need the test to fail if an exception is raised from any of the spawned threads. Is this possible with the unittest framework?
A simple but non-workable solution would be to either a) refactor the code to avoid multi-threading or b) test each thread separately. I cannot do that because the code interacts asynchronously with the external hardware. I have also considered implementing some kind of message passing to forward the exceptions to the main unittest thread. This would require significant testing-related changes to the code being tested, and I want to avoid that.
Time for an example. Can I modify the test script below to fail on the exception raised in my_thread without modifying the x.ExceptionRaiser class?
import unittest
import x
class Test(unittest.TestCase):
def test_x(self):
my_thread = x.ExceptionRaiser()
# Test case should fail when thread is started and raises
# an exception.
my_thread.start()
my_thread.join()
if __name__ == '__main__':
unittest.main()
At first, sys.excepthook looked like a solution. It is a global hook which is called every time an uncaught exception is thrown.
Unfortunately, this does not work. Why? well threading wraps your run function in code which prints the lovely tracebacks you see on screen (noticed how it always tells you Exception in thread {Name of your thread here}? this is how it's done).
Starting with Python 3.8, there is a function which you can override to make this work: threading.excepthook
... threading.excepthook() can be overridden to control how uncaught exceptions raised by Thread.run() are handled
So what do we do? Replace this function with our logic, and voilĂ :
For python >= 3.8
import traceback
import threading
import os
class GlobalExceptionWatcher(object):
def _store_excepthook(self, args):
'''
Uses as an exception handlers which stores any uncaught exceptions.
'''
self.__org_hook(args)
formated_exc = traceback.format_exception(args.exc_type, args.exc_value, args.exc_traceback)
self._exceptions.append('\n'.join(formated_exc))
return formated_exc
def __enter__(self):
'''
Register us to the hook.
'''
self._exceptions = []
self.__org_hook = threading.excepthook
threading.excepthook = self._store_excepthook
def __exit__(self, type, value, traceback):
'''
Remove us from the hook, assure no exception were thrown.
'''
threading.excepthook = self.__org_hook
if len(self._exceptions) != 0:
tracebacks = os.linesep.join(self._exceptions)
raise Exception(f'Exceptions in other threads: {tracebacks}')
For older versions of Python, this is a bit more complicated.
Long story short, it appears that the threading nodule has an undocumented import which does something along the lines of:
threading._format_exc = traceback.format_exc
Not very surprisingly, this function is only called when an exception is thrown from a thread's run function.
So for python <= 3.7
import threading
import os
class GlobalExceptionWatcher(object):
def _store_excepthook(self):
'''
Uses as an exception handlers which stores any uncaught exceptions.
'''
formated_exc = self.__org_hook()
self._exceptions.append(formated_exc)
return formated_exc
def __enter__(self):
'''
Register us to the hook.
'''
self._exceptions = []
self.__org_hook = threading._format_exc
threading._format_exc = self._store_excepthook
def __exit__(self, type, value, traceback):
'''
Remove us from the hook, assure no exception were thrown.
'''
threading._format_exc = self.__org_hook
if len(self._exceptions) != 0:
tracebacks = os.linesep.join(self._exceptions)
raise Exception('Exceptions in other threads: %s' % tracebacks)
Usage:
my_thread = x.ExceptionRaiser()
# will fail when thread is started and raises an exception.
with GlobalExceptionWatcher():
my_thread.start()
my_thread.join()
You still need to join yourself, but upon exit, the with-statement's context manager will check for any exception thrown in other threads, and will raise an exception appropriately.
THE CODE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND,
EXPRESS OR IMPLIED
This is an undocumented, sort-of-horrible hack. I tested it on linux and windows, and it seems to work. Use it at your own risk.
I've come across this problem myself, and the only solution I've been able to come up with is subclassing Thread to include an attribute for whether or not it terminates without an uncaught exception:
from threading import Thread
class ErrThread(Thread):
"""
A subclass of Thread that will log store exceptions if the thread does
not exit normally
"""
def run(self):
try:
Thread.run(self)
except Exception as self.err:
pass
else:
self.err = None
class TaskQueue(object):
"""
A utility class to run ErrThread objects in parallel and raises and exception
in the event that *any* of them fail.
"""
def __init__(self, *tasks):
self.threads = []
for t in tasks:
try:
self.threads.append(ErrThread(**t)) ## passing in a dict of target and args
except TypeError:
self.threads.append(ErrThread(target=t))
def run(self):
for t in self.threads:
t.start()
for t in self.threads:
t.join()
if t.err:
raise Exception('Thread %s failed with error: %s' % (t.name, t.err))
I've been using the accepted answer above for a while now, but since Python 3.8 the solution doesn't work anymore because the threading module doesn't have this _format_exc import anymore.
On the other hand the threading module now has a nice way to register custom except hooks in Python 3.8 so here is a simple solution to run unit tests which assert that some exceptions are raised inside threads:
def test_in_thread():
import threading
exceptions_caught_in_threads = {}
def custom_excepthook(args):
thread_name = args.thread.name
exceptions_caught_in_threads[thread_name] = {
'thread': args.thread,
'exception': {
'type': args.exc_type,
'value': args.exc_value,
'traceback': args.exc_traceback
}
}
# Registering our custom excepthook to catch the exception in the threads
threading.excepthook = custom_excepthook
# dummy function that raises an exception
def my_function():
raise Exception('My Exception')
# running the funciton in a thread
thread_1 = threading.Thread(name='thread_1', target=my_function, args=())
thread_1.start()
thread_1.join()
assert 'thread_1' in exceptions_caught_in_threads # there was an exception in thread 1
assert exceptions_caught_in_threads['thread_1']['exception']['type'] == Exception
assert str(exceptions_caught_in_threads['thread_1']['exception']['value']) == 'My Exception'

'sys.excepthook' and threading

I am using Python 2.5 and trying to use a self-defined excepthook in my program. In the main thread it works perfectly fine. But in a thread started with the threading module the usual excepthook is called.
Here is an example showing the problem. Uncommenting the comment shows the desired behaviour.
import threading, sys
def myexcepthook(type, value, tb):
print 'myexcepthook'
class A(threading.Thread, object):
def __init__(self):
threading.Thread.__init__(self, verbose=True)
# raise Exception('in main')
self.start()
def run(self):
print 'A'
raise Exception('in thread')
if __name__ == "__main__":
sys.excepthook = myexcepthook
A()
So, how can I use my own excepthook in a thread?
It looks like this bug is still present in (at least) 3.4, and one of the workarounds in the discussion Nadia Alramli linked seems to work in Python 3.4 too.
For convenience and documentation sake, I'll post the code for (in my opinion) the best workaround here. I updated the coding style and comments slightly to make it more PEP8 and Pythonic.
import sys
import threading
def setup_thread_excepthook():
"""
Workaround for `sys.excepthook` thread bug from:
http://bugs.python.org/issue1230540
Call once from the main thread before creating any threads.
"""
init_original = threading.Thread.__init__
def init(self, *args, **kwargs):
init_original(self, *args, **kwargs)
run_original = self.run
def run_with_except_hook(*args2, **kwargs2):
try:
run_original(*args2, **kwargs2)
except Exception:
sys.excepthook(*sys.exc_info())
self.run = run_with_except_hook
threading.Thread.__init__ = init
I just stumbled over this problem and as it turns out, it was the right time to do so.
New in version 3.8: threading.excepthook
Handle uncaught exception raised by Thread.run().
The args argument has the following attributes:
exc_type: Exception type.
exc_value: Exception value, can be None.
exc_traceback: Exception traceback, can be None.
thread: Thread which raised the exception, can be None.
I don't know why, but be aware, that unlike sys.excepthook, threading.excepthook receives the arguments as a namedtuple instead of multiple arguments.
It looks like there is a related bug reported here with workarounds. The suggested hacks basically wrap run in a try/catch and then call sys.excepthook(*sys.exc_info())

Categories

Resources