Python Multiprocessing JoinableQueue: clear queue and discard all unfinished tasks

Python Multiprocessing JoinableQueue: clear queue and discard all unfinished tasks - python

I got two processes and in order to do some clean up in case of fatal errors (instead of processes keeping running), I want to remove all remaining tasks en empty the queue (in order to let join() proceed). How can I achieve that (preferably it should be code to apply in both processes, but my code allows the child process to signal the main process of its failure state and instruct main to do the clean up as well)?
I was trying to get a understand it by inspecting the source at:
https://github.com/python/cpython/blob/main/Lib/multiprocessing/queues.py
But I got a little bit lost with code like:
...
self._unfinished_tasks._semlock._is_zero():
...
def __init__(self, maxsize=0, *, ctx):
Queue.__init__(self, maxsize, ctx=ctx)
self._unfinished_tasks = ctx.Semaphore(0)
...
(also where does the _semlock property comes from?)
For example, what is ctx and it appears not be required as I did not use it in my object creation. Digging further, it may have something to do with (a little bit too mysterious or me)
mp.get_context('spawn')
or
#asynccontextmanager
async def ctx():
yield
I need something like mentioned here by V.E.O (which is quite understandable, but that is only a single process as far as I understand):
Clear all items from the queue

I came up with the following code (to be tested):
def clearAndDiscardQueue(self):
try: # cleanup, preferably in the process that is adding to the queue
while True:
self.task_queue.get_nowait()
except Empty:
pass
except ValueError: # in case of closed
pass
self.task_queue.close()
# theoretically a new item could be placed by the
# other process by the time the interpreter is on this line,
# therefore the part above should be run in the process that
# fills (put) the queue when it is in its failure state
# (when the main process fails it should communicate to
# raise an exception in the child process to run the cleanup
# so main process' join will work)
try: # could be one of the processes
while True:
self.task_queue.task_done()
except ValueError: # too many times called, do not care
# since all remaining will not be processed due to failure state
pass
Else I would need to try understanding code like the following. I think messing with the next code, analogous to calling queue.clear() as in a single process queue, would have serious consequences in terms of race conditions when clearing the buffer/pipe myself somehow.
class Queue(object):
def __init__(self, maxsize=0, *, ctx):
…
self._reader, self._writer = connection.Pipe(duplex=False)
…
def put(self, obj, block=True, timeout=None):
…
self._buffer.append(obj) # in case of close() the background thread
# will quit once it has flushed all buffered data to the pipe.
…
def get(self, block=True, timeout=None):
…
res = self._recv_bytes()
…
return _ForkingPickler.loads(res)
…
class JoinableQueue(Queue):
def __init__(self, maxsize=0, *, ctx):
…
self._unfinished_tasks = ctx.Semaphore(0)
…
def task_done(self):
…
if not self._unfinished_tasks._semlock._is_zero():
…
in which _is_zero() is somehow externally defined (see synchronize.py), like mentioned here:
Why doesn't Python's _multiprocessing.SemLock have 'name'?

Related

How to properly encapsulate an asyncio Process

I have a program executed in a subprocess. This program runs forever, reads a line from its stdin, processes it, and outputs a result on stdout. I have encapsulated it as follows:
class BrainProcess:
def __init__(self, filepath):
# starting the program in a subprocess
self._process = asyncio.run(self.create_process(filepath))
# check if the program could not be executed
if self._process.returncode is not None:
raise BrainException(f"Could not start process {filepath}")
#staticmethod
async def create_process(filepath):
process = await sp.create_subprocess_exec(
filepath, stdin=sp.PIPE, stdout=sp.PIPE, stderr=sp.PIPE)
return process
# destructor function
def __del__(self):
self._process.kill() # kill the program, since it never stops
# waiting for the program to terminate
# self._process.wait() is asynchronous so I use async.run() to execute it
asyncio.run(self._process.wait())
async def _send(self, msg):
b = bytes(msg + '\n', "utf-8")
self._process.stdin.write(b)
await self._process.stdin.drain()
async def _readline(self):
return await self._process.stdout.readline()
def send_start_cmd(self, size):
asyncio.run(self._send(f"START {size}"))
line = asyncio.run(self._readline())
print(line)
return line
From my understanding asyncio.run() is used to run asynchronous code in a synchronous context. That is why I use it at the following lines:
# in __init__
self._process = asyncio.run(self.create_process(filepath))
# in send_start_cmd
asyncio.run(self._send(f"START {size}"))
# ...
line = asyncio.run(self._readline())
# in __del__
asyncio.run(self._process.wait())
The first line seems to work properly (the process is created correctly), but the other throw exceptions that look like got Future <Future pending> attached to a different loop.
Code:
brain = BrainProcess("./test")
res = brain.send_start_cmd(20)
print(res)
So my questions are:
What do these errors mean ?
How do I fix them ?
Did I use asyncio.run() correctly ?
Is there a better way to encapsulate the process to send and retrieve data to/from it without making my whole application use async / await ?

asyncio.run is meant to be used for running a body of async code, and producing a well-defined result. The most typical example is running the whole program:
async def main():
# your application here
if __name__ == '__main__':
asyncio.run(main())
Of couurse, asyncio.run is not limited to that usage, it is perfectly possible to call it multiple times - but it will create a fresh event loop each time. This means you won't be able to share async-specific objects (such as futures or objects that refer to them) between invocations - which is precisely what you tried to do. If you want to completely hide the fact that you're using async, why use asyncio.subprocess in the first place, wouldn't the regular subprocess do just as well?
The simplest fix is to avoid asyncio.run and just stick to the same event loop. For example:
_loop = asyncio.get_event_loop()
class BrainProcess:
def __init__(self, filepath):
# starting the program in a subprocess
self._process = _loop.run_until_complete(self.create_process(filepath))
...
...
Is there a better way to encapsulate the process to send and retrieve data to/from it without making my whole application use async / await ?
The idea is precisely for the whole application to use async/await, otherwise you won't be able to take advantage of asyncio - e.g. you won't be able to parallelize your async code.

Python threading - How to wait for data from multiple senders to merge the result?

My ImageStitcher class is receiving multiple image messages from different threads. Another thread will then call the get_stichted_image() and this works. But it doesn't look good and seems kinda slow.
Is there a better way to handle multiple incoming messages from different threads and wait for all queues (or something else) to contain something?
class ImageStitcher:
def __init__(foo):
self.image_storage = {
Position.top: queue.Queue(maxsize=1),
Position.bottom: queue.Queue(maxsize=1)
}
foo.register(image_callback)
# will be called from different threads
def image_callback(self, image_msg):
if self.image_storage[image_msg["position"]].full():
self.image_storage[image_msg["position"].get()
self.image_storage[image_msg["position"].put(image_msg)
def get_stichted_image(self):
try:
# the following code is ugly and seems to be slow
top_image_msg = self.image_storage[Position.top].get(timeout=0.1)
bottom_image_msg = self.image_storage[Position.bottom].get(timeout=0.1)
return self.stitch_images(top_image_msg, bottom_image_msg)
except queue.Empty:
return None

Why does my context manager exit function run before the computation isn't finished?

The exit function of my custom context manager seemingly runs before the computation is done. My context manager is meant to simplify writing concurrent/parallel code. Here is my context manager code:
import time
from multiprocessing.dummy import Pool, cpu_count
class managed_pool:
'''Simple context manager for multiprocessing.dummy.Pool'''
def __init__(self, msg):
self.msg = msg
def __enter__(self):
cores = cpu_count()
print 'start concurrent ({0} cores): {1}'.format(cores, self.msg)
self.start = time.time()
self.pool = Pool(cores)
return self.pool
def __exit__(self, type_, value, traceback):
print 'end concurrent:', self.msg
print 'time:', time.time() - self.start
self.pool.close()
self.pool.join()
I've already tried this script with multiprocessing.Pool instead of multiprocessing.dummy.Pool and it seems to fail all the time.
Here is an example of using the context manager:
def read_engine_files(f):
engine_input = engineInput()
with open(f, 'rb') as f:
engine_input.parse_from_string(f.read())
return engine_input
with managed_pool('load input files') as pool:
data = pool.map(read_engine_files, files)
So, inside of read_engine_files I print the name of the file. You'll notice in the __exit__ function that I also print out when the computation is done and how long it took. But when viewing stdout the __exit__ message appears way before the computation finished. Like, minutes before the computation is done. But htop says all of my cores are still being used. Here's an example of the output
start concurrent (4 cores): load engine input files
file1.pbin
file2.pbin
...
file16.pbin
end concurrent: load engine input files
time: 246.43829298
file17.pbin
...
file45.pbin
Why is __exit__ being called so early?

Are you sure you're just calling pool.map()? That should block until all the items have been mapped.
If you're calling one of the asynchronous methods of Pool, then you should be able to solve the problem by changing the order of things in __exit__(). Just join the pool before doing the summary.
def __exit__(self, type_, value, traceback):
self.pool.close()
self.pool.join()
print 'end concurrent:', self.msg
print 'time:', time.time() - self.start

The most likely explanation is that an exception occurred. The above code sample does not parse the type, value or traceback arguments of the __exit__ statement. Thus, an exception occurs (and is not caught earlier), is handed to the exit statement which in turn does not react to it. The processes (or some of them) continue running.

Python Queues memory leaks when called inside thread

I have python TCP client and need to send media(.mpg) file in a loop to a 'C' TCP server.
I have following code, where in separate thread I am reading the 10K blocks of file and sending it and doing it all over again in loop, I think it is because of my implementation of thread module, or tcp send. I am using Queues to print the logs on my GUI ( Tkinter ) but after some times it goes out of memory..
UPDATE 1 - Added more code as requested
Thread class "Sendmpgthread" used to create thread to send data
.
.
def __init__ ( self, otherparams,MainGUI):
.
.
self.MainGUI = MainGUI
self.lock = threading.Lock()
Thread.__init__(self)
#This is the one causing leak, this is called inside loop
def pushlog(self,msg):
self.MainGUI.queuelog.put(msg)
def send(self, mysocket, block):
size = len(block)
pos = 0;
while size > 0:
try:
curpos = mysocket.send(block[pos:])
except socket.timeout, msg:
if self.over:
self.pushlog(Exit Send)
return False
except socket.error, msg:
print 'Exception'
return False
pos = pos + curpos
size = size - curpos
return True
def run(self):
media_file = None
mysocket = None
try:
mysocket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
mysocket.connect((self.ip, string.atoi(self.port)))
media_file = open(self.file, 'rb')
while not self.over:
chunk = media_file.read(10000)
if not chunk: # EOF Reset it
print 'resetting stream'
media_file.seek(0, 0)
continue
if not self.send(mysocket, chunk): # If some error or thread is killed
break;
#disabling this solves the issue
self.pushlog('print how much data sent')
except socket.error, msg:
print 'print exception'
except Exception, msg:
print 'print exception'
try:
if media_file is not None:
media_file.close()
media_file = None
if mysocket is not None:
mysocket.close()
mysocket = None
finally:
print 'some cleaning'
def kill(self):
self.over = True
I figured out that it is because of wrong implementation of Queue as commenting that piece resolves the issue
UPDATE 2 - MainGUI class which is called from above Thread class
class MainGUI(Frame):
def __init__(self, other args):
#some code
.
.
#from the above thread class used to send data
self.send_mpg_status = Sendmpgthread(params)
self.send_mpg_status.start()
self.after(100, self.updatelog)
self.queuelog = Queue.Queue()
def updatelog(self):
try:
msg = self.queuelog.get_nowait()
while msg is not None:
self.printlog(msg)
msg = self.queuelog.get_nowait()
except Queue.Empty:
pass
if self.send_mpg_status: # only continue when sending
self.after(100, self.updatelog)
def printlog(self,msg):
#print in GUI

Since printlog is adding to a tkinter text control, the memory occupied by that control will grow with each message (it has to store all the log messages in order to display them).
Unless storing all the logs is critical, a common solution is to limit the maximum number of log lines displayed.
A naive implementation is to eliminate extra lines from the begining after the control reaches a maximum number of messages. Add a function to get the number of lines in the control and then, in printlog something similar to:
while getnumlines(self.edit) > self.maxloglines:
self.edit.delete('1.0', '1.end')
(above code not tested)
update: some general guidelines
Keep in mind that what might look like a memory leak does not always mean that a function is wrong, or that the memory is no longer accessible. Many times there is missing cleanup code for a container that is accumulating elements.
A basic general approach for this kind of problems:
form an opinion on what part of the code might be causing the problem
check it by commenting that code out (or keep commenting code until you find a candidate)
look for containers in the responsible code, add code to print their size
decide what elements can be safely removed from that container, and when to do it
test the result

I can't see anything obviously wrong with your code snippet.
To reduce memory usage a bit under Python 2.7, I'd use buffer(block, pos) instead of block[pos:]. Also I'd use mysocket.sendall(block) instead of your send method.
If the ideas above don't solve your problem, then the bug is most probably elsewhere in your code. Could you please post the shortest possible version of the full Python script which still grows out-of-memory (http://sscce.org/)? That increases your change of getting useful help.

Out of memory errors are indicative of data being generated but not consumed or released. Looking through your code I would guess these two areas:
Messages are being pushed onto a Queue.Queue() instance in the pushlog method. Are they being consumed?
The MainGui printlog method may be writing text somewhere. eg. Is it continually writing to some kind of GUI widget without any pruning of messages?
From the code you've posted, here's what I would try:
Put a print statement in updatelog. If this is not being continually called for some reason such as a failed after() call, then the queuelog will continue to grow without bound.
If updatelog is continually being called, then turn your focus to printlog. Comment the contents of this function to see if out of memory errors still occur. If they don't, then something in printlog may be holding on to the logged data, you'll need to dig deeper to find out what.
Apart from this, the code could be cleaned up a bit. self.queuelog is not created until after the thread is started which gives rise to a race condition where the thread may try to write into the queue before it has been created. Creation of queuelog should be moved to somewhere before the thread is started.
updatelog could also be refactored to remove redundancy:
def updatelog(self):
try:
while True:
msg = self.queuelog.get_nowait()
self.printlog(msg)
except Queue.Empty:
pass
And I assume the the kill function is called from the GUI thread. To avoid thread race conditions, the self.over should be a thread safe variable such as a threading.Event object.
def __init__(...):
self.over = threading.Event()
def kill(self):
self.over.set()

There is no data piling up in your TCP sending loop.
Memory error is probably caused by logging queue, as you have not posted complete code try using following class for logging:
from threading import Thread, Event, Lock
from time import sleep, time as now
class LogRecord(object):
__slots__ = ["txt", "params"]
def __init__(self, txt, params):
self.txt, self.params = txt, params
class AsyncLog(Thread):
DEBUGGING_EMULATE_SLOW_IO = True
def __init__(self, queue_max_size=15, queue_min_size=5):
Thread.__init__(self)
self.queue_max_size, self.queue_min_size = queue_max_size, queue_min_size
self._queuelock = Lock()
self._queue = [] # protected by _queuelock
self._discarded_count = 0 # protected by _queuelock
self._pushed_event = Event()
self.setDaemon(True)
self.start()
def log(self, message, **params):
with self._queuelock:
self._queue.append(LogRecord(message, params))
if len(self._queue) > self.queue_max_size:
# empty the queue:
self._discarded_count += len(self._queue) - self.queue_min_size
del self._queue[self.queue_min_size:] # empty the queue instead of creating new list (= [])
self._pushed_event.set()
def run(self):
while 1: # no reason for exit condition here
logs, discarded_count = None, 0
with self._queuelock:
if len(self._queue) > 0:
# select buffered messages for printing, releasing lock ASAP
logs = self._queue[:]
del self._queue[:]
self._pushed_event.clear()
discarded_count = self._discarded_count
self._discarded_count = 0
if not logs:
self._pushed_event.wait()
self._pushed_event.clear()
continue
else:
# print logs
if discarded_count:
print ".. {0} log records missing ..".format(discarded_count)
for log_record in logs:
self.write_line(log_record)
if self.DEBUGGING_EMULATE_SLOW_IO:
sleep(0.5)
def write_line(self, log_record):
print log_record.txt, " ".join(["{0}={1}".format(name, value) for name, value in log_record.params.items()])
if __name__ == "__main__":
class MainGUI:
def __init__(self):
self._async_log = AsyncLog()
self.log = self._async_log.log # stored as bound method
def do_this_test(self):
print "I am about to log 100 times per sec, while text output frequency is 2Hz (twice per second)"
def log_100_records_in_one_second(itteration_index):
for i in xrange(100):
self.log("something happened", timestamp=now(), session=3.1415, itteration=itteration_index)
sleep(0.01)
for iter_index in range(3):
log_100_records_in_one_second(iter_index)
test = MainGUI()
test.do_this_test()
I have noticed that you do not sleep() anywhere in the sending loop, this means data is read as fast as it can and is sent as fast as it can. Note that this is not desirable behavior when playing media files - container time-stamps are there to dictate data-rate.

Why does the python threading.Thread object has 'start', but not 'stop'? [duplicate]

This question already has answers here:
Is there any way to kill a Thread?
(31 answers)
Closed 10 years ago.
The python module threading has an object Thread to be used to run processes and functions in a different thread. This object has a start method, but no stop method. What is the reason a Thread cannot be stopped my calling a simple stop method? I can imagine cases when it is unconvenient to use the join method...

start can be generic and make sense because it just fires off the target of the thread, but what would a generic stop do? Depending upon what your thread is doing, you could have to close network connections, release system resources, dump file and other streams, or any number of other custom, non-trivial tasks. Any system that could do even most of these things in a generic way would add so much overhead to each thread that it wouldn't be worth it, and would be so complicated and shot through with special cases that it would be almost impossible to work with. You can keep track of all created threads without joining them in your main thread, then check their run state and pass them some sort of termination message when the main thread shuts itself down though.

It is definitely possible to implement a Thread.stop method as shown in the following example code:
import threading
import sys
class StopThread(StopIteration): pass
threading.SystemExit = SystemExit, StopThread
class Thread2(threading.Thread):
def stop(self):
self.__stop = True
def _bootstrap(self):
if threading._trace_hook is not None:
raise ValueError('Cannot run thread with tracing!')
self.__stop = False
sys.settrace(self.__trace)
super()._bootstrap()
def __trace(self, frame, event, arg):
if self.__stop:
raise StopThread()
return self.__trace
class Thread3(threading.Thread):
def _bootstrap(self, stop_thread=False):
def stop():
nonlocal stop_thread
stop_thread = True
self.stop = stop
def tracer(*_):
if stop_thread:
raise StopThread()
return tracer
sys.settrace(tracer)
super()._bootstrap()
################################################################################
import time
def main():
test = Thread2(target=printer)
test.start()
time.sleep(1)
test.stop()
test.join()
def printer():
while True:
print(time.time() % 1)
time.sleep(0.1)
if __name__ == '__main__':
main()
The Thread3 class appears to run code approximately 33% faster than the Thread2 class.
Addendum:
With sufficient knowledge of Python's C API and the use of the ctypes module, it is possible to write a far more efficient way of stopping a thread when desired. The problem with using sys.settrace is that the tracing function runs after each instruction. If an asynchronous exception is raised instead on the thread that needs to be aborted, no execution speed penalty is incurred. The following code provides some flexibility in this regard:
#! /usr/bin/env python3
import _thread
import ctypes as _ctypes
import threading as _threading
_PyThreadState_SetAsyncExc = _ctypes.pythonapi.PyThreadState_SetAsyncExc
# noinspection SpellCheckingInspection
_PyThreadState_SetAsyncExc.argtypes = _ctypes.c_ulong, _ctypes.py_object
_PyThreadState_SetAsyncExc.restype = _ctypes.c_int
# noinspection PyUnreachableCode
if __debug__:
# noinspection PyShadowingBuiltins
def _set_async_exc(id, exc):
if not isinstance(id, int):
raise TypeError(f'{id!r} not an int instance')
if not isinstance(exc, type):
raise TypeError(f'{exc!r} not a type instance')
if not issubclass(exc, BaseException):
raise SystemError(f'{exc!r} not a BaseException subclass')
return _PyThreadState_SetAsyncExc(id, exc)
else:
_set_async_exc = _PyThreadState_SetAsyncExc
# noinspection PyShadowingBuiltins
def set_async_exc(id, exc, *args):
if args:
class StateInfo(exc):
def __init__(self):
super().__init__(*args)
return _set_async_exc(id, StateInfo)
return _set_async_exc(id, exc)
def interrupt(ident=None):
if ident is None:
_thread.interrupt_main()
else:
set_async_exc(ident, KeyboardInterrupt)
# noinspection PyShadowingBuiltins
def exit(ident=None):
if ident is None:
_thread.exit()
else:
set_async_exc(ident, SystemExit)
class ThreadAbortException(SystemExit):
pass
class Thread(_threading.Thread):
def set_async_exc(self, exc, *args):
return set_async_exc(self.ident, exc, *args)
def interrupt(self):
self.set_async_exc(KeyboardInterrupt)
def exit(self):
self.set_async_exc(SystemExit)
def abort(self, *args):
self.set_async_exc(ThreadAbortException, *args)

Killing threads in a reliable fashion is not very easy. Think of the cleanups required: which locks (that might be shared with other threads!) should automatically be released? Otherwise, you will easily run into a deadlock!
The better way is to implement a proper shutdown yourself, and then set
mythread.shutdown = True
mythread.join()
to stop the thread.
Of course your thread should do something like
while not this.shutdown:
continueDoingSomething()
releaseThreadSpecificLocksAndResources()
to frequently check for the shutdown flag. Alternatively, you can rely on OS-specific signaling mechanisms to interrupt a thread, catch the interrupt, and then cleanup.
The cleanup is the most important part!

Stopping a thread should be up to the programmer to implement. Such as designing your thread to check it there are any requests for it to terminate immediately. If python (or any threading language) allowed you to just stop a thread then you would have code that just stopped. This is bug prone, etc.
Imagine if your thread as writing output to a file when you killed/stopped it. Then the file might be unfinished and corrupt. However if you simple signaled the thread you wanted it to stop then it could close the file, delete it, etc. You, the programmer, decided how to handle it. Python can't guess for you.
I'd suggest reading up on multi-threading theory. A decent start: http://en.wikipedia.org/wiki/Multithreading_(software)#Multithreading

On some platforms you can't forcibly "stop" a thread. It's also bad to do it since then the thread won't be able to clean up allocated resources. And it might happen when the thread is doing something important, like I/O.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python Multiprocessing JoinableQueue: clear queue and discard all unfinished tasks - python

Related

How to properly encapsulate an asyncio Process

Python threading - How to wait for data from multiple senders to merge the result?

Why does my context manager exit function run before the computation isn't finished?

Python Queues memory leaks when called inside thread

Why does the python threading.Thread object has 'start', but not 'stop'? [duplicate]

Categories

Resources

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python Multiprocessing JoinableQueue: clear queue and discard all unfinished tasks - python

Related

How to properly encapsulate an asyncio Process

Python threading - How to wait for data from multiple senders to merge the result?

Why does my context manager __exit__ function run before the computation isn't finished?

Python Queues memory leaks when called inside thread

Why does the python threading.Thread object has 'start', but not 'stop'? [duplicate]

Categories

Resources

Why does my context manager exit function run before the computation isn't finished?