python multiprocessing pool and additional queues - python

I would like to pass messages out from my function running in a process pool while the function is still running.
My application uses asyncio and multiprocessing queues to receive and distribute messages to a worker pool using asyncio.run_in_executor(). I manually created the pool so I could provide an initializer.
The problem I have is that I would like the functions that are running in the executor pool to be able to send messages out to the asyncio loop. This is how I started my new application process:
self._application = Application(self.outgoing_queue, self.incoming_queue, application_cores, log_level=logging.INFO)
self._application_process = mp.Process(target=self._application.run)
self._application_process.start()
the queues are from:
self.outgoing_queue = mp.Queue()
self.incoming_queue = mp.Queue()
I can't use my asyncio queue, or multiprocessing queue since those can't be passed to the process by this method:
async def run_operation():
kwargs = {
'out_queue': self._work_pool_queue
}
func = functools.partial(attribute, *args, **kwargs)
result = await self._loop.run_in_executor(self._work_pool, func)
result_msg = common.messages.MessageResult(result, msg.reply_id, msg.cpu_cost)
await self.outgoing_send(result_msg)
asyncio.create_task(run_operation())
My self._work_pool is created with:
self._work_pool = concurrent.futures.ProcessPoolExecutor(max_workers=self._cores, initializer=_work_pool_init)
since the following traceback results:
Task exception was never retrieved
future: <Task finished coro=<Application.message_router.<locals>.run_operation() done, defined at c:\users\brian\gitlab\rf-applications\rfapplications\common\application.py:95> exception=RuntimeError('Queue objects should only be shared between processes through inheritance')>
concurrent.futures.process._RemoteTraceback:
"""
Traceback (most recent call last):
File "c:\Program Files\Python37\lib\multiprocessing\queues.py", line 236, in _feed
obj = _ForkingPickler.dumps(obj)
File "c:\Program Files\Python37\lib\multiprocessing\reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
File "c:\Program Files\Python37\lib\multiprocessing\queues.py", line 58, in __getstate__
context.assert_spawning(self)
File "c:\Program Files\Python37\lib\multiprocessing\context.py", line 356, in assert_spawning
' through inheritance' % type(obj).__name__
RuntimeError: Queue objects should only be shared between processes through inheritance
"""
I was looking at using a Manager().Queue() (https://docs.python.org/3/library/multiprocessing.html#managers) since those can be sent to the process pool (Python multiprocessing Pool Queues communication). However, these queues seem to open up the possibility of remote connections, which I would like to avoid (I use secure websockets to communicate between remote machines right so far).

Related

Raise Exception if thread hangs

I have scripts running 24/7 that sometimes get stuck when a thread in concurrent.futures gets no response for a request.
The hanging-threads 2.0.5 module prints out which thread hangs and why.
The print looks something like this:
Thread 139646566659840 "ThreadPoolExecutor-666849_1" hangs -
File "/usr/lib/python3.9/threading.py", line 912, in _bootstrap
self._bootstrap_inner()
File "/usr/lib/python3.9/threading.py", line 954, in _bootstrap_inner
self.run()
File "/usr/lib/python3.9/threading.py", line 892, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.9/concurrent/futures/thread.py", line 77, in _worker
work_item.run()
File "/usr/lib/python3.9/concurrent/futures/thread.py", line 52, in run
result = self.fn(*self.args, **self.kwargs)
How can I, instead of just printing out the hanging threads and files, raise an exception when a thread is not responding in a given time? The script should just restart itself if hanging occurs, instead of waiting for a response.
I have tried with timeout, but concurrent futures can not be cancelled while running.
concurrent futures can not be cancelled while running
This is your problem. A hanging thread is still 'running'. Cancelling it from outside is not possible.
Thus you have two options:
switch to something which can be cancelled, like a ProcessPoolExecutor, or
rewrite the blocking code so it fails.
Since you say 'response to a request'---if this is a network request and you are early enough/frustrated enough in the dev cycle I thoroughly recommend switching to a concurrent multiprocessing framework like asyncio. This is exactly what they were developed for. In particular you may be interested in trios implementation of cancel scopes.

Gracefully terminate multiprocessing based program

I am working on a python service that spawns Process to handle the workload. Since I don't know at the start of the service how many workers I need, I chose to not use Pool. The following is a simplified version:
import multiprocessing as mp
import time
from datetime import datetime
def _print(s): # just my cheap logging utility
print(f'{datetime.now()} - {s}')
def run_in_process(q, evt):
_print(f'starting process job')
while not evt.is_set(): # True
try:
x = q.get(timeout=2)
_print(f'received {x}')
except:
_print(f'timed-out')
if __name__ == '__main__':
with mp.Manager() as manager:
q = manager.Queue()
evt = manager.Event()
p = mp.Process(target=run_in_process, args=(q, evt))
p.start()
time.sleep(2)
data = 100
while True:
try:
q.put(data)
time.sleep(0.5)
data += 1
if data > 110:
break
except KeyboardInterrupt:
_print('finishing...')
#p.terminate()
break
time.sleep(3)
_print('setting event 0')
evt.set()
_print('joining process')
p.join()
_print('done')
The program works and exits gracefully, without any error messages. However, if I use Ctrl-C before I have all 10 events processed, I get the following error before it exits.
2022-04-01 12:41:06.866484 - received 101
2022-04-01 12:41:07.367628 - received 102
^C2022-04-01 12:41:07.507805 - timed-out
2022-04-01 12:41:07.507886 - finishing...
Process Process-2:
Traceback (most recent call last):
File "/<path-omitted>/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/<path-omitted>/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "mp.py", line 10, in run_in_process
while not evt.is_set(): # True
File "/<path-omitted>/python3.7/multiprocessing/managers.py", line 1088, in is_set
return self._callmethod('is_set')
File "/<path-omitted>/python3.7/multiprocessing/managers.py", line 819, in _callmethod
kind, result = conn.recv()
File "/<path-omitted>/python3.7/multiprocessing/connection.py", line 250, in recv
buf = self._recv_bytes()
File "/<path-omitted>/python3.7/multiprocessing/connection.py", line 407, in _recv_bytes
buf = self._recv(4)
File "/<path-omitted>/python3.7/multiprocessing/connection.py", line 379, in _recv
chunk = read(handle, remaining)
ConnectionResetError: [Errno 104] Connection reset by peer
2022-04-01 12:41:10.511334 - setting event 0
Traceback (most recent call last):
File "mp.py", line 42, in <module>
evt.set()
File "/<path-omitted>/python3.7/multiprocessing/managers.py", line 1090, in set
return self._callmethod('set')
File "/<path-omitted>/python3.7/multiprocessing/managers.py", line 818, in _callmethod
conn.send((self._id, methodname, args, kwds))
File "/<path-omitted>/python3.7/multiprocessing/connection.py", line 206, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "/<path-omitted>/python3.7/multiprocessing/connection.py", line 404, in _send_bytes
self._send(header + buf)
File "/<path-omitted>/python3.7/multiprocessing/connection.py", line 368, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
A few observations:
The double error message looks exactly the same when I press Ctrl-C with my actual project. I think this is a good representation of my problem.
If I add p.terminate(), it doesn't change the behavior if the program is left to finish by itself. But if I press Ctrl-C halfway, I encounter the error message only once, I guess it's from the main thread/process.
If I change while not evt.is_set(): in run_in_process to an infinite loop: while Tre: and let the program finish its course I would continue to see periodic time-out prints which make sense. What I don't understand is that, if I press Ctrl-C, then the terminal will start spewing time-out without any time gap between them. What happened?
My ultimate question is: what is the correct way of construct this program so that when Ctrl-C is used (or a termination signal is generated to the program for that matter), the program stops gracefully?
I found out a solution to this problem myself by using signal.
The idea is to set up a signal catcher to catch specific signals, such as signal.SIGINT, signal.SIGTERM.
import multiprocessing as mp
from threading import Event
import signal
if __name__ == '__main__':
main_evt = Event()
def stop_main_handler(signum, frame):
if not main_evt.is_set():
main_evt.set()
signal.signal(signal.SIGINT, stop_main_handler)
with mp.Manager() as manager:
# creating mp queue, event and process
q = manager.Queue()
evt = manager.Event()
p = mp.Process(target=..., args=(q, evt))
p.start()
while not main_evt.is_set():
# processing data
# cleanup
evt.set()
p.join()
Or you can wrap it in an object-oriented fashion:
class SignalCatcher(object):
def __init__(self):
self._main_evt = Event()
def _stop_handler(self, signum, frame):
if not self._main_evt.is_set():
self._main_evt.set()
def block_until_signaled(self):
while not self._main_evt.is_set()
time.sleep(2)
Then you can use it as follows:
if __name__ == '__main__':
sc = SignalCatcher()
# this has to be outside. It seems that there is another process
# created by multiprocessing library, if you put sc creation in
# with-context, it would fail to signal each process.
with mp.Manager() as manager:
# creating process and starting it
# ...
sc.block_until_signaled()
# cleanup
# ...

terminate python multiprocessing pool cleanly

I am using multiprocessing.pool to work with the http server in python - it works great, but when I terminate, I get a slew of errors from all the spawnpoolworkers - and I'm just wondering how I avoid this.
My main code:
def run(self):
global pool
port = self.arguments.port
pool = multiprocessing.Pool( processes= self.arguments.threads)
with http.server.HTTPServer( ("", port), Handler ) as daemon:
print(f"serving on port {port}")
while True:
try:
daemon.handle_request()
except KeyboardInterrupt:
print("\nexiting")
pool.terminate()
pool.join()
return 0
I've tried doing nothing to the pool, I've tried doing pool.close() - I've tried not joining. But even if I just run that - never even access the port or call anything onto the pool, I still get a random list of things like this when I press control-c
Process SpawnPoolWorker-8:
Process SpawnPoolWorker-4:
Traceback (most recent call last):
File "/opt/homebrew/Cellar/python#3.10/3.10.1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/opt/homebrew/Cellar/python#3.10/3.10.1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/opt/homebrew/Cellar/python#3.10/3.10.1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/pool.py", line 114, in worker
task = get()
File "/opt/homebrew/Cellar/python#3.10/3.10.1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/queues.py", line 365, in get
with self._rlock:
File "/opt/homebrew/Cellar/python#3.10/3.10.1/F
how do I exit the pool cleanly, with no errors, and with no output?
ok - I'm stupid - the control-c was also interrupting all the child processes. This fixed it:
def ignore_control_c():
signal.signal(signal.SIGINT, signal.SIG_IGN)
pool = multiprocessing.Pool( processes = self.arguments.threads, initializer = ignore_control_c )

Python : _MainThread' object has no attribute '_state'

Hey Guys I am creating an application which takes in a request from the user. The main class in the server side is the Controller . I spawn a thread during init, which keeps actively listening for requests from the client ( I need to spawn a thread here. )
Once I get an request, I look into the type of request and call a function to handle it.
In that function, I want to create multiple processes to utilise my 8 cores effectively.
Here is the code:-
class Controller(app_manager.RyuApp):
OFP_VERSIONS = [ofproto_v1_3.OFP_VERSION]
def __init__(self, *args, **kwargs):
self.datapaths={}
self.monitor_thread=hub.spawn(self._monitor)
super(Controller, self).__init__( *args, **kwargs)
def _monitor(self):
global connstream
while True:
#Get Connection from client
data = connstream.read(15000)
data=eval(data)
print "Recieved a request from the client:-",data
for key,value in data.iteritems():
type=int(key)
request=value
if type==4:
self.get_route(type,request,connstream)
def get_route(self,type,request,connection):
global get_route_result
cities=request['Cities']
number_of_cities=request['Number_of_Cities']
city_count=0
processes=[]
pool = mp.Pool(processes=8)
for city,destination_ip in cities.iteritems():
args=(type,destination_ip)
processes.append(args)
city_count=city_count+1
if city_count==number_of_cities:
break
pool.map(self.get_route_process,processes)
def get_route_process(self,HOST,destination):
#Do Something
But the error I get is:-
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 810, in __bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 763, in run
self.__target(*self.__args, **self.__kwargs)
File "/usr/lib/python2.7/multiprocessing/pool.py", line 325, in _handle_workers
while thread._state == RUN or (pool._cache and thread._state != TERMINATE):
AttributeError: '_MainThread' object has no attribute '_state'
So in a nutshell, I create a thread, which tries to create multiple processes, but the code fails.

Celery shutting down worker from task_success handler not working

I'm trying to make a worker run only one task at a time, then shutdown. I've got the shutdown part working correctly (some background here: celery trying shutdown worker by raising SystemExit in task_postrun signal but always hangs and the main process never exits), but when it shuts down, I'm getting an error:
[2013-02-13 12:19:05,689: CRITICAL/MainProcess] Couldn't ack 1, reason:AttributeError("'NoneType' object has no attribute 'method_writer'",)
Traceback (most recent call last):
File "/usr/local/lib/python2.7/site-packages/kombu/transport/base.py", line 104, in ack_log_error
self.ack()
File "/usr/local/lib/python2.7/site-packages/kombu/transport/base.py", line 99, in ack
self.channel.basic_ack(self.delivery_tag)
File "/usr/local/lib/python2.7/site-packages/amqplib/client_0_8/channel.py", line 1742, in basic_ack
self._send_method((60, 80), args)
File "/usr/local/lib/python2.7/site-packages/amqplib/client_0_8/abstract_channel.py", line 75, in _send_method
self.connection.method_writer.write_method(self.channel_id,
AttributeError: 'NoneType' object has no attribute 'method_writer'
Why is this happening? Not only does it not ack, but it also purges all of the other tasks that are left in the queue (big problem).
How do I fix this?
UPDATE
Below is the stack trace with everything updated (pip install -U kombu amqp amqplib celery):
[2013-02-13 11:58:05,357: CRITICAL/MainProcess] Internal error: AttributeError("'NoneType' object has no attribute 'method_writer'",)
Traceback (most recent call last):
File "/usr/local/lib/python2.7/dist-packages/celery/worker/__init__.py", line 372, in process_task
req.execute_using_pool(self.pool)
File "/usr/local/lib/python2.7/dist-packages/celery/worker/job.py", line 219, in execute_using_pool
timeout=task.time_limit)
File "/usr/local/lib/python2.7/dist-packages/celery/concurrency/base.py", line 137, in apply_async
**options)
File "/usr/local/lib/python2.7/dist-packages/celery/concurrency/base.py", line 27, in apply_target
callback(target(*args, **kwargs))
File "/usr/local/lib/python2.7/dist-packages/celery/worker/job.py", line 333, in on_success
self.acknowledge()
File "/usr/local/lib/python2.7/dist-packages/celery/worker/job.py", line 439, in acknowledge
self.on_ack(logger, self.connection_errors)
File "/usr/local/lib/python2.7/dist-packages/kombu/transport/base.py", line 98, in ack_log_error
self.ack()
File "/usr/local/lib/python2.7/dist-packages/kombu/transport/base.py", line 93, in ack
self.channel.basic_ack(self.delivery_tag)
File "/usr/local/lib/python2.7/dist-packages/amqp/channel.py", line 1562, in basic_ack
self._send_method((60, 80), args)
File "/usr/local/lib/python2.7/dist-packages/amqp/abstract_channel.py", line 57, in _send_method
self.connection.method_writer.write_method(
AttributeError: 'NoneType' object has no attribute 'method_writer'
Exiting in task_postrun is not recommended as task_postrun is executed outside of the "task body" error handling.
Exactly what happens when a task calls sys.exit is not well defined,
and actually it depends on the pool being used.
With multiprocessing the child process will simply be replaced by a new one.
In other pools the worker will shutdown, but this is something that is likely to change
so that it's consistent with multiprocessing behavior.
Calling exit outside of the task body is regarded as an internal error (crash).
The "task body" is whatever executes at task.__call__()
I think maybe a better solution for this would be to use a custom execution
strategy:
from celery.worker import strategy
from functools import wraps
#staticmethod
def shutdown_after_strategy(task, app, consumer):
default_handler = strategy.default(task, app, consumer)
def _shutdown_to_exit_after(fun):
#wraps(fun)
def _inner(*args, **kwargs):
try:
return fun(*args, **kwargs)
finally:
raise SystemExit()
return _inner
return _decorate_to_exit_after(default_handler)
#celery.task(Strategy=shutdown_after_strategy)
def shutdown_after():
print('will shutdown after this')
This isn't exactly beautiful, but the execution strategy is there to optimize
task execution and not to be easily extendable (the worker "precompiles" the execution
path for each task type by caching Task.Strategy)
In Celery 3.1 you can extend the worker and consumer using "bootsteps", so likely
there will be a pretty solution then.

Categories

Resources