How to use Popen to run backgroud process and avoid zombie?

How to use Popen to run backgroud process and avoid zombie? - python

I've a listener server running new thread to for each client handler. Each handler can use:
proc = subprocess.Popen(argv, executable = "./Main.py", stdout = _stdout, stderr = subprocess.STDOUT, close_fds=False)
to run new process in background, after what the handler thread is ended.
After the background process is ended, it is kept in Z state. Is it possible to ask subprocess.Popen() to handle SIG_CHILD to avoid this zombie?
I don't want to read process state using proc.wait(), since for this I've to save the list of all running background processes...
UPD
I need to run some processes in background avoiding zombies and to run some processes with .communicate() to read data from these processes. In that case using signal trick from koblas I get an error:
File "./PyZWServer.py", line 115, in IsRunning
return (subprocess.Popen(["pgrep", "-c", "-x", name], stdout=subprocess.PIPE).communicate()[0] == "0")
File "/usr/lib/python2.6/subprocess.py", line 698, in communicate
self.wait()
File "/usr/lib/python2.6/subprocess.py", line 1170, in wait
pid, sts = _eintr_retry_call(os.waitpid, self.pid, 0)
File "/usr/lib/python2.6/subprocess.py", line 465, in _eintr_retry_call
return func(*args)
OSError: [Errno 10] No child processes
Error happened during handling of client

If you add a signal handler for SIGCHLD you will have the kernel handle the wait/reap piece.
Specifically the line:
signal.signal(signal.SIGCHLD, signal.SIG_IGN)
Will take care of your Zombies.

Related

Gracefully terminate multiprocessing based program

I am working on a python service that spawns Process to handle the workload. Since I don't know at the start of the service how many workers I need, I chose to not use Pool. The following is a simplified version:
import multiprocessing as mp
import time
from datetime import datetime
def _print(s): # just my cheap logging utility
print(f'{datetime.now()} - {s}')
def run_in_process(q, evt):
_print(f'starting process job')
while not evt.is_set(): # True
try:
x = q.get(timeout=2)
_print(f'received {x}')
except:
_print(f'timed-out')
if __name__ == '__main__':
with mp.Manager() as manager:
q = manager.Queue()
evt = manager.Event()
p = mp.Process(target=run_in_process, args=(q, evt))
p.start()
time.sleep(2)
data = 100
while True:
try:
q.put(data)
time.sleep(0.5)
data += 1
if data > 110:
break
except KeyboardInterrupt:
_print('finishing...')
#p.terminate()
break
time.sleep(3)
_print('setting event 0')
evt.set()
_print('joining process')
p.join()
_print('done')
The program works and exits gracefully, without any error messages. However, if I use Ctrl-C before I have all 10 events processed, I get the following error before it exits.
2022-04-01 12:41:06.866484 - received 101
2022-04-01 12:41:07.367628 - received 102
^C2022-04-01 12:41:07.507805 - timed-out
2022-04-01 12:41:07.507886 - finishing...
Process Process-2:
Traceback (most recent call last):
File "/<path-omitted>/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/<path-omitted>/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "mp.py", line 10, in run_in_process
while not evt.is_set(): # True
File "/<path-omitted>/python3.7/multiprocessing/managers.py", line 1088, in is_set
return self._callmethod('is_set')
File "/<path-omitted>/python3.7/multiprocessing/managers.py", line 819, in _callmethod
kind, result = conn.recv()
File "/<path-omitted>/python3.7/multiprocessing/connection.py", line 250, in recv
buf = self._recv_bytes()
File "/<path-omitted>/python3.7/multiprocessing/connection.py", line 407, in _recv_bytes
buf = self._recv(4)
File "/<path-omitted>/python3.7/multiprocessing/connection.py", line 379, in _recv
chunk = read(handle, remaining)
ConnectionResetError: [Errno 104] Connection reset by peer
2022-04-01 12:41:10.511334 - setting event 0
Traceback (most recent call last):
File "mp.py", line 42, in <module>
evt.set()
File "/<path-omitted>/python3.7/multiprocessing/managers.py", line 1090, in set
return self._callmethod('set')
File "/<path-omitted>/python3.7/multiprocessing/managers.py", line 818, in _callmethod
conn.send((self._id, methodname, args, kwds))
File "/<path-omitted>/python3.7/multiprocessing/connection.py", line 206, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "/<path-omitted>/python3.7/multiprocessing/connection.py", line 404, in _send_bytes
self._send(header + buf)
File "/<path-omitted>/python3.7/multiprocessing/connection.py", line 368, in _send
n = write(self._handle, buf)
BrokenPipeError: [Errno 32] Broken pipe
A few observations:
The double error message looks exactly the same when I press Ctrl-C with my actual project. I think this is a good representation of my problem.
If I add p.terminate(), it doesn't change the behavior if the program is left to finish by itself. But if I press Ctrl-C halfway, I encounter the error message only once, I guess it's from the main thread/process.
If I change while not evt.is_set(): in run_in_process to an infinite loop: while Tre: and let the program finish its course I would continue to see periodic time-out prints which make sense. What I don't understand is that, if I press Ctrl-C, then the terminal will start spewing time-out without any time gap between them. What happened?
My ultimate question is: what is the correct way of construct this program so that when Ctrl-C is used (or a termination signal is generated to the program for that matter), the program stops gracefully?

I found out a solution to this problem myself by using signal.
The idea is to set up a signal catcher to catch specific signals, such as signal.SIGINT, signal.SIGTERM.
import multiprocessing as mp
from threading import Event
import signal
if __name__ == '__main__':
main_evt = Event()
def stop_main_handler(signum, frame):
if not main_evt.is_set():
main_evt.set()
signal.signal(signal.SIGINT, stop_main_handler)
with mp.Manager() as manager:
# creating mp queue, event and process
q = manager.Queue()
evt = manager.Event()
p = mp.Process(target=..., args=(q, evt))
p.start()
while not main_evt.is_set():
# processing data
# cleanup
evt.set()
p.join()
Or you can wrap it in an object-oriented fashion:
class SignalCatcher(object):
def __init__(self):
self._main_evt = Event()
def _stop_handler(self, signum, frame):
if not self._main_evt.is_set():
self._main_evt.set()
def block_until_signaled(self):
while not self._main_evt.is_set()
time.sleep(2)
Then you can use it as follows:
if __name__ == '__main__':
sc = SignalCatcher()
# this has to be outside. It seems that there is another process
# created by multiprocessing library, if you put sc creation in
# with-context, it would fail to signal each process.
with mp.Manager() as manager:
# creating process and starting it
# ...
sc.block_until_signaled()
# cleanup
# ...

terminate python multiprocessing pool cleanly

I am using multiprocessing.pool to work with the http server in python - it works great, but when I terminate, I get a slew of errors from all the spawnpoolworkers - and I'm just wondering how I avoid this.
My main code:
def run(self):
global pool
port = self.arguments.port
pool = multiprocessing.Pool( processes= self.arguments.threads)
with http.server.HTTPServer( ("", port), Handler ) as daemon:
print(f"serving on port {port}")
while True:
try:
daemon.handle_request()
except KeyboardInterrupt:
print("\nexiting")
pool.terminate()
pool.join()
return 0
I've tried doing nothing to the pool, I've tried doing pool.close() - I've tried not joining. But even if I just run that - never even access the port or call anything onto the pool, I still get a random list of things like this when I press control-c
Process SpawnPoolWorker-8:
Process SpawnPoolWorker-4:
Traceback (most recent call last):
File "/opt/homebrew/Cellar/python#3.10/3.10.1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/opt/homebrew/Cellar/python#3.10/3.10.1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/opt/homebrew/Cellar/python#3.10/3.10.1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/pool.py", line 114, in worker
task = get()
File "/opt/homebrew/Cellar/python#3.10/3.10.1/Frameworks/Python.framework/Versions/3.10/lib/python3.10/multiprocessing/queues.py", line 365, in get
with self._rlock:
File "/opt/homebrew/Cellar/python#3.10/3.10.1/F
how do I exit the pool cleanly, with no errors, and with no output?

ok - I'm stupid - the control-c was also interrupting all the child processes. This fixed it:
def ignore_control_c():
signal.signal(signal.SIGINT, signal.SIG_IGN)
pool = multiprocessing.Pool( processes = self.arguments.threads, initializer = ignore_control_c )

python multiprocessing pool and additional queues

I would like to pass messages out from my function running in a process pool while the function is still running.
My application uses asyncio and multiprocessing queues to receive and distribute messages to a worker pool using asyncio.run_in_executor(). I manually created the pool so I could provide an initializer.
The problem I have is that I would like the functions that are running in the executor pool to be able to send messages out to the asyncio loop. This is how I started my new application process:
self._application = Application(self.outgoing_queue, self.incoming_queue, application_cores, log_level=logging.INFO)
self._application_process = mp.Process(target=self._application.run)
self._application_process.start()
the queues are from:
self.outgoing_queue = mp.Queue()
self.incoming_queue = mp.Queue()
I can't use my asyncio queue, or multiprocessing queue since those can't be passed to the process by this method:
async def run_operation():
kwargs = {
'out_queue': self._work_pool_queue
}
func = functools.partial(attribute, *args, **kwargs)
result = await self._loop.run_in_executor(self._work_pool, func)
result_msg = common.messages.MessageResult(result, msg.reply_id, msg.cpu_cost)
await self.outgoing_send(result_msg)
asyncio.create_task(run_operation())
My self._work_pool is created with:
self._work_pool = concurrent.futures.ProcessPoolExecutor(max_workers=self._cores, initializer=_work_pool_init)
since the following traceback results:
Task exception was never retrieved
future: <Task finished coro=<Application.message_router.<locals>.run_operation() done, defined at c:\users\brian\gitlab\rf-applications\rfapplications\common\application.py:95> exception=RuntimeError('Queue objects should only be shared between processes through inheritance')>
concurrent.futures.process._RemoteTraceback:
"""
Traceback (most recent call last):
File "c:\Program Files\Python37\lib\multiprocessing\queues.py", line 236, in _feed
obj = _ForkingPickler.dumps(obj)
File "c:\Program Files\Python37\lib\multiprocessing\reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
File "c:\Program Files\Python37\lib\multiprocessing\queues.py", line 58, in __getstate__
context.assert_spawning(self)
File "c:\Program Files\Python37\lib\multiprocessing\context.py", line 356, in assert_spawning
' through inheritance' % type(obj).__name__
RuntimeError: Queue objects should only be shared between processes through inheritance
"""
I was looking at using a Manager().Queue() (https://docs.python.org/3/library/multiprocessing.html#managers) since those can be sent to the process pool (Python multiprocessing Pool Queues communication). However, these queues seem to open up the possibility of remote connections, which I would like to avoid (I use secure websockets to communicate between remote machines right so far).

Asyncio function works when called from script but not from Flask route

I'm a novice with Python and these libraries/modules. I'm writing a simple ping-test network scanner as a learning project.
I first developed a script using asyncio to ping addresses on a network
#ip_test.py
import asyncio
import ipaddress
async def ping(addr):
proc = await asyncio.create_subprocess_exec(
'ping','-W','1','-c','3',addr,
stdout=asyncio.subprocess.PIPE
)
await proc.wait()
return proc.returncode
async def pingMain(net):
#hosts() returns list of Ipv4Address objects
result = await asyncio.gather(*(ping(str(addr)) for addr in net.hosts()))
return result
def getHosts(net_): #net_ is an Ipv4Network object
return asyncio.run(pingMain(net_))
#Returns list of response codes which I then zip with the list of searched ips
When I open python and run the following, it works as expected:
import ip_test as iptest
import ipaddress
print(iptest.getHosts(ipaddress.ip_network('192.168.1.0/29')))
#prints: [0, 0, 0, 1, 1, 1] as expected on this network
However, the ultimate goal is to take input from the user via form input (the results are recorded to a database, this is a simplified example for illustrative purposes). I collect the input via a flask route:
#app.route("/newscan",methods=['POST'])
def newScan():
form = request.form
networkstring = form.get('network') + "/" + form.get('mask')
result = iptest.getHosts(ipaddress.ip_network(networkstring))
return result
When I call the module this way, I get an error: Runtime Error: Cannot add child handler, the child watcher does not have a loop attached.
Why does this work when I import the module and run the function from the command line, but not when I call it with the same input from a flask route?
EDIT: Traceback:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 2463, in __call__
return self.wsgi_app(environ, start_response)
File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 2449, in wsgi_app
response = self.handle_exception(e)
File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1866, in handle_exception
reraise(exc_type, exc_value, tb)
File "/usr/local/lib/python3.7/site-packages/flask/_compat.py", line 39, in reraise
raise value
File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 2446, in wsgi_app
response = self.full_dispatch_request()
File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1951, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1820, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/usr/local/lib/python3.7/site-packages/flask/_compat.py", line 39, in reraise
raise value
File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1949, in full_dispatch_request
rv = self.dispatch_request()
File "/usr/local/lib/python3.7/site-packages/flask/app.py", line 1935, in dispatch_request
return self.view_functions[rule.endpoint](**req.view_args)
File "/app/app.py", line 41, in newScan
result = iptest.getHosts(ipaddress.ip_network(networkstring))
File "/app/ip_test.py", line 22, in getHosts
res = asyncio.run(pingMain(net_))
File "/usr/local/lib/python3.7/asyncio/runners.py", line 43, in run
return loop.run_until_complete(main)
File "/usr/local/lib/python3.7/asyncio/base_events.py", line 579, in run_until_complete
return future.result()
File "/app/ip_test.py", line 15, in pingMain
result = await asyncio.gather(*(ping(str(addr)) for addr in net.hosts()))
File "/app/ip_test.py", line 7, in ping
stdout=asyncio.subprocess.PIPE
File "/usr/local/lib/python3.7/asyncio/subprocess.py", line 217, in create_subprocess_exec
stderr=stderr, **kwds)
File "/usr/local/lib/python3.7/asyncio/base_events.py", line 1529, in subprocess_exec
bufsize, **kwargs)
File "/usr/local/lib/python3.7/asyncio/unix_events.py", line 193, in _make_subprocess_transport
self._child_watcher_callback, transp)
File "/usr/local/lib/python3.7/asyncio/unix_events.py", line 930, in add_child_handler
"Cannot add child handler, "
RuntimeError: Cannot add child handler, the child watcher does not have a loop attached

You are trying to run your async subprocess from a thread other than the main thread. This requires some initial setup from the main thread, see the Subprocesses and Threads section of the asyncio Subprocesses documentation:
Standard asyncio event loop supports running subprocesses from different threads, but there are limitations:
An event loop must run in the main thread.
The child watcher must be instantiated in the main thread before executing subprocesses from other threads. Call the get_child_watcher() function in the main thread to instantiate the child watcher.
What is happening here is that your WSGI server is using multiple threads to handle incoming requests, so the request handler is not running on the main thread. But your code uses asyncio.run() to start a new event loop, and so your asyncio.create_subprocess_exec() call will fail as there is no child watcher on the main thread.
You'd have to start a loop (and not stop it) from the main thread, and call asyncio.get_child_watcher() on that thread, for your code not to fail:
# to be run on the main thread, set up a subprocess child watcher
assert threading.current_thread() is threading.main_thread()
asyncio.get_event_loop()
asyncio.get_child_watcher()
Note: this restriction only applies to Python versions up to Python 3.7, the restriction has been lifted in Python 3.8.
However, just to run a bunch of subprocesses and wait for these to complete, using asyncio is overkill; your OS can run subprocesses in parallel just fine. Just use subprocess.Popen() and check each process via the Popen.poll() method:
import subprocess
def ping_proc(addr):
return subprocess.Popen(
['ping', '-W', '1', '-c', '3', addr],
stdout=subprocess.DEVNULL,
stderr=subprocess.DEVNULL
)
def get_hosts(net):
# hosts() returns list of Ipv4Address objects
procs = [ping_proc(str(addr)) for addr in net.hosts()]
while any(p.poll() is None for p in procs):
time.sleep(0.1)
return [p.returncode for p in procs]
Popen.poll() does not block; if Popen.returncode is not yet set it checks for the process status with the OS with waitpid([pid], WNOHANG) and returns either None if the process is still running, or the now-available returncode value. The above just checks for those statuses in a loop with a short sleep in between to avoid thrashing.
The asyncio subprocess wrapper (on POSIX at least) either uses a SIGCHLD signal handler to be notified of child processes exiting or (in Python 3.8) uses a separate thread per child process to use a blocking waitpid() call on each subprocess created. You could implement the same signal handler, but take into account that signal handlers can only be registered on the main thread, so you'd have to jump through several hoops to communicate incoming SIGCHLD signal information to the right thread.

Stopping cocotb forked coroutine

I have a coroutine that waiting for a signal rising :
#cocotb.coroutine
def wait_for_rise(self):
yield RisingEdge(self.dut.mysignal)
I'm launching it in my «main» test function like it :
mythread = cocotb.fork(wait_for_rise())
I want to stop it after a while even if no signal rise happen. I tryed to «kill» it:
mythread.kill()
But exception happen :
Send raised exception: 'RunningCoroutine' object has no attribute '_join'
File "/opt/cocotb/cocotb/decorators.py", line 121, in send
return self._coro.send(value)
File "/myproject.py", line 206, in i2c_read
wTXDRwthread.kill()
File "/opt/cocotb/cocotb/decorators.py", line 151, in kill
cocotb.scheduler.unschedule(self)
File "/opt/cocotb/cocotb/scheduler.py", line 453, in unschedule
if coro._join in self._trigger2coros:
Is there a solution to stop forked coroutine properly ?

This very much looks like it is the same problem as in https://github.com/potentialventures/cocotb/issues/650 - you can subscribe to the issue to be notified when its status changes.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to use Popen to run backgroud process and avoid zombie? - python

If you add a signal handler for SIGCHLD you will have the kernel handle the wait/reap piece. Specifically the line: signal.signal(signal.SIGCHLD, signal.SIG_IGN) Will take care of your Zombies.

Related

Gracefully terminate multiprocessing based program

terminate python multiprocessing pool cleanly

python multiprocessing pool and additional queues

Asyncio function works when called from script but not from Flask route

Stopping cocotb forked coroutine

Categories

Resources