I want to add few tests on how my cli app handles different signals (SIGTERM, etc). And I am using native testing solution click.testing.CliRunner alongside of pytest.
Test looks pretty standard and simple
def test_breaking_process(server, runner):
address = server.router({'^/$': Page("").exists().slow()})
runner = CliRunner(mix_stderr=True)
args = [address, '--no-colors', '--no-progress']
result = runner.invoke(main, args)
assert result.exit_code == 0
And here I am stuck, how could I send SIGTERM to process in runner.invoke? I see no problem to do it if I use e2e tests (calling executable and not CLIrunner), but I would like to try to implement this (at least able to send os.kill)
is there a way to do it?
So, if you want to test your click powered application for handling different signals, you can do next procedure.
def test_breaking_process(server, runner):
from multiprocessing import Queue, Process
from threading import Timer
from time import sleep
from os import kill, getpid
from signal import SIGINT
url = server.router({'^/$': Page("").slow().exists()})
args = [url, '--no-colors', '--no-progress']
q = Queue()
# Running out app in SubProcess and after a while using signal sending
# SIGINT, results passed back via channel/queue
def background():
Timer(0.2, lambda: kill(getpid(), SIGINT)).start()
result = runner.invoke(main, args)
q.put(('exit_code', result.exit_code))
q.put(('output', result.output))
p = Process(target=background)
p.start()
results = {}
while p.is_alive():
sleep(0.1)
else:
while not q.empty():
key, value = q.get()
results[key] = value
assert results['exit_code'] == 0
assert "Results can be inconsistent, as execution was terminated" in results['output']
Related
I have a python script which calls a series of sub-processes. They need to run "for ever" - but they occasionally die, or get killed. When this happens I need to restart the process using the same arguments as the one which died.
This is a very simplified version:
[edit: this is the less simplified version, which includes "restart" code]
import multiprocessing
import time
import random
def printNumber(number):
print("starting :", number)
while random.randint(0, 5) > 0:
print(number)
time.sleep(2)
if __name__ == '__main__':
children = [] # list
args = {} # dictionary
for processNumber in range(10,15):
p = multiprocessing.Process(
target=printNumber,
args=(processNumber,)
)
children.append(p)
p.start()
args[p.pid] = processNumber
while True:
time.sleep(1)
for n, p in enumerate(children):
if not p.is_alive():
#get parameters dead child was started with
pidArgs = args[p.pid]
del(args[p.pid])
print("n,args,p: ",n,pidArgs,p)
children.pop(n)
# start new process with same args
p = multiprocessing.Process(
target=printNumber,
args=(pidArgs,)
)
children.append(p)
p.start()
args[p.pid] = pidArgs
I have updated the example to illustrate how I want the processes to be restarted if one crashes/killed/etc - keeping track of which pid was started with which args.
Is this the "best" way to do this, or is there a more "python" way of doing this?
I think I would create a separate thread for each Process and use a ProcessPoolExecutor. Executors have a useful function, submit, which returns a Future. You can wait on each Future and re-launch the Executor when the Future is done. Arguments to the function are tracked as class variables, so restarting is just a simple loop.
import threading
from concurrent.futures import ProcessPoolExecutor
import time
import random
import traceback
def printNumber(number):
print("starting :", number)
while random.randint(0, 5) > 0:
print(number)
time.sleep(2)
class KeepRunning(threading.Thread):
def __init__(self, func, *args, **kwds):
self.func = func
self.args = args
self.kwds = kwds
super().__init__()
def run(self):
while True:
with ProcessPoolExecutor(max_workers=1) as pool:
future = pool.submit(self.func, *self.args, **self.kwds)
try:
future.result()
except Exception:
traceback.print_exc()
if __name__ == '__main__':
for process_number in range(10, 15):
keep = KeepRunning(printNumber, process_number)
keep.start()
while True:
time.sleep(1)
At the end of the program is a loop to keep the main thread running. Without that, the program will attempt to exit while your Processes are still running.
For the example you provided I would just remove the exit condition from the while loop and change it to True.
As you said though the actual code is more complicated (why didn't you post that?). So if the process gets terminated by lets say an exception just put the code inside a try catch block. You can then put said block in an infinite loop.
I hope this is what you are looking for but that seems to be the right way to do it provided the goal and information you provided.
Instead of just starting the process immediately, you can save the list of processes and their arguments, and create another process that checks they are alive.
For example:
if __name__ == '__main__':
process_list = []
for processNumber in range(5):
process = multiprocessing.Process(
target=printNumber,
args=(processNumber,)
)
process_list.append((process,args))
process.start()
while True:
for running_process, process_args in process_list:
if not running_process.is_alive():
new_process = multiprocessing.Process(target=printNumber, args=(process_args))
process_list.remove(running_process, process_args) # Remove terminated process
process_list.append((new_process, process_args))
I must say that I'm not sure the best way to do it is in python, you may want to look at scheduler services like jenkins or something like that.
Right now, I'm using subprocess to run a long-running job in the background. For multiple reasons (PyInstaller + AWS CLI) I can't use subprocess anymore.
Is there an easy way to achieve the same thing as below ? Running a long running python function in a multiprocess pool (or something else) and do real time processing of stdout/stderr ?
import subprocess
process = subprocess.Popen(
["python", "long-job.py"],
stdout=subprocess.PIPE,
stderr=subprocess.PIPE,
shell=True,
)
while True:
out = process.stdout.read(2000).decode()
if not out:
err = process.stderr.read().decode()
else:
err = ""
if (out == "" or err == "") and process.poll() is not None:
break
live_stdout_process(out)
Thanks
getting it cross platform is messy .... first of all windows implementation of non-blocking pipe is not user friendly or portable.
one option is to just have your application read its command line arguments and conditionally execute a file, and you get to use subprocess since you will be launching yourself with different argument.
but to keep it to multiprocessing :
the output must be logged to queues instead of pipes.
you need the child to execute a python file, this can be done using runpy to execute the file as __main__.
this runpy function should run under a multiprocessing child, this child must first redirect its stdout and stderr in the initializer.
when an error happens, your main application must catch it .... but if it is too busy reading the output it won't be able to wait for the error, so a child thread has to start the multiprocess and wait for the error.
the main process has to create the queues and launch the child thread and read the output.
putting it all together:
import multiprocessing
from multiprocessing import Queue
import sys
import concurrent.futures
import threading
import traceback
import runpy
import time
class StdoutQueueWrapper:
def __init__(self,queue:Queue):
self._queue = queue
def write(self,text):
self._queue.put(text)
def flush(self):
pass
def function_to_run():
# runpy.run_path("long-job.py",run_name="__main__") # run long-job.py
print("hello") # print something
raise ValueError # error out
def initializer(stdout_queue: Queue,stderr_queue: Queue):
sys.stdout = StdoutQueueWrapper(stdout_queue)
sys.stderr = StdoutQueueWrapper(stderr_queue)
def thread_function(child_stdout_queue,child_stderr_queue):
with concurrent.futures.ProcessPoolExecutor(1, initializer=initializer,
initargs=(child_stdout_queue, child_stderr_queue)) as pool:
result = pool.submit(function_to_run)
try:
result.result()
except Exception as e:
child_stderr_queue.put(traceback.format_exc())
if __name__ == "__main__":
child_stdout_queue = multiprocessing.Queue()
child_stderr_queue = multiprocessing.Queue()
child_thread = threading.Thread(target=thread_function,args=(child_stdout_queue,child_stderr_queue),daemon=True)
child_thread.start()
while True:
while not child_stdout_queue.empty():
var = child_stdout_queue.get()
print(var,end='')
while not child_stderr_queue.empty():
var = child_stderr_queue.get()
print(var,end='')
if not child_thread.is_alive():
break
time.sleep(0.01) # check output every 0.01 seconds
Note that a direct consequence of running as a multiprocess is that if the child runs into a segmentation fault or some unrecoverable error the parent will also die, hencing running yourself under subprocess might seem a better option if segfaults are expected.
In this script I was looking to launch a given program and monitor it as long as the program exists. Thus, I reached the point where I got to use the threading's module Timer method for controlling a loop that writes to a file and prints out to the console a specific stat of the launched process (for this case, mspaint).
The problem arises when I'm hitting CTRL + C in the console or when I close mspaint, with the script capturing any of the 2 events only after the time defined for the interval has completely ran out. These events make the script stop.
For example, if a 20 seconds time is set for the interval, once the script has started, if at second 5 I either hit CTRL + C or close mspaint, the script will stop only after the remaining 15 seconds will have passed.
I would like for the script to stop right away when I either hit CTRL + C or close mspaint (or any other process launched through this script).
The script can be used with the following command, according to the example:
python.exe mon_tool.py -p "C:\Windows\System32\mspaint.exe" -i 20
I'd really appreciate if you could come up with a working example.
I had used python 3.10.4 and psutil 5.9.0 .
This is the code:
# mon_tool.py
import psutil, sys, os, argparse
from subprocess import Popen
from threading import Timer
debug = False
def parse_args(args):
parser = argparse.ArgumentParser()
parser.add_argument("-p", "--path", type=str, required=True)
parser.add_argument("-i", "--interval", type=float, required=True)
return parser.parse_args(args)
def exceptionHandler(exception_type, exception, traceback, debug_hook=sys.excepthook):
'''Print user friendly error messages normally, full traceback if DEBUG on.
Adapted from http://stackoverflow.com/questions/27674602/hide-traceback-unless-a-debug-flag-is-set
'''
if debug:
print('\n*** Error:')
debug_hook(exception_type, exception, traceback)
else:
print("%s: %s" % (exception_type.__name__, exception))
sys.excepthook = exceptionHandler
def validate(data):
try:
if data.interval < 0:
raise ValueError
except ValueError:
raise ValueError(f"Time has a negative value: {data.interval}. Please use a positive value")
def main():
args = parse_args(sys.argv[1:])
validate(args)
# creates the "Process monitor data" folder in the "Documents" folder
# of the current Windows profile
default_path: str = f"{os.path.expanduser('~')}\\Documents\Process monitor data"
if not os.path.exists(default_path):
os.makedirs(default_path)
abs_path: str = f'{default_path}\data_test.txt'
print("data_test.txt can be found in: " + default_path)
# launches the provided process for the path argument, and
# it checks if the process was indeed launched
p: Popen[bytes] = Popen(args.path)
PID = p.pid
isProcess: bool = True
while isProcess:
for proc in psutil.process_iter():
if(proc.pid == PID):
isProcess = False
process_stats = psutil.Process(PID)
# creates the data_test.txt and it erases its content
with open(abs_path, 'w', newline='', encoding='utf-8') as testfile:
testfile.write("")
# loop for writing the handles count to data_test.txt, and
# for printing out the handles count to the console
def process_monitor_loop():
with open(abs_path, 'a', newline='', encoding='utf-8') as testfile:
testfile.write(f"{process_stats.num_handles()}\n")
print(process_stats.num_handles())
Timer(args.interval, process_monitor_loop).start()
process_monitor_loop()
if __name__ == '__main__':
main()
Thank you!
I think you could use python-worker (link) for the alternatives
import time
from datetime import datetime
from worker import worker, enableKeyboardInterrupt
# make sure to execute this before running the worker to enable keyboard interrupt
enableKeyboardInterrupt()
# your codes
...
# block lines with periodic check
def block_next_lines(duration):
t0 = time.time()
while time.time() - t0 <= duration:
time.sleep(0.05) # to reduce resource consumption
def main():
# your codes
...
#worker(keyboard_interrupt=True)
def process_monitor_loop():
while True:
print("hii", datetime.now().isoformat())
block_next_lines(3)
return process_monitor_loop()
if __name__ == '__main__':
main_worker = main()
main_worker.wait()
here your process_monitor_loop will be able to stop even if it's not exactly 20 sec of interval
You can try registering a signal handler for SIGINT, that way whenever the user presses Ctrl+C you can have a custom handler to clean all of your dependencies, like the interval, and exit gracefully.
See this for a simple implementation.
This is the solution for the second part of the problem, which checks if the launched process exists. If it doesn't exist, it stops the script.
This solution comes on top of the solution, for the first part of the problem, provided above by #danangjoyoo, which deals with stopping the script when CTRL + C is used.
Thank you very much once again, #danangjoyoo! :)
This is the code for the second part of the problem:
import time, psutil, sys, os
from datetime import datetime
from worker import worker, enableKeyboardInterrupt, abort_all_thread, ThreadWorkerManager
from threading import Timer
# make sure to execute this before running the worker to enable keyboard interrupt
enableKeyboardInterrupt()
# block lines with periodic check
def block_next_lines(duration):
t0 = time.time()
while time.time() - t0 <= duration:
time.sleep(0.05) # to reduce resource consumption
def main():
# launches mspaint, gets its PID and checks if it was indeed launched
path = f"C:\Windows\System32\mspaint.exe"
p = psutil.Popen(path)
PID = p.pid
isProcess: bool = True
while isProcess:
for proc in psutil.process_iter():
if(proc.pid == PID):
isProcess = False
interval = 5
global counter
counter = 0
#allows for sub_process to run only once
global run_sub_process_once
run_sub_process_once = 1
#worker(keyboard_interrupt=True)
def process_monitor_loop():
while True:
print("hii", datetime.now().isoformat())
def sub_proccess():
'''
Checks every second if the launched process still exists.
If the process doesn't exist anymore, the script will be stopped.
'''
print("Process online:", psutil.pid_exists(PID))
t = Timer(1, sub_proccess)
t.start()
global counter
counter += 1
print(counter)
# Checks if the worker thread is alive.
# If it is not alive, it will kill the thread spawned by sub_process
# hence, stopping the script.
for _, key in enumerate(ThreadWorkerManager.allWorkers):
w = ThreadWorkerManager.allWorkers[key]
if not w.is_alive:
t.cancel()
if not psutil.pid_exists(PID):
abort_all_thread()
t.cancel()
global run_sub_process_once
if run_sub_process_once:
run_sub_process_once = 0
sub_proccess()
block_next_lines(interval)
return process_monitor_loop()
if __name__ == '__main__':
main_worker = main()
main_worker.wait()
Also, I have to note that #danangjoyoo's solution comes as an alternative to signal.pause() for Windows. This only deals with CTRL + C problem part. signal.pause() works only for Unix systems. This is how it was supposed for its usage, for my case, in case it were a Unix system:
import signal, sys
from threading import Timer
def main():
def signal_handler(sig, frame):
print('\nYou pressed Ctrl+C!')
sys.exit(0)
signal.signal(signal.SIGINT, signal_handler)
print('Press Ctrl+C')
def process_monitor_loop():
try:
print("hi")
except KeyboardInterrupt:
signal.pause()
Timer(10, process_monitor_loop).start()
process_monitor_loop()
if __name__ == '__main__':
main()
The code above is based on this.
I have long running process, that I want to keep track about in which state it currently is in. There is N processes running in same time therefore multiprocessing issue.
I pass Queue into process to report messages about state, and this Queue is then read(if not empty) in thread every couple of second.
I'm using Spider on windows as environment and later described behavior is in its console. I did not try it in different env.
from multiprocessing import Process,Queue,Lock
import time
def test(process_msg: Queue):
try:
process_msg.put('Inside process message')
# process...
return # to have exitstate = 0
except Exception as e:
process_msg.put(e)
callback_msg = Queue()
if __name__ == '__main__':
p = Process(target = test,
args = (callback_msg,))
p.start()
time.sleep(5)
print(p)
while not callback_msg.empty():
msg = callback_msg.get()
if type(msg) != Exception:
tqdm.write(str(msg))
else:
raise msg
Problem is that whatever I do with code, it never reads what is inside the Queue(also because it never puts anything in it). Only when I switch to dummy version, which runs similary to threading on only 1 CPU from multiprocessing.dummy import Process,Queue,Lock
Apparently the test function have to be in separate file.
I have reserved some nodes on a SLURM cluster and want to run a python script on these nodes.
On one node (server) a python script should fill a queue with jobs and dispatch these jobs to the clients.
Most of the time this works fine, but occasionally the script stalls.
When using Ctrl+C it turns out that in that case one (or sometimes more) nodes seem to be stuck in <Finalize object, dead>:
^Csrun: interrupt (one more within 1 sec to abort)
srun: task 30: running
srun: tasks 0-29,31-39: exited
^Csrun: sending Ctrl-C to job 1075185.14
Exception KeyboardInterrupt: KeyboardInterrupt() in <Finalize object, dead> ignored
srun: Job step aborted: Waiting up to 2 seconds for job step to finish.
slurmd[cluster-112]: *** STEP 1075185.14 KILLED AT 2014-04-03T09:11:23 WITH SIGNAL 9 ***
I have no clue what the reason could be. Maybe, it looks like something related to the garbage collector.
This is the script I run:
#!/usr/bin/env
import os
import multiprocessing.managers
import Queue
import sys
import subprocess
import socket
import errno
class QueueManager(multiprocessing.managers.SyncManager):
pass
def worker(i, my_slurm_proc_id):
print 'hello %i (proc=%i)' % (i, my_slurm_proc_id)
time.sleep(0.1)
pass
def run_server(first_slurm_node, N_procs):
queue = Queue.Queue()
barrier = multiprocessing.BoundedSemaphore(N_procs-1)
QueueManager.register('get_queue', callable=lambda: queue)
QueueManager.register('get_barrier', callable=lambda: barrier)
for i in range(5000):
queue.put(i)
m = QueueManager(address=(first_slurm_node, 50000), authkey='abracadabra')
m.start()
for i in range(N_procs-1):
barrier.acquire(True)
m.get_queue().join() # somehow just 'queue.join()' doesn't work here
def run_client(my_slurm_proc_id, first_slurm_node):
QueueManager.register('get_queue')
QueueManager.register('get_barrier')
m = QueueManager(address=(first_slurm_node, 50000), authkey='abracadabra')
m.connect()
barrier = m.get_barrier()
barrier.acquire(True)
queue = m.get_queue()
while not queue.empty():
try:
data = queue.get_nowait()
except Queue.Empty:
break
worker(data, my_slurm_proc_id)
queue.task_done()
queue = None
barrier.release()
barrier = None
def main():
slurm_job_nodelist = subprocess.check_output('scontrol show hostname'.split(' ') + [os.environ['SLURM_JOB_NODELIST']]).split('\n')
master_node = slurm_job_nodelist[0]
my_slurm_proc_id = int(os.environ['SLURM_PROCID'])
N_procs = int(os.environ['SLURM_NPROCS'])
if my_slurm_proc_id == 0:
run_server(master_node, N_procs)
else:
run_client(my_slurm_proc_id, master_node)
if __name__ == '__main__':
main()