fabric run fork - python

I want to use Fabric.api.run to directly start an application in a remote box. Since the application takes really a long to finish, I wish to be able to fork a child process, such that I don't need to wait for a long time.
The code is like:
from fabric.api import run
....
run("python ./myApp.py --fork=True >myApp.log 2>&1")
I used the following code to enable forking side the code:
if settings.fork:
child_pid = os.fork()
if child_pid == 0:
print "Starting Child Process: PID# %s" % os.getpid()
else:
print "Terminating Parent Process: PID# %s" % os.getpid()
os._exit(0)
The problem is after I do the run command, I sshed into the remote box, and found out the program has been quit for some unknown reason, I check the log file, there is nothing there.
Somebody could let me know how I can work this around? Many thanks!!

Talking of forks, there is a fork of Fabric that enables parallel execution, apart from lots of other improvements.
http://tav.espians.com/fabric-python-with-cleaner-api-and-parallel-deployment-support.html
Depending on what you are doing, you may want to consider that.
Apart from that, I think you want to use multiprocessing:
from multiprocessing import Process
def f(name):
print 'hello', name
if __name__ == '__main__':
p = Process(target=f, args=('bob',))
p.start()
#p.join()
http://docs.python.org/library/multiprocessing.html

Related

Why don't I see output from this function when calling it with multiprocessing? [duplicate]

A basic example of multiprocessing Process class runs when executed from file, but not from IDLE. Why is that and can it be done?
from multiprocessing import Process
def f(name):
print('hello', name)
p = Process(target=f, args=('bob',))
p.start()
p.join()
Yes. The following works in that function f is run in a separate (third) process.
from multiprocessing import Process
def f(name):
print('hello', name)
if __name__ == '__main__':
p = Process(target=f, args=('bob',))
p.start()
p.join()
However, to see the print output, at least on Windows, one must start IDLE from a console like so.
C:\Users\Terry>python -m idlelib
hello bob
(Use idlelib.idle on 2.x.) The reason is that IDLE runs user code in a separate process. Currently the connection between the IDLE process and the user code process is via a socket. The fork done by multiprocessing does not duplicate or inherit the socket connection. When IDLE is started via an icon or Explorer (in Windows), there is nowhere for the print output to go. When started from a console with python (rather than pythonw), output goes to the console, as above.

Robust way to manage and kill any process

I am writing code to run experiments in parallel. I don't have control over what the experiments do, they might open use subprocess.Popen or check_output to run one or multiple additional child processes.
I have two conditions: I want to be able to kill experiments that exceed a time out and I want to kill experiments upon KeyboardInterrupt.
Most ways to terminate processes don't make sure that all subprocesses etc are killed. This is obviously a problem if 100s of experiments are run one after the other but they all spawn child processes that stay alive after the timeout occurred and the experiment was supposedly killed.
The way I am dealing with this now it to include code to store experiment configurations in a database, generating code that loads and runs experiments from command line and then calling these commands via subprocess.Popen(cmd, shell=True, start_new_session=True) and killing them using os.killpg on timeout.
My main question then is: Calling these experiments via command line feels cumbersome, so is there a way to call code directly via multiprocessing.Process(target=fn) and achieving the same effect of start_new_session=True + os.killpg upon timeout and KeyboardInterrupt?
<file1>
def run_exp(config):
do work
return result
if __name__ == "__main__":
save_exp(run_exp(load_config(sys.args)))
<file2>
def monitor(queue):
active = set() # active process ids
while True:
msg = queue.get()
if msg == "sentinel":
<loop over active ids and kill them with os.killpg>
else:
<add or remove id from active set>
def worker(args):
id, queue = args
command = f"python <file1> {id}"
with subprocess.Popen(command, shell=True, ..., start_new_session=True) as process:
try:
queue.put(f"start {process.pid}")
process.communicate(timeout=timeout)
except TimeoutExpired:
os.killpg(process.pid, signal.SIGINT) # send signal to the process group
process.communicate()
finally:
queue.put(f"done {process.pid}")
def main():
<save configs => c_ids>
queue = manager.Queue()
process = Process(target=monitor, args=(queue,))
process.start()
def clean_exit():
queue.put("sentinel")
<terminate pool and monitor process>
r = pool.map_async(worker, [(c_id, queue) for c_id in c_ids])
atexit.register(clean_exit)
r.wait()
<terminate pool and monitor process>
I posted a skeleton of the code that details the approach of starting processes via command line and killing them. An additional complication of that version of my approach is that when the KeyboardInterrupt arrives, the queue already gets terminated (for a lack of a better word) and communicating with the monitor process is impossible (the sentinel message never arrives). Instead I have to resort to writing process ids to file and reading the file back to in the main process to kill the still running processes. If you know a way to work around this queue-issue I'd be eager to learn about it.
I think the problem is you are storing Subprocess pid to kill it you need host process pid, and you used signal.SIGINT which I think should be signal.SIGTERM. try this, instead of this line:
os.killpg(process.pid, signal.SIGINT)
use this line:
os.killpg(os.getpgid(process.pid), signal.SIGTERM)
I guess there is one way to avoid this is using Try catch block.
Say if the KeyboardInterrupt arrives in main() then you could try this:
def main():
try:
<save configs => c_ids>
queue = manager.Queue()
process = Process(target=monitor, args=(queue,))
process.start()
def clean_exit():
queue.put("sentinel")
<terminate pool and monitor process>
r = pool.map_async(worker, [(c_id, queue) for c_id in c_ids])
atexit.register(clean_exit)
r.wait()
<terminate pool and monitor process>
except KeyboardInterrupt as e:
pass
#write the process you want to keep continuing.
Guess this will be helpful.

why does my print function (in multiprocessing) print nothing?

why does my print function (in python multiprocess) print nothing?
from multiprocessing import Process, Queue
import os, time, random
def write(q):
print('Process to write: %s' % os.getpid())
for value in ['A', 'B', 'C']:
print('Put %s to queue...' % value)
q.put(value)
time.sleep(random.random())
def read(q):
print('Process to read: %s' % os.getpid())
while True:
value = q.get(True)
print('Get %s from queue.' % value)
if __name__=='__main__':
q = Queue()
pw = Process(target=write, args=(q,))
pr = Process(target=read, args=(q,))
pw.start()
print('start')
pr.start()
pw.join()
pr.terminate()
print('end')
I run it on spyder (windows 10 system).
My result on IPython console of Spyder:
runfile('C:/Users/Dust/Desktop/programs/crawl/test.py', wdir='C:/Users/Dust/Desktop/programs/crawl')
start
end
Result on Python console of Spyder:
>>> runfile('C:/Users/Dust/Desktop/programs/crawl/test.py', wdir='C:/Users/Dust/Desktop/programs/crawl')
start
Process to write: 12824
Put A to queue...
Put B to queue...
Put C to queue...
end
It is really weird. The results are different, but both are not what I want.
Could anyone help me find where is the problem in my program. Thanks a lot
The multiprocessing module uses fork to spawn child processes for parallelism. Windows does not implement fork, and Python emulation of it on Windows is incomplete. One of the missing effects is that stdout (used by print()) is not inherited by the child processes. Rather, child processes use the default stdout.
This works fine in a command window because it uses default stdout. This shows nothing in Spyder because a different pipe is used for output to the Spyder IPython console window.
See this question ("Simple Multiprocessing function in Spyder doesn't output results") and this answer for additional details and potential mitigations.

How to Terminate a Python program before its child is finished running?

I have a script that is supposed to run 24/7 unless interrupted. This script is script A.
I want script A to call Script B, and have script A exit while B is running. Is this possible?
This is what I thought would work
#script_A.py
while(1)
do some stuff
do even more stuff
if true:
os.system("python script_B.py")
sys.exit(0)
#script_B.py
time.sleep(some_time)
do something
os.system("python script_A.py")
sys.exit(0)
But it seems as if A doesn't actually exit until B has finished executing (which is not what I want to happen).
Is there another way to do this?
What you are describing sounds a lot like a function call:
def doScriptB():
# do some stuff
# do some more stuff
def doScriptA():
while True:
# do some stuff
if Your Condition:
doScriptB()
return
while True:
doScriptA()
If this is insufficient for you, then you have to detach the process from you python process. This normally involves spawning the process in the background, which is done by appending an ampersand to the command in bash:
yes 'This is a background process' &
And detaching said process from the current shell, which, in a simple C program is done by forking the process twice. I don't know how to do this in python, but would bet, that there is a module for this.
This way, when the calling python process exits, it won't terminate the spawned child, since it is now independent.
It seems you want to detach a system call to another thread.
script_A.py
import subprocess
import sys
while(1)
do some stuff
do even more stuff
if true:
pid = subprocess.Popen([sys.executable, "python script_B.py"]) # call subprocess
sys.exit(0)
Anyway it does not seem a good practice at all. Why do you not try the script A listens the Process Stack and if it finds script B running then stops. This is another example how you could do it.
import subprocess
import sys
import psutil
while(1)
#This sections queries the current processes running
for proc in psutil.process_iter():
pinfo = proc.as_dict(attrs=['pid', 'name'])
if pinfo[ 'name' ] == "script_B.py":
sys.exit(0)
do some stuff
do even more stuff
if true:
pid = subprocess.Popen([sys.executable, "python script_B.py"]) # call subprocess
sys.exit(0)

Can multiprocessing Process class be run from IDLE

A basic example of multiprocessing Process class runs when executed from file, but not from IDLE. Why is that and can it be done?
from multiprocessing import Process
def f(name):
print('hello', name)
p = Process(target=f, args=('bob',))
p.start()
p.join()
Yes. The following works in that function f is run in a separate (third) process.
from multiprocessing import Process
def f(name):
print('hello', name)
if __name__ == '__main__':
p = Process(target=f, args=('bob',))
p.start()
p.join()
However, to see the print output, at least on Windows, one must start IDLE from a console like so.
C:\Users\Terry>python -m idlelib
hello bob
(Use idlelib.idle on 2.x.) The reason is that IDLE runs user code in a separate process. Currently the connection between the IDLE process and the user code process is via a socket. The fork done by multiprocessing does not duplicate or inherit the socket connection. When IDLE is started via an icon or Explorer (in Windows), there is nowhere for the print output to go. When started from a console with python (rather than pythonw), output goes to the console, as above.

Categories

Resources