why does my print function (in python multiprocess) print nothing?
from multiprocessing import Process, Queue
import os, time, random
def write(q):
print('Process to write: %s' % os.getpid())
for value in ['A', 'B', 'C']:
print('Put %s to queue...' % value)
q.put(value)
time.sleep(random.random())
def read(q):
print('Process to read: %s' % os.getpid())
while True:
value = q.get(True)
print('Get %s from queue.' % value)
if __name__=='__main__':
q = Queue()
pw = Process(target=write, args=(q,))
pr = Process(target=read, args=(q,))
pw.start()
print('start')
pr.start()
pw.join()
pr.terminate()
print('end')
I run it on spyder (windows 10 system).
My result on IPython console of Spyder:
runfile('C:/Users/Dust/Desktop/programs/crawl/test.py', wdir='C:/Users/Dust/Desktop/programs/crawl')
start
end
Result on Python console of Spyder:
>>> runfile('C:/Users/Dust/Desktop/programs/crawl/test.py', wdir='C:/Users/Dust/Desktop/programs/crawl')
start
Process to write: 12824
Put A to queue...
Put B to queue...
Put C to queue...
end
It is really weird. The results are different, but both are not what I want.
Could anyone help me find where is the problem in my program. Thanks a lot
The multiprocessing module uses fork to spawn child processes for parallelism. Windows does not implement fork, and Python emulation of it on Windows is incomplete. One of the missing effects is that stdout (used by print()) is not inherited by the child processes. Rather, child processes use the default stdout.
This works fine in a command window because it uses default stdout. This shows nothing in Spyder because a different pipe is used for output to the Spyder IPython console window.
See this question ("Simple Multiprocessing function in Spyder doesn't output results") and this answer for additional details and potential mitigations.
Related
A basic example of multiprocessing Process class runs when executed from file, but not from IDLE. Why is that and can it be done?
from multiprocessing import Process
def f(name):
print('hello', name)
p = Process(target=f, args=('bob',))
p.start()
p.join()
Yes. The following works in that function f is run in a separate (third) process.
from multiprocessing import Process
def f(name):
print('hello', name)
if __name__ == '__main__':
p = Process(target=f, args=('bob',))
p.start()
p.join()
However, to see the print output, at least on Windows, one must start IDLE from a console like so.
C:\Users\Terry>python -m idlelib
hello bob
(Use idlelib.idle on 2.x.) The reason is that IDLE runs user code in a separate process. Currently the connection between the IDLE process and the user code process is via a socket. The fork done by multiprocessing does not duplicate or inherit the socket connection. When IDLE is started via an icon or Explorer (in Windows), there is nowhere for the print output to go. When started from a console with python (rather than pythonw), output goes to the console, as above.
I am trying to start a process using the multiprocessing.Process example from the python documentation.
Here is the example code:
from multiprocessing import Process
import os
def info(title):
print(title)
print('module name:', __name__)
print('parent process:', os.getppid())
print('process id:', os.getpid())
def f(name):
info('function f')
print('hello', name)
if __name__ == '__main__':
info('main line')
p = Process(target=f, args=('bob',))
p.start()
p.join()
I would expect the console to show me the output of the function f('bob'), but I only get to see the output of info('mainline').
So I think the process doesn't even start??
I have never before worked with multiprocessing, I bet it's a silly mistake I'm making.
I have also tried to set the start method multiprocessing.set_start_method('spawn') (see here), as 'spawn' seems to be the only valid one for windows.
But I only get a
RuntimeError: context has already been set
At the moment I think, I can't get the process to start.
Any Ideas how to solve this?
P.S. I am working on windows 10 in spyder 4.2.5 (maybe this something with the ipython console? Because I have heared, this is no normal python console).
But I have also tried the same example in the normal python shell, and it also only showed the output of info('mainline').
SOLVED: by running the script from cmd
A basic example of multiprocessing Process class runs when executed from file, but not from IDLE. Why is that and can it be done?
from multiprocessing import Process
def f(name):
print('hello', name)
p = Process(target=f, args=('bob',))
p.start()
p.join()
Yes. The following works in that function f is run in a separate (third) process.
from multiprocessing import Process
def f(name):
print('hello', name)
if __name__ == '__main__':
p = Process(target=f, args=('bob',))
p.start()
p.join()
However, to see the print output, at least on Windows, one must start IDLE from a console like so.
C:\Users\Terry>python -m idlelib
hello bob
(Use idlelib.idle on 2.x.) The reason is that IDLE runs user code in a separate process. Currently the connection between the IDLE process and the user code process is via a socket. The fork done by multiprocessing does not duplicate or inherit the socket connection. When IDLE is started via an icon or Explorer (in Windows), there is nowhere for the print output to go. When started from a console with python (rather than pythonw), output goes to the console, as above.
I am running multiple subprocesses in parallel, but I need to lock each process until the subprocess gives an output (via print function). The subprocesses are running a python script that has been packaged to an executable.
The code looks like this:
import multiprocessing as mp
import subprocess
import os
def main(args):
l,inpath = args
l.acquire()
print "Running KNN.exe for files in %s" % os.path.normpath(inpath).split('\\')[-1]
#Run KNN executable as a subprocess
subprocess.call(os.path.join(os.getcwd(), "KNN.exe"))
#This is where I want to wait for any output from the subprocess before releasing the lock
l.release()
#Here I would like to wait until subprocess is done then print that it is done
l.acquire()
print "Done %s" % os.path.normpath(inpath).split('\\')[-1]
l.release()
if __name__ == "__main__":
#Set working directory path containing input text file
os.chdir("C:\Users\Patrick\Google Drive\KNN")
#Get folder names in directory containing GCM input
manager = mp.Manager()
l = manager.Lock()
gcm_dir = "F:\FIDS_GCM_Data_CMIP5\UTRB\UTRB KNN-CAD\Input"
paths = [(l, os.path.join(gcm_dir, folder)) for folder in os.listdir(gcm_dir)]
#Set up multiprocessing pool
p = mp.Pool(mp.cpu_count())
#Map function through input paths
p.map(main, paths)
So the goal is to lock the process so that a subprocess can be run until receiving an output. After which the lock can be released and the subprocess can continue, until it is complete, then I'd like to print that it is complete.
My question is how can I wait for the single (and only) output from the subprocess before releasing the lock on the process (out of multiple)?
Additionally how can I wait for the process to terminate then print that it is complete?
Your code makes use of the call method, which already waits for the subprocess to finish (which means all output has already been generated). I'm inferring from your question you'd like to be able to differentiate between when output is first written and when the subprocess is finished. Below is your code with my recommended modifications inline:
def main(args):
l,inpath = args
l.acquire()
print "Running KNN.exe for files in %s" % os.path.normpath(inpath).split('\\')[-1]
#Run KNN executable as a subprocess
#Use the Popen constructor
proc = subprocess.Popen(os.path.join(os.getcwd(), "KNN.exe"), stdout=subprocess.PIPE)
#This is where I want to wait for any output from the subprocess before releasing the lock
# Wait until the subprocess has written at least 1 byte to STDOUT (modify if you need different logic)
proc.stdout.read(1)
l.release()
#Here I would like to wait until subprocess is done then print that it is done
#proc.wait()
(proc_output, proc_error) = proc.communicate()
l.acquire()
print "Done %s" % os.path.normpath(inpath).split('\\')[-1]
l.release()
Note that the above doesn't assume you want to do anything with the subprocess's output other than check that it has been generated. If you want to do anything with that output that is less trivial than the above (consume 1 byte then drop it on the floor), the proc.stdout (which is a file object) should represent everything that the subprocess generates while running.
I want to use Fabric.api.run to directly start an application in a remote box. Since the application takes really a long to finish, I wish to be able to fork a child process, such that I don't need to wait for a long time.
The code is like:
from fabric.api import run
....
run("python ./myApp.py --fork=True >myApp.log 2>&1")
I used the following code to enable forking side the code:
if settings.fork:
child_pid = os.fork()
if child_pid == 0:
print "Starting Child Process: PID# %s" % os.getpid()
else:
print "Terminating Parent Process: PID# %s" % os.getpid()
os._exit(0)
The problem is after I do the run command, I sshed into the remote box, and found out the program has been quit for some unknown reason, I check the log file, there is nothing there.
Somebody could let me know how I can work this around? Many thanks!!
Talking of forks, there is a fork of Fabric that enables parallel execution, apart from lots of other improvements.
http://tav.espians.com/fabric-python-with-cleaner-api-and-parallel-deployment-support.html
Depending on what you are doing, you may want to consider that.
Apart from that, I think you want to use multiprocessing:
from multiprocessing import Process
def f(name):
print 'hello', name
if __name__ == '__main__':
p = Process(target=f, args=('bob',))
p.start()
#p.join()
http://docs.python.org/library/multiprocessing.html