I am trying to run a separate Python Process and store the result in the queue. I can extract the result in two ways: either run queue.get() just once or use a while loop and iterate over queue until it`s empty.
In the code below first method is used if first=True and second method is used if first=False.
from multiprocessing import Process, Queue
def foo1(queue):
queue.put(1)
def main(first=False):
queue = Queue()
p = Process(target=foo1, args=(queue,))
p.start()
if first:
a = queue.get()
print(a)
else:
while not queue.empty():
print(queue.get())
p.join()
if __name__ == "__main__":
main()
Question: Why does first method print 1 correctly and second does not ? Aren`t they supposed to be equal ?
I am using Windows 10. I noticed this behavior in both interactive console and shell terminal.
Note: Due to the bug mentioned here I have to run the code as one script.
Related
I'm trying to use "exec" to check some external code snippets for correctness and I wanted to trap infinite loops by spawning a process, waiting for a short period of time, then checking the local variables. I managed to shrink the code to this example:
import multiprocessing
def fHelper(queue, codeIn, globalsParamIn, localsParamIn):
exec(codeIn, globalsParamIn, localsParamIn) # Execute code string with limited builtins
queue.put(localsParamIn['spam'])
def f(codeIn):
globalsParam = {"float" : float, "int" : int, "len" : len}
spam = False
localsParam = {'spam': spam}
if __name__ == '__main__':
queue = multiprocessing.Queue()
p = multiprocessing.Process(target=fHelper, args=(queue, codeIn, globalsParam, localsParam))
p.start()
p.join(3) # Wait for 3 seconds or until process finishes
if p.is_alive(): # Just in case p hangs
p.terminate()
p.join()
return queue.get(timeout=3)
fOut = f("spam=True")
print(fOut)
# assert fOut
Now the code as-is executes fine, but if you uncomment the last line (or use almost anything else - print(fOut.copy()) will do it) the queue times out. I'm using Python 3.8.2 on Windows.
I would welcome any suggestions on how to fix the bug, or better yet understand what on earth is going on.
Thanks!
I would like to fun a function using different arguments. For each different argument, I would like to run the function in parallel and then get the output of each run. It seems that the multiprocessing module can help here. I am not sure about the right steps to make this work.
Do I start all the processes, then get all the queues and then join all the processes in this order? Or do I get the results after I have joined? Or do I get the ith result after I have joined the ith process?
from numpy.random import uniform
from multiprocessing import Process, Queue
def function(x):
return uniform(0.0, x)
if __name__ == "__main__":
queue = Queue()
processes = []
x_values = [1.0, 10.0, 100.0]
# Start all processes
for x in x_values:
process = Process(target=function, args=(x, queue, ))
processes.append(process)
process.start()
# Grab results of the processes?
outputs = [queue.get() for _ in range(len(x_values))]
# Not even sure what this does but apparently it's needed
for process in processes:
process.join()
So lets make a simple example for multiprocessing pools with a loaded function that sleeps for 3 seconds and returns the value passed to it(your parameter) and also the result of the function which is just doubling it.
IIRC there's some issue with stopping pools cleanly
from multiprocessing import Pool
import time
def time_waster(val):
try:
time.sleep(3)
return (val, val*2) #return a tuple here but you can use a dict as well with all your parameters
except KeyboardInterrupt:
raise KeyboardInterruptError()
if __name__ == '__main__':
x = list(range(5)) #values to pass to the function
results = []
try:
with Pool(2) as p: #I use 2 but you can use as many as you have cores
results.append(p.map(time_waster,x))
except KeyboardInterrupt:
p.terminate()
except Exception as e:
p.terminate()
finally:
p.join()
print(results)
As an extra service added some keyboardinterrupt handlers as IIRC there are some issues interrupting pools.https://stackoverflow.com/questions/1408356/keyboard-interrupts-with-pythons-multiprocessing-pool
proc.join() blocks until the process ended. queue.get() blocks until there is something in the queue. Because your processes don't put anything into the queue (in this example) than this code will never get beyond the queue.get() part... If your processes put something in the queue at the very end, then it doesn't matter if you first join() or get() because they happen at about the same time.
I'm new to multiprocessing and I'm trying to check that I can run two process simultaneously with the following code :
import random, time, multiprocessing as mp
def printer():
"""print function"""
z = random.randit(0,60)
for i in range(5):
print z
wait = 0.2
wait += random.randint(1,60)/100
time.sleep(wait)
return
if __name__ == '__main__':
p1 = mp.Process(target=printer)
p2 = mp.Process(target=printer)
p1.start()
p2.start()
This code does not print anything on the console although I checked that the process are running thanks to the is.alive() method.
However, I can print something using :
p1.run()
p2.run()
Question 1 : Why doesn't the start() method run the process ?
Question 2 : While running the code with run() method, why do I get a sequence like
25,25,25,25,25,11,11,11,11,11
instead of something like
25,25,11,25,11,11,11,25,11,25 ?
It seems that the process run one after the other.
I would like to use multiprocessing for using the same function on multiple files to parallelize file conversion.
I made the script run by adding
from multiprocessing import Process
However, I don't have a random sequence of two numbers, the pattern is always A,B,A,B.. If you know how to show that the two process run simultaneously, any ideas are welcome !
So, I'm trying to speed up one routine by using the Multiprocessing module in Python. I want to be able to read several .csv files by splitting the job among several cores, for that I have:
def csvreader(string):
from numpy import genfromtxt;
time,signal=np.genfromtxt(string, delimiter=',',unpack="true")
return time,signal
Then I call this function by saying:
if __name__ == '__main__':
for i in range(0,2):
p = multiprocessing.Process(target=CSVReader.csvreader, args=(string_array[i],))
p.start()
The thing is that this doesn't store any output. I have read all the forums online and seen that there might be a way with multiprocessing.queue but I don't understand it quite well.
Is there any simple and straightforward method?
Your best bet are multiprocessing.Queue or multiprocessing.Pipe, which are designed exactly for this problem. They allow you to send data between processes in a safe and easy way.
If you'd like to return the output of your csvreader function, you should pass another argument to it, which is the multiprocessing.Queue through which the data will be sent back to the main process. Instead of returning the values, place them on the queue, and the main process will retrieve them at some point later. If they're not ready when the process tries to get them, by default it will just block (wait) until they are available
Your function would now look like this:
def cvsreader(string, q):
q.put(np.genfromtxt(string, delimiter=',', unpack="true"))
The main routine would be:
if __name__ == '__main__'
q = multiprocessing.Queue()
for i in range(2):
p = multiprocessing.Process(target=csvreader, args=(string_array[i], q,))
p.start()
# Do anything else you need in here
time=np.empty(2,dtype='object')
signal=np.empty(2,dtype='object')
for i in range(2):
time[i], signal[i] = q.get() # Returns output or blocks until ready
# Process my output
Note that you have to call Queue.get() for each item you want to return.
Have a look at the documentation on the multiprocessing module for more examples and information.
Using the example from the introduction to the documentation:
if __name__ == '__main__':
pool = Pool(2)
results = pool.map(CSVReader.csvreader, string_array[:2])
print(results)
I'm trying to understand multiprocessing in python.
from multiprocessing import Process
def multiply(a,b):
print(a*b)
return a*b
if __name__ == '__main__':
p = Process(target= multiply, args= (5,4))
p.start()
p.join()
print("ok.")
In this codeblock, for example, if there was an variable that called "result". How can we assign return value of multiply function to "result"?
And a little problem about IDLE: when i'm tried to run this sample with Python Shell, it doesn't work properly? If i double click .py file, output is like that:
20
ok.
But if i try to run this in IDLE:
ok.
Thanks...
Ok, i somehow managed this. I looked to python documentation, and i learnt that: with using Queue class, we can get return values from a function. And final version of my code is like this:
from multiprocessing import Process, Queue
def multiply(a,b,que): #add a argument to function for assigning a queue
que.put(a*b) #we're putting return value into queue
if __name__ == '__main__':
queue1 = Queue() #create a queue object
p = Process(target= multiply, args= (5,4,queue1)) #we're setting 3rd argument to queue1
p.start()
print(queue1.get()) #and we're getting return value: 20
p.join()
print("ok.")
And there is also a pipe() function, i think we can use pipe() function,too. But Queue worked for me, now.
Does this help? This takes a list of functions (and their arguments), runs them in parallel,
and returns their outputs.: (This is old. Much newer version of this is at https://gitlab.com/cpbl/cpblUtilities/blob/master/parallel.py )
def runFunctionsInParallel(listOf_FuncAndArgLists):
"""
Take a list of lists like [function, arg1, arg2, ...]. Run those functions in parallel, wait for them all to finish, and return the list of their return values, in order.
(This still needs error handling ie to ensure everything returned okay.)
"""
from multiprocessing import Process, Queue
def storeOutputFFF(fff,theArgs,que): #add a argument to function for assigning a queue
print 'MULTIPROCESSING: Launching %s in parallel '%fff.func_name
que.put(fff(*theArgs)) #we're putting return value into queue
queues=[Queue() for fff in listOf_FuncAndArgLists] #create a queue object for each function
jobs = [Process(target=storeOutputFFF,args=[funcArgs[0],funcArgs[1:],queues[iii]]) for iii,funcArgs in enumerate(listOf_FuncAndArgLists)]
for job in jobs: job.start() # Launch them all
for job in jobs: job.join() # Wait for them all to finish
# And now, collect all the outputs:
return([queue.get() for queue in queues])