Weird behaviour of Queue.empty in Python

Weird behaviour of Queue.empty in Python - python

I came across this weird issue with multiprocessing's Queue.empty() in Python. The following code output is True and 20, right after filling it with elements.
from multiprocessing import Queue
import random
q = Queue()
for _ in range(20):
q.put(random.randint(0, 2))
#time.sleep(0.01)
print(q.empty())
print(q.qsize())
If I uncomment the sleep, the output is correct: False, 20. How is this possible? This code should run sequentially, which means by the time the q.empty() evaluates, the queue is already filled.

You can't rely on the result from a call to multiprocessing.Queue.empty().
The documentation for .empty() states:
Return True if the queue is empty, False otherwise. Because of multithreading/multiprocessing semantics, this is not reliable.
The documentation also states that a separate thread handles queuing objects, causing the observed behavior:
When an object is put on a queue, the object is pickled and a background thread later flushes the pickled data to an underlying pipe. This has some consequences which are a little surprising, but should not cause any practical difficulties – if they really bother you then you can instead use a queue created with a manager.
After putting an object on an empty queue there may be an infinitesimal delay before the queue’s empty() method returns False and get_nowait() can return without raising queue.Empty.
You have a single process, so use the queue from the Queue module, which does not rely on another thread to add the data to the queue:
from queue import Queue
import random
q = Queue()
for _ in range(20):
q.put(random.randint(0, 2))
print(q.empty())
print(q.qsize())
If you must use multiple processes, you should try to restructure your code to rely on .empty() as little as possible, because its results are unreliable. For example, instead of using .empty() to check whether there are elements on the queue, you should simply attempt to pop off the queue and block if there aren't any elements.

The output isn't deterministic, with or without the sleep(). The part you see runs sequentially, but, under the covers, q.put(thing) hands thing off to a multiprocessing worker thread to do the actual work of mutating the queue. .put() returns at once then, regardless of whether the worker thread has managed to put thing on the queue yet.
This can burn you "for real"! For example, consider this program:
import multiprocessing as mp
import time
q = mp.Queue()
nums = list(range(20))
q.put(nums)
# time.sleep(2)
del nums[-15:]
print(q.get())
Chances are that it will display:
[0, 1, 2, 3, 4]
This is so even if some other process retrieves from q. q.put(nums) hands off the task of pickling nums, and putting its serialized form on the queue, and there's a race between that and the main program mutating nums.
If you uncomment the sleep(2), then chances are high that it will display the original 20-element nums instead.

Related

Python multiprocessing map using with statement does not stop

I am using multiprocessing python module to run parallel and unrelated jobs with a function similar to the following example:
import numpy as np
from multiprocessing import Pool
def myFunction(arg1):
name = "file_%s.npy"%arg1
A = np.load(arg1)
A[A<0] = np.nan
np.save(arg1,A)
if(__name__ == "__main__"):
N = list(range(50))
with Pool(4) as p:
p.map_async(myFunction, N)
p.close() # I tried with and without that statement
p.join() # I tried with and without that statement
DoOtherStuff()
My problem is that the function DoOtherStuff is never executed, the processes switches into sleep mode on top and I need to kill it with ctrl+C to stop it.
Any suggestions?

You have at least a couple problems. First, you are using map_async() which does not block until the results of the task are completed. So what you're doing is starting the task with map_async(), but then immediately closes and terminates the pool (the with statement calls Pool.terminate() upon exiting).
When you add tasks to a Process pool with methods like map_async it adds tasks to a task queue which is handled by a worker thread which takes tasks off that queue and farms them out to worker processes, possibly spawning new processes as needed (actually there is a separate thread which handles that).
Point being, you have a race condition where you're terminating the Pool likely before any tasks are even started. If you want your script to block until all the tasks are done just use map() instead of map_async(). For example, I rewrote your script like this:
import numpy as np
from multiprocessing import Pool
def myFunction(N):
A = np.load(f'file_{N:02}.npy')
A[A<0] = np.nan
np.save(f'file2_{N:02}.npy', A)
def DoOtherStuff():
print('done')
if __name__ == "__main__":
N = range(50)
with Pool(4) as p:
p.map(myFunction, N)
DoOtherStuff()
I don't know what your use case is exactly, but if you do want to use map_async(), so that this task can run in the background while you do other stuff, you have to leave the Pool open, and manage the AsyncResult object returned by map_async():
result = pool.map_async(myFunction, N)
DoOtherStuff()
# Is my map done yet? If not, we should still block until
# it finishes before ending the process
result.wait()
pool.close()
pool.join()
You can see more examples in the linked documentation.
I don't know why in your attempt you got a deadlock--I was not able to reproduce that. It's possible there was a bug at some point that was then fixed, though you were also possibly invoking undefined behavior with your race condition, as well as calling terminate() on a pool after it's already been join()ed. As for your why your answer did anything at all, it's possible that with the multiple calls to apply_async() you managed to skirt around the race condition somewhat, but this is not at all guaranteed to work.

Can a python Multiprocessing queue be passed to the child process?

I have a big dataset in a data acquisition system I wrote in python that takes infinitely long to pass over a queue from the child process to the parent. I want to save the data acquired at the end of the acquisition and tried this using the queue function in Multiprocessing. Instead of doing it this way I would prefer it if I could instead pass a message over the queue from the parent to the child to save my data before I kill the child process. Is this possible? An example of what I thought it might look like is:
def acquireData(self, var1, queue):
import h5py
# Put my acquisition code here
queue.get()
if queue == True:
f = h5py.File("FileName","w")
f.create_dataset('Data',data=data)
f.close()
if __name__ == '__main__':
from multiprocessing import Process, Queue
queue = Queue()
inter_thread = Process(target=acquireData, args=(var1,queue))
queue.put(False)
inter_thread.start()
while True:
if not args.automate:
# Let c++ threads run for given amount of time
# Wait for stop from OP GUI
else:
queue.put(True)
break
print("Acquisition finished, cleaning up...")
sleep(2)
inter_thread.terminate()
Is this allowed? If this type of interfacing between processes is allowed then do I have the right notation? For some reference I have on the order of 9e7 data points in the array I'm trying to save and I have 7 arrays which is simply not being passed to my parent process in a timely manner by putting these arrays into the queue. Thank you.

First, yes, passing a queue to a child is not only legal, but the main use case for queues. See the first example in the docs, which does exactly that.
However, you've got some problems with your code:
queue.get()
if queue == True:
First, your queue is never going to be the boolean value True, it's going to be a Queue. You almost never want to check if x == True: in Python; you want to check if x:. For example, if [1, 2]: will pass, while if [1, 2] == True: will not.
Second, your queue isn't even the thing you want to check in the first place. It isn't truthy or falsey (or it isn't relevant whether it is); it's the value the main process put on the queue and you pulled off that's either truthy or falsey. Which you discarded as soon as you retrieved it.
So, do this:
flag = queue.get()
if flag:
Or, more simply:
if queue.get():
I'm not sure whether this is exactly what you want or not. That queue.get() will block forever until the main process puts something there. Is that what you wanted? If so, great; you're done with this part of your code. If not, you need to think about what you wanted instead.
As designed, the parent will always wait 2 seconds, even if the child finished long before that. A better solution is to join the child with a timeout of 2 seconds. Then you can terminate it if times out.
Plus, are you sure the termination behavior you've designed is what you want? You're doing a "soft kill request" with the queue, then waiting 2 seconds, then doing a "medium-hard kill request" with terminate, and never doing a "hard kill" with kill. That could be a perfectly reasonable design—but if it's not your design, you've implemented the wrong thing.

Output Queue of a Python multiprocessing is providing more results than expected

From the following code I would expect that the length of the resulting list were the same as the one of the range of items with which the multiprocess is feed:
import multiprocessing as mp
def worker(working_queue, output_queue):
while True:
if working_queue.empty() is True:
break #this is supposed to end the process.
else:
picked = working_queue.get()
if picked % 2 == 0:
output_queue.put(picked)
else:
working_queue.put(picked+1)
return
if __name__ == '__main__':
static_input = xrange(100)
working_q = mp.Queue()
output_q = mp.Queue()
for i in static_input:
working_q.put(i)
processes = [mp.Process(target=worker,args=(working_q, output_q)) for i in range(mp.cpu_count())]
for proc in processes:
proc.start()
for proc in processes:
proc.join()
results_bank = []
while True:
if output_q.empty() is True:
break
else:
results_bank.append(output_q.get())
print len(results_bank) # length of this list should be equal to static_input, which is the range used to populate the input queue. In other words, this tells whether all the items placed for processing were actually processed.
results_bank.sort()
print results_bank
Has anyone any idea about how to make this code to run properly?

This code will never stop:
Each worker gets an item from the queue as long as it is not empty:
picked = working_queue.get()
and puts a new one for each that it got:
working_queue.put(picked+1)
As a result the queue will never be empty except when the timing between the process happens to be such that the queue is empty at the moment one of the processes calls empty(). Because the queue length is initially 100 and you have as many processes as cpu_count() I would be surprised if this ever stops on any realistic system.
Well executing the code with slight modification proves me wrong, it does stop at some point, which actually surprises me. Executing the code with one process there seems to be a bug, because after some time the process freezes but does not return. With multiple processes the result is varying.
Adding a short sleep period in the loop iteration makes the code behave as I expected and explained above. There seems to be some timing issue between Queue.put, Queue.get and Queue.empty, although they are supposed to be thread-safe. Removing the empty test also gives the expected result (without ever getting stuck at an empty queue).
Found the reason for the varying behaviour. The objects put on the queue are not flushed immediately. Therefore empty might return False although there are items in the queue waiting to be flushed.
From the documentation:
Note: When an object is put on a queue, the object is pickled and a
background thread later flushes the pickled data to an underlying
pipe. This has some consequences which are a little surprising, but
should not cause any practical difficulties – if they really bother
you then you can instead use a queue created with a manager.
After putting an object on an empty queue there may be an infinitesimal delay before the queue’s empty() method returns False and get_nowait() can return without raising Queue.Empty.
If multiple processes are enqueuing objects, it is possible for the objects to be received at the other end out-of-order. However, objects enqueued by the same process will always be in the expected order with respect to each other.

Return whichever expression returns first

I have two different functions f, and g that compute the same result with different algorithms. Sometimes one or the other takes a long time while the other terminates quickly. I want to create a new function that runs each simultaneously and then returns the result from the first that finishes.
I want to create that function with a higher order function
h = firstresult(f, g)
What is the best way to accomplish this in Python?
I suspect that the solution involves threading. I'd like to avoid discussion of the GIL.

I would simply use a Queue for this. Start the threads and the first one which has a result ready writes to the queue.
Code
from threading import Thread
from time import sleep
from Queue import Queue
def firstresult(*functions):
queue = Queue()
threads = []
for f in functions:
def thread_main():
queue.put(f())
thread = Thread(target=thread_main)
threads.append(thread)
thread.start()
result = queue.get()
return result
def slow():
sleep(1)
return 42
def fast():
return 0
if __name__ == '__main__':
print firstresult(slow, fast)
Live demo
http://ideone.com/jzzZX2
Notes
Stopping the threads is an entirely different topic. For this you need to add some state variable to the threads which needs to be checked in regular intervals. As I want to keep this example short I simply assumed that part and assumed that all workers get the time to finish their work even though the result is never read.
Skipping the discussion about the Gil as requested by the questioner. ;-)

Now - unlike my suggestion on the other answer, this piece of code does exactly what you are requesting:
from multiprocessing import Process, Queue
import random
import time
def firstresult(func1, func2):
queue = Queue()
proc1 = Process(target=func1,args=(queue,))
proc2 = Process(target=func2, args=(queue,))
proc1.start();proc2.start()
result = queue.get()
proc1.terminate(); proc2.terminate()
return result
def algo1(queue):
time.sleep(random.uniform(0,1))
queue.put("algo 1")
def algo2(queue):
time.sleep(random.uniform(0,1))
queue.put("algo 2")
print firstresult(algo1, algo2)

Run each function in a new worker thread, the 2 worker threads send the result back to the main thread in a 1 item queue or something similar. When the main thread receives the result from the winner, it kills (do python threads support kill yet? lol.) both worker threads to avoid wasting time (one function may take hours while the other only takes a second).
Replace the word thread with process if you want.

You will need to run each function in another process (with multiprocessing) or in a different thread.
If both are CPU bound, multithread won help much - exactly due to the GIL -
so multiprocessing is the way.
If the return value is a pickleable (serializable) object, I have this decorator I created that simply runs the function in background, in another process:
https://bitbucket.org/jsbueno/lelo/src
It is not exactly what you want - as both are non-blocking and start executing right away. The tirck with this decorator is that it blocks (and waits for the function to complete) as when you try to use the return value.
But on the other hand - it is just a decorator that does all the work.

Dumping a multiprocessing.Queue into a list

I wish to dump a multiprocessing.Queue into a list. For that task I've written the following function:
import Queue
def dump_queue(queue):
"""
Empties all pending items in a queue and returns them in a list.
"""
result = []
# START DEBUG CODE
initial_size = queue.qsize()
print("Queue has %s items initially." % initial_size)
# END DEBUG CODE
while True:
try:
thing = queue.get(block=False)
result.append(thing)
except Queue.Empty:
# START DEBUG CODE
current_size = queue.qsize()
total_size = current_size + len(result)
print("Dumping complete:")
if current_size == initial_size:
print("No items were added to the queue.")
else:
print("%s items were added to the queue." % \
(total_size - initial_size))
print("Extracted %s items from the queue, queue has %s items \
left" % (len(result), current_size))
# END DEBUG CODE
return result
But for some reason it doesn't work.
Observe the following shell session:
>>> import multiprocessing
>>> q = multiprocessing.Queue()
>>> for i in range(100):
... q.put([range(200) for j in range(100)])
...
>>> q.qsize()
100
>>> l=dump_queue(q)
Queue has 100 items initially.
Dumping complete:
0 items were added to the queue.
Extracted 1 items from the queue, queue has 99 items left
>>> l=dump_queue(q)
Queue has 99 items initially.
Dumping complete:
0 items were added to the queue.
Extracted 3 items from the queue, queue has 96 items left
>>> l=dump_queue(q)
Queue has 96 items initially.
Dumping complete:
0 items were added to the queue.
Extracted 1 items from the queue, queue has 95 items left
>>>
What's happening here? Why aren't all the items being dumped?

Try this:
import Queue
import time
def dump_queue(queue):
"""
Empties all pending items in a queue and returns them in a list.
"""
result = []
for i in iter(queue.get, 'STOP'):
result.append(i)
time.sleep(.1)
return result
import multiprocessing
q = multiprocessing.Queue()
for i in range(100):
q.put([range(200) for j in range(100)])
q.put('STOP')
l=dump_queue(q)
print len(l)
Multiprocessing queues have an internal buffer which has a feeder thread which pulls work off a buffer and flushes it to the pipe. If not all of the objects have been flushed, I could see a case where Empty is raised prematurely. Using a sentinel to indicate the end of the queue is safe (and reliable). Also, using the iter(get, sentinel) idiom is just better than relying on Empty.
I don't like that it could raise empty due to flushing timing (I added the time.sleep(.1) to allow a context switch to the feeder thread, you may not need it, it works without it - it's a habit to release the GIL).

# in theory:
def dump_queue(q):
q.put(None)
return list(iter(q.get, None))
# in practice this might be more resilient:
def dump_queue(q):
q.put(None)
return list(iter(lambda : q.get(timeout=0.00001), None))
# but neither case handles all the ways things can break
# for that you need 'managers' and 'futures' ... see Commentary
I prefer None for sentinels, but I would tend to agree with jnoller that mp.queue could use a safe and simple sentinel. His comments on risks of getting empty raised early is also valid, see below.
Commentary:
This is old and Python has changed, but, this does come up has a hit if you're having issues with lists <-> queue in MP Python. So, let's look a little deeper:
First off, this is not a bug, it's a feature: https://bugs.python.org/issue20147. To save you some time from reading that discussion and more details in the documentation, here are some highlights (kind of philosophical but I think it might help some who are starting with MP/MT in Python):
MP Queues are structures capable of being communicated with from different threads, different processes on the same system, and in fact can be different (networked) computers
In general with parallel/distributed systems, strict synchronization is expensive, so every time you use part of the API for any MP/MT datastructures, you need to look at the documentation to see what it promises to do, or not. Hint: if a function doesn't include the word "lock" or "semaphore" or "barrier" etc, then it will be some mixture of "asynchronous" and "best effort" (approximate), or what you might call "flaky."
Specific to this situation: Python is an interpreted language, with a famous single interpreter thread with it's famous "Global Interpreter Lock" (GIL). If your entire program is single-process, single threaded, then everything is hunky dory. If not (and with MP it's egregiously not), you need to give the interpreter some breathing room. time.sleep() is your friend. In this case, timeouts.
In your solution you are only using flaky functions - get() and qsize(). And the code is in fact worse than you might think - dial up the size of the queue and the size of the objects and you're likely to break things:
Now, you can work with flaky routines, but you need to give them room to maneuver. In your example you're just hammering that queue. All you need to do is change the line thing = queue.get(block=False) to instead be thing = queue.get(block=True,timeout=0.00001) and you should be fine.
The time 0.00001 is chosen carefully (10^-5), it's about the smallest that you can safely make it (this is where art meets science).
Some comments on why you need the timout: this relates to the internals of how MP queues work. When you 'put' something into an MP queue, it's not actually put into the queue, it's queued up to eventually be there. That's why qsize() happens to give you a correct result - that part of the code knows there's a pile of things "in" the queue. You just need to realize that an object "in" the queue is not the same thing as "i can now read it." Think of MP queues as sending a letter with USPS or FedEx - you might have a receipt and a tracking number showing that "it's in the mail," but the recipient can't open it yet. Now, to be even more specific, in your case you get '0' items accessible right away. That's because the single interpreter thread you're running hasn't had any chance to process stuff that's "queued up", so your first loop just queues up a bunch of stuff for the queue, but you're immediately forcing your single thread to try to do a get() before it's even had a chance to line up even a single object for you.
One might argue that it slows code down to have these timeouts. Not really - MP queues are heavy-weight constructs, you should only be using them to pass pretty heavy-weight "things" around, either big chunks of data, or at least complex computation. the act of adding 10^-5 seconds actually does is give the interpreter a chance to do thread scheduling - at which point it will see your backed-up put() operations.
Caveat
The above is not completely correct, and this is (arguably) an issue with the design of the get() function. The semantics of setting timeout to non-zero is that the get() function will not block for longer than that before returning Empty. But it might not actually be Empty (yet). So if you know your queue has a bunch of stuff to get, then the second solution above works better, or even with a longer timeout. Personally I think they should have kept the timeout=0 behavior, but had some actual built-in tolerance of 1e-5, because a lot of people will get confused about what can happen around gets and puts to MP constructs.
In your example code, you're not actually spinning up parallel processes. If we were to do that, then you'd start getting some random results - sometimes only some of the queue objects will be removed, sometimes it will hang, sometimes it will crash, sometimes more than one thing will happen. In the below example, one process crashes and the other hangs:
The underlying problem is that when you insert the sentinel, you need to know that the queue is finished. That should be done has part of the logic around the queue - if for example you have a classical master-worker design, then the master would need to push a sentinel (end) when the last task has been added. Otherwise you end up with race conditions.
The "correct" (resilient) approach is to involve managers and futures:
import multiprocessing
import concurrent.futures
def fill_queue(q):
for i in range(5000):
q.put([range(200) for j in range(100)])
def dump_queue(q):
q.put(None)
return list(iter(q.get, None))
with multiprocessing.Manager() as manager:
q = manager.Queue()
with concurrent.futures.ProcessPoolExecutor() as executor:
executor.submit(fill_queue, q) # add stuff
executor.submit(fill_queue, q) # add more stuff
executor.submit(fill_queue, q) # ... and more
# 'step out' of the executor
l = dump_queue(q)
# 'step out' of the manager
print(f"Saw {len(l)} items")
Let the manager handle your MP constructs (queues, dictionaries, etc), and within that let the futures handle your processes (and within that, if you want, let another future handle threads). This assures that things are cleaned up as you 'unravel' the work.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.