i wrote a simple script. something like this:
import multiprocessing
my_list = []
def run_a_process():
proc = multiprocessing.Process(target=worker)
proc.start()
my_list.append(proc)
def worker():
# do something here
...
def close_done_processes():
global my_list
for idx, proc in enumerate(my_list):
if not proc.is_alive():
del my_list[idx]
def main():
while True:
if len(my_list) <= 10:
for _ in range(10 - len(my_list)):
run_a_process()
if len(my_list):
close_done_processes()
main()
i dont know about this example but the real program works just fine and no problem.
but after some days it will freeze without any error or anything. the program is running the interpreter is working on it but there will be no more logs and no more functionality. even ctrl+c wont stop the program. i think the problem is with the del my_list[id] part. i think its not removing the index from memory and not garbage collect it. so it will pile up and cause freeze of memory limitation??
i want to know how can i solve this issue??
i want to add items to the list and remove the ones that are already done processing from memory while keeping the other unprocessed items in the list without getting this freeze thing.
You've got a few problems here:
As written, on Windows this code should fill your machine with a nigh infinite number of processes in seconds (at least according to the documentation; you may be avoiding this by luck). You need the if __name__ == '__main__': guard around your invocation of main to prevent it:
if __name__ == '__main__':
main()
Your code for cleaning up the list is broken (mutating a collection while you iterate over it is a bad idea), and will delete the wrong elements of the list when there are two or more elements to remove (so some might never get deleted if there's always an element before them that's deleted first)
You're not actually joining the dead processes, which delays cleanup of Process resources for a potentially significant (indefinite?) period of time.
To fix #2 and #3, the easiest solution is to just build a new list of alive processes to replace the existing list, and join the ones that aren't alive:
def close_done_processes():
global my_list
new_list = []
for proc in my_list:
if proc.is_alive():
new_list.append(proc)
else:
proc.join()
my_list = new_list
Related
Is it possible to set up a loop with live variable changes? I'm using threading, and the variables can change very often in between lines.
I'm looking for something like this:
length = len(some_list)
while length == len(some_list):
if check_something(some_list):
# The variable could change right here for
# example, and the next line would still be called.
do_something(some_list)
So far I've had no luck, is this something that's possible in python?
EDIT: More what I'm looking for is something so that the loop restarts if some_list changes.
If its just a single changing list, you can make a local copy.
def my_worker():
my_list = some_list[:]
if check_something(my_list):
do_something(my_list)
UPDATE
A queue may work for you. The thing that modifies needs to post to the queue, so its not an automatic thing. There is also the risk that the background thread falls behind and processes old stuff or ends up crashing everything if memory is exhausted by the queue.
import threading
import queue
import time
def worker(work_q):
while True:
some_list = work_q.get()
if some_list is None:
print('exiting')
return
print(some_list)
work_q = queue.Queue()
work_thread = threading.Thread(target=worker, args=(work_q,))
work_thread.start()
for i in range(10):
some_list.append(i)
work_q.put(some_list[:])
time.sleep(.2)
work_q.put(None)
work_thread.join()
I am running multiple processes from single python code:
Code Snippet:
while 1:
if sqsObject.msgCount() > 0:
ReadyMsg = sqsObject.readM2Q()
if ReadyMsg == 0:
continue
fileName = ReadyMsg['fileName']
dirName = ReadyMsg['dirName']
uuid = ReadyMsg['uid']
guid = ReadyMsg['guid']
callback = ReadyMsg['callbackurl']
# print ("Trigger Algorithm Process")
if(countProcess < maxProcess):
try:
retValue = Process(target=dosomething, args=(dirName, uuid,guid,callback))
processArray.append(retValue)
retValue.start()
countProcess = countProcess + 1
except:
print "Cannot Run Process"
else:
for i in range(len(processArray)):
if (processArray[i].is_alive() == True):
continue
else:
try:
#print 'Restart Process'
processArray[i] = Process(target=dosomething, args=(dirName,uuid,guid,callback))
processArray[i].start()
except:
print "Cannot Run Process"
else: # No more request to service
for i in range(len(processArray)):
if (processArray[i].is_alive() == True):
processRunning = 1
break
else:
continue
if processRunning == 0:
countProcess = 0
else:
processRunning = 0
Here I am reading the messages from the queue and creating a process to run the algorithm on that message. I am putting upper limit of maxProcess. And hence after reaching maxProcess, I want to reuse the processArray slots which are not alive by checking is_alive().
This process runs fine for smaller number of processes however, for large number of messages say 100, Memory consumption goes through roof. I am thinking I have leak by reusing the process slots.
Not sure what is wrong in the process.
Thank you in advance for spotting an error or wise advise.
Your code is, in a word, weird :-)
It's not an mvce, so no one else can test it, but just looking at it, you have this (slightly simplified) structure in the inner loop:
if count < limit:
... start a new process, and increment count ...
else:
do things that can potentially start even more processes
(but never, ever, decrease count)
which seems unwise at best.
There are no invocations of a process instance's join(), anywhere. (We'll get back to the outer loop and its else case in a bit.)
Let's look more closely at the inner loop's else case code:
for i in range(len(processArray)):
if (processArray[i].is_alive() == True):
Leaving aside the unnecessary == True test—which is a bit of a risk, since the is_alive() method does not specifically promise to return True and False, just something that works boolean-ly—consider this description from the documentation (this link goes to py2k docs but py3k is the same, and your print statements imply your code is py2k anyway):
is_alive()
Return whether the process is alive.
Roughly, a process object is alive from the moment the start() method returns until the child process terminates.
Since we can't see the code for dosomething, it's hard to say whether these things ever terminate. Probably they do (by exiting), but if they don't, or don't soon enough, we could get problems here, where we just drop the message we pulled off the queue in the outer loop.
If they do terminate, we just drop the process reference from the array, by overwriting it:
processArray[i] = Process(...)
The previous value in processArray[i] is discarded. It's not clear if you may have saved this anywhere else, but if you have not, the Process instance gets discarded, and now it is actually impossible to call its join() method.
Some Python data structures tend to clean themselves up when abandoned (e.g., open streams flush output and close as needed), but the multiprocess code appears not to auto-join() its children. So this could be the, or a, source of the problem.
Finally, whenever we do get to the else case in the outer loop, we have the same somewhat odd search for any alive processes—which, incidentally, can be written more clearly as:
if any(p.is_alive() for p in processArray):
as long as we don't care about which particular ones are alive, and which are not—and if none report themselves as alive, we reset the count, but never do anything with the variable processArray, so that each processArray[i] still holds the identity of the Process instance. (So at least we could call join on each of these, excluding any lost by overwriting.)
Rather than building your own Pool yourself, you are probably better off using multiprocess.Pool and its apply and apply_async methods, as in miraculixx's answer.
Not sure what is wrong in the process.
It appears you are creating as many processes as there are messages, even when the maxProcess count is reached.
I am thinking I have leak by reusing the process slots.
There is no need to manage the processes yourself. Just use a process pool:
# before your while loop starts
from multiprocessing import Pool
pool = Pool(processes=max_process)
while 1:
...
# instead of creating a new Process
res = pool.apply_async(dosomething,
args=(dirName,uuid,guid,callback))
# after the while loop has finished
# -- wait to finish
pool.close()
pool.join()
Ways to submit jobs
Note that the Pool class supports several ways to submit jobs:
apply_async - one message at a time
map_async - a chunk of messages at a time
If messages arrive fast enough it might be better to collect several of them (say 10 or 100 at a time, depending on the actual processing done) and use map to submit a "mini-batch" to the target function at a time:
...
while True:
messages = []
# build mini-batch of messages
while len(messages) < batch_size:
... # get message
messages.append((dirName,uuid,guid,callback))
pool.map_async(dosomething, messages)
To avoid memory leaks left by dosomething you can ask the Pool to restart a process after it has consumed some number of messages:
max_tasks = 5 # some sensible number
Pool(max_processes, maxtasksperchild=max_tasks)
Going distributed
If with this approach the memory capacity is still exceeded, consider using a distributed approach i.e. add more machines. Using Celery that would be pretty straight forward, coming from the above:
# tasks.py
#task
def dosomething(...):
... # same code as before
# driver.py
while True:
... # get messages as before
res = somefunc.apply_async(args=(dirName,uuid,guid,callback))
I wrote a multiprocessing program in python. It can illustrate as follow:
nodes = multiprocessing.Manager().list()
lock = multiprocess.Lock()
def get_elems(node):
#get elements by send requests
def worker():
lock.acquire()
node = nodes.pop(0)
lock.release()
elems = get_elems(node)
lock.acquire()
for elem in elems:
nodes.append(node)
lock.release()
if __name__ == "__main__":
node = {"name":"name", "group":0}
nodes.append(node)
processes = [None for i in xrange(10)]
for i in xrange(10):
processes[i] = multiprocessing.Process(target=worker)
processes[i].start()
for i in xrange(10):
processes[i].join()
At the beginning of the program run, it seems everything is okay. After run for a while. the speed of the program slow down. The phenomenon also exist when use multithreading. And I saw there is a Global Interpreter Lock in Python, So I change to multiprocessing. But still have this phenomenon. The complete code is in here. I have tried Cython, still have this phenomenon. Is there something wrong in my code? Or is there a birth defects in python about parallel?
I'm not sure it's the actual cause but, you are popping from the beginning of an increasingly longer list. That's expensive. Try to use a collections.deque.
Update: Read the linked code. You should use a Queue, as suggested in the comments to this post, and threads.
You do away with locks using the Queue.
The workers are IO bound so threads are appropriate.
I have a script that all my workers updates different dictionary in a Manager list. My question is: when one worker is writing the list, will the others wait for it, or all worker can update the list at the same time?
Here is my code:
from multiprocessing import Process, Manager
def worker(x, i, *args):
sub_l = x[i]
sub_l[i] = i
x[i] = sub_l
if __name__ == '__main__':
manager = Manager()
num = 4
x = manager.list([{}]*num)
p = []
for i in range(num):
p.append(Process(target=worker, args=(x, i)))
p[i].start()
for i in range(5):
p[i].join()
print x
I only need all my workers to run separately, and update different global variables. I kind of think using manager.list is an over kill, but not sure if there is other way to do this.
The Manager server that provides access to the Manager.list doesn't do any synchronization when you try to access objects its managing. You can basically think of it exactly the same way you would if you were just dealing with threads and an ordinary global variable; because of the GIL, two threads aren't going to be able to actually step on top of each other while doing an atomic bytecode operation, but doing something like incrementing a variable, which requires multiple bytecode operations, need to be protected by a lock.
In your case, you'd only be in trouble if multiple workers did certain operations on the same sub-list in parallel. If two workers ran this at the same time:
sub_l = x[i]
sub_l[i] = sub_l[i] + <something unique to the worker>
x[i] = sub_l
With the same i, they could end up stomping on each other; both would store a copy of the same sub_l, and both would increment sub_l[i], and then both would update x[i], but the second one to update x would overwrite the changes done by the first one.
As long as you don't try to do those kind of things in parallel across workers, you should be fine.
From the following code I would expect that the length of the resulting list were the same as the one of the range of items with which the multiprocess is feed:
import multiprocessing as mp
def worker(working_queue, output_queue):
while True:
if working_queue.empty() is True:
break #this is supposed to end the process.
else:
picked = working_queue.get()
if picked % 2 == 0:
output_queue.put(picked)
else:
working_queue.put(picked+1)
return
if __name__ == '__main__':
static_input = xrange(100)
working_q = mp.Queue()
output_q = mp.Queue()
for i in static_input:
working_q.put(i)
processes = [mp.Process(target=worker,args=(working_q, output_q)) for i in range(mp.cpu_count())]
for proc in processes:
proc.start()
for proc in processes:
proc.join()
results_bank = []
while True:
if output_q.empty() is True:
break
else:
results_bank.append(output_q.get())
print len(results_bank) # length of this list should be equal to static_input, which is the range used to populate the input queue. In other words, this tells whether all the items placed for processing were actually processed.
results_bank.sort()
print results_bank
Has anyone any idea about how to make this code to run properly?
This code will never stop:
Each worker gets an item from the queue as long as it is not empty:
picked = working_queue.get()
and puts a new one for each that it got:
working_queue.put(picked+1)
As a result the queue will never be empty except when the timing between the process happens to be such that the queue is empty at the moment one of the processes calls empty(). Because the queue length is initially 100 and you have as many processes as cpu_count() I would be surprised if this ever stops on any realistic system.
Well executing the code with slight modification proves me wrong, it does stop at some point, which actually surprises me. Executing the code with one process there seems to be a bug, because after some time the process freezes but does not return. With multiple processes the result is varying.
Adding a short sleep period in the loop iteration makes the code behave as I expected and explained above. There seems to be some timing issue between Queue.put, Queue.get and Queue.empty, although they are supposed to be thread-safe. Removing the empty test also gives the expected result (without ever getting stuck at an empty queue).
Found the reason for the varying behaviour. The objects put on the queue are not flushed immediately. Therefore empty might return False although there are items in the queue waiting to be flushed.
From the documentation:
Note: When an object is put on a queue, the object is pickled and a
background thread later flushes the pickled data to an underlying
pipe. This has some consequences which are a little surprising, but
should not cause any practical difficulties – if they really bother
you then you can instead use a queue created with a manager.
After putting an object on an empty queue there may be an infinitesimal delay before the queue’s empty() method returns False and get_nowait() can return without raising Queue.Empty.
If multiple processes are enqueuing objects, it is possible for the objects to be received at the other end out-of-order. However, objects enqueued by the same process will always be in the expected order with respect to each other.