Python multiprocessing lock mechanism failing when acquired lock - python

I am trying to implement a multiprocessing application which can access a shared data resource. I am using locking mechanism to make sure the shared resource is accessed safely. However I am hitting error . Surprisingly if process 1 acquires lock first it is servicing the request and it is failing on next process which is trying to acquire lock.But if some other process other than 1 is trying to acquire lock first it is failing in very first run . I am new to python and using documentation to implement this So I am unaware if I am missing any basic safety mechanisms here.Any data point as why I am witnessing this would be of great help
PROGRAM:
#!/usr/bin/python
from multiprocessing import Process, Manager, Lock
import os
import Queue
import time
lock = Lock()
def launch_worker(d,l,index):
global lock
lock.acquire()
d[index] = "new"
print "in process"+str(index)
print d
lock.release()
return None
def dispatcher():
i=1
d={}
mp = Manager()
d = mp.dict()
d[1] = "a"
d[2] = "b"
d[3] = "c"
d[4] = "d"
d[5] = "e"
l = mp.list(range(10))
for i in range(4):
p = Process(target=launch_worker, args=(d,l,i))
i = i+1
p.start()
return None
if __name__ == '__main__':
dispatcher()
ERROR when process 1 is serviced first
in process0
{0: 'new', 1: 'a', 2: 'b', 3: 'c', 4: 'd', 5: 'e'}
Process Process-3:
Traceback (most recent call last):
File "/usr/lib/python2.6/multiprocessing/process.py", line 232, in _bootstrap
self.run()
File "/usr/lib/python2.6/multiprocessing/process.py", line 88, in run
self._target(*self._args, **self._kwargs)
File "dispatcher.py", line 10, in launch_worker
d[index] = "new"
File "<string>", line 2, in __setitem__
File "/usr/lib/python2.6/multiprocessing/managers.py", line 722, in _callmethod
self._connect()
File "/usr/lib/python2.6/multiprocessing/managers.py", line 709, in _connect
conn = self._Client(self._token.address, authkey=self._authkey)
File "/usr/lib/python2.6/multiprocessing/connection.py", line 143, in Client
c = SocketClient(address)
File "/usr/lib/python2.6/multiprocessing/connection.py", line 263, in SocketClient
s.connect(address)
File "<string>", line 1, in connect
error: [Errno 2] No such file or directory
ERROR when process 2 is serviced first
Process Process-2:
Traceback (most recent call last):
File "/usr/lib/python2.6/multiprocessing/process.py", line 232, in _bootstrap
self.run()
File "/usr/lib/python2.6/multiprocessing/process.py", line 88, in run
self._target(*self._args, **self._kwargs)
File "dispatcher.py", line 10, in launch_worker
d[index] = "new"
File "<string>", line 2, in __setitem__
File "/usr/lib/python2.6/multiprocessing/managers.py", line 722, in _callmethod
self._connect()
File "/usr/lib/python2.6/multiprocessing/managers.py", line 709, in _connect
conn = self._Client(self._token.address, authkey=self._authkey)
File "/usr/lib/python2.6/multiprocessing/connection.py", line 150, in Client
deliver_challenge(c, authkey)
File "/usr/lib/python2.6/multiprocessing/connection.py", line 373, in deliver_challenge
response = connection.recv_bytes(256) # reject large message
IOError: [Errno 104] Connection reset by peer

The dict your workers modify is a shared object managed by the dispatching process; modifications to that object by the workers requires that they communicate with the dispatching process. The errors you see come from the fact that your dispatcher isn't waiting for the worker processes after it launches them; it's exiting too soon, so it might not exist for them to communicate with when they need to.
The first worker or two that attempts to update the shared dict might succeed, because when they modify the shared dict the process containing the Manager instance might still exist (e.g., it might still be in the process of creating further workers). Thus in your examples you see some successful output. But the managing process soon exits, and the next worker that attempts a modification will fail. (The error messages you see are typical of failed attempts at inter-process communication; you'll probably also see EOF errors if you run your program a few more times.)
What you need to do is call the join method on the Process objects as a way of waiting for each of them to exit. The following modification of your dispatcher shows the basic idea:
def dispatcher():
mp = Manager()
d = mp.dict()
d[1] = "a"
d[2] = "b"
d[3] = "c"
d[4] = "d"
d[5] = "e"
procs = []
for i in range(4):
p = Process(target=launch_worker, args=(d,i))
procs.append(p)
p.start()
for p in procs:
p.join()

Related

Using a multiprocessing Dictionary with multiple multiprocessing shared object(dict,list,variables) in python

I want to share multiprocessing list , dictionary and variable through multiprocessing . I am able to code and use them if i send these 3 as separate argument to function which i want to multiprocess . But as soon as I use a multiprocessing dictionary or a normal dictionary to store these three shared objects and then send that dictionary as a single argument in function . I get a RuntimeError: shared objects should only be shared using Inheritance
import multiprocessing as mp
from multiprocessing.sharedctypes import Value
def final(D):
if 0 not in D["first"][0]:
D["first"][0].update({0:D["first"][2]})
num = D["first"][2].value
D["first"][2].value+=1
if D["first"][2].value ==3:
D["first"][2].value=0
else:
num = D["first"][2].value
print("num is {}".format(num))
if __name__ == '__main__':
l=mp.Manager().list(range(5))
v=Value("i",0)
d= mp.Manager().dict()
D= mp.Manager().dict()
# D={}
D.update({"first": [d,l,v]})
p= mp.Process(target=final,name = "process", args=(D,))
p.start()
p.join()
This is the error i get
Process process:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "test.py", line 58, in d_final
D["first"][0].update({0:D["first"][2]})
File "<string>", line 2, in update
File "/usr/lib/python2.7/multiprocessing/managers.py", line 758, in _callmethod
conn.send((self._id, methodname, args, kwds))
File "/usr/lib/python2.7/multiprocessing/sharedctypes.py", line 218, in __reduce__
assert_spawning(self)
File "/usr/lib/python2.7/multiprocessing/forking.py", line 52, in assert_spawning
' through inheritance' % type(self).__name__
RuntimeError: Synchronized objects should only be shared between processes through inheritance
How can i do the above as i need to store multiple multiprocess dictionary , list , and variables in a dictionary which i can share among processes .
Dictionary should look like:-
D= { "first":[d,l,v] , "second":[d,l,v], "third":[d,l,v] ....... }

Python multiprocessing pool map_async freezes

I have a list of 80,000 strings that I am running through a discourse parser, and in order to increase the speed of this process I have been trying to use the python multiprocessing package.
The parser code requires python 2.7 and I am currently running it on a 2-core Ubuntu machine using a subset of the strings. For short lists, i.e. 20, the process runs without an issue on both cores, however if I run a list of about 100 strings, both workers will freeze at different points (so in some cases worker 1 won't stop until a few minutes after worker 2). This happens before all the strings are finished and anything is returned. Each time the cores stop at the same point given the same mapping function is used, but these points are different if I try a different mapping function, i.e. map vs map_async vs imap.
I have tried removing the strings at those indices, which did not have any affect and those strings run fine in a shorter list. Based on print statements I included, when the process appears to freeze the current iteration seems to finish for the current string and it just does not move on to the next string. It takes about an hour of run time to reach the spot where both workers have frozen and I have not been able to reproduce the issue in less time. The code involving the multiprocessing commands is:
def main(initial_file, chunksize = 2):
entered_file = pd.read_csv(initial_file)
entered_file = entered_file.ix[:, 0].tolist()
pool = multiprocessing.Pool()
result = pool.map_async(discourse_process, entered_file, chunksize = chunksize)
pool.close()
pool.join()
with open("final_results.csv", 'w') as file:
writer = csv.writer(file)
for listitem in result.get():
writer.writerow([listitem[0], listitem[1]])
if __name__ == '__main__':
main(sys.argv[1])
When I stop the process with Ctrl-C (which does not always work), the error message I receive is:
^CTraceback (most recent call last):
File "Combined_Script.py", line 94, in <module>
main(sys.argv[1])
File "Combined_Script.py", line 85, in main
pool.join()
File "/usr/lib/python2.7/multiprocessing/pool.py", line 474, in join
p.join()
File "/usr/lib/python2.7/multiprocessing/process.py", line 145, in join
res = self._popen.wait(timeout)
File "/usr/lib/python2.7/multiprocessing/forking.py", line 154, in wait
return self.poll(0)
File "/usr/lib/python2.7/multiprocessing/forking.py", line 135, in poll
pid, sts = os.waitpid(self.pid, flag)
KeyboardInterrupt
Process PoolWorker-1:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python2.7/multiprocessing/pool.py", line 117, in worker
put((job, i, result))
File "/usr/lib/python2.7/multiprocessing/queues.py", line 390, in put
wacquire()
KeyboardInterrupt
^CProcess PoolWorker-2:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python2.7/multiprocessing/pool.py", line 117, in worker
put((job, i, result))
File "/usr/lib/python2.7/multiprocessing/queues.py", line 392, in put
return send(obj)
KeyboardInterrupt
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
File "/usr/lib/python2.7/atexit.py", line 24, in _run_exitfuncs
func(*targs, **kargs)
File "/usr/lib/python2.7/multiprocessing/util.py", line 305, in _exit_function
_run_finalizers(0)
File "/usr/lib/python2.7/multiprocessing/util.py", line 274, in _run_finalizers
finalizer()
File "/usr/lib/python2.7/multiprocessing/util.py", line 207, in __call__
res = self._callback(*self._args, **self._kwargs)
File "/usr/lib/python2.7/multiprocessing/pool.py", line 500, in _terminate_pool
outqueue.put(None) # sentinel
File "/usr/lib/python2.7/multiprocessing/queues.py", line 390, in put
wacquire()
KeyboardInterrupt
Error in sys.exitfunc:
Traceback (most recent call last):
File "/usr/lib/python2.7/atexit.py", line 24, in _run_exitfuncs
func(*targs, **kargs)
File "/usr/lib/python2.7/multiprocessing/util.py", line 305, in _exit_function
_run_finalizers(0)
File "/usr/lib/python2.7/multiprocessing/util.py", line 274, in _run_finalizers
finalizer()
File "/usr/lib/python2.7/multiprocessing/util.py", line 207, in __call__
res = self._callback(*self._args, **self._kwargs)
File "/usr/lib/python2.7/multiprocessing/pool.py", line 500, in _terminate_pool
outqueue.put(None) # sentinel
File "/usr/lib/python2.7/multiprocessing/queues.py", line 390, in put
wacquire()
KeyboardInterrupt
When I look at the memory in another command window using htop, memory is at <3% once the workers freeze. This is my first attempt at parallel processing and I am not sure what else I might be missing?
I was not able to solve the issue with multiprocessing pool, however I came across the loky package and was able to use it to run my code with the following lines:
executor = loky.get_reusable_executor(timeout = 200, kill_workers = True)
results = executor.map(discourse_process, entered_file)
You could define a time to your process to return a result, otherwise it would raise an error:
try:
result.get(timeout = 1)
except multiprocessing.TimeoutError:
print("Error while retrieving the result")
Also you could verify if your process is succesful with
import time
while True:
try:
result.succesful()
except Exception:
print("Result is not yet succesful")
time.sleep(1)
Finally, checking out https://docs.python.org/2/library/multiprocessing.html ,is helpful.

Multiprocessing simple function doesn't work but why

I am trying to multiprocess system commands, but can't get it to work with a simple program. The function runit(cmd) works fine though...
#!/usr/bin/python3
from subprocess import call, run, PIPE,Popen
from multiprocessing import Pool
import os
pool = Pool()
def runit(cmd):
proc = Popen(cmd, shell=True,stdout=PIPE, stderr=PIPE, universal_newlines=True)
return proc.stdout.read()
#print(runit('ls -l'))
it = []
for i in range(1,3):
it.append('ls -l')
results = pool.map(runit, it)
It outputs:
Process ForkPoolWorker-1:
Process ForkPoolWorker-2:
Traceback (most recent call last):
Traceback (most recent call last):
File "/usr/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap
self.run()
File "/usr/lib/python3.5/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.5/multiprocessing/pool.py", line 108, in worker
task = get()
File "/usr/lib/python3.5/multiprocessing/queues.py", line 345, in get
return ForkingPickler.loads(res)
AttributeError: Can't get attribute 'runit' on <module '__main__' from './syscall.py'>
File "/usr/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap
self.run()
File "/usr/lib/python3.5/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.5/multiprocessing/pool.py", line 108, in worker
task = get()
File "/usr/lib/python3.5/multiprocessing/queues.py", line 345, in get
return ForkingPickler.loads(res)
AttributeError: Can't get attribute 'runit' on <module '__main__' from './syscall.py'>
Then it somehow waits and does nothing, and when I press Ctrl+C a few times it spits out:
^CProcess ForkPoolWorker-4:
Process ForkPoolWorker-6:
Traceback (most recent call last):
File "./syscall.py", line 17, in <module>
Process ForkPoolWorker-5:
results = pool.map(runit, it)
File "/usr/lib/python3.5/multiprocessing/pool.py", line 260, in map
...
buf = self._recv(4)
File "/usr/lib/python3.5/multiprocessing/connection.py", line 379, in _recv
chunk = read(handle, remaining)
KeyboardInterrupt
I'm not sure, since the issue I know is windows-related (and I don't have access to Linux box to reprocude), but in order to be portable you have to wrap your multiprocessing-dependent commands in if __name__=="__main__" or it conflicts with the way python spawns the processes: that fixed example runs fine on windows (and should work OK on other platforms as well):
from multiprocessing import Pool
import os
def runit(cmd):
proc = Popen(cmd, shell=True,stdout=PIPE, stderr=PIPE, universal_newlines=True)
return proc.stdout.read()
#print(runit('ls -l'))
it = []
for i in range(1,3):
it.append('ls -l')
if __name__=="__main__":
# all calls to multiprocessing module are "protected" by this directive
pool = Pool()
(Studying the error messages more closely, now I'm pretty sure that just moving pool = Pool() after the declaration of runit would fix it as well on Linux, but wrapping in __main__ fixes+makes it portable)
That said, note that your multiprocessing just creates a new process, so you'd be better off with thread pools (Threading pool similar to the multiprocessing Pool?): threads which creates processes, like this:
from multiprocessing.pool import ThreadPool # uses threads, not processes
import os
def runit(cmd):
proc = Popen(cmd, shell=True,stdout=PIPE, stderr=PIPE, universal_newlines=True)
return proc.stdout.read()
it = []
for i in range(1,3):
it.append('ls -l')
if __name__=="__main__":
pool = ThreadPool() # ThreadPool instead of Pool
results = pool.map(runit, it)
print(results)
results = pool.map(runit, it)
print(results)
the latter solution is more lightweight and is less issue-prone (multiprocessing is a delicate module to handle). You'll be able to work with objects, shared data, etc... without the need for a Manager object, among other advantages

Iterate over list from leading and trailing with multiprocessing

I want to iterate over a list with 2 function using multiprocessing one function iterate over the main_list from leading and other from trailing, I want this function each time that iterates over the sample list (g) put the element in main list till one of them find a duplicate in list then I want the terminate both processes and return the seen elements.
I expect that the first process return :
['a', 'b', 'c', 'd', 'e', 'f']
And the second return :
['l', 'k', 'j', 'i', 'h', 'g']
this is my code that returns an Error:
from multiprocessing import Process, Manager
manager = Manager()
d = manager.list()
# Fn definitions and such
def a(main_path,g,l=[]):
for i in g:
l.append(i)
print 'a'
if i in main_path:
return l
main_path.append(i)
def b(main_path,g,l=[]):
for i in g:
l.append(i)
print 'b'
if i in main_path:
return l
main_path.append(i)
g=['a','b','c','d','e','f','g','h','i','j','k','l']
g2=g[::-1]
p1 = Process(target=a, args=(d,g))
p2 = Process(target=b, args=(d,g2))
p1.start()
p2.start()
And this is the Traceback:
a
Process Process-2:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "/home/bluebird/Desktop/persiantext.py", line 17, in a
if i in main_path:
File "<string>", line 2, in __contains__
File "/usr/lib/python2.7/multiprocessing/managers.py", line 755, in _callmethod
self._connect()
File "/usr/lib/python2.7/multiprocessing/managers.py", line 742, in _connect
conn = self._Client(self._token.address, authkey=self._authkey)
File "/usr/lib/python2.7/multiprocessing/connection.py", line 169, in Client
b
c = SocketClient(address)
File "/usr/lib/python2.7/multiprocessing/connection.py", line 304, in SocketClient
s.connect(address)
File "/usr/lib/python2.7/socket.py", line 224, in meth
return getattr(self._sock,name)(*args)
error: [Errno 2] No such file or directory
Process Process-3:
Traceback (most recent call last):
File "/usr/lib/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "/home/bluebird/Desktop/persiantext.py", line 27, in b
if i in main_path:
File "<string>", line 2, in __contains__
File "/usr/lib/python2.7/multiprocessing/managers.py", line 755, in _callmethod
self._connect()
File "/usr/lib/python2.7/multiprocessing/managers.py", line 742, in _connect
conn = self._Client(self._token.address, authkey=self._authkey)
File "/usr/lib/python2.7/multiprocessing/connection.py", line 169, in Client
c = SocketClient(address)
File "/usr/lib/python2.7/multiprocessing/connection.py", line 304, in SocketClient
s.connect(address)
File "/usr/lib/python2.7/socket.py", line 224, in meth
return getattr(self._sock,name)(*args)
error: [Errno 2] No such file or directory
Note that i have not any idea that how terminate both processes after that one of them find a duplicated element!!
There are all kinds of other problems in your code, but since I already explained them on your other question, I won't get into them here.
The new problem is that you're not joining your child processes. In your threaded version, this wasn't an issue just because your main thread accidentally had a "block forever" before the end. But here, you don't have that, so the main process reaches the end of the script while the background processes are still running.
When this happens, it's not entirely defined what your code will do.* But basically, you're destroying the manager object, which shuts down the manager server while the background processes are still using it, so they're going to raise exceptions the next time they try to access a managed object.
The solution is to add p1.join() and p2.join() to the end of your script.
But that really only gets you back to the same situation as your threaded code (except not blocking forever at the end). You've still got code that's completely serialized, and a big race condition, and so on.
If you're curious why this happens:
At the end of the script, all of your module's globals go out of scope.** Since those variables are the only reference you have to the manager and process objects, those objects get garbage-collected, and their destructors get called.
For a manager object, the destructor shuts down the server.
For a process object, I'm not entirely sure, but I think the destructor does nothing (rather than join it and/or interrupt it). Instead, there's an atexit function, that runs after all of the destructors, that joins any still-running processes.***
So, first the manager goes away, then the main process starts waiting for the children to finish; the next time each one tries to access a managed object, it fails and exits. Once all of them do that, the main process finishes waiting and exits.
* The multiprocessing changes in 3.2 and the shutdown changes in 3.4 make things a lot cleaner, so if we weren't talking about 2.7, there would be less "here's what usually happens but not always" and "here's what happens in one particular implementation on one particular platform".
** This isn't actually guaranteed by 2.7, and garbage-collecting all of the modules' globals doesn't always happen. But in this particular simple case, I'm pretty sure it will always work this way, at least in CPython, although I don't want to try to explain why.
*** That's definitely how it works with threads, at least on CPython 2.7 on Unix… again, this isn't at all documented in 2.x, so you can only tell by reading the source or experimenting on the platforms/implementations/versions that matter to you… And I don't want to track this through the source unless there's likely to be something puzzling or interesting to find.

How can I catch a memory error in a spawned thread?

I've never used the multiprocessing library before, so all advice is welcome..
I've got a python program that uses the multiprocessing library to do some memory-intensive tasks in multiple processes, which occasionally runs out of memory (I'm working on optimizations, but that's not what this question is about). Sometimes, an out-of-memory error gets thrown in a way that I can't seem to catch (output below), and then the program hangs on pool.join() (I'm using multiprocessing.Pool. How can I make the program do something other than indefinitely wait when this problem occurs?
Ideally, The memory error is propagated back to the main process which then dies.
Here's the memory error:
Exception in thread Thread-1:
Traceback (most recent call last):
File "/usr/lib64/python2.7/threading.py", line 811, in __bootstrap_inner
self.run()
File "/usr/lib64/python2.7/threading.py", line 764, in run
self.__target(*self.__args, **self.__kwargs)
File "/usr/lib64/python2.7/multiprocessing/pool.py", line 325, in _handle_workers
pool._maintain_pool()
File "/usr/lib64/python2.7/multiprocessing/pool.py", line 229, in _maintain_pool
self._repopulate_pool()
File "/usr/lib64/python2.7/multiprocessing/pool.py", line 222, in _repopulate_pool
w.start()
File "/usr/lib64/python2.7/multiprocessing/process.py", line 130, in start
self._popen = Popen(self)
File "/usr/lib64/python2.7/multiprocessing/forking.py", line 121, in __init__
self.pid = os.fork()
OSError: [Errno 12] Cannot allocate memory
And here's where i manage multiprocessing:
mp_pool = mp.Pool(processes=num_processes)
mp_results = list()
for datum in input_data:
data_args = {
'value': 0 // actually some other simple dict key/values
}
mp_results.append(mp_pool.apply_async(_process_data, args=(common_args, data_args)))
frame_pool.close()
frame_pool.join() // hangs here when that thread dies..
for result_async in mp_results:
result = result_async.get()
// do stuff to collect results
// rest of the code
When I interrupt the hanging program, I get:
Process process_003:
Traceback (most recent call last):
File "/opt/rh/python27/root/usr/lib64/python2.7/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/opt/rh/python27/root/usr/lib64/python2.7/multiprocessing/process.py", line 114, in run
self._target(*self._args, **self._kwargs)
File "/opt/rh/python27/root/usr/lib64/python2.7/multiprocessing/pool.py", line 102, in worker
task = get()
File "/opt/rh/python27/root/usr/lib64/python2.7/multiprocessing/queues.py", line 374, in get
return recv()
racquire()
KeyboardInterrupt
This is actually a known bug in python's multiprocessing module, fixed in python 3 (here's a summarizing blog post I found). There's a patch attached to python issue 22393, but that hasn't been officially applied.
Basically, if one of a multiprocess pool's sub-processes die unexpectedly (out of memory, killed externally, etc.), the pool will wait indefinitely.

Categories

Resources