Race condition using multiprocessing and threading together

Race condition using multiprocessing and threading together - python

I wrote the sample program.
It creates 8 threads and spawns process in each one
import threading
from multiprocessing import Process
def fast_function():
pass
def thread_function():
process_number = 1
print 'start %s processes' % process_number
for i in range(process_number):
p = Process(target=fast_function, args=())
p.start()
p.join()
def main():
threads_number = 8
print 'start %s threads' % threads_number
threads = [threading.Thread(target=thread_function, args=())
for i in range(threads_number)]
for thread in threads:
thread.start()
for thread in threads:
thread.join()
It crashes with several exceptions like this
Exception in thread Thread-3:
Traceback (most recent call last):
File "/usr/lib/python2.6/threading.py", line 532, in __bootstrap_inner
self.run()
File "/usr/lib/python2.6/threading.py", line 484, in run
self.__target(*self.__args, **self.__kwargs)
File "./repeat_multiprocessing_bug.py", line 15, in thread_function
p.start()
File "/usr/lib/python2.6/multiprocessing/process.py", line 99, in start
_cleanup()
File "/usr/lib/python2.6/multiprocessing/process.py", line 53, in _cleanup
if p._popen.poll() is not None:
File "/usr/lib/python2.6/multiprocessing/forking.py", line 106, in poll
pid, sts = os.waitpid(self.pid, flag)
OSError: [Errno 10] No child processes
Python version 2.6.5. Can somebody explain what I do wrong?

You're probably trying to run it from the interactive interpreter. Try writing your code to a file and run it as a python script, it works on my machine...
See the explanation and examples at the Python multiprocessing docs.

The multiprocessing module has a thread-safety issue in 2.6.5. Your best bet is updating to a newer Python, or add this patch to 2.6.5: http://hg.python.org/cpython/rev/41aef062d529/
The bug is described in more detail in the following links:
http://bugs.python.org/issue11891
http://bugs.python.org/issue1731717

Related

Cannot use ThreadPoolExecutor within threading.Thread (Python 3.9)

Due to production needs, we updated our project to Python3.9. But the project stopped working because of RuntimeError: can't register atexit after shutdown, which did not occur with Python3.7. Our project has many threads and each thread might spawn sub-threads. We used threading.Thread for the higher levels and concurrent.futures.ThreadPoolExecutor at the bottom level. For example, the following code would work on 3.7 but not 3.9:
from threading import Thread
import concurrent.futures
def func1():
print("func1 start")
def func2():
print("func2 start")
def func3():
with concurrent.futures.ThreadPoolExecutor() as executor:
print("func3 start")
future1 = executor.submit(func1)
future2 = executor.submit(func2)
concurrent.futures.wait([future1, future2])
print("func3 end")
thread1 = Thread(target=func1)
thread3 = Thread(target=func3)
thread1.start()
thread3.start()
thread1.join()
thread3.join()
with the following error in 3.9:
func1 start
Exception in thread Thread-2:
Traceback (most recent call last):
File "C:\my_project\lib\threading.py", line 973, in _bootstrap_inner
self.run()
File "C:\my_project\lib\threading.py", line 910, in run
self._target(*self._args, **self._kwargs)
File "C:\my_projectr\tests\thread_test.py", line 13, in func2
with concurrent.futures.ThreadPoolExecutor() as executor:
File "C:\my_project\lib\concurrent\futures\__init__.py", line 49, in __getattr__
from .thread import ThreadPoolExecutor as te
File "C:\my_project\lib\concurrent\futures\thread.py", line 37, in <module>
threading._register_atexit(_python_exit)
File "C:\my_project\lib\threading.py", line 1407, in _register_atexit
raise RuntimeError("can't register atexit after shutdown")
RuntimeError: can't register atexit after shutdown
After some experimenting, I realized that ThreadPoolExecutor cannot be used under Thread while Thread can be used under ThreadPoolExecutor in Python3.9.
My questions are:
Is this behaviour (change) intended? Why?
What would be a proper way to use multi-level threading in Python3.9?

What's causing this PermissionError With Multiprocessing Queues?

I've been exploring queues, pipes, etc. for a project.
The following code was to learn how queues operate:
from multiprocessing import Process, Queue
def words(liss, q):
newlis = []
for i in liss:
# newlis.append(str(i) + "flag")
q.put(str(i) + "flag")
def reading(q):
while not q.empty():
print(q.get())
if __name__ == '__main__':
q = Queue()
p1 = Process(target=words, args=([23, "Hello", "Hey", 78], q))
p2 = Process(target=readit, args=(q,))
p1.start()
p2.start()
p1.join()
p1.join()
I've tried changing what I put in the queue and running the program with higher permissions. Both ran into this error:
Process Process-1:
Traceback (most recent call last):
File "C:\Users\Jonat\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\process.py", line 297, in _bootstrap
self.run()
File "C:\Users\Jonat\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "C:\Users\Jonat\PycharmProjects\Giraffe\Small-Tests.py", line 8, in words
q.put(liss)
File "C:\Users\Jonat\AppData\Local\Programs\Python\Python37-32\lib\multiprocessing\queues.py", line 82, in put
if not self._sem.acquire(block, timeout):
PermissionError: [WinError 5] Access is denied
Process finished with exit code 0
I've found one other post regarding this error, but I didn't quite understand it. Regardless, I'll link the post here: What is the reason of this errror: "PermissionError: [WinError 5] Access is denied"

Solved by #DipenShah.
This is a known issue with Python 3.7.2
Upgrading or downgrading to a different version should fix it.

Multiprocessing simple function doesn't work but why

I am trying to multiprocess system commands, but can't get it to work with a simple program. The function runit(cmd) works fine though...
#!/usr/bin/python3
from subprocess import call, run, PIPE,Popen
from multiprocessing import Pool
import os
pool = Pool()
def runit(cmd):
proc = Popen(cmd, shell=True,stdout=PIPE, stderr=PIPE, universal_newlines=True)
return proc.stdout.read()
#print(runit('ls -l'))
it = []
for i in range(1,3):
it.append('ls -l')
results = pool.map(runit, it)
It outputs:
Process ForkPoolWorker-1:
Process ForkPoolWorker-2:
Traceback (most recent call last):
Traceback (most recent call last):
File "/usr/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap
self.run()
File "/usr/lib/python3.5/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.5/multiprocessing/pool.py", line 108, in worker
task = get()
File "/usr/lib/python3.5/multiprocessing/queues.py", line 345, in get
return ForkingPickler.loads(res)
AttributeError: Can't get attribute 'runit' on <module '__main__' from './syscall.py'>
File "/usr/lib/python3.5/multiprocessing/process.py", line 249, in _bootstrap
self.run()
File "/usr/lib/python3.5/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.5/multiprocessing/pool.py", line 108, in worker
task = get()
File "/usr/lib/python3.5/multiprocessing/queues.py", line 345, in get
return ForkingPickler.loads(res)
AttributeError: Can't get attribute 'runit' on <module '__main__' from './syscall.py'>
Then it somehow waits and does nothing, and when I press Ctrl+C a few times it spits out:
^CProcess ForkPoolWorker-4:
Process ForkPoolWorker-6:
Traceback (most recent call last):
File "./syscall.py", line 17, in <module>
Process ForkPoolWorker-5:
results = pool.map(runit, it)
File "/usr/lib/python3.5/multiprocessing/pool.py", line 260, in map
...
buf = self._recv(4)
File "/usr/lib/python3.5/multiprocessing/connection.py", line 379, in _recv
chunk = read(handle, remaining)
KeyboardInterrupt

I'm not sure, since the issue I know is windows-related (and I don't have access to Linux box to reprocude), but in order to be portable you have to wrap your multiprocessing-dependent commands in if __name__=="__main__" or it conflicts with the way python spawns the processes: that fixed example runs fine on windows (and should work OK on other platforms as well):
from multiprocessing import Pool
import os
def runit(cmd):
proc = Popen(cmd, shell=True,stdout=PIPE, stderr=PIPE, universal_newlines=True)
return proc.stdout.read()
#print(runit('ls -l'))
it = []
for i in range(1,3):
it.append('ls -l')
if __name__=="__main__":
# all calls to multiprocessing module are "protected" by this directive
pool = Pool()
(Studying the error messages more closely, now I'm pretty sure that just moving pool = Pool() after the declaration of runit would fix it as well on Linux, but wrapping in __main__ fixes+makes it portable)
That said, note that your multiprocessing just creates a new process, so you'd be better off with thread pools (Threading pool similar to the multiprocessing Pool?): threads which creates processes, like this:
from multiprocessing.pool import ThreadPool # uses threads, not processes
import os
def runit(cmd):
proc = Popen(cmd, shell=True,stdout=PIPE, stderr=PIPE, universal_newlines=True)
return proc.stdout.read()
it = []
for i in range(1,3):
it.append('ls -l')
if __name__=="__main__":
pool = ThreadPool() # ThreadPool instead of Pool
results = pool.map(runit, it)
print(results)
results = pool.map(runit, it)
print(results)
the latter solution is more lightweight and is less issue-prone (multiprocessing is a delicate module to handle). You'll be able to work with objects, shared data, etc... without the need for a Manager object, among other advantages

getting error not attribute exit on thread object

I am getting the following error
Traceback (most recent call last): File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 810, in __bootstrap_inner
self.run() File "python/file_download.py", line 30, in run
self._downloadFile(host[0], host[1]) File "python/file_download.py", line 57, in _downloadFile
th.exit() AttributeError: 'Thread' object has no attribute 'exit'
from the code below:
th = threading.Thread(
target=self._fileWriteToDisk,
args=(saveTo, u, file_name),
name="fileWrite_Child_of_%s" % self.getName(),
)
th.setDaemon(False)
th.start()
print "Writing to disk using child: %s " % th.name
th.exit()

There is no need to kill the thread, python threads kill themselves when they are completed.
Abruptly killing a thread is bad practice, you could have resourced that are used by the thread that aren't closed properly. A stop flag is a better way to end a thread, see this for an example.
You should be using the following, instead of hat you have:
th.daemon = False
as th.setDaemon(False) is depreciated.

Python multiprocessing.Manager and os.fork producing strange behavior

My coworker asked for my help with a problem he was having with a daemon script he is working on. He was having a strange error involving a multiprocessing.Manager, which I managed to reproduce with the following five lines:
import multiprocessing, os, sys
mgr = multiprocessing.Manager()
pid = os.fork()
if pid > 0:
sys.exit(0)
When run on CentOS 6 Linux and Python 2.6, I get the following error:
Traceback (most recent call last):
File "/usr/lib64/python2.6/multiprocessing/util.py", line 235, in _run_finalizers
finalizer()
File "/usr/lib64/python2.6/multiprocessing/util.py", line 174, in __call__
res = self._callback(*self._args, **self._kwargs)
File "/usr/lib64/python2.6/multiprocessing/managers.py", line 576, in _finalize_manager
if process.is_alive():
File "/usr/lib64/python2.6/multiprocessing/process.py", line 129, in is_alive
assert self._parent_pid == os.getpid(), 'can only test a child process'
AssertionError: can only test a child process
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
File "/usr/lib64/python2.6/atexit.py", line 24, in _run_exitfuncs
func(*targs, **kargs)
File "/usr/lib64/python2.6/multiprocessing/util.py", line 269, in _exit_function
p.join()
File "/usr/lib64/python2.6/multiprocessing/process.py", line 117, in join
assert self._parent_pid == os.getpid(), 'can only join a child process'
AssertionError: can only join a child process
Error in sys.exitfunc:
Traceback (most recent call last):
File "/usr/lib64/python2.6/atexit.py", line 24, in _run_exitfuncs
func(*targs, **kargs)
File "/usr/lib64/python2.6/multiprocessing/util.py", line 269, in _exit_function
p.join()
File "/usr/lib64/python2.6/multiprocessing/process.py", line 117, in join
assert self._parent_pid == os.getpid(), 'can only join a child process'
AssertionError: can only join a child process
I suspect the error is due to some interaction between os.fork and the multiprocessing.Manager, and that he should use the multiprocessing module to create new processes instead of os.fork. Can anyone confirm this and/or explain what is going on? If my hunch is correct, why is this the wrong place to use os.fork?

The issue is that Manager create a process and try to stop it at sys.exit. Since the memory of the process is copied (lazily) during the fork both the parent and the child try to stop the process and wait for it to stop. However, as the exception mention only the parent process can do that. If instead of using os.fork, you use multiprocessing.Process which will spawn a new process which wouldn't try to close the Manager at sys.exit.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Race condition using multiprocessing and threading together - python

You're probably trying to run it from the interactive interpreter. Try writing your code to a file and run it as a python script, it works on my machine... See the explanation and examples at the Python multiprocessing docs.

Related

Cannot use ThreadPoolExecutor within threading.Thread (Python 3.9)

What's causing this PermissionError With Multiprocessing Queues?

Multiprocessing simple function doesn't work but why

getting error not attribute exit on thread object

Python multiprocessing.Manager and os.fork producing strange behavior

Categories

Resources