OSError (Errno 9) when using multiprocessing.Array in Python

OSError (Errno 9) when using multiprocessing.Array in Python - python

I'm trying to use a multiprocessing.Array in two separate processes in Python 3.7.4 (macOS 10.14.6). I start off by creating a new process using the spawn context, passing as an argument to it an Array object:
import multiprocessing, time, ctypes
def fn(c):
time.sleep(1)
print("value:", c.value)
def main():
ctx = multiprocessing.get_context("spawn")
arr = multiprocessing.Array(ctypes.c_char, 32)
p = ctx.Process(target=fn, args=(arr,))
p.start()
arr.value = b"hello"
p.join()
if __name__ == "__main__":
main()
However, when I try to read it, I get the following error:
Process SpawnProcess-1:
Traceback (most recent call last):
File "/usr/local/Cellar/python/3.7.4/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
self.run()
File "/usr/local/Cellar/python/3.7.4/Frameworks/Python.framework/Versions/3.7/lib/python3.7/multiprocessing/process.py", line 99, in run
self._target(*self._args, **self._kwargs)
File "/Users/federico/Workspace/test/test.py", line 6, in fn
print("value:", c.value)
File "<string>", line 3, in getvalue
OSError: [Errno 9] Bad file descriptor
The expected output, however, is value: hello. Anyone know what could be going wrong here? Thanks.

The array should also be defined in the context that you define for the multiprocessing like so:
import multiprocessing, time
import ctypes
from multiprocessing import Process
def fn(arr):
time.sleep(1)
print("value:", arr.value)
def main():
ctx = multiprocessing.get_context("spawn")
arr = ctx.Array(ctypes.c_char, 32)
p = ctx.Process(target=fn, args=(arr,))
p.start()
arr.value = b'hello'
p.join()
if __name__ == "__main__":
main()

Related

Multithreading / Multiprocessing solution using concurrent.futures

Hi I'm referencing the following question because it's similar to what I'm trying to achieve, however, I'm getting an error that I can't seem to figure out so looking for some help
Combining multithreading and multiprocessing with concurrent.futures
Here's my test code:
from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor
import numpy as np
from os import cpu_count
from functools import partial
num_list = range(0,1000)
def test(x):
x**2
def multithread(f, lst):
print('Thread running')
with ThreadPoolExecutor() as thread_executor:
thread_executor.map(f, lst)
def multiprocesser(lst, f, n_processors=cpu_count()//2):
chunks = np.array_split(lst, n_processors)
with ProcessPoolExecutor(max_workers=n_processors) as mp_executor:
mp_executor.map(partial(multithread, f), chunks)
if __name__ == '__main__':
multiprocesser(num_list, test)
Process SpawnProcess-31:
Traceback (most recent call last):
File "C:\Users\Test_user\Anaconda3\envs\test_env\lib\multiprocessing\process.py", line 315, in _bootstrap
self.run()
File "C:\Users\Test_user\Anaconda3\envs\test_env\lib\multiprocessing\process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "C:\Users\Test_user\Anaconda3\envs\test_env\lib\concurrent\futures\process.py", line 237, in _process_worker
call_item = call_queue.get(block=True)
File "C:\Users\Test_user\Anaconda3\envs\test_env\lib\multiprocessing\queues.py", line 122, in get
return _ForkingPickler.loads(res)
AttributeError: Can't get attribute 'multithread' on <module '__main__' (built-in)>
Process SpawnProcess-32:
Traceback (most recent call last):
File "C:\Users\Test_user\Anaconda3\envs\test_env\lib\multiprocessing\process.py", line 315, in _bootstrap
self.run()
File "C:\Users\Test_user\Anaconda3\envs\test_env\lib\multiprocessing\process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "C:\Users\Test_user\Anaconda3\envs\test_env\lib\concurrent\futures\process.py", line 237, in _process_worker
call_item = call_queue.get(block=True)
File "C:\Users\Test_user\Anaconda3\envs\test_env\lib\multiprocessing\queues.py", line 122, in get
return _ForkingPickler.loads(res)
AttributeError: Can't get attribute 'multithread' on <module '__main__' (built-in)>
So I didn't specify number of threads (don't see a reason to for the threadpool executor). Having trouble understanding what the error actually means and how I can fix it. Any help would be appreciated.

The error probably stems from the fact that multithread() is being called incorrectly.
Try this:
from concurrent.futures import ThreadPoolExecutor, ProcessPoolExecutor
import numpy as np
from os import cpu_count
from functools import partial
num_list = range(0,1000)
def test(x):
x**2
def multithread(f, lst):
print('Thread running')
with ThreadPoolExecutor() as thread_executor:
thread_executor.map(f, lst)
def multiprocesser(lst, f, n_processors=cpu_count()//2):
chunks = np.array_split(lst, n_processors)
with ProcessPoolExecutor(max_workers=n_processors) as mp_executor:
mp_executor.map(partial(multithread, f), chunks)
if __name__ == '__main__':
multiprocesser(num_list, test)

Missing if __name__ == '__main__'
if __name__ == '__main__':
multiprocesser(num_list, test)
Unintended recursion
When you don't block out the call to multiprocessor(), you have recursion when the subprocess runs the python script.
Safe importing of main module
The following is an example of the same type of problem
from the multiprocessing docs:
https://docs.python.org/3/library/multiprocessing.html?highlight=multiprocess#the-spawn-and-forkserver-start-methods
Make sure that the main module can be safely imported by a new Python
interpreter without causing unintended side effects (such a starting a
new process).
For example, using the spawn or forkserver start method running the
following module would fail with a RuntimeError:
multiprocessing import Process
def foo():
print('hello')
p = Process(target=foo) p.start()
Instead one should protect the
“entry point” of the program by using if __name__ == '__main__': as
follows:
from multiprocessing import Process, freeze_support, set_start_method
def foo():
print('hello')
if __name__ == '__main__':
freeze_support()
set_start_method('spawn')
p = Process(target=foo)
p.start() ```

Python multiprocessing with M1 Mac

I have a mac (Mac Os 11.1, Python Ver 3.8.2) and need to work in multiprocessing, but the procedures doesn’t work.
import multiprocessing
def func(index: int):
print(index)
manager = multiprocessing.Manager()
processes = []
for i in range(-1, 10):
p = multiprocessing.Process(target=func,
args=(i,))
processes.append(p)
p.start()
for process in processes:
process.join()
However, on my Intel-based Mac, it works fine.
What I expect is
-1
0
1
2
3
4
5
6
7
8
9
But instead, I got an error:
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
Traceback (most recent call last):
File "/Users/lance/Documents/Carleton Master Projects/Carleton-Master-Thesis/experiment.py", line 7, in <module>
manager = multiprocessing.Manager()
File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/multiprocessing/context.py", line 57, in Manager
m.start()
File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/multiprocessing/managers.py", line 583, in start
self._address = reader.recv()
File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/multiprocessing/connection.py", line 250, in recv
buf = self._recv_bytes()
File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/multiprocessing/connection.py", line 414, in _recv_bytes
buf = self._recv(4)
File "/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.8/lib/python3.8/multiprocessing/connection.py", line 383, in _recv
raise EOFError
EOFError
Is there any similar way (also keep it easy) to parallelize in M1-based Mac?

I'm not sure why this works on an Intel machine, but the problem is all the MP-related code should be inside if __name__ == '__main__':
import multiprocessing
def func(index: int):
print(index)
if __name__ == '__main__':
manager = multiprocessing.Manager()
processes = []
for i in range(-1, 10):
p = multiprocessing.Process(target=func, args=(i,))
processes.append(p)
p.start()
for process in processes:
process.join()
Since MP actually starts a new process, the new process needs to import the func function from this module to execute its task. If you don't protect the code that starts new processes, it will be executed during this import.

How to properly setup sys.excepthook

I have written the following code to understand the sys.excepthook on a multiprocess environment. I am using python 3. I have created 2 processes which would print and wait for getting ctrl+c.
from multiprocessing import Process
import multiprocessing
import sys
from time import sleep
class foo:
def f(self, name):
try:
raise ValueError("test value error")
except ValueError as e:
print(e)
print('hello', name)
while True:
pass
def myexcepthook(exctype, value, traceback):
print("Value: {}".format(value))
for p in multiprocessing.active_children():
p.terminate()
def main(name):
a = foo()
a.f(name)
sys.excepthook = myexcepthook
if __name__ == '__main__':
for i in range(2):
p = Process(target=main, args=('bob', ))
p.start()
I was expecting the following result when I press ctrl+C
python /home/test/test.py
test value error
hello bob
test value error
hello bob
Value: <KeyboardInterrupt>
But unfortunately, I got the following result.
/home/test/venvPython3/bin/python /home/test/test.py
test value error
hello bob
test value error
hello bob
Error in atexit._run_exitfuncs:
Traceback (most recent call last):
Process Process-1:
File "/usr/lib/python3.6/multiprocessing/popen_fork.py", line 28, in poll
pid, sts = os.waitpid(self.pid, flag)
KeyboardInterrupt
Traceback (most recent call last):
File "/usr/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/home/test/test.py", line 26, in main
a.f(name)
File "/home/test/test.py", line 15, in f
pass
KeyboardInterrupt
Process Process-2:
Traceback (most recent call last):
File "/usr/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/home/test/test.py", line 26, in main
a.f(name)
File "/home/test/test.py", line 15, in f
pass
KeyboardInterrupt
Process finished with exit code 0
It will be a great help if somebody could point out what am I doing wrong. Also, please let me know how to get the expected output.

You almost did it.
At first, use exctype to print:
def myexcepthook(exctype, value, traceback):
print("Value: {}".format(exctype))
for p in multiprocessing.active_children():
p.terminate()
And join() created processes, to prevent premature exit
if __name__ == '__main__':
pr = []
for i in range(2):
p = Process(target=main, args=('bob', ))
p.start()
pr.append(p)
for p in pr:
p.join()

Python3 multiprocessing Pool and Lock error

i have have code similaire to what shown below, So please i want the proper solution to run this without errors, i got shared memory error and also open gui many times,
mainapp = ProcessFiles()
p = multiprocessing.Pool()
p.map(mainapp.getPdfInfo, Files_list)
p.close()
class ProcessFiles:
def __init__():
self.lock = multiprocessing.Lock()
def getPdfInfo(file):
#READ FILES DATA AND DO SOME STUFFS
self.lock.aquere()
#INSERT DATA TO DATABASE
self.lock.release()
and this the error msg
TypeError: can't pickle sqlite3.Connection objects
Alos tried with multiprocessing.Manager() and also got errors, Code shown below
mainapp = ProcessFiles()
p = multiprocessing.Pool()
p.map(mainapp.getPdfInfo, Files_list)
p.close()
class ProcessFiles:
def __init__():
m = multiprocessing.Manager()
self.lock = m.Lock()
def getPdfInfo(file):
#READ FILES DATA AND DO SOME STUFFS
self.lock.acquire()
#INSERT DATA TO DATABASE
self.lock.release()
and thats the error msg
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Python37\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
++++++++++++++++++++++++++++++
_check_not_importing_main()
File "C:\Python37\lib\multiprocessing\spawn.py", line 136, in
_check_not_importing_main
is not going to be frozen to produce an executable.''')
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.

python tempfile and multiprocessing pool error

I'm experimenting with python's multiprocessing. I struggled with a bug in my code and managed to narrow it down. However, I still don't know why this happens. What I'm posting is just sample code. If I import tempfile module and change tempdir, the code crashes at pool creation. I'm using python 2.7.5
Here's the code
from multiprocessing import Pool
import tempfile
tempfile.tempdir = "R:/" #REMOVING THIS LINE FIXES THE ERROR
def f(x):
return x*x
if __name__ == '__main__':
pool = Pool(processes=4) # start 4 worker processes
result = pool.apply_async(f, [10]) # evaluate "f(10)" asynchronously
print result.get(timeout=1) # prints "100" unless your computer is *very* slow
print pool.map(f, range(10)) # prints "[0, 1, 4,..., 81]"
Here's error
R:\>mp_pool_test.py
Traceback (most recent call last):
File "R:\mp_pool_test.py", line 11, in <module>
pool = Pool(processes=4) # start 4 worker processes
File "C:\Python27\lib\multiprocessing\__init__.py", line 232, in Pool
return Pool(processes, initializer, initargs, maxtasksperchild)
File "C:\Python27\lib\multiprocessing\pool.py", line 138, in __init__
self._setup_queues()
File "C:\Python27\lib\multiprocessing\pool.py", line 233, in _setup_queues
self._inqueue = SimpleQueue()
File "C:\Python27\lib\multiprocessing\queues.py", line 351, in __init__
self._reader, self._writer = Pipe(duplex=False)
File "C:\Python27\lib\multiprocessing\__init__.py", line 107, in Pipe
return Pipe(duplex)
File "C:\Python27\lib\multiprocessing\connection.py", line 223, in Pipe
1, obsize, ibsize, win32.NMPWAIT_WAIT_FOREVER, win32.NULL
WindowsError: [Error 123] The filename, directory name, or volume label syntax is incorrect
This code works fine.
from multiprocessing import Pool
import tempfile as TF
TF.tempdir = "R:/"
def f(x):
return x*x
if __name__ == '__main__':
print("test")
The bizarre thing is that, both times I don't do anything with TF.tempdir, but the one with the Pool doesn't work for some reason.

It is cool it looks like you have a name collision from what I can see in
"C:\Program Files\PYTHON\Lib\multiprocessing\connection.py"
It seems that multipprocessing is using tempfile as well
That behavior should not happen but it looks to me like the problem is in line 66 of connection.py
elif family == 'AF_PIPE':
return tempfile.mktemp(prefix=r'\\.\pipe\pyc-%d-%d-' %
(os.getpid(), _mmap_counter.next()))
I am still poking at this, I looked at globals after importing tempfile and then tempfile as TF, different names exist but now I am wondering about references, and so am trying to figure out if they point to the same thing.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

OSError (Errno 9) when using multiprocessing.Array in Python - python

Related

Multithreading / Multiprocessing solution using concurrent.futures

Python multiprocessing with M1 Mac

How to properly setup sys.excepthook

Python3 multiprocessing Pool and Lock error

python tempfile and multiprocessing pool error

Categories

Resources