any code with multiprocessing in python not working - python

I have a simple multiprocessing code
from multiprocessing import Pool
def f(x):
return x*x
if __name__ == '__main__':
with Pool(5) as p:
print(p.map(f, [1, 2, 3]))
This code gives same error repeated over(as given below) . Not only this, any library/function which internally calls multiprocessing module(eg-reverse_geocoder, ProcessPoolExecutor in concurrent.futures ) shows same error while ThreadPoolExecutor runs just fine!!! I am trying to run this in Pycharm, also, same code runs fine on another Pc in Pycharm
Please help.
Error
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Program Files\Python37\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "C:\Program Files\Python37\lib\multiprocessing\spawn.py", line 114, in _main
prepare(preparation_data)
File "C:\Program Files\Python37\lib\multiprocessing\spawn.py", line 225, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "C:\Program Files\Python37\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
run_name="__mp_main__")
File "C:\Program Files\Python37\lib\runpy.py", line 261, in run_path
code, fname = _get_code_from_file(run_name, path_name)
File "C:\Program Files\Python37\lib\runpy.py", line 231, in _get_code_from_file
with open(fname, "rb") as f:
OSError: [Errno 22] Invalid argument: 'D:\\p\\r\\i\\b\\<input>'

Related

why multiprocess works differently in ubuntu vs macOS?

I have this simple code to make things simpler.
from multiprocessing import Process
def abc():
print("HI")
print(a)
a = 4
p = Process(target = abc)
p.start()
It works perfectly fine in ubuntu (Python 3.8.5) and provides the output:
HI
4
However, it fails in spyder (Python 3.9.5) "AttributeError: Can't get attribute 'abc' on <module 'main' (built-in)>" and macOS (Python 3.8.10, I tried other versions as well and failed) CLI "RuntimeError".
Spyder error:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "multiprocessing/spawn.pyc", line 116, in spawn_main
File "multiprocessing/spawn.pyc", line 126, in _main
AttributeError: Can't get attribute 'abc' on <module '__main__' (built-in)>
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "multiprocessing/spawn.pyc", line 116, in spawn_main
File "multiprocessing/spawn.pyc", line 126, in _main
AttributeError: Can't get attribute 'abc' on <module '__main__' (built-in)>
MacOS-BigSur error:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "/Users/asavas/opt/anaconda3/lib/python3.8/multiprocessing/spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "/Users/asavas/opt/anaconda3/lib/python3.8/multiprocessing/spawn.py", line 125, in _main
prepare(preparation_data)
File "/Users/asavas/opt/anaconda3/lib/python3.8/multiprocessing/spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "/Users/asavas/opt/anaconda3/lib/python3.8/multiprocessing/spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "/Users/asavas/opt/anaconda3/lib/python3.8/runpy.py", line 265, in run_path
return _run_module_code(code, init_globals, run_name,
File "/Users/asavas/opt/anaconda3/lib/python3.8/runpy.py", line 97, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "/Users/asavas/opt/anaconda3/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/Users/asavas/delete.py", line 8, in <module>
p.start()
File "/Users/asavas/opt/anaconda3/lib/python3.8/multiprocessing/process.py", line 121, in start
self._popen = self._Popen(self)
File "/Users/asavas/opt/anaconda3/lib/python3.8/multiprocessing/context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/Users/asavas/opt/anaconda3/lib/python3.8/multiprocessing/context.py", line 284, in _Popen
return Popen(process_obj)
File "/Users/asavas/opt/anaconda3/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in __init__
super().__init__(process_obj)
File "/Users/asavas/opt/anaconda3/lib/python3.8/multiprocessing/popen_fork.py", line 19, in __init__
self._launch(process_obj)
File "/Users/asavas/opt/anaconda3/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 42, in _launch
prep_data = spawn.get_preparation_data(process_obj._name)
File "/Users/asavas/opt/anaconda3/lib/python3.8/multiprocessing/spawn.py", line 154, in get_preparation_data
_check_not_importing_main()
File "/Users/asavas/opt/anaconda3/lib/python3.8/multiprocessing/spawn.py", line 134, in _check_not_importing_main
raise RuntimeError('''
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
Trying to understand why and how can I resolve this issue?
Thanks
It appears that this version of macOS has switched over to using the spawn rather than fork method of creating new processes. This means that when a new process is created, a new, empty address space is created, a new Python interpreter is launched and the source file is re-executed from the top. Consequently, any code that is at global scope and not within a block that begins if __name__ == '__main__': will get executed. That is why any code that creates processes must be in such a block or you will get in a recursive, process-creating loop that generates the errors you see. You simply need:
from multiprocessing import Process
def abc():
print("HI")
print(a)
# this will get re-executed in the new subprocess:
a = 4
if __name__ == '__main__':
p = Process(target=abc)
p.start()
p.join() # explicitly wait for process to complete

Pandas Dedupe not working . Multiprocessing and Permission error

I was trying to clean up duplicates in an excel file using dedupe.
The code worked fine at first and the code itself is simple. But whenever I run the code I get the below error. The code works fine if I delete all the temp files, restart pycharm or restart my computer and it won't run for the second time.
The data file is a csv file with a list of random similar name in column A with header as 'Name'. Please help tp resolve. Thank you.
Code
import pandas as pd
import pandas_dedupe
#loading data
df = pd.read_csv('duplicate.csv')
#deduplication process
df_final = pandas_dedupe.dedupe_dataframe(df,['Name'])
#save to csv
df_final.to_csv('cleansed_output.csv')
Getting error below
C:\Users\Username\AppData\Roaming\Python\Python38\site-packages\pandas_dedupe\utility_functions.py:17: FutureWarning: The default value of regex will change from True to False in a future version.
df[i] = df[i].str.replace('[^\w\s\.\-\(\)\,\:\/\\\\]','')
Reading from dedupe_dataframe_learned_settings
Clustering...
Traceback (most recent call last):
File "C:\Users\Username\AppData\Roaming\Python\Python38\site-packages\dedupe\api.py", line 103, in score
matches = core.scoreDuplicates(pairs,
File "C:\Users\Username\AppData\Roaming\Python\Python38\site-packages\dedupe\core.py", line 244, in scoreDuplicates
process.start()
File "C:\Users\Username\AppData\Local\Programs\Python\Python38\lib\multiprocessing\process.py", line 121, in start
self._popen = self._Popen(self)
File "C:\Users\Username\AppData\Local\Programs\Python\Python38\lib\multiprocessing\context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Users\Username\AppData\Local\Programs\Python\Python38\lib\multiprocessing\context.py", line 327, in _Popen
return Popen(process_obj)
File "C:\Users\Username\AppData\Local\Programs\Python\Python38\lib\multiprocessing\popen_spawn_win32.py", line 45, in __init__
prep_data = spawn.get_preparation_data(process_obj._name)
File "C:\Users\Username\AppData\Local\Programs\Python\Python38\lib\multiprocessing\spawn.py", line 154, in get_preparation_data
_check_not_importing_main()
File "C:\Users\Username\AppData\Local\Programs\Python\Python38\lib\multiprocessing\spawn.py", line 134, in _check_not_importing_main
raise RuntimeError('''
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Users\Username\AppData\Local\Programs\Python\Python38\lib\multiprocessing\spawn.py", line 116, in spawn_main
exitcode = _main(fd, parent_sentinel)
File "C:\Users\Username\AppData\Local\Programs\Python\Python38\lib\multiprocessing\spawn.py", line 125, in _main
prepare(preparation_data)
File "C:\Users\Username\AppData\Local\Programs\Python\Python38\lib\multiprocessing\spawn.py", line 236, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "C:\Users\Username\AppData\Local\Programs\Python\Python38\lib\multiprocessing\spawn.py", line 287, in _fixup_main_from_path
main_content = runpy.run_path(main_path,
File "C:\Users\Username\AppData\Local\Programs\Python\Python38\lib\runpy.py", line 265, in run_path
return _run_module_code(code, init_globals, run_name,
File "C:\Users\Username\AppData\Local\Programs\Python\Python38\lib\runpy.py", line 97, in _run_module_code
_run_code(code, mod_globals, init_globals,
File "C:\Users\Username\AppData\Local\Programs\Python\Python38\lib\runpy.py", line 87, in _run_code
exec(code, run_globals)
File "C:\Users\Username\PycharmProjects\Duplicate\main.py", line 10, in <module>
df_final = pandas_dedupe.dedupe_dataframe(df,['Name'])
File "C:\Users\Username\AppData\Roaming\Python\Python38\site-packages\pandas_dedupe\dedupe_dataframe.py", line 249, in dedupe_dataframe
clustered_df = _cluster(deduper, data_d, threshold, canonicalize)
File "C:\Users\Username\AppData\Roaming\Python\Python38\site-packages\pandas_dedupe\dedupe_dataframe.py", line 143, in _cluster
clustered_dupes = deduper.partition(data, threshold)
File "C:\Users\Username\AppData\Roaming\Python\Python38\site-packages\dedupe\api.py", line 170, in partition
pair_scores = self.score(pairs)
File "C:\Users\Username\AppData\Roaming\Python\Python38\site-packages\dedupe\api.py", line 108, in score
raise RuntimeError('''
RuntimeError:
You need to either turn off multiprocessing or protect
the calls to the Dedupe methods with a
`if __name__ == '__main__'` in your main module, see
https://docs.python.org/3/library/multiprocessing.html#the-spawn-and-forkserver-start-methods
Traceback (most recent call last):
File "C:\Users\Username\AppData\Local\Programs\Python\Python38\lib\shutil.py", line 616, in _rmtree_unsafe
os.unlink(fullname)
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\USERNAME~1.KAB\\AppData\\Local\\Temp\\tmpp9123_pc\\blocks.db'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\Username\AppData\Local\Programs\Python\Python38\lib\tempfile.py", line 802, in onerror
_os.unlink(path)
PermissionError: [WinError 32] The process cannot access the file because it is being used by another process: 'C:\\Users\\USERNAME~1.KAB\\AppData\\Local\\Temp\\tmpp9123_pc\\blocks.db'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "C:\Users\Username\AppData\Local\Programs\Python\Python38\lib\weakref.py", line 642, in _exitfunc
f()
File "C:\Users\Username\AppData\Local\Programs\Python\Python38\lib\weakref.py", line 566, in __call__
return info.func(*info.args, **(info.kwargs or {}))
File "C:\Users\Username\AppData\Local\Programs\Python\Python38\lib\tempfile.py", line 817, in _cleanup
cls._rmtree(name)
File "C:\Users\Username\AppData\Local\Programs\Python\Python38\lib\tempfile.py", line 813, in _rmtree
_shutil.rmtree(name, onerror=onerror)
File "C:\Users\Username\AppData\Local\Programs\Python\Python38\lib\shutil.py", line 740, in rmtree
return _rmtree_unsafe(path, onerror)
File "C:\Users\Username\AppData\Local\Programs\Python\Python38\lib\shutil.py", line 618, in _rmtree_unsafe
onerror(os.unlink, fullname, sys.exc_info())
File "C:\Users\Username\AppData\Local\Programs\Python\Python38\lib\tempfile.py", line 805, in onerror
cls._rmtree(path)
File "C:\Users\Username\AppData\Local\Programs\Python\Python38\lib\tempfile.py", line 813, in _rmtree
_shutil.rmtree(name, onerror=onerror)
File "C:\Users\Username\AppData\Local\Programs\Python\Python38\lib\shutil.py", line 740, in rmtree
return _rmtree_unsafe(path, onerror)
File "C:\Users\Username\AppData\Local\Programs\Python\Python38\lib\shutil.py", line 599, in _rmtree_unsafe
onerror(os.scandir, path, sys.exc_info())
File "C:\Users\Username\AppData\Local\Programs\Python\Python38\lib\shutil.py", line 596, in _rmtree_unsafe
with os.scandir(path) as scandir_it:
NotADirectoryError: [WinError 267] The directory name is invalid: 'C:\\Users\\USERNAME~1.KAB\\AppData\\Local\\Temp\\tmpp9123_pc\\blocks.db'
Process finished with exit code -1
The answer is in the error:
You need to either turn off multiprocessing or protect the calls to the Dedupe methods with a if __name__ == '__main__' in your main module
Change your code to the following, and try again:
import pandas as pd
import pandas_dedupe
if __name__ == "__main__":
#loading data
df = pd.read_csv('duplicate.csv')
#deduplication process
df_final = pandas_dedupe.dedupe_dataframe(df,['Name'])
#save to csv
df_final.to_csv('cleansed_output.csv')

Problems with multiprocessing

I'm trying to implement a python script which reads the content of a pdf file and move that file to a specific directory.
On my Debian machine it works without any problem. But on my Xubuntu system I"m getting the following error:
Traceback (most recent call last):
File "/usr/lib/python3.6/threading.py", line 916, in _bootstrap_inner
self.run()
File "/usr/lib/python3.6/threading.py", line 864, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.6/multiprocessing/pool.py", line 463, in _handle_results
task = get()
File "/usr/lib/python3.6/multiprocessing/connection.py", line 251, in recv
return _ForkingPickler.loads(buf.getbuffer())
TypeError: __init__() takes 1 positional argument but 2 were given
At this point, the script halts until I cancel it with KeyboardInerrupt, which gives me the rest of the error:
Process ForkPoolWorker-5:
Process ForkPoolWorker-6:
Process ForkPoolWorker-3:
Traceback (most recent call last):
Traceback (most recent call last):
File "/usr/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.6/multiprocessing/pool.py", line 108, in worker
task = get()
File "/usr/lib/python3.6/multiprocessing/pool.py", line 108, in worker
task = get()
Traceback (most recent call last):
File "/usr/lib/python3.6/multiprocessing/queues.py", line 334, in get
with self._rlock:
File "/usr/lib/python3.6/multiprocessing/queues.py", line 334, in get
with self._rlock:
File "/usr/lib/python3.6/multiprocessing/synchronize.py", line 95, in __enter__
return self._semlock.__enter__()
File "/usr/lib/python3.6/multiprocessing/synchronize.py", line 95, in __enter__
return self._semlock.__enter__()
KeyboardInterrupt
KeyboardInterrupt
Process ForkPoolWorker-1:
File "/usr/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.6/multiprocessing/pool.py", line 108, in worker
task = get()
File "/usr/lib/python3.6/multiprocessing/queues.py", line 334, in get
with self._rlock:
File "/usr/lib/python3.6/multiprocessing/synchronize.py", line 95, in __enter__
return self._semlock.__enter__()
KeyboardInterrupt
Traceback (most recent call last):
File "/usr/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.6/multiprocessing/pool.py", line 108, in worker
task = get()
File "/usr/lib/python3.6/multiprocessing/queues.py", line 335, in get
res = self._reader.recv_bytes()
File "/usr/lib/python3.6/multiprocessing/connection.py", line 216, in recv_bytes
buf = self._recv_bytes(maxlength)
File "/usr/lib/python3.6/multiprocessing/connection.py", line 407, in _recv_bytes
buf = self._recv(4)
File "/usr/lib/python3.6/multiprocessing/connection.py", line 379, in _recv
chunk = read(handle, remaining)
KeyboardInterrupt
Traceback (most recent call last):
File "/usr/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
self.run()
File "/usr/lib/python3.6/multiprocessing/process.py", line 93, in run
self._target(*self._args, **self._kwargs)
File "/usr/lib/python3.6/multiprocessing/pool.py", line 108, in worker
task = get()
File "/usr/lib/python3.6/multiprocessing/queues.py", line 334, in get
with self._rlock:
File "/usr/lib/python3.6/multiprocessing/synchronize.py", line 95, in __enter__
return self._semlock.__enter__()
KeyboardInterrupt
I don't know how to fix this issue. Hope you guys can give a hint.
Thank y'all so far!
EDIT The code of the script:
from datetime import date
from multiprocessing import Pool
from pdf2image import convert_from_path
from os import listdir, remove
from os.path import isfile, join, abspath, split, exists
import pytesseract
import sys
import os
import re
import tempfile
tmp_path = tempfile.gettempdir() # replace with given output directory
def run(path):
PDF_file = abspath(path) # use absolute path of pdf file
pages = convert_from_path(PDF_file, 500)
page = pages[0]
imgFile = abspath(join(tmp_path, "document"+str(date.today())+".jpg"))
# save image to temp path
page.save(imgFile, 'JPEG')
# get text from image of page 1
text = str(((pytesseract.image_to_string(Image.open(imgFile)))))
if exists(imgFile):
os.remove(imgFile)
match = re.search(r"(Vertragsnummer\:\s)(\d+)\w+", text)
if match == None:
print("Could not find contract id")
exit(1)
else:
f = split(PDF_file)
d = join(tmp_path, match.group(2))
if not exists(d):
os.mkdir(d)
PDF_file_new = join(d, f[1])
print("New file: "+PDF_file_new)
os.rename(PDF_file, PDF_file_new)
def run_in_dir(directory):
files = [join(directory, f)
for f in listdir(directory) if isfile(join(directory, f))]
with Pool() as p:
p.map_async(run, files)
p.close()
p.join()
if __name__ == "__main__":
import argparse
import cProfile
parser = argparse.ArgumentParser(description="")
parser.add_argument("-p", "--path", help="Path to specific PDF file.")
parser.add_argument("-d", "--directory",
help="Path to folder containing PDF files.")
args = parser.parse_args()
# run(args.path)
print(cProfile.run("run_in_dir(args.directory)"))
Try running the script without multiprocessing. In my case I found that
pytesseract.pytesseract.TesseractNotFoundError: tesseract is not installed or it's not in your path
Here's how to install it.
I have no idea why the error message with multiprocessing is so unclear.
Also, remove exit(1) since it's intended for interactive shells not scripts.

Python inputs library - 'NoneType' object has no attribute 'terminate'

I am trying to use the inputs library to get user input from mice, gamepads, and keyboards.
I tried the following code which is supposed to read events from all devices:
import inputs
while True:
for device in inputs.devices:
for event in device.read():
print(event)
There is a problem when I run the code - I get the following error: AttributeError: 'NoneType' object has no attribute 'terminate'
I have also tried to read a single event:
import inputs
while True:
for device in inputs.devices:
event = device.read()
print(event)
This gives me the same error.
I am using Python3.6 and inputs==0.4 from pip
Does anyone know how to fix this error?
Full traceback:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\python36\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "C:\python36\lib\multiprocessing\spawn.py", line 114, in _main
prepare(preparation_data)
File "C:\python36\lib\multiprocessing\spawn.py", line 225, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "C:\python36\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
run_name="__mp_main__")
File "C:\python36\lib\runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "C:\python36\lib\runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "C:\python36\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "C:\Users\David\Documents\GitHub\Bubbles\testing.py", line 5, in <module>
event = device.read()
File "C:\python36\lib\site-packages\inputs.py", line 2313, in read
return next(iter(self))
File "C:\python36\lib\site-packages\inputs.py", line 2273, in __iter__
event = self._do_iter()
File "C:\python36\lib\site-packages\inputs.py", line 2292, in _do_iter
data = self._get_data(read_size)
File "C:\python36\lib\site-packages\inputs.py", line 2365, in _get_data
return self._pipe.recv_bytes()
File "C:\python36\lib\site-packages\inputs.py", line 2330, in _pipe
self._listener.start()
File "C:\python36\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:\python36\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\python36\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:\python36\lib\multiprocessing\popen_spawn_win32.py", line 33, in __init__
prep_data = spawn.get_preparation_data(process_obj._name)
File "C:\python36\lib\multiprocessing\spawn.py", line 143, in get_preparation_data
_check_not_importing_main()
File "C:\python36\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main
is not going to be frozen to produce an executable.''')
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
Exception ignored in: <bound method InputDevice.__del__ of inputs.Keyboard("/dev/input/by-id/usb-A_Nice_Keyboard-event-kbd")>
Traceback (most recent call last):
File "C:\python36\lib\site-packages\inputs.py", line 2337, in __del__
File "C:\python36\lib\multiprocessing\process.py", line 116, in terminate
AttributeError: 'NoneType' object has no attribute 'terminate'
Define your steps under a method like:
def func():
while True:
for device in inputs.devices:
event = device.read()
print(event)
And then call your function using:
if __name__ == '__main__':
func()

ThreadPool is OK, Pool raises RuntimeError

i'm new to Python and i'm trying to organize some kind of timeout, when process hangs. Following code works fine:
from multiprocessing.pool import ThreadPool
pool = ThreadPool(processes=1)
file_list = []
while not file_list:
async_result = pool.apply_async(list_retriever,)
try:
file_list = async_result.get(15)
except:
print('We\'ve got timeout!\n')
I've found out that ThreadPool is not documented properly, and I've decided to switch to the Pool instead. Following code raises RuntimeError:
from multiprocessing.pool import Pool
pool = Pool(processes=1)
file_list = []
while not file_list:
async_result = pool.apply_async(list_retriever,)
try:
file_list = async_result.get(15)
except:
print('We\'ve got timeout!\n')
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Users\Bodnya\AppData\Local\Programs\Python\Python35\lib\multiprocessing\spawn.py", line 106, in spawn_main
exitcode = _main(fd)
File "C:\Users\Bodnya\AppData\Local\Programs\Python\Python35\lib\multiprocessing\spawn.py", line 115, in _main
prepare(preparation_data)
File "C:\Users\Bodnya\AppData\Local\Programs\Python\Python35\lib\multiprocessing\spawn.py", line 226, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "C:\Users\Bodnya\AppData\Local\Programs\Python\Python35\lib\multiprocessing\spawn.py", line 278, in _fixup_main_from_path
run_name="__mp_main__")
File "C:\Users\Bodnya\AppData\Local\Programs\Python\Python35\lib\runpy.py", line 254, in run_path
pkg_name=pkg_name, script_name=fname)
File "C:\Users\Bodnya\AppData\Local\Programs\Python\Python35\lib\runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "C:\Users\Bodnya\AppData\Local\Programs\Python\Python35\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "C:\Users\Bodnya\PycharmProjects\zuf-test-branch\zuf-test\auto-deploy.py", line 69, in <module>
pool = Pool(processes=1)
File "C:\Users\Bodnya\AppData\Local\Programs\Python\Python35\lib\multiprocessing\pool.py", line 168, in __init__
self._repopulate_pool()
File "C:\Users\Bodnya\AppData\Local\Programs\Python\Python35\lib\multiprocessing\pool.py", line 233, in _repopulate_pool
w.start()
File "C:\Users\Bodnya\AppData\Local\Programs\Python\Python35\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:\Users\Bodnya\AppData\Local\Programs\Python\Python35\lib\multiprocessing\context.py", line 313, in _Popen
return Popen(process_obj)
File "C:\Users\Bodnya\AppData\Local\Programs\Python\Python35\lib\multiprocessing\popen_spawn_win32.py", line 34, in __init__
prep_data = spawn.get_preparation_data(process_obj._name)
File "C:\Users\Bodnya\AppData\Local\Programs\Python\Python35\lib\multiprocessing\spawn.py", line 144, in get_preparation_data
_check_not_importing_main()
File "C:\Users\Bodnya\AppData\Local\Programs\Python\Python35\lib\multiprocessing\spawn.py", line 137, in _check_not_importing_main
is not going to be frozen to produce an executable.''')
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
And this error message keeps showing in the endless loop. Can you please tell me what I'm doing wrong?
UPDATE
Apparently I did not protect the main code like this to avoid creating subprocesses recursively, but why it worked in the first example?
Here's working solution:
def function():
from multiprocessing.pool import ThreadPool
pool = ThreadPool(processes=3)
file_list = []
for i in range(3)
async_result = pool.apply_async(list_retriever,)
try:
file_list = async_result.get(15)
except:
print('We\'ve got timeout!\n')
continue
if __name__ == '__main__':
function()
However, still, it soent explain, why it was working in the first place.

Categories

Resources