Python 3.x multiprocessing tkinter mainloop - python

How do you use multiprocessing on root.mainloop? I am using Python 3.6. I need to do lines of code after it, some requiring the object.
I do not want to create a second object, like some of the other answers for my question suggest.
Here is a little code snippet (set being a JSON object):
from multiprocessing import Process
def check():
try: sett['setup']
except KeyError:
sett['troubleshoot_file']=None
check()
else:
if sett['setup'] is True: return
elif type(sett['setup']) is not bool: raise TypeError('sett[\'setup\'] is not a type of boolian (\'bool\')')
root.=Tk()
root['bg']='blue'
mainloop=Process(target=root.mainloop)
mainloop.start()
mainloop.join()
check()
However, I get this traceback:
Traceback (most recent call last):
File "(directory)/main.py", line 41, in <module>
check()
File "(directory)/main.py", line 39, in check
mainloop.start()
File "C:\Program Files (x86)\Python36-32\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:\Program Files (x86)\Python36-32\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Program Files (x86)\Python36-32\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:\Program Files (x86)\Python36-32\lib\multiprocessing\popen_spawn_win32.py", line 65, in __init__
reduction.dump(process_obj, to_child)
File "C:\Program Files (x86)\Python36-32\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
TypeError: can't pickle _tkinter.tkapp objects
I have tried running:
from queue import Queue
from tkinter import Tk
from multiprocessing import Process
p=Process(target=q.get())
The interpreter then completely crashes.

You cannot use any tkinter objects across multiple processes or threads. If you need to share data between the gui and other processes you will need to set up a queue, and poll the queue from the GUI.
The reason for this is that tkinter is a wrapper around a tcl interpreter that knows nothing about python threads or processes.
You will find a link on how to do this at:
docs.python.org/3.6/library/queue.html

Related

How to fix multiprocessing problems in python in windows10

I try to use this tutorial to train my own car model recognition model: https://github.com/Helias/Car-Model-Recognition. And i want to use coda and my gpu perfomance to enhance training speed (preprocesssing step was completed without any errors).But when I try to train my model, I've got the following errors:
######### ERROR #######
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
######### batch #######
Traceback (most recent call last):
File "D:\Car-Model-Recognition\main.py", line 78, in train_model
######### ERROR #######
[Errno 32] Broken pipe
for i, batch in enumerate(loaders[mode]):
######### batch ####### File "C:\Program Files\Python37\lib\site-packages\torch\utils\data\dataloader.py", line 279, in __iter__
return _MultiProcessingDataLoaderIter(self)
Traceback (most recent call last):
File "C:\Program Files\Python37\lib\site-packages\torch\utils\data\dataloader.py", line 719, in __init__
File "main.py", line 78, in train_model
w.start()
File "C:\Program Files\Python37\lib\multiprocessing\process.py", line 112, in start
for i, batch in enumerate(loaders[mode]):
File "C:\Program Files\Python37\lib\site-packages\torch\utils\data\dataloader.py", line 279, in __iter__
self._popen = self._Popen(self)
File "C:\Program Files\Python37\lib\multiprocessing\context.py", line 223, in _Popen
return _MultiProcessingDataLoaderIter(self)
File "C:\Program Files\Python37\lib\site-packages\torch\utils\data\dataloader.py", line 719, in __init__
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Program Files\Python37\lib\multiprocessing\context.py", line 322, in _Popen
w.start()
return Popen(process_obj)
File "C:\Program Files\Python37\lib\multiprocessing\popen_spawn_win32.py", line 46, in __init__
File "C:\Program Files\Python37\lib\multiprocessing\process.py", line 112, in start
prep_data = spawn.get_preparation_data(process_obj._name)
File "C:\Program Files\Python37\lib\multiprocessing\spawn.py", line 143, in get_preparation_data
self._popen = self._Popen(self)
File "C:\Program Files\Python37\lib\multiprocessing\context.py", line 223, in _Popen
_check_not_importing_main()
File "C:\Program Files\Python37\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Program Files\Python37\lib\multiprocessing\context.py", line 322, in _Popen
is not going to be frozen to produce an executable.''')
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
return Popen(process_obj)
I have used the exact code from given link, and if i start my code using wsl, everything is ok, but I can't use my gpu from wsl. Where should I insert this name == 'main' check to prevent such a mistake or how can i disable this multiprocessing
Looking at main.py, you run a lot of code at the module level. On Windows, python's multiprocessing module will start a new python interpreter, import your modules, unpickle a snapshot of your parent context and then call your worker function. The problem is that all of that module level code executes merely by import and you essentially run a new copy of your program instead of building a context for your worker.
The solution is two-fold. First, move all of the module level code into functions. You want to be a able to import your module without side effects. Second, call the function(s) that start your program from a conditional
def main():
the stuff you were doing a module level
if __name__ == "__main__":
main()
The reason this works is in the module name. When you run the top level script of a python (e.g., python main.py), its a script called "__main__", not a module. If a different program imports main its a module called "main" (or whatever you named your script). That 'if' stops your main code from executing if its imported by some other python code - such as the multiprocessing module.
Its okay to have some executable code at the module level, especially if you are setting up defaults and such. But don't do anything at the module level that you wouldn't want done if some other code imports your script.

How to run decorated function in a separate and terminatable process?

I am dealing with an existing test suite, where we want to implement a timeout functionality, which will cause a hanging test to time out and then move on with its regular teardown/cleanup.
I am toying with the idea of running each test in a process, which I can terminate after e.g. a timeout of 3 seconds. Ideally, I don't want to modify the test cases and instead just add a decorator indicating the test is affected by this timeout behavior.
This is what I have, a minimal example:
import multiprocessing
import sys
from time import sleep
def timeout(func):
def wrapper():
proc = multiprocessing.Process(target=func)
proc.start()
sleep(3)
proc.terminate()
return wrapper
#timeout
def my_test():
while True:
sleep(1)
if __name__ == "__main__":
my_test()
But for some reason, it seems pickle cannot deal with this and the decorator somehow messes up the reference to the function, as this error is hit:
$ python multiproc.py
2019-11-07 07:34:37.098 | DEBUG | __main__:wrapper:13 - In wrapper
Traceback (most recent call last):
File "multiproc.py", line 30, in <module>
my_test()
File "multiproc.py", line 15, in wrapper
proc.start()
File "C:\Python36\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:\Python36\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Python36\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:\Python36\lib\multiprocessing\popen_spawn_win32.py", line 65, in __init__
reduction.dump(process_obj, to_child)
File "C:\Python36\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
_pickle.PicklingError: Can't pickle <function my_test at 0x000001C0A7E87400>: it's not the same object as __main__.my_test
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\Python36\lib\multiprocessing\spawn.py", line 99, in spawn_main
new_handle = reduction.steal_handle(parent_pid, pipe_handle)
File "C:\Python36\lib\multiprocessing\reduction.py", line 82, in steal_handle
_winapi.PROCESS_DUP_HANDLE, False, source_pid)
OSError: [WinError 87] The parameter is incorrect
Does anyone have an idea if this can be solved without modifying the existing test case?

Error pickling a `matlab` object in joblib `Parallel` context

I'm running some Matlab code in parallel from inside a Python context (I know, but that's what's going on), and I'm hitting an import error involving matlab.double. The same code works fine in a multiprocessing.Pool, so I am having trouble figuring out what the problem is. Here's a minimal reproducing test case.
import matlab
from multiprocessing import Pool
from joblib import Parallel, delayed
# A global object that I would like to be available in the parallel subroutine
x = matlab.double([[0.0]])
def f(i):
print(i, x)
with Pool(4) as p:
p.map(f, range(10))
# This prints 1, [[0.0]]\n2, [[0.0]]\n... as expected
for _ in Parallel(4, backend='multiprocessing')(delayed(f)(i) for i in range(10)):
pass
# This also prints 1, [[0.0]]\n2, [[0.0]]\n... as expected
# Now run with default `backend='loky'`
for _ in Parallel(4)(delayed(f)(i) for i in range(10)):
pass
# ^ this crashes.
So, the only problematic one is the one using the 'loky' backend.
The full traceback is:
exception calling callback for <Future at 0x7f63b5a57358 state=finished raised BrokenProcessPool>
joblib.externals.loky.process_executor._RemoteTraceback:
'''
Traceback (most recent call last):
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/externals/loky/process_executor.py", line 391, in _process_worker
call_item = call_queue.get(block=True, timeout=timeout)
File "~/miniconda3/envs/myenv/lib/python3.6/multiprocessing/queues.py", line 113, in get
return _ForkingPickler.loads(res)
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/matlab/mlarray.py", line 31, in <module>
from _internal.mlarray_sequence import _MLArrayMetaClass
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/matlab/_internal/mlarray_sequence.py", line 3, in <module>
from _internal.mlarray_utils import _get_strides, _get_size, \
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/matlab/_internal/mlarray_utils.py", line 4, in <module>
import matlab
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/matlab/__init__.py", line 24, in <module>
from mlarray import double, single, uint8, int8, uint16, \
ImportError: cannot import name 'double'
'''
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/externals/loky/_base.py", line 625, in _invoke_callbacks
callback(self)
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/parallel.py", line 309, in __call__
self.parallel.dispatch_next()
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/parallel.py", line 731, in dispatch_next
if not self.dispatch_one_batch(self._original_iterator):
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/parallel.py", line 759, in dispatch_one_batch
self._dispatch(tasks)
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/parallel.py", line 716, in _dispatch
job = self._backend.apply_async(batch, callback=cb)
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 510, in apply_async
future = self._workers.submit(SafeFunction(func))
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/externals/loky/reusable_executor.py", line 151, in submit
fn, *args, **kwargs)
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/externals/loky/process_executor.py", line 1022, in submit
raise self._flags.broken
joblib.externals.loky.process_executor.BrokenProcessPool: A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable.
joblib.externals.loky.process_executor._RemoteTraceback:
'''
Traceback (most recent call last):
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/externals/loky/process_executor.py", line 391, in _process_worker
call_item = call_queue.get(block=True, timeout=timeout)
File "~/miniconda3/envs/myenv/lib/python3.6/multiprocessing/queues.py", line 113, in get
return _ForkingPickler.loads(res)
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/matlab/mlarray.py", line 31, in <module>
from _internal.mlarray_sequence import _MLArrayMetaClass
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/matlab/_internal/mlarray_sequence.py", line 3, in <module>
from _internal.mlarray_utils import _get_strides, _get_size, \
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/matlab/_internal/mlarray_utils.py", line 4, in <module>
import matlab
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/matlab/__init__.py", line 24, in <module>
from mlarray import double, single, uint8, int8, uint16, \
ImportError: cannot import name 'double'
'''
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "test.py", line 20, in <module>
for _ in Parallel(4)(delayed(f)(i) for i in range(10)):
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/parallel.py", line 934, in __call__
self.retrieve()
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/parallel.py", line 833, in retrieve
self._output.extend(job.get(timeout=self.timeout))
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 521, in wrap_future_result
return future.result(timeout=timeout)
File "~/miniconda3/envs/myenv/lib/python3.6/concurrent/futures/_base.py", line 432, in result
return self.__get_result()
File "~/miniconda3/envs/myenv/lib/python3.6/concurrent/futures/_base.py", line 384, in __get_result
raise self._exception
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/externals/loky/_base.py", line 625, in _invoke_callbacks
callback(self)
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/parallel.py", line 309, in __call__
self.parallel.dispatch_next()
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/parallel.py", line 731, in dispatch_next
if not self.dispatch_one_batch(self._original_iterator):
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/parallel.py", line 759, in dispatch_one_batch
self._dispatch(tasks)
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/parallel.py", line 716, in _dispatch
job = self._backend.apply_async(batch, callback=cb)
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/_parallel_backends.py", line 510, in apply_async
future = self._workers.submit(SafeFunction(func))
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/externals/loky/reusable_executor.py", line 151, in submit
fn, *args, **kwargs)
File "~/miniconda3/envs/myenv/lib/python3.6/site-packages/joblib/externals/loky/process_executor.py", line 1022, in submit
raise self._flags.broken
joblib.externals.loky.process_executor.BrokenProcessPool: A task has failed to un-serialize. Please ensure that the arguments of the function are all picklable.
Looking at the traceback, it seems like the root cause is an issue importing the matlab package in the child process.
It's probably worth noting that this all runs just fine if instead I had defined x = np.array([[0.0]]) (after importing numpy as np). And of course the main process has no problem with any matlab imports, so I am not sure why the child process would.
I'm not sure if this error has anything in particular to do with the matlab package, or if it's something to do with global variables and cloudpickle or loky. In my application it would help to stick with loky, so I'd appreciate any insight!
I should also note that I'm using the official Matlab engine for Python: https://www.mathworks.com/help/matlab/matlab-engine-for-python.html. I suppose that might make it hard for others to try out the test cases, so I wish I could reproduce this error with a type other than matlab.double, but I haven't found another yet.
Digging around more, I've noticed that the process of importing the matlab package is more circular than I would expect, and I'm speculating that this could be part of the problem? The issue is that when import matlab is run by loky's _ForkingPickler, first some file matlab/mlarray.py is imported, which imports some other files, one of which contains import matlab, and this causes matlab/__init__.py to be run, which internally has from mlarray import double, single, uint8, ... which is the line that causes the crash.
Could this circularity be the issue? If so, why can I import this module in the main process but not in the loky backend?
The error is caused by incorrect loading order of global objects in the child processes. It can be seen clearly in the traceback
_ForkingPickler.loads(res) -> ... -> import matlab -> from mlarray import ...
that matlab is not yet imported when the global variable x is loaded by cloudpickle.
joblib with loky seems to treat modules as normal global objects and send them dynamically to the child processes. joblib doesn't record the order in which those objects/modules were defined. Therefore they are loaded (initialized) in a random order in the child processes.
A simple workaround is to manually pickle the matlab object and load it after importing matlab inside your function.
import matlab
import pickle
px = pickle.dumps(matlab.double([[0.0]]))
def f(i):
import matlab
x=pickle.loads(px)
print(i, x)
Of course you can also use the joblib.dumps and loads to serialize the objects.
Use initializer
Thanks to the suggestion of #Aaron, you can also use an initializer (for loky) to import Matlab before loading x.
Currently there's no simple API to specify initializer. So I wrote a simple function:
def with_initializer(self, f_init):
# Overwrite initializer hook in the Loky ProcessPoolExecutor
# https://github.com/tomMoral/loky/blob/f4739e123acb711781e46581d5ed31ed8201c7a9/loky/process_executor.py#L850
hasattr(self._backend, '_workers') or self.__enter__()
origin_init = self._backend._workers._initializer
def new_init():
origin_init()
f_init()
self._backend._workers._initializer = new_init if callable(origin_init) else f_init
return self
It is a little bit hacky but works well with the current version of joblib and loky.
Then you can use it like:
import matlab
from joblib import Parallel, delayed
x = matlab.double([[0.0]])
def f(i):
print(i, x)
def _init_matlab():
import matlab
with Parallel(4) as p:
for _ in with_initializer(p, _init_matlab)(delayed(f)(i) for i in range(10)):
pass
I hope the developers of joblib will add initializer argument to the constructor of Parallel in the future.

multiprocessing PicklingError in Python 3.6.5 that doesn't occur in 3.6.1

The following code works in Python 3.6.1 ... but returns this error in Python 3.6.5:
Traceback (most recent call last):
File "G:/GOOD/Coding/Deepthroat/Deepthroat2/Bin/Testing/gh.py", line 36, in <module>
loops.start()
File "C:\Program Files\Python36\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:\Program Files\Python36\lib\multiprocessing\context.py", line 223, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Program Files\Python36\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:\Program Files\Python36\lib\multiprocessing\popen_spawn_win32.py", line 65, in __init__
reduction.dump(process_obj, to_child)
File "C:\Program Files\Python36\lib\multiprocessing\reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
_pickle.PicklingError: Can't pickle <function loops at 0x0000019D3E30B598>: it's not the same object as __main__.loops
Here is the code:
from multiprocessing import Process
import time
def timeout(mm):
for i in range(6):
time.sleep(0.1)
print('stop all bla loops now')
mm.terminate()
return
def loops():
for i in range(5):
time.sleep(0.1)
print('loop1')
billy = 'loop1 done'
for i in range(5):
time.sleep(0.1)
print('loop2')
billy = 'loop2 done'
for i in range(5):
time.sleep(0.1)
print('loop3')
billy = 'loop3 done'
billy = 'cool'
loops = Process(target=loops)
loops.start()
timeout = Process(target=timeout(loops))
timeout.start()
This is actually just an error by you because you are overwriting loops, which means it can't be pickled (pickle just stores a reference to the function). If you say this error didn't occur in python3.6.1, than probably there are test missing in python3.6.1
If the target and variable both having same name like loops = Process(target=loops) then this error returns.
I could solve my problem using the explanations by Medalng and comments of Rhys in above answer. Re-iterated it again to help others as I could catch Rhys's comment in third visit.

RuntimeError with multiprocessing module when trying to recursively compare lists

I'm generating a list filled with sublists of randomly generated 0s and 1s, and then trying to compare each list with every other list to determine their similarity, efficiently.
I know that my code works with a single process (i.e. without involving multiprocessing, but once I start involving multiprocessing.Pool() or multiprocessing.Process() everything starts to break.
I want to compare how long a single process would take compared to multiple processes. I've tried this with threading, but a single process actually ended up taking less time, probably due to the Global Interpreter Lock.
Here's my code:
import difflib
import secrets
import timeit
import multiprocessing
import numpy
random_lists = [[secrets.randbelow(2) for _ in range(500)] for _ in range(500)]
random_lists_split = numpy.array_split(numpy.array(random_lists), 5)
def get_similarity_value(lists_to_check, sublists_to_check) -> list:
ratios = []
matcher = difflib.SequenceMatcher()
for sublist_major in sublists_to_check:
try:
sublist_major = sublist_major.tolist()
except AttributeError:
pass
for sublist_minor in lists_to_check:
if sublist_major == sublist_minor or [lists_to_check.index(sublist_major), lists_to_check.index(sublist_minor)] in [ratios[i][1] for i in range(len(ratios))] or [lists_to_check.index(sublist_minor), lists_to_check.index(sublist_major)] in [ratios[i][1] for i in range(len(ratios))]: # or lists_to_check.index(sublist_major.tolist()) > lists_to_check.index(sublist_minor):
pass
else:
matcher.set_seqs(sublist_major, sublist_minor)
ratios.append([matcher.ratio(), sorted([lists_to_check.index(sublist_major), lists_to_check.index(sublist_minor)])])
return ratios
def start():
test = multiprocessing.Pool(4)
data = [(random_lists, random_lists_split[i]) for i in range(len(random_lists_split))]
print(test.map(get_similarity_value, data))
statement = timeit.Timer(start)
print(statement.timeit(1))
statement2 = timeit.Timer(lambda: get_similarity_value(random_lists, random_lists))
print(statement2.timeit(1))
And here's the error:
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\ProgramData\Anaconda3\envs\Computing Coursework\lib\multiprocessing\spawn.py", line 105, in spawn_main
exitcode = _main(fd)
File "C:\ProgramData\Anaconda3\envs\Computing Coursework\lib\multiprocessing\spawn.py", line 114, in _main
prepare(preparation_data)
File "C:\ProgramData\Anaconda3\envs\Computing Coursework\lib\multiprocessing\spawn.py", line 225, in prepare
_fixup_main_from_path(data['init_main_from_path'])
File "C:\ProgramData\Anaconda3\envs\Computing Coursework\lib\multiprocessing\spawn.py", line 277, in _fixup_main_from_path
run_name="__mp_main__")
File "C:\ProgramData\Anaconda3\envs\Computing Coursework\lib\runpy.py", line 263, in run_path
pkg_name=pkg_name, script_name=fname)
File "C:\ProgramData\Anaconda3\envs\Computing Coursework\lib\runpy.py", line 96, in _run_module_code
mod_name, mod_spec, pkg_name, script_name)
File "C:\ProgramData\Anaconda3\envs\Computing Coursework\lib\runpy.py", line 85, in _run_code
exec(code, run_globals)
File "timings.py", line 38, in <module>
print(statement.timeit(1))
File "C:\ProgramData\Anaconda3\envs\Computing Coursework\lib\timeit.py", line 178, in timeit
timing = self.inner(it, self.timer)
File "<timeit-src>", line 6, in inner
File "timings.py", line 32, in start
test = multiprocessing.Pool(4)
File "C:\ProgramData\Anaconda3\envs\Computing Coursework\lib\multiprocessing\context.py", line 119, in Pool
context=self.get_context())
File "C:\ProgramData\Anaconda3\envs\Computing Coursework\lib\multiprocessing\pool.py", line 174, in __init__
self._repopulate_pool()
File "C:\ProgramData\Anaconda3\envs\Computing Coursework\lib\multiprocessing\pool.py", line 239, in _repopulate_pool
w.start()
File "C:\ProgramData\Anaconda3\envs\Computing Coursework\lib\multiprocessing\process.py", line 105, in start
self._popen = self._Popen(self)
File "C:\ProgramData\Anaconda3\envs\Computing Coursework\lib\multiprocessing\context.py", line 322, in _Popen
return Popen(process_obj)
File "C:\ProgramData\Anaconda3\envs\Computing Coursework\lib\multiprocessing\popen_spawn_win32.py", line 33, in __init__
prep_data = spawn.get_preparation_data(process_obj._name)
File "C:\ProgramData\Anaconda3\envs\Computing Coursework\lib\multiprocessing\spawn.py", line 143, in get_preparation_data
_check_not_importing_main()
File "C:\ProgramData\Anaconda3\envs\Computing Coursework\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main
is not going to be frozen to produce an executable.''')
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
N.B. I have tried using multiprocessing.freeze_support() but it results in the same error. The code also seems to be attempting to run indefinitely, as the error appears over and over again.
Thanks!
The problem is that your top-level code—including the code that creates the child Process—is not protected from being run in the child processes.
As the docs explain:, if you're not using the fork start method (and since you're on Windows, you're not):
Make sure that the main module can be safely imported by a new Python interpreter without causing unintended side effects (such a starting a new process).
In fact, it's nearly identical to the example that follows that warning. You're launching a whole pool of children instead of just one, but it's the same problem. Every child in the pool tries to launch a new pool, and, fortunately, multiprocessing figures out that something bad is going on and fails with a RuntimeError instead of exponentially spawning processes until Windows refuses to spawn anymore or its scheduler just falls down.
As the docs say:
Instead one should protect the “entry point” of the program by using if __name__ == '__main__':
In your case, that means this part:
if __name__ == '__main__':
statement = timeit.Timer(start)
print(statement.timeit(1))
statement2 = timeit.Timer(lambda: get_similarity_value(random_lists, random_lists))
print(statement2.timeit(1))

Categories

Resources