When I run a GridsearchCV() and a RandomizedsearchCV() methods in parallel ( having n_jobs>1 or n_jobs=-1 options set )
it shows this message:
ImportError: [joblib] Attempting to do parallel computing without
protecting your import on a system that does not support forking. To
use parallel-computing in a script, you must protect your main loop
using "if name == 'main'". Please see the joblib documentation on
Parallel for more information" I put the code in a class in .py file
and call it using if_name_=='main in other .py file but it still shows
this message
It works good when n_jobs=1
import platform; print(platform.platform())
Windows-10-10.0.10586-SP0
import numpy; print("NumPy", numpy.__version__)
NumPy 1.13.1
import scipy; print("SciPy", scipy.__version__)
SciPy 0.19.1
import sklearn; print("Scikit-Learn", sklearn.__version__)
Scikit-Learn 0.19.0
UPDATE
I tried this code but it still gives me the same error
import numpy as np
from sklearn.model_selection import RandomizedSearchCV
from sklearn.tree import DecisionTreeClassifier
class Test():
def __init__(self):
attributes = [..]
dataset = pd.read_csv("..")
X=dataset[[..]]
Y=dataset[...]
model=DecisionTreeClassifier()
model = RandomizedSearchCV(....)
model.fit(X, Y)
if __name__ == '__main__':
Test()
joblib is know for this behaviour and rather explicit in documenting:
Warning
Under Windows, it is important to protect the main loop of code to avoid recursive spawning of subprocesses when using joblib.Parallel. In other words, you should be writing code like this:
import ....
def function1(...):
...
def function2(...):
...
...
if __name__ == '__main__':
# do stuff with imports and functions defined about
...
No code should run outside of the “if __name__ == ‘__main__’” blocks, only imports and definitions.
So, refactor your code so as to meet this well-defined requirement and your code will start to benefit from the joblib-tools powers.
I imagine this won't be the most useful answer, but you could always parallelize the process manually. https://docs.python.org/2/library/multiprocessing.html
Related
I am trying to use a TensorRT engine for inference in a python class that inherits from multiprocessing. The engine works in a standalone python script on my system, but now while integrating it into the codebase, the multiprocessing used in the class seems to be causing problems.
I am not getting any errors. It just skips everything after the line self.runtime = trt.Runtime(self.trt_logger). My debugger from vscode does not go into the function either.
In the docs the following is mentioned, that I do not fully understand:
The TensorRT builder may only be used by one thread at a time. If you
need to run multiple builds simultaneously, you will need to create
multiple builders. The TensorRT runtime can be used by multiple
threads simultaneously, so long as each object uses a different
execution context.
The following parts of my code are started, joined and terminated from another file:
# more imports
import logging
import multiprocessing
import tensorrt as trt
import pycuda.driver as cuda
import pycuda.autoinit
class MyClass(multiprocessing.Process):
def __init__(self, messages):
multiprocessing.Process.__init__(self)
# other stuff
self.exit = multiprocessing.Event()
def load_tensorrt_model(self, config):
'''Load tensorrt model with engine'''
logging.debug('Start')
# Reading the config parameters related to the engine
engine_file = config['trt_engine']['trt_folder'] + os.path.sep + config['trt_engine']['engine_file']
class_names_file = config['trt_engine']['trt_folder'] + os.path.sep + config['trt_engine']['class_names_file']
# Verify if all the necessary files are present, if so load the detection network
if os.path.exists(engine_file) and os.path.exists(class_names_file):
try:
logging.debug('In try statement')
self.trt_logger = trt.Logger()
f = open(engine_file, 'rb')
logging.debug('I can get here, but no further')
self.runtime = trt.Runtime(self.trt_logger)
logging.debug('Cannot get here')
self.engine = self.runtime.deserialize_cuda_engine(f.read())
# More stuff
I have found someone with a multithreading problem, but as of now I was unable to use this to solve my problem.
Any help is appreciated.
System specs:
Python 3.6.9
Jetson NX
Jetpack 4.4.1
L4T 32.4.4
Tensorrt 7.1.3.0-1
Cuda10.2
Ubuntu 18.04
same problem. It seems pycuda autoinit not working well under a multi process scenario.
try to replace import pycuda.autoinit with
cuda.init()
self.cuda_context = cuda.Device(0).make_context()
I need to implement a multiprocessing pool that utilizes arbitrary packages for calculations. For this, I'm using Python and joblib 0.9.0. This code is basically the structure I want.
import numpy as np
from joblib import pool
def someComputation(x):
return np.interp(x, [-1, 1], [-1, 1])
if __name__ == '__main__':
some_set_of_numbers = [-1,-0.5,0,0.5,1]
the_pool = pool.Pool(processes=2)
solutions = [the_pool.apply_async(someComputation, (x,)) for x in some_set_of_numbers]
print(solutions[0].get())
On both Windows 10 and Red Hat Enterprise Linux running Anaconda 4.3.1 Python 3.6.0 (as well as 3.5 and 3.4 with virtual envs), I get that 'np' was never passed into the someComputation() function raising the error
File "C:\Anaconda3\lib\site-packages\multiprocessing_on_dill\pool.py", line 608, in get
raise self._value
NameError: name 'np' is not defined
however, on my Mac OS X 10.11.6 running Python 3.5 and the same joblib, I get the expected output of '-1' with the exact same code. This question is essentially the same, but it dealt with pathos and not joblib. The general answer was to include the numpy import statement inside of the function
from joblib import pool
def someComputation(x):
import numpy as np
return np.interp(x, [-1, 1], [-1, 1])
if __name__ == '__main__':
some_set_of_numbers = [-1,-0.5,0,0.5,1]
the_pool = pool.Pool(processes=2)
solutions = [the_pool.apply_async(someComputation, (x,)) for x in some_set_of_numbers]
print(solutions[0].get())
This solves the issue on the Windows and Linux machines, where they now output '-1' as expected but this solution seems clunky. Is there any reason why the first bit of code would work on a Mac, but not Windows or Linux? I ultimately need to run this code on the Linux machine so is there any fix that doesn't include putting the import statement inside of the function?
Edit:
After investigating a bit further, I found an old workaround I put in years ago that looks like is causing the issue. In joblib/pool.py, I changed line 44 from
from multiprocessing.pool import Pool
to
from multiprocessing_on_dill.pool import Pool
to support pickling of arbitrary functions. For some reason, this change is what really causes the issue on Windows and Linux, but the Mac machine runs just fine. Using multiprocessing instead of multiprocessing_on_dill solves the above issue, but the code doesn't work for the majority of my cases since they can't be pickled.
I am not sure what the exact issue is, but it appears that there is some problem with transferring the global scope over to the subprocesses that run the task. You can potentially avoid name errors by binding the name np as a function parameter:
def someComputation(x, np=np):
return np.interp(x, [-1, 1], [-1, 1])
This has the advantage of not requiring a call to the import machinery every time the function is run. The name np will be bound to the function when it is first evaluated during module loading.
I'm certainly missing something very obvious here, but why does this work:
a = [0.2635,0.654654,0.365,0.4545,1.5465,3.545]
import statsmodels.robust as rb
print rb.scale.mad(a)
0.356309343367
but this doesn't:
import statsmodels as sm
print sm.robust.scale.mad(a)
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-5-1ce0c872b0be> in <module>()
----> 1 print statsmodels.robust.scale.mad(a)
AttributeError: 'module' object has no attribute 'robust'
Long answer see http://www.statsmodels.org/stable/importpaths.html
Statsmodels has intentionally mostly empty __init__.py but has a parallel import collection through the api.py.
The recommended import for interactive work import statsmodels.api as sm imports almost all of statsmodels, numpy, pandas and patsy, and large parts of scipy. This is slooow on cold start.
If we want to import just a specific part of statsmodels, then we don't need to import all these extras. Having empty __init__.py means that we can import just a single module (which of course imports the dependencies of that module).
e.g. from statsmodels.robust.scale import mad or
import statmodels.robust scale as smscale
smscale.mad(...)
(Small caveat: Some of the very low level imports might not remain always backwards compatible if the internal structure changes. However, the general policy is to deprecate functions over one or two releases while maintaining the old access structure.)
You can, you just have to import robust as well:
import statsmodels as sm
import statsmodels.robust
Then:
>>> sm.robust.scale.mad(a)
0.35630934336679576
robust is a subpackage of statsmodels, and importing a package does not in general automatically import subpackages (unless the package is written to do so explicitly).
So this sounds a bit convoluted, but I've had a problem with joblib lately where it will create a bunch of processes and then just hang there (aka, each process takes up memory, but uses no CPU time).
Here is the simplest code I've got that will reproduce the problem:
from sklearn import linear_model
import numpy as np
from sklearn import cross_validation as cval
from joblib import Parallel, delayed
def fit_hanging_model(n=10000, nx=10, ny=32, ndelay=10,
n_cvs=5, n_jobs=None):
# Create data
X = np.random.randn(n, ny*ndelay)
y = np.random.randn(n, nx)
# Create model + CV
model = linear_model.Ridge(alpha=1000.)
cvest = cval.KFold(n, n_folds=n_cvs, shuffle=True)
# Fit model
par = Parallel(n_jobs=n_jobs, verbose=10)
parfunc = delayed(_fit_model_cvs)
par(parfunc(X, y, train, test, model)
for i, (train, test) in enumerate(cvest))
def _fit_model_cvs(X, Y, train, test, model):
model.fit(X, Y)
n = 10
a = np.random.randn(n, 32)
b = np.random.randn(32, 10)
##########
c = np.dot(a, b)
##########
fit_hanging_model(n_jobs=3)
Here is what happens:
If I run all of the code above, then it spawns off three processes and hangs
If I run all of the code above, but use n_jobs=1, then it works fine
If I run all of the code above a second time, after running it once with n_jobs=1, then it works fine no matter how many jobs I use.
If I run all of the code above EXCEPT for the code between the ######, then it runs fine.
However, if I then run the code between the ######, and try to run fit_hanging_model with n_jobs > 1, then it hangs
This is with joblib = 0.8.0, and sklearn 0.15-git.
Note, this bug is on CentOS on linux. I have not been able to reproduce this bug on another machine, so it may be hard to reproduce.
Does anyone have any idea why this might be going on? It seems like that dot product is doing something strange, but I have no idea what it could be...I'm at the end of my rope...
Figured it out. Apparently this was an issue with Joblib creating multiple python processes, while MKL was simultaneously trying to do threading. See the issue and the solution (which involves setting environment variables) here:
https://github.com/joblib/joblib/issues/138
According to my experience, joblib hangs when the function it calls makes another level of parallelization.
Using the solution in the answer by #choldgraf, you need to disable the inner level of parallelization by MKL:
import os
os.environ['MKL_NUM_THREADS'] = '1'
os.environ['OMP_NUM_THREADS'] = '1'
os.environ['MKL_DYNAMIC'] = 'FALSE'
This applies to other parallel computing libraries such as OpenMP.
I am trying to calculate and generate plots using multiprocessing. On Linux the code below runs correctly, however on the Mac (ML) it doesn't, giving the error below:
import multiprocessing
import matplotlib.pyplot as plt
import numpy as np
import rpy2.robjects as robjects
def main():
pool = multiprocessing.Pool()
num_figs = 2
# generate some random numbers
input = zip(np.random.randint(10,1000,num_figs),
range(num_figs))
pool.map(plot, input)
def plot(args):
num, i = args
fig = plt.figure()
data = np.random.randn(num).cumsum()
plt.plot(data)
main()
The Rpy2 is rpy2==2.3.1 and R is 2.13.2 (I could not install R 3.0 and rpy2 latest version on any mac without getting segmentation fault).
The error is:
The process has forked and you cannot use this CoreFoundation functionality safely. You MUST exec().
Break on __THE_PROCESS_HAS_FORKED_AND_YOU_CANNOT_USE_THIS_COREFOUNDATION_FUNCTIONALITY___YOU_MUST_EXEC__() to debug.
The process has forked and you cannot use this CoreFoundation functionality safely. You MUST exec().
I have tried everything to understand what the problem is with no luck. My configuration is:
Danials-MacBook-Pro:~ danialt$ brew --config
HOMEBREW_VERSION: 0.9.4
ORIGIN: https://github.com/mxcl/homebrew
HEAD: 705b5e133d8334cae66710fac1c14ed8f8713d6b
HOMEBREW_PREFIX: /usr/local
HOMEBREW_CELLAR: /usr/local/Cellar
CPU: dual-core 64-bit penryn
OS X: 10.8.3-x86_64
Xcode: 4.6.2
CLT: 4.6.0.0.1.1365549073
GCC-4.2: build 5666
LLVM-GCC: build 2336
Clang: 4.2 build 425
X11: 2.7.4 => /opt/X11
System Ruby: 1.8.7-358
Perl: /usr/bin/perl
Python: /usr/local/bin/python => /usr/local/Cellar/python/2.7.4/Frameworks/Python.framework/Versions/2.7/bin/python2.7
Ruby: /usr/bin/ruby => /System/Library/Frameworks/Ruby.framework/Versions/1.8/usr/bin/ruby
Any ideas?
This error occurs on Mac OS X when you perform a GUI operation outside the main thread, which is exactly what you are doing by shifting your plot function to the multiprocessing.Pool (I imagine that it will not work on Windows either for the same reason - since Windows has the same requirement). The only way that I can imagine it working is using the pool to generate the data, then have your main thread wait in a loop for the data that's returned (a queue is the way I usually handle it...).
Here is an example (recognizing that this may not do what you want - plot all the figures "simultaneously"? - plt.show() blocks so only one is drawn at a time and I note that you do not have it in your sample code - but without I don't see anything on my screen - however, if I take it out - there is no blocking and no error because all GUI functions are happening in the main thread):
import multiprocessing
import matplotlib.pyplot as plt
import numpy as np
import rpy2.robjects as robjects
data_queue = multiprocessing.Queue()
def main():
pool = multiprocessing.Pool()
num_figs = 10
# generate some random numbers
input = zip(np.random.randint(10,10000,num_figs), range(num_figs))
pool.map(worker, input)
figs_complete = 0
while figs_complete < num_figs:
data = data_queue.get()
plt.figure()
plt.plot(data)
plt.show()
figs_complete += 1
def worker(args):
num, i = args
data = np.random.randn(num).cumsum()
data_queue.put(data)
print('done ',i)
main()
Hope this helps.
I had a similar issue with my worker, which was loading some data, generating a plot, and saving it to a file. Note that this is slightly different than what the OP's case, which seems to be oriented around interactive plotting. Still, I think it's relevant.
A simplified version of my code:
def worker(id):
data = load_data(id)
plot_data_to_file(data) # Generates a plot and saves it to a file.
def plot_something_parallel(ids):
pool = multiprocessing.Pool()
pool.map(worker, ids)
plot_something_parallel(ids=[1,2,3])
This caused the same error others mention:
The process has forked and you cannot use this CoreFoundation functionality safely. You MUST exec().
Break on __THE_PROCESS_HAS_FORKED_AND_YOU_CANNOT_USE_THIS_COREFOUNDATION_FUNCTIONALITY___YOU_MUST_EXEC__() to debug.
Following #bbbruce's train of thought, I solved my problem by switching the matplotlib backend from TKAgg to the default. Specifically, I commented out the following line in my matplotlibrc file:
#backend : TkAgg
This might be rpy2-specific.
There are reports of a similar problem with OS X and multiprocessing here and there.
I think that using an initializer that imports the packages needed to run the code in plot could solve the problem (multiprocessing-doc).
I had a similar issue and found that setting the start method in multiprocessing to use forkserver works as long as it comes after your if name == main: statement.
if __name__ == '__main__':
multiprocessing.set_start_method('forkserver')
first_process = multiprocessing.Process(target = targetOne)
second_process = multiprocessing.Process(target = targetTwo)
first_process.start()
second_process.start()
Try to upgrade matplotlib to 3.0.3:
pip3 install matplotlib --upgrade
Then everything goes fine.
=======================================================================
No need to read below anymore.
Yesterday, my multiprocess plot works on my MacBook Air. But not working on my MacBook Pro tomorrow morning with the same code, displaying many:
The process has forked and you cannot use this CoreFoundation functionality safely. You MUST exec().
Break on __THE_PROCESS_HAS_FORKED_AND_YOU_CANNOT_USE_THIS_COREFOUNDATION_FUNCTIONALITY___YOU_MUST_EXEC__() to debug.
The process has forked and you cannot use this CoreFoundation functionality safely. You MUST exec().
They are all using 4th gen i intel CPU (i5-4xxx with air and i7-4xxx with pro). So if there are no difference on hardware, it must be on software.
So I just tried update matplot to 3.0.3 on MacBook Pro( was 3.0.1), every thing goes fine.
Also, no need to do pool.apply_async anymore.