I am building and fitting an hdbscan model on my data and when I run the script from within the file it works well and quickly, but when I import the file and run it from 'outside' it goes into a weird loop that I don't understand how it started. And I get the following error:
ImportError: [joblib] Attempting to do parallel computing without protecting your import on a system that does not support forking. To use parallel-computing in a script, you must protect your main loop using "if __name__ == '__main__'". Please see the joblib documentation on Parallel for more information
Here is an excerpt of the code:
df_pos_raw, df_pos_training = pre_process_data(df_pos)
df_pos_training_std = standardize_df(df_pos_training) # Standardized data, column-wise
print "generating model"
pos_cls = hdbscan.HDBSCAN(min_cluster_size=10, prediction_data=True)
print "fitting model to data"
pos_cls.fit(df_pos_training_std)
print 'done fitting model'
# sns.distplot(pos_cls.labels_, bins=len(set(pos_cls.labels_)))
df_filtered = filter_cons_types(df, [3, 5])
print "Done. returning variables"
return pos_cls, df_filtered
Here is the output when running from 'outside' the file:
Traceback (most recent call last):
File "<string>", line 1, in <module>
generating model
File "C:\ProgramData\Anaconda2\Lib\multiprocessing\forking.py", line 380, in main
fitting model to data
prepare(preparation_data)
File "C:\ProgramData\Anaconda2\Lib\multiprocessing\forking.py", line 510, in prepare
'__parents_main__', file, path_name, etc
File "C:\Users\sareetn\PycharmProjects\Arad\DataImputation\ClusteringExtrapolation\Dev\run_clustering_based_prediction.py", line 4, in <module>
model, raw_df = clustering()
File "C:\Users\sareetn\PycharmProjects\Arad\DataImputation\ClusteringExtrapolation\Dev\clustering_model_constype_3_5.py", line 86, in main
pos_cls.fit(df_pos_training_std)
File "C:\Users\sareetn\PycharmProjects\Arad\venv\lib\site-packages\hdbscan\hdbscan_.py", line 816, in fit
self._min_spanning_tree) = hdbscan(X, **kwargs)
File "C:\Users\sareetn\PycharmProjects\Arad\venv\lib\site-packages\hdbscan\hdbscan_.py", line 543, in hdbscan
core_dist_n_jobs, **kwargs)
File "C:\Users\sareetn\PycharmProjects\Arad\venv\lib\site-packages\sklearn\externals\joblib\memory.py", line 362, in __call__
return self.func(*args, **kwargs)
File "C:\Users\sareetn\PycharmProjects\Arad\venv\lib\site-packages\hdbscan\hdbscan_.py", line 239, in _hdbscan_boruvka_kdtree
n_jobs=core_dist_n_jobs, **kwargs)
File "hdbscan/_hdbscan_boruvka.pyx", line 375, in hdbscan._hdbscan_boruvka.KDTreeBoruvkaAlgorithm.__init__ (hdbscan/_hdbscan_boruvka.c:5195)
File "hdbscan/_hdbscan_boruvka.pyx", line 411, in hdbscan._hdbscan_boruvka.KDTreeBoruvkaAlgorithm._compute_bounds (hdbscan/_hdbscan_boruvka.c:5915)
File "C:\Users\sareetn\PycharmProjects\Arad\venv\lib\site-packages\sklearn\externals\joblib\parallel.py", line 749, in __call__
n_jobs = self._initialize_backend()
File "C:\Users\sareetn\PycharmProjects\Arad\venv\lib\site-packages\sklearn\externals\joblib\parallel.py", line 547, in _initialize_backend
**self._backend_args)
File "C:\Users\sareetn\PycharmProjects\Arad\venv\lib\site-packages\sklearn\externals\joblib\_parallel_backends.py", line 305, in configure
'[joblib] Attempting to do parallel computing '
ImportError: [joblib] Attempting to do parallel computing without protecting your import on a system that does not support forking. To use parallel-computing in a script, you must protect your main loop using "if __name__ == '__main__'". Please see the joblib documentation on Parallel for more information
generating model
fitting model to data
generating model
fitting model to data
generating model
fitting model to data
Thank you very much in advance!!
A friend helped me figure it out-
Clustering uses a library called joblib that splits the job into parallel processes. When running such functions on a Windows machine, care needs to be taken to make sure we use
if __name__ == '__main__'
in order to protect the code and allow the parallel processing to work.
After adding
if __name__ == '__main__'
and placing all of the code there, the clustering ran smoothly and quickly
Related
I try to use this tutorial to train my own car model recognition model: https://github.com/Helias/Car-Model-Recognition. And i want to use coda and my gpu perfomance to enhance training speed (preprocesssing step was completed without any errors).But when I try to train my model, I've got the following errors:
######### ERROR #######
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
######### batch #######
Traceback (most recent call last):
File "D:\Car-Model-Recognition\main.py", line 78, in train_model
######### ERROR #######
[Errno 32] Broken pipe
for i, batch in enumerate(loaders[mode]):
######### batch ####### File "C:\Program Files\Python37\lib\site-packages\torch\utils\data\dataloader.py", line 279, in __iter__
return _MultiProcessingDataLoaderIter(self)
Traceback (most recent call last):
File "C:\Program Files\Python37\lib\site-packages\torch\utils\data\dataloader.py", line 719, in __init__
File "main.py", line 78, in train_model
w.start()
File "C:\Program Files\Python37\lib\multiprocessing\process.py", line 112, in start
for i, batch in enumerate(loaders[mode]):
File "C:\Program Files\Python37\lib\site-packages\torch\utils\data\dataloader.py", line 279, in __iter__
self._popen = self._Popen(self)
File "C:\Program Files\Python37\lib\multiprocessing\context.py", line 223, in _Popen
return _MultiProcessingDataLoaderIter(self)
File "C:\Program Files\Python37\lib\site-packages\torch\utils\data\dataloader.py", line 719, in __init__
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Program Files\Python37\lib\multiprocessing\context.py", line 322, in _Popen
w.start()
return Popen(process_obj)
File "C:\Program Files\Python37\lib\multiprocessing\popen_spawn_win32.py", line 46, in __init__
File "C:\Program Files\Python37\lib\multiprocessing\process.py", line 112, in start
prep_data = spawn.get_preparation_data(process_obj._name)
File "C:\Program Files\Python37\lib\multiprocessing\spawn.py", line 143, in get_preparation_data
self._popen = self._Popen(self)
File "C:\Program Files\Python37\lib\multiprocessing\context.py", line 223, in _Popen
_check_not_importing_main()
File "C:\Program Files\Python37\lib\multiprocessing\spawn.py", line 136, in _check_not_importing_main
return _default_context.get_context().Process._Popen(process_obj)
File "C:\Program Files\Python37\lib\multiprocessing\context.py", line 322, in _Popen
is not going to be frozen to produce an executable.''')
RuntimeError:
An attempt has been made to start a new process before the
current process has finished its bootstrapping phase.
This probably means that you are not using fork to start your
child processes and you have forgotten to use the proper idiom
in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce an executable.
return Popen(process_obj)
I have used the exact code from given link, and if i start my code using wsl, everything is ok, but I can't use my gpu from wsl. Where should I insert this name == 'main' check to prevent such a mistake or how can i disable this multiprocessing
Looking at main.py, you run a lot of code at the module level. On Windows, python's multiprocessing module will start a new python interpreter, import your modules, unpickle a snapshot of your parent context and then call your worker function. The problem is that all of that module level code executes merely by import and you essentially run a new copy of your program instead of building a context for your worker.
The solution is two-fold. First, move all of the module level code into functions. You want to be a able to import your module without side effects. Second, call the function(s) that start your program from a conditional
def main():
the stuff you were doing a module level
if __name__ == "__main__":
main()
The reason this works is in the module name. When you run the top level script of a python (e.g., python main.py), its a script called "__main__", not a module. If a different program imports main its a module called "main" (or whatever you named your script). That 'if' stops your main code from executing if its imported by some other python code - such as the multiprocessing module.
Its okay to have some executable code at the module level, especially if you are setting up defaults and such. But don't do anything at the module level that you wouldn't want done if some other code imports your script.
I am trying to run in parallel a Linear Regression over 10000000 data point (4 features, 1 target variable) randomly generated from a Normal Distribution using Python's Scoop library. Here is the code:
import pandas as pd
import numpy as np
import random
from scoop import futures
import statsmodels.api as sm
from time import time
def linreg(vals):
global model
model = sm.OLS(y_vals,X_vals).fit()
return model
print(model.summary())
if __name__ == '__main__':
random.seed(42)
vals = pd.DataFrame(np.random.normal(loc = 3, scale = 100, size =(10000000,5)))
vals.columns = ['dep', 'ind1', 'ind2', 'ind3', 'ind4']
y_vals = vals['dep']
X_vals = vals[['ind1', 'ind2', 'ind3', 'ind4']]
bt = time()
model_vals = list(map(linreg, [1,2,3]))
mval = model_vals[0]
print(mval.summary())
serial_time = time() - bt
bt1 = time()
model_vals_1 = list(futures.map(linreg, [1,2,3]))
mval_1 = model_vals_1[0]
print(mval_1.summary())
parallel_time = time() - bt1
print(serial_time, parallel_time)`
However, after that the regression summary is indeed produced in serial - via the Python's standard map function - an error:
Traceback (most recent call last): File "C:\Users\niccolo.gentile\AppData\Local\Continuum\anaconda3\envs\tensorenviron\lib\runpy.py", line 193, in _run_module_as_main "main", mod_spec) File "C:\Users\niccolo.gentile\AppData\Local\Continuum\anaconda3\envs\tensorenviron\lib\runpy.py", line 85, in _run_code exec(code, run_globals) File "C:\Users\niccolo.gentile\AppData\Local\Continuum\anaconda3\envs\tensorenviron\lib\site-packages\scoop\bootstrap__main__.py", line 302, in b.main() File "C:\Users\niccolo.gentile\AppData\Local\Continuum\anaconda3\envs\tensorenviron\lib\site-packages\scoop\bootstrap__main__.py", line 92, in main self.run() File "C:\Users\niccolo.gentile\AppData\Local\Continuum\anaconda3\envs\tensorenviron\lib\site-packages\scoop\bootstrap__main__.py", line 290, in run futures_startup() File "C:\Users\niccolo.gentile\AppData\Local\Continuum\anaconda3\envs\tensorenviron\lib\site-packages\scoop\bootstrap__main__.py", line 271, in futures_startup run_name="main" File "C:\Users\niccolo.gentile\AppData\Local\Continuum\anaconda3\envs\tensorenviron\lib\site-packages\scoop\futures.py", line 64, in _startup result = _controller.switch(rootFuture, *args, **kargs) File "C:\Users\niccolo.gentile\AppData\Local\Continuum\anaconda3\envs\tensorenviron\lib\site-packages\scoop_control.py", line 253, in runController raise future.exceptionValue File "C:\Users\niccolo.gentile\AppData\Local\Continuum\anaconda3\envs\tensorenviron\lib\site-packages\scoop_control.py", line 127, in runFuture future.resultValue = future.callable(*future.args, **future.kargs) File "C:\Users\niccolo.gentile\AppData\Local\Continuum\anaconda3\envs\tensorenviron\lib\runpy.py", line 263, in run_path pkg_name=pkg_name, script_name=fname) File "C:\Users\niccolo.gentile\AppData\Local\Continuum\anaconda3\envs\tensorenviron\lib\runpy.py", line 96, in _run_module_code mod_name, mod_spec, pkg_name, script_name) File "C:\Users\niccolo.gentile\AppData\Local\Continuum\anaconda3\envs\tensorenviron\lib\runpy.py", line 85, in _run_code exec(code, run_globals) File "Scoop_map_linear_regression1.py", line 33, in model_vals_1 = list(futures.map(linreg, [1,2,3])) File "C:\Users\niccolo.gentile\AppData\Local\Continuum\anaconda3\envs\tensorenviron\lib\site-packages\scoop\futures.py", line 102, in _mapGenerator for future in _waitAll(*futures): File "C:\Users\niccolo.gentile\AppData\Local\Continuum\anaconda3\envs\tensorenviron\lib\site-packages\scoop\futures.py", line 358, in _waitAll for f in _waitAny(future): File "C:\Users\niccolo.gentile\AppData\Local\Continuum\anaconda3\envs\tensorenviron\lib\site-packages\scoop\futures.py", line 335, in _waitAny raise childFuture.exceptionValue NameError: name 'y_vals' is not defined
is produced afterwards. This means that the code stops at model_vals_1 = list(futures.map(linreg, [1,2,3]))
Please carefully note that in order to be able to run the code in parallel, it has to be launched from the command line specifying the -m scoop parameter, like this:
python -m scoop Scoop_map_linear_regression1.py
Indeed, should it be launched without the -m scoop parameter, it would not be parallelized and would indeed actually run, but just using two times the built in Python's map function (hence, running two times in serial), as how you would get reported in the Warnings. That is, without specifying the -m scoop parameter when launching it, futures.map would be replaced by map, while the goal is instead to indeed run it in parallel using futures.map.
This clarification is done so to avoid people answering that they solved the problem by simply launching the code without the -m scoop parameter, as already has happened here:
Python Parallel Computing - Scoop
where, as a consequence of this, the question was wrongly put on hold as off topic because no more reproducible.
Many thanks in advance and any comment is highly appreciated and welcome.
The solution is to pass, as second argument of futures.map (but not necessarily of map), only [1].
Indeed, even though the linreg function doesn't use the second argument passed to map, it still determines how many times the linreg function will be run. As an example, consider the following basic example:
def welcome(x):
print('Hello world!')
if __name__ == '__main__':
a = list(map(welcome, [1,2]))
The function welcome doesn't actually need any argument, but still the output will be
Hello world!
Hello world!
repeated two times, that is the length of the list passed as second argument.
In this specific case, this implies that the linear regression will be run 3 times by map, despite the fact that the regression output will appear just once as the summary is called outside the map.
The point is that, instead, it is not possible to run multiple times the linear regression with futures.map. The problem with is that, apparently, after the first run it actually deletes the used datasets, from which the impossibility to continue with the second and third run, and the consequent
NameError: name 'y_vals' is not defined
thrown at the end of the Trace. This should be visible by navigating over: scoop.futures source code
Didn't go over all of it, but I guess the problem should be related with greenlet switchers.
I have a keras_yolo python implementation, and I am trying to get the learning to work across multiple GPUs, and the multi_gpu_mode option sounds like a good place to start.
However, my problem is that the same code works just fine in a single CPU/GPU setup but fails with NameError: name 'yolo_head' is not defined when running as a multi_gpu_mode model. The full stack:
parallel_model = multi_gpu_model(model, cpu_relocation=True)
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/keras/utils/multi_gpu_utils.py", line 200, in multi_gpu_model
model = clone_model(model)
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/keras/models.py", line 251, in clone_model
return _clone_functional_model(model, input_tensors=input_tensors)
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/keras/models.py", line 152, in _clone_functional_model
layer(computed_tensors, **kwargs))
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/keras/engine/base_layer.py", line 457, in __call__
output = self.call(inputs, **kwargs)
File "/home/ubuntu/anaconda3/envs/tensorflow_p36/lib/python3.6/site-packages/keras/layers/core.py", line 687, in call
return self.function(inputs, **arguments)
File "/mnt/data/DeepLeague/YAD2K/yad2k/models/keras_yolo.py", line 199, in yolo_loss
pred_xy, pred_wh, pred_confidence, pred_class_prob = yolo_head(
Here is a link to the definition of yolo_head: https://github.com/farzaa/DeepLeague/blob/c87fcd89d9f9e81421609eb397bf95433270f0e2/YAD2K/yad2k/models/keras_yolo.py#L66
I've not yet dived into the multi_gpu_model code to understand how the copying worked under the hood and was hoping to avoid needing to do that.
The issue is because custom imports in the lambda used in Keras must be imported explicitly within the function referring to it.
eg. in this case yolo_head must be 're-imported' at the function level of `yolo_loss' like this:
def yolo_loss(args, anchors, num_classes, rescore_confidence=False, print_loss=False):
from yad2k.models.keras_yolo import yolo_head
I am getting random failures to create the callback model weights file when running Keras (with tensorflow) models in a loop with changing model parameters and input data. I am thinking that it might be something to do with the length of my directory name but if so, it seems like a bug as it only happens sometimes. Prior to the following error it wrote multiple files with same length directories. I am using long names for directories to make it easier to distinguish runs in tensorboard. I will show my basic code setup as pseudo code and then the random error that I am getting. I have a nested for loop that is changing model parameters as well as input data. The basic loop will work fine for hours and then randomly fail for the same error at some point in the loop. I would like to know if I am doing something wrong in my file name that is causing this. I would also like a work around so that when it fails, I can keep running and move on to the next file and skip the one that is failing. Some type of try/except but I don't know enough about h5py to know how to code that. I am running on Windows 10 (conda env), tensorflow-gpu 1.6.0, Keras 2.1.5, h5py 2.7.1, tensorboard 1.6.0. I also set up Windows 10 to handle long file names. This error seems to be coming directly from h5py (h5py\h5f.pyx). Also, the file actually gets created and written. I can load the file using h5py.File() and it is the correct size and has same groups and objects. Update: I included the os.makedirs() line in my code that I did not show before. I also added a check on the directory creation and ran the code again. It still failed in same way and it never triggered the isdir() check. Update 2: Wanted to point out that I am using multiprocessing because of memory leaks when using Keras with Tensorflow. This happens regardless of having K.clear_session() and tf.reset_default_graph(). I now believe this random error is related to multiprocessing as I have not observed this error yet when I eliminate the pooling process.
def main():
for input_data in input_data_list:
for model_parameters in model_parameters_list:
# run model with different parameters on all data
pool = multiprocessing.Pool(1)
pool.apply(run_function,run_parameters...,model_func_name,
model_func_dict)
pool.close()
def run_function(run_parameters...,model_func_name,model_func_dict,...):
# code to extract x_train,y_train, x_val, y_val etc not shown
# model_def = long string representing model parameters example below
# model_def =
# 'basic_ff_nn4_mse_dr50_Nadam_LeakyReLU_kr_l2_ar_off_ns_0_BCtoA_all_2_2'
# build and compile model
model = model_func_name(**model_func_dict)
# set up callbacks
os.makedirs(models_dir + "{}_{}_{}_{}/".format(model_def, set_name,
fold, set_num), exist_ok=True)
tmp_path = models_dir + "{}_{}_{}_{}/".format(model_def, set_name, fold,
set_num)
best_weights_file = models_dir + "{}_{}_{}_{}/best_weights.hdf5".format(
model_def, set_name, fold, set_num)
best_model_weights = callbacks.ModelCheckpoint(best_weights_file,
save_best_only=True,
save_weights_only=True)
log_dir = 'output/{}_{}/tf_logs/{}/{}/{}'.format(model_type, cur_time,
model_def, set_name,
'f' + str(fold))
tensorboard = callbacks.TensorBoard(log_dir=log_dir,
histogram_freq=0, write_graph=False,
write_images=False,
write_grads=False)
if not os.path.isdir(tmp_path):
print('path not created = ',tmp_path)
model_history = model.fit(x=x_train, y=y_train,
verbose=0,
batch_size=size_batches,
epochs=num_epochs,
validation_data=[x_val, y_val],
callbacks=[best_model_weights, tensorboard],
)
K.clear_session()
tf.reset_default_graph()
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "C:\ProgramData\Miniconda3\envs\tflow_g\lib\multiprocessing\pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "C:\Users\xxxxx\Dropbox (test)\Lab\VLL models\zakworkspace\cps\cps_main.py", line 1042, in run_joint_ff
callbacks=[best_model_weights, tensorboard],
File "C:\ProgramData\Miniconda3\envs\tflow_g\lib\site-packages\keras\models.py", line 963, in fit
validation_steps=validation_steps)
File "C:\ProgramData\Miniconda3\envs\tflow_g\lib\site-packages\keras\engine\training.py", line 1705, in fit
validation_steps=validation_steps)
File "C:\ProgramData\Miniconda3\envs\tflow_g\lib\site-packages\keras\engine\training.py", line 1255, in _fit_loop
callbacks.on_epoch_end(epoch, epoch_logs)
File "C:\ProgramData\Miniconda3\envs\tflow_g\lib\site-packages\keras\callbacks.py", line 77, in on_epoch_end
callback.on_epoch_end(epoch, logs)
File "C:\ProgramData\Miniconda3\envs\tflow_g\lib\site-packages\keras\callbacks.py", line 445, in on_epoch_end
self.model.save_weights(filepath, overwrite=True)
File "C:\ProgramData\Miniconda3\envs\tflow_g\lib\site-packages\keras\models.py", line 754, in save_weights
with h5py.File(filepath, 'w') as f:
File "C:\ProgramData\Miniconda3\envs\tflow_g\lib\site-packages\h5py\_hl\files.py", line 269, in __init__
fid = make_fid(name, mode, userblock_size, fapl, swmr=swmr)
File "C:\ProgramData\Miniconda3\envs\tflow_g\lib\site-packages\h5py\_hl\files.py", line 105, in make_fid
fid = h5f.create(name, h5f.ACC_TRUNC, fapl=fapl, fcpl=fcpl)
File "h5py\_objects.pyx", line 54, in h5py._objects.with_phil.wrapper
File "h5py\_objects.pyx", line 55, in h5py._objects.with_phil.wrapper
File "h5py\h5f.pyx", line 98, in h5py.h5f.create
OSError: Unable to create file (unable to open file: name = 'output/TB_runs_03122018-031837/dump/models/basic_ff_nn4_mse_dr50_Nadam_LeakyReLU_kr_l2_ar_off_ns_0_BCtoA_all_2_2/best_weights.hdf5', errno = 22, error message = 'Invalid argument', flags = 13, o_flags = 302)
"""
I know this post is rather old but I just ran into this problem myself and managed to find a solution. Perhaps it is also useful for others.
This error occurs because the nc file has been locked. It is necessary/possible to turn off 'file locking' by typing into the terminal:
export HDF5_USE_FILE_LOCKING=FALSE
see more explanation on NetCDF / HDF5 file locking here.
I'm going to dump the error code I got while try a python script :
Preprocess validation data upfront
Using gpu device 0: Tesla K20c
Traceback (most recent call last):
File "<string>", line 1, in <module>
File "C:\SciSoft\WinPython-64bit-2.7.6.4\python-2.7.6.amd64\lib\multiprocessing\forking.py", line 380, in main
prepare(preparation_data)
File "C:\SciSoft\WinPython-64bit-2.7.6.4\python-2.7.6.amd64\lib\multiprocessing\forking.py", line 495, in prepare
'__parents_main__', file, path_name, etc
File "C:\Users\Administrator\Desktop\Galaxy Data\kaggle-galaxies-master\kaggle-galaxies-master\try_convnet_cc_multirotflip_3x69r45_maxout2048_extradense.py", line 133, in <module>
for data, length in create_valid_gen():
File "load_data.py", line 572, in buffered_gen_mp
process.start()
`File "C:\SciSoft\WinPython-64bit-2.7.6.4\python-2.7.6.amd64\lib\multiprocessing\process.py", line 130, in start
self._popen = Popen(self)
File "C:\SciSoft\WinPython-64bit-2.7.6.4\python-2.7.6.amd64\lib\multiprocessing\forking.py", line 258, in init
cmd = get_command_line() + [rhandle]
File "C:\SciSoft\WinPython-64bit-2.7.6.4\python-2.7.6.amd64\lib\multiprocessing\forking.py", line 358, in get_command_line`
is not going to be frozen to produce a Windows executable.''')
RuntimeError:
Attempt to start a new process before the current process
has finished its bootstrapping phase.
This probably means that you are on Windows and you have
forgotten to use the proper idiom in the main module:
if __name__ == '__main__':
freeze_support()
...
The "freeze_support()" line can be omitted if the program
is not going to be frozen to produce a Windows executable.
As I understand I have to insert a line
if __name__ == '__main__':
Some where to get it to work
Can anyone tell me in which File I should insert it ? I have included the affected files list in the initial error logs
The affected file :
https://github.com/benanne/kaggle-galaxies/blob/master/try_convnet_cc_multirotflip_3x69r45_maxout2048_extradense.py
Lines 131-134
and
https://github.com/benanne/kaggle-galaxies/blob/master/load_data.py
line 572
Python documentation is quite clear in this case.
The important part is Safe importing of main module.
Your try_convnet_cc_multirotflip_3x69r45_maxout2048_extradense.py script is doing lots of things on module level. Without reading it in details I can already say that you should wrap the workflow with a function and use it like this:
if __name__ == '__main__':
freeze_support() # Optional under circumstances described in docs
your_workflow_function()
Besides the problem you have, it's a good habit not to surprise possible user of your script with side effects, if the user just wants to import it and reuse some of it's functionality.
So don't put your code on module level. It's ok to have constants on module level but the workflow should be in functions and classes.
If Python module is intended to be used as a script (like in your case), you simply put the if __name__ == '__main__' in the very end of this module, calling your_workflow_function() only if the module is the entry point for the interpreter - so called main module.