Pyspark can't pickle local object (list of functions)

Pyspark can't pickle local object (list of functions) - python

I'm trying to implement LSH in pyspark, my implementation perfectly works for small sets of documents but, when the set is quite huge, I obtain this error:
AttributeError: Can't pickle local object '__hash_family__.<locals>.hash_member'
And then:
19/11/21 17:59:40 ERROR TaskSetManager: Task 0 in stage 3.0 failed 1 times; aborting job
Traceback (most recent call last):
File "/Users/<my_home_dir>/PycharmProjects/data_mining/hw_2/ex_3/main_kijiji.py", line 62, in <module>
lsh = signatures.reduce(lambda x, y: __update_hash_table__(x[0], x[1], lsh_b, lsh_r) +
File "/Library/Python/3.7/site-packages/pyspark/rdd.py", line 844, in reduce
vals = self.mapPartitions(func).collect()
File "/Library/Python/3.7/site-packages/pyspark/rdd.py", line 816, in collect
sock_info = self.ctx._jvm.PythonRDD.collectAndServe(self._jrdd.rdd())
File "/Library/Python/3.7/site-packages/py4j/java_gateway.py", line 1257, in __call__
answer, self.gateway_client, self.target_id, self.name)
File "/Library/Python/3.7/site-packages/pyspark/sql/utils.py", line 63, in deco
return f(*a, **kw)
File "/Library/Python/3.7/site-packages/py4j/protocol.py", line 328, in get_return_value
format(target_id, ".", name), value)
The error is generated by this line of code:
lsh = signatures.reduce(lambda x, y: __update_hash_table__(x[0], x[1], hash_tables, lsh_b, lsh_r) +
__update_hash_table__(y[0], y[1], hash_tables, lsh_b, lsh_r)).cache()
where hash_tables is a list generated in this way:
hash_tables = [[__hash_family__(i, lsh_num_hashes), {}] for i in range(lsh_b)]
The function hash_family is the following:
def __hash_family__(i, resultSize=20):
maxLen = 10 # how long can our i be (in decimal)
salt = str(i).zfill(maxLen)[-maxLen:]
def hash_member(x):
return hashlib.sha1((x + salt).encode()).digest()[-resultSize:]
return hash_member
And this is the function update_hash_table:
def __update_hash_table__(doc_id, sig, hash_tables, lsh_b, lsh_r):
for b in range(lsh_b):
start_row = b * lsh_r
end_row = start_row + lsh_r
band = str(sig[start_row:end_row])
bucket_idx = hash_tables[b][0](''.join(band))
try:
hash_tables[b][1][bucket_idx].append(doc_id)
except KeyError:
hash_tables[b][1][bucket_idx] = [doc_id]
return hash_tables
I even tried to generate the hash_tables directly in the file that contains the definition of update_hash_table or to generate the tables inside the function, but I always get the pickling error, how can I rewrite my code to store in the variable lsh the result of that reduce operation?
I know I could collect signatures and transform it in a list from the rdd, but it would be very expensive, can I still execute this operation without increasing too much the execution time?

Related

multiprocessing.Pool().map multiple values for argument in worker function Error

Hello fellow programmers!
I am trying to implement multiprocessing in a class, to reduce processing time of a program.
This is an abbreviation of the program:
import multiprocessing as mp
from functools import partial
class PlanningMachines():
def __init__(self, machines, number_of_objectives, topology=False, episodes=None):
....
def calculate_total_node_THD_func_real_data_with_topo(self):
self.consider_topology = True
func_part = partial(self.worker_function, consider_topology=self.consider_topology,
list_of_machines=self.list_of_machines, next_state=self.next_state, phase=phase, grid_topo=self.grid_topo,
total_THD_for_all_timesteps_with_topo=total_THD_for_all_timesteps_with_topo,
smallest_harmonic=smallest_harmonic, pol2cart=self.pol2cart, cart2pol=self.cart2pol,
total_THD_for_all_timesteps=total_THD_for_all_timesteps, harmonics_state_phase=harmonics_state_phase,
episode=self.episode, episodes=self.episodes, time_=self.time_, steplength=self.steplength,
longest_measurement=longest_measurement)
with mp.Pool() as mpool:
mpool.map(func_part, range(0, longest_measurement))
def worker_function(measurement=None, consider_topology=None, list_of_machines=None, next_state=None, phase=None,
grid_topo=None, total_THD_for_all_timesteps_with_topo=None, smallest_harmonic=None, pol2cart=None,
cart2pol=None, total_THD_for_all_timesteps=None, harmonics_state_phase=None, episode=None,
episodes=None, time_=None, steplength=None, longest_measurement=None):
.....
As you might know, one way of implementing parallel processing is using multiprocessing.Pool().map:
with mp.Pool() as mpool:
mpool.map(func_part, range(0, longest_measurement))
This function requires a worker_function which can be "packed" with functools.partial:
func_part = partial(self.worker_function, consider_topology=self.consider_topology,
list_of_machines=self.list_of_machines, next_state=self.next_state, phase=phase, grid_topo=self.grid_topo,
total_THD_for_all_timesteps_with_topo=total_THD_for_all_timesteps_with_topo,
smallest_harmonic=smallest_harmonic, pol2cart=self.pol2cart, cart2pol=self.cart2pol,
total_THD_for_all_timesteps=total_THD_for_all_timesteps, harmonics_state_phase=harmonics_state_phase,
episode=self.episode, episodes=self.episodes, time_=self.time_, steplength=self.steplength,
longest_measurement=longest_measurement)
The Error is thrown when I try to execute mpool.map(func_part, range(0, longest_measurement)):
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "C:\Users\Artur\Anaconda\lib\multiprocessing\pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "C:\Users\Artur\Anaconda\lib\multiprocessing\pool.py", line 44, in mapstar
return list(map(*args))
TypeError: worker_function() got multiple values for argument 'consider_topology'
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "C:/Users/Artur/Desktop/RL_framework/train.py", line 87, in <module>
main()
File "C:/Users/Artur/Desktop/RL_framework/train.py", line 77, in main
duration = cf.training(episodes, env, agent, filename, topology=topology, multi_processing=multi_processing, CPUs_used=CPUs_used)
File "C:\Users\Artur\Desktop\RL_framework\help_functions\custom_functions.py", line 166, in training
save_interval = parallel_training(range(episodes), env, agent, log_data_qvalues, log_data, filename, CPUs_used)
File "C:\Users\Artur\Desktop\RL_framework\help_functions\custom_functions.py", line 54, in paral
lel_training
next_state, reward = env.step(action, state) # given the action, the environment gives back the next_state and the reward for the transaction for all objectives seperately
File "C:\Users\Artur\Desktop\RL_framework\help_functions\environment_machines.py", line 127, in step
self.calculate_total_node_THD_func_real_data_with_topo() # THD_plant calculation with considering grid topo
File "C:\Users\Artur\Desktop\RL_framework\help_functions\environment_machines.py", line 430, in calculate_total_node_THD_func_real_data_with_topo
mpool.map(func_part, range(longest_measurement))
File "C:\Users\Artur\Anaconda\lib\multiprocessing\pool.py", line 268, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "C:\Users\Artur\Anaconda\lib\multiprocessing\pool.py", line 657, in get
raise self._value
TypeError: worker_function() got multiple values for argument 'consider_topology'
Process finished with exit code 1
How can consider_topology have multiple values if it is passed right before the worker_function:
self.consider_topology = True
I hope I could describe the my issue well enough for you to understand. Thank you in return.

The problem I think is that your worker_function should be a static method.
What happens now is that you provide all values except the measurement variable in the partial call. You do this since this is the one value you are changing I'm guessing.
However since it is a class method it provides an instance of itself automatically as the first argument as well. You did not define self as the first argument of worker_function and now the class instance is inputted as your measurement input. The range(0, longest_measurement) you provide the map call is then inserted as the second input variable. Now since consider_topology is the second input parameter the function sees two values supplied for it, 1 the value in the partial call, and 2 the map call.

Pandas index labelling type error within multiprocessing code

I've been wrestling with getting the below multiprocessing code to work (to append nearest stores to a customer file using co-ordinates).
I believe it's a pandas issue that's causing the problem, potentially something to do with passing the dataframe into the function parallelize_dataframe() where it's splits into different numpy arrays (that's just a guess). Oddly, when I run on the full postcodes file (rather than the test customer file), it doesn't crash (ran for 15 mins until I stopped it), however, as postcodes is 2.6m records long, I don't know if it just hadn't reached the point where it would crash, or if I'm introducing the problem when I create the test files.
It's a long process that utilises most of my CPU, so I want to prove it works on the the test files first before letting it run for a long time on the full file.
Either way, it persistently throws an index labelling type error (at end of post).
Any help with this appreciated.
import multiprocess as mp #pip install multiprocess
import pandas as pd
import numpy as np
import functools
postcodes = pd.read_csv('national_postcode_stats_file.csv')
customers = postcodes.sample(n = 10000, random_state=1) # customers test file
stores = postcodes.sample(n = 100, random_state=1) # store test file
stores.reset_index(inplace=True)
cores = mp.cpu_count() # 8 CPUs
partitions = cores
def parallelize_dataframe(data, func):
data_split = np.array_split(data, partitions)
pool = mp.Pool(cores)
data = pd.concat(pool.map(func, data_split))
pool.close()
pool.join()
return data
def dist_func(stores, data):
# Reimport libraries (parellel processes completed in fresh interpretter each time)
import pandas as pd
import numpy as np
def nearest(inlat1, inlon1, inlat2, inlon2, store, postcode):
lat1 = np.radians(inlat1)
lat2 = np.radians(inlat2)
longdif = np.radians(inlon2 - inlon1)
r = 6371.1009 # gives d in kilometers
d = np.arccos(np.sin(lat1)*np.sin(lat2) + np.cos(lat1)*np.cos(lat2) * np.cos(longdif)) * r
near = pd.DataFrame({'store': store, 'postcode': postcode, 'distance': d})
near_min = near.loc[near['distance'].idxmin()]
x = str(near_min['store']) + '~' + str(near_min['postcode']) + '~' + str(near_min['distance'])
return x
data['appended'] = data['lat'].apply(nearest, args=(data['long'], stores['lat'], stores['long'], stores['index'], stores['pcds']))
data[['store','store_postcode','distance_km']] = data['appended'].str.split("~",expand=True)
return data
dist_func_with_stores = functools.partial(dist_func, stores) # Needed to pass stores to parrellize_dataframe
dist = parallelize_dataframe(customers, dist_func_with_stores)
And the full error:
---------------------------------------------------------------------------
RemoteTraceback Traceback (most recent call last)
RemoteTraceback:
"""
Traceback (most recent call last):
File "C:\Users\AppData\Local\Continuum\anaconda3\lib\site-packages\multiprocess\pool.py", line 121, in worker
result = (True, func(*args, **kwds))
File "C:\Users\AppData\Local\Continuum\anaconda3\lib\site-packages\multiprocess\pool.py", line 44, in mapstar
return list(map(*args))
File "<ipython-input-34-7a1b788055e2>", line 41, in dist_func
File "C:\Users\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\series.py", line 3591, in apply
mapped = lib.map_infer(values, f, convert=convert_dtype)
File "pandas/_libs/lib.pyx", line 2217, in pandas._libs.lib.map_infer
File "C:\Users\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\series.py", line 3578, in f
return func(x, *args, **kwds)
File "<ipython-input-34-7a1b788055e2>", line 37, in nearest
File "C:\Users\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\indexing.py", line 1500, in __getitem__
return self._getitem_axis(maybe_callable, axis=axis)
File "C:\Users\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\indexing.py", line 1912, in _getitem_axis
self._validate_key(key, axis)
File "C:\Users\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\indexing.py", line 1799, in _validate_key
self._convert_scalar_indexer(key, axis)
File "C:\Users\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\indexing.py", line 262, in _convert_scalar_indexer
return ax._convert_scalar_indexer(key, kind=self.name)
File "C:\Users\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\indexes\numeric.py", line 211, in _convert_scalar_indexer
._convert_scalar_indexer(key, kind=kind))
File "C:\Users\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\indexes\base.py", line 2877, in _convert_scalar_indexer
return self._invalid_indexer('label', key)
File "C:\Users\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\indexes\base.py", line 3067, in _invalid_indexer
kind=type(key)))
TypeError: cannot do label indexing on <class 'pandas.core.indexes.numeric.Int64Index'> with these indexers [nan] of <class 'float'>
"""
The above exception was the direct cause of the following exception:
TypeError Traceback (most recent call last)
<ipython-input-34-7a1b788055e2> in <module>
45 dist_func_with_stores = functools.partial(dist_func, stores) # Needed to pass stores to parrellise_dataframe
46
---> 47 dist = parallelize_dataframe(customers, dist_func_with_stores)
<ipython-input-34-7a1b788055e2> in parallelize_dataframe(data, func)
16 data_split = np.array_split(data, partitions)
17 pool = mp.Pool(cores)
---> 18 data = pd.concat(pool.map(func, data_split))
19 pool.close()
20 pool.join()
~\AppData\Local\Continuum\anaconda3\lib\site-packages\multiprocess\pool.py in map(self, func, iterable, chunksize)
266 in a list that is returned.
267 '''
--> 268 return self._map_async(func, iterable, mapstar, chunksize).get()
269
270 def starmap(self, func, iterable, chunksize=None):
~\AppData\Local\Continuum\anaconda3\lib\site-packages\multiprocess\pool.py in get(self, timeout)
655 return self._value
656 else:
--> 657 raise self._value
658
659 def _set(self, i, obj):
TypeError: cannot do label indexing on <class 'pandas.core.indexes.numeric.Int64Index'> with these indexers [nan] of <class 'float'>

Solved it - changed the apply to a lambda function - works fine now.
data['appended'] = data.apply(lambda row: nearest(row['lat'], row['long'], stores['lat'], stores['long'], stores['index'], stores['pcds']),axis=1)

How to interface Pyomo with GLPK?

opt = SolverFactory("glpk")
opt.options["mipgap"] = 0.05
opt.options["FeasibilityTol"] = 1e-05
solver_manager = SolverManagerFactory("serial")
# results = solver_manager.solve(instance, opt=opt, tee=True,timelimit=None, mipgap=0.1)
results = solver_manager.solve(model, opt=opt, tee=True, timelimit=None)
# sends results to stdout
# results.write()
def pyomo_save_results(options=None, instance=None, results=None):
OUTPUT = open(r'Results_generic_hub.txt', 'w')
print(results, file=OUTPUT)
OUTPUT.close()
It generates the following error. GLPK is installed with GLPSOL -- help working from any directory. Is this a problem with the GLPK module? Or with the model itself? Environment: - Conda, Mac OS Yosemite.
File "<ipython-input-7-ba156f9322b2>", line 7, in <module>
results = solver_manager.solve(model, opt=opt, tee=True,timelimit=None)
File "/anaconda/lib/python3.6/site-
packages/pyomo/opt/parallel/async_solver.py", line 34, in solve
return self.execute(*args, **kwds)
File "/anaconda/lib/python3.6/site-
packages/pyomo/opt/parallel/manager.py", line 107, in execute
ah = self.queue(*args, **kwds)
File "/anaconda/lib/python3.6/site-
packages/pyomo/opt/parallel/manager.py", line 122, in queue
return self._perform_queue(ah, *args, **kwds)
File "/anaconda/lib/python3.6/site-
packages/pyomo/opt/parallel/local.py", line 59, in _perform_queue
results = opt.solve(*args, **kwds)
File "/anaconda/lib/python3.6/site-packages/pyomo/opt/base/solvers.py", line 582, in solve
self._presolve(*args, **kwds)
File "/anaconda/lib/python3.6/site-packages/pyomo/opt/solver/shellcmd.py", line 196, in _presolve
OptSolver._presolve(self, *args, **kwds)
File "/anaconda/lib/python3.6/site-packages/pyomo/opt/base/solvers.py", line 661, in _presolve
**kwds)
File "/anaconda/lib/python3.6/site-packages/pyomo/opt/base/solvers.py", line 729, in _convert_problem
**kwds)
File "/anaconda/lib/python3.6/site-packages/pyomo/opt/base/convert.py", line 110, in convert_problem
problem_files, symbol_map = converter.apply(*tmp, **tmpkw)
File "/anaconda/lib/python3.6/site-packages/pyomo/solvers/plugins/converter/model.py", line 86, in apply
io_options=io_options)
File "/anaconda/lib/python3.6/site-packages/pyomo/core/base/block.py", line 1646, in write
io_options)
File "/anaconda/lib/python3.6/site-packages/pyomo/repn/plugins/cpxlp.py", line 163, in __call__
include_all_variable_bounds=include_all_variable_bounds)
File "/anaconda/lib/python3.6/site-packages/pyomo/repn/plugins/cpxlp.py", line 575, in _print_model_LP
" cannot write legal LP file" % str(model.name))
ValueError: ERROR: No objectives defined for input model 'unknown'; cannot write legal LP file

The error you are seeing:
"ERROR: No objectives defined for input model 'unknown'; cannot write legal LP file"
indicates that Pyomo cannot find an active Objective component on your model (either you never added one to the model, or the Objective component(s) were all deactivated). Either way, valid LP files (which is how Pyomo interfaces with GLPK) require an objective. Fixing your model by adding an Objective should resolve this error.

Try this code in the end of the script:
> instance = model.create() instance.pprint() opt =
> SolverFactory("glpk") results = opt.solve(instance)
> print(results)
`

Exception: 'numpy.float64' object is not callable when optimising

I keep getting
Exception: 'numpy.float64' object is not callable
when trying to minimize a function.
I can call the function I'm trying to minimize as
def testLLCalc():
mmc = MortalityModelCalibrator()
a = mmc.log_likelihood(2000, np.array([[0.6, 0.2, 0.8]]))
but when I try and minimize it by doing
x0 = np.array([0, 0, 0])
res = minimize(-a[0], x0)
I get the exception above. Any help would be appreciated. Full traceback is:
Error
Traceback (most recent call last):
File "C:\Program Files (x86)\JetBrains\WinPython-64bit-3.5.3.0Qt5\python-3.5.3.amd64\lib\unittest\case.py", line 59, in testPartExecutor
yield
File "C:\Program Files (x86)\JetBrains\WinPython-64bit-3.5.3.0Qt5\python-3.5.3.amd64\lib\unittest\case.py", line 601, in run
testMethod()
File "C:\Program Files (x86)\JetBrains\WinPython-64bit-3.5.3.0Qt5\python-3.5.3.amd64\lib\site-packages\nose\case.py", line 198, in runTest
self.test(*self.arg)
File "C:\Users\Matt\Documents\PyCharmProjects\Mortality\src\PennanenMortalityModel_test.py", line 57, in testLLCalc
res = minimize(-a[0], x0)
File "C:\Program Files (x86)\JetBrains\WinPython-64bit-3.5.3.0Qt5\python-3.5.3.amd64\lib\site-packages\scipy\optimize\_minimize.py", line 444, in minimize
return _minimize_bfgs(fun, x0, args, jac, callback, **options)
File "C:\Program Files (x86)\JetBrains\WinPython-64bit-3.5.3.0Qt5\python-3.5.3.amd64\lib\site-packages\scipy\optimize\optimize.py", line 913, in _minimize_bfgs
gfk = myfprime(x0)
File "C:\Program Files (x86)\JetBrains\WinPython-64bit-3.5.3.0Qt5\python-3.5.3.amd64\lib\site-packages\scipy\optimize\optimize.py", line 292, in function_wrapper
return function(*(wrapper_args + args))
File "C:\Program Files (x86)\JetBrains\WinPython-64bit-3.5.3.0Qt5\python-3.5.3.amd64\lib\site-packages\scipy\optimize\optimize.py", line 688, in approx_fprime
return _approx_fprime_helper(xk, f, epsilon, args=args)
File "C:\Program Files (x86)\JetBrains\WinPython-64bit-3.5.3.0Qt5\python-3.5.3.amd64\lib\site-packages\scipy\optimize\optimize.py", line 622, in _approx_fprime_helper
f0 = f(*((xk,) + args))
File "C:\Program Files (x86)\JetBrains\WinPython-64bit-3.5.3.0Qt5\python-3.5.3.amd64\lib\site-packages\scipy\optimize\optimize.py", line 292, in function_wrapper
return function(*(wrapper_args + args))
Exception: 'numpy.float64' object is not callable

scipy's minimize expects a callable function as first argument.
As you did not show your complete code it's just a guessing-game here, but this
res = minimize(-a[0], x0)
has to mean that the first element of a should be a function.
Seeing this line:
a = mmc.log_likelihood(2000, np.array([[0.6, 0.2, 0.8]]))
it does not look like that as probably a scalar is returned.
The effect is simple: scipy want's to call this given function with some argument (x0 at the beginning), but calls some numpy-array value with some argument in your case (which is not valid of course).

Review the docs:
minimize(fun, x0, args=(),...
fun : callable
Objective function.
x0 : ndarray
Initial guess.
args : tuple, optional
Extra arguments passed to the objective function and its derivatives
Do you know what a 'callable' is? It's a function (or equivalent), something can be 'called' with fun(x0, arg0, arg1, ...).
The error tells us that -a[0] is an element of a numpy array, a.
It's not clear whether you are trying to minimize this function, or whether this is part of using minimize. It can't be source of a, because it doesn't return anything.
def testLLCalc():
mmc = MortalityModelCalibrator()
a = mmc.log_likelihood(2000, np.array([[0.6, 0.2, 0.8]]))
# return a ????
So - review your understanding of basic Python, especially the idea of a 'callable'. And run some the minimize examples, to get a better feel for how to use this function.

Python Multiprocessing Process exits before operation

I have a python object - list of dictionaries which I want to fill with key-value pairs in each of those dicts but simultaneously using multiple processors and using the multiprocessing module in python. For that purpose I am using the Manager module for storing that python object. Here is the following code:
from pylab import *
from numpy.random import *
import multiprocessing
import threading
import random
def tasks_start(id, global_lists):
counter_lock = threading.Lock()
with counter_lock:
num = int(10*random.random())
global_lists[num] = {'1':'Random'}
print("Id: ", id)
print(global_lists[0])
if __name__ == '__main__':
numProcessors = 6
pool = multiprocessing.Pool(numProcessors)
global_list = multiprocessing.Manager().list(range(100))
for idx in range(100):
global_list[idx] = multiprocessing.Manager().dict()
tasks = []
for id in range(10):
tasks.append((id, global_list))
pool.starmap(tasks_start, tasks)
pool.close()
pool.join()
So what I am doing here is creating a list of dictionaries stored as global_list and then calling the tasks_start() method 10 times using the python's starmap() module (just so that I can later extend to multiple arguments) to fill the list of dictionaries. As a simple test case, I just use the random generator to randomly pick up one dictionary among the lists everytime and fill it with some value. When I run the program, the following error occurs:
multiprocessing.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/usr/lib/python3.4/multiprocessing/pool.py", line 119, in worker
result = (True, func(*args, **kwds))
File "/usr/lib/python3.4/multiprocessing/pool.py", line 47, in starmapstar
return list(itertools.starmap(args[0], args[1]))
File "/home/cysis/inhibition_soum/motif_temporal_patterns/code_versions/2016/09/09_08/parallel_test/test_error_manager.py", line 14, in tasks_start
print(global_lists[0])
File "<string>", line 2, in __getitem__
File "/usr/lib/python3.4/multiprocessing/managers.py", line 732, in _callmethod
kind, result = conn.recv()
File "/usr/lib/python3.4/multiprocessing/connection.py", line 251, in recv
return ForkingPickler.loads(buf.getbuffer())
File "/usr/lib/python3.4/multiprocessing/managers.py", line 852, in RebuildProxy
return func(token, serializer, incref=incref, **kwds)
File "/usr/lib/python3.4/multiprocessing/managers.py", line 706, in __init__
self._incref()
File "/usr/lib/python3.4/multiprocessing/managers.py", line 756, in _incref
conn = self._Client(self._token.address, authkey=self._authkey)
File "/usr/lib/python3.4/multiprocessing/connection.py", line 495, in Client
c = SocketClient(address)
File "/usr/lib/python3.4/multiprocessing/connection.py", line 624, in SocketClient
s.connect(address)
FileNotFoundError: [Errno 2] No such file or directory
"""
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/cysis/inhibition_soum/motif_temporal_patterns/code_versions/2016/09/09_08/parallel_test/test_error_manager.py", line 29, in <module>
pool.starmap(tasks_start, tasks)
File "/usr/lib/python3.4/multiprocessing/pool.py", line 268, in starmap
return self._map_async(func, iterable, starmapstar, chunksize).get()
File "/usr/lib/python3.4/multiprocessing/pool.py", line 599, in get
raise self._value
FileNotFoundError: [Errno 2] No such file or directory
In my opinion before the last print(global_lists[0) is executed, the Manager exits and therefore is not able to find global_lists[0]. Can anybody shed some light on this sort of stuff?

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Pyspark can't pickle local object (list of functions) - python

Related

multiprocessing.Pool().map multiple values for argument in worker function Error

Pandas index labelling type error within multiprocessing code

How to interface Pyomo with GLPK?

Exception: 'numpy.float64' object is not callable when optimising

Python Multiprocessing Process exits before operation

Categories

Resources