Use multiprocessing for each recursion step - python

My requirement is to generate a list permissible combinations. Below code is the simplified version which meets my need.
def getChild(tupF):
if len(tupF) <= 60:
for val in range(1,10): #in actual requirement, it is not a fixed range, but some complex processing to determing the list which need to be appended
t=list(tupF) #I am converting a tuple to list and after appending it, back to tuple as if I just handle as list, some how it didn't work
t.append(val)
getChild(tuple(t))
t=[]
else:
print(tupF)
tup = ()
getChild(tup)
But, as the number of levels are high (60) and each of my combination is completely independent of each other, I would like to make this code multiprocess one.
I tried adding
t.append(val)
tmpLst.append(tuple(t))
t=[]
if __name__ == '__main__':
pool = Pool(processes=3)
pool.map(getChild,tmpLst)
But this didn't work as my worker process is trying to sub-divide further. In my case, I don't think the sub-process would explode as once the parent process has called a set of child process, I am OK with terminating the parent process as the all the desired information are in the tuple I am passing to the child process.
Please let me know whether this problem is right candidate for multiprocessing, if yes provide some guidance on how to make it multiprocess so that I can reduce the computational time. I have no prior experience in writing multiprocessing code, so if you can point to a relevant example, that would be great. Thanks.

Related

How to have a multi-procsesing function return and store values in python?

I have a function which I will run using multi-processing. However the function returns a value and I do not know how to store that value once it's done.
I read somewhere online about using a queue but I don't know how to implement it or if that'd even work.
cores = []
for i in range(os.cpu_count()):
cores.append(Process(target=processImages, args=(dataSets[i],)))
for core in cores:
core.start()
for core in cores:
core.join()
Where the function 'processImages' returns a value. How do I save the returned value?
In your code fragment you have input dataSets which is a list of some unspecified size. You have a function processImages which takes a dataSet element and apparently returns a value you want to capture.
cpu_count == dataset length ?
The first problem I notice is that os.cpu_count() drives the range of values i which then determines which datasets you process. I'm going to assume you would prefer these two things to be independent. That is, you want to be able to crunch some X number of datasets and you want it to work on any machine, having anywhere from 1 - 1000 (or more...) cores.
An aside about CPU-bound work
I'm also going to assume that you have already determined that the task really is CPU-bound, thus it makes sense to split by core. If, instead, your task is disk io-bound, you would want more workers. You could also be memory bound or cache bound. If optimal parallelization is important to you, you should consider doing some trials to see which number of workers really gives you maximum performance.
Here's more reading if you like
Pool class
Anyway, as mentioned by Michael Butscher, the Pool class simplifies this for you. Yours is a standard use case. You have a set of work to be done (your list of datasets to be processed) and a number of workers to do it (in your code fragment, your number of cores).
TLDR
Use those simple multiprocessing concepts like this:
from multiprocessing import Pool
# Renaming this variable just for clarity of the example here
work_queue = datasets
# This is the number you might want to find experimentally. Or just run with cpu_count()
worker_count = os.cpu_count()
# This will create processes (fork) and join all for you behind the scenes
worker_pool = Pool(worker_count)
# Farm out the work, gather the results. Does not care whether dataset count equals cpu count
processed_work = worker_pool.map(processImages, work_queue)
# Do something with the result
print(processed_work)
You cannot return the variable from another process. The recommended way would be to create a Queue (multiprocessing.Queue), then have your subprocess put the results to that queue, and once it's done, you may read them back -- this works if you have a lot of results.
If you just need a single number -- using Value or Array could be easier.
Just remember, you cannot use a simple variable for that, it has to be wrapped with above mentioned classes from multiprocessing lib.
If you want to use the result object returned by a multiprocessing, try this
from multiprocessing.pool import ThreadPool
def fun(fun_argument1, ... , fun_argumentn):
<blabla>
return object_1, object_2
pool = ThreadPool(processes=number_of_your_process)
async_num1 = pool.apply_async(fun, (fun_argument1, ... , fun_argumentn))
object_1, object_2 = async_num1.get()
then you can do whatever you want.

python for large data processing

I relatively new to python, and have been able to answer most of my questions based on similar problems answered on forms, but I'm at a point where I'm stuck an could use some help.
I have a simple nested for loop script that generates an output of strings. What I need to do next is have each grouping go through a simulation, based on numerical values that the strings will be matched too.
really my question is how do I go about this in the best way? Im not sure if multithreading will work since the strings are generated and then need to undergo the simulation, one set at a time. I was reading about queue's and wasn't sure if they could be passed into queue's for storage and then undergo the simulation, in the same order they entered the queue.
Regardless of the research I've done I'm open to any suggestion anyone can make on the matter.
thanks!
edit: Im not look for an answer on how to do the simulation, but rather a way to store the combinations while simulations are being computed
example
X = ["a","b"]
Y = ["c","d","e"]
Z= ["f","g"]
for A in itertools.combinations(X,1):
for B in itertools.combinations(Y,2):
for C in itertools.combinations(Z, 2):
D = A + B + C
print(D)
As was hinted at in the comments, the multiprocessing module is what you're looking for. Threading won't help you because of the Global Interpreter Lock (GIL), which limits execution to one Python thread at a time. In particular, I would look at multiprocessing pools. These objects give you an interface to have a pool of subprocesses do work for you in parallel with the main process, and you can go back and check on the results later.
Your example snippet could look something like this:
import multiprocessing
X = ["a","b"]
Y = ["c","d","e"]
Z= ["f","g"]
pool = multiprocessing.pool() # by default, this will create a number of workers equal to
# the number of CPU cores you have available
combination_list = [] # create a list to store the combinations
for A in itertools.combinations(X,1):
for B in itertools.combinations(Y,2):
for C in itertools.combinations(Z, 2):
D = A + B + C
combination_list.append(D) # append this combination to the list
results = pool.map(simulation_function, combination_list)
# simulation_function is the function you're using to actually run your
# simulation - assuming it only takes one parameter: the combination
The call to pool.map is blocking - meaning that once you call it, execution in the main process will halt until all the simulations are complete, but it is running them in parallel. When they complete, whatever your simulation function returns will be available in results, in the same order that the input arguments were in the combination_list.
If you don't want to wait for them, you can also use apply_async on your pool and store the result to look at later:
import multiprocessing
X = ["a","b"]
Y = ["c","d","e"]
Z= ["f","g"]
pool = multiprocessing.pool()
result_list = [] # create a list to store the simulation results
for A in itertools.combinations(X,1):
for B in itertools.combinations(Y,2):
for C in itertools.combinations(Z, 2):
D = A + B + C
result_list.append(pool.apply_async(
simulation_function,
args=(D,))) # note the extra comma - args must be a tuple
# do other stuff
# now iterate over result_list to check the results when they're ready
If you use this structure, result_list will be full of multiprocessing.AsyncResult objects, which allow you to check if they are ready with result.ready() and, if it's ready, retrieve the result with result.get(). This approach will cause the simulations to be kicked off right when the combination is calculated, instead of waiting until all of them have been calculated to start processing them. The downside is that it's a little more complicated to manage and retrieve the results. For example, you have to make sure the result is ready or be ready to catch an exception, you need to be ready to catch exceptions that may have been raised in the worker function, etc. The caveats are explained pretty well in the documentation.
If calculating the combinations doesn't actually take very long and you don't mind your main process halting until they're all ready, I suggest the pool.map approach.

Multiprocessing, pooling and randomness

I am experiencing a strange thing: I wrote a program to simulate economies. Instead of running this simulation one by one on one CPU core, I want to use multiprocessing to make things faster. So I run my code (fine), and I want to get some stats from the simulations I am doing. Then arises one surprise: all the simulations done at the same time yield the very same result! Is there some strange relationship between Pool() and random.seed()?
To be much clearer, here is what the code can be summarized as:
class Economy(object):
def __init__(self,i):
self.run_number = i
self.Statistics = Statistics()
self.process()
def run_and_return(i):
eco = Economy(i)
return eco
collection = []
def get_result(x):
collection.append(x)
if __name__ == '__main__':
pool = Pool(processes=4)
for i in range(NRUN):
pool.apply_async(run_and_return, (i,), callback=get_result)
pool.close()
pool.join()
The process(i) is the function that goes through every step of the simulation, during i steps. Basically I simulate NRUN Economies, from which I get the Statistics that I put in the list collection.
Now the strange thing is that the output of this is exactly the same for the first 4 runs: during the same "wave" of simulations, I get the very same output. Once I get to the second wave, then I get a different output for the next 4 simulations!
All these simulations run well if I use the same program with processes=1: I get different results when I only work on one core, taking simulations one by one... I have tried a few things, but can't get my head around this, hence my post...
Thank you very much for taking the time to read this long post, do not hesitate to ask for more precisions!
All the best,
If you are on Linux then each pool process is made by forking the parent process. This means the process is literally duplicated - this includes the seed any random object may be using.
The random module selects the seed for its default functions on import. Meaning the seed has already been selected before you create the Pool.
To get around this you must use an initialiser for each pool process that sets the random seed to something unique.
A decent way to seed random would be to use the process id and the current time. The process id is bound to be unique on a single run of your program. Whilst using the time will ensure uniqueness over multiple runs in case the same process id is produced. Passing process id and time through as a string will mean that the digest of the string is also used to seed the random number generator -- meaning two similar strings will produce substantially different seeds. Alternatively, you could use the uuid module to generate seeds.
def proc_init():
random.seed(str(os.getpid()) + str(time.time()))
pool = Pool(num_procs, initializer=proc_init)

Is this usage of Python threading safe/good?

I've got an application which gets some results from some urls and then has to take a decision based on the results (i.e.: pick the best result and display it to the user). Since I want to check several urls this was the first time that multithreading is pretty much needed.
So with the help of some examples I cooked up the following testcode:
import threading
import urllib2
threadsList = []
theResultList = []
def get_url(url):
result = urllib2.urlopen(url).read()
theResultList.append(result[0:10])
theUrls = ['http://google.com', ' http://yahoo.com']
for u in theUrls:
t = threading.Thread(target=get_url, args=(u,))
threadsList.append(t)
t.start()
t.join()
print theResultList
This seems to work, but I'm really insecure here because I really have virtually no experience with multithreading. I always hear these terms like "thread safe" and "race condition".
Of course I read about these things, but since this is my first time using something like this, my question is: is it ok to do it like this? Are there any negative or unexpected effects which I overlook? Are there ways to improve this?
All tips are welcome!
You have to worry about race conditions when you have multiple threads modifying the same object. In your case you have this exact condition - all threads are modifying theResultList.
However, Python's lists are thread safe - read more here. Therefore appends to a list from multiple threads will not somehow corrupt the list structure - you still have to take care to protect concurrent modifications to individual list elements however. For example:
# not thread safe code! - all threads modifying the same element
def get_url(url):
result = urllib2.urlopen(url).read()
#in this example, theResultList is a list of integers
theResultList[0] += 1
In your case, you aren't doing something like this, so your code is fine.
Side note:
The reason incrementing an integer isn't thread safe, is because it's actually two operations - one operation to read the value, and one operation to increment the value. A thread can be interrupted between these two steps (by another thread that also wants to increment the same variable) - this means that when the thread finally does increment in the second step, it could be incrementing an out of date value.

python - multiprocessing - static tree traversal - performance gain?

I have a node tree where every node has an id (node number), a list over children and a debth indicator. I am then given a list over nodes which i am to find the debth of. To do this i use a recursive function.
This is all fine and dandy but I want to speed the process up. I've been looking into multiprocessing, but every time I try it, the calculation time goes up (the higher process count, the longer runtime) compared to using no other processes at all.
My code looks like junk from trying to understand a lot of different examples, so il post this psuedocode instead.
class Node:
id = int
children = int[]
debth = int
function makeNodeTree() ...
function find(x, node):
for c in node.children:
if c.id == x: return c
else:
if find(x, c) != None: return result
return None
function main():
search = [nodeid, nodeid, nodeid...]
timerstart
for x in search: find(x, rootNode)
timerstop
timerstart
<split list over number of processes>
<do some multiprocess magic>
<get results>
timerstop
compare the two
I've tried all kinds off tree sizes to see if there is any gain at all, but i have yet to find such a case, which leads me thinking I'm doing something wrong. I guess what I'm asking for is an example/way of doing this traversal with a performance gain, using multiprocessing.
I know there are plenty ways to organize nodes to make this task easy, but i want to check the possible(?) performance boost, if it is possible at all.
Multiprocessing has overhead because every time you add a process it takes time to set it up. Also if you are using standard Python threads you are unlikely to get any speedup because all the threads will still run on one processor. So three thoughts (1) are your really so big that you need to speed it up? (2) spawn subprocesses (3) don't use paralellism at each node, just at the top few levels to minimize overhead.

Categories

Resources