What does the delayed() function do (when used with joblib in Python)

What does the delayed() function do (when used with joblib in Python) - python

I've read through the documentation, but I don't understand what is meant by:
The delayed function is a simple trick to be able to create a tuple (function, args, kwargs) with a function-call syntax.
I'm using it to iterate over the list I want to operate on (allImages) as follows:
def joblib_loop():
Parallel(n_jobs=8)(delayed(getHog)(i) for i in allImages)
This returns my HOG features, like I want (and with the speed gain using all my 8 cores), but I'm just not sure what it is actually doing.
My Python knowledge is alright at best, and it's very possible that I'm missing something basic. Any pointers in the right direction would be most appreciated

Perhaps things become clearer if we look at what would happen if instead we simply wrote
Parallel(n_jobs=8)(getHog(i) for i in allImages)
which, in this context, could be expressed more naturally as:
Create a Parallel instance with n_jobs=8
create a generator for the list [getHog(i) for i in allImages]
pass that generator to the Parallel instance
What's the problem? By the time the list gets passed to the Parallel object, all getHog(i) calls have already returned - so there's nothing left to execute in Parallel! All the work was already done in the main thread, sequentially.
What we actually want is to tell Python what functions we want to call with what arguments, without actually calling them - in other words, we want to delay the execution.
This is what delayed conveniently allows us to do, with clear syntax. If we want to tell Python that we'd like to call foo(2, g=3) sometime later, we can simply write delayed(foo)(2, g=3). Returned is the tuple (foo, [2], {g: 3}), containing:
a reference to the function we want to call, e.g.foo
all arguments (short "args") without a keyword, e.g.t 2
all keyword arguments (short "kwargs"), e.g. g=3
So, by writing Parallel(n_jobs=8)(delayed(getHog)(i) for i in allImages), instead of the above sequence, now the following happens:
A Parallel instance with n_jobs=8 gets created
The list
[delayed(getHog)(i) for i in allImages]
gets created, evaluating to
[(getHog, [img1], {}), (getHog, [img2], {}), ... ]
That list is passed to the Parallel instance
The Parallel instance creates 8 threads and distributes the tuples from the list to them
Finally, each of those threads starts executing the tuples, i.e., they call the first element with the second and the third elements unpacked as arguments tup[0](*tup[1], **tup[2]), turning the tuple back into the call we actually intended to do, getHog(img2).

we need a loop to test a list of different model configurations. This is the main function that drives the grid search process and will call the score_model() function for each model configuration. We can dramatically speed up the grid search process by evaluating model configurations in parallel. One way to do that is to use the Joblib library . We can define a Parallel object with the number of cores to use and set it to the number of scores detected in your hardware.
define executor
executor = Parallel(n_jobs=cpu_count(), backend= 'multiprocessing' )
then create a list of tasks to execute in parallel, which will be one call to the score model() function for each model configuration we have.
suppose def score_model(data, n_test, cfg):
........................
define list of tasks
tasks = (delayed(score_model)(data, n_test, cfg) for cfg in cfg_list)
we can use the Parallel object to execute the list of tasks in parallel.
scores = executor(tasks)

So what you want to be able to do is pile up a set of function calls and their arguments in such a way that you can pass them out efficiently to a scheduler/executor. Delayed is a decorator that takes in a function and its args and wraps them into an object that can be put in a list and popped out as needed. Dask has the same thing which it uses in part to feed into its graph scheduler.

From reference https://wiki.python.org/moin/ParallelProcessing
The Parallel object creates a multiprocessing pool that forks the Python interpreter in multiple processes to execute each of the items of the list. The delayed function is a simple trick to be able to create a tuple (function, args, kwargs) with a function-call syntax.
Another thing I would like to suggest is instead of explicitly defining num of cores we can generalize like this:
import multiprocessing
num_core=multiprocessing.cpu_count()

Related

What is this joblib Parallel syntax doing? So many parentheses

Scikit-learn often uses joblib to parallelize with calls like Parallel(n_jobs=n_jobs)(delayed(function)(param_to_function) for param_to_function in iterable).
This helpful question and answer indicate that this double-parenthesis business means the second set is passed to whatever is returned by the call involving the first set, which makes a lot of sense if the thing returned is a callable.
But here the thing returned by Parallel(n_jobs=n_jobs) should be a Parallel object, right? And then we're passing it a generator object given by the loop in the second parenthetical. You shouldn't be able to directly pass a generator to a class after construction like that. There should be some function call between the object and the input. Or in python is there __some_special_function__ that works with this syntax?
What is this syntax doing, exactly?

The "special function" is probably just a __call__ method. An instance of a class with that method can be called just like a function. In this case, Parallel presumably defines __call__ to accept a generator.
(Note, that's not to say it's a good idea to write code like your example. It's needlessly confusing.)

How to pass list of PyHandles to function, making separate thread for every handle

I need every directory handle from the list to be addressed in different thread.
My code:
while len(handles_list) != 0:
threading.Thread(target=handle_thread, args=handles_list,).start()
handles_list.pop(0)
def handle_thread(handle):
# do stuff with handle
Written like this it's giving an error that the def take only one argument - two are given (or more depending on the list's content). So how to start 3 different threads giving them args handles_list[0], handles_list[1] .. etc.

The thread object expects the target function to have as many arguments as are contained in that list. You are passing the entire list of handles, but the target function expects a single handle.
So, you need to pass a single handle to each thread you create. But the args parameter needs to be an iterable whose length is the number of arguments that the target function expects. So you write it like this:
for handle in handles_list:
threading.Thread(target=handle_thread, args=[handle]).start()
Or if you'd rather use a tuple than a list:
for handle in handles_list:
threading.Thread(target=handle_thread, args=(handle,)).start()
This article gives a concise introduction to argument passing to the thread class: Python Threading Arguments, Andrew Ippoliti.

How do Python Generators know who is calling?

This question is making me pull my hair out.
if I do:
def mygen():
for i in range(100):
yield i
and call it from one thousand threads, how does the generator know what to send next for each thread?
Everytime I call it, does the generator save a table with the counter and the caller reference or something like that?
It's weird.
Please, clarify my mind on that one.

mygen does not have to remember anything. Every call to mygen() returns an independent iterable. These iterables, on the other hand, have state: Every time next() is called on one, it jumps to the correct place in the generator code -- when a yield is encountered, control is handed back to the caller. The actual implementation is rather messy, but in principle you can imagine that such an iterator stores the local variables, the bytecode, and the current position in the bytecode (a.k.a. instruction pointer). There is nothing special about threads here.

A function like this, when called, will return a generator object. If you have separate threads calling next() on the same generator object, they will interfere with eachother. That is to say, 5 threads calling next() 10 times each will get 50 different yields.
If two threads each create a generator by calling mygen() within the thread, they will have separate generator objects.
A generator is an object, and its state will be stored in memory, so two threads that each create a mygen() will refer to separate objects. It'd be no different than two threads creating an object from a class, they'll each have a different object, even though the class is the same.
if you're coming at this from a C background, this is not the same thing as a function with static variables. The state is maintained in an object, not statically in the variables contained in the function.

It might be clearer if you look at it this way. Instead of:
for i in mygen():
. . .
use:
gen_obj = mygen()
for i in gen_obj:
. . .
then you can see that mygen() is only called once, and it creates a new object, and it is that object that gets iterated. You could create two sequences in the same thread, if you wanted:
gen1 = mygen()
gen2 = mygen()
print(gen1.__next__(), gen2.__next__(), gen1.__next__(), gen2.__next__())
This will print 0, 0, 1, 1.
You could access the same iterator from two threads if you like, just store the generator object in a global:
global_gen = mygen()
Thread 1:
for i in global_gen:
. . .
Thread 2:
for i in global_gen:
. . .
This would probably cause all kinds of havoc. :-)

Handling and passing large group of variables to functions\objects in python

I find myself in a need of working with functions and objects who take a large number of variables.
For a specific case, consider a function from a separated module which takes N different variables, which are then pass them on to newly instanced object:
def Function(Variables):
Do something with some of the variables
object1 = someobject(some of the variables)
object2 = anotherobject(some of the variables, not necessarily as in object1)
While i can just pass a long list of variables, from time to time i find myself making changes to one function, which requires making changes in other functions it might call, or objects it might create. Sometimes the list of variables might change a little.
Is there a nice elegant way to pass a large group of variables and maintain flexibility?
I tried using kwargs in the following way:
def Function(**kwargs):
Rest of the function
and calling Function(**somedict), where somedict is a dictionary has keys and values of all the variables i need to pass to Function (and maybe some more). But i get an error about undefined global variables.
Edit1:
I will post the piece of code later since i am not at home or the lab now. Till then i will try to better explain the situation.
I have a molecular dynamics simulation, which take few dozens of parameters. Few of the parameters (like the temperature for example) need to be iterated over. To make good use of the quad core processor i ran different iterations in parallel. So the code starts with a loop over the different iteration, and at each pass send that parameters of that iteration to a pool of workers (using the multiprocessing module). It goes something like:
P = mp.pool(number of workers) # If i remember correctly this line
for iteration in Iterations:
assign values to parameters
P.apply_async(run,(list of parameters),callback = some post processing)
P.close()
P.join()
The function run takes the list of parameters and generates the simulation objects, each take some of the parameters as their attributes.
Edit2:
Here is a version of the problematic function. **kwargs contain all the parameters needed by the 'sim','lattice' and 'adatom'.
def run(**kwargs):
"""'run' runs a single simulation process.
j is the index number of the simulation run.
The code generates an independent random seed for the initial conditios."""
scipy.random.seed()
sim = MDF.Simulation(tstep, temp, time, writeout, boundaryxy, boundaryz, relax, insert, lat,savetemp)
lattice = MDF.Lattice(tstep, temp, time, writeout, boundaryxy, boundaryz, relax, insert, lat, kb, ks, kbs, a, p, q, massL, randinit, initvel, parangle,scaletemp,savetemp,freeze)
adatom = MDF.Adatom(tstep, temp, time, writeout, boundaryxy, boundaryz, relax, insert, lat, ra, massa, amorse, bmorse, r0, z0, name, lattice, samplerate,savetemp,adatomrelax)
bad = 1
print 'Starting simulation run number %g\nrun' % (j+1)
while bad is 1:
# If the simulation did not complete successfuly, run it again.
bad = sim.timeloop(lattice,adatom1,j)
print 'Starting post processing'
# Return the temperature and adatomś trajectory and velocity
List = [j,lattice.temp , adatom1.traj ,adatom1.velocity, lattice.Toptemp, lattice.Bottomtemp, lattice.middletemp, lattice.latticetop]
return List

The cleanest solution is not using that many parameters in a function at all.
You could use set methods or properties to set each variable separately, storing them as class members and being used by the functions inside that class.
Those functions fill the private variables and the get methods can be used to retrieve those variables.
An alternative is to use structures (or classes without functions) and this way create a named group of variables.

Putting *args and/or **kwargs as the last items in your function definition’s argument list allows that function to accept an arbitrary number of anonymous and/or keyword arguments.
You would use *args when you're not sure how many arguments might be passed to your function.

You also can group parameters in tuples to make a structure.
This is not as elegant as using structures or get/set methods but can be applied mostly easily in existing applications without too much rework.
Of course only related parameters should be grouped in a tuple.
E.g. you could have a function passed as
value = function_call((car_model, car_type), age, (owner.name, owner.address, owner.telephone))
This does not reduce the number of parameters but adds a bit more structure.

python how to memoize a method

Say I a method to create a dictionary from the given parameters:
def newDict(a,b,c,d): # in reality this method is a bit more complex, I've just shortened for the sake of simplicity
return { "x": a,
"y": b,
"z": c,
"t": d }
And I have another method that calls newDict method each time it is executed. Therefore, at the end, when I look at my cProfiler I see something like this:
17874 calls (17868 primitive) 0.076 CPU seconds
and of course, my newDict method is called 1785 times. Now, my question is whether I can memorize the newDict method so that I reduce the call times? (Just to make sure, the variables change almost in every call, though I'm not sure if it has an effect on memorizing the function)
Sub Question: I believe that 17k calls are too much, and the code is not efficient. But by looking at the stats can you also please state whether this is a normal result or I have too many calls and the code is slow?

You mean memoize not memorize.
If the values are almost always different, memoizing won't help, it will slow things down.
Without seeing your full code, and knowing what it's supposed to do, how can we know if 17k calls is a lot or the little?

If by memorizing you mean memoizing, use functools.lru_cache.
It's a function decorator

The purpose of memoizing is to save a result of an operation that was expensive to perform so that it can be provided a second, third, etc., time without having to repeat the operation and repeatedly incur the expense.
Memoizing is normally applied to a function that (a) performs an expensive operation, (b) always produces the same result given the same arguments, and (c) has no side effects on the program state.
Memoizing is typically implemented within such a function by 'saving' the result along with the values of the arguments that produced that result. This is a special form of the general concept of a cache. Each time the function is called, the function checks its memo cache to see if it has already determined the result that is appropriate for the current values of the arguments. If the cache contains the result, it can be returned without the need to recompute it.
Your function appears to be intended to create a new dict each time it is called. There does not appear to be a sensible way to memoize this function: you always want a new dict returned to the caller so that its use of the dict it receives does not interfere with some other call to the function.
The only way I can visualize using memoizing would be if (1) the computation of one or more of the values placed into the result are expensive (in which case I would probably define a function that computes the value and memoize that function) or (2) the newDict function is intended to return the same collection of values given a particular set of argument values. In the latter case I would not use a dict but would instead use a non-modifiable object (e.g., a class like a dict but with protections against modifying its contents).
Regarding your subquestion, the questions you need to ask are (1) is the number of times newDict is being called appropriate and (2) can the execution time of each execution of newDict be reduced. These are two separate and independent questions that need to be individually addressed as appropriate.
BTW your function definition has a typo in it -- the return should not have a 'd' between the return keyword and the open brace.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

What does the delayed() function do (when used with joblib in Python) - python

Related

What is this joblib Parallel syntax doing? So many parentheses

How to pass list of PyHandles to function, making separate thread for every handle

How do Python Generators know who is calling?

Handling and passing large group of variables to functions\objects in python

python how to memoize a method

Categories

Resources