Running bunch of python methods on a single piece of data

Running bunch of python methods on a single piece of data - python

I would like to run a set of methods given some data. I was wondering how I can remove or chose to run different methods to be run. I would like to groups them within a larger method so I can call it; and it will go along the lines of test case.
In code: Now these are the methods that process the data. I may sometimes want to run all three or a subset thereof to collect information on this data set.
def one(self):
pass
def two(self):
pass
def three(self):
pass
I would like to be able to call of these methods with another call so I dont have to type out run this; run this. I am looking for elegant way to run a bunch of methods through one call so I can pick and choose which gets run.
Desired result
def run_methods(self, variables):
#runs all three or subset of
I hope I have been clear in my question. I am just looking for an elegant way to do this. Like in Java with reflection.
Please and thanks.

Send the methods you want to run as a parameter:
def runmethods(self, variables, methods):
for method in methods:
method(variables)
then call something like:
self.runmethods(variables, (method1, method2))
This is the nice thing of having functions as first-class objects in Python
For the question of the OP in the comment (different parameters for the functions), a dirty solution (sorry for that):
def rest(a, b):
print a - b
def sum(a, b):
print a + b
def run(adictio):
for method, (a, b) in adictio.iteritems():
method(a, b)
mydictio = {rest:(3, 2), sum:(4, 5)}
run(mydictio)
You could use other containers to send methods together with their variables but it is nice to see a function as the key of a dictionary
if your methods/functions use different numbers of parameters you can not use
for method, (a,b) in adictio.iteritems():
because it expects the same number of parameters for all methods. In this case you can use *args:
def rest(*args):
a, b = args
print a - b
def sum(*args):
a, b, c, d, e = args
print a + b + c + d + e
def run(adictio):
for method, params in adictio.iteritems():
method(*params)
mydictio = {rest:(3, 2), sum:(4, 5, 6, 7, 8)}
run(mydictio)

If you normally do all the functions but sometimes have exceptions, then it would be useful to have them done by default, but optionally disable them like this:
def doWalkDog():
pass
def doFeedKid():
pass
def doTakeOutTrash():
pass
def doChores(walkDog=True, feedKid=True, takeOutTrash=True):
if walkDog: doWalkDog()
if feedKid: doFeedKid()
if takeOutTrash: doTakeOutTrash()
# if the kid is at grandma's...
# we still walk the dog and take out the trash
doChores(feedKid=False)

To answer the question in the comment regarding passing arbitrary values:
def runmethods(self, methods):
for method, args in methods.iteritems():
method(*args[0], **args[1])
runmethods( {methodA: ([arg1, arg2], {'kwarg1:' 'one', 'kwarg2'})},
{methodB: ([arg1], {'kwarg1:' 'one'})}
)
But at this point, it's looking like more code than it's worth!

Related

Function with two stages to define parameters

I need to define functions that get called in two stages.
The first stage only sets a subset of parameters and the second stage runs the function with some additional parameters.
I didn't know how to name the title to this question.
Currently I'm doing this by defining three functions.
The first function returns a new function that has the parameters set.
After that, calling the returned function will actually call the final function.
The current solution looks like this:
def my_function(stage1_param1, stage1_param2):
return lambda stage2_param1, stage2_param2: my_function_op(stage1_param1, stage1_param2, stage2_param1, stage2_param2)
def my_function_op(stage1_param1, stage1_param2, stage2_param1, stage2_param2):
# do stuff
I would like to reduce boilerplate here and have a more compact version.
Is there a better / shorter solution?

I think you are just looking for partial function application; your my_function is in some sense a specialized implementation of functools.partial.
from functools import partial
def my_function_op(p1, p2, p3, p4):
# do stuff
f = partial(my_function_op, a1, a2)
x = f(a3, a4) # x = my_function_op(a1, a2, a3, a4)

You can just define a function within a function:
def my_function(stage1_param1, stage1_param2):
def stage2(stage2_param1, stage2_param2):
... # do stuff
return stage2
You can then use my_function like this:
>>> f = my_function("foo", "bar")
>>> f("ham", "spam")
Inside the inner function, all the arguments are bound to the correct values.
This can be nested as far as needed, just remember to return the defined functions.

Can I change function parameters when passing them as variables?

Excuse my poor wording in the title, but here's a longer explanation:
I have a function which as arguments takes some functions which are used to determine which data to retrieve from a database, as such:
def customer_data(customer_name, *args):
# initialize dictionary with ids
codata = dict([(data.__name__, []) for data in args])
codata['customer_observer_id'] = _customer_observer_ids(customer_name)
# add values to dictionary using function name as key
for data in args:
for coid in codata['customer_observer_id']:
codata[data.__name__].append(data(coid))
return codata
Which makes the call to the function looking something like this:
customer_data('customername', target_parts, source_group, ...)
One of these functions is defined with an extra parameter:
def polarization_value(customer_observer_id, timespan='day')
What I would like is a way to change the timespan variable in a clever way. One obvious way is to include a keyword argument in customer_observer and add an exception when the function name being called is 'polarization_value', but I have a feeling there is a better way to do this.

You can use functools.partial and pass polarization_value as :
functools.partial(polarization_value, timespan='day')
Example:
>>> import functools
def func(x, y=1):
print x, y
...
>>> new_func = functools.partial(func, y=20)
>>> new_func(100)
100 20
You may also find this helpful: Python: Why is functools.partial necessary?

Generate python function with different arguments

Background
I have a function that takes a number of parameters and returns an error measure which I then want to minimize (using scipy.optimize.leastsq, but that is beside the point right now).
As a toy example, let's assume my function to optimize take the four parameters a,b,c,d:
def f(a,b,c,d):
err = a*b - c*d
return err
The optimizer then want a function with the signature func(x, *args) where x is the parameter vector.
That is, my function is currently written like:
def f_opt(x, *args):
a,b,c,d = x
err = a*b - c*d
return err
But, now I want to do a number of experiments where I fix some parameters while keeping some parameters free in the optimization step.
I could of course do something like:
def f_ad_free(x, b, c):
a, d = x
return f(a,b,c,d)
But this will be cumbersome since I have over 10 parameters which means the combinations of different numbers of free-vs-fixed parameters will potentially be quite large.
First approach using dicts
One solution I had was to write my inner function f with keyword args instead of positional args and then wrap the solution like this:
def generate(func, all_param, fixed_param):
param_dict = {k : None for k in all_param}
free_param = [param for param in all_param if param not in fixed_param]
def wrapped(x, *args):
param_dict.update({k : v for k, v in zip(fixed_param, args)})
param_dict.update({k : v for k, v in zip(free_param, x)})
return func(**param_dict)
return wrapped
Creating a function that fixes 'b' and 'c' then turns into the following:
all_params = ['a','b','c']
f_bc_fixed = generate(f_inner, all_params, ['b', 'c'])
a = 1
b = 2
c = 3
d = 4
f_bc_fixed((a,d), b, c)
Question time!
My question is whether anyone can think of a neater way solve this. Since the final function is going to be run in an optimization step I can't accept too much overhead for each function call.
The time it takes to generate the optimization function is irrelevant.

I can think of several ways to avoid using a closure as you do above, though after doing some testing, I'm not sure either of these will be faster. One approach might be to skip the wrapper and just write a function that accepts
A vector
A list of free names
A dictionary mapping names to values.
Then do something very like what you do above, but in the function itself:
def f(free_vals, free_names, params):
params.update(zip(free_names, free_vals))
err = params['a'] * params['b'] - params['c'] * params['d']
return err
For code that uses variable names multiple times, make vars local up front, e.g.
a = params['a']
b = params['b']
and so on. This might seem cumbersome, but it has the advantage of making everything explicit, avoiding the kinds of namespace searches that could make closures slow.
Then pass a list of free names and a dictionary of fixed params via the args parameter to optimize.leastsq. (Note that the params dictionary is mutable, which means that there could be side effects in theory; but in this case it shouldn't matter because only the free params are being overwritten by update, so I omitted the copy step for the sake of speed.)
The main downsides of this approach are that it shifts some complexity into the call to optimize.leastsq, and it makes your code less reusable. A second approach avoids those problems though it might not be quite as fast: using a callable class.
class OptWrapper(object):
def __init__(self, func, free_names, **fixed_params):
self.func = func
self.free_names = free_names
self.params = fixed_params
def __call__(self, x, *args):
self.params.update(zip(self.free_names, x))
return self.func(**self.params)
You can see that I simplified the parameter structure for __init__; the fixed params are passed here as keyword arguments, and the user must ensure that free_names and fixed_params don't have overlapping names. I think the simplicity is worth the tradeoff but you can easily enforce the separation between the two just as you did in your wrapper code.
I like this second approach best; it has the flexibility of your closure-based approach, but I find it more readable. All the names are in (or can be accessed through) the local namespace, which I thought that would speed things up -- but after some testing I think there's reason to believe that the closure approach will still be faster than this; accessing the __call__ method seems to add about 100 ns per call of overhead. I would strongly recommend testing if performance is a real issue.

Your generate function is basically the same as functools.partial, which is what I would use here.

Python: Best way to deal with functions with long list of arguments?

I've found various detailed explanations on how to pass long lists of arguments into a function, but I still kinda doubt if that's proper way to do it.
In other words, I suspect that I'm doing it wrong, but I can't see how to do it right.
The problem: I have (not very long) recurrent function, which uses quite a number of variables and needs to modify some content in at least some of them.
What I end up with is sth like this:
def myFunction(alpha, beta, gamma, zeta, alphaList, betaList, gammaList, zetaList):
<some operations>
myFunction(alpha, beta, modGamma, zeta, modAlphaList, betaList, gammaList, modZetaList)
...and I want to see the changes I did on original variables (in C I would just pass a reference, but I hear that in Python it's always a copy?).
Sorry if noob, I don't know how to phrase this question so I can find relevant answers.

You could wrap up all your parameters in a class, like this:
class FooParameters:
alpha = 1.0
beta = 1.0
gamma = 1.0
zeta = 1.0
alphaList = []
betaList = []
gammaList = []
zetaList = []
and then your function takes a single parameter instance:
def myFunction(params):
omega = params.alpha * params.beta + exp(params.gamma)
# more magic...
calling like:
testParams = FooParameters()
testParams.gamma = 2.3
myFunction(testParams)
print params.zetaList
Because the params instance is passed by reference, changes in the function are preserved.

This is commonly used in matplotlib, for example. They pass the long list of arguments using * or **, like:
def function(*args, **kwargs):
do something
Calling function:
function(1,2,3,4,5, a=1, b=2, b=3)
Here 1,2,3,4,5 will go to args and a=1, b=2, c=3 will go to kwargs, as a dictionary. So that they arrive at your function like:
args = [1,2,3,4,5]
kwargs = {a:1, b:2, c:3}
And you can treat them in the way you want.

I don't know where you got the idea that Python copies values when passing into a function. That is not at all true.
On the contrary: each parameter in a function is an additional name referring to the original object. If you change the value of that object in some way - for example, if it's a list and you change one of its members - then the original will also see that change. But if you rebind the name to something else - say by doing alpha = my_completely_new_value - then the original remains unchanged.

You may be tempted to something akin to this:
def myFunction(*args):
var_names = ['alpha','beta','gamma','zeta']
locals().update(zip(var_names,args))
myFunction(alpha,beta,gamma,zeta)
However, this 'often' won't work. I suggest introducing another namespace:
from collections import OrderedDict
def myFunction(*args):
var_names = ['alpha','beta','gamma','zeta']
vars = OrderedDict(zip(var_names,args))
#get them all via vars[var_name]
myFunction(*vars.values()) #since we used an orderedDict we can simply do *.values()

you can capture the non-modfied values in a closure:
def myFunction(alpha, beta, gamma, zeta, alphaList, betaList, gammaList, zetaList):
def myInner(g=gamma, al, zl):
<some operations>
myInner(modGamma, modAlphaList, modZetaList)
myInner(al=alphaList, zl=zetaList)
(BTW, this is about the only way to write a truly recursive function in Python.)

You could pass in a dictionary and return a new dictionary. Or put your method in a class and have alpha, beta etc. be attributes.

You should put myFunction in a class. Set up the class with the appropriate attributes and call the appropriate functions. The state is then well contained in the class.

Deal with undefined arguments more elegantly

The accepted paradigm to deal with mutable default arguments is:
def func(self, a = None):
if a is None:
a = <some_initialisation>
self.a = a
As I might have to do this for several arguments, I would need to write very similar 3 lines over and over again. I find this un-pythonically a lot of text to read for a very very standard thing to do when initialising class objects or functions.
Isn't there an elegant one-liner to replace those 3 lines dealing with the potentially undefined argument and the standard required copying to the class instance variables?

If a "falsy" value (0, empty string, list, dict, etc.) is not a valid value for a, then you can cut down the initialization to one line:
a = a or <initialize_object>

Another way of doing the same thing is as follows:
def func(self,**kwargs):
self.a=kwargs.get('a',<a_initialization>)
...
This has the added bonus that the value of a passed to the function could be None and the initialization won't overwrite it. The disadvantage is that a user using the builtin help function won't be able to tell what keywords your function is looking for unless you spell it out explicitly in the docstring.
EDIT
One other comment. The user could call the above function with keywords which are not pulled out of the kwargs dictionary. In some cases, this is good (if you want to pass the keywords to another function for instance). In other cases, this is not what you want. If you want to raise an error if the user provides an unknown keyword, you can do the following:
def func(self,**kwargs):
self.a=kwargs.pop('a',"Default_a")
self.b=kwargs.pop('b',"Default_b")
if(kwargs):
raise ... #some appropriate exception...possibly using kwargs.keys() to say which keywords were not appropriate for this function.

You could do this
def func(self, a=None):
self.a = <some_initialisation> if a is None else a
But why the obsession with one liners? I would usually use the 3 line version even if it gets repeated all over the place because if makes your code very easy for experienced Python programmers to read

just a little solution I came up by using an extra function, can be improved of course:
defaultargs.py:
def doInit(var, default_value,condition):
if condition:
var = default_value
return var
def func(a=None, b=None, c=None):
a = doInit(a,5,(a is None or not isinstance(a,int)))
b = doInit(b,10.0,(a is None or not isinstance(a,float)))
c = doInit(c,"whatever",(a is None or not isinstance(c, str)))
print a
print b
print c
if __name__ == "__main__":
func(10)
func(None,12341.12)
func("foo",None,"whowho")
output:
10
10.0
whatever
5
10.0
whatever
5
10.0
whowho
I like your question. :)
Edit: If you dont care about the variables type, please dont use isinstance().

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Running bunch of python methods on a single piece of data - python

Related

Function with two stages to define parameters

Can I change function parameters when passing them as variables?

Generate python function with different arguments

Python: Best way to deal with functions with long list of arguments?

Deal with undefined arguments more elegantly

Categories

Resources