Background
I have a function that takes a number of parameters and returns an error measure which I then want to minimize (using scipy.optimize.leastsq, but that is beside the point right now).
As a toy example, let's assume my function to optimize take the four parameters a,b,c,d:
def f(a,b,c,d):
err = a*b - c*d
return err
The optimizer then want a function with the signature func(x, *args) where x is the parameter vector.
That is, my function is currently written like:
def f_opt(x, *args):
a,b,c,d = x
err = a*b - c*d
return err
But, now I want to do a number of experiments where I fix some parameters while keeping some parameters free in the optimization step.
I could of course do something like:
def f_ad_free(x, b, c):
a, d = x
return f(a,b,c,d)
But this will be cumbersome since I have over 10 parameters which means the combinations of different numbers of free-vs-fixed parameters will potentially be quite large.
First approach using dicts
One solution I had was to write my inner function f with keyword args instead of positional args and then wrap the solution like this:
def generate(func, all_param, fixed_param):
param_dict = {k : None for k in all_param}
free_param = [param for param in all_param if param not in fixed_param]
def wrapped(x, *args):
param_dict.update({k : v for k, v in zip(fixed_param, args)})
param_dict.update({k : v for k, v in zip(free_param, x)})
return func(**param_dict)
return wrapped
Creating a function that fixes 'b' and 'c' then turns into the following:
all_params = ['a','b','c']
f_bc_fixed = generate(f_inner, all_params, ['b', 'c'])
a = 1
b = 2
c = 3
d = 4
f_bc_fixed((a,d), b, c)
Question time!
My question is whether anyone can think of a neater way solve this. Since the final function is going to be run in an optimization step I can't accept too much overhead for each function call.
The time it takes to generate the optimization function is irrelevant.
I can think of several ways to avoid using a closure as you do above, though after doing some testing, I'm not sure either of these will be faster. One approach might be to skip the wrapper and just write a function that accepts
A vector
A list of free names
A dictionary mapping names to values.
Then do something very like what you do above, but in the function itself:
def f(free_vals, free_names, params):
params.update(zip(free_names, free_vals))
err = params['a'] * params['b'] - params['c'] * params['d']
return err
For code that uses variable names multiple times, make vars local up front, e.g.
a = params['a']
b = params['b']
and so on. This might seem cumbersome, but it has the advantage of making everything explicit, avoiding the kinds of namespace searches that could make closures slow.
Then pass a list of free names and a dictionary of fixed params via the args parameter to optimize.leastsq. (Note that the params dictionary is mutable, which means that there could be side effects in theory; but in this case it shouldn't matter because only the free params are being overwritten by update, so I omitted the copy step for the sake of speed.)
The main downsides of this approach are that it shifts some complexity into the call to optimize.leastsq, and it makes your code less reusable. A second approach avoids those problems though it might not be quite as fast: using a callable class.
class OptWrapper(object):
def __init__(self, func, free_names, **fixed_params):
self.func = func
self.free_names = free_names
self.params = fixed_params
def __call__(self, x, *args):
self.params.update(zip(self.free_names, x))
return self.func(**self.params)
You can see that I simplified the parameter structure for __init__; the fixed params are passed here as keyword arguments, and the user must ensure that free_names and fixed_params don't have overlapping names. I think the simplicity is worth the tradeoff but you can easily enforce the separation between the two just as you did in your wrapper code.
I like this second approach best; it has the flexibility of your closure-based approach, but I find it more readable. All the names are in (or can be accessed through) the local namespace, which I thought that would speed things up -- but after some testing I think there's reason to believe that the closure approach will still be faster than this; accessing the __call__ method seems to add about 100 ns per call of overhead. I would strongly recommend testing if performance is a real issue.
Your generate function is basically the same as functools.partial, which is what I would use here.
Related
I have a rather easy problem but wish to find an elegant solution that does not come up to mind. Let's say I have a function that takes some arguments and performs calculations:
def f(a, b, c):
# preprocessing of some sort
d_discarded = ... # initialized and maybe some computations as well
for i in range(1000):
d_discarded = ...
final_value_update = ...
return final_value_update
Up to a user request, I would like to iteratively store and return also the value of d_discarded, but only up to a user request, so not necessarily. How could I envision an efficient way to do so?
A naive solution would be adding if statements and an additional argument like:
def f(a, b, c, keep_d = False):
# preprocessing of some sort
d_discarded = ... # initialized and maybe some computations as well
if keep_d:
l_discarded = []
l_discarded.append(d_discarded)
for i in range(1000):
d_discarded = ...
final_value_update = ...
if keep_d:
l_discarded.append(d_discarded)
if keep_d:
return final_value_update, l_discarded
return final_value_update
But this is not efficient, nor elegant, as it calls 1002 times passing an if statement. I surely can do this, but wish to learn a more clever process.
Any consideration is appreciated. I can understand the problem is rather broad but I chose to leave it as it is because it is indeed suitable for any setting.
I've got a quite extensive simulation tool written in python, which requires the user to call functions to set up the environment in a strict order since np.ndarrays are at first created (and changed by appending etc.) and afterwards memory views to specific cells of these arrays are defined.
Currently each part of the environment requires around 4 different function calls to be set up, with easily >> 100 parts.
Thus I need to combine each part's function calls by syntactically (not based on timers) postponing the execution of some functions until all preceding functions have been executed, while still maintaining the strict order to be able to use memory views.
Futhermore all functions to be called by the user use PEP 3102 style keyword-only arguments to reduce the probability of input-errors and all are instance methods with self as the first parameter, with self containing the references to the arrays to construct the memory views to.
My current implementation is using lists to store the functions and the dict for each function's keyworded arguments. This is shown here, omitting the class and self parameters to make it short:
def fun1(*, x, y): # easy minimal example function 1
print(x * y)
def fun2(*, x, y, z): # easy minimal example function 2
print((x + y) / z)
fun_list = [] # list to store the functions and kwargs
fun_list.append([fun1, {'x': 3.4, 'y': 7.0}]) # add functions and kwargs
fun_list.append([fun2, {'x':1., 'y':12.8, 'z': np.pi}])
fun_list.append([fun2, {'x':0.3, 'y':2.4, 'z': 1.}])
for fun in fun_list:
fun[0](**fun[1])
What I'd like to implement is using a decorator to postpone the function execution by adding a generator, to be able to pass all arguments to the functions as they are called, but not execute them, as shown below:
def postpone(myfun): # define generator decorator
def inner_fun(*args, **kwargs):
yield myfun(*args, **kwargs)
return inner_fun
fun_list_dec = [] # list to store the decorated functions
fun_list_dec.append(postpone(fun1)(x=3.4, y=7.0)) # add decorated functions
fun_list_dec.append(postpone(fun2)(x=1., y=12.8, z=np.pi))
fun_list_dec.append(postpone(fun2)(x=0.3, y=2.4, z=1.))
for fun in fun_list_dec: # execute functions
next(fun)
Which is the best (most pythonic) method to do so? Are there any drawbacks?
And most important: Will my references to np.ndarrays passed to the functions within self still be a reference, so that the memory addresses of these arrays are still correct when executing the functions, if the memory addresses change in between saving the function calls to a list (or being decorated) and executing them?
Execution speed does not matter here.
Using a generators here doesn't make much sense. You are essentially simulating partial-application. Therefore, this seems like a use-case for functools.partial. Since you are sticking with key-word only arguments, this will work just fine:
In [1]: def fun1(*, x, y): # easy minimal example function 1
...: print(x * y)
...: def fun2(*, x, y, z): # easy minimal example function 2
...: print((x + y) / z)
...:
In [2]: from functools import partial
In [3]: fun_list = []
In [4]: fun_list.append(partial(fun1, x=3.4, y=7.0))
In [5]: fun_list.append(partial(fun2, x=1., y=12.8, z=3.14))
In [6]: fun_list.append(partial(fun2, x=0.3, y=2.4,z=1.))
In [7]: for f in fun_list:
...: f()
...:
23.8
4.3949044585987265
2.6999999999999997
You don't have to use functools.partial either, you can do your partial application "manually", just to demonstrate:
In [8]: fun_list.append(lambda:fun1(x=5.4, y=8.7))
In [9]: fun_list[-1]()
46.98
Since for commenting the code would be too complicated and it is based on juanpa.arrivillaga's answer, I'll add a full post with a short explanation what I mean by updating the reference to the arrays:
def fun1(*, x, y): # easy minimal example function 1
print(x * y)
arr = np.random.rand(5)
f1_lam = lambda:fun1(x=arr, y=5.)
f1_par = partial(fun1, x=arr, y=5.)
f1_lam() # Out[01]: [0.55561103 0.9962626 3.60992174 2.55491852 3.9402079 ]
f1_par() # Out[02]: [0.55561103 0.9962626 3.60992174 2.55491852 3.9402079 ]
# manipulate array so that the memory address changes and
# passing as reference is "complicated":
arr = np.append(arr, np.ones((2,1)))
f1_lam() # Out[03]: [0.55561103 0.9962626 3.60992174 2.55491852 3.9402079 5. 5.]
f1_par() # Out[02]: [0.55561103 0.9962626 3.60992174 2.55491852 3.9402079 ]
The behaviour of lambda is exactly what I was looking for in this question.
My examples with dict and decorators don't work, as well as functools.partial. Any idea why lambda is working? And just out of interest: Would there be any way to update the references to the arrays in the dict so that it would also work this way?
On Codewars.com I encountered the following task:
Create a function add that adds numbers together when called in succession. So add(1) should return 1, add(1)(2) should return 1+2, ...
While I'm familiar with the basics of Python, I've never encountered a function that is able to be called in such succession, i.e. a function f(x) that can be called as f(x)(y)(z).... Thus far, I'm not even sure how to interpret this notation.
As a mathematician, I'd suspect that f(x)(y) is a function that assigns to every x a function g_{x} and then returns g_{x}(y) and likewise for f(x)(y)(z).
Should this interpretation be correct, Python would allow me to dynamically create functions which seems very interesting to me. I've searched the web for the past hour, but wasn't able to find a lead in the right direction. Since I don't know how this programming concept is called, however, this may not be too surprising.
How do you call this concept and where can I read more about it?
I don't know whether this is function chaining as much as it's callable chaining, but, since functions are callables I guess there's no harm done. Either way, there's two ways I can think of doing this:
Sub-classing int and defining __call__:
The first way would be with a custom int subclass that defines __call__ which returns a new instance of itself with the updated value:
class CustomInt(int):
def __call__(self, v):
return CustomInt(self + v)
Function add can now be defined to return a CustomInt instance, which, as a callable that returns an updated value of itself, can be called in succession:
>>> def add(v):
... return CustomInt(v)
>>> add(1)
1
>>> add(1)(2)
3
>>> add(1)(2)(3)(44) # and so on..
50
In addition, as an int subclass, the returned value retains the __repr__ and __str__ behavior of ints. For more complex operations though, you should define other dunders appropriately.
As #Caridorc noted in a comment, add could also be simply written as:
add = CustomInt
Renaming the class to add instead of CustomInt also works similarly.
Define a closure, requires extra call to yield value:
The only other way I can think of involves a nested function that requires an extra empty argument call in order to return the result. I'm not using nonlocal and opt for attaching attributes to the function objects to make it portable between Pythons:
def add(v):
def _inner_adder(val=None):
"""
if val is None we return _inner_adder.v
else we increment and return ourselves
"""
if val is None:
return _inner_adder.v
_inner_adder.v += val
return _inner_adder
_inner_adder.v = v # save value
return _inner_adder
This continuously returns itself (_inner_adder) which, if a val is supplied, increments it (_inner_adder += val) and if not, returns the value as it is. Like I mentioned, it requires an extra () call in order to return the incremented value:
>>> add(1)(2)()
3
>>> add(1)(2)(3)() # and so on..
6
You can hate me, but here is a one-liner :)
add = lambda v: type("", (int,), {"__call__": lambda self, v: self.__class__(self + v)})(v)
Edit: Ok, how this works? The code is identical to answer of #Jim, but everything happens on a single line.
type can be used to construct new types: type(name, bases, dict) -> a new type. For name we provide empty string, as name is not really needed in this case. For bases (tuple) we provide an (int,), which is identical to inheriting int. dict are the class attributes, where we attach the __call__ lambda.
self.__class__(self + v) is identical to return CustomInt(self + v)
The new type is constructed and returned within the outer lambda.
If you want to define a function to be called multiple times, first you need to return a callable object each time (for example a function) otherwise you have to create your own object by defining a __call__ attribute, in order for it to be callable.
The next point is that you need to preserve all the arguments, which in this case means you might want to use Coroutines or a recursive function. But note that Coroutines are much more optimized/flexible than recursive functions, specially for such tasks.
Here is a sample function using Coroutines, that preserves the latest state of itself. Note that it can't be called multiple times since the return value is an integer which is not callable, but you might think about turning this into your expected object ;-).
def add():
current = yield
while True:
value = yield current
current = value + current
it = add()
next(it)
print(it.send(10))
print(it.send(2))
print(it.send(4))
10
12
16
Simply:
class add(int):
def __call__(self, n):
return add(self + n)
If you are willing to accept an additional () in order to retrieve the result you can use functools.partial:
from functools import partial
def add(*args, result=0):
return partial(add, result=sum(args)+result) if args else result
For example:
>>> add(1)
functools.partial(<function add at 0x7ffbcf3ff430>, result=1)
>>> add(1)(2)
functools.partial(<function add at 0x7ffbcf3ff430>, result=3)
>>> add(1)(2)()
3
This also allows specifying multiple numbers at once:
>>> add(1, 2, 3)(4, 5)(6)()
21
If you want to restrict it to a single number you can do the following:
def add(x=None, *, result=0):
return partial(add, result=x+result) if x is not None else result
If you want add(x)(y)(z) to readily return the result and be further callable then sub-classing int is the way to go.
The pythonic way to do this would be to use dynamic arguments:
def add(*args):
return sum(args)
This is not the answer you're looking for, and you may know this, but I thought I would give it anyway because if someone was wondering about doing this not out of curiosity but for work. They should probably have the "right thing to do" answer.
I've found various detailed explanations on how to pass long lists of arguments into a function, but I still kinda doubt if that's proper way to do it.
In other words, I suspect that I'm doing it wrong, but I can't see how to do it right.
The problem: I have (not very long) recurrent function, which uses quite a number of variables and needs to modify some content in at least some of them.
What I end up with is sth like this:
def myFunction(alpha, beta, gamma, zeta, alphaList, betaList, gammaList, zetaList):
<some operations>
myFunction(alpha, beta, modGamma, zeta, modAlphaList, betaList, gammaList, modZetaList)
...and I want to see the changes I did on original variables (in C I would just pass a reference, but I hear that in Python it's always a copy?).
Sorry if noob, I don't know how to phrase this question so I can find relevant answers.
You could wrap up all your parameters in a class, like this:
class FooParameters:
alpha = 1.0
beta = 1.0
gamma = 1.0
zeta = 1.0
alphaList = []
betaList = []
gammaList = []
zetaList = []
and then your function takes a single parameter instance:
def myFunction(params):
omega = params.alpha * params.beta + exp(params.gamma)
# more magic...
calling like:
testParams = FooParameters()
testParams.gamma = 2.3
myFunction(testParams)
print params.zetaList
Because the params instance is passed by reference, changes in the function are preserved.
This is commonly used in matplotlib, for example. They pass the long list of arguments using * or **, like:
def function(*args, **kwargs):
do something
Calling function:
function(1,2,3,4,5, a=1, b=2, b=3)
Here 1,2,3,4,5 will go to args and a=1, b=2, c=3 will go to kwargs, as a dictionary. So that they arrive at your function like:
args = [1,2,3,4,5]
kwargs = {a:1, b:2, c:3}
And you can treat them in the way you want.
I don't know where you got the idea that Python copies values when passing into a function. That is not at all true.
On the contrary: each parameter in a function is an additional name referring to the original object. If you change the value of that object in some way - for example, if it's a list and you change one of its members - then the original will also see that change. But if you rebind the name to something else - say by doing alpha = my_completely_new_value - then the original remains unchanged.
You may be tempted to something akin to this:
def myFunction(*args):
var_names = ['alpha','beta','gamma','zeta']
locals().update(zip(var_names,args))
myFunction(alpha,beta,gamma,zeta)
However, this 'often' won't work. I suggest introducing another namespace:
from collections import OrderedDict
def myFunction(*args):
var_names = ['alpha','beta','gamma','zeta']
vars = OrderedDict(zip(var_names,args))
#get them all via vars[var_name]
myFunction(*vars.values()) #since we used an orderedDict we can simply do *.values()
you can capture the non-modfied values in a closure:
def myFunction(alpha, beta, gamma, zeta, alphaList, betaList, gammaList, zetaList):
def myInner(g=gamma, al, zl):
<some operations>
myInner(modGamma, modAlphaList, modZetaList)
myInner(al=alphaList, zl=zetaList)
(BTW, this is about the only way to write a truly recursive function in Python.)
You could pass in a dictionary and return a new dictionary. Or put your method in a class and have alpha, beta etc. be attributes.
You should put myFunction in a class. Set up the class with the appropriate attributes and call the appropriate functions. The state is then well contained in the class.
I would like to run a set of methods given some data. I was wondering how I can remove or chose to run different methods to be run. I would like to groups them within a larger method so I can call it; and it will go along the lines of test case.
In code: Now these are the methods that process the data. I may sometimes want to run all three or a subset thereof to collect information on this data set.
def one(self):
pass
def two(self):
pass
def three(self):
pass
I would like to be able to call of these methods with another call so I dont have to type out run this; run this. I am looking for elegant way to run a bunch of methods through one call so I can pick and choose which gets run.
Desired result
def run_methods(self, variables):
#runs all three or subset of
I hope I have been clear in my question. I am just looking for an elegant way to do this. Like in Java with reflection.
Please and thanks.
Send the methods you want to run as a parameter:
def runmethods(self, variables, methods):
for method in methods:
method(variables)
then call something like:
self.runmethods(variables, (method1, method2))
This is the nice thing of having functions as first-class objects in Python
For the question of the OP in the comment (different parameters for the functions), a dirty solution (sorry for that):
def rest(a, b):
print a - b
def sum(a, b):
print a + b
def run(adictio):
for method, (a, b) in adictio.iteritems():
method(a, b)
mydictio = {rest:(3, 2), sum:(4, 5)}
run(mydictio)
You could use other containers to send methods together with their variables but it is nice to see a function as the key of a dictionary
if your methods/functions use different numbers of parameters you can not use
for method, (a,b) in adictio.iteritems():
because it expects the same number of parameters for all methods. In this case you can use *args:
def rest(*args):
a, b = args
print a - b
def sum(*args):
a, b, c, d, e = args
print a + b + c + d + e
def run(adictio):
for method, params in adictio.iteritems():
method(*params)
mydictio = {rest:(3, 2), sum:(4, 5, 6, 7, 8)}
run(mydictio)
If you normally do all the functions but sometimes have exceptions, then it would be useful to have them done by default, but optionally disable them like this:
def doWalkDog():
pass
def doFeedKid():
pass
def doTakeOutTrash():
pass
def doChores(walkDog=True, feedKid=True, takeOutTrash=True):
if walkDog: doWalkDog()
if feedKid: doFeedKid()
if takeOutTrash: doTakeOutTrash()
# if the kid is at grandma's...
# we still walk the dog and take out the trash
doChores(feedKid=False)
To answer the question in the comment regarding passing arbitrary values:
def runmethods(self, methods):
for method, args in methods.iteritems():
method(*args[0], **args[1])
runmethods( {methodA: ([arg1, arg2], {'kwarg1:' 'one', 'kwarg2'})},
{methodB: ([arg1], {'kwarg1:' 'one'})}
)
But at this point, it's looking like more code than it's worth!