I've got a quite extensive simulation tool written in python, which requires the user to call functions to set up the environment in a strict order since np.ndarrays are at first created (and changed by appending etc.) and afterwards memory views to specific cells of these arrays are defined.
Currently each part of the environment requires around 4 different function calls to be set up, with easily >> 100 parts.
Thus I need to combine each part's function calls by syntactically (not based on timers) postponing the execution of some functions until all preceding functions have been executed, while still maintaining the strict order to be able to use memory views.
Futhermore all functions to be called by the user use PEP 3102 style keyword-only arguments to reduce the probability of input-errors and all are instance methods with self as the first parameter, with self containing the references to the arrays to construct the memory views to.
My current implementation is using lists to store the functions and the dict for each function's keyworded arguments. This is shown here, omitting the class and self parameters to make it short:
def fun1(*, x, y): # easy minimal example function 1
print(x * y)
def fun2(*, x, y, z): # easy minimal example function 2
print((x + y) / z)
fun_list = [] # list to store the functions and kwargs
fun_list.append([fun1, {'x': 3.4, 'y': 7.0}]) # add functions and kwargs
fun_list.append([fun2, {'x':1., 'y':12.8, 'z': np.pi}])
fun_list.append([fun2, {'x':0.3, 'y':2.4, 'z': 1.}])
for fun in fun_list:
fun[0](**fun[1])
What I'd like to implement is using a decorator to postpone the function execution by adding a generator, to be able to pass all arguments to the functions as they are called, but not execute them, as shown below:
def postpone(myfun): # define generator decorator
def inner_fun(*args, **kwargs):
yield myfun(*args, **kwargs)
return inner_fun
fun_list_dec = [] # list to store the decorated functions
fun_list_dec.append(postpone(fun1)(x=3.4, y=7.0)) # add decorated functions
fun_list_dec.append(postpone(fun2)(x=1., y=12.8, z=np.pi))
fun_list_dec.append(postpone(fun2)(x=0.3, y=2.4, z=1.))
for fun in fun_list_dec: # execute functions
next(fun)
Which is the best (most pythonic) method to do so? Are there any drawbacks?
And most important: Will my references to np.ndarrays passed to the functions within self still be a reference, so that the memory addresses of these arrays are still correct when executing the functions, if the memory addresses change in between saving the function calls to a list (or being decorated) and executing them?
Execution speed does not matter here.
Using a generators here doesn't make much sense. You are essentially simulating partial-application. Therefore, this seems like a use-case for functools.partial. Since you are sticking with key-word only arguments, this will work just fine:
In [1]: def fun1(*, x, y): # easy minimal example function 1
...: print(x * y)
...: def fun2(*, x, y, z): # easy minimal example function 2
...: print((x + y) / z)
...:
In [2]: from functools import partial
In [3]: fun_list = []
In [4]: fun_list.append(partial(fun1, x=3.4, y=7.0))
In [5]: fun_list.append(partial(fun2, x=1., y=12.8, z=3.14))
In [6]: fun_list.append(partial(fun2, x=0.3, y=2.4,z=1.))
In [7]: for f in fun_list:
...: f()
...:
23.8
4.3949044585987265
2.6999999999999997
You don't have to use functools.partial either, you can do your partial application "manually", just to demonstrate:
In [8]: fun_list.append(lambda:fun1(x=5.4, y=8.7))
In [9]: fun_list[-1]()
46.98
Since for commenting the code would be too complicated and it is based on juanpa.arrivillaga's answer, I'll add a full post with a short explanation what I mean by updating the reference to the arrays:
def fun1(*, x, y): # easy minimal example function 1
print(x * y)
arr = np.random.rand(5)
f1_lam = lambda:fun1(x=arr, y=5.)
f1_par = partial(fun1, x=arr, y=5.)
f1_lam() # Out[01]: [0.55561103 0.9962626 3.60992174 2.55491852 3.9402079 ]
f1_par() # Out[02]: [0.55561103 0.9962626 3.60992174 2.55491852 3.9402079 ]
# manipulate array so that the memory address changes and
# passing as reference is "complicated":
arr = np.append(arr, np.ones((2,1)))
f1_lam() # Out[03]: [0.55561103 0.9962626 3.60992174 2.55491852 3.9402079 5. 5.]
f1_par() # Out[02]: [0.55561103 0.9962626 3.60992174 2.55491852 3.9402079 ]
The behaviour of lambda is exactly what I was looking for in this question.
My examples with dict and decorators don't work, as well as functools.partial. Any idea why lambda is working? And just out of interest: Would there be any way to update the references to the arrays in the dict so that it would also work this way?
Related
I want to optimize a function f(x,y,z) over x with sp.optimize.minimize. The Jacobian only depends on x and y, and it is the function J(x,y). (this is just a toy example)
If I try:
import numpy as np
import scipy as sp
def f(x,y,z):
return(x**2+x*y**3+z)
def J(x,y):
return(2*x+y**3)
x0,y,z=0,1,4
sp.optimize.minimize(f,x0,args=(y,z),jac=J)
I get an error "J() takes 2 positional arguments but 3 were given", because optimize passes y and z to J.
Is any way to define the arguments I want to pass to f, and the ones I want to pass to J?
(one option is to define f and J such that they have the same arguments and just ignore the ones not needed by the function, but I hope there is a more elegant way)
As per the manual, the Jacobian is a callable with signature
J(x, *args)
Where args are explicitly the fixed parameters args=(y,z) in your example. So no in general. On the other hand, nothing is preventing you from writing:
def J(x, y, z):
return 2*x + y**3
and I do not see anything "inelegant" here. In general we write
df(x, y, z)/dx = f'(x, y, z)
anyway, and this holds for f' being independent of one of the variables - we do not know, and no one frowns on this sort of writing.
If you really want you could have:
def J(x, *args):
return 2*x + args[0]**3
to hide the extra variables. I would not call this more elegant though.
I apologize in advance if there is an obvious solution to this question or it is a duplicate.
I have a class as follows:
class Kernel(object):
""" creates kernels with the necessary input data """
def __init__(self, Amplitude, random = None):
self.Amplitude = Amplitude
self.random = random
if random != None:
self.dims = list(random.shape)
def Gaussian(self, X, Y, sigmaX, sigmaY, muX=0.0, muY=0.0):
""" return a 2 dimensional Gaussian kernel """
kernel = np.zeros([X, Y])
theta = [self.Amplitude, muX, muY, sigmaX, sigmaY]
for i in range(X):
for j in range(Y):
kernel[i][j] = integrate.dblquad(lambda x, y: G2(x + float(i) - (X-1.0)/2.0, \
y + float(j) - (Y-1.0)/2.0, theta), \
-0.5, 0.5, lambda y: -0.5, lambda y: 0.5)[0]
return kernel
It just basically creates a bunch of convolution kernels (I've only included the first).
I want to add an instance (method?) to this class so that I can use something like
conv = Kernel(1.5)
conv.Gaussian(9, 9, 2, 2).kershow()
and have the array pop up using Matplotlib. I know how to write this instance and plot it with Matplotlib, but I don't know how to write this class so that for each method I would like to have this additional ability (i.e. .kershow()), I may call it in this manner.
I think I could use decorators ? But I've never used them before. How can I do this?
The name of the thing you're looking for is function or method chaining.
Strings are a really good example of this in Python. Because a string is immutable, each string method returns a new string. So you can call string methods on the return values, rather than storing the intermediate value. For example:
lower = ' THIS IS MY NAME: WAYNE '.lower()
without_left_padding = lower.lstrip()
without_right_padding = without_left_padding.rstrip()
title_cased = without_right_padding.title()
Instead you could write:
title_cased = ' THIS IS MY NAME: WAYNE '.lower().lstrip().rstrip().title()
Of course really you'd just do .strip().title(), but this is an example.
So if you want a .kernshow() option, then you'll need to include that method on whatever you return. In your case, numpy arrays don't have a .kernshow method, so you'll need to return something that does.
Your options are mostly:
A subclass of numpy arrays
A class that wraps the numpy array
I'm not sure what is involved with subclassing the numpy array, so I'll stick with the latter as an example. Either you can use the kernel class, or create a second class.
Alex provided an example of using your kernel class, but alternatively you could have another class like this:
class KernelPlotter(object):
def __init__(self, kernel):
self.kernel = kernel
def kernshow(self):
# do the plotting here
Then you would pretty much follow your existing code, but rather than return kernel you would do return KernelPlotter(kernel).
Which option you choose really depends on what makes sense for your particular problem domain.
There's another sister to function chaining called a fluent interface that's basically function chaining but with the goal of making the interface read like English. For example you might have something like:
Kernel(with_amplitude=1.5).create_gaussian(with_x=9, and_y=9, and_sigma_x=2, and_sigma_y=2).show_plot()
Though obviously there can be some problems when writing your code this way.
Here's how I would do it:
class Kernel(object):
def __init__ ...
def Gaussian(...):
self.kernel = ...
...
return self # not kernel
def kershow(self):
do_stuff_with(self.kernel)
Basically the Gaussian method doesn't return a numpy array, it just stores it in the Kernel object to be used elsewhere in the class. In particular kershow can now use it. The return self is optional but allows the kind of interface you wanted where you write
conv.Gaussian(9, 9, 2, 2).kershow()
instead of
conv.Gaussian(9, 9, 2, 2)
conv.kershow()
I am a c++ guy, learning the lambda function in python and wanna know it inside out. did some seraches before posting here. anyway, this piece of code came up to me.
<1> i dont quite understand the purpose of lambda function here. r we trying to get a function template? If so, why dont we just set up 2 parameters in the function input?
<2> also, make_incrementor(42), at this moment is equivalent to return x+42, and x is the 0,1 in f(0) and f(1)?
<3> for f(0), does it not have the same effect as >>>f = make_incrementor(42)? for f(0), what are the values for x and n respectively?
any commments are welcome! thanks.
>>> def make_incrementor(n):
... return lambda x: x + n
...
>>> f = make_incrementor(42)
>>> f(0)
42
>>> f(1)
43
Yes, this is similar to a C++ int template. However, instead of at compile time (yes, Python (at least for CPython) is "compiled"), the function is created at run time. Why the lambda is used in this specific case is unclear, probably only for demonstration that functions can be returned from other functions rather than practical use. Sometimes, however, statements like this may be necessary if you need a function taking a specified number of arguments (e.g. for map, the function must take the same number of arguments as the number of iterables given to map) but the behaviour of the function should depend on other arguments.
make_incrementor returns a function that adds n (here, 42) to any x passed to that function. In your case the x values you tried are 0 and `1``
f = make_incrementor(42) sets f to a function that returns x + 42. f(0), however, returns 0 + 42, which is 42 - the returned types and values are both different, so the different expressions don't have the same effect.
The purpose is to show a toy lambda return. It lets you create a function with data baked in. I have used this less trivial example of a similar use.
def startsWithFunc(testString):
return lambda x: x.find(testString) == 0
Then when I am parsing, I create some functions:
startsDesctription = startsWithFunc("!Sample_description")
startMatrix = startsWithFunc("!series_matrix_table_begin")
Then in code I use:
while line:
#.... other stuff
if startsDesctription(line):
#do description work
if startMatrix(line):
#do matrix start work
#other stuff ... increment line ... etc
Still perhaps trival, but it shows creating general funcitons with data baked it.
Background
I have a function that takes a number of parameters and returns an error measure which I then want to minimize (using scipy.optimize.leastsq, but that is beside the point right now).
As a toy example, let's assume my function to optimize take the four parameters a,b,c,d:
def f(a,b,c,d):
err = a*b - c*d
return err
The optimizer then want a function with the signature func(x, *args) where x is the parameter vector.
That is, my function is currently written like:
def f_opt(x, *args):
a,b,c,d = x
err = a*b - c*d
return err
But, now I want to do a number of experiments where I fix some parameters while keeping some parameters free in the optimization step.
I could of course do something like:
def f_ad_free(x, b, c):
a, d = x
return f(a,b,c,d)
But this will be cumbersome since I have over 10 parameters which means the combinations of different numbers of free-vs-fixed parameters will potentially be quite large.
First approach using dicts
One solution I had was to write my inner function f with keyword args instead of positional args and then wrap the solution like this:
def generate(func, all_param, fixed_param):
param_dict = {k : None for k in all_param}
free_param = [param for param in all_param if param not in fixed_param]
def wrapped(x, *args):
param_dict.update({k : v for k, v in zip(fixed_param, args)})
param_dict.update({k : v for k, v in zip(free_param, x)})
return func(**param_dict)
return wrapped
Creating a function that fixes 'b' and 'c' then turns into the following:
all_params = ['a','b','c']
f_bc_fixed = generate(f_inner, all_params, ['b', 'c'])
a = 1
b = 2
c = 3
d = 4
f_bc_fixed((a,d), b, c)
Question time!
My question is whether anyone can think of a neater way solve this. Since the final function is going to be run in an optimization step I can't accept too much overhead for each function call.
The time it takes to generate the optimization function is irrelevant.
I can think of several ways to avoid using a closure as you do above, though after doing some testing, I'm not sure either of these will be faster. One approach might be to skip the wrapper and just write a function that accepts
A vector
A list of free names
A dictionary mapping names to values.
Then do something very like what you do above, but in the function itself:
def f(free_vals, free_names, params):
params.update(zip(free_names, free_vals))
err = params['a'] * params['b'] - params['c'] * params['d']
return err
For code that uses variable names multiple times, make vars local up front, e.g.
a = params['a']
b = params['b']
and so on. This might seem cumbersome, but it has the advantage of making everything explicit, avoiding the kinds of namespace searches that could make closures slow.
Then pass a list of free names and a dictionary of fixed params via the args parameter to optimize.leastsq. (Note that the params dictionary is mutable, which means that there could be side effects in theory; but in this case it shouldn't matter because only the free params are being overwritten by update, so I omitted the copy step for the sake of speed.)
The main downsides of this approach are that it shifts some complexity into the call to optimize.leastsq, and it makes your code less reusable. A second approach avoids those problems though it might not be quite as fast: using a callable class.
class OptWrapper(object):
def __init__(self, func, free_names, **fixed_params):
self.func = func
self.free_names = free_names
self.params = fixed_params
def __call__(self, x, *args):
self.params.update(zip(self.free_names, x))
return self.func(**self.params)
You can see that I simplified the parameter structure for __init__; the fixed params are passed here as keyword arguments, and the user must ensure that free_names and fixed_params don't have overlapping names. I think the simplicity is worth the tradeoff but you can easily enforce the separation between the two just as you did in your wrapper code.
I like this second approach best; it has the flexibility of your closure-based approach, but I find it more readable. All the names are in (or can be accessed through) the local namespace, which I thought that would speed things up -- but after some testing I think there's reason to believe that the closure approach will still be faster than this; accessing the __call__ method seems to add about 100 ns per call of overhead. I would strongly recommend testing if performance is a real issue.
Your generate function is basically the same as functools.partial, which is what I would use here.
I've found various detailed explanations on how to pass long lists of arguments into a function, but I still kinda doubt if that's proper way to do it.
In other words, I suspect that I'm doing it wrong, but I can't see how to do it right.
The problem: I have (not very long) recurrent function, which uses quite a number of variables and needs to modify some content in at least some of them.
What I end up with is sth like this:
def myFunction(alpha, beta, gamma, zeta, alphaList, betaList, gammaList, zetaList):
<some operations>
myFunction(alpha, beta, modGamma, zeta, modAlphaList, betaList, gammaList, modZetaList)
...and I want to see the changes I did on original variables (in C I would just pass a reference, but I hear that in Python it's always a copy?).
Sorry if noob, I don't know how to phrase this question so I can find relevant answers.
You could wrap up all your parameters in a class, like this:
class FooParameters:
alpha = 1.0
beta = 1.0
gamma = 1.0
zeta = 1.0
alphaList = []
betaList = []
gammaList = []
zetaList = []
and then your function takes a single parameter instance:
def myFunction(params):
omega = params.alpha * params.beta + exp(params.gamma)
# more magic...
calling like:
testParams = FooParameters()
testParams.gamma = 2.3
myFunction(testParams)
print params.zetaList
Because the params instance is passed by reference, changes in the function are preserved.
This is commonly used in matplotlib, for example. They pass the long list of arguments using * or **, like:
def function(*args, **kwargs):
do something
Calling function:
function(1,2,3,4,5, a=1, b=2, b=3)
Here 1,2,3,4,5 will go to args and a=1, b=2, c=3 will go to kwargs, as a dictionary. So that they arrive at your function like:
args = [1,2,3,4,5]
kwargs = {a:1, b:2, c:3}
And you can treat them in the way you want.
I don't know where you got the idea that Python copies values when passing into a function. That is not at all true.
On the contrary: each parameter in a function is an additional name referring to the original object. If you change the value of that object in some way - for example, if it's a list and you change one of its members - then the original will also see that change. But if you rebind the name to something else - say by doing alpha = my_completely_new_value - then the original remains unchanged.
You may be tempted to something akin to this:
def myFunction(*args):
var_names = ['alpha','beta','gamma','zeta']
locals().update(zip(var_names,args))
myFunction(alpha,beta,gamma,zeta)
However, this 'often' won't work. I suggest introducing another namespace:
from collections import OrderedDict
def myFunction(*args):
var_names = ['alpha','beta','gamma','zeta']
vars = OrderedDict(zip(var_names,args))
#get them all via vars[var_name]
myFunction(*vars.values()) #since we used an orderedDict we can simply do *.values()
you can capture the non-modfied values in a closure:
def myFunction(alpha, beta, gamma, zeta, alphaList, betaList, gammaList, zetaList):
def myInner(g=gamma, al, zl):
<some operations>
myInner(modGamma, modAlphaList, modZetaList)
myInner(al=alphaList, zl=zetaList)
(BTW, this is about the only way to write a truly recursive function in Python.)
You could pass in a dictionary and return a new dictionary. Or put your method in a class and have alpha, beta etc. be attributes.
You should put myFunction in a class. Set up the class with the appropriate attributes and call the appropriate functions. The state is then well contained in the class.