Parallel optimizations in SciPy

Parallel optimizations in SciPy - python

I have a simple function
def square(x, a=1):
return [x**2 + a, 2*x]
I want to minimize it over x, for several parameters a. I currently have loops that, in spirit, do something like this:
In [89]: from scipy import optimize
In [90]: res = optimize.minimize(square, 25, method='BFGS', jac=True)
In [91]: [res.x, res.fun]
Out[91]: [array([ 0.]), 1.0]
In [92]: l = lambda x: square(x, 2)
In [93]: res = optimize.minimize(l, 25, method='BFGS', jac=True)
In [94]: [res.x, res.fun]
Out[94]: [array([ 0.]), 2.0]
Now, the function is already vectorized
In [98]: square(array([2,3]))
Out[98]: [array([ 5, 10]), array([4, 6])]
In [99]: square(array([2,3]), array([2,3]))
Out[99]: [array([ 6, 12]), array([4, 6])]
Which means it would probably be much faster to run all the optimizations in parallel rather than looping. Is that something that's easily do-able with SciPy? Or any other 3rd party tool?

Here's another try, based on my original answer and the discussion that followed.
As far as I know, the scipy.optimize module is for functions with scalar or vector inputs and a scalar output, or "cost".
Since you're treating each equation as independent of the others, my best idea is to use the multiprocessing module to do the work in parallel. If the functions you're minimizing are as simple as the ones in your question, I'd say it's not worth the effort.
If the functions are more complex, and you'd like to divide the work up, try something like:
import numpy as np
from scipy import optimize
from multiprocessing import Pool
def square(x, a=1):
return [np.sum(x**2 + a), 2*x]
def minimize(args):
f,x,a = args
res = optimize.minimize(f, x, method = 'BFGS', jac = True, args = [a])
return res.x
# your a values
a = np.arange(1,11)
# initial guess for all the x values
x = np.empty(len(a))
x[:] = 25
args = [(square,a[i],x[i]) for i in range(10)]
p = Pool(4)
print p.map(minimize,args)

I am a bit late to the party. But this may be interesting for people who want to reduce minimization time by parallel computing:
We implemented a parallel version of scipy.optimize.minimize(method='L-BFGS-B') in the package optimparallel available on PyPI. It can speedup the optimization by evaluating the objective function and the (approximate) gradient in parallel. Here is an example:
from optimparallel import minimize_parallel
def my_square(x, a=1):
return (x - a)**2
minimize_parallel(fun=my_square, x0=1, args=11)
Note that the parallel implementation only reduces the optimization time for objective functions with a long evaluation time (say, longer than 0.1 seconds). Here is an illustration of the possible parallel scaling:

If I understand your intent, you can pass numpy arrays for both x and a, so you can optimize for all your a parameters at once.
Try something like:
def square(x, a=1):
return [np.sum(x**2 + a), 2*x]
# your a values
a = np.arange(1,11)
# initial guess for all the x values
x = np.empty(len(a))
x[:] = 25
# extra arguments to pass to the objective function, in this case, your a values
args = [a]
res = optimize.minimize(square, x, method = 'BFGS', jac = True, args = args)
This appears to be getting the correct results.
>>> res.x
[ -8.88178420e-16 -8.88178420e-16 -8.88178420e-16 -8.88178420e-16
-8.88178420e-16 -8.88178420e-16 -8.88178420e-16 -8.88178420e-16
-8.88178420e-16 -8.88178420e-16]
>>> res.fun
55.0

Related

numpy, pass individual arguments to np.vectorized function

if i have an array of x values and want to multiply each x value with a different coefficients and sum the. although I want this operation to happen by passing a function that handles the summation and weighting. for example if i have x, coeffs and a function, custom_weight(x, a, b, c)
x = numpy.array([1, 2, 3, 4, 5, 6])
coeffs = numpy.array([[0.1, 0.2, 3.2], [4.5, 4.0, 0.005]]
def custom_weight(x, a, b, c):
return a*x**2 + (x+b)**3 + x*c
I want x to be broadcast for each inner array of coeffs. in this case the final result
should be an array with the shape (6, 2). for the first iteration of the custom_weight function
should look like this custom_weight(x[0], *(coeffs[0])) == custom_weight(1, 0.1, 0.2, 3.2). the same happens for all the other x's 2-6. then this happens again with the x's but now using the second set of coefficients.
I do realize that I could do this manually or numpy.vectorize in a certain way... but I specifically want to use a function in that form. what I want is some function that would look like so:
numpy.the_function(x, coeffs, axis=0, custom_weight)
# the_function should take each x value and pass it to custom_weight as the first arg.
# then pass the column of coeffs (because axis=0)
# to custom_weight but it should do this by unpacking the column into the args a, b, and c

The problem is more because your custome_weight function is not designed to be vectorized. You are looking for something like this:
def custom_weight(x, coeffs):
return coeffs # x**np.array([[2,3,1]]).T
Output:
array([[ 3.5 , 8.4 , 15.9 , 27.2 , 43.5 , 66. ],
[ 8.505, 50.01 , 148.515, 328.02 , 612.525, 1026.03 ]])

So after messing around, one solution I found was by vectorizing. transposing the coefficients when passing the arguments to custom_weight and then unpacking the coefficients and it will broadcasting and np.vectorize takes care of the rest.
import numpy as np
def custom_weight(x, a, b):
return a*x**2 + b
x = np.linspace(-1, 1, 100)
coeffs = np.array([[0.2, 0.6],
[1.2, 0.1]])
vec_custom_weight = np.vectorize(custom_weight)
results = vec_custom_weight(xs[:, np.newaxis], *coeffs.T).T

Using Additional kwargs with a Custom Function for Scipy's cdist (or pdist)?

I am using a custom metric function with scipy's cdist function.
The custom function is something like
def cust_metric(u,v):
dist = np.cumsum(np.gcd(u,v) * k)
return dist
where k is an arbitrary coefficient.
Ideally, I was hoping to pass k as an argument when calling cdist like so:
d_ar = scipy.spatial.distance.cdist(arr1, arr2, metric=cust_metric(k=7))
However, this throws an error.
I was wondering if there is a simple solution that I may be missing?
A quick but non-elegant fix is to declare k as a global variable and adjust it when needed.

According to its documentation, the value for metric should be a callable (or a string for a particular fixed collection). In your case you could obtain that through
def cust_metric(k):
return lambda u, v: np.cumsum(np.gcd(u, v) * k)
I do imagine your actual callable would look somewhat different since the moment u and v are 2D arrays, the np.cumsum returns an array, while the callable is supposed to produce a scalar. For example:
In [25]: arr1 = np.array([[5, 7], [6, 1]])
In [26]: arr2 = np.array([[6, 7], [6, 1]])
In [28]: def cust_metric(k):
...: return lambda u, v: np.sqrt(np.sum((k*u - v)**2))
...:
In [29]: scipy.spatial.distance.cdist(arr1, arr2, metric=cust_metric(k=7))
Out[29]:
array([[51.03920062, 56.08029957],
[36. , 36.49657518]])
In [30]: scipy.spatial.distance.cdist(arr1, arr2, metric=cust_metric(k=1))
Out[30]:
array([[1. , 6.08276253],
[6. , 0. ]])

python matrix of functions

Given the two matrices f and x:
def f11(x): return 1
def f12(x): return x+1
def f21(x): return np.log(x)
def f22(x): return np.exp(x)
f = np.matrix([[f11,f12],[f21,f22]])
x = np.matrix([[10,5],[3,8]])
How can I apply element-wise the matrix operator f to x (considering that the functions may be more complex, so it's just an example)?

Matrices are basically not designed to support such functionalities. Instead you can use one function that accepts an array and returns the expected result. The reason that you should use array instead of matrix is that they're more flexible and better adoptable with python operations, like in this case in-place unpacking.
In [41]: def apply_f(matrix):
...: ((x, y), (z, t)) = matrix
...: return np.array([[1, y +1], [np.log(z), np.exp(t)]])
...:
In [42]: x = np.array([[3, 5], [10, 8]])
In [43]: apply_f(x)
Out[43]:
array([[1.00000000e+00, 6.00000000e+00],
[2.30258509e+00, 2.98095799e+03]])

Integrating functions that return an array in Python

I have a lot of data to integrate over and would like to find a way of doing it all with just matrices, and would be willing to compromise on accuracy for a performance boost. What I have in mind is something like this:
import numpy
import scipy
a = np.array([1,2,3])
def func(x):
return x**2 + x
def func2(x):
global a
return a*x
def integrand(x):
return func(x)*func2(x)
integrated = quad(integrand, 0, 1)
So I am trying to integrate each element in the array that comes out of integrand.
I'm aware that there is a possibility of using numpy.vectorize() like this:
integrated = numpy.vectorize(scipy.integrate.quad)(integrand, 0, 1)
but I can't get that working. Is there a way to do this in python?
Solution
Well now that I learnt a bit more python I can answer this question if anyone happens to stable upon it and has the same question. The way to do it is to write the functions as though they are going to take scalar values, and not vectors as inputs. So follow from my code above, what we would have is something like
import numpy as np
import scipy.integrate.quad
a = np.array([1, 2, 3]) # arbitrary array, can be any size
def func(x):
return x**2 + x
def func2(x, a):
return a*x
def integrand(x, a):
return func(x)*func2(x, a)
def integrated(a):
integrated, tmp = scipy.integrate.quad(integrand, 0, 1, args = (a))
return integrated
def vectorizeInt():
global a
integrateArray = []
for i in range(len(a)):
integrate = integrated(a[i])
integrateArray.append(integrate)
return integrateArray
Not that the variable which you are integrating over must be the first input to the function. This is required for scipy.integrate.quad. If you are integrating over a method, it is the second argument after the typical self (i.e. x is integrated in def integrand(self, x, a):). Also the args = (a) is necessary to tell quad the value of a in the function integrand. If integrand has many arguments, say def integrand(x, a, b, c, d): you simply put the arguments in order in args. So that would be args = (a, b, c, d).

vectorize won't provide help with improving the performance of code that uses quad. To use quad, you'll have to call it separately for each component of the value returned by integrate.
For a vectorized but less accurate approximation, you can use numpy.trapz or scipy.integrate.simps.
Your function definition (at least the one shown in the question) is implemented using numpy functions that all support broadcasting, so given a grid of x values on [0, 1], you can do this:
In [270]: x = np.linspace(0.0, 1.0, 9).reshape(-1,1)
In [271]: x
Out[271]:
array([[ 0. ],
[ 0.125],
[ 0.25 ],
[ 0.375],
[ 0.5 ],
[ 0.625],
[ 0.75 ],
[ 0.875],
[ 1. ]])
In [272]: integrand(x)
Out[272]:
array([[ 0. , 0. , 0. ],
[ 0.01757812, 0.03515625, 0.05273438],
[ 0.078125 , 0.15625 , 0.234375 ],
[ 0.19335938, 0.38671875, 0.58007812],
[ 0.375 , 0.75 , 1.125 ],
[ 0.63476562, 1.26953125, 1.90429688],
[ 0.984375 , 1.96875 , 2.953125 ],
[ 1.43554688, 2.87109375, 4.30664062],
[ 2. , 4. , 6. ]])
That is, by making x an array with shape (n, 1), the value returned by integrand(x) has shape (n, 3). There is one column for each value in a.
You can pass that value to numpy.trapz() or scipy.integrate.simps(), using axis=0, to get the three approximations of the integrals. You'll probably want a finer grid:
In [292]: x = np.linspace(0.0, 1.0, 101).reshape(-1,1)
In [293]: np.trapz(integrand(x), x, axis=0)
Out[293]: array([ 0.583375, 1.16675 , 1.750125])
In [294]: simps(integrand(x), x, axis=0)
Out[294]: array([ 0.58333333, 1.16666667, 1.75 ])
Compare that to repeated calls to quad:
In [296]: np.array([quad(lambda t: integrand(t)[k], 0, 1)[0] for k in range(len(a))])
Out[296]: array([ 0.58333333, 1.16666667, 1.75 ])
Your function integrate (which I assume is just an example) is a cubic polynomial, for which Simpson's rule gives the exact result. In general, don't expect simps to give such an accurate answer.

quadpy (a project of mine) is fully vectorized. Install with
pip install quadpy
and then do
import numpy
import quadpy
def integrand(x):
return [numpy.sin(x), numpy.exp(x)] # ,...
res, err = quadpy.quad(integrand, 0, 1)
print(res)
print(err)
[0.45969769 1.71828183]
[1.30995437e-20 1.14828375e-19]

Performing operations on variables before assignment of variable values in Python

Ok, so basically my problem is shifting frame of mind from solving math problems „on the paper“ to solving them by programing. Let me explain: I want to know is it possible to perform operations on variable before assigning it a value. Like if I have something like (1-x)**n can I firstly assign n a value, then turn it into a from specific for certain degree and then give x a value or values. If I wasn’t clear enough: if n=2 can I firstly turn equation in form 1-2x+x**2 and then in the next step take care of x value?
I want to write a code for calculating and drawing n-th degree Bezier curve .I am using Bernstein polynomials for this, so I realized that equations consists of 3 parts: first part are polynomial coefficients which are all part of Pascal triangle; I am calculating those and putting them in one list. Second part are coordinates of control points which are also some kind of coefficients, and put them in separate list. Now comes the hard part: part of equation that has a variable.Bernsteins are working with barocentric coordinates (meaning u and 1-u).N-th degree formula for this part of equation is:
u**i *(1-u)**(n-i)
where n is curve degree, I goes from 0->n and U is variable.U is acctualy normalised variable,meaning that it value can be from 0 to 1 and i want to itterate it later in certain number of steps (like 1000).But problem is if i try to use mentioned equation i keep getting error, because Python doesnt know what to do with u.I taught about nested loops in which first one would itterate a value of u from 0 to 1 and second would take care of the mentioned equation from 0 to n, but not sure if it is right solution,and no idea how to chech results.What do you think?
PS: I have not uploaded the code because the part with which im having problem i can not even start,and ,I think but could be wrong, that it is separated from the rest of the code; but if you think it can help solving problem i can upload it.

You can do with higher-order functions, that is functions that return functions, like in
def Bernstein(n,i):
def f(t):
return t**i*(1.0-t)**(n-i)
return f
that you could use like this
b52 = Bernstein(5,2)
val = b52(0.74)
but instead you'll rather use lists
Bernstein_ni = [Bernstein(n,i) for i in range(n+1)]
to be used in a higher order function to build the Bezier curve function
def mk_bezier(Px,Py):
"Input, lists of control points, output a function of t that returns (x,y)"
n = len(Px)
binomials = {0:[1], 1:[1,1], 2:[1,2,1],
3:[1,3,3,1], 4:[1,4,6,4,1], 5:[1,5,10,10,5,1]}
binomial = binomials[n-1]
bPx = [b*x for b,x in zip(binomial,Px)]
bPy = [b*y for b,y in zip(binomial,Py)]
bns = [Bernstein(n-1,i) for i in range(n)]
def f(t):
x = 0 ; y = 0
for i in range(n):
berns = bns[i](t)
x = x + bPx[i]*berns
y = y + bPy[i]*berns
return x, y
return f
eventually, in your program, you can use the function factory like this
linear = mk_bezier([0.0,1.0],[1.0,0.0])
quadra = mk_bezier([0.0,1.0,2.0],[1.0,3.0,1.0])
for t in (0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0):
l = linear(t) ; q = quadra(t)
print "%3.1f (%6.4f,%6.4f) (%6.4f,%6.4f)" % (t, l[0],l[1], q[0],q[1])
and this is the testing output
0.0 (0.0000,1.0000) (0.0000,1.0000)
0.1 (0.1000,0.9000) (0.2000,1.3600)
0.2 (0.2000,0.8000) (0.4000,1.6400)
0.3 (0.3000,0.7000) (0.6000,1.8400)
0.4 (0.4000,0.6000) (0.8000,1.9600)
0.5 (0.5000,0.5000) (1.0000,2.0000)
0.6 (0.6000,0.4000) (1.2000,1.9600)
0.7 (0.7000,0.3000) (1.4000,1.8400)
0.8 (0.8000,0.2000) (1.6000,1.6400)
0.9 (0.9000,0.1000) (1.8000,1.3600)
1.0 (1.0000,0.0000) (2.0000,1.0000)
Edit
I think that the right way to do it is at the module level, with a top level sort-of-defaultdictionary that memoizes all the different lists required to perform the actual computations, but defaultdict doesn't pass a variable to its default_factory and I don't feel like subclassing dict (not now) for the sake of this answer, the main reason being that I've never subclassed before...
In response to OP comment
You say that the function degree is the main parameter? But it is implicitely defined by length of the list of control points...
N = user_input()
P0x = user_input()
P0y = user_input()
PNx = user_input()
PNy = user_input()
# code that computes P1, ..., PNminus1
orderN = mk_bezier([P0x,P1x,...,PNminus1x,PNx],
[P0y,P1y,...,PNminus1y,PNy])
x077, y077 = orderN(0.77)
But the customer is always right, so I'll never try again to convince you that my solution works for you if you state that it does things differently from your expectations.

There are Python packages for doing symbolic math, but it might be easier to use some of the polynomial functions available in Numpy. These functions use the convention that a polynomial is represented as an array of coefficients, starting with the lowest order coefficient. So a polynomial a*x^2 + b*x + c would be represented as array([c, b, a]).
Some examples:
In [49]: import numpy.polynomial.polynomial as poly
In [50]: p = [-1, 1] # -x + 1
In [51]: p = poly.polypow(p, 2)
In [52]: p # should be 1 - 2x + x^2
Out[52]: array([ 1., -2., 1.])
In [53]: x = np.arange(10)
In [54]: poly.polyval(x, p) # evaluate polynomial at points x
Out[54]: array([ 1., 0., 1., 4., 9., 16., 25., 36., 49., 64.])
And you could calculate your Bernstein polynomial in a way similar to this (there is still a binomial coefficient missing):
In [55]: def Bernstein(n, i):
...: part1 = poly.polypow([0, 1], i) # (0 + u)**i
...: part2 = poly.polypow([1, -1], n - i) # (1 - u)**(n - i)
...: return poly.polymul(part1, part2)
In [56]: p = Bernstein(3, 2)
In [57]: p
Out[57]: array([ 0., 0., 1., -1.])
In [58]: poly.polyval(x, p) # evaluate polynomial at points x
Out[58]: array([ 0., 0., -4., -18., ..., -448., -648.])

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Parallel optimizations in SciPy - python

Related

numpy, pass individual arguments to np.vectorized function

Using Additional kwargs with a Custom Function for Scipy's cdist (or pdist)?

python matrix of functions

Integrating functions that return an array in Python

Performing operations on variables before assignment of variable values in Python

Categories

Resources