Related
if i have an array of x values and want to multiply each x value with a different coefficients and sum the. although I want this operation to happen by passing a function that handles the summation and weighting. for example if i have x, coeffs and a function, custom_weight(x, a, b, c)
x = numpy.array([1, 2, 3, 4, 5, 6])
coeffs = numpy.array([[0.1, 0.2, 3.2], [4.5, 4.0, 0.005]]
def custom_weight(x, a, b, c):
return a*x**2 + (x+b)**3 + x*c
I want x to be broadcast for each inner array of coeffs. in this case the final result
should be an array with the shape (6, 2). for the first iteration of the custom_weight function
should look like this custom_weight(x[0], *(coeffs[0])) == custom_weight(1, 0.1, 0.2, 3.2). the same happens for all the other x's 2-6. then this happens again with the x's but now using the second set of coefficients.
I do realize that I could do this manually or numpy.vectorize in a certain way... but I specifically want to use a function in that form. what I want is some function that would look like so:
numpy.the_function(x, coeffs, axis=0, custom_weight)
# the_function should take each x value and pass it to custom_weight as the first arg.
# then pass the column of coeffs (because axis=0)
# to custom_weight but it should do this by unpacking the column into the args a, b, and c
The problem is more because your custome_weight function is not designed to be vectorized. You are looking for something like this:
def custom_weight(x, coeffs):
return coeffs # x**np.array([[2,3,1]]).T
Output:
array([[ 3.5 , 8.4 , 15.9 , 27.2 , 43.5 , 66. ],
[ 8.505, 50.01 , 148.515, 328.02 , 612.525, 1026.03 ]])
So after messing around, one solution I found was by vectorizing. transposing the coefficients when passing the arguments to custom_weight and then unpacking the coefficients and it will broadcasting and np.vectorize takes care of the rest.
import numpy as np
def custom_weight(x, a, b):
return a*x**2 + b
x = np.linspace(-1, 1, 100)
coeffs = np.array([[0.2, 0.6],
[1.2, 0.1]])
vec_custom_weight = np.vectorize(custom_weight)
results = vec_custom_weight(xs[:, np.newaxis], *coeffs.T).T
I have a lot of data to integrate over and would like to find a way of doing it all with just matrices, and would be willing to compromise on accuracy for a performance boost. What I have in mind is something like this:
import numpy
import scipy
a = np.array([1,2,3])
def func(x):
return x**2 + x
def func2(x):
global a
return a*x
def integrand(x):
return func(x)*func2(x)
integrated = quad(integrand, 0, 1)
So I am trying to integrate each element in the array that comes out of integrand.
I'm aware that there is a possibility of using numpy.vectorize() like this:
integrated = numpy.vectorize(scipy.integrate.quad)(integrand, 0, 1)
but I can't get that working. Is there a way to do this in python?
Solution
Well now that I learnt a bit more python I can answer this question if anyone happens to stable upon it and has the same question. The way to do it is to write the functions as though they are going to take scalar values, and not vectors as inputs. So follow from my code above, what we would have is something like
import numpy as np
import scipy.integrate.quad
a = np.array([1, 2, 3]) # arbitrary array, can be any size
def func(x):
return x**2 + x
def func2(x, a):
return a*x
def integrand(x, a):
return func(x)*func2(x, a)
def integrated(a):
integrated, tmp = scipy.integrate.quad(integrand, 0, 1, args = (a))
return integrated
def vectorizeInt():
global a
integrateArray = []
for i in range(len(a)):
integrate = integrated(a[i])
integrateArray.append(integrate)
return integrateArray
Not that the variable which you are integrating over must be the first input to the function. This is required for scipy.integrate.quad. If you are integrating over a method, it is the second argument after the typical self (i.e. x is integrated in def integrand(self, x, a):). Also the args = (a) is necessary to tell quad the value of a in the function integrand. If integrand has many arguments, say def integrand(x, a, b, c, d): you simply put the arguments in order in args. So that would be args = (a, b, c, d).
vectorize won't provide help with improving the performance of code that uses quad. To use quad, you'll have to call it separately for each component of the value returned by integrate.
For a vectorized but less accurate approximation, you can use numpy.trapz or scipy.integrate.simps.
Your function definition (at least the one shown in the question) is implemented using numpy functions that all support broadcasting, so given a grid of x values on [0, 1], you can do this:
In [270]: x = np.linspace(0.0, 1.0, 9).reshape(-1,1)
In [271]: x
Out[271]:
array([[ 0. ],
[ 0.125],
[ 0.25 ],
[ 0.375],
[ 0.5 ],
[ 0.625],
[ 0.75 ],
[ 0.875],
[ 1. ]])
In [272]: integrand(x)
Out[272]:
array([[ 0. , 0. , 0. ],
[ 0.01757812, 0.03515625, 0.05273438],
[ 0.078125 , 0.15625 , 0.234375 ],
[ 0.19335938, 0.38671875, 0.58007812],
[ 0.375 , 0.75 , 1.125 ],
[ 0.63476562, 1.26953125, 1.90429688],
[ 0.984375 , 1.96875 , 2.953125 ],
[ 1.43554688, 2.87109375, 4.30664062],
[ 2. , 4. , 6. ]])
That is, by making x an array with shape (n, 1), the value returned by integrand(x) has shape (n, 3). There is one column for each value in a.
You can pass that value to numpy.trapz() or scipy.integrate.simps(), using axis=0, to get the three approximations of the integrals. You'll probably want a finer grid:
In [292]: x = np.linspace(0.0, 1.0, 101).reshape(-1,1)
In [293]: np.trapz(integrand(x), x, axis=0)
Out[293]: array([ 0.583375, 1.16675 , 1.750125])
In [294]: simps(integrand(x), x, axis=0)
Out[294]: array([ 0.58333333, 1.16666667, 1.75 ])
Compare that to repeated calls to quad:
In [296]: np.array([quad(lambda t: integrand(t)[k], 0, 1)[0] for k in range(len(a))])
Out[296]: array([ 0.58333333, 1.16666667, 1.75 ])
Your function integrate (which I assume is just an example) is a cubic polynomial, for which Simpson's rule gives the exact result. In general, don't expect simps to give such an accurate answer.
quadpy (a project of mine) is fully vectorized. Install with
pip install quadpy
and then do
import numpy
import quadpy
def integrand(x):
return [numpy.sin(x), numpy.exp(x)] # ,...
res, err = quadpy.quad(integrand, 0, 1)
print(res)
print(err)
[0.45969769 1.71828183]
[1.30995437e-20 1.14828375e-19]
I have two numpy arrays
import numpy as np
x = np.linspace(1e10, 1e12, num=50) # 50 values
y = np.linspace(1e5, 1e7, num=50) # 50 values
x.shape # output is (50,)
y.shape # output is (50,)
I would like to create a function which returns an array shaped (50,50) such that the first x value x0 is evaluated for all y values, etc.
The current function I am using is fairly complicated, so let's use an easier example. Let's say the function is
def func(x,y):
return x**2 + y**2
How do I shape this to be a (50,50) array? At the moment, it will output 50 values. Would you use a for loop inside an array?
Something like:
np.array([[func(x,y) for i in x] for j in y)
but without using two for loops. This takes forever to run.
EDIT: It has been requested I share my "complicated" function. Here it goes:
There is a data vector which is a 1D numpy array of 4000 measurements. There is also a "normalized_matrix", which is shaped (4000,4000)---it is nothing special, just a matrix with entry values of integers between 0 and 1, e.g. 0.5567878. These are the two "given" inputs.
My function returns the matrix multiplication product of transpose(datavector) * matrix * datavector, which is a single value.
Now, as you can see in the code, I have initialized two arrays, x and y, which pass through a series of "x parameters" and "y parameters". That is, what does func(x,y) return for value x1 and value y1, i.e. func(x1,y1)?
The shape of matrix1 is (50, 4000, 4000). The shape of matrix2 is (50, 4000, 4000). Ditto for total_matrix.
normalized_matrix is shape (4000,4000) and id_mat is shaped (4000,4000).
normalized_matrix
print normalized_matrix.shape #output (4000,4000)
data_vector = datarr
print datarr.shape #output (4000,)
def func(x, y):
matrix1 = x [:, None, None] * normalized_matrix[None, :, :]
matrix2 = y[:, None, None] * id_mat[None, :, :]
total_matrix = matrix1 + matrix2
# transpose(datavector) * matrix * datavector
# by matrix multiplication, equals single value
return np.array([ np.dot(datarr.T, np.dot(total_matrix, datarr) ) ])
If I try to use np.meshgrid(), that is, if I try
x = np.linspace(1e10, 1e12, num=50) # 50 values
y = np.linspace(1e5, 1e7, num=50) # 50 values
X, Y = np.meshgrid(x,y)
z = func(X, Y)
I get the following value error: ValueError: operands could not be broadcast together with shapes (50,1,1,50) (1,4000,4000).
reshape in numpy as different meaning. When you start with a (100,) and change it to (5,20) or (10,10) 2d arrays, that is 'reshape. There is anumpy` function to do that.
You want to take 2 1d array, and use those to generate a 2d array from a function. This is like taking an outer product of the 2, passing all combinations of their values through your function.
Some sort of double loop is one way of doing this, whether it is with an explicit loop, or list comprehension. But speeding this up depends on that function.
For at x**2+y**2 example, it can be 'vectorized' quite easily:
In [40]: x=np.linspace(1e10,1e12,num=10)
In [45]: y=np.linspace(1e5,1e7,num=5)
In [46]: z = x[:,None]**2 + y[None,:]**2
In [47]: z.shape
Out[47]: (10, 5)
This takes advantage of numpy broadcasting. With the None, x is reshaped to (10,1) and y to (1,5), and the + takes an outer sum.
X,Y=np.meshgrid(x,y,indexing='ij') produces two (10,5) arrays that can be used the same way. Look at is doc for other parameters.
So if your more complex function can be written in a way that takes 2d arrays like this, it is easy to 'vectorize'.
But if that function must take 2 scalars, and return another scalar, then you are stuck with some sort of double loop.
A list comprehension form of the double loop is:
np.array([[x1**2+y1**2 for y1 in y] for x1 in x])
Another is:
z=np.empty((10,5))
for i in range(10):
for j in range(5):
z[i,j] = x[i]**2 + y[j]**2
This double loop can be sped up somewhat by using np.vectorize. This takes a user defined function, and returns one that can take broadcastable arrays:
In [65]: vprod=np.vectorize(lambda x,y: x**2+y**2)
In [66]: vprod(x[:,None],y[None,:]).shape
Out[66]: (10, 5)
Test that I've done in the past show that vectorize can improve on the list comprehension route by something like 20%, but the improvement is nothing like writing your function to work with 2d arrays in the first place.
By the way, this sort of 'vectorization' question has been asked many times on SO numpy. Beyond these broad examples, we can't help you without knowning more about that more complicated function. As long as it is a black box that takes scalars, the best we can help you with is np.vectorize. And you still need to understand broadcasting (with or without meshgrid help).
I think there is a better way, it is right on the tip of my tongue, but as an interim measure:
You are operating on 1x2 windows of a meshgrid. You can use as_strided from numpy.lib.stride_tricks to rearrange the meshgrid into two-element windows, then apply your function to the resultant array. I like to use a generic nd solution, sliding_windows (http://www.johnvinyard.com/blog/?p=268) (Not mine) to transform the array.
import numpy as np
a = np.array([1,2,3])
b = np.array([.1, .2, .3])
z= np.array(np.meshgrid(a,b))
def foo((x,y)):
return x+y
>>> z.shape
(2, 3, 3)
>>> t = sliding_window(z, (2,1,1))
>>> t
array([[ 1. , 0.1],
[ 2. , 0.1],
[ 3. , 0.1],
[ 1. , 0.2],
[ 2. , 0.2],
[ 3. , 0.2],
[ 1. , 0.3],
[ 2. , 0.3],
[ 3. , 0.3]])
>>> v = np.apply_along_axis(foo, 1, t)
>>> v
array([ 1.1, 2.1, 3.1, 1.2, 2.2, 3.2, 1.3, 2.3, 3.3])
>>> v.reshape((len(a), len(b)))
array([[ 1.1, 2.1, 3.1],
[ 1.2, 2.2, 3.2],
[ 1.3, 2.3, 3.3]])
>>>
This should be somewhat faster.
You may need to modify your function's argument signature.
If the link to the johnvinyard.com blog breaks, I've posted the the sliding_window implementation in other SO answers - https://stackoverflow.com/a/22749434/2823755
Search around and you'll find many other tricky as_strided solutions.
In response to your edited question:
normalized_matrix
print normalized_matrix.shape #output (4000,4000)
data_vector = datarr
print datarr.shape #output (4000,)
def func(x, y):
matrix1 = x [:, None, None] * normalized_matrix[None, :, :]
matrix2 = y[:, None, None] * id_mat[None, :, :]
total_matrix = matrix1 + matrix2
# transpose(datavector) * matrix * datavector
# by matrix multiplication, equals single value
# return np.array([ np.dot(datarr.T, np.dot(total_matrix, datarr))])
return np.einsum('j,ijk,k->i',datarr,total_matrix,datarr)
Since datarr is shape (4000,), transpose does nothing. I believe you want the result of the 2 dots to be shape (50,). I'm suggesting using einsum. But it can be done with tensordot, or I think even np.dot(np.dot(total_matrix, datarr),datarr). Test the expression with smaller arrays, focusing on getting the shapes right.
x = np.linspace(1e10, 1e12, num=50) # 50 values
y = np.linspace(1e5, 1e7, num=50) # 50 values
z = func(x,y)
# X, Y = np.meshgrid(x,y)
# z = func(X, Y)
X,Y is wrong. func takes x and y that are 1d. Notice how you expand the dimensions with [:, None, None]. Also you aren't creating a 2d array from an outer combination of x and y. None of your arrays in func is (50,50) or (50,50,...). The higher dimensions are provided by nomalied_matrix and id_mat.
When showing us the ValueError you should also indicate where in your code that occurred. Otherwise we have to guess, or recreate the code ourselves.
In fact when I run my edited func(X,Y), I get this error:
----> 2 matrix1 = x [:, None, None] * normalized_matrix[None, :, :]
3 matrix2 = y[:, None, None] * id_mat[None, :, :]
4 total_matrix = matrix1 + matrix2
5 # transpose(datavector) * matrix * datavector
ValueError: operands could not be broadcast together with shapes (50,1,1,50) (1,400,400)
See, the error occurs right at the start. normalized_matrix is expanded to (1,400,400) [I'm using smaller examples]. The (50,50) X is expanded to (50,1,1,50). x expands to (50,1,1), which broadcasts just fine.
To address the edit and the broadcasting error in the edit:
Inside your function you are adding dimensions to arrays to try to get them to broadcast.
matrix1 = x [:, None, None] * normalized_matrix[None, :, :]
This expression looks like you want to broadcast a 1d array with a 2d array.
The results of your meshgrid are two 2d arrays:
X,Y = np.meshgrid(x,y)
>>> X.shape, Y.shape
((50, 50), (50, 50))
>>>
When you try to use X in in your broadcasting expression the dimensions don't line up, that is what causes the ValueError - refer to the General Broadcasting Rules:
>>> x1 = X[:, np.newaxis, np.newaxis]
>>> nm = normalized_matrix[np.newaxis, :, :]
>>> x1.shape
(50, 1, 1, 50)
>>> nm.shape
(1, 4000, 4000)
>>>
You're on the right track with your list comprehension, you just need to add in an extra level of iteration:
np.array([[func(i,j) for i in x] for j in y])
Ok, so basically my problem is shifting frame of mind from solving math problems „on the paper“ to solving them by programing. Let me explain: I want to know is it possible to perform operations on variable before assigning it a value. Like if I have something like (1-x)**n can I firstly assign n a value, then turn it into a from specific for certain degree and then give x a value or values. If I wasn’t clear enough: if n=2 can I firstly turn equation in form 1-2x+x**2 and then in the next step take care of x value?
I want to write a code for calculating and drawing n-th degree Bezier curve .I am using Bernstein polynomials for this, so I realized that equations consists of 3 parts: first part are polynomial coefficients which are all part of Pascal triangle; I am calculating those and putting them in one list. Second part are coordinates of control points which are also some kind of coefficients, and put them in separate list. Now comes the hard part: part of equation that has a variable.Bernsteins are working with barocentric coordinates (meaning u and 1-u).N-th degree formula for this part of equation is:
u**i *(1-u)**(n-i)
where n is curve degree, I goes from 0->n and U is variable.U is acctualy normalised variable,meaning that it value can be from 0 to 1 and i want to itterate it later in certain number of steps (like 1000).But problem is if i try to use mentioned equation i keep getting error, because Python doesnt know what to do with u.I taught about nested loops in which first one would itterate a value of u from 0 to 1 and second would take care of the mentioned equation from 0 to n, but not sure if it is right solution,and no idea how to chech results.What do you think?
PS: I have not uploaded the code because the part with which im having problem i can not even start,and ,I think but could be wrong, that it is separated from the rest of the code; but if you think it can help solving problem i can upload it.
You can do with higher-order functions, that is functions that return functions, like in
def Bernstein(n,i):
def f(t):
return t**i*(1.0-t)**(n-i)
return f
that you could use like this
b52 = Bernstein(5,2)
val = b52(0.74)
but instead you'll rather use lists
Bernstein_ni = [Bernstein(n,i) for i in range(n+1)]
to be used in a higher order function to build the Bezier curve function
def mk_bezier(Px,Py):
"Input, lists of control points, output a function of t that returns (x,y)"
n = len(Px)
binomials = {0:[1], 1:[1,1], 2:[1,2,1],
3:[1,3,3,1], 4:[1,4,6,4,1], 5:[1,5,10,10,5,1]}
binomial = binomials[n-1]
bPx = [b*x for b,x in zip(binomial,Px)]
bPy = [b*y for b,y in zip(binomial,Py)]
bns = [Bernstein(n-1,i) for i in range(n)]
def f(t):
x = 0 ; y = 0
for i in range(n):
berns = bns[i](t)
x = x + bPx[i]*berns
y = y + bPy[i]*berns
return x, y
return f
eventually, in your program, you can use the function factory like this
linear = mk_bezier([0.0,1.0],[1.0,0.0])
quadra = mk_bezier([0.0,1.0,2.0],[1.0,3.0,1.0])
for t in (0.0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0):
l = linear(t) ; q = quadra(t)
print "%3.1f (%6.4f,%6.4f) (%6.4f,%6.4f)" % (t, l[0],l[1], q[0],q[1])
and this is the testing output
0.0 (0.0000,1.0000) (0.0000,1.0000)
0.1 (0.1000,0.9000) (0.2000,1.3600)
0.2 (0.2000,0.8000) (0.4000,1.6400)
0.3 (0.3000,0.7000) (0.6000,1.8400)
0.4 (0.4000,0.6000) (0.8000,1.9600)
0.5 (0.5000,0.5000) (1.0000,2.0000)
0.6 (0.6000,0.4000) (1.2000,1.9600)
0.7 (0.7000,0.3000) (1.4000,1.8400)
0.8 (0.8000,0.2000) (1.6000,1.6400)
0.9 (0.9000,0.1000) (1.8000,1.3600)
1.0 (1.0000,0.0000) (2.0000,1.0000)
Edit
I think that the right way to do it is at the module level, with a top level sort-of-defaultdictionary that memoizes all the different lists required to perform the actual computations, but defaultdict doesn't pass a variable to its default_factory and I don't feel like subclassing dict (not now) for the sake of this answer, the main reason being that I've never subclassed before...
In response to OP comment
You say that the function degree is the main parameter? But it is implicitely defined by length of the list of control points...
N = user_input()
P0x = user_input()
P0y = user_input()
PNx = user_input()
PNy = user_input()
# code that computes P1, ..., PNminus1
orderN = mk_bezier([P0x,P1x,...,PNminus1x,PNx],
[P0y,P1y,...,PNminus1y,PNy])
x077, y077 = orderN(0.77)
But the customer is always right, so I'll never try again to convince you that my solution works for you if you state that it does things differently from your expectations.
There are Python packages for doing symbolic math, but it might be easier to use some of the polynomial functions available in Numpy. These functions use the convention that a polynomial is represented as an array of coefficients, starting with the lowest order coefficient. So a polynomial a*x^2 + b*x + c would be represented as array([c, b, a]).
Some examples:
In [49]: import numpy.polynomial.polynomial as poly
In [50]: p = [-1, 1] # -x + 1
In [51]: p = poly.polypow(p, 2)
In [52]: p # should be 1 - 2x + x^2
Out[52]: array([ 1., -2., 1.])
In [53]: x = np.arange(10)
In [54]: poly.polyval(x, p) # evaluate polynomial at points x
Out[54]: array([ 1., 0., 1., 4., 9., 16., 25., 36., 49., 64.])
And you could calculate your Bernstein polynomial in a way similar to this (there is still a binomial coefficient missing):
In [55]: def Bernstein(n, i):
...: part1 = poly.polypow([0, 1], i) # (0 + u)**i
...: part2 = poly.polypow([1, -1], n - i) # (1 - u)**(n - i)
...: return poly.polymul(part1, part2)
In [56]: p = Bernstein(3, 2)
In [57]: p
Out[57]: array([ 0., 0., 1., -1.])
In [58]: poly.polyval(x, p) # evaluate polynomial at points x
Out[58]: array([ 0., 0., -4., -18., ..., -448., -648.])
A list of lambda expressions given to me (by Sympy's lambdify), some explicitly depending on a variable x, some constant. I would like to evaluate those consistently with Numpy arrays.
When evaluating a lambda expression, e.g., lambda x: 1.0 + x**2, with a Numpy array x, the result will have the same shape as the array. If the expression happens to not explicitly contain x though, e.g., g = lambda x: 1.0, only a scalar is returned.
import numpy
f = [lambda x: 1.0 + x**2, lambda x: 1.0]
X = numpy.array([1, 2, 3])
print(f[0](X))
print(f[1](X))
returns
[ 2. 5. 10.]
1.0
Is there a way to get the shapes of the output arguments consistent?
You could use ones_like:
>>> X = numpy.array([1, 2, 3])
>>> def g(x): return numpy.ones_like(x)
>>> g(X)
array([1, 1, 1])
Note that this returns integers, not floats, because that was the input dtype; you could specify dtype=float or multiply by 1.0 if you prefer to always get floats out.
PS: It's a little odd to use a lambda and then immediately give it a name. It's like wearing a mask but handing out business cards.
PPS: back before ones_like I tended to use x*0+1 when I wanted something appropriately shaped.
I don't see the problem, just do:
import numpy as np
X = np.array([1, 2, 3])
f = lambda x: 1.0 + x**2
print(f(X))
g = lambda x: np.ones(shape=(len(X),))
print(g(X))
Which prints:
[ 2. 5. 10.]
[ 1. 1. 1.]
Notice that using np.ones(shape=(len(X),)) is the same that using np.ones_like(X)
Use ones_like:
g = lambda x: np.ones_like(x) * 1.0
There's also this slightly hackier solution:
g = lambda x: 1.0 + (x*0)
You seem to want an array of ones:
>>> import numpy
>>> numpy.ones(3)
array([ 1., 1., 1.])
If you want to set scalars, it's easy to do so
g = lambda x: numpy.ones(shape=x.shape) * 2
g(X)
returns
array([ 2., 2., 2.])
So for an arbitrary array:
g = lambda x: numpy.ones(shape=x.shape) * 1
n = numpy.array([1,2,3,4,5])
g(n) is
array([ 1., 1., 1., 1., 1.])