Related
Given a function func(x,y,z), I want to provide a function
def integral_over_z(func,x,y,zmin=0,zmax=1,n=16):
lambda_func = z,x,y: ???
return scipy.integrate.fixed_quad(lambda_func,a=zmin,b=zmax,args=(x,y),n=n)
that computes its integral over z for user provided (x,y) inputs using scipy.integrate.fixed_quad. The input (x,y) can be each be a single float or an array of floats (when both are arrays, their shapes are identical).
scipy.integrate.fixed_quad supports integrating vector-valued functions. To this end, the function func must return a corresponding array of higher dimension: "If integrating a vector-valued function, the returned array must have shape (..., len(x))" (from the docs).
My question therefore is how to generate the corresponding output array of the lambda_func (which may be implemented using a special-purpose class).
EDIT: to help understand my question, here is an implementation that works, but is not vectorized over z (and hence doesn't use scipy.integrate.fixed_quad).
def integral_over_z(func,x,y,zmin,zmax,n=16):
z,w = scipy.special.roots_legendre(n)
dz = 0.5*(zmax-zmin)
z = zmin + (np.real(z)+1) * dz
w = np.real(w) * dz
result = w[0] * func(x,y,z[0])
for i in range(1,len(z)):
result += w[i] * func(x,y,z[i])
return result
The problem is: how to vectorize it, such that it works for any valid input (x and/or y floats or arrays).
ANOTHER EDIT:
For the implementation via scipy.integrate.fixed_quad, the integrand function must take a 1D array of z of shape (nz). The inputs x and y must broadcast together, when the broadcasted shape of them could be anything, say (n0,n1,..,nk) Then the return from func must have shape (n0,n1,..,nk,nz) -- how to I generated that?
It seems as a vector valued function the vector values must be in the 0th dimension, and the integration arguments (in your case z) must come last (that what they mean with (..., len(x)), their x is your z), I think this comes from the broadcasting rules. Following example worked fine for me - the key here is that x and y must have the right shape for the broadcasting to work
import numpy as np
import scipy.integrate
def integral_over_z(func,x,y,n=16):
lambda_func = lambda z, x, y: func(x[..., None],y[..., None],z) # the last dimension of (x,y) needs to be size 1, but you can have as many leading dimensions as you want
return scipy.integrate.fixed_quad(lambda_func,a=0,b=1,args=(x,y),n=n)
func = lambda x,y,z: 1 + 0*x + 0*y + 0*z # make sure that the output has the right (broadcast) shape
x = np.zeros((5,))
y = np.arange(5)
print(integral_over_z(func, x, y, 2))
After the (incomplete) answer by flawr and reading about numpy broadcasting, I found a solution. I'd be happy to learn whether this can still be improved and/or if this is really correct, i.e. works for any valid input (it does for my tests sofar).
The important point is to adapt the shapes of x and y such that
func(x,y,z) works just fine, i.e. x, y, and z are jointly broadcastable;
after summing the output of func over the last (z) dimension, the result has the joint broadcasted shape of x and y.
Here is my solution:
def integral_over_z(func,x,y,zmin=0,zmax=1,n=16):
xe = x
ye = y
if type(xe) is np.ndarray or type(ye) is np.ndarray:
xe,ye = np.broadcast_arrays(x,y) # replace x,y by their joint broadcast
xe = np.expand_dims(xe, xe.ndim) # expand by an extra dimension for z
ye = np.expand_dims(ye, ye.ndim) # expand by an extra dimension for z
return scipy.integrate.fixed_quad(lambda z : func(xe,ye,z), a=zmin, b=zmax, n=n)
Let's consider data :
import numpy as np
from sklearn.linear_model import LogisticRegression
x=np.linspace(0,2*np.pi,80)
x = x.reshape(-1,1)
y = np.sin(x)+np.random.normal(0,0.4,80)
y[y<1/2] = 0
y[y>1/2] = 1
clf=LogisticRegression(solver="saga", max_iter = 1000)
I want to fit logistic regression where y is dependent variable, and x is independent variable. But while I'm using :
clf.fit(x,y)
I see error
'y should be a 1d array, got an array of shape (80, 80) instead'.
I tried to reshape data by using
y=y.reshape(-1,1)
But I end up with array of length 6400! (How come?)
Could you please give me a hand with performing this regression ?
Change the order of your operations:
First geneate x and y as 1-D arrays:
x = np.linspace(0, 2*np.pi, 8)
y = np.sin(x) + np.random.normal(0, 0.4, 8)
Then (after y was generated) reshape x:
x = x.reshape(-1, 1)
Edit following a comment as of 2022-02-20
The source of the problem in the original code is that;
x = np.linspace(0,2*np.pi,80) - generates a 1-D array.
x = x.reshape(-1,1) - reshapes it into a 2-D array, with one column and
as many rows as needed.
y = np.sin(x) + np.random.normal(0,0.4,80) - operates on a columnar array and
a 1-D array (treated here as a single row array).
the effect is that y is a 2-D array (80 * 80).
then the attempt to reshape y gives a single column array with 6400 rows.
The proper solution is that both x and y should be initially 1-D
(single row) arrays and my code does just this.
Then both arrays can be reshaped.
I encountered this error and solving it via reshape but it didn't work
ValueError: y should be a 1d array, got an array of shape () instead.
Actually, this was happening due to the wrong placement of [] brackets around np.argmax, below is the wrong code and correct one, notice the positioning of [] around the np.argmax in both the snippets
Wrong Code
ax[i,j].set_title("Predicted Watch : "+str(le.inverse_transform([pred_digits[prop_class[count]]])) +"\n"+"Actual Watch : "+str(le.inverse_transform(np.argmax([y_test[prop_class[count]]])).reshape(-1,1)))
Correct Code
ax[i,j].set_title("Predicted Watch :"+str(le.inverse_transform([pred_digits[prop_class[count]]]))+"\n"+"Actual Watch : "+str(le.inverse_transform([np.argmax(y_test[prop_class[count]])])))
I'm quite new to Python and Numpy, so I apologize if I'm missing something obvious here.
I have a function that solves a system of 2 differential equations :
import numpy as np
import numpy.linalg as la
def solve_ode(x0, a0, beta, t):
At = np.array([[0.23*t, (-10**5)*t], [0, -beta*t]], dtype=np.float32)
# get eigenvalues and eigenvectors
evals, V = la.eig(At)
Vi = la.inv(V)
# get e^At coeff
eAt = V # np.exp(evals) # Vi
xt = eAt*x0
return xt
However, running it with this code :
import matplotlib.pyplot as plt
# initial values
x0 = 10**6
a0 = 2.5
beta = 0.05
t = np.linspace(0, 3600, 360)
plt.semilogy(t, solve_ode(x0, a0, beta, t))
... throws this error :
ValueError: setting an array element with a sequence.
At this line :
At = np.array([[0.23*t, (-10**5)*t], [0, -beta*t]], dtype=np.float32)
Note that t and beta are supposed to be floats. I think Python might not be able to infer this but I don't know how I could do this...
Thx in advance for your help.
You are supplying t as a numpy array of shape 360 from linspace and not simply a float. The resulting At numpy array you are trying to create is then ill formed as all columns must be the same length. In python there is an important difference between lists and numpy arrays. For example, you could do what you have here as a list of lists, e.g.
At = [[0.23*t, (-10**5)*t], [0, -beta*t]]
with dimensions [[360 x 360] x [1 x 360]].
Alternatively, if all elements of At are the length of t the array would work,
At = np.array([[0.23*t, (-10**5)*t], [t, -beta*t]], dtype=np.float32)
with shape [2, 2, 360].
When you give a list or a list of lists, or in this case, a list of list of listss, all of them should have the same length, so that numpy can automatically infer the dimensions (shape) of the resulting matrix.
In your example, it's all correctly put, except the part you put 0 as a column I guess. Not sure what to call it though, cause your expected output is a cube I suppose.
You can fix it by giving the correct number of zeros as bellow:
At = np.array([[0.23*t, (-10**5)*t], [np.zeros(len(t)), -beta*t]], dtype=np.float32)
But check the .shape of the resulting array, and make sure it's what you want.
As others note the problem is the 0 in the inner list. It doesn't match the 360 length arrays generated by the other expressions. np.array can make an object dtype array from that (2x2), but can't make a float one.
At = np.array([[0.23*t, (-10**5)*t], [0*t, -beta*t]])
produces a (2,2,360) array. But I suspect the rest of that function is built around the assumption that At is (2,2) - a 2d square array with eig, inv etc.
What is the return xt supposed to be?
Does this work?
S = np.array([solve_ode(x0, a0, beta, i) for i in t])
giving a 1d array with the same number of values as in t?
I'm not suggesting this is the fastest way of solving the problem, but it's the simplest, especially if you are only generating 360 values.
I'm writing some python + numpy + cython code, and am trying to find the most elegant and efficient way of doing the following kind of iteration over an array:
Let's say I have a function f(x, y) that takes a vector x of shape (3,) and a vector y of shape (10,) and returns a vector of shape (10,). Now I have two arrays X and Y of shape sx + (3,) and sy + (10,), where the sx and sy are two shapes that can be broadcast together (i.e. either sx == sy, or when an axis differs, one of the two has length 1, in which case it will be repeated). I want to produce an array Z that has the shape zs + (10,), where zs is the shape of the broadcasting of sx with sy. Each 10 dimensional vector in Z is equal to f(x, y) of the vectors x and y at the corresponding locations in X and Y.
I looked into np.nditer and while it plays nice with cython (see bottom of linked page), it doesn't seem to allow iterating over vectors from a multidimensional array, instead of elements. I also looked at index grids, but the problem there is that cython indexing is only fast when the number of indexes is equal to the dimensionality of the array, and are stored as cython integers instead of python tuples.
Any help is greatly appreciated!
You are describing what Numpy calls a Generalized Universal FUNCtion, or gufunc. As it name suggests, it is an extension of ufuncs. You probably want to start by reading these two pages:
Writing your own ufunc
Building a ufunc from scratch
The second example uses Cython and has some material on gufuncs. To fully go down the gufunc road, you will need to read the corresponding section in the numpy C API documentation:
Generalized Universal Function API
I do not know of any example of gufuncs being coded in Cython, although it shouldn't be too hard to do following the examples above. If you want to look at gufuncs coded in C, you can take a look at the source code for np.linalg here, although that can be a daunting experience. A while back I bored my local Python User Group to death giving a talk on extending numpy with C, which was mostly about writing gufuncs in C, the slides of that talk and a sample Python module providing a new gufunc can be found here.
If you want to stick with nditer, here's a way using your example dimensions. It's pure Python here, but shouldn't be hard to implement with cython (though it still has the tuple iterator). I'm borrowing ideas from ndindex as described in shallow iteration with nditer
The idea is to find the common broadcasting shape, sz, and construct a multi_index iterator over it.
I'm using as_strided to expand X and Y to usable views, and passing the appropriate vectors (actually (1,n) arrays) to the f(x,y) function.
import numpy as np
from numpy.lib.stride_tricks import as_strided
def f(x,y):
# sample that takes (10,) and (3,) arrays, and returns (10,) array
assert x.shape==(1,10), x.shape
assert y.shape==(1,3), y.shape
z = x*10 + y.mean()
return z
def brdcast(X, X1):
# broadcast X to shape of X1 (keep last dim of X)
# modeled on np.broadcast_arrays
shape = X1.shape + (X.shape[-1],)
strides = X1.strides + (X.strides[-1],)
X1 = as_strided(X, shape=shape, strides=strides)
return X1
def F(X, Y):
X1, Y1 = np.broadcast_arrays(X[...,0], Y[...,0])
Z = np.zeros(X1.shape + (10,))
it = np.nditer(X1, flags=['multi_index'])
X1 = brdcast(X, X1)
Y1 = brdcast(Y, Y1)
while not it.finished:
I = it.multi_index + (None,)
Z[I] = f(X1[I], Y1[I])
it.iternext()
return Z
sx = (2,3) # works with (2,1)
sy = (1,3)
# X, Y = np.ones(sx+(10,)), np.ones(sy+(3,))
X = np.repeat(np.arange(np.prod(sx)).reshape(sx)[...,None], 10, axis=-1)
Y = np.repeat(np.arange(np.prod(sy)).reshape(sy)[...,None], 3, axis=-1)
Z = F(X,Y)
print Z.shape
print Z[...,0]
I'm creating a code to run a perceptron algorithm and I can't create a random matrix the way I need it:
from random import choice
from numpy import array, dot, random
unit_step = lambda x: -1 if x < 0 else 1
import numpy as np
m=3 #this will be the number of rows
allys=[]
for j in range(m):
aa=np.random.rand(1,3)
tt=np.random.rand(3)
yy=dot(aa,tt)
ally = [aa, yy]
allys.append(ally)
print "allys", allys
w = random.rand(3)
errors = []
eta = 0.2
n = 10
x=[1,3]
for i in xrange(n):
print i
x, expected = choice(allys)
print "x", x
And I get the problem here:
result = dot(x,w)
error = expected - unit_step(result)
errors.append(error)
w += eta * error * x
print x, expected, w, error, result, errors
The log says
w += eta * error * x
ValueError: non-broadcastable output operand with shape (3,) doesn't
match the broadcast shape (1,3)
The idea is to get result looping randomly over the "table" allys.
How can I solve this? What is shape(3,)?
Thanks!
The error message actually tells you what is wrong. The result of your multiplication is a (1,3) array (2D array, one row, three columns), whereas you try to add it into a 3-element vector.
Both arrays have three elements in a row, so if you do this:
w = w + eta * error * x
there will be no error on that line, but the resulting vector will actually be a (1,3) array. That is unwanted, because then your dot does not work.
There are several ways to fix the problem, but possibly the easiest to read is to reshape x for the calculation to be a 3-element vector (1D array):
w += eta * error * x.reshape(3,)
Possibly a cleaner solution would be to define w as a (1,3) 2D array, as well, and then transpose w for the dot. This is really a matter of taste.
For numpy arrays, shape attribute returns array's dimensionality. In your case, w.shape is (3,). This means that w is a one-dimensional array with 3 elements. In turn, x.shape is (1,3), which means that x is a two-dimensional array with one row and 3 columns. You are getting an error, because the interpreter is confused on how to match the shapes. I am not sure what you are trying to do, so it's hard to suggest the solution. But you might want to try reshaping one of the arrays. For example, x = x.reshape((3,)) for adapting the shape of x to w.