Python: Gradient of matrix function - python

I want to calculate the gradient of the following function h(x) = 0.5 x.T * A * x + b.T + x.
For now I set A to be just a (2,2) Matrix.
def function(x):
return 0.5 * np.dot(np.dot(np.transpose(x), A), x) + np.dot(np.transpose(b), x)
where
A = A = np.zeros((2, 2))
n = A.shape[0]
A[range(n), range(n)] = 1
a (2,2) Matrix with main diagonal of 1 and
b = np.ones(2)
For a given Point x = (1,1) numpy.gradient returns an empty list.
x = np.ones(2)
result = np.gradient(function(x))
However shouldn't I get something like that: grad(f((1,1)) = (x1 + 1, x2 + 1) = (2, 2).
Appreciate any help.

It seems like you want to perform symbolic differentiation or automatic differentiation which np.gradient does not do. sympy is a package for symbolic math and autograd is a package for automatic differentiation for numpy. For example, to do this with autograd:
import autograd.numpy as np
from autograd import grad
def function(x):
return 0.5 * np.dot(np.dot(np.transpose(x), A), x) + np.dot(np.transpose(b), x)
A = A = np.zeros((2, 2))
n = A.shape[0]
A[range(n), range(n)] = 1
b = np.ones(2)
x = np.ones(2)
grad(function)(x)
Outputs:
array([2., 2.])

Related

How to fit a rotated and translated hyperbola to a set of x,y points in Python

I want to fit a set of data points in the xy plane to the general case of a rotated and translated hyperbola to back out the coefficients of the general equation of a conic.
I've tried the methodology proposed in here but so far I cannot make it work.
When fitting to a set of points known to be a hyperbola I get quite different outputs.
What I'm doing wrong in the code below?
Or is there any other way to solve this problem?
import numpy as np
from sympy import plot_implicit, Eq
from sympy.abc import x, y
def fit_hyperbola(x, y):
D1 = np.vstack([x**2, x*y, y**2]).T
D2 = np.vstack([x, y, np.ones(len(x))]).T
S1 = D1.T # D1
S2 = D1.T # D2
S3 = D2.T # D2
# define the constraint matrix and its inverse
C = np.array(((0, 0, -2), (0, 1, 0), (-2, 0, 0)), dtype=float)
Ci = np.linalg.inv(C)
# Setup and solve the generalized eigenvector problem
T = np.linalg.inv(S3) # S2.T
S = Ci#(S1 - S2#T)
eigval, eigvec = np.linalg.eig(S)
# evaluate and sort resulting constraint values
cond = eigvec[1]**2 - 4*eigvec[0]*eigvec[2]
# [condVals index] = sort(cond)
idx = np.argsort(cond)
condVals = cond[idx]
possibleHs = condVals[1:] + condVals[0]
minDiffAt = np.argmin(abs(possibleHs))
# minDiffVal = possibleHs[minDiffAt]
alpha1 = eigvec[:, idx[minDiffAt + 1]]
alpha2 = T#alpha1
return np.concatenate((alpha1, alpha2)).ravel()
if __name__ == '__main__':
# known hyperbola coefficients
coeffs = [1., 6., -2., 3., 0., 0.]
# hyperbola points
x_ = [1.56011303e+00, 1.38439984e+00, 1.22595618e+00, 1.08313085e+00,
9.54435408e-01, 8.38528681e-01, 7.34202759e-01, 6.40370424e-01,
5.56053814e-01, 4.80374235e-01, 4.12543002e-01, 3.51853222e-01,
2.97672424e-01, 2.49435970e-01, 2.06641170e-01, 1.68842044e-01,
1.35644673e-01, 1.06703097e-01, 8.17157025e-02, 6.04220884e-02,
4.26003457e-02, 2.80647476e-02, 1.66638132e-02, 8.27872926e-03,
2.82211172e-03, 2.37095181e-04, 4.96740239e-04, 3.60375275e-03,
9.59051203e-03, 1.85194083e-02, 3.04834928e-02, 4.56074477e-02,
6.40488853e-02, 8.59999904e-02, 1.11689524e-01, 1.41385205e-01,
1.75396504e-01, 2.14077865e-01, 2.57832401e-01, 3.07116093e-01,
3.62442545e-01, 4.24388335e-01, 4.93599021e-01, 5.70795874e-01,
6.56783391e-01, 7.52457678e-01, 8.58815793e-01, 9.76966133e-01,
1.10813998e+00, 1.25370436e+00]
y_ = [-0.66541515, -0.6339625 , -0.60485332, -0.57778425, -0.5524732 ,
-0.52865638, -0.50608561, -0.48452564, -0.46375182, -0.44354763,
-0.42370253, -0.4040097 , -0.38426392, -0.3642594 , -0.34378769,
-0.32263542, -0.30058217, -0.27739811, -0.25284163, -0.22665682,
-0.19857079, -0.16829086, -0.13550147, -0.0998609 , -0.06099773,
-0.01850695, 0.02805425, 0.07917109, 0.13537629, 0.19725559,
0.26545384, 0.34068177, 0.42372336, 0.51544401, 0.61679957,
0.72884632, 0.85275192, 0.98980766, 1.14144182, 1.30923466,
1.49493479, 1.70047747, 1.92800474, 2.17988774, 2.45875143,
2.76750196, 3.10935692, 3.48787892, 3.90701266, 4.3711261 ]
plot_implicit (Eq(coeffs[0]*x**2 + coeffs[1]*x*y + coeffs[2]*y**2 + coeffs[3]*x + coeffs[4]*y, -coeffs[5]))
coeffs_fit = fit_hyperbola(x_, y_)
plot_implicit (Eq(coeffs_fit[0]*x**2 + coeffs_fit[1]*x*y + coeffs_fit[2]*y**2 + coeffs_fit[3]*x + coeffs_fit[4]*y, -coeffs_fit[5]))
The general equation of hyperbola is defined with 5 independent coefficients (not 6). If the model equation includes dependant coefficients (which is the case with 6 coefficients) trouble might occur in the numerical regression calculus.
That is why the equation A * x * x + B * x * y + C * y * y + D * x + F * y = 1 is considered in the calculus below. The fitting is very good.
Then one can goback to the standard equation a * x * x + 2 * b * x * y + c * y * y + 2 * d * x + 2 * f * y + g = 0 in setting a value for g (for example g=-1).
The formulas to find the coordinates of the center, the equations of asymptotes, the equations of axis, are given in addition.
https://mathworld.wolfram.com/ConicSection.html
https://en.wikipedia.org/wiki/Conic_section
https://en.wikipedia.org/wiki/Hyperbola

Minimize system of nonlinear equation (integral on exponent)

General:
I am using maximum entropy to find distribution for on positive integers vectors, I can estimate the mean and variance, and have three equation I am trying to find a and b,
The equations:
integral(exp(a*x^2+bx+c) from (0 , infinity))-1
integral(xexp(ax^2+bx+c)from (0 , infinity))- mean
integral(x^2*exp(a*x^2+bx+c) from (0 , infinity))- mean^2 - var
(integrals between [0,∞))
The problem:
I am trying to use numerical solver and I used fsolve of sympy
But I guess I am missing some knowledge.
My code:
import numpy as np
import sympy as sym
from scipy.optimize import *
def myFunction(x,*data):
y = sym.symbols('y')
m,v=data
F = [0]*3
x[0] = - abs(x[0])
print(x)
F[0] = (sym.integrate(sym.exp(x[0] * y ** 2 + x[1] * y + x[2]), (y, 0,sym.oo)) -1).evalf()
F[1] = (sym.integrate(y*sym.exp(x[0] * y ** 2 + x[1] * y + x[2]), (y, 0,sym.oo))-m).evalf()
F[2] = (sym.integrate((y**2)*sym.exp(x[0] * y ** 2 + x[1] * y + x[2]), (y,0,sym.oo)) -v-m).evalf()
print(F)
return F
data = (10,3.5) # mean and var for example
xGuess = [1, 1, 1]
z = fsolve(myFunction,xGuess,args = data)
print(z)
my result are not that accurate, is there a better way to solve it?
integral(exp(a*x^2+bx+c))-1 = 5.67659292676884
integral(xexp(ax^2+bx+c))- mean = −1.32123173796713
integral(x^2*exp(a*x^2+bx+c))- mean^2 - var = −2.20825624606312
Thanks
I have rewritten the problem replacing sympy with numpy and lambdas (inline functions).
Also note that in your problem statement you subtract the third equation with $mean^2$, but in your code you only subtract $mean$.
import numpy as np
from scipy.optimize import minimize
from scipy.integrate import quad
def myFunction(x,data):
m,v=data
F = np.zeros(3) # use numpy array
# use scipy.integrade.quad for integration of lambda functions
# quad output is (result, error), so we just select the result value at the end
F[0] = quad(lambda y: np.exp(x[0] * y ** 2 + x[1] * y + x[2]), 0, np.inf)[0] -1
F[1] = quad(lambda y: y*np.exp(x[0] * y ** 2 + x[1] * y + x[2]), 0, np.inf)[0] -m
F[2] = quad(lambda y: (y**2)*np.exp(x[0] * y ** 2 + x[1] * y + x[2]), 0, np.inf)[0] -v-m**2
# minimize the squared error
return np.sum(F**2)
data = (10,3.5) # mean and var for example
xGuess = [-1, 1, 1]
z = minimize(lambda x: myFunction(x, data), x0=xGuess,
bounds=((None, 0), (None, None), (None, None))) # use bounds for negative first coefficient
print(z)
# x: array([-0.99899311, 2.18819689, 1.85313181])
Does this seem more reasonable?

Conjugate Gradient implementation Python

I implemented Conjugate Gradient in python by looking into the Wikipedia reference - https://en.wikipedia.org/wiki/Conjugate_gradient_method
The implementation should solve for
ax = b
my application inputs goes as below,
a = <400x400 sparse matrix of type '<class 'numpy.float64'>'
with 1920 stored elements in Compressed Sparse Row format>
b = vector of shape (400, ) and dtype = float64
x = vector of random numbers of shape (400, )
Here is my implementation -
def ConjGrad(a, b, x):
r = (b - np.dot(np.array(a), x));
p = r;
rsold = np.dot(r.T, r);
for i in range(len(b)):
a_p = np.dot(a, p);
alpha = rsold / np.dot(p.T, a_p);
x = x + (alpha * p);
r = r - (alpha * a_p);
rsnew = np.dot(r.T, r);
if (np.sqrt(rsnew) < (10 ** -5)):
break;
p = r + ((rsnew / rsold) * p);
rsold = rsnew;
return p
When i call the above CG function, i get an error within the function for the line -
r = (b - np.dot(np.array(a), x));
The error goes like this -
NotImplementedError: subtracting a sparse matrix from a nonzero scalar is
not supported
At run time, below are the properties of variables within the CG function -
np.dot(np.array(a), x).shape
(400,)
b.shape
(400,)
I wonder why the subtraction is not happenning???
I tested the same function with the sample input arguments below and it worked fine.
a = np.array([[3, 2, -1], [2, -1, 1], [-1, 1, -1]]) # 3X3 symmetric matrix
b = (np.array([1, -2, 0])[np.newaxis]).T # 3X1 matrix
x = (np.array([0, 1, 2])[np.newaxis]).T
Can someone please tell me why its not working for a sparse matrix?
When multiplying a sparsa matrix by a array you should not use: np.dot(np.array(a), x)) but a.dot(x). See the documentation below:
https://docs.scipy.org/doc/scipy/reference/sparse.html
Follows a correct routine:
def conjGrad(A,x,b,tol,N):
r = b - A.dot(x)
p = r.copy()
for i in range(N):
Ap = A.dot(p)
alpha = np.dot(p,r)/np.dot(p,Ap)
x = x + alpha*p
r = b - A.dot(x)
if np.sqrt(np.sum((r**2))) < tol:
print('Itr:', i)
break
else:
beta = -np.dot(r,Ap)/np.dot(p,Ap)
p = r + beta*p
return x

Is there a faster way of repeating a chunk of code x times and taking an average?

Starting with:
a,b=np.ogrid[0:n+1:1,0:n+1:1]
B=np.exp(1j*(np.pi/3)*np.abs(a-b))
B[z,b] = np.exp(1j * (np.pi/3) * np.abs(z - b +x))
B[a,z] = np.exp(1j * (np.pi/3) * np.abs(a - z +x))
B[diag,diag]=1-1j/np.sqrt(3)
this produces an n*n grid that acts as a matrix.
n is just a number chosen to represent the indices, i.e. an a*b matrix where a and b both go up to n.
Where z is a constant I choose to replace a row and column with the B[z,b] and B[a,z] formulas. (Essentially the same formula but with a small number added to the np.abs(a-b))
The diagonal of the matrix is given by the bottom line:
B[diag,diag]=1-1j/np.sqrt(3)
where,
diag=np.arange(n+1)
I would like to repeat this code 50 times where the only thing that changes is x so I will end up with 50 versions of the B np.ogrid. x is a randomly generated number between -0.8 and 0.8 each time.
x=np.random.uniform(-0.8,0.8)
I want to generate 50 versions of B with random values of x each time and take a geometric average of the 50 versions of B using the definition:
def geo_mean(y):
y = np.asarray(y)
return np.prod(y ** (1.0 / y.shape[0]), axis=-1)
I have tried to set B as a function of some index and then use a for _ in range(): loop, this doesn't work. Aside from copy and pasting the block 50 times and denoting each one as B1, B2, B3 etc; I can't think of another way of working this out.
EDIT:
I'm now using part of a given solution in order to show clearly what I am looking for:
#A matrix with 50 random values between -0.8 and 0.8 to be used in the loop
X=np.random.uniform(-0.8,0.8, (50,1))
#constructing the base array before modification by random x values in position z
a,b = np.ogrid[0:n+1:1,0:n+1:1]
B = np.exp(1j * ( np.pi / 3) * np.abs( a - b ))
B[diag,diag] = 1 - 1j / np.sqrt(3)
#list to store all modified arrays
randomarrays = []
for i in range( 0,50 ):
#copy array and modify it
Bnew = np.copy( B )
Bnew[z, b] = np.exp( 1j * ( np.pi / 3 ) * np.abs(z - b + X[i]))
Bnew[a, z] = np.exp( 1j * ( np.pi / 3 ) * np.abs(a - z + X[i]))
randomarrays.append(Bnew)
Bstack = np.dstack(randomarrays)
#calculate the geometric mean value along the axis that was the row in 2D arrays
B0 = geo_mean(Bstack)
From this example, every iteration of i uses the same value of X, I can't seem to get a way to get each new loop of i to use the next value in the matrix X. I am unsure of the ++ action in python, I know it does not work in python, I just don't know how to use the python equivalent. I want a loop to use a value of X, then the next loop to use the next value and so on and so forth so I can dstack all the matrices at the end and find a geo_mean for each element in the stacked matrices.
One pedestrian way would be to use a list comprehension or generator expression:
>>> def f(n, z, x):
... diag = np.arange(n+1)
... a,b=np.ogrid[0:n+1:1,0:n+1:1]
... B=np.exp(1j*(np.pi/3)*np.abs(a-b))
... B[z,b] = np.exp(1j * (np.pi/3) * np.abs(z - b +x))
... B[a,z] = np.exp(1j * (np.pi/3) * np.abs(a - z +x))
... B[diag,diag]=1-1j/np.sqrt(3)
... return B
...
>>> X = np.random.uniform(-0.8, 0.8, (10,))
>>> np.prod((*map(np.power, map(f, 10*(4,), 10*(2,), X), 10 * (1/10,)),), axis=0)
But in your concrete example we can do much better than that;
using the identity exp(a) x exp(b) = exp(a + b) we can convert the geometric mean after exponentiation to an arithmetic mean before exponentition. A bit of care is required because of the multivaluedness of the complex n-th root which occurs in the geometric mean. In the code below we normalize the angles occurring to range -pi, pi so as to always hit the same branch as the n-th root.
Please also note that the geo_mean function you provide is definitely wrong. It fails the basic sanity check that taking the average of copies of the same thing should return the same thing. I've provided a better version. It is still not perfect, but I think there actually is no perfect solution, because of the nonuniqueness of the complex root.
Because of this I recommend taking the average before exponentiating. As long as your random spread is less than pi this allows a well-defined averaging procedure with an average that is actually close to the samples
import numpy as np
def f(n, z, X, do_it_pps_way=True):
X = np.asanyarray(X)
diag = np.arange(n+1)
a,b=np.ogrid[0:n+1:1,0:n+1:1]
B=np.exp(1j*(np.pi/3)*np.abs(a-b))
X = X.reshape(-1,1,1)
if do_it_pps_way:
zbx = np.mean(np.abs(z-b+X), axis=0)
azx = np.mean(np.abs(a-z+X), axis=0)
else:
zbx = np.mean((np.abs(z-b+X)+3) % 6 - 3, axis=0)
azx = np.mean((np.abs(a-z+X)+3) % 6 - 3, axis=0)
B[z,b] = np.exp(1j * (np.pi/3) * zbx)
B[a,z] = np.exp(1j * (np.pi/3) * azx)
B[diag,diag]=1-1j/np.sqrt(3)
return B
def geo_mean(y):
y = np.asarray(y)
dim = len(y.shape)
y = np.atleast_2d(y)
v = np.prod(y, axis=0) ** (1.0 / y.shape[0])
return v[0] if dim == 1 else v
def geo_mean_correct(y):
y = np.asarray(y)
return np.prod(y ** (1.0 / y.shape[0]), axis=0)
# demo that orig geo_mean is wrong
B = np.exp(1j * np.random.random((5, 5)))
# the mean of four times the same thing should be the same thing:
if not np.allclose(B, geo_mean([B, B, B, B])):
print('geo_mean failed')
if np.allclose(B, geo_mean_correct([B, B, B, B])):
print('but geo_mean_correct works')
n, z, m = 10, 3, 50
X = np.random.uniform(-0.8, 0.8, (m,))
B0 = f(n, z, X, do_it_pps_way=False)
B1 = np.prod((*map(np.power, map(f, m*(n,), m*(z,), X), m * (1/m,)),), axis=0)
B2 = geo_mean_correct([f(n, z, x) for x in X])
# This is the recommended way:
B_recommended = f(n, z, X, do_it_pps_way=True)
print()
print(np.allclose(B1, B0))
print(np.allclose(B2, B1))
I think you should rely more on numpy functionality, when approaching your problem. Not a numpy expert myself, so there is surely room for improvement:
from scipy.stats import gmean
n = 2
z = 1
a = np.arange(n + 1).reshape(1, n + 1)
#constructing the base array before modification by random x values in position z
B = np.exp(1j * (np.pi / 3) * np.abs(a - a.T))
B[a, a] = 1 - 1j / np.sqrt(3)
#list to store all modified arrays
random_arrays = []
for _ in range(50):
#generate random x value
x=np.random.uniform(-0.8, 0.8)
#copy array and modify it
B_new = np.copy(B)
B_new[z, a] = np.exp(1j * (np.pi / 3) * np.abs(z - a + x))
B_new[a, z] = np.exp(1j * (np.pi / 3) * np.abs(a - z + x))
random_arrays.append(B_new)
#store all B arrays as a 3D array
B_stack = np.stack(random_arrays)
#calculate the geometric mean value along the axis that was the row in 2D arrays
geom_mean_for_rows = gmean(B_stack, axis = 2)
It uses the geometric mean function from scipy.stats module to have a vectorised approach for this calculation.

Fast math operations on an array in python

I have a fairly simple math operation I'd like to perform on a array. Let me write out the example:
A = numpy.ndarray((255, 255, 3), dtype=numpy.single)
# ..
for i in range(A.shape[0]):
for j in range(A.shape[1]):
x = simple_func1(i)
y = simple_func2(j)
A[i, j] = (alpha * x * y + beta * x**2 + gamma * y**2, 1, 0)
So basically, there's a mapping between (i, j) and the 3 values of that value (this is for visualization).
I'd like to roll this up and somehow vectorize this, but I'm not sure how to or if I can. Thanks.
Here is the vectorized version:
i = arange(255)
j = arange(255)
x = simple_func1(i)
y = simple_func2(j)
y = y.reshape(-1,1)
A = alpha * x * y + beta * x**2 + gamma * y**2 # broadcasting is your friend here
If you want to fill the last coordinates with 1 and 0:
B = empty(A.shape+(3,))
B[:,:,0] = A
B[:,:,1] = 1 # broadcasting again
B[:,:,2] = 0
You have to change simple_funcN so that they take arrays as input, and create arrays as output. After that, you could look into the numpy.meshgrid() or the cartesian() function here to build coordinate arrays. After that, you should be able to use the coordinate array(s) to fill A with a one-liner.

Categories

Resources