Hessian in theano with respect to matrix input - python

I'm trying to get a vectorized version of theano gradient and hessian, i.e. I want to compute gradient and hessian at several points, given in a matrix as shown below:
I have a function:
f(x_1,x_2,..,x_n)=exp(x_1^2+x_2^2+...+x_n^2)
and I want to compute its gradient at multiple points with one command. I can do this like so:
x = T.matrix('x')
y = T.diag(T.exp(T.dot(x,x.T)))
J = theano.grad(cost = y.sum(), wrt = x)
f = theano.function(inputs = [x], outputs = J)
f([[1,2],[3,4]])
It returns a matrix, which rows are gradients computed at points (1,2) and (3,4). I want to get the same result for hessian (in this case it would be a 3 dimensional tensor as oppose to a matrix, but the same idea). The following code:
H = theano.gradient.hessian(cost = y.sum(), wrt = x)
returns an error:
AssertionError: tensor.hessian expects a (list of) 1 dimensional variable as `wrt`
I was able to achieve the appropriate result with following code
J = theano.grad(cost = y.sum(), wrt = x)
H = theano.gradient.jacobian(expression = J.flatten(), wrt = x)
g = theano.function(inputs = [x], outputs = H)
g([[1,2],[3,4]])
but it produces a lot of unnecessary zeros and seems like an inefficient and "ugly" way of obtaining the desired result. Has anyone had a similar problem or can you suggest anything?

Related

How to use scipy.integrate.fixed_quad for computing many integrals at once?

Given a function func(x,y,z), I want to provide a function
def integral_over_z(func,x,y,zmin=0,zmax=1,n=16):
lambda_func = z,x,y: ???
return scipy.integrate.fixed_quad(lambda_func,a=zmin,b=zmax,args=(x,y),n=n)
that computes its integral over z for user provided (x,y) inputs using scipy.integrate.fixed_quad. The input (x,y) can be each be a single float or an array of floats (when both are arrays, their shapes are identical).
scipy.integrate.fixed_quad supports integrating vector-valued functions. To this end, the function func must return a corresponding array of higher dimension: "If integrating a vector-valued function, the returned array must have shape (..., len(x))" (from the docs).
My question therefore is how to generate the corresponding output array of the lambda_func (which may be implemented using a special-purpose class).
EDIT: to help understand my question, here is an implementation that works, but is not vectorized over z (and hence doesn't use scipy.integrate.fixed_quad).
def integral_over_z(func,x,y,zmin,zmax,n=16):
z,w = scipy.special.roots_legendre(n)
dz = 0.5*(zmax-zmin)
z = zmin + (np.real(z)+1) * dz
w = np.real(w) * dz
result = w[0] * func(x,y,z[0])
for i in range(1,len(z)):
result += w[i] * func(x,y,z[i])
return result
The problem is: how to vectorize it, such that it works for any valid input (x and/or y floats or arrays).
ANOTHER EDIT:
For the implementation via scipy.integrate.fixed_quad, the integrand function must take a 1D array of z of shape (nz). The inputs x and y must broadcast together, when the broadcasted shape of them could be anything, say (n0,n1,..,nk) Then the return from func must have shape (n0,n1,..,nk,nz) -- how to I generated that?
It seems as a vector valued function the vector values must be in the 0th dimension, and the integration arguments (in your case z) must come last (that what they mean with (..., len(x)), their x is your z), I think this comes from the broadcasting rules. Following example worked fine for me - the key here is that x and y must have the right shape for the broadcasting to work
import numpy as np
import scipy.integrate
def integral_over_z(func,x,y,n=16):
lambda_func = lambda z, x, y: func(x[..., None],y[..., None],z) # the last dimension of (x,y) needs to be size 1, but you can have as many leading dimensions as you want
return scipy.integrate.fixed_quad(lambda_func,a=0,b=1,args=(x,y),n=n)
func = lambda x,y,z: 1 + 0*x + 0*y + 0*z # make sure that the output has the right (broadcast) shape
x = np.zeros((5,))
y = np.arange(5)
print(integral_over_z(func, x, y, 2))
After the (incomplete) answer by flawr and reading about numpy broadcasting, I found a solution. I'd be happy to learn whether this can still be improved and/or if this is really correct, i.e. works for any valid input (it does for my tests sofar).
The important point is to adapt the shapes of x and y such that
func(x,y,z) works just fine, i.e. x, y, and z are jointly broadcastable;
after summing the output of func over the last (z) dimension, the result has the joint broadcasted shape of x and y.
Here is my solution:
def integral_over_z(func,x,y,zmin=0,zmax=1,n=16):
xe = x
ye = y
if type(xe) is np.ndarray or type(ye) is np.ndarray:
xe,ye = np.broadcast_arrays(x,y) # replace x,y by their joint broadcast
xe = np.expand_dims(xe, xe.ndim) # expand by an extra dimension for z
ye = np.expand_dims(ye, ye.ndim) # expand by an extra dimension for z
return scipy.integrate.fixed_quad(lambda z : func(xe,ye,z), a=zmin, b=zmax, n=n)

Inverse of covariance matrix in time series

I'm trying to convert a method based on the Mahalanobis distance which works on images to my code which has to process time series. This is the Matlab code, where the users pass an image as the input and then first reshape it and then calculate the mean, the covariance matrix and its inverse (He's using images size):
function out = rxd(X)
% X input size = (126, 150, 204)
sizes = size(X);
X = reshape(X, [sizes(1)*sizes(2), sizes(3)]);
% X input size = (18900, 204)
M = mean(X);
% M size = (1, 204)
C = cov(X);
% M size = (204, 204)
Q = inv(C);
% M size = (204, 204)
This is my code, where I implemented this first part in Python. I do not have an image but a time series, whose shape is (24230, 30), that's why I avoided the reshaping part:
import os
import numpy as np
X = np.load('dataset.npy')
# dataset shape: (24230, 30)
# 1. Calculate the mean of the matrix
M = np.mean(X, axis=0) # shape = (30,)
# 2. Calculate the Covariance matrix
C = np.cov(X) #shape = (24230, 24230)
# 3. Calculate the inverse of the Covariance matrix
Q = np.linalg.inv(C) #Error
If I try to run it I get the error:
LinAlgError: Singular matrix
What could be the problem? I noticed that the only difference from the Matlab outputs is in the mean shape, but I don't understand if I'm wrong with my conversion.
Ignoring everything about MATLAB, your covariance matrix C can't be inverted; by definition this means the matrix C is "singular", hence the Singular Matrix error. (So it's not about the code, but the data).
If you wish to calculate the inverse of this matrix anyway, the pseudo-inverse function np.linalg.pinv can do this; but do be sure to understand why you are doing it.

Solving Linear Systems of equations with SVD Decomposition

I want to write a function that uses SVD decomposition to solve a system of equations ax=b, where a is a square matrix and b is a vector of values. The scipy function scipy.linalg.svd() should turn a into the matrices U W V. For U and V, I can simply take the transpose of to find their inverse. But for W the function gives me a 1-D array of values that I need to put down the diagonal of a matrix and then enter one over the value.
def solveSVD(a,b):
U,s,V=sp.svd(a,compute_uv=True)
Ui=np.transpose(a)
Vi=np.transpose(V)
W=np.diag(s)
Wi=np.empty(np.shape(W)[0],np.shape(W)[1])
for i in range(np.shape(Wi)[0]):
if W[i,i]!=0:
Wi[i,i]=1/W[i,i]
ai=np.matmul(Ui,np.matmul(Wi,Vi))
x=np.matmul(ai,b)
return(x)
However, I get a "TypeError: data type not understood" error. I think part of the issue is that
W=np.diag(s)
is not producing a square diagonal matrix.
This is my first time working with this library so apologies if I've done something very stupid, but I cannot work out why this line hasn't worked. Thanks all!
To be short, using singular value decomposition let you replace your initial problem which is A x = b by U diag(s) Vh x = b. Using a bit of algebra on the latter, give you the following 3 steps function which is really easy to read :
import numpy as np
from scipy.linalg import svd
def solve_svd(A,b):
# compute svd of A
U,s,Vh = svd(A)
# U diag(s) Vh x = b <=> diag(s) Vh x = U.T b = c
c = np.dot(U.T,b)
# diag(s) Vh x = c <=> Vh x = diag(1/s) c = w (trivial inversion of a diagonal matrix)
w = np.dot(np.diag(1/s),c)
# Vh x = w <=> x = Vh.H w (where .H stands for hermitian = conjugate transpose)
x = np.dot(Vh.conj().T,w)
return x
Now, let's test it with
A = np.random.random((100,100))
b = np.random.random((100,1))
and compare it with LU decomposition of np.linalg.solve function
x_svd = solve_svd(A,b)
x_lu = np.linalg.solve(A,b)
which gives
np.allclose(x_lu,x_svd)
>>> True
Please, feel free to ask more explanations in comments if needed. Hope this helps.

Reducing two tensors in Tensorflow

I have two tensors.
A tensor of shape (1,N)
A tensor of shape (N,T)
What I want to calculate is the following scalar:
tf.reduce_sum seemed helpful, but I couldn't get my head around combining the two tensors and reduce functions to get what I want. Can someone help me how to write the above equation in tensorflow?
Does this work?
import tensorflow as tf
import numpy as np
N = 10
T = 20
l = tf.constant(np.random.randn(1, N), dtype=tf.float32)
z = tf.constant(np.random.randn(N, T), dtype=tf.float32)
with tf.Session() as sess:
# swap axis for broadcasting to work
l = tf.transpose(l, [1, 0])
z_div_l = tf.divide(z, l)
z_div_l_2 = tf.divide(1.0 - z, 1.0 - l)
result = tf.reduce_sum(tf.add(z_div_l, z_div_l_2), axis=0)
eval_result = sess.run(result)
print('{}\n{}'.format(eval_result.shape, eval_result))
This calculates the above expression for every t from 0 to T-1, so it is not a scalar but a vector of size (T,). Your question mentions you want to compute just one scalar, but the sum is only over N and not over T, so I assumed you just want this expression to be evaluated for every t.

Matlab to Python: Solving the system using SVD

I am trying to convert Matlab code to Python code.
I am stuck with
x = A\b;
where A is 2D array (2257x456) and where b is 1D array (2257x1).
The array outputed in Matlab x is 1D array (456x1)
Also there is a comment in Matlab code which says: %Solve the system using SVD
So how can I do this in Python?
I try with following code's but with no success.
x = np.linalg.lstsq(A,b)
x = np.linalg.lstsq(A.T, b.T)[1].T
x = A :\\ b # found this [here][1]
x = np.linalg.solve(A,b)
[1]: https://docs.scipy.org/doc/numpy-dev/user/numpy-for-matlab-users.html
Update:
Error and result produced:
x = np.linalg.solve(A,b) : LinAlgError: Last 2 dimensions of the array must be square
x = np.linalg.lstsq(A,b) : x is not expected result, it is 3D array (4x456x1)
x = np.linalg.lstsq(A.T, b.T)[1].T : LinAlgError: Incompatible dimensions
You want np.linalg.lstsq(A,b). Take another look at the docstring, and note that it returns four values. So to use it, you would write
x, residuals, rank, s = np.linalg.lstsq(A,b)
Or, if you want to ignore everything except x,
x = np.linalg.lstsq(A,b)[0]

Categories

Resources