Conjugate Gradient implementation Python - python

I implemented Conjugate Gradient in python by looking into the Wikipedia reference - https://en.wikipedia.org/wiki/Conjugate_gradient_method
The implementation should solve for
ax = b
my application inputs goes as below,
a = <400x400 sparse matrix of type '<class 'numpy.float64'>'
with 1920 stored elements in Compressed Sparse Row format>
b = vector of shape (400, ) and dtype = float64
x = vector of random numbers of shape (400, )
Here is my implementation -
def ConjGrad(a, b, x):
r = (b - np.dot(np.array(a), x));
p = r;
rsold = np.dot(r.T, r);
for i in range(len(b)):
a_p = np.dot(a, p);
alpha = rsold / np.dot(p.T, a_p);
x = x + (alpha * p);
r = r - (alpha * a_p);
rsnew = np.dot(r.T, r);
if (np.sqrt(rsnew) < (10 ** -5)):
break;
p = r + ((rsnew / rsold) * p);
rsold = rsnew;
return p
When i call the above CG function, i get an error within the function for the line -
r = (b - np.dot(np.array(a), x));
The error goes like this -
NotImplementedError: subtracting a sparse matrix from a nonzero scalar is
not supported
At run time, below are the properties of variables within the CG function -
np.dot(np.array(a), x).shape
(400,)
b.shape
(400,)
I wonder why the subtraction is not happenning???
I tested the same function with the sample input arguments below and it worked fine.
a = np.array([[3, 2, -1], [2, -1, 1], [-1, 1, -1]]) # 3X3 symmetric matrix
b = (np.array([1, -2, 0])[np.newaxis]).T # 3X1 matrix
x = (np.array([0, 1, 2])[np.newaxis]).T
Can someone please tell me why its not working for a sparse matrix?

When multiplying a sparsa matrix by a array you should not use: np.dot(np.array(a), x)) but a.dot(x). See the documentation below:
https://docs.scipy.org/doc/scipy/reference/sparse.html
Follows a correct routine:
def conjGrad(A,x,b,tol,N):
r = b - A.dot(x)
p = r.copy()
for i in range(N):
Ap = A.dot(p)
alpha = np.dot(p,r)/np.dot(p,Ap)
x = x + alpha*p
r = b - A.dot(x)
if np.sqrt(np.sum((r**2))) < tol:
print('Itr:', i)
break
else:
beta = -np.dot(r,Ap)/np.dot(p,Ap)
p = r + beta*p
return x

Related

Trying to convert Matlab to python: (https://en.wikipedia.org/wiki/Laplacian_matrix#Example_of_the_operator_on_a_grid)

I'm trying to get my head around the example code on the wikipedia page for Laplacian matricies.
It is written in MatLab and I only have access to open source tools.
I think most of it is fairly straight forward but I'm having a bit of trouble on this one line.
C0V = V'*C0; % Transform the initial condition into the coordinate system
% of the eigenvectors
Now my maths isn't up to scratch to really understand what the comment means. Judging by the matlab website this appears to be a transposed matrix multiplied (inner product) by another matrix.
Where the left matrix (after transposing) is ( m x p ) and the right is ( p x n ).
The matrix V is produced by a call to eig a few lines above, which from this answer I am substituting scipy.linalg.eig.
The problem is that C0 is clearly defined as an N x N matrix (ndarray), but in my code V is an N**2 x N**2 matrix. I have no way of knowing what shape V is in the original code.
For reference the wikipedia code is below, and below that is my attempt to rewrite it in python (using scipy and numpy), followed by the error attributed to the line described above.
Matlab code from wikipedia
N = 20; % The number of pixels along a dimension of the image
A = zeros(N, N); % The image
Adj = zeros(N * N, N * N); % The adjacency matrix
% Use 8 neighbors, and fill in the adjacency matrix
dx = [- 1, 0, 1, - 1, 1, - 1, 0, 1];
dy = [- 1, - 1, - 1, 0, 0, 1, 1, 1];
for x = 1:N
for y = 1:N
index = (x - 1) * N + y;
for ne = 1:length(dx)
newx = x + dx(ne);
newy = y + dy(ne);
if newx > 0 && newx <= N && newy > 0 && newy <= N
index2 = (newx - 1) * N + newy;
Adj(index, index2) = 1;
end
end
end
end
% BELOW IS THE KEY CODE THAT COMPUTES THE SOLUTION TO THE DIFFERENTIAL EQUATION
Deg = diag(sum(Adj, 2)); % Compute the degree matrix
L = Deg - Adj; % Compute the laplacian matrix in terms of the degree and adjacency matrices
[V, D] = eig(L); % Compute the eigenvalues/vectors of the laplacian matrix
D = diag(D);
% Initial condition (place a few large positive values around and
% make everything else zero)
C0 = zeros(N, N);
C0(2:5, 2:5) = 5;
C0(10:15, 10:15) = 10;
C0(2:5, 8:13) = 7;
C0 = C0(:);
C0V = V'*C0; % Transform the initial condition into the coordinate system
% of the eigenvectors
for t = 0:0.05:5
% Loop through times and decay each initial component
Phi = C0V .* exp(- D * t); % Exponential decay for each component
Phi = V * Phi; % Transform from eigenvector coordinate system to original coordinate system
Phi = reshape(Phi, N, N);
% Display the results and write to GIF file
imagesc(Phi);
caxis([0, 10]);
title(sprintf('Diffusion t = %3f', t));
frame = getframe(1);
im = frame2im(frame);
[imind, cm] = rgb2ind(im, 256);
if t == 0
imwrite(imind, cm, 'out.gif', 'gif', 'Loopcount', inf, 'DelayTime', 0.1);
else
imwrite(imind, cm, 'out.gif', 'gif', 'WriteMode', 'append', 'DelayTime', 0.1);
end
end
My attempted translation
import numpy as np
import matplotlib.pyplot as plt
import scipy.linalg as la
N = 20 # The number of pixels along a dimension of the image
A = np.zeros((N, N)) # The image
Adj = np.zeros((N**2, N**2)) # The adjacency matrix
# Use 8 neighbors, and fill in the adjacency matrix
dx = [- 1, 0, 1, - 1, 1, - 1, 0, 1]
dy = [- 1, - 1, - 1, 0, 0, 1, 1, 1]
for x in range(N):
for y in range(N):
index = x * N + y
for ne in range(len(dx)):
newx = x + dx[ne]
newy = y + dy[ne]
if (newx >= 0 and newx < N
and newy >= 0 and newy < N):
index2 = newx * N + newy;
Adj[index, index2] = 1
# BELOW IS THE KEY CODE THAT COMPUTES THE SOLUTION TO THE DIFFERENTIAL EQUATION
Deg = np.diag(np.sum(Adj, 1)) # Compute the degree matrix
L = Deg - Adj # Compute the laplacian matrix in terms of the degree and adjacency matrices
D, V = la.eig(L) # Compute the eigenvalues/vectors of the laplacian matrix
D = np.diag(D)
# Initial condition (place a few large positive values around and
# make everything else zero)
C0 = np.zeros((N, N))
C0[1:4, 1:4] = 5
C0[9:14, 9:14] = 10
C0[1:5, 7:12] = 7
#C0 = C0(:) #This doesn't seem to do anything?
# matlab:C0V = V'*C0; % Transform the initial condition into the coordinate system
# of the eigenvectors
C0V = V.T * C0 # ???
Error
----------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-10-c3014dc90c9f> in <module>
2 # of the eigenvectors
3
----> 4 C0V = V.T * C0
ValueError: operands could not be broadcast together with shapes (400,400) (20,20)
Edit
It appears #hpaulj has identified my problem. The line I ommited is a ravel operation ie C0 = C0(:) takes a 20, 20 matrix to a 400, 1 vector. So I need
C0 = C0.ravel()

Projection of matrix onto a simplex

I have problem with understanding this piece of code which based on the output, I guess it computes the eigenvector of the matrix.
def simplexProj(y):
"""
Given y, computes its projection x* onto the simplex
Delta = { x | x >= 0 and sum(x) <= 1 },
that is, x* = argmin_x ||x-y||_2 such that x in Delta.
x = SimplexProj(y)
****** Input ******
y : input vector.
****** Output ******
x : projection of y onto Delta.
"""
if len(y.shape) == 1: # Reshape to (1,-1) if y is a vector.
y = y.reshape(1, -1) # row vector
x = y.copy()
x[x < 0] = 0 #element within the matrix that is negative will be replaced with 0, python2 feature
K = np.flatnonzero(np.sum(x, 0) > 1) #return indices that are non-zero in the flattened version of a ; sum of each column
# K gives the column index for column that has colum sum>1, True = 1, False = 0
x[:, K] = blockSimplexProj(y[:, K])
return x
def blockSimplexProj(y):
""" Same as function SimplexProj except that sum(max(Y,0)) > 1. """
r, c = y.shape
ys = -np.sort(-y, axis=0) #sort each column of the matrix with biggest entry on the first row
mu = np.zeros(c, dtype=float)
S = np.zeros((r, c), dtype=float)
for i in range(1, r): #1st to r-1th row
S[i, :] = np.sum(ys[:i, :] - ys[i, :], 0)
print(S)
colInd_ge1 = np.flatnonzero(S[i, :] >= 1)
colInd_lt1 = np.flatnonzero(S[i, :] < 1)
if len(colInd_ge1) > 0:
mu[colInd_ge1] = (1 - S[i - 1, colInd_ge1]) / i - ys[i - 1, colInd_ge1]
if i == r:
mu[colInd_lt1] = (1 - S[r, colInd_lt1]) / (r + 1) - ys[r, colInd_lt1]
x = y + mu
x[x < 0] = 0
return x
I'm a bit puzzle by the step computing the matrix S because according to the code, the row of first row of S should be all 0. Take for example the matrix A = np.array([[25,70,39,10,80],[12,45,32,89,43],[67,24,84,39,21],[0.1,0.2,0.3,0.035,0.06]]) The 3 iterations (i=1,2,3) are computed as expected but then there is an extra step which seemingly gives back S as basis of eigenvectors. It would be great if somebody can help me with understanding this problem. Also I#m not sure what's the name of this algorithm (how S is computed)

Numpy broadcasting elementwise product on all pairs of rows?

I have a 1d ndarray A of shape (n,) and a 2d ndarray E of shape (n,m). I am trying to preform the following calculation (the circle-dot denotes element wise multiplication):
I have written it using with a for loop, but this block of code is called thousands of times, and I was hoping there was a way to accomplish this with broadcasting or numpy functions. The following is my for loop solution I'm trying to rewrite:
def fun(E, A):
X = E * A[:,np.newaxis]
R = np.zeros(E.shape[-1])
for ii in xrange(len(E)-1):
for jj in xrange(ii+1, len(E)):
R += X[ii] * X[jj]
return R
Any help would be appreciated.
Current approach, but still not working:
def fun1(E, A):
X = E * A[:,np.newaxis]
R = np.zeros(E.shape[-1])
for ii in xrange(len(E)-1):
for jj in xrange(ii+1, len(E)):
R += X[ii] * X[jj]
return R
def fun2(E, A):
n = E.shape[0]
m = E.shape[1]
A_ = np.triu(A[1:] * A[:-1].reshape(-1,1))
E_ = E[1:] * E[:-1]
R = np.sum((A_.reshape(n-1, 1, n-1) * E_.T).transpose(0,2,1).reshape(n-1*n-1,m), axis=0)
return R
A = np.arange(4,9)
E = np.arange(20).reshape((5,4))
print fun1(E,A)
print fun2(E,A)
Now, this should work:
def fun3(E,A):
n,m = E.shape
n_ = n - 1
X = E * A[:, np.newaxis]
a = (X[:-1].reshape(n_, 1, m) * X[1:])
b = np.tril(np.ones((m, n_, n_))).T
R = np.sum((a*b).reshape(n_*n_, m), axis=0)
return R
Last function was only based on the given formula. This is instead based on fun and tested with your added test case.
Hope this works for you!

Transfrom matrix from scipy.spatial.procrustes [duplicate]

Is there something like Matlab's procrustes function in NumPy/SciPy or related libraries?
For reference. Procrustes analysis aims to align 2 sets of points (in other words, 2 shapes) to minimize square distance between them by removing scale, translation and rotation warp components.
Example in Matlab:
X = [0 1; 2 3; 4 5; 6 7; 8 9]; % first shape
R = [1 2; 2 1]; % rotation matrix
t = [3 5]; % translation vector
Y = X * R + repmat(t, 5, 1); % warped shape, no scale and no distortion
[d Z] = procrustes(X, Y); % Z is Y aligned back to X
Z
Z =
0.0000 1.0000
2.0000 3.0000
4.0000 5.0000
6.0000 7.0000
8.0000 9.0000
Same task in NumPy:
X = arange(10).reshape((5, 2))
R = array([[1, 2], [2, 1]])
t = array([3, 5])
Y = dot(X, R) + t
Z = ???
Note: I'm only interested in aligned shape, since square error (variable d in Matlab code) is easily computed from 2 shapes.
I'm not aware of any pre-existing implementation in Python, but it's easy to take a look at the MATLAB code using edit procrustes.m and port it to Numpy:
def procrustes(X, Y, scaling=True, reflection='best'):
"""
A port of MATLAB's `procrustes` function to Numpy.
Procrustes analysis determines a linear transformation (translation,
reflection, orthogonal rotation and scaling) of the points in Y to best
conform them to the points in matrix X, using the sum of squared errors
as the goodness of fit criterion.
d, Z, [tform] = procrustes(X, Y)
Inputs:
------------
X, Y
matrices of target and input coordinates. they must have equal
numbers of points (rows), but Y may have fewer dimensions
(columns) than X.
scaling
if False, the scaling component of the transformation is forced
to 1
reflection
if 'best' (default), the transformation solution may or may not
include a reflection component, depending on which fits the data
best. setting reflection to True or False forces a solution with
reflection or no reflection respectively.
Outputs
------------
d
the residual sum of squared errors, normalized according to a
measure of the scale of X, ((X - X.mean(0))**2).sum()
Z
the matrix of transformed Y-values
tform
a dict specifying the rotation, translation and scaling that
maps X --> Y
"""
n,m = X.shape
ny,my = Y.shape
muX = X.mean(0)
muY = Y.mean(0)
X0 = X - muX
Y0 = Y - muY
ssX = (X0**2.).sum()
ssY = (Y0**2.).sum()
# centred Frobenius norm
normX = np.sqrt(ssX)
normY = np.sqrt(ssY)
# scale to equal (unit) norm
X0 /= normX
Y0 /= normY
if my < m:
Y0 = np.concatenate((Y0, np.zeros(n, m-my)),0)
# optimum rotation matrix of Y
A = np.dot(X0.T, Y0)
U,s,Vt = np.linalg.svd(A,full_matrices=False)
V = Vt.T
T = np.dot(V, U.T)
if reflection != 'best':
# does the current solution use a reflection?
have_reflection = np.linalg.det(T) < 0
# if that's not what was specified, force another reflection
if reflection != have_reflection:
V[:,-1] *= -1
s[-1] *= -1
T = np.dot(V, U.T)
traceTA = s.sum()
if scaling:
# optimum scaling of Y
b = traceTA * normX / normY
# standarised distance between X and b*Y*T + c
d = 1 - traceTA**2
# transformed coords
Z = normX*traceTA*np.dot(Y0, T) + muX
else:
b = 1
d = 1 + ssY/ssX - 2 * traceTA * normY / normX
Z = normY*np.dot(Y0, T) + muX
# transformation matrix
if my < m:
T = T[:my,:]
c = muX - b*np.dot(muY, T)
#transformation values
tform = {'rotation':T, 'scale':b, 'translation':c}
return d, Z, tform
There is a Scipy function for it: scipy.spatial.procrustes
I'm just posting its example here:
>>> import numpy as np
>>> from scipy.spatial import procrustes
>>> a = np.array([[1, 3], [1, 2], [1, 1], [2, 1]], 'd')
>>> b = np.array([[4, -2], [4, -4], [4, -6], [2, -6]], 'd')
>>> mtx1, mtx2, disparity = procrustes(a, b)
>>> round(disparity)
0.0
You can have both Ordinary Procrustes Analysis and Generalized Procrustes Analysis in python with something like this:
import numpy as np
def opa(a, b):
aT = a.mean(0)
bT = b.mean(0)
A = a - aT
B = b - bT
aS = np.sum(A * A)**.5
bS = np.sum(B * B)**.5
A /= aS
B /= bS
U, _, V = np.linalg.svd(np.dot(B.T, A))
aR = np.dot(U, V)
if np.linalg.det(aR) < 0:
V[1] *= -1
aR = np.dot(U, V)
aS = aS / bS
aT-= (bT.dot(aR) * aS)
aD = (np.sum((A - B.dot(aR))**2) / len(a))**.5
return aR, aS, aT, aD
def gpa(v, n=-1):
if n < 0:
p = avg(v)
else:
p = v[n]
l = len(v)
r, s, t, d = np.ndarray((4, l), object)
for i in range(l):
r[i], s[i], t[i], d[i] = opa(p, v[i])
return r, s, t, d
def avg(v):
v_= np.copy(v)
l = len(v_)
R, S, T = [list(np.zeros(l)) for _ in range(3)]
for i, j in np.ndindex(l, l):
r, s, t, _ = opa(v_[i], v_[j])
R[j] += np.arccos(min(1, max(-1, np.trace(r[:1])))) * np.sign(r[1][0])
S[j] += s
T[j] += t
for i in range(l):
a = R[i] / l
r = [np.cos(a), -np.sin(a)], [np.sin(a), np.cos(a)]
v_[i] = v_[i].dot(r) * (S[i] / l) + (T[i] / l)
return v_.mean(0)
For testing purposes, the output of each algorithm can be visualized as follows:
import matplotlib.pyplot as p; p.rcParams['toolbar'] = 'None';
def plt(o, e, b):
p.figure(figsize=(10, 10), dpi=72, facecolor='w').add_axes([0.05, 0.05, 0.9, 0.9], aspect='equal')
p.plot(0, 0, marker='x', mew=1, ms=10, c='g', zorder=2, clip_on=False)
p.gcf().canvas.set_window_title('%f' % e)
x = np.ravel(o[0].T[0])
y = np.ravel(o[0].T[1])
p.xlim(min(x), max(x))
p.ylim(min(y), max(y))
a = []
for i, j in np.ndindex(len(o), 2):
a.append(o[i].T[j])
O = p.plot(*a, marker='x', mew=1, ms=10, lw=.25, c='b', zorder=0, clip_on=False)
O[0].set(c='r', zorder=1)
if not b:
O[2].set_color('b')
O[2].set_alpha(0.4)
p.axis('off')
p.show()
# Fly wings example (Klingenberg, 2015 | https://en.wikipedia.org/wiki/Procrustes_analysis)
arr1 = np.array([[588.0, 443.0], [178.0, 443.0], [56.0, 436.0], [50.0, 376.0], [129.0, 360.0], [15.0, 342.0], [92.0, 293.0], [79.0, 269.0], [276.0, 295.0], [281.0, 331.0], [785.0, 260.0], [754.0, 174.0], [405.0, 233.0], [386.0, 167.0], [466.0, 59.0]])
arr2 = np.array([[477.0, 557.0], [130.129, 374.307], [52.0, 334.0], [67.662, 306.953], [111.916, 323.0], [55.119, 275.854], [107.935, 277.723], [101.899, 259.73], [175.0, 329.0], [171.0, 345.0], [589.0, 527.0], [591.0, 468.0], [299.0, 363.0], [306.0, 317.0], [406.0, 288.0]])
def opa_out(a):
r, s, t, d = opa(a[0], a[1])
a[1] = a[1].dot(r) * s + t
return a, d, False
plt(*opa_out([arr1, arr2, np.matrix.copy(arr2)]))
def gpa_out(a):
g = gpa(a, -1)
D = [avg(a)]
for i in range(len(a)):
D.append(a[i].dot(g[0][i]) * g[1][i] + g[2][i])
return D, sum(g[3])/len(a), True
plt(*gpa_out([arr1, arr2]))
Probably you want to try this package with various flavors of different Procrustes methods, https://github.com/theochem/procrustes.

Python: Gradient of matrix function

I want to calculate the gradient of the following function h(x) = 0.5 x.T * A * x + b.T + x.
For now I set A to be just a (2,2) Matrix.
def function(x):
return 0.5 * np.dot(np.dot(np.transpose(x), A), x) + np.dot(np.transpose(b), x)
where
A = A = np.zeros((2, 2))
n = A.shape[0]
A[range(n), range(n)] = 1
a (2,2) Matrix with main diagonal of 1 and
b = np.ones(2)
For a given Point x = (1,1) numpy.gradient returns an empty list.
x = np.ones(2)
result = np.gradient(function(x))
However shouldn't I get something like that: grad(f((1,1)) = (x1 + 1, x2 + 1) = (2, 2).
Appreciate any help.
It seems like you want to perform symbolic differentiation or automatic differentiation which np.gradient does not do. sympy is a package for symbolic math and autograd is a package for automatic differentiation for numpy. For example, to do this with autograd:
import autograd.numpy as np
from autograd import grad
def function(x):
return 0.5 * np.dot(np.dot(np.transpose(x), A), x) + np.dot(np.transpose(b), x)
A = A = np.zeros((2, 2))
n = A.shape[0]
A[range(n), range(n)] = 1
b = np.ones(2)
x = np.ones(2)
grad(function)(x)
Outputs:
array([2., 2.])

Categories

Resources