Solve overdetermined system with QR decomposition in Python

Solve overdetermined system with QR decomposition in Python - python

I'm trying to solve an overdetermined system with QR decomposition and linalg.solve but the error I get is
LinAlgError: Last 2 dimensions of the array must be square.
This happens when the R array is not square, right? The code looks like this
import numpy as np
import math as ma
A = np.random.rand(2,3)
b = np.random.rand(2,1)
Q, R = np.linalg.qr(A)
Qb = np.matmul(Q.T,b)
x_qr = np.linalg.solve(R,Qb)
Is there a way to write this in a more efficient way for arbitrary A dimensions? If not, how do I make this code snippet work?

The reason is indeed that the matrix R is not square, probably because the system is overdetermined. You can try np.linalg.lstsq instead, finding the solution which minimizes the squared error (which should yield the exact solution if it exists).
import numpy as np
A = np.random.rand(2, 3)
b = np.random.rand(2, 1)
x_qr = np.linalg.lstsq(A, b)[0]

You need to call QR with the flag mode='reduced'. The default Q R matrices are returned as M x M and M x N, so if M is greater than N then your matrix R will be nonsquare. If you choose reduced (economic) mode your matrices will be M x N and N x N, in which case the solve routine will work fine.
However, you also have equations/unknowns backwards for an overdetermined system. Your code snippet should be
import numpy as np
A = np.random.rand(3,2)
b = np.random.rand(3,1)
Q, R = np.linalg.qr(A, mode='reduced')
#print(Q.shape, R.shape)
Qb = np.matmul(Q.T,b)
x_qr = np.linalg.solve(R,Qb)
As noted by other contributors, you could also call lstsq directly, but sometimes it is more convenient to have Q and R directly (e.g. if you are also planning on computing projection matrix).

As shown in the documentation of numpy.linalg.solve:
Computes the “exact” solution, x, of the well-determined, i.e., full rank, linear matrix equation ax = b.
Your system of equations is underdetermined not overdetermined. Notice that you have 3 variables in it and 2 equations, thus fewer equations than unknowns.
Also notice how it also mentions that in numpy.linalg.solve(a,b), a must be an MxM matrix. The reason behind this is that solving the system of equations Ax=b involves computing the inverse of A, and only square matrices are invertible.
In these cases a common approach is to take the Moore-Penrose pseudoinverse, which will compute a best fit (least squares) solution of the system. So instead of trying to solve for the exact solution use numpy.linalg.lstsq:
x_qr = np.linalg.lstsq(R,Qb)

Related

In which scenario would one use another matrix than the identity matrix for finding eigenvalues?

The scipy.linalg.eigh function can take two matrices as arguments: first the matrix a, of which we will find eigenvalues and eigenvectors, but also the matrix b, which is optional and chosen as the identity matrix in case it is left blank.
In what scenario would someone like to use this b matrix?
Some more context: I am trying to use xdawn covariances from the pyRiemann package. This uses the scipy.linalg.eigh function with a covariance matrix a and a baseline covariance matrix b. You can find the implementation here. This yields an error, as the b matrix in my case is not positive definitive and thus not useable in the scipy.linalg.eigh function. Removing this matrix and just using the identity matrix however solves this problem and yields relatively nice results... The problem is that I do not really understand what I changed, and maybe I am doing something I should not be doing.
This is the code from the pyRiemann package I am using (modified to avoid using functions defined in other parts of the package):
# X are samples (EEG data), y are labels
# shape of X is (1000, 64, 2459)
# shape of y is (1000,)
from scipy.linalg import eigh
Ne, Ns, Nt = X.shape
tmp = X.transpose((1, 2, 0))
b = np.matrix(sklearn.covariance.empirical_covariance(tmp.reshape(Ne, Ns * Nt).T))
for c in self.classes_:
# Prototyped response for each class
P = np.mean(X[y == c, :, :], axis=0)
# Covariance matrix of the prototyper response & signal
a = np.matrix(sklearn.covariance.empirical_covariance(P.T))
# Spatial filters
evals, evecs = eigh(a, b)
# and I am now using the following, disregarding the b matrix:
# evals, evecs = eigh(a)

If A and B were both symmetric matrices that doesn't necessarily have to imply that inv(A)*B must be a symmetric matrix. And so, if i had to solve a generalised eigenvalue problem of Ax=lambda Bx then i would use eig(A,B) rather than eig(inv(A)*B), so that the symmetry isn't lost.
One practical application is in finding the natural frequencies of a dynamic mechanical system from differential equations of the form M (d²x/dt²) = Kx where M is a positive definite matrix known as the mass matrix and K is the stiffness matrix, and x is displacement vector and d²x/dt² is acceleration vector which is the second derivative of the displacement vector. To find the natural frequencies, x can be substituted with x0 sin(ωt) where ω is the natural frequency. The equation reduces to Kx = ω²Mx. Now, one can use eig(inv(K)*M) but that might break the symmetry of the resultant matrix, and so I would use eig(K,M) instead.

A - lambda B x it means that x is not in the same basis as the covariance matrix.
If the matrix is not definite positive it means that there are vectors that can be flipped by your B.
I hope it was helpful.

How to remove discontinuities from complex angle of NumPy eigenvector components?

I am using NumPy's linalg.eig on square matrices. My square matrices are a function of a 2D domain, and I am looking at its eigenvectors' complex angles along a parameterized circle on this domain. As long as the path I am considering is smooth, I expect the complex angles of each eigenvector's components to be smooth. However, for some cases, this is not the case with Python (although it is with other programming languages). For the parameter M=0 (some argument in my matrix that appears on its diagonal), I have components that look like:
when they should ideally look like (M=0.1):
What I have tried:
I verified that the matrices are Hermitian in both cases.
When I use linalg.eigh, M=0.1 becomes discontinuous while M=0 sometimes becomes continuous.
Using np.unwrap did nothing.
The difference between component phases (i.e. np.angle(v1-v2) for eigenvector v=[[v1],[v2]]) is smooth/continuous, but this is not what I want.
Fixing the NumPy seed before solving did nothing for different values of the seed. For example: np.random.seed(1).
What else can I do? I am trying to use Sympy's eigenvects just because I am running out of options, and I asked another question asking about another potential approach here: How do I force first component of NumPy eigenvectors to be real? . But, I do not know what else I can try.
Here is a minimal working example that works nicely in a Jupyter notebook:
import numpy as np
from numpy import linalg as LA
import matplotlib.pyplot as plt
M = 0.01; # nonzero M is okay
M = 0.0; # M=0 causes problems
def matrix_generator(kx,ky,M):
a = 2.46; t = 1; k = np.array((kx,ky));
d1 = (a/2)*np.array((1,np.sqrt(3)));d2 = (a/2)*np.array((1,-np.sqrt(3)));d3 = -a*np.array((1,0));
sx = np.matrix([[0,1],[1,0]]);sy = np.matrix([[0,-1j],[1j,0]]);sz = np.matrix([[1,0],[0,-1]]);
hx = np.cos(k#d1)+np.cos(k#d2)+np.cos(k#d3);hy = np.sin(k#d1)+np.sin(k#d2)+np.sin(k#d3);
return -t*(hx*sx - hy*sy + M*sz)
n_segs = 200; #number of segments in (kx,ky) loop
evecs_along_loop = np.zeros((n_segs,2,2),dtype=float)
# parameterize circular loop
kx0 = 0.5; ky0 = 1; r1=0.2; r2=0.2;
a = np.linspace(0.0, 2*np.pi, num=n_segs+2)
kloop=np.zeros((n_segs+2,2))
for i in range(n_segs+2):
kloop[i,:]=np.array([kx0 + r1*np.cos(a[i]), ky0 + r2*np.sin(a[i])])
# assign eigenvector complex angles
for j in np.arange(n_segs):
np.random.seed(2)
H = matrix_generator(kloop[j][0],kloop[j][1],M)
eval0, psi0 = LA.eig(H)
evecs_along_loop[j,:,:] = np.angle(psi0)
# plot eigenvector complex angles
for p in np.arange(2):
for q in np.arange(2):
print(f"Phase for eigenvector element {p},{q}:")
fig = plt.figure()
ax = plt.axes()
ax.plot((evecs_along_loop[:,p,q]))
plt.show()
Clarification for anon01's comment:
For M=0, a sample matrix at some value of (kx,ky) would look like:
a = np.matrix([[0.+0.j, 0.99286437+1.03026667j],
[0.99286437-1.03026667j, 0.+0.j]])
For M =/= 0, the diagonal will be non-zero (but real).

I think that in general this is a tough problem. The fundamental issue is that eigenvectors (unlike eigenvalues) are not unambiguously defined. An eigenvector v of M with eigenvalue c is any non-zero vector for which
M*v = c*v
In particular for any non zero scalar s, multiplying an eigenvector by s yields an eigenvector, and even if you demand (as usual) that eigenvectors have length 1, we are still free to multiply by any scalar of absolute value 1. Even worse, if v1,..vd are orthogonal eigenvectors for c, then any non-zero linear combination of the v's is also an eigenvector for c.
Different eigendecomposition routines might well, therefore, come up with very different eigenvectors and still be doing their job. Moreover some routines might produce eigenvectors that are far apart for matrices that are close together.
A simple tractable case is where you know that all your eigenvalues are non-degenerate (i.e. each eigenspace is of dimension 1) and you happen to know that for a particular i, the i'th component of each eigenvector will be non zero. Then you could multiply the eigenvector v by a scalar, of absolute value 1, chosen so that after the multiplication v[i] is a positive real number. In C
s = conj(v[i])/cabs(v[i])
where
conj(z) is the complex conjugate of the complex number z,
and cabs(z) is the absolute value of the complex number z
Note that the above supposes that we are using the same index for every eigenvector, though the factor s varies from eigenvector to eigenvector.
This would impose a uniqueness on the eigenvectors, and, one would hope, mean that they varied continuously with the parameters of your matrix.

Lanczos algorithm for finding top eigenvalues of a matrix sum

I am trying to find the top k leading eigenvalues of a numpy matrix (using python dot product notation)
L#L + a*Y#Y.T, where L and Y are a symmetric nxn and an nxd matrix, respectively.
According to the below text from this paper, I should be able to calculate these leading eigenvalues with L#(L#v) + a*X#(X.T#v), where I guess v is an arbitrary vector. The Lanczos paper they cite is here.
I'm not quite sure where to start. I know that scipy has scipy.sparse.linalg.eigsh here, and from the notes it looks like it uses the Lanczos algorithm - but I am at a loss as to whether it's possible to use sparse.linalg.eigsh for my specific use case. I googled around and didn't find a Python implementation for this very quickly -- does anybody know if I can use sparse.linalg.eigsh to calculate this somehow? I definitely don't want to write this algorithm myself.
I also wasn't sure whether to post this in math.stackexchange or here, since it's a question about the Python implementation of a very mathy thing.

You could check scipy.sparse.linalg.eigsh.
import numpy as np;
from scipy.sparse.linalg import eigsh;
from numpy.linalg import eigh
a = 1.4
n = 20;
d = 7;
# random symmetric n x n matrix
L = np.random.randn(n, n)
L = L + L.T
# random n x d matrix
Y = np.random.randn(n, d)
A = L # L.T + a * Y # Y.T # your equation
A must be positive-definite to use eigsh, this is guaranteed to be true if a>0.
You could check the four eigenvalues as follows
eigsh(La, 4)[0]
For reference you can compare based on numpy.linalg.eigh that compute all the eigenvalues. Sort them, and take the last four elements of the sorted array, the results should be close.
np.sort(eigh(La)[0])[-4:]

Multivariate Multiple Linear regression using only numpy

I am looking to build a multivariate, multiple linear regression model with N dependent variables and M independent variables. I was looking around and cannot seem to find an implementation. I did some research and found some notes here: http://users.stat.umn.edu/~helwig/notes/mvlr-Notes.pdf on slide 51. This seems very simple to implement:
import numpy as np
M = 10
N = 3
p = 15
Y = np.random.rand(p,N)
X = np.random.rand(p,M)
A = np.dot(np.transpose(X),X)
B = np.dot(np.transpose(X),Y)
sol = np.linalg.solve(A,B)
where sol outputs the matrix of coefficients. I eventually will be scaling this up to extremely large datasets. My main concern is the accuracy in this method. To be quite honest it seems all too simple. Can someone weigh in on whether this is sufficient in a multivariate, multiple regression or is there some package or anything else that I can use that is better?
Thank you

There's a lot going on behind the scenes than you think. What you are looking at is the least squared solution to a system of an overconstrained linear system of equations. Let me explain it like this. You have p equations and q unknown.
Case 1: p < q Infinitely many solutions. The matrix A of size p x q is singular and so infinitely many solution exists. They can be found by finding the particular solution and basis for the null space. np.linalg.solve can't be used to solve such a system as it only expects full rank square matrices. You can use np.linalg.lstsq instead.
Case 2: p = q Unique Solution. This means that the p x q matrix is invertible and you can solve the system Ax = b using x = A^(-1) b. In fact, when you call np.linalg.solve, this is exactly what's happening.
Case 3: p > q No solution exists. But we can approximate by projecting the vector b orthogonally onto the column space of matrix A. This means that we want to find a vector b_hat such that b_hat + w = b where w is some arbitrary vector perpendicular to b_hat and b_hat lies in the column space of the matrix A. Hence, there is some x_hat such that A * x_hat = b and A^(T) w = 0 (because w lies in the left null space). As w = b - b_hat, we have A^(T) w = A^(T) * b - A^(T) * ( A * x_hat ). Hence, we come to a solution, x_hat = (A^(T) * A)^(-1) * A^(T) * b. This x_hat is the best approximate solution to the system of linear equations with p > q. It can be proved mathematically but I think this is a plausible explanation. Again, np.linalg.solve can't be used to solve this type of linear system but there is another routine np.linalg.lstsq that can be used.

Scipy Newton Krylov Expects Square Matrix

I am trying to use scipy.optimize.newton_krylov() to solve a least-squares optimization problem, i.e. finding x such that (Ax - b)**2 = 0. My understanding is that A has to be mxn with m>n, b has to be mx1, and x will be nx1. When I try to run the optimization, I get an error:
ValueError: expected square matrix, but got shape=(40, 6)
Presumably this error concerns the computation of the Jacobian and not my input matrix A? But if so, how can I change the values I am providing to the functions to resolve this problem? Any advice would be appreciated.
The following code reproduces the error:
import numpy as np
from scipy.optimize import newton_krylov
A = np.random.uniform(0, 1, (40,6))
b = np.arange(40)
x0 = np.ones(6)
def F(x):
return (A.dot(x) - b)**2
x = newton_krylov(F, np.ones(6))

As the docstring of newton_krylov explains, it finds a root of a function F(x). The function F must accept a one-dimensional array, and return a one-dimensional array of the same size as the input. If, for example, x has length 3, F(x) must return an array with length 3. In that case, newton_krylov attempts to solve F(x) = [0, 0, 0].
The error that you got is the result of newton_krylov attempting to use the numerically computed Jacobian matrix of F with a function that expects the matrix to be square. Your function F has a Jacobian matrix with shape (40, 6), because the input has length 6 and the output has length 40.
By itself, newton_krylov is not the right function to use for solving a least-squares problem. A least-squares problem is a minimization problem, not a root-finding problem. (A solver such as newton_krylov might be used to implement a minimization algorithm, but I assume you are interested in using an existing solution rather than writing your own.)
You say you want to solve a least-squares problem, but then you say "i.e. finding x such that (Ax - b)**2 = 0." I assume that was just a bit a sloppiness in your description, because that is not the least-squares problem. The least-squares problem is to find x such that sum((Ax - b)**2) is minimized. (In general, there won't be an x that makes the sum of squares equal to zero.)
So, assuming you really want to find x such that sum((Ax - b)**2) is minimized, you can use scipy.linalg.lstsq.
For example:
In [54]: from scipy.linalg import lstsq
In [55]: A = np.random.uniform(0, 1, (40,6))
In [56]: b = np.arange(40)
In [57]: x, res, rank, s = lstsq(A, b)
In [58]: x
Out[58]:
array([ 5.07513787, 1.83858547, 18.07818853, 9.28805475,
6.13019155, -0.7045539 ])

Krylov method requires the first argument (function F(x) in your case) to be a square matrix.
This seems like a homework question but the answer will be adjusting matrix A to make it square. Examples: https://docs.scipy.org/doc/scipy-0.14.0/reference/tutorial/optimize.html#kk

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.