every matrix can be written in upper or lower triangular form simply just by rotating the basis. Is there a simple routine in python (numpy) to do it? I was unable to find it and I cant believe that there is no such thing. To ilustrate it:
matrix = numpy.array([[a,b,c],
[d,e,f],
[g,h,i]])
to
matrix2 = numpy.array([[z,0,0],
[y,x,0],
[v,u,t]])
letters are floats. So how to make this change, but not simply just by zeroing numbers b, c and f, but by correct rotation of basis in the most simple way.
Thank you!
You are looking for Schur decomposition. Schur decomposition decomposes a matrix A as A = Q U Q^H, where U is an upper triangular matrix, Q is a unitary matrix (which effects the basis rotation) and Q^H is the Hermitian adjoint of Q.
import numpy as np
from scipy.linalg import schur
a = np.array([[ 1., 2., 3.], [4., 5., 6.], [7., 8., 9.]])
u, q = schur(a) # q is the unitary matrix, u is upper triangular
repr(u)
# array([[ 1.61168440e+01, 4.89897949e+00, 1.58820582e-15],
# [ 0.00000000e+00, -1.11684397e+00, -1.11643184e-15],
# [ 0.00000000e+00, 0.00000000e+00, -1.30367773e-15]])
Related
I would like to produce a 4D array from a 2D one by periodic shifts, in a way that can be summarized by the following:
uuvv[kx,ky,qx,qy] = uu[kx+qx,ky+qy]
This is easiest to illustrate with a "2D from 1D" MWE:
def pc(idx):
return idx - Npts*int(idx/Npts)
uu = np.square(np.arange(Npts))
uv = np.zeros((Npts,Npts))
for kx in np.arange(Npts):
for qx in np.arange(Npts):
uv[kx,qx] = uu[pc(kx+qx)]
Here, the periodicity condition pc just brings the index back into the allowed range. The output for Npts=4 is:
array([[0., 1., 4., 9.],
[1., 4., 9., 0.],
[4., 9., 0., 1.],
[9., 0., 1., 4.]])
So that each value is shifted slightly. For the "4D from 2D" case, I could obviously use:
def pbc(idx):
return idx - Npts*int(idx/Npts)
uv = np.zeros((Npts,Npts,Npts,Npts))
for kx in np.arange(Npts):
for ky in np.arange(Npts):
for qx in np.arange(Npts):
for qy in np.arange(Npts):
uv[kx,ky,qx,qy] = uu[pbc(kx+qx),pbc(ky+qy)]
However, using four loops is going to be slow, as I will be doing this multiple times for much larger arrays. How can I do this more efficiently?
Please note that, although the MWE example could be reproduced by applying the square function to a 2D array, that would not be a helpful solution. Using the MWE to illustrate, the goal is to apply the function as few times as possible (i.e. only on the 1D array) and then to create the 2D array without for loops. Ultimately, I will need to do this to generate a 4D array from a 2D array. How can I do this?
You can replicate the 2D array and then extract the shifted 2D sub-arrays (avoiding modulus and conditionals). Here is how to do that:
uuRep = np.tile(uu, (2,2))
uv = np.zeros((Npts,Npts,Npts,Npts))
for kx in np.arange(Npts):
for ky in np.arange(Npts):
uv[kx,ky,:,:] = uuRep[kx:kx+Npts,ky:ky+Npts]
With Npts=64, this solution is about 1000 times faster.
I am using python 2.7 with scipy to calculate a distance matrix for an array.
I don't get how to find the wanted distance values in the returned condensed matrix.
See example
from scipy.spatial.distance import pdist
import numpy as np
a = np.array([[1],[4],[0],[5]])
print a
print pdist(a)
will print
[ 3. 1. 4. 4. 1. 5.]
I found here that the ij entry in the condensed matrix should store the distance between the i and j entries where ithread wondering if they mean ij as i*j or str.join(i,j) e.g 1,2 -> 2 or 12.
I can't find a consistent way to know the wanted index.
see my example, you should expect that all of the distances from entry 0 to anywhere else will be stored in entry 0 if the first option is valid.
can anyone shed some light on how can i extract my wanted distance from entry x to entry y? which index am i looking for?
Thanks!
This vector is in condensed form. It enumerates all pairs of indices in a natural order (in your example 0,1 0,2 0,3 0,4 1,2 1,3 1,4 2,3 2,4 ) and yields the distance between the elements at these array entries.
There is also the squareform function, which transforms the condensed form into a square matrix form (and vice versa). The square matrix form is exactly what you expect, i.e. at entry ij (row i, column j), it stores the distance between the i-th and j-th entry. For example, if you add print squareform(d) at the end of you code, the output will be:
array([[ 0., 3., 1., 4.],
[ 3., 0., 4., 1.],
[ 1., 4., 0., 5.],
[ 4., 1., 5., 0.]])
I know that my code is wrong because np.sum(abs(X),axis=1)) also sums the diagonal value, therefore my code will always return 'NOT diagonally dominant'. I have tried putting '-np.diag(X)' but i get an error message. Thank you in advance!
import numpy as np
A=np.array([[ 40., 7., 5.],
[ 5., 90., 7.],
[20., 7., 50.]])
def dd(X):
Sum_values_in_given_row = np.sum(abs(X),axis=1)
if np.all(((abs(np.diag(X)))) >= np.sum(abs(X),axis=1)):
print 'matrix is diagonally dominant'
else:
print 'NOT diagonally dominant'
return
dd(A)
To determine if a matrix is diagonally dominant, you have to check if the sum of the row coefficients excluding the diagonal coefficient is larger than the diagonal coefficient. Obviously you take the absolute values as part of the test. You are not doing this and you are including the diagonal coefficient instead. As you mentioned, you should subtract this the summation of each element with the diagonal coefficient to ensure the check is correct, but you didn't put that in your code for some reason:
def dd(X):
D = np.diag(np.abs(X)) # Find diagonal coefficients
S = np.sum(np.abs(X), axis=1) - D # Find row sum without diagonal
if np.all(D > S):
print 'matrix is diagonally dominant'
else:
print 'NOT diagonally dominant'
return
Note that the code takes advantage of broadcasting to facilitate subtracting the row sums with the corresponding diagonal coefficient.
The matrix A is diagonally dominant if |Aii| ≥ ∑j≠i |Aij|, or equivalently, 2|Aii| ≥ ∑j |Aij|.
def is_diagonally_dominant(x):
abs_x = np.abs(x)
return np.all( 2*np.diag(abs_x) >= np.sum(abs_x, axis=1) )
# ^^
What's wrong with
matrix = [[ 40., 7., 5.],
[ 5., 90., 7.],
[20., 7., 50.]]
def dd(mat):
for numb, i in enumerate(mat):
if mat[numb][numb]<sum(i)-mat[numb][numb]:
return False
return True
print(dd(matrix))
Is there a way in Python to have an efficient incremental update of sparse matrix?
H = lil_matrix((n,m))
for (i,j) in zip(A,B):
h(i,j) += compute_something
It seems that such a way to build a sparse matrix is quite slow (lil_matrix is the fastest sparse matrix type for that).
Is there a way (like using dict of dict or other kind of approaches) to efficiently build the sparse matrix H?
In https://stackoverflow.com/a/27771335/901925 I explore incremental matrix assignment.
lol and dok are the recommended formats if you want to change values. csr will give you an efficiency warning, and coo does not allow indexing.
But I also found that dok indexing is slow compared to regular dictionary indexing. So for many changes it is better to build a plain dictionary (with the same tuple indexing), and build the dok matrix from that.
But if you can calculate the H data values with a fast numpy vector operation, as opposed to iteration, it is best to do so, and construct the sparse matrix from that (e.g. coo format). In fact even with iteration this would be faster:
h = np.zeros(A.shape)
for k, (i,j) in enumerate(zip(A,B)):
h[k] = compute_something
H = sparse.coo_matrix((h, (A, B)), shape=(n,m))
e.g.
In [780]: A=np.array([0,1,1,2]); B=np.array([0,2,2,1])
In [781]: h=np.zeros(A.shape)
In [782]: for k, (i,j) in enumerate(zip(A,B)):
h[k] = i+j+k
.....:
In [783]: h
Out[783]: array([ 0., 4., 5., 6.])
In [784]: M=sparse.coo_matrix((h,(A,B)),shape=(4,4))
In [785]: M
Out[785]:
<4x4 sparse matrix of type '<class 'numpy.float64'>'
with 4 stored elements in COOrdinate format>
In [786]: M.A
Out[786]:
array([[ 0., 0., 0., 0.],
[ 0., 0., 9., 0.],
[ 0., 6., 0., 0.],
[ 0., 0., 0., 0.]])
Note that the (1,2) value is the sum 4+5. That's part of the coo to csr conversion.
In this case I could have calculated h with:
In [791]: A+B+np.arange(A.shape[0])
Out[791]: array([0, 4, 5, 6])
so there's no need for iteration.
Nope, do not use csr_matrix or csc_matrix, as they are going to be even more slower than lil_matrix, if you construct them incrementally. The Dictionary of Key based sparse matrix is exactly what you are looking for
from scipy.sparse import dok_matrix
S = dok_matrix((5, 5), dtype=np.float32)
for i in range(5):
for j in range(5):
S[i,j] = i+j # Update elements
A faster way would be:
H_ij = compute_something_vectorized()
H = coo_matrix((H_ij, (A, B))).tocsr()
The data for duplicate coordinates are then summed, see the docs for coo_matrix.
I wish to initiate a symmetric matrix in python and populate it with zeros.
At the moment, I have initiated an array of known dimensions but this is unsuitable for subsequent input into R as a distance matrix.
Are there any 'simple' methods in numpy to create a symmetric matrix?
Edit
I should clarify - creating the 'symmetric' matrix is fine. However I am interested in only generating the lower triangular form, ie.,
ar = numpy.zeros((3, 3))
array([[ 0., 0., 0.],
[ 0., 0., 0.],
[ 0., 0., 0.]])
I want:
array([[ 0],
[ 0, 0 ],
[ 0., 0., 0.]])
Is this possible?
I don't think it's feasible to try work with that kind of triangular arrays.
So here is for example a straightforward implementation of (squared) pairwise Euclidean distances:
def pdista(X):
"""Squared pairwise distances between all columns of X."""
B= np.dot(X.T, X)
q= np.diag(B)[:, None]
return q+ q.T- 2* B
For performance wise it's hard to beat it (in Python level). What would be the main advantage of not using this approach?