I'm working on a project and i'm doing additions on diagonal submatrixes of a square matrix of big size (180x180) with a "kernel" square matrix of size around 40*40.
size1 = matrix1.shape[0] #square matrix shape 180x180
sizeKernel = kernelMatrix.shape[0] # square matrix shape 40x40
newMatrix1 = matrix1.copy()
np.fill_diagonal(newMatrix1, 0)
for i in range(size1):
newMatrix1[i:i+sizeKernel,i:i+sizeKernel] += matrix1[i][i] * kernelMatrix[:size1-i, :size1-i]
The goal is to distribute diagonal coefficients with the kernel matrix to other coefficients (a kind of partial convolution).
As an example, let's say the matrix is size 4x4 and the kernel is size 2x2:
a = [[1,2,1,0],
[3,-1,0,0],
[0,1,1,1],
[0,2,0,2]]
kern = [[1,2],
[0,1]]
we create the matrix res by copying every coef of a but the diagonal coefs:
res = [[0,2,1,0],
[3,0,0,0],
[0,1,0,1],
[0,2,0,0]]
we distribute every coefficient at diagonal to the bottom right submatrix of same size as kernel (if not possible we just take the remaining and cut the kernel matrix)
For the first coefficient of diagonal it gives:
res[0:2, 0:2] = res[0:2,0:2] + a[0,0]*kern
res[0:2, 0:2] = [[0,2] + 1 * [[1,2],
[3,0]] [0,1]]
res = [[1,4,1,0],
[3,1,0,0],
[0,1,0,1],
[0,2,0,0]]
Then we do the same idea for the other coefs. For the last coef of diagonal, instead of taking the whole kernel, we take apply the same thing extending the elements of the matrix with 0s.
As it is not vectorized, it takes quite a long time to compute this.
Another way to compute this would be to use directly the scipy method fftconvolve or convolve2d only on the diagonal matrix extracted, and add the result to the matrix with 0s on the diagonal:
newMatrix = matrix1.copy()
matrix1Size = matrix1.shape[0]
np.fill_diagonal(newMatrix , 0)
newMatrix += scipy.signal.fftconvolve(matrix1, kernelMatrix)[:matrix1Size,:matrixSize]
but it takes approximately as much time as the first method (of course their are more calculation if we do the full convolution of a whole matrix)
Is there any method to vectorize this calculation on submatrixes?
Thank you in advance!
I have a 2D means matrix in size n*m, where n is number of samples and m is the dimension of the data.
I have as well n matrices of m*m, namely sigma is my variance matrix in shape n*m*m.
I wish to sample n samples from a the distributions above, such that x_i~N(mean[i], sigma[i]).
Any way to do that in numpy or any other standard lib w/o running with a for loop?
The only option I thought was using np.random.multivariate_normal() by flatting the means matrix to one vector, and flatten the 3D sigma to a 2D blocks-diagonal matrix. And of course reshape afterwards. But that means we are going the sample with sigma in shape (n*m)*(n*m) which can easily be ridiculously huge, and only computing and allocating that matrix (if possible) can take longer than running in a for loop.
In my specific task, right now Sigma is the same matrix for all the samples, means I can express Sigma in m*m, and it is the same one for all n points. But I am interested in a general solution.
Appreciate your help.
Difficult to tell without testable code, but this should be close:
A = numpy.linalg.cholesky(sigma) # => shape (n, m, m), same as sigma
Z = np.random.normal(size = (n, m)) # shape (n, m)
X = np.einsum('ijk, ik -> ij', A, Z) + mean # shape (n, m)
What's going on:
We're manually sampling multivariate normal distributions according to the standard Cholesky decomposition method outlined here. A is built such that A#A.T = sigma. Then X (the multivariate normal) can be formed by the dot product of A and a univariate normal N(0, 1) vector Z, plus the mean.
You keep the extraneous dimension throughout the calculation in the first (index = 0, 'i' in the einsum) axis, while contracting the last ('k') axis, forming the dot product.
I am trying to implement a multivariate Gaussian Mixture Model and am trying to calculate the probability distribution function using tensors. There are n data points, k clusters, and d dimensions. So far, I have two tensors. One is a (n,k,d) tensor of centered data points and the other is a kxdxd tensor of covariance matricies. I can compute an nxk matrix of probabilities by doing
centered = np.repeat(points[:,np.newaxis,:],K,axis=1) - mu[np.newaxis,:] # KxNxD
prob = np.zeros(n,k)
constant = 1/2/np.pow(np.pi, d/2)
for n in range(centered.shape[1]):
for k in range(centered.shape[0]):
p = centered[n,k,:][np.newaxis] # 1xN
power = -1/2*(p # np.linalg.inv(sigma[k,:,:]) # p.T)
prob[n,k] = constant * np.linalg.det(sigma[k,:,:]) * np.exp(power)
where sigma is the triangularized kxdxd matrix of covariances and centered are mypoints. What is a more pythonic way of doing this using numpy's tensor capabilites?
Just a couple of quick observations:
I don't see you using p in the loop; is this a mistake? Using n instead?
The T in centered[n,k,:].T does nothing; with that index the array is 1d
I'm not sure if np.linal.inv can handle batches of arrays, allowing np.linalg.inv(sigma).
# allows batches, just so long as the last 2 dim are the ones entering into the dot (with the usual last of A, 2nd to the last of B rule; einsum can also be used.
again does np.linalg.det handle batches?
I am trying to convert a RGB image to Grayscale using the following paper.
The main algorithm using in the paper is this:
Novel PCA based algorithm to convert images to grayscale
However, when I am trying to extract eigen vectors from the image I am getting 500 eigen values, instead of 3 as required. As far as I know, a NxN matrix usually gives N Eigen vectors, but I am not really sure what I should be doing here to get only 3 Eigen vectors.
Any help as to what I should do? Here's my code so far:
import numpy as np
import cv2
def pca_rgb2gray(img):
"""
NOVEL PCA-BASED COLOR-TO-GRAY IMAGE CONVERSION
Authors:
-Ja-Won Seo
-Seong Dae Kim
2013 IEEE International Conference on Image Processing
"""
I_re = cv2.resize(img, (500,500))
Iycc = cv2.cvtColor(I_re, cv2.COLOR_BGR2YCrCb)
Izycc = Iycc - Iycc.mean()
eigvals = []
eigvecs = []
final_im = []
for i in range(3):
res = np.linalg.eig(Izycc[:,:,i])
eigvals.append(res[0])
eigvecs.append(res[1])
eignorm = np.linalg.norm(eigvals)
for i in range(3):
eigvals[i]/=eignorm
eigvecs[i]/=np.linalg.norm(eigvecs[i])
temp = eigvals[i] * np.dot(eigvecs[i], Izycc[:,:,i])
final_im.append(temp)
final_im = final_im[0] + final_im[1] + final_im[2]
return final_im
if __name__ == '__main__':
img = cv2.imread('image.png')
gray = pca_rgb2gray(img)
The accepted answer by Ahmed unfortunately has the PCA math wrong, leading to the a result quite different to the manuscript. Here are the images screen captured from the manuscript.
The mean centring and SVD should be done along the other dimension, with the channels treated as the different samples. The mean centring is aimed at getting an average pixel response of zero, not an average channel response of zero.
The linked algorithm also clearly states that the projection of the PCA model involves multiplication of the image by the scores first and this product by the eigenvalues, not the other way round as in the other answer.
For further info on the math see my PCA math answer here
The difference in the code can be seen in the outputs. Since the manuscript did not provide an example output (that I found) there may be subtle differences between the results as the manuscript ones are captured screenshots.
For comparison, the downloaded colour file, which is a little more contrasted than the screenshot, so one would expect the same from the output greyscale.
First the result from Ahmed's code:
Then the result from the updated code:
The corrected code (based on Ahmed's for ease of comparison) is
import numpy as np
import cv2
from numpy.linalg import svd, norm
# Read input image
Ibgr = cv2.imread('path/peppers.jpg')
#Convert to YCrCb
Iycc = cv2.cvtColor(Ibgr, cv2.COLOR_BGR2YCR_CB)
# Reshape the H by W by 3 array to a 3 by N array (N = W * H)
Izycc = Iycc.reshape([-1, 3]).T
# Remove mean along Y, Cr, and Cb *separately*!
Izycc = Izycc - Izycc.mean(0) #(1)[:, np.newaxis]
# Mean across channels is required (separate means for each channel is not a
# mathematically sensible idea) - each pixel's variation should centre around 0
# Make sure we're dealing with zero-mean data here: the mean for Y, Cr, and Cb
# should separately be zero. Recall: Izycc is 3 by N array.
# Original assertion was based on a false presmise. Mean value for each pixel should be 0
assert(np.allclose(np.mean(Izycc, 0), 0.0))
# Compute data array's SVD. Ignore the 3rd return value: unimportant in this context.
(U, S, L) = svd(Izycc, full_matrices=False)
# Square the data's singular vectors to get the eigenvalues. Then, normalize
# the three eigenvalues to unit norm and finally, make a diagonal matrix out of
# them.
eigvals = np.diag(S**2 / norm(S**2))
# Eigenvectors are just the right-singular vectors.
eigvecs = U;
# Project the YCrCb data onto the principal components and reshape to W by H
# array.
# This was performed incorrectly, the published algorithm shows that the eigenvectors
# are multiplied by the flattened image then scaled by eigenvalues
Igray = np.dot(eigvecs.T, np.dot(eigvals, Izycc)).sum(0).reshape(Iycc.shape[:2])
Igray2 = np.dot(eigvals, np.dot(eigvecs, Izycc)).sum(0).reshape(Iycc.shape[:2])
eigvals3 = eigvals*[1,-1,1]
Igray3 = np.dot(eigvals3, np.dot(eigvecs, Izycc)).sum(0).reshape(Iycc.shape[:2])
eigvals4 = eigvals*[1,-1,-1]
Igray4 = np.dot(eigvals4, np.dot(eigvecs, Izycc)).sum(0).reshape(Iycc.shape[:2])
# Rescale Igray to [0, 255]. This is a fancy way to do this.
from scipy.interpolate import interp1d
Igray = np.floor((interp1d([Igray.min(), Igray.max()],
[0.0, 256.0 - 1e-4]))(Igray))
Igray2 = np.floor((interp1d([Igray2.min(), Igray2.max()],
[0.0, 256.0 - 1e-4]))(Igray2))
Igray3 = np.floor((interp1d([Igray3.min(), Igray3.max()],
[0.0, 256.0 - 1e-4]))(Igray3))
Igray4 = np.floor((interp1d([Igray4.min(), Igray4.max()],
[0.0, 256.0 - 1e-4]))(Igray4))
# Make sure we don't accidentally produce a photographic negative (flip image
# intensities). N.B.: `norm` is often expensive; in real life, try to see if
# there's a more efficient way to do this.
if norm(Iycc[:,:,0] - Igray) > norm(Iycc[:,:,0] - (255.0 - Igray)):
Igray = 255 - Igray
if norm(Iycc[:,:,0] - Igray2) > norm(Iycc[:,:,0] - (255.0 - Igray2)):
Igray2 = 255 - Igray2
if norm(Iycc[:,:,0] - Igray3) > norm(Iycc[:,:,0] - (255.0 - Igray3)):
Igray3 = 255 - Igray3
if norm(Iycc[:,:,0] - Igray4) > norm(Iycc[:,:,0] - (255.0 - Igray4)):
Igray4 = 255 - Igray4
# Display result
if True:
import pylab
pylab.ion()
fGray = pylab.imshow(Igray, cmap='gray')
# Save result
cv2.imwrite('peppers-gray.png', Igray.astype(np.uint8))
fGray2 = pylab.imshow(Igray2, cmap='gray')
# Save result
cv2.imwrite('peppers-gray2.png', Igray2.astype(np.uint8))
fGray3 =pylab.imshow(Igray3, cmap='gray')
# Save result
cv2.imwrite('peppers-gray3.png', Igray3.astype(np.uint8))
fGray4 =pylab.imshow(Igray4, cmap='gray')
# Save result
cv2.imwrite('peppers-gray4.png', Igray4.astype(np.uint8))
****EDIT*****
Following Nazlok's query about the instability of eigenvector direction (which direction any one eigenvectors is oriented in is arbitrary, so there is not guarantee that different algorithms (or single algorithms without a reproducible standardisation step for orientation) would give the same result. I have now added in two extra examples, where I have simply switched the sign of the eigenvectors (number 2 and numbers 2 and 3). The results are again different, with the switching of only PC2 giving a much lighter tone, while switching 2 and 3 is similar (not surprising as the exponential scaling relegates the influence of PC3 to very little). I'll leave that last one for people bothered to run the code.
Conclusion
Without clear additional steps taken to provide a repeatable and reproducible orientation of PCs this algorithm is unstable and I personally would not be comfortable employing it as is. Nazlok's suggestion of using the balance of positive and negative intensities could provide a rule but would need validated so is out of scope of this answer. Such a rule however would not guarantee a 'best' solution, just a stable one. Eigenvectors are unit vectors, so are balanced in variance (square of intensity). Which side of zero has the largest sum of magnitudes is only telling us which side has individual pixels contributing larger variances which I suspect is generally not very informative.
Background
When Seo and Kim ask for lambda_i, v_i <- PCA(Iycc), for i = 1, 2, 3, they want:
from numpy.linalg import eig
lambdas, vs = eig(np.dot(Izycc, Izycc.T))
for a 3×N array Izycc. That is, they want the three eigenvalues and eigenvectors of the 3×3 covariance matrix of Izycc, the 3×N array (for you, N = 500*500).
However, you almost never want to compute the covariance matrix, then find its eigendecomposition, because of numerical instability. There is a much better way to get the same lambdas, vs, using the singular value decomposition (SVD) of Izycc directly (see this answer). The code below shows you how to do this.
Just show me the code
First download http://cadik.posvete.cz/color_to_gray_evaluation/img/155_5572_jpg/155_5572_jpg.jpg and save it as peppers.jpg.
Then, run the following:
import numpy as np
import cv2
from numpy.linalg import svd, norm
# Read input image
Ibgr = cv2.imread('peppers.jpg')
# Convert to YCrCb
Iycc = cv2.cvtColor(Ibgr, cv2.COLOR_BGR2YCR_CB)
# Reshape the H by W by 3 array to a 3 by N array (N = W * H)
Izycc = Iycc.reshape([-1, 3]).T
# Remove mean along Y, Cr, and Cb *separately*!
Izycc = Izycc - Izycc.mean(1)[:, np.newaxis]
# Make sure we're dealing with zero-mean data here: the mean for Y, Cr, and Cb
# should separately be zero. Recall: Izycc is 3 by N array.
assert(np.allclose(np.mean(Izycc, 1), 0.0))
# Compute data array's SVD. Ignore the 3rd return value: unimportant.
(U, S) = svd(Izycc, full_matrices=False)[:2]
# Square the data's singular vectors to get the eigenvalues. Then, normalize
# the three eigenvalues to unit norm and finally, make a diagonal matrix out of
# them. N.B.: the scaling factor of `norm(S**2)` is, I believe, arbitrary: the
# rest of the algorithm doesn't really care if/how the eigenvalues are scaled,
# since we will rescale the grayscale values to [0, 255] anyway.
eigvals = np.diag(S**2 / norm(S**2))
# Eigenvectors are just the left-singular vectors.
eigvecs = U;
# Project the YCrCb data onto the principal components and reshape to W by H
# array.
Igray = np.dot(eigvecs.T, np.dot(eigvals, Izycc)).sum(0).reshape(Iycc.shape[:2])
# Rescale Igray to [0, 255]. This is a fancy way to do this.
from scipy.interpolate import interp1d
Igray = np.floor((interp1d([Igray.min(), Igray.max()],
[0.0, 256.0 - 1e-4]))(Igray))
# Make sure we don't accidentally produce a photographic negative (flip image
# intensities). N.B.: `norm` is often expensive; in real life, try to see if
# there's a more efficient way to do this.
if norm(Iycc[:,:,0] - Igray) > norm(Iycc[:,:,0] - (255.0 - Igray)):
Igray = 255 - Igray
# Display result
if True:
import pylab
pylab.ion()
pylab.imshow(Igray, cmap='gray')
# Save result
cv2.imwrite('peppers-gray.png', Igray.astype(np.uint8))
This produces the following grayscale image, which seems to match the result in Figure 4 of the paper (though see caveat at the bottom of this answer!):
Errors in your implementation
Izycc = Iycc - Iycc.mean() WRONG. Iycc.mean() flattens the image and computes the mean. You want Izycc such that the Y channel, Cr channel, and Cb channel all have zero-mean. You could do this in a for dim in range(3)-loop, but I did it above with array broadcasting. I also have an assert above to make sure this condition holds. The trick where you get the eigendecomposition of the covariance matrix from the SVD of the data array requires zero-mean Y/Cr/Cb channels.
np.linalg.eig(Izycc[:,:,i]) WRONG. The contribution of this paper is to use principal components to convert color to grayscale. This means you have to combine the colors. The processing you were doing above was on a channel-by-channel basis—no combination of colors. Moreover, it was totally wrong to decompose the 500×500 array: the width/height of the array don’t matter, only pixels. For this reason, I reshape the three channels of the input into 3×whatever and operate on that matrix. Make sure you understand what’s happening after BGR-to-YCrCb conversion and before the SVD.
Not so much an error but a caution: when calling numpy.linalg.svd, the full_matrices=False keyword is important: this makes the “economy-size” SVD, calculating just three left/right singular vectors and just three singular values. The full-sized SVD will attempt to make an N×N array of right-singular vectors: with N = 114270 pixels (293 by 390 image), an N×N array of float64 will be N ** 2 * 8 / 1024 ** 3 or 97 gigabytes.
Final note
The magic of this algorithm is really in a single line from my code:
Igray = np.dot(eigvecs.T, np.dot(eigvals, Izycc)).sum(0) # .reshape...
This is where The Math is thickest, so let’s break it down.
Izycc is a 3×N array whose rows are zero-mean;
eigvals is a 3×3 diagonal array containing the eigenvalues of the covariance matrix dot(Izycc, Izycc.T) (as mentioned above, computed via a shortcut, using SVD of Izycc),
eigvecs is a 3×3 orthonormal matrix whose columns are the eigenvectors corresponding to those eigenvalues of that covariance.
Because these are Numpy arrays and not matrixes, we have to use dot(x,y) for matrix-matrix-multiplication, and then we use sum, and both of these obscure the linear algebra. You can check for yourself but the above calculation (before the .reshape() call) is equivalent to
np.ones([1, 3]) · eigvecs.T · eigvals · Izycc = dot([[-0.79463857, -0.18382267, 0.11589724]], Izycc)
where · is true matrix-matrix-multiplication, and the sum is replaced by pre-multiplying by a row-vector of ones. Those three numbers,
-0.79463857 multiplying each pixels’s Y-channel (luma),
-0.18382267 multiplying Cr (red-difference), and
0.11589724 multiplying Cb (blue-difference),
specify the “perfect” weighted average, for this particular image: each pixel’s Y/Cr/Cb channels are being aligned with the image’s covariance matrix and summed. Numerically speaking, each pixel’s Y-value is slightly attenuated, its Cr-value is significantly attenuated, and its Cb-value is even more attenuated but with an opposite sign—this makes sense, we expect the luma to be most informative for a grayscale so its contribution is the highest.
Minor caveat
I’m not really sure where OpenCV’s RGB to YCrCb conversion comes from. The documentation for cvtColor, specifically the section on RGB ↔︎ YCrCb JPEG doesn’t seem to correspond to any of the transforms specified on Wikipedia. When I use, say, the Colorspace Transformations Matlab package to just do the RGB to YCrCb conversion (which cites the Wikipedia entry), I get a nicer grayscale image which appears to be more similar to the paper’s Figure 4:
I’m totally out of my depth when it comes to these color transformations—if someone can explain how to get Wikipedia or Matlab’s Colorspace Transformations equivalents in Python/OpenCV, that’d be very kind. Nonetheless, this caveat is about preparing the data. After you make Izycc, the 3×N zero-mean data array, the above code fully-specifies the remaining processing.