Matrix similarity along one axis - python

Hi haves 2D matrices and want to calculate a measure of similarity along the Y axis.
For example, the following matrix should yield 0:
[1, 0, 0, 0]
[0, 1, 0, 0]
[0, 0, 1, 0]
[0, 0, 0, 1]
While this one should yield 1:
[0, 1, 1, 0]
[0, 1, 1, 0]
[0, 1, 1, 0]
[0, 1, 1, 0]
In these examples I used binary values in the matrices, but in reality they are floats between 0 and 1. The matrices are much bigger and there is noise - the calculation has to be very fast as I have a large number of matrices to calculate for every experiment.
Right now I'm doing a Random PCA, keeping the first component as the measure of similarity. However, it is somewhat slow and I have the feeling that it is overkill. Any suggestions welcome!

The real problem here is how to define similarity.
I assume you define similarity as proportion of equal rows. That is, if you randomly take two different rows, what is the probability that those two rows are equal? This definition is the simplest I can think of that fits your example desired results.
If that's indeed what you want, it is easily computed as follows, where A denotes the data matrix:
d = squeeze(all(bsxfun(#eq, A, permute(A, [3 2 1])), 2)); %// test all pairs
%// of rows for equality
result = (sum(d(:))-size(d,1))/(numel(d)-size(d,1)); %// compute average, but
%// removing similarity of each row with itself

Use all with axis=0 can have the logical result, then reapply to the matrix:
Example:
mx
matrix([[1, 0, 0, 0],
[0, 1, 0, 0],
[0, 0, 1, 0],
[0, 0, 0, 1]])
mx1
matrix([[0, 1, 1, 0],
[0, 1, 1, 0],
[0, 1, 1, 0]])
To apply:
# use .A to convert to array to do the logical calculation
np.matrix(mx.A * mx.all(axis=0).A)
matrix([[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]])
The same for mx1:
np.matrix(mx1.A * mx1.all(axis=0).A)
matrix([[0, 1, 1, 0],
[0, 1, 1, 0],
[0, 1, 1, 0]])

Related

Reshaping a matrix (numpy array) when matrices are multiplied on individual elements

I am trying to multiply each element of a 2x2 matrix say [1,1],[1,1] with a 2x2 Identity matrix. The problem is that numpy puts the whole identity matrix as a separate sub index which is not the result I need to evaluate it further, I want it to have 4 rows and 4 columns but when I reshape it to (4,4), it offsets the values and I get [1,0,1,0] on each row (consult the image for required and obtained results).
Thank you!
Image here
EDIT:
Thanks for the response and code.
I made a mistake formulating my question so I'll try one more time.
I have a matrix
I = [[1,0],[0,1]]
A = [
[4*I, 0*I],
[1*(-I), 1*I]
]
This should generate the result:
A = [
[4, 0, 0, 0],
[0, 4, 0, 0],
[-1, 0, 1, 0],
[0, -1, 0, 1]
]
Looks like you want the Kronecker product.
In [585]: np.kron(np.ones((2,2),int), np.eye(2,dtype=int))
Out[585]:
array([[1, 0, 1, 0],
[0, 1, 0, 1],
[1, 0, 1, 0],
[0, 1, 0, 1]])
You were try to make the array with repeated uses of the eye:
In [590]: np.array([I,I,I])
Out[590]:
array([[[1, 0],
[0, 1]],
[[1, 0],
[0, 1]],
[[1, 0],
[0, 1]]])
This is a (3,2,2), that's joining the eye on a new leading axis.
It is possible to transpose/reshape the (2,2,2,2) produced by np.array([[I,I],[I,I]]), but I'll you with the kron.

How do I make a mask of diagonal matrix, but starting from the 2nd column?

So here is what I can get with torch.eye(3,4) now
The matrix I get:
[[1, 0, 0, 0],
[0, 1, 0, 0],
[0, 0, 1, 0]]
Is there any (easy)way to transform it, or make such a mask in this format:
The matrix I want:
[[0, 1, 0, 0],
[0, 0, 1, 0],
[0, 0, 0, 1]]
You can do it by using torch.diagonal and specifying the diagonal you want:
>>> torch.diag(torch.tensor([1,1,1]), diagonal=1)[:-1]
tensor([[0, 1, 0, 0],
[0, 0, 1, 0],
[0, 0, 0, 1]])
If :attr:diagonal = 0, it is the main diagonal.
If :attr:diagonal > 0, it is above the main diagonal.
If :attr:diagonal < 0, it is below the main diagonal.
Here is another solution using torch.diagflat(), and using a positive offset for shifting/moving the diagonal above the main diagonal.
# diagonal values to fill
In [253]: diagonal_vals = torch.ones(3, dtype=torch.long)
# desired tensor but ...
In [254]: torch.diagflat(diagonal_vals, offset=1)
Out[254]:
tensor([[0, 1, 0, 0],
[0, 0, 1, 0],
[0, 0, 0, 1],
[0, 0, 0, 0]])
The above operation gives us a square matrix; however, we need a non-square matrix of shape (3,4). So, we'll just ignore the last row with simple indexing:
# shape (3, 4) with 1's above the main diagonal
In [255]: torch.diagflat(diagonal_vals, offset=1)[:-1]
Out[255]:
tensor([[0, 1, 0, 0],
[0, 0, 1, 0],
[0, 0, 0, 1]])

Changing the order of a matrix in numpy

I have a matrix
test = np.array([[0,1,0,0],[1,0,1,1],[0,1,0,1],[0,1,1,0]])
How do I reorder the columns so that they are like this matrix? (Basically the last row becomes the first row in reverse order and so on...)
np.array([[0,1,1,0],[1,0,1,0],[1,1,0,1],[0,0,1,0]])
Just reverse both axis
test[::-1,::-1]
array([[0, 1, 1, 0],
[1, 0, 1, 0],
[1, 1, 0, 1],
[0, 0, 1, 0]])
Update (ahh... Okay, I think I understand now.)
You can use negative steps for both the inner and outer steps.
test[::-1, ::-1]
Output:
array([[0, 1, 1, 0],
[1, 0, 1, 0],
[1, 1, 0, 1],
[0, 0, 1, 0]])
To reverse both the row and column you can use the np.flip, in your case:
test = np.array([[0,1,0,0],[1,0,1,1],[0,1,0,1],[0,1,1,0]])
reversed = np.flip(test, axis=[0,1])

numpy roll along a single axis

I have a numpy array with binary values that I need to change in the following way: The value of every element must be shifted one column to the left but only within the same row. As an example, I have the following array:
>>> arr = np.array([[0,0,1,0],[1,0,0,0],[0,0,1,1]])
>>> arr
array([[0, 0, 1, 0],
[1, 0, 0, 0],
[0, 0, 1, 1]])
And it needs to be transformed to:
>>> arr
array([[0, 1, 0, 0],
[0, 0, 0, 1],
[0, 1, 1, 0]])
I know that np.roll(arr,-1) would roll the values one cell to the left, but it doesn't seem to be able to roll them within the rows they belong to (i.e. the element on cell [1,0] goes to [0,3] instead of the desired [1,3]. Is there a way of doing this?
Thanks in advance.
roll accepts an axis parameter:
np.roll(arr,-1, axis=1)
array([[0, 1, 0, 0],
[0, 0, 0, 1],
[0, 1, 1, 0]])

How to toggle theano matrix based on vector of int position

Using theano tensor operations, how can I toggle one cell on each row of a matrix based on a integer position indicator on the correspond row index of a vector (i.e. |v| = rows of the matrix). For example, given a 100x5 matrix of zeros
M = [
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
...
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]
] # |M| = 100x5
and a 100-element vector of integer in the range of [0, 4].
V = [2, 4, ..., 0, 2] # |V| = 100, max(V) = 4, min(V) = 0
update (or create another) matrix M to
M = [
[0, 0, 1, 0, 0],
[0, 0, 0, 0, 1],
...
[1, 0, 0, 0, 0],
[0, 0, 1, 0, 0]
] # |M| = 100x5
(I know how to do this iteratively using conventional codes, but I want to run it as part of an algorithm on GPU without complicating my input which is currently vector V, so a direct theano implementation would be great.)
I figured out the answer myself. This operation is known as one-hot and it is supported as the "to_one_hot" in Theano's extra_ops package. Code:
M_one_hot = theano.tensor.extra_ops.to_one_hot(V, 5, dtype='int32')

Categories

Resources