Different operations for different columns in a numpy matrix? - python

I have a square numpy matrix with 0 and 1 and I have to do different operations according to the column.
If the column contains all 0 I have to replace these 0 with 1/number_of_the_colomns (i use the command matrix.shape[1]) , else (if colomn doesn't contain all 0) i have to divide each element by the sum of the colomn.
In essence, after these operations the sum of each colomn must be 1.
I try this but i have error in the third line: index returns 3-dim structure
a=numpy.nonzero(out_degree)
b=numpy.where(out_degree==0)
graph[:,b]=1/graph.shape[0]
graph[:,a]=graph/out_degree
graph is the numpy matrix, out_degree is a vector that contains the sum of each colomn
I have to use numpy without loop to save time.

A start would be:
import numpy as np
np.random.seed(1)
M, N = 5, 4
a = np.random.choice([0, 1, 2], size=(M, N), p=[0.6, 0.2, 0.2]).astype(float)
print(a)
a_inds = np.where(~a.any(axis=0))[0]
b_inds = np.setdiff1d(np.arange(N), a_inds, assume_unique=True)
b_col_sums = np.sum(a[:, b_inds], axis=0)
a[:, a_inds] = 1 / N
a[:, b_inds] /= b_col_sums
print(a)
Output:
[[ 0. 1. 0. 0.]
[ 0. 0. 0. 0.]
[ 0. 0. 0. 1.]
[ 0. 2. 0. 1.]
[ 0. 0. 0. 0.]]
[[ 0.25 0.33333333 0.25 0. ]
[ 0.25 0. 0.25 0. ]
[ 0.25 0. 0.25 0.5 ]
[ 0.25 0.66666667 0.25 0.5 ]
[ 0.25 0. 0.25 0. ]]
This should be easy to read and of medium performance. It's probably not the fastest because of a lot of fancy-indexing.
It also does not check the problematic cases of divide by zero (not part of your specification)!
Edit: OP is only interested in square-arrays, so the following is to be ignored!
You state: In essence, after these operations the sum of each colomn must be 1. and give the operation: have to replace these 0 with 1/number_of_the_columns, which is a contradiction. Maybe you need to replace N with M in a[:, a_inds] = 1 / N.
Then you obtain:
[[ 0.2 0.33333333 0.2 0. ]
[ 0.2 0. 0.2 0. ]
[ 0.2 0. 0.2 0.5 ]
[ 0.2 0.66666667 0.2 0.5 ]
[ 0.2 0. 0.2 0. ]]

You can check for nonzero elements otherwise just sum it.
for col in range(a.shape[1]):
if np.any(a[:, col]):
a[:, col] /= np.sum(a[:, col])
else:
a[:, col] = 1/a.shape[1]

Related

How to populate an 2D array from a list left to right with the diagonal being all zeroes

I have a list:
idmatrixlist=[0.61, 0.63, 0.54, 0.82, 0.58, 0.57]
I need to populate an array from left to right while maintaining the zeroes on the diagonal so that the resulting array looks like.
I have tried the following code but it results in the wrong ordering of the entries.
lowertriangleidmatrix = np.zeros((4,4))
indexer = np.tril_indices(4,k=-1)
lowertriangleidmatrix[indexer] = idmatrixlist
print(lowertriangleidmatrix)
result:
[[0. 0. 0. 0. ]
[0.61 0. 0. 0. ]
[0.63 0.54 0. 0. ]
[0.82 0.58 0.57 0. ]]
How can this be re-ordered?
You can use triu_indices and invert the x/y:
lowertriangleidmatrix = np.zeros((4, 4))
indexer = np.triu_indices(4, k=1)[::-1]
# (array([0, 0, 1, 0, 1, 2]), array([1, 2, 2, 3, 3, 3]))
lowertriangleidmatrix[indexer] = idmatrixlist
print(lowertriangleidmatrix)
Output:
[[0. 0. 0. 0. ]
[0.61 0. 0. 0. ]
[0.63 0.82 0. 0. ]
[0.54 0.58 0.57 0. ]]

Modifying (keras/tensorflow) Tensors using numpy methods

I want to perform a specific operation. Namely, from a matrix:
A = np.array([[1,2],
[3,4]])
To the following
B = np.array([[1, 0, 0, 2, 0, 0],
[0, 1, 0, 0, 2, 0],
[0, 0, 1, 0, 0, 2],
[3, 0, 0, 4, 0, 0],
[0, 3, 0, 0, 4, 0],
[0, 0, 3, 0, 0, 4]])
Or in words: multiply every entry by the identity matrix and keep the same order.
Now I have accomplished this by using numpy, using the following code. Here N and M are the dimensions of the starting matrix, and the dimension of the identity matrix.
l_slice = 3
n_slice = 2
A = np.reshape(np.arange(1, 1+N ** 2), (N, N))
B = np.array([i * np.eye(M) for i in A.flatten()])
C = B.reshape(N, N, M, M).reshape(N, N * M, M).transpose([0, 2, 1]).reshape((N * M, N * M))
where C has my desired properties.
But now I want do this modification in Keras/Tensorflow, where the matrix A is the outcome of one of my layers.
However, I am not sure yet if I will be able to properly create matrix B. Especially when batches are involved, I think I will somehow mess up the dimensions of my problem.
Can anyone with more Keras/Tensorflow experience comment on this 'reshape' and how he/she sees this happening within Keras/Tensorflow?
Here is a way to do that with TensorFlow:
import tensorflow as tf
data = tf.placeholder(tf.float32, [None, None])
n = tf.placeholder(tf.int32, [])
eye = tf.eye(n)
mult = data[:, tf.newaxis, :, tf.newaxis] * eye[tf.newaxis, :, tf.newaxis, :]
result = tf.reshape(mult, n * tf.shape(data))
with tf.Session() as sess:
a = sess.run(result, feed_dict={data: [[1, 2], [3, 4]], n: 3})
print(a)
Output:
[[1. 0. 0. 2. 0. 0.]
[0. 1. 0. 0. 2. 0.]
[0. 0. 1. 0. 0. 2.]
[3. 0. 0. 4. 0. 0.]
[0. 3. 0. 0. 4. 0.]
[0. 0. 3. 0. 0. 4.]]
By the way, you can do basically the same in NumPy, which should be faster than your current solution:
import numpy as np
data = np.array([[1, 2], [3, 4]])
n = 3
eye = np.eye(n)
mult = data[:, np.newaxis, :, np.newaxis] * eye[np.newaxis, :, np.newaxis, :]
result = np.reshape(mult, (n * data.shape[0], n * data.shape[1]))
print(result)
# The output is the same as above
EDIT:
I'll try to give some intuition about why/how this works, sorry if it's too long. It is not that hard but I think it's sort of tricky to explain. Maybe it is easier to see how the following multiplication works
import numpy as np
data = np.array([[1, 2], [3, 4]])
n = 3
eye = np.eye(n)
mult1 = data[:, :, np.newaxis, np.newaxis] * eye[np.newaxis, np.newaxis, :, :]
Now, mult1 is a sort of "matrix of matrices". If I give two indices, I will get the diagonal matrix for the corresponding element in the original one:
print(mult1[0, 0])
# [[1. 0. 0.]
# [0. 1. 0.]
# [0. 0. 1.]]
So you could say this matrix could be visualize like this:
| 1 0 0 | | 2 0 0 |
| 0 1 0 | | 0 2 0 |
| 0 0 1 | | 0 0 2 |
| 3 0 0 | | 4 0 0 |
| 0 3 0 | | 0 4 0 |
| 0 0 3 | | 0 0 4 |
However this is deceiving, because if you try to reshape this to the final shape the result is not the right one:
print(np.reshape(mult1, (n * data.shape[0], n * data.shape[1])))
# [[1. 0. 0. 0. 1. 0.]
# [0. 0. 1. 2. 0. 0.]
# [0. 2. 0. 0. 0. 2.]
# [3. 0. 0. 0. 3. 0.]
# [0. 0. 3. 4. 0. 0.]
# [0. 4. 0. 0. 0. 4.]]
The reason is that reshaping (conceptually) "flattens" the array first and then gives the new shape. But the flattened array in this case is not what you need:
print(mult1.ravel())
# [1. 0. 0. 0. 1. 0. 0. 0. 1. 2. 0. 0. 0. 2. 0. ...
You see, it first traverses the first submatrix, then the second, etc. What you want though is for it to traverse first the first row of the first submatrix, then the first row of the second submatrix, then second row of first submatrix, etc. So basically you want something like:
Take the first two submatrices (the ones with 1 and 2)
Take all the first rows ([1, 0, 0] and [2, 0, 0]).
Take the first of these ([1, 0, 0])
Take each of its elements (1, 0 and 0).
And then continue for the rest. So if you think about it, we traversing first the axis 0 (row of "matrix of matrices"), then 2 (rows of each submatrix), then 1 (column of "matrix of matrices") and finally 3 (columns of submatrices). So we can just reorder the axis to do that:
mult2 = mult1.transpose((0, 2, 1, 3))
print(np.reshape(mult2, (n * data.shape[0], n * data.shape[1])))
# [[1. 0. 0. 2. 0. 0.]
# [0. 1. 0. 0. 2. 0.]
# [0. 0. 1. 0. 0. 2.]
# [3. 0. 0. 4. 0. 0.]
# [0. 3. 0. 0. 4. 0.]
# [0. 0. 3. 0. 0. 4.]]
And it works! So in the solution I posted, to avoid the tranposing, I just make the multiplication so the order of the axes is exactly that:
mult = data[
:, # Matrix-of-matrices rows
np.newaxis, # Submatrix rows
:, # Matrix-of-matrices columns
np.newaxis # Submatrix columns
] * eye[
np.newaxis, # Matrix-of-matrices rows
:, # Submatrix rows
np.newaxis, # Matrix-of-matrices columns
: # Submatrix columns
]
I hope that makes it slightly clearer. To be honest, in this case in particular I could came up with the solution quickly because I had to solve a similar problem not too long ago, and I guess you end up building an intuition of these things.
Another way to achieve the same effect in numpy is to use the following:
A = np.array([[1,2],
[3,4]])
B = np.repeat(np.repeat(A, 3, axis=0), 3, axis=1) * np.tile(np.eye(3), (2,2))
Then, to replicate it in tensorflow, we can use tf.tile, but there is no tf.repeat, however someone has provided this function on tensorflow tracker.
def tf_repeat(tensor, repeats):
"""
Args:
input: A Tensor. 1-D or higher.
repeats: A list. Number of repeat for each dimension, length must be the same as the number of dimensions in input
Returns:
A Tensor. Has the same type as input. Has the shape of tensor.shape * repeats
"""
with tf.variable_scope("repeat"):
expanded_tensor = tf.expand_dims(tensor, -1)
multiples = [1] + list(repeats)
tiled_tensor = tf.tile(expanded_tensor, multiples=multiples)
repeated_tesnor = tf.reshape(tiled_tensor, tf.shape(tensor) * repeats)
return repeated_tesnor
and thus the tensorflow implementation will look like the following. Here I also consider that the first dimension represents batches, and thus we do not operate on it.
N = 2
M = 3
nbatch = 2
Ain = np.reshape(np.arange(1, 1 + N*N*nbatch), (nbatch, N, N))
A = tf.placeholder(tf.float32, shape=(nbatch, N, N))
B = tf.tile(tf.eye(M), [N, N]) * tf_repeat(A, [1, M, M])
with tf.Session() as sess:
print(sess.run(C, feed_dict={A: Ain}))
and the result:
[[[1. 0. 0. 2. 0. 0.]
[0. 1. 0. 0. 2. 0.]
[0. 0. 1. 0. 0. 2.]
[3. 0. 0. 4. 0. 0.]
[0. 3. 0. 0. 4. 0.]
[0. 0. 3. 0. 0. 4.]]
[[5. 0. 0. 6. 0. 0.]
[0. 5. 0. 0. 6. 0.]
[0. 0. 5. 0. 0. 6.]
[7. 0. 0. 8. 0. 0.]
[0. 7. 0. 0. 8. 0.]
[0. 0. 7. 0. 0. 8.]]]

create 3D binary image

I have a 2D array, a, comprising a set of 100 x,y,z coordinates:
[[ 0.81 0.23 0.52]
[ 0.63 0.45 0.13]
...
[ 0.51 0.41 0.65]]
I would like to create a 3D binary image, b, with 101 pixels in each of the x,y,z dimensions, of coordinates ranging between 0.00 and 1.00.
Pixels at locations defined by a should take on a value of 1, all other pixels should have a value of 0.
I can create an array of zeros of the right shape with b = np.zeros((101,101,101)), but how do I assign coordinate and slice into it to create the ones using a?
First, start off by safely rounding your floats to ints. In context, see this question.
a_indices = np.rint(a * 100).astype(int)
Next, assign those indices in b to 1. But be careful to use an ordinary list instead of the array, or else you'll trigger the usage of index arrays. It seems as though performance of this method is comparable to that of alternatives (Thanks #Divakar! :-)
b[list(a_indices.T)] = 1
I created a small example with size 10 instead of 100, and 2 dimensions instead of 3, to illustrate:
>>> a = np.array([[0.8, 0.2], [0.6, 0.4], [0.5, 0.6]])
>>> a_indices = np.rint(a * 10).astype(int)
>>> b = np.zeros((10, 10))
>>> b[list(a_indices.T)] = 1
>>> print(b)
[[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 1. 0. 0. 0.]
[ 0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]
You could do something like this -
# Get the XYZ indices
idx = np.round(100 * a).astype(int)
# Initialize o/p array
b = np.zeros((101,101,101))
# Assign into o/p array based on linear index equivalents from indices array
np.put(b,np.ravel_multi_index(idx.T,b.shape),1)
Runtime on the assignment part -
Let's use a bigger grid for timing purposes.
In [82]: # Setup input and get indices array
...: a = np.random.randint(0,401,(100000,3))/400.0
...: idx = np.round(400 * a).astype(int)
...:
In [83]: b = np.zeros((401,401,401))
In [84]: %timeit b[list(idx.T)] = 1 ##Praveen soln
The slowest run took 42.16 times longer than the fastest. This could mean that an intermediate result is being cached.
1 loop, best of 3: 6.28 ms per loop
In [85]: b = np.zeros((401,401,401))
In [86]: %timeit np.put(b,np.ravel_multi_index(idx.T,b.shape),1) # From this post
The slowest run took 45.34 times longer than the fastest. This could mean that an intermediate result is being cached.
1 loop, best of 3: 5.71 ms per loop
In [87]: b = np.zeros((401,401,401))
In [88]: %timeit b[idx[:,0],idx[:,1],idx[:,2]] = 1 #Subscripted indexing
The slowest run took 40.48 times longer than the fastest. This could mean that an intermediate result is being cached.
1 loop, best of 3: 6.38 ms per loop

Numpy - Modal matrix and diagonal Eigenvalues

I wrote a simple Linear Algebra code in Python Numpy to calculate the Diagonal of EigenValues by calculating $M^{-1}.A.M$ (M is the Modal Matrix) and it's working strange.
Here's the Code :
import numpy as np
array = np.arange(16)
array = array.reshape(4, -1)
print(array)
[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]
[12 13 14 15]]
eigenvalues, eigenvectors = np.linalg.eig(array)
print eigenvalues
[ 3.24642492e+01 -2.46424920e+00 1.92979794e-15 -4.09576009e-16]
print eigenvectors
[[-0.11417645 -0.7327781 0.54500164 0.00135151]
[-0.3300046 -0.28974835 -0.68602671 0.40644504]
[-0.54583275 0.15328139 -0.2629515 -0.8169446 ]
[-0.76166089 0.59631113 0.40397657 0.40914805]]
inverseEigenVectors = np.linalg.inv(eigenvectors) #M^(-1)
diagonal= inverseEigenVectors.dot(array).dot(eigenvectors) #M^(-1).A.M
print(diagonal)
[[ 3.24642492e+01 -1.06581410e-14 5.32907052e-15 0.00000000e+00]
[ 7.54951657e-15 -2.46424920e+00 -1.72084569e-15 -2.22044605e-16]
[ -2.80737213e-15 1.46768503e-15 2.33547852e-16 7.25592561e-16]
[ -6.22319863e-15 -9.69656080e-16 -1.38050658e-30 1.97215226e-31]]
the final 'diagonal' matrix should be a diagonal matrix with EigenValues on the main diagonal and zeros elsewhere. but it's not... the two first main diagonal values ARE eigenvalues but the two second aren't (although just like the two second eigenvalues, they are nearly zero).
and by the way a number like $-1.06581410e-14$ is literally zero so how can I make numpy show them as zero?
What am I doing wrong?
Thanks...
Just round the final result to the desired digits :
print(diagonal.round(5))
array([[ 32.46425, 0. , 0. , 0. ],
[ 0. , -2.46425, 0. , 0. ],
[ 0. , 0. , 0. , 0. ],
[ 0. , 0. , 0. , 0. ]])
Don't confuse precision of computation and printing policies.
>>> diagonal[np.abs(diagonal)<0.0000000001]=0
>>> print diagonal
[[ 32.4642492 0. 0. 0. ]
[ 0. -2.4642492 0. 0. ]
[ 0. 0. 0. 0. ]
[ 0. 0. 0. 0. ]]
>>>

Build diagonal matrix without using for loop

I am trying to build the following matrix in Python without using a for loop:
A
[[ 0.1 0.2 0. 0. 0. ]
[ 1. 2. 3. 0. 0. ]
[ 0. 1. 2. 3. 0. ]
[ 0. 0. 1. 2. 3. ]
[ 0. 0. 0. 4. 5. ]]
I tried the fill_diagonal method in NumPy (see matrix B below) but it does not give me the same matrix as shown in matrix A:
B
[[ 1. 0.2 0. 0. 0. ]
[ 0. 2. 0. 0. 0. ]
[ 0. 0. 3. 0. 0. ]
[ 0. 0. 0. 1. 0. ]
[ 0. 0. 0. 4. 5. ]]
Here is the Python code that I used to construct the matrices:
import numpy as np
import scipy.linalg as sp # maybe use scipy to build diagonal matrix?
#---- build diagonal square array using "for" loop
m = 5
A = np.zeros((m, m))
A[0, 0] = 0.1
A[0, 1] = 0.2
for i in range(1, m-1):
A[i, i-1] = 1 # m-1
A[i, i] = 2 # m
A[i, i+1] = 3 # m+1
A[m-1, m-2] = 4
A[m-1, m-1] = 5
print('A \n', A)
#---- build diagonal square array without loop
B = np.zeros((m, m))
B[0, 0] = 0.1
B[0, 1] = 0.2
np.fill_diagonal(B, [1, 2, 3])
B[m-1, m-2] = 4
B[m-1, m-1] = 5
print('B \n', B)
So is there a way to construct a diagonal matrix like the one shown by matrix A without using a for loop?
There are functions for this in scipy.sparse, e.g.:
from scipy.sparse import diags
C = diags([1,2,3], [-1,0,1], shape=(5,5), dtype=float)
C = C.toarray()
C[0, 0] = 0.1
C[0, 1] = 0.2
C[-1, -2] = 4
C[-1, -1] = 5
Diagonal matrices are generally very sparse, so you could also keep it as a sparse matrix. This could even have large efficiency benefits, depending on the application.
The efficiency gains sparse matrices could give you depend very much on matrix size. For a 5x5 array you can't really be bothered I guess. But for larger matrices creating the array could be a lot faster with sparse matrices, illustrated by the following example with an identity matrix:
%timeit np.eye(3000)
# 100 loops, best of 3: 3.12 ms per loop
%timeit sparse.eye(3000)
# 10000 loops, best of 3: 79.5 µs per loop
But the real strength of the sparse matrix data type is shown when you need to do mathematical operations on arrays that are sparse:
%timeit np.eye(3000).dot(np.eye(3000))
# 1 loops, best of 3: 2.8 s per loop
%timeit sparse.eye(3000).dot(sparse.eye(3000))
# 1000 loops, best of 3: 1.11 ms per loop
Or when you need to work with some very large but sparse array:
np.eye(1E6)
# ValueError: array is too big.
sparse.eye(1E6)
# <1000000x1000000 sparse matrix of type '<type 'numpy.float64'>'
# with 1000000 stored elements (1 diagonals) in DIAgonal format>
Notice that the number of 0 is always 3 (or a constant whenever you want to have a diagonal matrix like this):
In [10]:
import numpy as np
A1=[0.1, 0.2]
A2=[1,2,3]
A3=[4,5]
SPC=[0,0,0] #=or use np.zeros #spacing zeros
np.hstack((A1,SPC,A2,SPC,A2,SPC,A2,SPC,A3)).reshape(5,5)
Out[10]:
array([[ 0.1, 0.2, 0. , 0. , 0. ],
[ 1. , 2. , 3. , 0. , 0. ],
[ 0. , 1. , 2. , 3. , 0. ],
[ 0. , 0. , 1. , 2. , 3. ],
[ 0. , 0. , 0. , 4. , 5. ]])
In [11]:
import itertools #A more general way of doing it
np.hstack(list(itertools.chain(*[(item, SPC) for item in [A1, A2, A2, A2, A3]]))[:-1]).reshape(5,5)
Out[11]:
array([[ 0.1, 0.2, 0. , 0. , 0. ],
[ 1. , 2. , 3. , 0. , 0. ],
[ 0. , 1. , 2. , 3. , 0. ],
[ 0. , 0. , 1. , 2. , 3. ],
[ 0. , 0. , 0. , 4. , 5. ]])

Categories

Resources