I have a 2D array, a, comprising a set of 100 x,y,z coordinates:
[[ 0.81 0.23 0.52]
[ 0.63 0.45 0.13]
...
[ 0.51 0.41 0.65]]
I would like to create a 3D binary image, b, with 101 pixels in each of the x,y,z dimensions, of coordinates ranging between 0.00 and 1.00.
Pixels at locations defined by a should take on a value of 1, all other pixels should have a value of 0.
I can create an array of zeros of the right shape with b = np.zeros((101,101,101)), but how do I assign coordinate and slice into it to create the ones using a?
First, start off by safely rounding your floats to ints. In context, see this question.
a_indices = np.rint(a * 100).astype(int)
Next, assign those indices in b to 1. But be careful to use an ordinary list instead of the array, or else you'll trigger the usage of index arrays. It seems as though performance of this method is comparable to that of alternatives (Thanks #Divakar! :-)
b[list(a_indices.T)] = 1
I created a small example with size 10 instead of 100, and 2 dimensions instead of 3, to illustrate:
>>> a = np.array([[0.8, 0.2], [0.6, 0.4], [0.5, 0.6]])
>>> a_indices = np.rint(a * 10).astype(int)
>>> b = np.zeros((10, 10))
>>> b[list(a_indices.T)] = 1
>>> print(b)
[[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 1. 0. 0. 0.]
[ 0. 0. 0. 0. 1. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 1. 0. 0. 0. 0. 0. 0. 0.]
[ 0. 0. 0. 0. 0. 0. 0. 0. 0. 0.]]
You could do something like this -
# Get the XYZ indices
idx = np.round(100 * a).astype(int)
# Initialize o/p array
b = np.zeros((101,101,101))
# Assign into o/p array based on linear index equivalents from indices array
np.put(b,np.ravel_multi_index(idx.T,b.shape),1)
Runtime on the assignment part -
Let's use a bigger grid for timing purposes.
In [82]: # Setup input and get indices array
...: a = np.random.randint(0,401,(100000,3))/400.0
...: idx = np.round(400 * a).astype(int)
...:
In [83]: b = np.zeros((401,401,401))
In [84]: %timeit b[list(idx.T)] = 1 ##Praveen soln
The slowest run took 42.16 times longer than the fastest. This could mean that an intermediate result is being cached.
1 loop, best of 3: 6.28 ms per loop
In [85]: b = np.zeros((401,401,401))
In [86]: %timeit np.put(b,np.ravel_multi_index(idx.T,b.shape),1) # From this post
The slowest run took 45.34 times longer than the fastest. This could mean that an intermediate result is being cached.
1 loop, best of 3: 5.71 ms per loop
In [87]: b = np.zeros((401,401,401))
In [88]: %timeit b[idx[:,0],idx[:,1],idx[:,2]] = 1 #Subscripted indexing
The slowest run took 40.48 times longer than the fastest. This could mean that an intermediate result is being cached.
1 loop, best of 3: 6.38 ms per loop
Related
I have a training data like this:
x_train = np.random.randint(100, size=(1000, 25))
where each row is a sample and thus we have 1000 samples.
Now I need to have the training data such that for each of the sample/row there can be at max 3 non-zero elements out of 25.
Can you all please suggest how I can implement that? Thanks!
I am assuming that you want to turn a majority of your data into zeros, except that 0 to 3 non-zero elements are retained (randomly) for each row. If this is the case, a possible way to do this is as follows.
Code
import numpy as np
max_ = 3
nrows = 1000
ncols = 25
np.random.seed(7)
X = np.zeros((nrows,ncols))
data = np.random.randint(100, size=(nrows, ncols))
# number of max non-zeros to be generated for each column
vmax = np.random.randint(low=0, high=4, size=(nrows,))
for i in range(nrows):
if vmax[i]>0:
#index for setting non-zeros
col = np.random.randint(low=0, high=ncols, size=(1,vmax[i]))
#set non-zeros elements
X[i][col] = data[i][col]
print(X)
Output
[[ 0. 68. 25. ... 0. 0. 0.]
[ 0. 0. 0. ... 0. 0. 0.]
[ 0. 0. 0. ... 0. 0. 0.]
...
[ 0. 0. 0. ... 0. 0. 0.]
[88. 0. 0. ... 0. 0. 0.]
[ 0. 0. 0. ... 0. 0. 0.]]
I've the following array:
np.array([[0.07704314, 0.46752589, 0.39533099, 0.35752864],
[0.45813299, 0.02914078, 0.65307364, 0.58732429],
[0.32757561, 0.32946822, 0.59821108, 0.45585825],
[0.49054429, 0.68553148, 0.26657932, 0.38495586]])
I want to find the minimum value in each row of the array. How can I achieve this?
Expected answer:
[[0.07704314 0. 0. 0. ]
[0. 0.02914078 0. 0. ]
[0.32757561 0 0. 0. ]
[0. 0. 0.26657932 0. ]]
You can use np.where like so:
np.where(a.argmin(1)[:,None]==np.arange(a.shape[1]), a, 0)
Or (more lines but potentially more efficient):
out = np.zeros_like(a)
idx = a.argmin(1)[:, None]
np.put_along_axis(out, idx, np.take_along_axis(a, idx, 1), 1)
IIUC first find out out the min value of each line , then we base on the min value mask all min value in original array as True, using multiple(matrix) , get what we need as result
np.multiply(a,a==np.min(a,1)[:,None])
Out[225]:
array([[0.07704314, 0. , 0. , 0. ],
[0. , 0.02914078, 0. , 0. ],
[0.32757561, 0. , 0. , 0. ],
[0. , 0. , 0.26657932, 0. ]])
np.amin(a, axis=1) where a is your np array
I want to perform a specific operation. Namely, from a matrix:
A = np.array([[1,2],
[3,4]])
To the following
B = np.array([[1, 0, 0, 2, 0, 0],
[0, 1, 0, 0, 2, 0],
[0, 0, 1, 0, 0, 2],
[3, 0, 0, 4, 0, 0],
[0, 3, 0, 0, 4, 0],
[0, 0, 3, 0, 0, 4]])
Or in words: multiply every entry by the identity matrix and keep the same order.
Now I have accomplished this by using numpy, using the following code. Here N and M are the dimensions of the starting matrix, and the dimension of the identity matrix.
l_slice = 3
n_slice = 2
A = np.reshape(np.arange(1, 1+N ** 2), (N, N))
B = np.array([i * np.eye(M) for i in A.flatten()])
C = B.reshape(N, N, M, M).reshape(N, N * M, M).transpose([0, 2, 1]).reshape((N * M, N * M))
where C has my desired properties.
But now I want do this modification in Keras/Tensorflow, where the matrix A is the outcome of one of my layers.
However, I am not sure yet if I will be able to properly create matrix B. Especially when batches are involved, I think I will somehow mess up the dimensions of my problem.
Can anyone with more Keras/Tensorflow experience comment on this 'reshape' and how he/she sees this happening within Keras/Tensorflow?
Here is a way to do that with TensorFlow:
import tensorflow as tf
data = tf.placeholder(tf.float32, [None, None])
n = tf.placeholder(tf.int32, [])
eye = tf.eye(n)
mult = data[:, tf.newaxis, :, tf.newaxis] * eye[tf.newaxis, :, tf.newaxis, :]
result = tf.reshape(mult, n * tf.shape(data))
with tf.Session() as sess:
a = sess.run(result, feed_dict={data: [[1, 2], [3, 4]], n: 3})
print(a)
Output:
[[1. 0. 0. 2. 0. 0.]
[0. 1. 0. 0. 2. 0.]
[0. 0. 1. 0. 0. 2.]
[3. 0. 0. 4. 0. 0.]
[0. 3. 0. 0. 4. 0.]
[0. 0. 3. 0. 0. 4.]]
By the way, you can do basically the same in NumPy, which should be faster than your current solution:
import numpy as np
data = np.array([[1, 2], [3, 4]])
n = 3
eye = np.eye(n)
mult = data[:, np.newaxis, :, np.newaxis] * eye[np.newaxis, :, np.newaxis, :]
result = np.reshape(mult, (n * data.shape[0], n * data.shape[1]))
print(result)
# The output is the same as above
EDIT:
I'll try to give some intuition about why/how this works, sorry if it's too long. It is not that hard but I think it's sort of tricky to explain. Maybe it is easier to see how the following multiplication works
import numpy as np
data = np.array([[1, 2], [3, 4]])
n = 3
eye = np.eye(n)
mult1 = data[:, :, np.newaxis, np.newaxis] * eye[np.newaxis, np.newaxis, :, :]
Now, mult1 is a sort of "matrix of matrices". If I give two indices, I will get the diagonal matrix for the corresponding element in the original one:
print(mult1[0, 0])
# [[1. 0. 0.]
# [0. 1. 0.]
# [0. 0. 1.]]
So you could say this matrix could be visualize like this:
| 1 0 0 | | 2 0 0 |
| 0 1 0 | | 0 2 0 |
| 0 0 1 | | 0 0 2 |
| 3 0 0 | | 4 0 0 |
| 0 3 0 | | 0 4 0 |
| 0 0 3 | | 0 0 4 |
However this is deceiving, because if you try to reshape this to the final shape the result is not the right one:
print(np.reshape(mult1, (n * data.shape[0], n * data.shape[1])))
# [[1. 0. 0. 0. 1. 0.]
# [0. 0. 1. 2. 0. 0.]
# [0. 2. 0. 0. 0. 2.]
# [3. 0. 0. 0. 3. 0.]
# [0. 0. 3. 4. 0. 0.]
# [0. 4. 0. 0. 0. 4.]]
The reason is that reshaping (conceptually) "flattens" the array first and then gives the new shape. But the flattened array in this case is not what you need:
print(mult1.ravel())
# [1. 0. 0. 0. 1. 0. 0. 0. 1. 2. 0. 0. 0. 2. 0. ...
You see, it first traverses the first submatrix, then the second, etc. What you want though is for it to traverse first the first row of the first submatrix, then the first row of the second submatrix, then second row of first submatrix, etc. So basically you want something like:
Take the first two submatrices (the ones with 1 and 2)
Take all the first rows ([1, 0, 0] and [2, 0, 0]).
Take the first of these ([1, 0, 0])
Take each of its elements (1, 0 and 0).
And then continue for the rest. So if you think about it, we traversing first the axis 0 (row of "matrix of matrices"), then 2 (rows of each submatrix), then 1 (column of "matrix of matrices") and finally 3 (columns of submatrices). So we can just reorder the axis to do that:
mult2 = mult1.transpose((0, 2, 1, 3))
print(np.reshape(mult2, (n * data.shape[0], n * data.shape[1])))
# [[1. 0. 0. 2. 0. 0.]
# [0. 1. 0. 0. 2. 0.]
# [0. 0. 1. 0. 0. 2.]
# [3. 0. 0. 4. 0. 0.]
# [0. 3. 0. 0. 4. 0.]
# [0. 0. 3. 0. 0. 4.]]
And it works! So in the solution I posted, to avoid the tranposing, I just make the multiplication so the order of the axes is exactly that:
mult = data[
:, # Matrix-of-matrices rows
np.newaxis, # Submatrix rows
:, # Matrix-of-matrices columns
np.newaxis # Submatrix columns
] * eye[
np.newaxis, # Matrix-of-matrices rows
:, # Submatrix rows
np.newaxis, # Matrix-of-matrices columns
: # Submatrix columns
]
I hope that makes it slightly clearer. To be honest, in this case in particular I could came up with the solution quickly because I had to solve a similar problem not too long ago, and I guess you end up building an intuition of these things.
Another way to achieve the same effect in numpy is to use the following:
A = np.array([[1,2],
[3,4]])
B = np.repeat(np.repeat(A, 3, axis=0), 3, axis=1) * np.tile(np.eye(3), (2,2))
Then, to replicate it in tensorflow, we can use tf.tile, but there is no tf.repeat, however someone has provided this function on tensorflow tracker.
def tf_repeat(tensor, repeats):
"""
Args:
input: A Tensor. 1-D or higher.
repeats: A list. Number of repeat for each dimension, length must be the same as the number of dimensions in input
Returns:
A Tensor. Has the same type as input. Has the shape of tensor.shape * repeats
"""
with tf.variable_scope("repeat"):
expanded_tensor = tf.expand_dims(tensor, -1)
multiples = [1] + list(repeats)
tiled_tensor = tf.tile(expanded_tensor, multiples=multiples)
repeated_tesnor = tf.reshape(tiled_tensor, tf.shape(tensor) * repeats)
return repeated_tesnor
and thus the tensorflow implementation will look like the following. Here I also consider that the first dimension represents batches, and thus we do not operate on it.
N = 2
M = 3
nbatch = 2
Ain = np.reshape(np.arange(1, 1 + N*N*nbatch), (nbatch, N, N))
A = tf.placeholder(tf.float32, shape=(nbatch, N, N))
B = tf.tile(tf.eye(M), [N, N]) * tf_repeat(A, [1, M, M])
with tf.Session() as sess:
print(sess.run(C, feed_dict={A: Ain}))
and the result:
[[[1. 0. 0. 2. 0. 0.]
[0. 1. 0. 0. 2. 0.]
[0. 0. 1. 0. 0. 2.]
[3. 0. 0. 4. 0. 0.]
[0. 3. 0. 0. 4. 0.]
[0. 0. 3. 0. 0. 4.]]
[[5. 0. 0. 6. 0. 0.]
[0. 5. 0. 0. 6. 0.]
[0. 0. 5. 0. 0. 6.]
[7. 0. 0. 8. 0. 0.]
[0. 7. 0. 0. 8. 0.]
[0. 0. 7. 0. 0. 8.]]]
I have the following code where I have been trying to create a tridiagonal matrix x using if-conditions.
#!/usr/bin/env python
# import useful modules
import numpy as np
N=5
x=np.identity(N)
#x=np.zeros((N,N))
print x
# Construct NxN matrix
for i in range(N):
for j in range(N):
if i-j==1:
x[i][j]=1
elif j-1==1:
x[i][j]=-1
else:
x[i][j]=0
print "i= ",i," j= ",j
print x
I desire to get
[[ 0. -1. 0. 0. 0.]
[ 1. 0. -1. 0. 0.]
[ 0. 1. 0. -1 0.]
[ 0. 0. 1. 0. -1.]
[ 0. 0. 0. 1. 0.]]
However, I obtain
[[ 0. 0. -1. 0. 0.]
[ 1. 0. -1. 0. 0.]
[ 0. 1. -1. 0. 0.]
[ 0. 0. 1. 0. 0.]
[ 0. 0. -1. 1. 0.]]
What's going wrong?
Bonus question : Can I forcefully index from 1 to 5 instead of 0 to 4 in this example, or Python never allows that?
elif j-1==1: should be elif j-i==1:.
And no, lists/arrays etc. are always indexed from 0.
As for the bonus question, the first element of a sequence in Python has always the index 0. However, if for some particular reason (for example to prevent off-by-one errors) you wish to count the elements of a sequence from a value other than 0, you could use the built-in function enumerate() and set the value of the optional parameter start to fit your needs:
>>> seq = ['a', 'b', 'c']
>>> for count, item in enumerate(seq, start=1):
... print(count, item)
...
1 a
2 b
3 c
I am trying to build the following matrix in Python without using a for loop:
A
[[ 0.1 0.2 0. 0. 0. ]
[ 1. 2. 3. 0. 0. ]
[ 0. 1. 2. 3. 0. ]
[ 0. 0. 1. 2. 3. ]
[ 0. 0. 0. 4. 5. ]]
I tried the fill_diagonal method in NumPy (see matrix B below) but it does not give me the same matrix as shown in matrix A:
B
[[ 1. 0.2 0. 0. 0. ]
[ 0. 2. 0. 0. 0. ]
[ 0. 0. 3. 0. 0. ]
[ 0. 0. 0. 1. 0. ]
[ 0. 0. 0. 4. 5. ]]
Here is the Python code that I used to construct the matrices:
import numpy as np
import scipy.linalg as sp # maybe use scipy to build diagonal matrix?
#---- build diagonal square array using "for" loop
m = 5
A = np.zeros((m, m))
A[0, 0] = 0.1
A[0, 1] = 0.2
for i in range(1, m-1):
A[i, i-1] = 1 # m-1
A[i, i] = 2 # m
A[i, i+1] = 3 # m+1
A[m-1, m-2] = 4
A[m-1, m-1] = 5
print('A \n', A)
#---- build diagonal square array without loop
B = np.zeros((m, m))
B[0, 0] = 0.1
B[0, 1] = 0.2
np.fill_diagonal(B, [1, 2, 3])
B[m-1, m-2] = 4
B[m-1, m-1] = 5
print('B \n', B)
So is there a way to construct a diagonal matrix like the one shown by matrix A without using a for loop?
There are functions for this in scipy.sparse, e.g.:
from scipy.sparse import diags
C = diags([1,2,3], [-1,0,1], shape=(5,5), dtype=float)
C = C.toarray()
C[0, 0] = 0.1
C[0, 1] = 0.2
C[-1, -2] = 4
C[-1, -1] = 5
Diagonal matrices are generally very sparse, so you could also keep it as a sparse matrix. This could even have large efficiency benefits, depending on the application.
The efficiency gains sparse matrices could give you depend very much on matrix size. For a 5x5 array you can't really be bothered I guess. But for larger matrices creating the array could be a lot faster with sparse matrices, illustrated by the following example with an identity matrix:
%timeit np.eye(3000)
# 100 loops, best of 3: 3.12 ms per loop
%timeit sparse.eye(3000)
# 10000 loops, best of 3: 79.5 µs per loop
But the real strength of the sparse matrix data type is shown when you need to do mathematical operations on arrays that are sparse:
%timeit np.eye(3000).dot(np.eye(3000))
# 1 loops, best of 3: 2.8 s per loop
%timeit sparse.eye(3000).dot(sparse.eye(3000))
# 1000 loops, best of 3: 1.11 ms per loop
Or when you need to work with some very large but sparse array:
np.eye(1E6)
# ValueError: array is too big.
sparse.eye(1E6)
# <1000000x1000000 sparse matrix of type '<type 'numpy.float64'>'
# with 1000000 stored elements (1 diagonals) in DIAgonal format>
Notice that the number of 0 is always 3 (or a constant whenever you want to have a diagonal matrix like this):
In [10]:
import numpy as np
A1=[0.1, 0.2]
A2=[1,2,3]
A3=[4,5]
SPC=[0,0,0] #=or use np.zeros #spacing zeros
np.hstack((A1,SPC,A2,SPC,A2,SPC,A2,SPC,A3)).reshape(5,5)
Out[10]:
array([[ 0.1, 0.2, 0. , 0. , 0. ],
[ 1. , 2. , 3. , 0. , 0. ],
[ 0. , 1. , 2. , 3. , 0. ],
[ 0. , 0. , 1. , 2. , 3. ],
[ 0. , 0. , 0. , 4. , 5. ]])
In [11]:
import itertools #A more general way of doing it
np.hstack(list(itertools.chain(*[(item, SPC) for item in [A1, A2, A2, A2, A3]]))[:-1]).reshape(5,5)
Out[11]:
array([[ 0.1, 0.2, 0. , 0. , 0. ],
[ 1. , 2. , 3. , 0. , 0. ],
[ 0. , 1. , 2. , 3. , 0. ],
[ 0. , 0. , 1. , 2. , 3. ],
[ 0. , 0. , 0. , 4. , 5. ]])