How to implement Numpy where index in TensorFlow?

How to implement Numpy where index in TensorFlow? - python

I have the following operations which uses numpy.where:
mat = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]], dtype=np.int32)
index = np.array([[1,0,0],[0,1,0],[0,0,1]])
mat[np.where(index>0)] = 100
print(mat)
How to implement the equivalent in TensorFlow?
mat = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]], dtype=np.int32)
index = np.array([[1, 0, 0], [0, 1, 0], [0, 0, 1]])
tf_mat = tf.constant(mat)
tf_index = tf.constant(index)
indi = tf.where(tf_index>0)
tf_mat[indi] = -1 <===== not allowed

Assuming that what you want is to create a new tensor with some replaced elements, and not update a variable, you could do something like this:
import numpy as np
import tensorflow as tf
mat = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]], dtype=np.int32)
index = np.array([[1, 0, 0], [0, 1, 0], [0, 0, 1]])
tf_mat = tf.constant(mat)
tf_index = tf.constant(index)
tf_mat = tf.where(tf_index > 0, -tf.ones_like(tf_mat), tf_mat)
with tf.Session() as sess:
print(sess.run(tf_mat))
Output:
[[-1 2 3]
[ 4 -1 6]
[ 7 8 -1]]

You can get indexes by tf.where, then you can either run the index, or use tf.gather to collect data from the origin array, or use tf.scatter_update to update origin data, tf.scatter_nd_update for multi-dimension update.
mat = tf.Variable([[1, 2, 3], [4, 5, 6], [7, 8, 9]], dtype=tf.int32)
index = tf.Variable([[1,0,0],[0,1,0],[0,0,1]])
idx = tf.where(index>0)
tf.scatter_nd_update(mat, idx, /*values you want*/)
note that update values should be the same first dimension size with idx.
see https://www.tensorflow.org/api_guides/python

Related

How to generate values from a diagonal to fill matrix

I have the following diagonal matrix
a = array([[1, 0, 0, 0],
[0, 2, 0, 0],
[0, 0, 3, 0],
[0, 0, 0, 4]])
And the desired out come is the following
array([[1, 3, 4, 5],
[3, 2, 5, 6],
[4, 5, 3, 7],
[5, 6, 7, 4]])
Each element is the sum of the corresponding diagonals.
Thanks a lot

Try:
>>> np.diag(a) + np.diag(a)[:, None] - a
array([[1, 3, 4, 5],
[3, 2, 5, 6],
[4, 5, 3, 7],
[5, 6, 7, 4]])
Addendum
What if a is a DataFrame?
Then: np.diag(a) + np.diag(a)[:, None] - a is also a DataFrame (with same index and columns as a).
What if a is a numpy array, but I want a DataFrame result?
Then use: pd.DataFrame(...) instead.

You can use:
# get diagonal
diag = np.diag(a)
# outer sum
out = diag+diag[:,None]
# or
# out = np.outer(diag, diag)
# reset diagonal
np.fill_diagonal(out, diag)
print(out)
output:
[[1 3 4 5]
[3 2 5 6]
[4 5 3 7]
[5 6 7 4]]

Take values from row and column of numpy array based on another array

I have a np array of weights
mat = np.array([
[1, 2, 3],
[4, 5, 6],
[7, 8, 9]
])
I have another numpy array containing the row and index to extract from the weight matrix.
row_col = np.array([
[1, 1], # row 1, col 1
[2, 2], # row 2, col 2
[0, 2], # row 0, col 2
[1, 0] # row 1, col 0
])
How do I get the output:
[5, 9, 3, 4]

Try this:
mat[row_col[:, 0], row_col[:, 1]]
Output: array([5, 9, 3, 4])

How to repeat a numpy array along a new dimension with padding?

Given an two arrays: an input array and a repeat array, I would like to receive an array which is repeated along a new dimension a specified amount of times for each row and padded until the ending.
to_repeat = np.array([1, 2, 3, 4, 5, 6])
repeats = np.array([1, 2, 2, 3, 3, 1])
# I want final array to look like the following:
#[[1, 0, 0],
# [2, 2, 0],
# [3, 3, 0],
# [4, 4, 4],
# [5, 5, 5],
# [6, 0, 0]]
The issue is that I'm operating with large datasets (10M or so) so a list comprehension is too slow - what is a fast way to achieve this?

Here's one with masking based on this idea -
m = repeats[:,None] > np.arange(repeats.max())
out = np.zeros(m.shape,dtype=to_repeat.dtype)
out[m] = np.repeat(to_repeat,repeats)
Sample output -
In [44]: out
Out[44]:
array([[1, 0, 0],
[2, 2, 0],
[3, 3, 0],
[4, 4, 4],
[5, 5, 5],
[6, 0, 0]])
Or with broadcasted-multiplication -
In [67]: m*to_repeat[:,None]
Out[67]:
array([[1, 0, 0],
[2, 2, 0],
[3, 3, 0],
[4, 4, 4],
[5, 5, 5],
[6, 0, 0]])
For large datasets/sizes, we can leverage multi-cores and be more efficient on memory with numexpr module on that broadcasting -
In [64]: import numexpr as ne
# Re-using mask `m` from previous method
In [65]: ne.evaluate('m*R',{'m':m,'R':to_repeat[:,None]})
Out[65]:
array([[1, 0, 0],
[2, 2, 0],
[3, 3, 0],
[4, 4, 4],
[5, 5, 5],
[6, 0, 0]])

Replicating elements in numpy array

I have a numpy array say
a = array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
I have an array 'replication' of the same size where replication[i,j](>=0) denotes how many times a[i][j] should be repeated along the row. Obiviously, replication array follows the invariant that np.sum(replication[i]) have the same value for all i.
For example, if
replication = array([[1, 2, 1],
[1, 1, 2],
[2, 1, 1]])
then the final array after replicating is:
new_a = array([[1, 2, 2, 3],
[4, 5, 6, 6],
[7, 7, 8, 9]])
Presently, I am doing this to create new_a:
##allocate new_a
h = a.shape[0]
w = a.shape[1]
for row in range(h):
ll = [[a[row][j]]*replicate[row][j] for j in range(w)]
new_a[row] = np.array([item for sublist in ll for item in sublist])
However, this seems to be too slow as it involves using lists. Can I do the intended entirely in numpy, without the use of python lists?

You can flatten out your replication array, then use the .repeat() method of a:
import numpy as np
a = array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
replication = array([[1, 2, 1],
[1, 1, 2],
[2, 1, 1]])
new_a = a.repeat(replication.ravel()).reshape(a.shape[0], -1)
print(repr(new_a))
# array([[1, 2, 2, 3],
# [4, 5, 6, 6],
# [7, 7, 8, 9]])

Inserting rows and columns into a numpy array

I would like to insert multiple rows and columns into a NumPy array.
If I have a square array of length n_a, e.g.: n_a = 3
a = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
and I would like to get a new array with size n_b, which contains array a and zeros (or any other 1D array of length n_b) on certain rows and columns with indices, e.g.
index = [1, 3]
so n_b = n_a + len(index). Then the new array is:
b = np.array([[1, 0, 2, 0, 3],
[0, 0, 0, 0, 0],
[4, 0, 5, 0, 6],
[0, 0, 0, 0, 0],
[7, 0, 8, 0, 9]])
My question is, how to do this efficiently, with the assumption that by bigger arrays n_a is much larger than len(index).
EDIT
The results for:
import numpy as np
import random
n_a = 5000
n_index = 100
a=np.random.rand(n_a, n_a)
index = random.sample(range(n_a), n_index)
Warren Weckesser's solution: 0.208 s
wim's solution: 0.980 s
Ashwini Chaudhary's solution: 0.955 s
Thank you to all!

Here's one way to do it. It has some overlap with #wim's answer, but it uses index broadcasting to copy a into b with a single assignment.
import numpy as np
a = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
index = [1, 3]
n_b = a.shape[0] + len(index)
not_index = np.array([k for k in range(n_b) if k not in index])
b = np.zeros((n_b, n_b), dtype=a.dtype)
b[not_index.reshape(-1,1), not_index] = a

You can do this by applying two numpy.insert calls on a:
>>> a = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
>>> indices = np.array([1, 3])
>>> i = indices - np.arange(len(indices))
>>> np.insert(np.insert(a, i, 0, axis=1), i, 0, axis=0)
array([[1, 0, 2, 0, 3],
[0, 0, 0, 0, 0],
[4, 0, 5, 0, 6],
[0, 0, 0, 0, 0],
[7, 0, 8, 0, 9]])

Since fancy indexing returns a copy instead of a view,
I can only think how to do it in a two-step process. Maybe a numpy wizard knows a better way...
Here you go:
import numpy as np
a = np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]])
index = [1, 3]
n = a.shape[0]
N = n + len(index)
non_index = [x for x in xrange(N) if x not in index]
b = np.zeros((N,n), a.dtype)
b[non_index] = a
a = np.zeros((N,N), a.dtype)
a[:, non_index] = b

Why can't you just Slice/splice? This has zero loops or for statements.
xlen = a.shape[1]
ylen = a.shape[0]
b = np.zeros((ylen * 2 - ylen % 2, xlen * 2 - xlen % 2)) #accomodates both odd and even shapes
b[0::2,0::2] = a

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to implement Numpy where index in TensorFlow? - python

Related

How to generate values from a diagonal to fill matrix

Take values from row and column of numpy array based on another array

How to repeat a numpy array along a new dimension with padding?

Replicating elements in numpy array

Inserting rows and columns into a numpy array

Categories

Resources