There are two array, a and indices.
a's shape: (g,N), which means there are g group all with N samples.
indices' shape: (q,g), which means there are q class', each of them contains different indices for the g groups to access a's values.
For example,
a = [[1 3 7 8]
[2 4 5 6]] # shape:(2,4), 2 groups with 4 samples
indices = [[0 1]
[2 2]] # shape:(2,2), 2 class' with indices to access a for the two groups.
I try to use np.take(a, indices, axis=1)and get
result = [[[1 3]
[7 7]]
[[2 4]
[5 5]]]
but that wasn't what I want.
The result I want to get is:
result = [[1,4]
[7,5]]
because
indices[0] = [0,1] # class 0's indices for the two groups
a[0,0] = 1
a[1,1] = 4
indices[1] = [2,2] # class 1's indices for the two groups
a[0,2] = 7
a[1,2] = 5
Could anyone help? thanks!
Use take_along_axis:
np.take_along_axis(a.T,indices,0)
# array([[1, 4],
# [7, 5]])
Related
Suppose we have:
row vector V of shape (F,1), and
4-D tensor T of shape (N, F, X, Y).
As a concrete example, let N, F, X, Y = 2, 3, 2, 2. Let V = [v0, v1,v2].
Then, I want to element-wise add v0 to the inner 2x2 matrix T[0,0], v1 to T[0,1], and v2 to T[0,2]. Similarly, I want to add v0 to T[1,0], v1 to T[1,1], and v2 to T[1,2].
So at the "innermost" level, the addition between the 2x2 matrix and a scalar, e.g. T[0,0] + v0, uses broadcasting to element-wise add v0. Then what I'm trying to do is apply that more generally to each inner 2x2.
I've tried using np.einsum() and np.tensordot(), but I couldn't figure out what each of those functions was actually doing on a more fundamental level, so I wanted to ask for a more step-by-step explanation of how this computation might be done.
Thanks
To multiply: You can simply translate your text into indices names of eisnum and it will take care of broadcasting:
TV = np.einsum('ijkl,j->ijkl',T,V)
To add: Simply add dimensions to your V using None to match up last two dimensions of T and broadcasting will take care of the rest:
TV = T + V[:,None,None]
Example input/output that shows the desired behavior of your output for adding:
T:
[[[[7 4]
[5 9]]
[[0 3]
[2 6]]
[[7 6]
[1 1]]]
[[[8 0]
[8 7]]
[[2 6]
[9 2]]
[[8 6]
[4 9]]]]
V:
[0 1 2]
TV:
[[[[ 7 4]
[ 5 9]]
[[ 1 4]
[ 3 7]]
[[ 9 8]
[ 3 3]]]
[[[ 8 0]
[ 8 7]]
[[ 3 7]
[10 3]]
[[10 8]
[ 6 11]]]]
I try to get value by indices from np.array or pd.DataFrame. Suppose raw value shape is [x,y], my indices is an array which shape is [x,z]. I want to take values for each column by indices. It means that each column will changed to z columns. I tried to use take directly, but it was not my want. Thus, I have to apply the method for columns loop. My code is as follows:
import numpy as np
arr = np.asarray([[0, 1, 2, 4], [1, 2, 3, 4], [2, 3, 4, 5]])
print(f"input array:\n{arr}")
indices = np.asarray([[-1, 0, 1], [-1, -1, 0]]).T
print(f"indices:\n{indices}")
res_0 = arr.take(indices)
print(f"take directly:\n{res_0}")
result_list = []
for i in range(arr.shape[1]):
result_list.append(arr[:, i].take(indices))
res_1 = np.concatenate(result_list, axis=-1)
print(f"expected result:\n{res_1}")
The output of the code is as follows:
input array:
[[0 1 2 4]
[1 2 3 4]
[2 3 4 5]]
indices:
[[-1 -1]
[ 0 -1]
[ 1 0]]
take directly:
[[5 5]
[0 5]
[1 0]]
expected result:
[[2 2 3 3 4 4 5 5]
[0 2 1 3 2 4 4 5]
[1 0 2 1 3 2 4 4]]
For each column of arr, using the indices to select will generate two new columns (each column in indices will generate a new column). Thus, we finally get a new array with shape [3, 4*2].
Using take directly can not achieve my target, while using loop is not so neat.
Is there any more efficient way to implement this?
I have a 2-D array for example:
p = np.array([[21,2,3,1,12,13],
[4,5,6,14,15,16],
[7,8,9,17,18,19]])
b = np.argpartition(p, np.argmin(p, axis=1))[:, -2:]
com = np.ones([3,6],dtype=np.int)
com[np.arange(com.shape[0])[:,None],b] = 0
print(com)
b is the indices of top 2 values of each row in p:
b = [[0 5]
[4 5]
[4 5]]
com is np.ones matrix, the same size as p, the element whose index is same as b will change to 0.
So the result is :
com = [[0 1 1 1 1 0]
[1 1 1 1 0 0]
[1 1 1 1 0 0]]
Now I have one more constraint :
p[0:2,0:2]
The numbers in these area should not be considered,
so the result should be:
b = [[4 5]
[4 5]
[4 5]]
How can I do this ?
Thanks in advance!
Make sure your question is clear. Not sure I understand your constraints. Here's a take:
# the data
p = np.array([[21, 2, 3, 1, 12, 13],
[4, 5, 6, 14, 15, 16],
[7, 8, 9, 17, 18, 19]])
# not sure if this is what you mean by constraint
# but lets ignore values in first two cols and rows
p[0:2, 0:2] = 0
# return the idx of highest values
b = np.argpartition(p, -2)[:, -2:]
I have a vector and i want to have the "Sorted Index Function" of it.
What i mean by that is, that if you have a vector v with k=length(v) and you sort it with
sort_v=tf.nn.top_k(v,k)
then i would like to have the "Sorted Index Function" psi with
v(psi(i))=sort_v(i)
how do i get this function (as a tensor) in tensorflow?
According to the documentation tf_nn.top_k returns both values and indices of the sorted tensor, so you can simply use two variables, one for the values and one for the indices
a_sorted_val, a_sorted_ind = tf.nn.top_k(a, 2)
a_sorted_ind is the fuction expressed as a tensor
Example:
import tensorflow as tf
import numpy as np
with tf.Session():
a = tf.convert_to_tensor([[4, 3, 2, 1], [5, 6, 7, 8]])
a_sort_val, a_sort_ind = tf.nn.top_k(a, 4)
values = a_sort_val.eval()
indices = a_sort_ind.eval()
unsorted_a = a.eval()
print(unsorted_a)
print(values)
print(indices)
type(a_sort_ind)
[[4 3 2 1] <-- unsorted
[5 6 7 8]]
[[4 3 2 1] <-- sorted tensor
[8 7 6 5]]
[[0 1 2 3] <-- indices of sorted tensor
[3 2 1 0]]
tensorflow.python.framework.ops.Tensor
Given base array X of shape (2, 3, 4) which can be interpreted as two sets of 3 elements each, where every element is 4-dimensional, I want to sample from this array X in the following way.
From each of 2 sets I want to pick 2 subsets each defined by the binary array of length 3, other subsets would be set to 0. So the sampling process is defined by the array of shape (2, 2, 3). The result of this sampling should have shape (2, 2, 3, 4).
Here's the code that does what I need but I wonder if it could be rewritten more efficiently using numpy indexing.
import numpy as np
np.random.seed(3)
sets = np.random.randint(0, 10, [2, 3, 4])
subset_masks = np.random.randint(0, 2, [2, 2, 3])
print('Base set\n', sets, '\n')
print('Subset masks\n', subset_masks, '\n')
result = np.empty([2, 2, 3, 4])
for set_index in range(sets.shape[0]):
for subset_index, subset in enumerate(subset_masks[set_index]):
print('----')
picked_subset = subset.reshape(3, 1) * sets[set_index]
result[set_index][subset_index] = picked_subset
print('Picking subset ', subset, 'from set #', set_index)
print(picked_subset, '\n')
Output
Base set
[[[8 9 3 8]
[8 0 5 3]
[9 9 5 7]]
[[6 0 4 7]
[8 1 6 2]
[2 1 3 5]]]
Subset masks
[[[0 0 1]
[1 0 0]]
[[1 0 1]
[0 1 1]]]
----
Picking subset [0 0 1] from set # 0
[[0 0 0 0]
[0 0 0 0]
[9 9 5 7]]
----
Picking subset [1 0 0] from set # 0
[[8 9 3 8]
[0 0 0 0]
[0 0 0 0]]
----
Picking subset [1 0 1] from set # 1
[[6 0 4 7]
[0 0 0 0]
[2 1 3 5]]
----
Picking subset [0 1 1] from set # 1
[[0 0 0 0]
[8 1 6 2]
[2 1 3 5]]
Extend each of them to 4D by adding new axis for subset_masks along the last one and for sets as the second axis. For adding those new axes, we can use None/np.newaxis. Then, leverage NumPy broadcasting to perform the element-wise multiplication, like so -
subset_masks[...,None]*sets[:,None]
Just for the kicks probably, we can also use np.einsum -
np.einsum('ijk,ilj->iljk',sets,subset_masks)