numpy equivalent of tf.math.segment_sum - python

What is the equivalent of tf.math.segment_sum in numpy?
So basically I like to rewrite the exact same code in tf to np where I am using segment sum to group together certain elements using a segment_ids array and sum those segments. What is the equivalent code in numpy? I have an array and the segment_ids array and I like to perform segment_sum but in numpy.

You can create something pretty close to tf.math.segment_sum with the method numpy.add.at, which is the at method of the add ufunc:
def segment_sum(data, segment_ids):
data = np.asarray(data)
s = np.zeros((np.max(segment_ids)+1,) + data.shape[1:], dtype=data.dtype)
np.add.at(s, segment_ids, data)
return s
For example,
In [53]: c = np.array([[1, 2, 3, 4], [4, 3, 2, 1], [5, 6, 7, 8]])
In [54]: ids = [0, 0, 1]
In [55]: segment_sum(c, ids)
Out[55]:
array([[5, 5, 5, 5],
[5, 6, 7, 8]])
In [56]: x = [10, 20, 20, 30, 10, 0, 1, 2]
In [57]: xids = [1, 1, 0, 0, 2, 2, 2, 3]
In [58]: segment_sum(x, xids)
Out[58]: array([50, 30, 11, 2])
In [59]: w = np.arange(72).reshape(6, 2, 6) % 5
In [60]: w
Out[60]:
array([[[0, 1, 2, 3, 4, 0],
[1, 2, 3, 4, 0, 1]],
[[2, 3, 4, 0, 1, 2],
[3, 4, 0, 1, 2, 3]],
[[4, 0, 1, 2, 3, 4],
[0, 1, 2, 3, 4, 0]],
[[1, 2, 3, 4, 0, 1],
[2, 3, 4, 0, 1, 2]],
[[3, 4, 0, 1, 2, 3],
[4, 0, 1, 2, 3, 4]],
[[0, 1, 2, 3, 4, 0],
[1, 2, 3, 4, 0, 1]]])
In [61]: wids = [0, 0, 1, 2, 2, 2]
In [62]: segment_sum(w, wids)
Out[62]:
array([[[2, 4, 6, 3, 5, 2],
[4, 6, 3, 5, 2, 4]],
[[4, 0, 1, 2, 3, 4],
[0, 1, 2, 3, 4, 0]],
[[4, 7, 5, 8, 6, 4],
[7, 5, 8, 6, 4, 7]]])

Related

Expand a multidimentional array with another array of different shape

I have the following arrays:
A = np.array([
[[[0, 1, 2, 3],
[3, 0, 1, 2],
[2, 3, 0, 1],
[1, 3, 2, 1],
[1, 2, 3, 0]]],
[[[9, 8, 7, 6],
[5, 4, 3, 2],
[0, 9, 8, 3],
[1, 9, 2, 3],
[1, 0, -1, 2]]],
[[[0, 7, 1, 2],
[1, 2, 1, 0],
[0, 2, 0, 7],
[-1, 3, 0, 1],
[1, 0, 1, 0]]]
])
A.shape
(3,1,5,4)
B = np.array([
[[[1, 0],
[-1, 2],
[9, 1],
[8, 2],
[7, 0]]],
[[[9, 6],
[5, 2],
[0, 3],
[1, 9],
[1, 0]]],
[[[0, 7],
[1, 0],
[0, 7],
[-1, 1],
[0, 0]]]
])
B.shape
(3,1,5,2)
Then I want to expand array A with B in the last dimension of A. Such that, the result X is:
X = np.array([
[[[0, 1, 2, 3, 1, 0],
[3, 0, 1, 2,-1, 2],
[2, 3, 0, 1, 9, 1],
[1, 3, 2, 1, 8, 2],
[1, 2, 3, 0, 7, 0]]],
[[[9, 8, 7, 6, 9, 6],
[5, 4, 3, 2, 5, 2],
[0, 9, 8, 3, 0, 3],
[1, 9, 2, 3, 1, 9],
[1, 0,-1, 2, 1, 0]]],
[[[0, 7, 1, 2, 0, 7],
[1, 2, 1, 0, 1, 0],
[0, 2, 0, 7, 0, 7],
[-1,3, 0, 1,-1, 1],
[1, 0, 1, 0, 0, 0]]]
])
X.shape
(3,1,5,6)
``
You have to concatenate the 2 arrays together along the axis you need:
C = np.concatenate((A, B), axis=3)

selecting certain indices in Numpy ndarray using another array

I'm trying to mark the value and indices of max values in a 3D array, getting the max in the third axis.
Now this would have been obvious in a lower dimension:
argmaxes=np.argmax(array)
maximums=array[argmaxes]
but NumPy doesn't understand the second syntax properly for higher than 1D.
Let's say my 3D array has shape (8,8,250). argmaxes=np.argmax(array,axis=-1)would return a (8,8) array with numbers between 0 to 250. Now my expected output is an (8,8) array containing the maximum number in the 3rd dimension. I can achieve this with maxes=np.max(array,axis=-1) but that's repeating the same calculation twice (because I need both values and indices for later calculations)
I can also just do a crude nested loop:
for i in range(8):
for j in range(8):
maxes[i,j]=array[i,j,argmaxes[i,j]]
But is there a nicer way to do this?
You can use advanced indexing. This is a simpler case when shape is (8,8,3):
arr = np.random.randint(99, size=(8,8,3))
x, y = np.indices(arr.shape[:-1])
arr[x, y, np.argmax(array,axis=-1)]
Sample run:
>>> x
array([[0, 0, 0, 0, 0, 0, 0, 0],
[1, 1, 1, 1, 1, 1, 1, 1],
[2, 2, 2, 2, 2, 2, 2, 2],
[3, 3, 3, 3, 3, 3, 3, 3],
[4, 4, 4, 4, 4, 4, 4, 4],
[5, 5, 5, 5, 5, 5, 5, 5],
[6, 6, 6, 6, 6, 6, 6, 6],
[7, 7, 7, 7, 7, 7, 7, 7]])
>>> y
array([[0, 1, 2, 3, 4, 5, 6, 7],
[0, 1, 2, 3, 4, 5, 6, 7],
[0, 1, 2, 3, 4, 5, 6, 7],
[0, 1, 2, 3, 4, 5, 6, 7],
[0, 1, 2, 3, 4, 5, 6, 7],
[0, 1, 2, 3, 4, 5, 6, 7],
[0, 1, 2, 3, 4, 5, 6, 7],
[0, 1, 2, 3, 4, 5, 6, 7]])
>>> np.argmax(arr,axis=-1)
array([[2, 1, 1, 2, 0, 0, 0, 1],
[2, 2, 2, 1, 0, 0, 1, 0],
[1, 2, 0, 1, 1, 1, 2, 0],
[1, 0, 0, 0, 2, 1, 1, 0],
[2, 0, 1, 2, 2, 2, 1, 0],
[2, 2, 0, 1, 1, 0, 2, 2],
[1, 1, 0, 1, 1, 2, 1, 0],
[2, 1, 1, 1, 0, 0, 2, 1]], dtype=int64)
This is a visual example of array to help to understand it better:

How to return indices from sorting a 2d numpy array row-by-row?

Input: A 2D numpy array
Output: An array of indices that will sort the array row by row (or column by column)
E.g.: Say the function is get_sorted_indices(array, axis=0)
a = np.array([[1,2,3,4,5]
,[2,3,4,5,6]
,[1,2,3,4,5]
,[2,3,4,6,6]
,[2,3,4,5,6]])
ind = get_sorted_indices(a, axis=0)
Then we will get
>>> ind
[0, 2, 1, 4, 3]
>>> a[ind] # should be equals to a.sort(axis = 0)
array([[1, 2, 3, 4, 5],
[1, 2, 3, 4, 5],
[2, 3, 4, 5, 6],
[2, 3, 4, 5, 6],
[2, 3, 4, 6, 6]])
>>> a.sort(axis=0)
array([[1, 2, 3, 4, 5],
[1, 2, 3, 4, 5],
[2, 3, 4, 5, 6],
[2, 3, 4, 5, 6],
[2, 3, 4, 6, 6]])
I've looked at argsort but I don't understand its output and reading the documentation doesn't help:
>>> a.argsort()
array([[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4]])
>>> a.argsort(axis=0)
array([[0, 0, 0, 0, 0],
[2, 2, 2, 2, 2],
[1, 1, 1, 1, 1],
[3, 3, 3, 4, 3],
[4, 4, 4, 3, 4]])
I can do this manually but I think I'm misunderstanding argsort or I'm missing something from numpy.
Is there a standard way to do this or I have no choice but to do this manually?

Advanced List Coding using multiple lists

So we are given two lists.
groups = [[0,1],[2],[3,4,5],[6,7,8,9]]
A = [[[0, 1, 6, 7, 8, 9], [0, 1, 6, 7, 8, 9]], [[2]], [[3, 4, 5, 6, 7, 8, 9], [3, 4, 5, 6, 7, 8, 9], [3, 4, 5, 6, 7, 8, 9]], [[0, 1, 3, 4, 5, 6, 8, 9], [0, 1, 3, 4, 5, 7, 8, 9], [0, 1, 3, 4, 5, 6, 7, 8, 9], [0, 1, 3, 4, 5, 6, 7, 8, 9]]]
How do we replace the the elements in A with their corresponding indexes in groups: i.e., replace the 0 and 1 in A with 0, the 2 in A with 1, the 3, 4 and 5 with 2 and so on.
Output:
A = [[[0, 0, 3, 3, 3, 3], [0, 0, 3, 3, 3, 3]], [[1]], [[2, 2, 2, 3, 3, 3, 3], [2, 2, 2, 3, 3, 3, 3], [2, 2, 2, 3, 3, 3, 3]], [[0, 0, 2, 2, 2, 3, 3, 3], [0, 0, 2, 2, 2, 3, 3, 3], [0, 0, 2, 2, 2, 3, 3, 3, 3], [0, 0, 2, 2, 2, 3, 3, 3, 3]]]
create a dictionry which store the index value for those numbers and then for those number in list A add the index
groups = [[0,1],[2],[3,4,5],[6,7,8,9]]
A = [[[0, 1, 6, 7, 8, 9], [0, 1, 6, 7, 8, 9]], [[2]], [[3, 4, 5, 6, 7, 8, 9], [3, 4, 5, 6, 7, 8, 9], [3, 4, 5, 6, 7, 8, 9]], [[0, 1, 3, 4, 5, 6, 8, 9], [0, 1, 3, 4, 5, 7, 8, 9], [0, 1, 3, 4, 5, 6, 7, 8, 9], [0, 1, 3, 4, 5, 6, 7, 8, 9]]]
from collections import defaultdict
dic = defaultdict(int)
for i in range(len(groups)):
for j in groups[i]:
dic[j]=i
for i in A:
for j in i:
for l in range(len(j)):
j[l] = dic[j[l]]
output
[[[0, 0, 3, 3, 3, 3], [0, 0, 3, 3, 3, 3]],
[[1]],
[[2, 2, 2, 3, 3, 3, 3], [2, 2, 2, 3, 3, 3, 3], [2, 2, 2, 3, 3, 3, 3]],
[[0, 0, 2, 2, 2, 3, 3, 3],
[0, 0, 2, 2, 2, 3, 3, 3],
[0, 0, 2, 2, 2, 3, 3, 3, 3],
[0, 0, 2, 2, 2, 3, 3, 3, 3]]]
Even though there is no attempt from your side, here you go :
def f(l,i):
for k in l:
if i in k:
return l.index(k)
output_ = [[[f(groups,n) for n in a0] for a0 in a] for a in A]
Output :
[[[0, 0, 3, 3, 3, 3], [0, 0, 3, 3, 3, 3]], [[1]], [[2, 2, 2, 3, 3, 3, 3], [2, 2, 2, 3, 3, 3, 3], [2, 2, 2, 3, 3, 3, 3]], [[0, 0, 2, 2, 2, 3, 3, 3], [0, 0, 2, 2, 2, 3, 3, 3], [0, 0, 2, 2, 2, 3, 3, 3, 3], [0, 0, 2, 2, 2, 3, 3, 3, 3]]]
try this:
def replace_items(i, inner_list, *lists):
for l in lists:
for item in l:
if item in inner_list:
index= l.index(item)
l[index] = i
for i,inner_list in enumerate(groups):
for lists in A:
replace_items(i, inner_list, *lists)
print(A)
If you convert your groups into a dictionary, it will be easy to process the 3 level list using a list comprehension:
groupDict = { v:i for i,g in enumerate(groups) for v in g }
A = [ [ [ groupDict[z] for z in yz ] for yz in xyz] for xyz in A ]

Efficiently define an implicit Numpy array

A and B are Numpy arrays of common shape [n1,n2,n3]. The values of B are all integers in [0,n3). I want A to "invert" B in the sense that each value of A satisfies A[i,j,B[i,j,k]]=k for all i,j,k in the appropriate ranges. While it's obvious how to do this with for loops, I suspect that there is a clever one-liner using fancy indexing. Does anyone see it?
Here are two methods.
The first method is a one-liner: A = B.argsort(axis=-1)
Here's an example. B has shape (3, 5, 7) and for each fixed i and j, B[i,j,:] is a permutation of range(B.shape[2]).
In [386]: B
Out[386]:
array([[[1, 5, 4, 6, 2, 3, 0],
[6, 5, 3, 4, 2, 1, 0],
[4, 5, 0, 3, 1, 2, 6],
[0, 5, 6, 3, 2, 1, 4],
[4, 1, 5, 2, 6, 3, 0]],
[[2, 6, 0, 1, 5, 4, 3],
[3, 2, 4, 0, 1, 5, 6],
[3, 4, 6, 5, 1, 2, 0],
[4, 6, 3, 0, 2, 5, 1],
[0, 3, 1, 6, 4, 5, 2]],
[[0, 3, 6, 2, 1, 5, 4],
[3, 1, 2, 4, 6, 0, 5],
[1, 3, 5, 6, 4, 0, 2],
[4, 1, 6, 0, 2, 3, 5],
[6, 4, 5, 1, 0, 3, 2]]])
In [387]: A = B.argsort(axis=-1)
In [388]: A
Out[388]:
array([[[6, 0, 4, 5, 2, 1, 3],
[6, 5, 4, 2, 3, 1, 0],
[2, 4, 5, 3, 0, 1, 6],
[0, 5, 4, 3, 6, 1, 2],
[6, 1, 3, 5, 0, 2, 4]],
[[2, 3, 0, 6, 5, 4, 1],
[3, 4, 1, 0, 2, 5, 6],
[6, 4, 5, 0, 1, 3, 2],
[3, 6, 4, 2, 0, 5, 1],
[0, 2, 6, 1, 4, 5, 3]],
[[0, 4, 3, 1, 6, 5, 2],
[5, 1, 2, 0, 3, 6, 4],
[5, 0, 6, 1, 4, 2, 3],
[3, 1, 4, 5, 0, 6, 2],
[4, 3, 6, 5, 1, 2, 0]]])
Verify the desired property by sampling a few values.
In [389]: A[0, 0, B[0, 0, 0]]
Out[389]: 0
In [390]: A[0, 0, B[0, 0, 1]]
Out[390]: 1
In [391]: A[0, 0, B[0, 0, :]]
Out[391]: array([0, 1, 2, 3, 4, 5, 6])
In [392]: A[2, 3, B[2, 3, :]]
Out[392]: array([0, 1, 2, 3, 4, 5, 6])
The second method has a lower time complexity than using argsort, but it is a three-liner rather than a one-liner. I'll use the same B as above.
Create A, but with no values assigned yet.
In [393]: A = np.empty_like(B)
Create index arrays for each dimension of B.
In [394]: i, j, k = np.ogrid[[slice(n) for n in B.shape]] # or np.ix_(*[range(n) for n in B.shape])
This is the cool part. Do the assignment exactly as you wrote it in the question.
In [395]: A[i, j, B[i, j, k]] = k
Verify that we have the same A as above.
In [396]: A
Out[396]:
array([[[6, 0, 4, 5, 2, 1, 3],
[6, 5, 4, 2, 3, 1, 0],
[2, 4, 5, 3, 0, 1, 6],
[0, 5, 4, 3, 6, 1, 2],
[6, 1, 3, 5, 0, 2, 4]],
[[2, 3, 0, 6, 5, 4, 1],
[3, 4, 1, 0, 2, 5, 6],
[6, 4, 5, 0, 1, 3, 2],
[3, 6, 4, 2, 0, 5, 1],
[0, 2, 6, 1, 4, 5, 3]],
[[0, 4, 3, 1, 6, 5, 2],
[5, 1, 2, 0, 3, 6, 4],
[5, 0, 6, 1, 4, 2, 3],
[3, 1, 4, 5, 0, 6, 2],
[4, 3, 6, 5, 1, 2, 0]]])
After poking around some more on SO, I see that both these methods appear in answers to the question "How to invert a permutation array in numpy". The only thing really new here is doing the inversion along one axis of a three-dimensional array.

Categories

Resources