How can I implement the tensorflow function tf.nn.top_k with Numpy? Suppose the input is ndarray in format heigh x width x channel?
You can use the answer here with Numpy 1.8 and up.
I spent more time on this than I wanted, because the other answers treated the whole multidimensional array as a single search where top_k only looks at the last dimension. There's more information here, where the partition is used to specifically sort a given axis.
To summarize, based upon the tensorflow signature (without name):
def top_k(input, k=1, sorted=True):
"""Top k max pooling
input(ndarray): convolutional feature in heigh x width x channel format
k(int): if k==1, it is equal to normal max pooling
sorted(bool): whether to return the array sorted by channel value
ndarray: k x (height x width)
ndarray: k
ind = np.argpartition(input, -k)[..., -k:]
def get_entries(input, ind, sorted):
if len(ind.shape) == 1:
if sorted:
ind = ind[np.argsort(-input[ind])]
return input[ind], ind
output, ind = zip(*[get_entries(inp, id, sorted) for inp, id in zip(input, ind)])
return np.array(output), np.array(ind)
return get_entries(input, ind, sorted)
Keep in mind, for your answer, you tested with
arr = np.random.rand(3, 3, 3)
arr1, ind1 = top_k(arr)
arr2 = np.max(arr, axis=(0,1))
arr3, ind3 = tf.nn.top_k(arr)
but arr2.shape is (3,) and arr3.numpy().shape is (3, 3, 1).
If you really want tf.nn.top_k like functionality, you should use np.array_equal(arr3, np.max(arr, axis=-1, keepdims=True)) as the test. I ran this with tf.enable_eager_execution() executed, hence the .numpy() instead of .eval().
import numpy as np
def top_k(input, k=1):
"""Top k max pooling
input(ndarray): convolutional feature in heigh x width x channel format
k(int): if k==1, it is equal to normal max pooling
ndarray: k x (height x width)
input = np.reshape(input, [-1, input.shape[-1]])
input = np.sort(input, axis=0)[::-1, :][:k, :]
return input
arr = np.random.rand(3, 3, 3)
arr1 = top_k(arr)
arr2 = np.max(arr, axis=(0,1))
assert np.array_equal(top_k(arr)[0], np.max(arr, axis=(0,1)))
First things first: I'm relatively new to TensorFlow.
I'm trying to implement a custom layer in tensorflow.keras and I'm having relatively hard time when I try to achieve the following:
I've got 3 Tensors (x,y,z) of shape (?,49,3,3,32) [where ? is the batch size]
On each Tensor I compute the sum over the 3rd and 4th axes [thus I end up with 3 Tensors of shape (?,49,32)]
By doing an argmax (A)on the above 3 Tensors (?,49,32) I get a single (?,49,32) Tensor
Now I want to use this tensor to select slices from the initial x,y,z Tensors in the following form:
Each element in the last dimension of A corresponds to the selected Tensor.
(aka: 0 = X, 1 = Y, 2 = Z)
The index of the last dimension of A corresponds to the slice that I would like to extract from the Tensor last dimension.
I've tried to achieve the above using tf.gather but I had no luck. Then I tried using a series of tf.map_fn, which is ugly and computationally costly.
To simplify the above:
let's say we've got an A array of shape (3,3,3,32). Then the numpy equivalent of what I try to achieve is this:
import numpy as np
x = np.random.rand(3,3,32)
y = np.random.rand(3,3,32)
z = np.random.rand(3,3,32)
x_sums = np.sum(np.sum(x,axis=0),0);
y_sums = np.sum(np.sum(y,axis=0),0);
z_sums = np.sum(np.sum(z,axis=0),0);
max_sums = np.argmax([x_sums,y_sums,z_sums],0)
A = np.array([x,y,z])
tmp = []
for i in range(0,len(max_sums)):
output = np.transpose(np.stack(tmp))
Any suggestions?
ps: I tried tf.gather_nd but I had no luck
This is how you can do something like that with tf.gather_nd:
import tensorflow as tf
# Make example data
b = 10 # Batch size
x = tf.random.uniform((b, 49, 3, 3, 32))
y = tf.random.uniform((b, 49, 3, 3, 32))
z = tf.random.uniform((b, 49, 3, 3, 32))
# Stack tensors together
data = tf.stack([x, y, z], axis=2)
# Put reduction axes last
data_t = tf.transpose(data, (0, 1, 5, 2, 3, 4))
# Reduce
s = tf.reduce_sum(data_t, axis=(4, 5))
# Find largest sums
idx = tf.argmax(s, 3)
# Make gather indices
data_shape = tf.shape(data_t, idx.dtype)
bb, ii, jj = tf.meshgrid(*(tf.range(data_shape[i]) for i in range(3)), indexing='ij')
# Gather result
output_t = tf.gather_nd(data_t, tf.stack([bb, ii, jj, idx], axis=-1))
# Reorder axes
output = tf.transpose(output_t, (0, 1, 3, 4, 2))
# TensorShape([10, 49, 3, 3, 32])
I want to vectorize the following code:
def style_noise(self, y, style):
n = torch.randn(y.shape)
for i in range(n.shape[0]):
n[i] = (n[i] - n.mean(dim=(1, 2, 3))[i]) * style.std(dim=(1, 2, 3))[i] / n.std(dim=(1, 2, 3))[i] + style.mean(dim=(1, 2, 3))[i]
noise = Variable(n, requires_grad=False).to(y.device)
return noise
I didn't find a way nice way of doing so.
y and style are 4d tensors, say style.shape = y.shape = [64, 3, 128, 128].
I want to return the noise tensor, noise.shape = [64, 3, 128, 128].
Please let me know in the comments if the question is not clear.
Your use case is exactly why the .mean and .std methods come with a keepdim parameter. You can make use of this to enable broadcasting semantics to vectorize things for you:
def style_noise(self, y, style):
n = torch.randn(y.shape)
n_mean = n.mean(dim=(1, 2, 3), keepdim=True)
n_std = n.std(dim=(1, 2, 3), keepdim=True)
style_mean = style.mean(dim=(1, 2, 3), keepdim=True)
style_std = style.std(dim=(1, 2, 3), keepdim=True)
n = (n - n_mean) * style_std / n_std + style_mean
noise = Variable(n, requires_grad=False).to(y.device)
return noise
To calculate mean and std for the whole tensor you set no arguments
m = t.mean(); print(m) # if you don't set the dim for the whole tensor
s = t.std(); print(s) # if you don't set the dim for the whole tensor
Then if your shape is 2,2,2 for instance, create tensors for broadcasting subtract and division.
ss = torch.empty(2,2,2).fill_(s)
mm = torch.empty(2,2,2).fill_(m)
At the moment keepdim is not working as expected when you don't set the dim.
m = t.mean(); print(m) # for the whole tensor
s = t.std(); print(s) # for the whole tensor
m = t.mean(dim=0); print(m) # 0 means columns mean
s = t.std(dim=0); print(s) # 0 means columns mean
m = t.mean(dim=1); print(m) # 1 means rows mean
s = t.std(dim=1); print(s) # 1 means rows mean
s = t.mean(keepdim=True);print(s) # will not work
m = t.std(keepdim=True);print(m) # will not work
If you set a dim as a tuple, then it will return mean for axes, you asked not for the whole.
I have a numpy array my_array of size 100x20. I want to create a function that receives as an input a 2d numpy array my_arr and an index x and will return two arrays one with size 1x20 test_arr and one with 99x20 train_arr. The vector test_arr will correspond to the row of the matrix my_arr with the index x and the train_arr will contain the rest rows. I tried to follow a solution using masking:
def split_train_test(my_arr, x):
a =, mask=False)
a.mask[x, :] = True
a = np.array(a.compressed())
return a
Apparently this is not working as i wanted. How can i return a numpy array as a result and the train and test arrays properly?
You can use simple index and numpy.delete for this:
def split_train_test(my_arr, x):
return np.delete(my_arr, x, 0), my_arr[x:x+1]
my_arr = np.arange(10).reshape(5,2)
train, test = split_train_test(my_arr, 2)
#array([[0, 1],
# [2, 3],
# [6, 7],
# [8, 9]])
#array([[4, 5]])
You can also use a boolean index as the mask:
def split_train_test(my_arr, x):
# define mask
mask=np.zeros(my_arr.shape[0], dtype=bool)
mask[x] = True # True only at index x, False elsewhere
return my_arr[mask, :], my_arr[~mask, :]
Sample run:
test_arr, train_arr = split_train_test(np.random.rand(100, 20), x=10)
print(test_arr.shape, train_arr.shape)
((1L, 20L), (99L, 20L))
If someone is looking for the general case where more than one element needs to be allocated to the test array (say 80%-20% split), x can also accept an array:
my_arr = np.random.rand(100, 20)
x = np.random.choice(np.arange(my_arr.shape[0]), int(my_arr .shape[0]*0.8), replace=False)
test_arr, train_arr = split_train_test(my_arr, x)
print(test_arr.shape, train_arr.shape)
((80L, 20L), (20L, 20L))
In short: I am looking for a simple numpy (maybe oneliner) implementation of Maxpool - maximum on a window on numpy.narray for all location of the window across dimensions.
In more details: I am implementing a convolutional neural network ("CNN"), one of the typical layers in such a network is MaxPool layer (look for example here). Writing
y = MaxPool(x, S), x is an input narray and S is a parameter, using pseudocode, the output of the MaxPool is given by:
y[b,h,w,c] = max(x[b, s*h + i, s*w + j, c]) over i = 0,..., S-1; j = 0,...,S-1.
That is, y is narray where the value at indexes b,h,w,c equals the maximum taken over the window of size S x S along the second and the third dimension of the input x, the window "corner" is placed at the indexes b,h,w,c.
Some additional details: The network is implemented using numpy. CNN has many "layers" where output of one layer is the input to the next layer. The input to a layers are numpy.narrays called "tensors". In my case tensors are 4-dimensional numpy.narray's, x. That is x.shape is a tuple (B,H,W,C). Each size of dimensions changes after the tensor is process by a layer, for example the input to layer i= 4 can have size B = 10, H = 24, W = 24, C = 3, while the output, aka input to i+1 layer has B = 10, H = 12, W = 12, C = 5. As indicated in the comments the size after application of MaxPool is (B, H - S + 1, W - S + 1, C).
For a concreteness: if I use
import numpy as np
y = np.amax(x, axis = (1,2))
where x.shape is say (2,3,3,4) this will give me what I want but for a degenerate case where the window I am maximizing over is of the size 3 x 3, the size of the second and third dimension of x, which is not exactly what I want.
Here's a solution using np.lib.stride_tricks.as_strided to create sliding windows resulting in a 6D array of shape : (B,H-S+1,W-S+1,S,S,C) and then simply performing max along the fourth and fifth axes, resulting in an output array of shape : (B,H-S+1,W-S+1,C). The intermediate 6D array would be a view into the input array and as such won't occupy anymore memory. The subsequent operation of max being a reduction would efficiently utilize the sliding views.
Thus, an implementation would be -
# Based on
def patchify(img, patch_shape):
a, X, Y, b = img.shape
x, y = patch_shape
shape = (a, X - x + 1, Y - y + 1, x, y, b)
a_str, X_str, Y_str, b_str = img.strides
strides = (a_str, X_str, Y_str, X_str, Y_str, b_str)
return np.lib.stride_tricks.as_strided(img, shape=shape, strides=strides)
out = patchify(x, (S,S)).max(axis=(3,4))
Sample run -
In [224]: x = np.random.randint(0,9,(10,24,24,3))
In [225]: S = 5
In [226]: np.may_share_memory(patchify(x, (S,S)), x)
Out[226]: True
In [227]: patchify(x, (S,S)).shape
Out[227]: (10, 20, 20, 5, 5, 3)
In [228]: patchify(x, (S,S)).max(axis=(3,4)).shape
Out[228]: (10, 20, 20, 3)
The general solution to this question is being worked on in this github issue, but I was wondering if there are workarounds using tf.gather (or something else) to achieve array indexing using a multi-index. One solution I came up with was to broadcast multiply each index in the multi-idx with the cumulative product of the tensor shape, which produces indices suitable for indexing the flattened tensor:
import tensorflow as tf
import numpy as np
def __cumprod(l):
# Get the length and make a copy
ll = len(l)
l = [v for v in l]
# Reverse cumulative product
for i in range(ll-1):
l[ll-i-2] *= l[ll-i-1]
return l
def ravel_multi_index(tensor, multi_idx):
Returns a tensor suitable for use as the index
on a gather operation on argument tensor.
if not isinstance(tensor, (tf.Variable, tf.Tensor)):
raise TypeError('tensor should be a tf.Variable')
if not isinstance(multi_idx, list):
multi_idx = [multi_idx]
# Shape of the tensor in ints
shape = [i.value for i in tensor.get_shape()]
if len(shape) != len(multi_idx):
raise ValueError("Tensor rank is different "
"from the multi_idx length.")
# Work out the shape of each tensor in the multi_idx
idx_shape = [tuple(j.value for j in i.get_shape()) for i in multi_idx]
# Ensure that each multi_idx tensor is length 1
assert all(len(i) == 1 for i in idx_shape)
# Create a list of reshaped indices. New shape will be
# [1, 1, dim[0], 1] for the 3rd index in multi_idx
# for example.
reshaped_idx = [tf.reshape(idx, [1 if i !=j else dim[0]
for j in range(len(shape))])
for i, (idx, dim)
in enumerate(zip(multi_idx, idx_shape))]
# Figure out the base indices for each dimension
base = __cumprod(shape)
# Now multiply base indices by each reshaped index
# to produce the flat index
return (sum(b*s for b, s in zip(base[1:], reshaped_idx[:-1]))
+ reshaped_idx[-1])
# Shape and slice starts and sizes
shape = (Z, Y, X) = 4, 5, 6
Z0, Y0, X0 = 1, 1, 1
ZS, YS, XS = 3, 3, 4
# Numpy matrix and index
M = np.random.random(size=shape)
idx = [
np.arange(Z0, Z0+ZS).reshape(ZS,1,1),
np.arange(Y0, Y0+YS).reshape(1,YS,1),
np.arange(X0, X0+XS).reshape(1,1,XS),
# Tensorflow matrix and indices
TM = tf.Variable(M)
TF_flat_idx = ravel_multi_index(TM, [
tf.range(Z0, Z0+ZS),
tf.range(Y0, Y0+YS),
tf.range(X0, X0+XS)])
TF_data = tf.gather(tf.reshape(TM,[-1]), TF_flat_idx)
with tf.Session() as S:
# Obtain data via flat indexing
data =
# Check that it agrees with data obtained
# by numpy smart indexing
assert np.all(data == M[idx])
However, this only works on tensors of rank 3 due to this (current) limitation limiting broadcasts to tensors of rank 3.
At the moment I can only think of doing a chained gather, transpose, gather, transpose, gather, but this is unlikely to be efficient. e.g.
shape = (8, 9, 10)
A = tf.random_normal(shape)
data = tf.gather(tf.transpose(tf.gather(A, [1, 3]), [1,0,2]), ...)
Any ideas?
It sounds like you want gather_nd.