Tensorflow: Keep 10% of the largest entries of a tensor - python

I want to filter a tensor by keeping 10% of the largest entries. Is there a Tensorflow function to do that? How would a possible implementation look like? I am looking for something that can handle tensors of shape [N,W,H,C] and [N,W*H*C].
By filter I mean that the shape of the tensor remains the same but only the largest 10% are kept. Thus all entries become zero except the 10% largest.
Is that possible?

The correct way of doing this would be computing the 90 percentile, for example with tf.contrib.distributions.percentile:
import tensorflow as tf
images = ... # [N, W, H, C]
n = tf.shape(images)[0]
images_flat = tf.reshape(images, [n, -1])
p = tf.contrib.distributions.percentile(images_flat, 90, axis=1, interpolation='higher')
images_top10 = tf.where(images >= tf.reshape(p, [n, 1, 1, 1]),
images, tf.zeros_like(images))
If you want to be ready for TensorFlow 2.x, where tf.contrib will be removed, you can instead use TensorFlow Probability, which is where the percentile function will be permanently in the future.
EDIT: If you want to do the filtering per channel, you can modify the code slightly like this:
import tensorflow as tf
images = ... # [N, W, H, C]
shape = tf.shape(images)
n, c = shape[0], shape[3]
images_flat = tf.reshape(images, [n, -1, c])
p = tf.contrib.distributions.percentile(images_flat, 90, axis=1, interpolation='higher')
images_top10 = tf.where(images >= tf.reshape(p, [n, 1, 1, c]),
images, tf.zeros_like(images))

I've not found any built-in method yet. Try this workaround:
import numpy as np
import tensorflow as tf
def filter(tensor, ratio):
num_entries = tf.reduce_prod(tensor.shape)
num_to_keep = tf.cast(tf.multiply(ratio, tf.cast(num_entries, tf.float32)), tf.int32)
# Calculate threshold
x = tf.contrib.framework.sort(tf.reshape(tensor, [num_entries]))
threshold = x[-num_to_keep]
# Filter the tensor
mask = tf.cast(tf.greater_equal(tensor, threshold), tf.float32)
return tf.multiply(tensor, mask)
tensor = tf.constant(np.arange(40).reshape(2, 4, 5), dtype=tf.float32)
filtered_tensor = filter(tensor, 0.1)
# Print result
tf.InteractiveSession()
print(tensor.eval())
print(filtered_tensor.eval())

Related

Tensor slicing: tensorflow vs pytorch

I was testing this simple slicing operation in TF and PyTorch which should match in both
import tensorflow as tf
import numpy as np
import torch
tf_x = tf.random.uniform((4, 64, 64, 3))
pt_x = torch.Tensor(tf_x.numpy())
pt_x = pt_x.permute(0, 3, 1, 2)
# slicing operation
print(np.any(pt_x[:, :, 1:].permute(0, 2, 3, 1).numpy() - tf_x[:, 1:].numpy()))
# > False
pt_x = torch.Tensor(tf_x.numpy())
b, h, w, c = pt_x.shape
pt_x = pt_x.reshape((b, c, h, w))
print(np.any(pt_x.view(b, h, w, c).numpy() - tf_x.numpy())) # False
print(np.any(pt_x[:, :, 1:].reshape(4, 63, 64, 3).numpy() - tf_x[:, 1:].numpy()))
# > True
In the last line lies the problem. Both PyTorch and TF should lead to the same value but they don't. Is this discrepancy caused when I try to reshape the tensor?
On one hand, you have pt_x equal to tf_x, use np.isclose to verify:
>>> np.isclose(pt_x.view(b, h, w, c).numpy(), tf_x.numpy()).all()
True
On the other hand, you are slicing both tensors differently: pt_x[:, :, 1:] removes the first element along axis=2, while tf_x[:, 1:] removed the first element along axis=1. Therefore you end up with two distinct elements with overlapping values, like tf_x[:, 1:][0,-1,-1,-1] and pt_x[0,-1,-1,-1].
Also keep in mind tensor layouts are different in Tensorflow and PyTorch, while the former uses channel last layout, the latter does channel first. The operation needed between those two is a permutation (not a reshape).

Multidimensional Tensor slicing

First things first: I'm relatively new to TensorFlow.
I'm trying to implement a custom layer in tensorflow.keras and I'm having relatively hard time when I try to achieve the following:
I've got 3 Tensors (x,y,z) of shape (?,49,3,3,32) [where ? is the batch size]
On each Tensor I compute the sum over the 3rd and 4th axes [thus I end up with 3 Tensors of shape (?,49,32)]
By doing an argmax (A)on the above 3 Tensors (?,49,32) I get a single (?,49,32) Tensor
Now I want to use this tensor to select slices from the initial x,y,z Tensors in the following form:
Each element in the last dimension of A corresponds to the selected Tensor.
(aka: 0 = X, 1 = Y, 2 = Z)
The index of the last dimension of A corresponds to the slice that I would like to extract from the Tensor last dimension.
I've tried to achieve the above using tf.gather but I had no luck. Then I tried using a series of tf.map_fn, which is ugly and computationally costly.
To simplify the above:
let's say we've got an A array of shape (3,3,3,32). Then the numpy equivalent of what I try to achieve is this:
import numpy as np
x = np.random.rand(3,3,32)
y = np.random.rand(3,3,32)
z = np.random.rand(3,3,32)
x_sums = np.sum(np.sum(x,axis=0),0);
y_sums = np.sum(np.sum(y,axis=0),0);
z_sums = np.sum(np.sum(z,axis=0),0);
max_sums = np.argmax([x_sums,y_sums,z_sums],0)
A = np.array([x,y,z])
tmp = []
for i in range(0,len(max_sums)):
tmp.append(A[max_sums[i],:,:,i)
output = np.transpose(np.stack(tmp))
Any suggestions?
ps: I tried tf.gather_nd but I had no luck
This is how you can do something like that with tf.gather_nd:
import tensorflow as tf
# Make example data
tf.random.set_seed(0)
b = 10 # Batch size
x = tf.random.uniform((b, 49, 3, 3, 32))
y = tf.random.uniform((b, 49, 3, 3, 32))
z = tf.random.uniform((b, 49, 3, 3, 32))
# Stack tensors together
data = tf.stack([x, y, z], axis=2)
# Put reduction axes last
data_t = tf.transpose(data, (0, 1, 5, 2, 3, 4))
# Reduce
s = tf.reduce_sum(data_t, axis=(4, 5))
# Find largest sums
idx = tf.argmax(s, 3)
# Make gather indices
data_shape = tf.shape(data_t, idx.dtype)
bb, ii, jj = tf.meshgrid(*(tf.range(data_shape[i]) for i in range(3)), indexing='ij')
# Gather result
output_t = tf.gather_nd(data_t, tf.stack([bb, ii, jj, idx], axis=-1))
# Reorder axes
output = tf.transpose(output_t, (0, 1, 3, 4, 2))
print(output.shape)
# TensorShape([10, 49, 3, 3, 32])

Local reduce with specified slices over a single axis in tensorflow

I am trying to perform a local reduce with specified slices over a single axis on a 2D array.
I achieved this using numpy's numpy.ufunc.reduceat or numpy.add.reduceat but I would like do the same in tensorflow as the input to this reduce operation is an output from tensorflow convolution.
I came across tf.math.reduce_sum but I am not sure how this can be used in my case.
It will be great if I can do the reduceat operation in tensorflow as I can take advantage of a GPU.
You can do almost the same using tf.math.segment_sum:
import tensorflow as tf
import numpy as np
def add_reduceat_tf(a, indices, axis=0):
a = tf.convert_to_tensor(a)
indices = tf.convert_to_tensor(indices)
# Transpose if necessary
transpose = not (isinstance(axis, int) and axis == 0)
if transpose:
axis = tf.convert_to_tensor(axis)
ndims = tf.cast(tf.rank(a), axis.dtype)
a = tf.transpose(a, tf.concat([[axis], tf.range(axis),
tf.range(axis + 1, ndims)], axis=0))
# Make segment ids
r = tf.range(tf.shape(a, out_type=indices.dtype)[0])
segments = tf.searchsorted(indices, r, side='right')
# Compute segmented sum and discard first unused segment
out = tf.math.segment_sum(a, segments)[1:]
# Transpose back if necessary
if transpose:
out = tf.transpose(out, tf.concat([tf.range(1, axis + 1), [0],
tf.range(axis + 1, ndims)], axis=0))
return out
# Test
np.random.seed(0)
a = np.random.rand(5, 10).astype(np.float32)
indices = [2, 4, 7]
axis = 1
# NumPy computation
out_np = np.add.reduceat(a, indices, axis=axis)
# TF computation
with tf.Graph().as_default(), tf.Session() as sess:
out = add_reduceat_tf(a, indices, axis=axis)
out_tf = sess.run(out)
# Check result
print(np.allclose(out_np, out_tf))
# True
You can replace tf.math.segment_sum above with the reduction function you want to use. The only difference between this and the actual np.ufunc.reduceat is the special case where indices[i] >= indices[i + 1]. The posted function requires indices to be sorted, and if there were a case where indices[i] == indices[i + 1] the corresponding i position in the output would be zero, not a[indices[i]].

How to implement maxpool: taking a maximum on sliding window on image or tensor

In short: I am looking for a simple numpy (maybe oneliner) implementation of Maxpool - maximum on a window on numpy.narray for all location of the window across dimensions.
In more details: I am implementing a convolutional neural network ("CNN"), one of the typical layers in such a network is MaxPool layer (look for example here). Writing
y = MaxPool(x, S), x is an input narray and S is a parameter, using pseudocode, the output of the MaxPool is given by:
y[b,h,w,c] = max(x[b, s*h + i, s*w + j, c]) over i = 0,..., S-1; j = 0,...,S-1.
That is, y is narray where the value at indexes b,h,w,c equals the maximum taken over the window of size S x S along the second and the third dimension of the input x, the window "corner" is placed at the indexes b,h,w,c.
Some additional details: The network is implemented using numpy. CNN has many "layers" where output of one layer is the input to the next layer. The input to a layers are numpy.narrays called "tensors". In my case tensors are 4-dimensional numpy.narray's, x. That is x.shape is a tuple (B,H,W,C). Each size of dimensions changes after the tensor is process by a layer, for example the input to layer i= 4 can have size B = 10, H = 24, W = 24, C = 3, while the output, aka input to i+1 layer has B = 10, H = 12, W = 12, C = 5. As indicated in the comments the size after application of MaxPool is (B, H - S + 1, W - S + 1, C).
For a concreteness: if I use
import numpy as np
y = np.amax(x, axis = (1,2))
where x.shape is say (2,3,3,4) this will give me what I want but for a degenerate case where the window I am maximizing over is of the size 3 x 3, the size of the second and third dimension of x, which is not exactly what I want.
Here's a solution using np.lib.stride_tricks.as_strided to create sliding windows resulting in a 6D array of shape : (B,H-S+1,W-S+1,S,S,C) and then simply performing max along the fourth and fifth axes, resulting in an output array of shape : (B,H-S+1,W-S+1,C). The intermediate 6D array would be a view into the input array and as such won't occupy anymore memory. The subsequent operation of max being a reduction would efficiently utilize the sliding views.
Thus, an implementation would be -
# Based on http://stackoverflow.com/a/41850409/3293881
def patchify(img, patch_shape):
a, X, Y, b = img.shape
x, y = patch_shape
shape = (a, X - x + 1, Y - y + 1, x, y, b)
a_str, X_str, Y_str, b_str = img.strides
strides = (a_str, X_str, Y_str, X_str, Y_str, b_str)
return np.lib.stride_tricks.as_strided(img, shape=shape, strides=strides)
out = patchify(x, (S,S)).max(axis=(3,4))
Sample run -
In [224]: x = np.random.randint(0,9,(10,24,24,3))
In [225]: S = 5
In [226]: np.may_share_memory(patchify(x, (S,S)), x)
Out[226]: True
In [227]: patchify(x, (S,S)).shape
Out[227]: (10, 20, 20, 5, 5, 3)
In [228]: patchify(x, (S,S)).max(axis=(3,4)).shape
Out[228]: (10, 20, 20, 3)

Batched 4D tensor Tensorflow indexing

Given
batch_images: 4D tensor of shape (B, H, W, C)
x: 3D tensor of shape (B, H, W)
y: 3D tensor of shape (B, H, W)
Goal
How can I index into batch_images using the x and y coordinates to obtain a 4D tensor of shape B, H, W, C. That is, I want to obtain for each batch, and for each pair (x, y) a tensor of shape C.
In numpy, this would be achieved using input_img[np.arange(B)[:,None,None], y, x] for example but I can't seem to make it work in tensorflow.
My attempt so far
def get_pixel_value(img, x, y):
"""
Utility function to get pixel value for
coordinate vectors x and y from a 4D tensor image.
"""
H = tf.shape(img)[1]
W = tf.shape(img)[2]
C = tf.shape(img)[3]
# flatten image
img_flat = tf.reshape(img, [-1, C])
# flatten idx
idx_flat = (x*W) + y
return tf.gather(img_flat, idx_flat)
which is returning an incorrect tensor of shape (B, H, W).
It should be possible to do it by flattening the tensor as you've done, but the batch dimension has to be taken into account in the index calculation.
In order to do this, you'll have to make an additional dummy batch index tensor with the same shape as x and y that always contains the index of the current batch.
This is basically the np.arange(B) from your numpy example, which is missing from your TensorFlow code.
You can also simplify things a bit by using tf.gather_nd, which does the index calculations for you.
Here's an example:
import numpy as np
import tensorflow as tf
# Example tensors
M = np.random.uniform(size=(3, 4, 5, 6))
x = np.random.randint(0, 5, size=(3, 4, 5))
y = np.random.randint(0, 4, size=(3, 4, 5))
def get_pixel_value(img, x, y):
"""
Utility function that composes a new image, with pixels taken
from the coordinates given in x and y.
The shapes of x and y have to match.
The batch order is preserved.
"""
# We assume that x and y have the same shape.
shape = tf.shape(x)
batch_size = shape[0]
height = shape[1]
width = shape[2]
# Create a tensor that indexes into the same batch.
# This is needed for gather_nd to work.
batch_idx = tf.range(0, batch_size)
batch_idx = tf.reshape(batch_idx, (batch_size, 1, 1))
b = tf.tile(batch_idx, (1, height, width))
indices = tf.pack([b, y, x], 3)
return tf.gather_nd(img, indices)
s = tf.Session()
print(s.run(get_pixel_value(M, x, y)).shape)
# Should print (3, 4, 5, 6).
# We've composed a new image of the same size from randomly picked x and y
# coordinates of each original image.

Categories

Resources