Related
I have two placholders with the following dimensions
x_ph = tf.placeholder(tf.float32,[None, 4])
and
y_ph = tf.placeholder(tf.float32,[None, 4, 3])
I want to multiply each element of x with one row of y_ph to get an output of shape (None, 4, 3).
Example of output I am looking,
x = np.random.uniform(-1,1, (2,2))
z = np.random.uniform(-1,1, (2,2,3))
x, z
([[ 0.27083503, -0.13795923],[ 0.8436118 , 0.00771057]])
([[[ 0.51905276, 0.01173655, -0.57335926],
[ 0.42347431, -0.05438272, 0.21042366]]
[[ 0.91347706, -0.28086164, 0.54952429],
[ 0.41551953, -0.6207727 , 0.32066292]]]))
I want to do following operation:
result = np.zeros((2,3))
for i in range(2):
for j in range(2):
result[i] += x[i,j]*z[i,j,:]
print(result)
[[ 0.08215548 0.01068127 -0.18431566]
[ 0.77382391 -0.2417247 0.46605767]]
Any way to do it in tensorflow?
Add one dimension at the end of x_ph so you can use broadcasting to multiply both tensors:
import tensorflow as tf
x_ph = tf.placeholder(tf.float32,[None, 4])
y_ph = tf.placeholder(tf.float32,[None, 4, 3])
result = tf.expand_dims(x_ph, -1) * y_ph
First things first: I'm relatively new to TensorFlow.
I'm trying to implement a custom layer in tensorflow.keras and I'm having relatively hard time when I try to achieve the following:
I've got 3 Tensors (x,y,z) of shape (?,49,3,3,32) [where ? is the batch size]
On each Tensor I compute the sum over the 3rd and 4th axes [thus I end up with 3 Tensors of shape (?,49,32)]
By doing an argmax (A)on the above 3 Tensors (?,49,32) I get a single (?,49,32) Tensor
Now I want to use this tensor to select slices from the initial x,y,z Tensors in the following form:
Each element in the last dimension of A corresponds to the selected Tensor.
(aka: 0 = X, 1 = Y, 2 = Z)
The index of the last dimension of A corresponds to the slice that I would like to extract from the Tensor last dimension.
I've tried to achieve the above using tf.gather but I had no luck. Then I tried using a series of tf.map_fn, which is ugly and computationally costly.
To simplify the above:
let's say we've got an A array of shape (3,3,3,32). Then the numpy equivalent of what I try to achieve is this:
import numpy as np
x = np.random.rand(3,3,32)
y = np.random.rand(3,3,32)
z = np.random.rand(3,3,32)
x_sums = np.sum(np.sum(x,axis=0),0);
y_sums = np.sum(np.sum(y,axis=0),0);
z_sums = np.sum(np.sum(z,axis=0),0);
max_sums = np.argmax([x_sums,y_sums,z_sums],0)
A = np.array([x,y,z])
tmp = []
for i in range(0,len(max_sums)):
tmp.append(A[max_sums[i],:,:,i)
output = np.transpose(np.stack(tmp))
Any suggestions?
ps: I tried tf.gather_nd but I had no luck
This is how you can do something like that with tf.gather_nd:
import tensorflow as tf
# Make example data
tf.random.set_seed(0)
b = 10 # Batch size
x = tf.random.uniform((b, 49, 3, 3, 32))
y = tf.random.uniform((b, 49, 3, 3, 32))
z = tf.random.uniform((b, 49, 3, 3, 32))
# Stack tensors together
data = tf.stack([x, y, z], axis=2)
# Put reduction axes last
data_t = tf.transpose(data, (0, 1, 5, 2, 3, 4))
# Reduce
s = tf.reduce_sum(data_t, axis=(4, 5))
# Find largest sums
idx = tf.argmax(s, 3)
# Make gather indices
data_shape = tf.shape(data_t, idx.dtype)
bb, ii, jj = tf.meshgrid(*(tf.range(data_shape[i]) for i in range(3)), indexing='ij')
# Gather result
output_t = tf.gather_nd(data_t, tf.stack([bb, ii, jj, idx], axis=-1))
# Reorder axes
output = tf.transpose(output_t, (0, 1, 3, 4, 2))
print(output.shape)
# TensorShape([10, 49, 3, 3, 32])
Consider I have a tensor called x which has the shape of [1, batch_size]. I want to change the rows of another tensor called my_tensor with the shape of [batch_size, seq_length] if the respected value in x is less than or equal to zero.
I suppose I can explain better by representing a code:
import tensorflow as tf
batch_size = 3
seq_length = 5
x = tf.constant([-1, 4, 0]) # size is [1, batch_size]
# select the indices of the rows to be changed
candidate_rows = tf.where(tf.less_equal(x, 0))
my_tensor = tf.random.uniform(shape=(batch_size, seq_length), minval=10, maxval=30, seed=123)
sess = tf.InteractiveSession()
print(sess.run(candidate_rows))
print(sess.run(my_tensor))
which will produce:
candidate_rows =
[[0]
[2]]
my_tensor =
[[10.816193 14.168425 11.83606 24.044014 24.146267]
[17.929298 11.330187 15.837727 10.592653 29.098463]
[10.122135 16.338099 24.35467 15.236387 10.991222]]
and I would like to change rows [0] and [2] in my tensor to another value, say all equal to 1.
[[1 1 1 1]
[17.929298 11.330187 15.837727 10.592653 29.098463]
[1 1 1 1 1]]
Perhaps all the problem arises when I use the tf.where. I appreciate any assistance :)
One solution to your problem is to use tf.where to select between elements of two tensors.
t = tf.ones(shape=my_tensor.shape, dtype=my_tensor.dtype)
my_tensor = tf.where(x > 0, my_tensor, t)
I have a tensor of shape (16, 4096, 3). I have another tensor of indices of shape (16, 32768, 3). I am trying to collect the values along dim=1. This was initially done in pytorch using gather function as shown below-
# a.shape (16L, 4096L, 3L)
# idx.shape (16L, 32768L, 3L)
b = a.gather(1, idx)
# b.shape (16L, 32768L, 3L)
Please note that the size of output b is the same as that of idx. However, when I apply gather function of tensorflow, I get a completely different output. The output dimension was found mismatching as shown below-
b = tf.gather(a, idx, axis=1)
# b.shape (16, 16, 32768, 3, 3)
I also tried using tf.gather_nd but got in vain. See below-
b = tf.gather_nd(a, idx)
# b.shape (16, 32768)
Why am I getting different shapes of tensors? I want to get the tensor of the same shape as calculated by pytorch.
In other words, I want to know the tensorflow equivalent of torch.gather.
For 2D case,there is a method to do it:
# a.shape (16L, 10L)
# idx.shape (16L,1)
idx = tf.stack([tf.range(tf.shape(idx)[0]),idx[:,0]],axis=-1)
b = tf.gather_nd(a,idx)
However,For ND case,this method maybe very complex
This "should" be a general solution using tf.gather_nd (I've only tested for rank 2 and 3 tensors along the last axis):
def torch_gather(x, indices, gather_axis):
# if pytorch gather indices are
# [[[0, 10, 20], [0, 10, 20], [0, 10, 20]],
# [[0, 10, 20], [0, 10, 20], [0, 10, 20]]]
# tf nd_gather needs to be
# [[0,0,0], [0,0,10], [0,0,20], [0,1,0], [0,1,10], [0,1,20], [0,2,0], [0,2,10], [0,2,20],
# [1,0,0], [1,0,10], [1,0,20], [1,1,0], [1,1,10], [1,1,20], [1,2,0], [1,2,10], [1,2,20]]
# create a tensor containing indices of each element
all_indices = tf.where(tf.fill(indices.shape, True))
gather_locations = tf.reshape(indices, [indices.shape.num_elements()])
# splice in our pytorch style index at the correct axis
gather_indices = []
for axis in range(len(indices.shape)):
if axis == gather_axis:
gather_indices.append(gather_locations)
else:
gather_indices.append(all_indices[:, axis])
gather_indices = tf.stack(gather_indices, axis=-1)
gathered = tf.gather_nd(x, gather_indices)
reshaped = tf.reshape(gathered, indices.shape)
return reshaped
For the last-axis gathering, we can use the 2D-reshape trick for general ND cases, and then employ #LiShaoyuan 2D code above
# last-axis gathering only - use 2D-reshape-trick for Torch's style nD gathering
def torch_gather(param, id_tensor):
# 2d-gather torch equivalent from #LiShaoyuan above
def gather2d(target, id_tensor):
idx = tf.stack([tf.range(tf.shape(id_tensor)[0]),id_tensor[:,0]],axis=-1)
result = tf.gather_nd(target,idx)
return tf.expand_dims(result,axis=-1)
target = tf.reshape(param, (-1, param.shape[-1])) # reshape 2D
target_shape = id_tensor.shape
id_tensor = tf.reshape(id_tensor, (-1, 1)) # also 2D-index
result = gather2d(target, id_tensor)
return tf.reshape(result, target_shape)
I have a placeholder tensor with shape: [batch_size, sentence_length, word_dim] and a list of indices with shape=[batch_size, num_indices]. Indices are on the second axes and are indices of words in the sentence. Batch_size & sentence_length are only known at runtime.
How do I extract a tensor with shape [batch_size, len(indices), word_dim]?
I was reading about tensorflow.gather but it seems like gather only slices along the first axes. Am I correct?
Edit: I managed to get it work with constant
def tile_repeat(n, repTime):
'''
create something like 111..122..2333..33 ..... n..nn
one particular number appears repTime consecutively.
This is for flattening the indices.
'''
print n, repTime
idx = tf.range(n)
idx = tf.reshape(idx, [-1, 1]) # Convert to a n x 1 matrix.
idx = tf.tile(idx, [1, int(repTime)]) # Create multiple columns, each column has one number repeats repTime
y = tf.reshape(idx, [-1])
return y
def gather_along_second_axis(x, idx):
'''
x has shape: [batch_size, sentence_length, word_dim]
idx has shape: [batch_size, num_indices]
Basically, in each batch, get words from sentence having index specified in idx
However, since tensorflow does not fully support indexing,
gather only work for the first axis. We have to reshape the input data, gather then reshape again
'''
reshapedIdx = tf.reshape(idx, [-1]) # [batch_size*num_indices]
idx_flattened = tile_repeat(tf.shape(x)[0], tf.shape(x)[1]) * tf.shape(x)[1] + reshapedIdx
y = tf.gather(tf.reshape(x, [-1,int(tf.shape(x)[2])]), # flatten input
idx_flattened)
y = tf.reshape(y, tf.shape(x))
return y
x = tf.constant([
[[1,2,3],[3,5,6]],
[[7,8,9],[10,11,12]],
[[13,14,15],[16,17,18]]
])
idx=tf.constant([[0,1],[1,0],[1,1]])
y = gather_along_second_axis(x, idx)
with tf.Session(''):
print y.eval()
print tf.Tensor.get_shape(y)
And the output is:
[[[ 1 2 3]
[ 3 5 6]]
[[10 11 12]
[ 7 8 9]]
[[16 17 18]
[16 17 18]]]
shape: (3, 2, 3)
However, when inputs are placeholder it does not work return error:
idx = tf.tile(idx, [1, int(repTime)])
TypeError: int() argument must be a string or a number, not 'Tensor'
Python 2.7, tensorflow 0.12
Thank you in advance.
Thank to #AllenLavoie's comments, I could eventually come up with the solution:
def tile_repeat(n, repTime):
'''
create something like 111..122..2333..33 ..... n..nn
one particular number appears repTime consecutively.
This is for flattening the indices.
'''
print n, repTime
idx = tf.range(n)
idx = tf.reshape(idx, [-1, 1]) # Convert to a n x 1 matrix.
idx = tf.tile(idx, [1, repTime]) # Create multiple columns, each column has one number repeats repTime
y = tf.reshape(idx, [-1])
return y
def gather_along_second_axis(x, idx):
'''
x has shape: [batch_size, sentence_length, word_dim]
idx has shape: [batch_size, num_indices]
Basically, in each batch, get words from sentence having index specified in idx
However, since tensorflow does not fully support indexing,
gather only work for the first axis. We have to reshape the input data, gather then reshape again
'''
reshapedIdx = tf.reshape(idx, [-1]) # [batch_size*num_indices]
idx_flattened = tile_repeat(tf.shape(x)[0], tf.shape(x)[1]) * tf.shape(x)[1] + reshapedIdx
y = tf.gather(tf.reshape(x, [-1,tf.shape(x)[2]]), # flatten input
idx_flattened)
y = tf.reshape(y, tf.shape(x))
return y
x = tf.constant([
[[1,2,3],[3,5,6]],
[[7,8,9],[10,11,12]],
[[13,14,15],[16,17,18]]
])
idx=tf.constant([[0,1],[1,0],[1,1]])
y = gather_along_second_axis(x, idx)
with tf.Session(''):
print y.eval()
print tf.Tensor.get_shape(y)
#Hoa Vu's answer was very helpful. The code works with the example x and idx which is sentence_length == len(indices), but it gives an error when sentence_length != len(indices).
I slightly changed the code and now it works when sentence_length >= len(indices).
I tested with new x and idx on Python 3.x.
def tile_repeat(n, repTime):
'''
create something like 111..122..2333..33 ..... n..nn
one particular number appears repTime consecutively.
This is for flattening the indices.
'''
idx = tf.range(n)
idx = tf.reshape(idx, [-1, 1]) # Convert to a n x 1 matrix.
idx = tf.tile(idx, [1, repTime]) # Create multiple columns, each column has one number repeats repTime
y = tf.reshape(idx, [-1])
return y
def gather_along_second_axis(x, idx):
'''
x has shape: [batch_size, sentence_length, word_dim]
idx has shape: [batch_size, num_indices]
Basically, in each batch, get words from sentence having index specified in idx
However, since tensorflow does not fully support indexing,
gather only work for the first axis. We have to reshape the input data, gather then reshape again
'''
reshapedIdx = tf.reshape(idx, [-1]) # [batch_size*num_indices]
idx_flattened = tile_repeat(tf.shape(x)[0], tf.shape(idx)[1]) * tf.shape(x)[1] + reshapedIdx
y = tf.gather(tf.reshape(x, [-1,tf.shape(x)[2]]), # flatten input
idx_flattened)
y = tf.reshape(y, [tf.shape(x)[0],tf.shape(idx)[1],tf.shape(x)[2]])
return y
x = tf.constant([
[[1,2,3],[1,2,3],[3,5,6],[3,5,6]],
[[7,8,9],[7,8,9],[10,11,12],[10,11,12]],
[[13,14,15],[13,14,15],[16,17,18],[16,17,18]]
])
idx=tf.constant([[0,1],[1,2],[0,3]])
y = gather_along_second_axis(x, idx)
with tf.Session(''):
print(y.eval())