tensorflow efficient way for tensor multiplication - python

I have two tensors in tensorflow, the first tensor is 3-D, and the second is 2D. And I want to multiply them like this:
x = tf.placeholder(tf.float32, shape=[sequence_length, batch_size, hidden_num])
w = tf.get_variable("w", [hidden_num, 50])
b = tf.get_variable("b", [50])
output_list = []
for step_index in range(sequence_length):
output = tf.matmul(x[step_index, :, :], w) + b
output = tf.pack(outputs_list)
I use a loop to do multiply operation, but I think it is too slow. What would be the best way to make this process as simple/clean as possible?

You could use batch_matmul. Unfortunately it doesn't seem batch_matmul supports broadcasting along the batch dimension, so you have to tile your w matrix. This will use more memory, but all operations will stay in TensorFlow
a = tf.ones((5, 2, 3))
b = tf.ones((3, 1))
b = tf.reshape(b, (1, 3, 1))
b = tf.tile(b, [5, 1, 1])
c = tf.batch_matmul(a, b) # use tf.matmul in TF 1.0
sess = tf.InteractiveSession()
This gives
array([5, 2, 1], dtype=int32)

You could use map_fn, which scans a function along the first dimension.
x = tf.placeholder(tf.float32, shape=[sequence_length, batch_size, hidden_num])
w = tf.get_variable("w", [hidden_num, 50])
b = tf.get_variable("b", [50])
def mul_fn(current_input):
return tf.matmul(current_input, w) + b
output = tf.map_fn(mul_fn, x)
I used this at one point to implement a softmax scan along a sequence.


return the top_k masked softmax of each row for a 2D tensor

For any 2D tensor like
I want to do softmax for the top k element in each row and then construct a new tensor by replacing all the other elements to 0.
The result should be to get the softmax of top k (here k=2) elements for each row [[7,5],[8,7]],
which is thus
and then reconstruct a new tensor according to the index of the top k elements in the original tensor, the final result should be
Is it possible to implement this kind of masked softmax in tensorflow? Many thanks in advance!
Here is how you can do that:
import tensorflow as tf
# Input data
a = tf.placeholder(tf.float32, [None, None])
num_top = tf.placeholder(tf.int32, [])
# Find top elements
a_top, a_top_idx = tf.nn.top_k(a, num_top, sorted=False)
# Apply softmax
a_top_sm = tf.nn.softmax(a_top)
# Reconstruct into original shape
a_shape = tf.shape(a)
a_row_idx = tf.tile(tf.range(a_shape[0])[:, tf.newaxis], (1, num_top))
scatter_idx = tf.stack([a_row_idx, a_top_idx], axis=-1)
result = tf.scatter_nd(scatter_idx, a_top_sm, a_shape)
# Test
with tf.Session() as sess:
result_val = sess.run(result, feed_dict={a: [[2, 5, 4, 7], [7, 5, 6, 8]], num_top: 2})
[[0. 0.11920291 0. 0.880797 ]
[0.26894143 0. 0. 0.7310586 ]]
Actually, there is a function that more closely does what you intend, tf.sparse.softmax. However, it requires a SparseTensor as input, and I'm not sure it should be faster since it has to figure out which sparse values go together in the softmax. The good thing about this function is that you could have different number of elements to softmax in each row, but in your case that does not seem to be important. Anyway, here is an implementation with that, in case you find it useful.
import tensorflow as tf
a = tf.placeholder(tf.float32, [None, None])
num_top = tf.placeholder(tf.int32, [])
# Find top elements
a_top, a_top_idx = tf.nn.top_k(a, num_top, sorted=False)
# Flatten values
sparse_values = tf.reshape(a_top, [-1])
# Make sparse indices
shape = tf.cast(tf.shape(a), tf.int64)
a_row_idx = tf.tile(tf.range(shape[0])[:, tf.newaxis], (1, num_top))
sparse_idx = tf.stack([a_row_idx, tf.cast(a_top_idx, tf.int64)], axis=-1)
sparse_idx = tf.reshape(sparse_idx, [-1, 2])
# Make sparse tensor
a_top_sparse = tf.SparseTensor(sparse_idx, sparse_values, shape)
# Reorder sparse tensor
a_top_sparse = tf.sparse.reorder(a_top_sparse)
# Softmax
result_sparse = tf.sparse.softmax(a_top_sparse)
# Convert back to dense (or you can keep working with the sparse tensor)
result = tf.sparse.to_dense(result_sparse)
# Test
with tf.Session() as sess:
result_val = sess.run(result, feed_dict={a: [[2, 5, 4, 7], [7, 5, 6, 8]], num_top: 2})
# Same as before
Let's say you have a weights tensor w with shape (None, N)
Find the minimum value of the top k elements
top_kw = tf.math.top_k(w, k=10, sorted=False)[0]
min_w = tf.reduce_min(top_kw, axis=1, keepdims=True)
Generate a boolean mask for the weights tensor
mask_w = tf.greater_equal(w, min_w)
mask_w = tf.cast(mask_w, tf.float32)
Compute custom softmax using the mask
w = tf.multiply(tf.exp(w), mask_w) / tf.reduce_sum(tf.multiply(tf.exp(w), mask_w), axis=1, keepdims=True)

How to vectorize indexing operation in tensorflow

I have a tensor A of shape (2, 4, 2), and a tensor B of shape (4, 4), all the values are int. Entries in A are from 0 to 3.
I want to create a tensor C of shape(2, 4, 2).
The for loop code is like:
for i in range(2):
for j in range(2):
for k in range(4):
C[i][k][j] = B[k][A[i][k][j]]
How can I create such tensor C in tensorflow?
Here is how you can do it with tf.gather_nd:
import tensorflow as tf
# Input values
A = tf.placeholder(tf.int32, [None, None, None])
B = tf.placeholder(tf.int32, [None, None])
# Make indices for first dimension of B
idx = tf.range(tf.shape(B)[0], dtype=A.dtype)[tf.newaxis, :, tf.newaxis]
# Tile first dimension indices to match the size of A
idx = tf.tile(idx, (tf.shape(A)[0], 1, tf.shape(A)[2]))
# Stack first dimension indices with A to complete index tensor
idx = tf.stack([idx, A], axis=-1)
# Make result gathering from B
C = tf.gather_nd(B, idx)
Here is an example, testing that the result matches your code:
import tensorflow as tf
import numpy as np
# Non-TensorFlow implementation for result comparison
A_value = np.random.randint(0, 4, size=(2, 4, 2))
B_value = np.random.randint(100, size=(4, 4))
C_value = np.empty(A_value.shape, dtype=B_value.dtype)
for i in range(A_value.shape[0]):
for j in range(A_value.shape[2]):
for k in range(A_value.shape[1]):
C_value[i][k][j] = B_value[k][A_value[i][k][j]]
# TensorFlow implementation
A = tf.placeholder(tf.int32, [None, None, None])
B = tf.placeholder(tf.int32, [None, None])
idx = tf.range(tf.shape(B)[0], dtype=A.dtype)[tf.newaxis, :, tf.newaxis]
idx = tf.tile(idx, (tf.shape(A)[0], 1, tf.shape(A)[2]))
idx = tf.stack([idx, A], axis=-1)
C = tf.gather_nd(B, idx)
# Check result
with tf.Session() as sess:
C_value_tf = sess.run(C, feed_dict={A: A_value, B: B_value})
print(np.all(np.equal(C_value_tf, C_value)))

linear model in tensor flow

I was trying to generate a simple linear model in Tensorflow. Here is the code ...
N = 400
features = 100
nSteps = 1000
data = (np.random.randn(N, features), np.random.randint(0, 2, N))
W = tf.placeholder(tf.float32, shape=(features,1), name='W')
b = tf.placeholder(tf.float32, shape=(features,1), name='b')
d = tf.constant(data[0], dtype=tf.float32)
result = tf.add( tf.matmul(d, W), b)
It turns out that there might be some problem with the dimensions of b, but for some reason as far as I can say, they are all ok ...
Not sure why this is throwing an error. Can someone please help?
result = tf.matmul(d, W)
This is ok.
I have checked the shape of the result, and is the same as that of b. Not really sure what might be the problem.
In a linear model (i.e. one unit in the output layer), b should be a scalar.
Mathematically, for a single observation, you have: result = WX + b, where dimensions W [1 x features], X [features x 1]. Then, WX is scalar. Thus b should be a scalar.
So you should change b to the following, to get the correct linear model and make the dimensions work out:
b = tf.placeholder(tf.float32, shape=(1,1), name='b')

Tensor multiplication in Tensorflow

I am trying to carry out tensor multiplication in NumPy/Tensorflow.
I have 3 tensors- A (M X h), B (h X N X s), C (s X T).
I believe that A X B X C should produce a tensor D (M X N X T).
Here's the code (using both numpy and tensorflow).
M = 5
N = 2
T = 3
h = 2
s = 3
A_np = np.random.randn(M, h)
C_np = np.random.randn(s, T)
B_np = np.random.randn(h, N, s)
A_tf = tf.Variable(A_np)
C_tf = tf.Variable(C_np)
B_tf = tf.Variable(B_np)
# Tensorflow
with tf.Session() as sess:
print sess.run(A_tf)
p = tf.matmul(A_tf, B_tf)
This returns the following error:
ValueError: Shape must be rank 2 but is rank 3 for 'MatMul_2' (op: 'MatMul') with input shapes: [5,2], [2,2,3].
If we try the multiplication only with numpy matrices, we get the following errors:
np.multiply(A_np, B_np)
ValueError: operands could not be broadcast together with shapes (5,2) (2,2,3)
However, we can use np.tensordot as follows:
np.tensordot(np.tensordot(A_np, B_np, axes=1), C_np, axes=1)
Is there an equivalent operation in TensorFlow?
In numpy, we would do as follows:
ABC_np = np.tensordot(np.tensordot(A_np, B_np, axes=1), C_np, axes=1)
In tensorflow, we would do as follows:
AB_tf = tf.tensordot(A_tf, B_tf,axes = [[1], [0]])
AB_tf_C_tf = tf.tensordot(AB_tf, C_tf, axes=[[2], [0]])
with tf.Session() as sess:
ABC_tf = sess.run(AB_tf_C_tf)
np.allclose(ABC_np, ABC_tf) return True.
tf.tensordot(A_tf, B_tf,axes = [[1], [0]])
For example:
x=tf.tensordot(A_tf, B_tf,axes = [[1], [0]])
TensorShape([Dimension(5), Dimension(2), Dimension(3)])
Here is tensordot documentation, and here is the relevant github repository.

Tensorflow: Convolutions with different filter for each sample in the mini-batch

I would like to have a 2d convolution with a filter which depends on the sample in the mini-batch in tensorflow. Any ideas how one could do that, especially if the number of sample per mini-batch is not known?
Concretely, I have input data inp of the form MB x H x W x Channels, and I have filters F of the form MB x fh x fw x Channels x OutChannels.
It is assumed that
inp = tf.placeholder('float', [None, H, W, channels_img], name='img_input').
I would like to do tf.nn.conv2d(inp, F, strides = [1,1,1,1]), but this is not allowed because F cannot have a mini-batch dimension. Any idea how to solve this problem?
I think the proposed trick is actually not right. What happens with a tf.conv3d() layer is that the input gets convolved on depth (=actual batch) dimension AND then summed along resulting feature maps. With padding='SAME' the resulting number of outputs then happens to be the same as batch size so one gets fooled!
EDIT: I think a possible way to do a convolution with different filters for the different mini-batch elements involves 'hacking' a depthwise convolution. Assuming batch size MB is known:
inp = tf.placeholder(tf.float32, [MB, H, W, channels_img])
# F has shape (MB, fh, fw, channels, out_channels)
# REM: with the notation in the question, we need: channels_img==channels
F = tf.transpose(F, [1, 2, 0, 3, 4])
F = tf.reshape(F, [fh, fw, channels*MB, out_channels)
inp_r = tf.transpose(inp, [1, 2, 0, 3]) # shape (H, W, MB, channels_img)
inp_r = tf.reshape(inp, [1, H, W, MB*channels_img])
out = tf.nn.depthwise_conv2d(
strides=[1, 1, 1, 1],
padding='VALID') # here no requirement about padding being 'VALID', use whatever you want.
# Now out shape is (1, H, W, MB*channels*out_channels)
out = tf.reshape(out, [H, W, MB, channels, out_channels) # careful about the order of depthwise conv out_channels!
out = tf.transpose(out, [2, 0, 1, 3, 4])
out = tf.reduce_sum(out, axis=3)
# out shape is now (MB, H, W, out_channels)
In case MB is unknown, it should be possible to determine it dynamically using tf.shape() (I think)
You could use tf.map_fn as follows:
inp = tf.placeholder(tf.float32, [None, h, w, c_in])
def single_conv(tupl):
x, kernel = tupl
return tf.nn.conv2d(x, kernel, strides=(1, 1, 1, 1), padding='VALID')
# Assume kernels shape is [tf.shape(inp)[0], fh, fw, c_in, c_out]
batch_wise_conv = tf.squeeze(tf.map_fn(
single_conv, (tf.expand_dims(inp, 1), kernels), dtype=tf.float32),
It is important to specify dtype for map_fn. Basically, this solution defines batch_dim_size 2D convolution operations.
The accepted answer is slightly wrong in how it treats the dimensions, as they are changed by padding = "VALID" (he treats them as if padding = "SAME"). Hence in the general case, the code will crash, due to this mismatch. I attach his corrected code, with both scenarios correctly treated.
inp = tf.placeholder(tf.float32, [MB, H, W, channels_img])
# F has shape (MB, fh, fw, channels, out_channels)
# REM: with the notation in the question, we need: channels_img==channels
F = tf.transpose(F, [1, 2, 0, 3, 4])
F = tf.reshape(F, [fh, fw, channels*MB, out_channels)
inp_r = tf.transpose(inp, [1, 2, 0, 3]) # shape (H, W, MB, channels_img)
inp_r = tf.reshape(inp_r, [1, H, W, MB*channels_img])
padding = "VALID" #or "SAME"
out = tf.nn.depthwise_conv2d(
strides=[1, 1, 1, 1],
padding=padding) # here no requirement about padding being 'VALID', use whatever you want.
# Now out shape is (1, H-fh+1, W-fw+1, MB*channels*out_channels), because we used "VALID"
if padding == "SAME":
out = tf.reshape(out, [H, W, MB, channels, out_channels)
if padding == "VALID":
out = tf.reshape(out, [H-fh+1, W-fw+1, MB, channels, out_channels)
out = tf.transpose(out, [2, 0, 1, 3, 4])
out = tf.reduce_sum(out, axis=3)
# out shape is now (MB, H-fh+1, W-fw+1, out_channels)
They way to go around it is adding an extra dimension using
tf.expand_dims(inp, 0)
to create a 'fake' batch size. Then use the
operation where the filter-depth matches the batch size. This will result in each filter convolving with only one sample in each batch.
Sadly, you will not solve the variable batch size problem this way, only the convolutions.

