Get the last output of a dynamic_rnn in TensorFlow - python

I have a 3-D tensor of shape [batch, None, dim] where the second dimension, i.e. the timesteps, is unknown. I use dynamic_rnn to process such input, like in the following snippet:
import numpy as np
import tensorflow as tf
batch = 2
dim = 3
hidden = 4
lengths = tf.placeholder(dtype=tf.int32, shape=[batch])
inputs = tf.placeholder(dtype=tf.float32, shape=[batch, None, dim])
cell = tf.nn.rnn_cell.GRUCell(hidden)
cell_state = cell.zero_state(batch, tf.float32)
output, _ = tf.nn.dynamic_rnn(cell, inputs, lengths, initial_state=cell_state)
Actually, running this snipped with some actual numbers, I have some reasonable results:
inputs_ = np.asarray([[[0, 0, 0], [1, 1, 1], [2, 2, 2], [3, 3, 3]],
[[6, 6, 6], [7, 7, 7], [8, 8, 8], [9, 9, 9]]],
dtype=np.int32)
lengths_ = np.asarray([3, 1], dtype=np.int32)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
output_ = sess.run(output, {inputs: inputs_, lengths: lengths_})
print(output_)
And the output is:
[[[ 0. 0. 0. 0. ]
[ 0.02188676 -0.01294564 0.05340237 -0.47148666]
[ 0.0343586 -0.02243731 0.0870839 -0.89869428]
[ 0. 0. 0. 0. ]]
[[ 0.00284752 -0.00315077 0.00108094 -0.99883419]
[ 0. 0. 0. 0. ]
[ 0. 0. 0. 0. ]
[ 0. 0. 0. 0. ]]]
Is there a way to get a 3-D tensor of shape [batch, 1, hidden] with the last relevant output of the dynamic RNN? Thanks!

This is what gather_nd is for!
def extract_axis_1(data, ind):
"""
Get specified elements along the first axis of tensor.
:param data: Tensorflow tensor that will be subsetted.
:param ind: Indices to take (one for each element along axis 0 of data).
:return: Subsetted tensor.
"""
batch_range = tf.range(tf.shape(data)[0])
indices = tf.stack([batch_range, ind], axis=1)
res = tf.gather_nd(data, indices)
return res
In your case:
output = extract_axis_1(output, lengths - 1)
Now output is a tensor of dimension [batch_size, num_cells].

From the following two sources,
http://www.wildml.com/2016/08/rnns-in-tensorflow-a-practical-guide-and-undocumented-features/
outputs, last_states = tf.nn.dynamic_rnn(
cell=cell,
dtype=tf.float64,
sequence_length=X_lengths,
inputs=X)
Or https://github.com/ageron/handson-ml/blob/master/14_recurrent_neural_networks.ipynb,
It is clear the last_states can be directly extracted from the SECOND output of the dynamic_rnn call. It will give you the last_states across all layers (in LSTM it is compsed from LSTMStateTuple) , while the outputs contains all the states in the last layer.

Okay — so, looks like there actually is an easier solution. As #Shao Tang and #Rahul mentioned, the preferred way to do this would be by accessing the final cell state. Here’s why:
If you look at the GRUCell source code (below), you’ll see that the “state” that the cell maintains is actually the hidden weights themselves. So, when the tf.nn.dynamic_rnn returns the final state, it is actually returning the final hidden weights that you are interested in. To prove this, I just tweaked your setup and got the results:
GRUCell Call (rnn_cell_impl.py):
def call(self, inputs, state):
"""Gated recurrent unit (GRU) with nunits cells."""
if self._gate_linear is None:
bias_ones = self._bias_initializer
if self._bias_initializer is None:
bias_ones = init_ops.constant_initializer(1.0, dtype=inputs.dtype)
with vs.variable_scope("gates"): # Reset gate and update gate.
self._gate_linear = _Linear(
[inputs, state],
2 * self._num_units,
True,
bias_initializer=bias_ones,
kernel_initializer=self._kernel_initializer)
value = math_ops.sigmoid(self._gate_linear([inputs, state]))
r, u = array_ops.split(value=value, num_or_size_splits=2, axis=1)
r_state = r * state
if self._candidate_linear is None:
with vs.variable_scope("candidate"):
self._candidate_linear = _Linear(
[inputs, r_state],
self._num_units,
True,
bias_initializer=self._bias_initializer,
kernel_initializer=self._kernel_initializer)
c = self._activation(self._candidate_linear([inputs, r_state]))
new_h = u * state + (1 - u) * c
return new_h, new_h
Solution:
import numpy as np
import tensorflow as tf
batch = 2
dim = 3
hidden = 4
lengths = tf.placeholder(dtype=tf.int32, shape=[batch])
inputs = tf.placeholder(dtype=tf.float32, shape=[batch, None, dim])
cell = tf.nn.rnn_cell.GRUCell(hidden)
cell_state = cell.zero_state(batch, tf.float32)
output, state = tf.nn.dynamic_rnn(cell, inputs, lengths, initial_state=cell_state)
inputs_ = np.asarray([[[0, 0, 0], [1, 1, 1], [2, 2, 2], [3, 3, 3]],
[[6, 6, 6], [7, 7, 7], [8, 8, 8], [9, 9, 9]]],
dtype=np.int32)
lengths_ = np.asarray([3, 1], dtype=np.int32)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
output_, state_ = sess.run([output, state], {inputs: inputs_, lengths: lengths_})
print (output_)
print (state_)
Output:
[[[ 0. 0. 0. 0. ]
[-0.24305521 -0.15512943 0.06614969 0.16873555]
[-0.62767833 -0.30741733 0.14819752 0.44313088]
[ 0. 0. 0. 0. ]]
[[-0.99152333 -0.1006391 0.28767768 0.76360202]
[ 0. 0. 0. 0. ]
[ 0. 0. 0. 0. ]
[ 0. 0. 0. 0. ]]]
[[-0.62767833 -0.30741733 0.14819752 0.44313088]
[-0.99152333 -0.1006391 0.28767768 0.76360202]]
For other readers who are working with the LSTMCell (another popular option), things work a little differently. The LSTMCell maintains the state in a different way - cell state is either a tuple or a concatenated version of the actual cell state and the hidden state. So, to access the final hidden weights, you could set (is_state_tuple to True) during cell-initialization, and the final state will be a tuple : (final cell state, final hidden weights). So, in this case,
_, (_, h) = tf.nn.dynamic_rnn(cell, inputs, lengths, initial_state=cell_state)
will give you the final weights.
References:
c_state and m_state in Tensorflow LSTM
https://github.com/tensorflow/tensorflow/blob/438604fc885208ee05f9eef2d0f2c630e1360a83/tensorflow/python/ops/rnn_cell_impl.py#L308
https://github.com/tensorflow/tensorflow/blob/438604fc885208ee05f9eef2d0f2c630e1360a83/tensorflow/python/ops/rnn_cell_impl.py#L415

Actually, the solution was not that hard. I implemented the following code:
slices = []
for index, l in enumerate(tf.unstack(lengths)):
slice = tf.slice(rnn_out, begin=[index, l - 1, 0], size=[1, 1, 3])
slices.append(slice)
last = tf.concat(0, slices)
So, the full snippet would be the following:
import numpy as np
import tensorflow as tf
batch = 2
dim = 3
hidden = 4
lengths = tf.placeholder(dtype=tf.int32, shape=[batch])
inputs = tf.placeholder(dtype=tf.float32, shape=[batch, None, dim])
cell = tf.nn.rnn_cell.GRUCell(hidden)
cell_state = cell.zero_state(batch, tf.float32)
output, _ = tf.nn.dynamic_rnn(cell, inputs, lengths, initial_state=cell_state)
inputs_ = np.asarray([[[0, 0, 0], [1, 1, 1], [2, 2, 2], [3, 3, 3]],
[[6, 6, 6], [7, 7, 7], [8, 8, 8], [9, 9, 9]]],
dtype=np.int32)
lengths_ = np.asarray([3, 1], dtype=np.int32)
slices = []
for index, l in enumerate(tf.unstack(lengths)):
slice = tf.slice(output, begin=[index, l - 1, 0], size=[1, 1, 3])
slices.append(slice)
last = tf.concat(0, slices)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
outputs = sess.run([output, last], {inputs: inputs_, lengths: lengths_})
print 'RNN output:'
print(outputs[0])
print
print 'last relevant output:'
print(outputs[1])
And the output:
RNN output:
[[[ 0. 0. 0. 0. ]
[-0.06667092 -0.09284072 0.01098599 -0.03676109]
[-0.09101103 -0.19828682 0.03546784 -0.08721405]
[ 0. 0. 0. 0. ]]
[[-0.00025157 -0.05704876 0.05527233 -0.03741353]
[ 0. 0. 0. 0. ]
[ 0. 0. 0. 0. ]
[ 0. 0. 0. 0. ]]]
last relevant output:
[[[-0.09101103 -0.19828682 0.03546784]]
[[-0.00025157 -0.05704876 0.05527233]]]

Related

Assign zeros to minimum values in numpy 3d array

I have a numpy array of shape (100, 100, 20) (in python 3)
I want to find for each 'pixel' the 15 channels with minimum values, and make them zeros (meaning: make the array sparse, keep only the 5 highest values).
Example:
input: array = [[1,2,3], [7,6,9], [12,71,3]], num_channles_to_zero = 2
output: [[0,0,3], [0,0,9], [0,71,0]]
How can I do it?
what I have for now:
array = numpy.random.rand(100, 100, 20)
inds = numpy.argsort(array, axis=-1) # also shape (100, 100, 20)
I want to do something like
array[..., inds[..., :15]] = 0
but it doesn't give me what I want
np.argsort outputs indices suitable for the [...]_along_axis functions of numpy. This includes np.put_along_axis:
import numpy as np
array = np.random.rand(100, 100, 20)
print(array[0,0])
#[0.44116124 0.94656705 0.20833932 0.29239585 0.33001399 0.82396784
# 0.35841905 0.20670957 0.41473762 0.01568006 0.1435386 0.75231818
# 0.5532527 0.69366173 0.17247832 0.28939985 0.95098187 0.63648877
# 0.90629116 0.35841627]
inds = np.argsort(array, axis=-1)
np.put_along_axis(array, inds[..., :15], 0, axis=-1)
print(array[0,0])
#[0. 0.94656705 0. 0. 0. 0.82396784
# 0. 0. 0. 0. 0. 0.75231818
# 0. 0. 0. 0. 0.95098187 0.
# 0.90629116 0. ]
As it mentioned in the numpy documentation
From each row, a specific element should be selected. The row index is just [0, 1, 2] and the column index specifies the element to choose for the corresponding row, here [0, 1, 0]. Using both together the task can be solved using advanced indexing:
>>>x = np.array([[1, 2], [3, 4], [5, 6]])
>>>x[[0, 1, 2], [0, 1, 0]]
array([1, 4, 5])
So, for your example:
a = np.array([[1,2,3], [7,6,9], [12,71,3]])
amax = a.argmax(axis=-1)
a[np.arange(a.shape[0]), amax] = 0
a
array([[ 1, 2, 0],
[ 7, 6, 0],
[12, 0, 3]])

mask 0 values during normalization

I am doing normalization for datasets but the data contains a lot of 0 because of padding.
I can mask them during model training but apparently, these zero will be affected when I applied normalization.
from sklearn.preprocessing import StandardScaler,MinMaxScaler
I am currently using the Sklearn library to do the normalization
For example, given a 3D array with dimension (4,3,5) as (batch, step, features)
The number of zero-padding varied from batch to batch as these are the features I extracted from audio files, that have varying lengths, using a fixed window size.
[[[0 0 0 0 0],
[0 0 0 0 0],
[0 0 0 0 0]]
[[1 2 3 4 5],
[4 5 6 7 8],
[9 10 11 12 13]],
[[14 15 16 17 18],
[0 0 0 0 0],
[24 25 26 27 28]],
[[0 0 0 0 0],
[423 2 230 60 70],
[0 0 0 0 0]]
]
I wish to perform normalization by column so
scaler = MinMaxScaler()
X_train = scaler.fit_transform(X_train.reshape(-1,X_train.shape[-1])).reshape(X_train.shape)
X_test = scaler.transform(X_test.reshape(-1,X_test.shape[-1])).reshape(X_test.shape)
However, in this case, zeros are treated as effective values. For example, the minimum value of the first column should be 1 instead of 0.
Further, the 0's values are also changed after applying the scalers but I wish to keep them as 0's so I can mask them during training. model.add(tf.keras.layers.Masking(mask_value=0.0, input_shape=(X_train.shape[1], X_train.shape[2])))
Is there any way to mask them during normalization so only the 2nd step and 3rd step in this example are used in normalization?
In addition, The actual dimension of the array for my project is bigger as (2000,50,68)
among the 68 features, the difference in values of the 68 features can be very large. I tried to normalize them by dividing each element by the biggest element in their row to avoid the impact from 0's but this did not work out well.
The task of just MinMaxScaler() masking can be solved by next code.
Each other operation needs separate way of handling, if you'll mention all operations that need masking then we can solve them one-by-one basis and I'll extend my answer. E.g. keras layers can be masked by tf.keras.layers.Masking() layer as you mentioned.
Next code min/max-scales only non zero features, the rest remain zeros.
import numpy as np
from sklearn.preprocessing import MinMaxScaler
X = np.array([
[[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]],
[[1, 2, 3, 4, 5],
[4, 5, 6, 7, 8],
[9, 10, 11, 12, 13]],
[[14, 15, 16, 17, 18],
[0, 0, 0, 0, 0],
[24, 25, 26, 27, 28]],
[[0, 0, 0, 0, 0],
[423, 2, 230, 60, 70],
[0, 0, 0, 0, 0]]
], dtype = np.float64)
nz = np.any(X, -1)
X[nz] = MinMaxScaler().fit_transform(X[nz])
print(X)
Output:
[[[0. 0. 0. 0. 0. ]
[0. 0. 0. 0. 0. ]
[0. 0. 0. 0. 0. ]]
[[0. 0. 0. 0. 0. ]
[0.007109 0.13043478 0.01321586 0.05357143 0.04615385]
[0.01895735 0.34782609 0.03524229 0.14285714 0.12307692]]
[[0.03080569 0.56521739 0.05726872 0.23214286 0.2 ]
[0. 0. 0. 0. 0. ]
[0.05450237 1. 0.10132159 0.41071429 0.35384615]]
[[0. 0. 0. 0. 0. ]
[1. 0. 1. 1. 1. ]
[0. 0. 0. 0. 0. ]]]
If you need to train MinMaxScaler() on one dataset and apply it later on others then you can do next:
scaler = MinMaxScaler().fit(X[np.any(X, -1)])
X[np.any(X, -1)] = scaler.transform(X[np.any(X, -1)])
Y[np.any(Y, -1)] = scaler.transform(Y[np.any(Y, -1)])

How can I add a small image to a bigger one in Tensorflow?

I want to overlay a smaller image onto a larger one.
I have tried adding to a slice but couldn't get it to work.
So, as a simple example, how do I perform this NumPy operation in Tensorflow:
a = np.array([1, 1, 1, 1])
b = np.array([5, 5])
c = a
c[1:3] = c[1:3] + b
print(c)
# => [1 6 6 1]
This is one possible implementation:
import tensorflow as tf
# i and j are first row and colum
# alpha (0..1) is the intensity of the overlay
def overlay_patch(img, patch, i, j, alpha=0.5):
img_shape = tf.shape(img)
img_rows, img_cols = img_shape[0], img_shape[1]
patch_shape = tf.shape(patch)
patch_rows, patch_cols = patch_shape[0], patch_shape[1]
i_end = i + patch_rows
j_end = j + patch_cols
# Mix patch: alpha from patch, minus alpha from image
overlay = alpha * (patch - img[i:i_end, j:j_end])
# Pad patch
overlay_pad = tf.pad(overlay, [[i, img_rows - i_end], [j, img_cols - j_end], [0, 0]])
# Make final image
img_overlay = img + overlay_pad
return img_overlay
Test:
img = tf.placeholder(tf.float32, [None, None, None])
patch = tf.placeholder(tf.float32, [None, None, None])
i = tf.placeholder(tf.int32, [])
j = tf.placeholder(tf.int32, [])
alpha = tf.placeholder(tf.float32, [])
img_overlay = overlay_patch(img, patch, i, j, alpha)
with tf.Session() as sess:
result = sess.run(img_overlay, feed_dict={
img: [[[ 1], [ 2], [ 3], [ 4]],
[[ 5], [ 6], [ 7], [ 8]],
[[ 9], [10], [11], [12]],
[[13], [14], [15], [16]]],
patch: [[[10], [20], [30]],
[[40], [50], [60]]],
i: 2, j: 1, alpha: 0.5
})
print(result[..., 0])
Output:
[[ 1. 2. 3. 4. ]
[ 5. 6. 7. 8. ]
[ 9. 10. 15.5 21. ]
[13. 27. 32.5 38. ]]

Assigning zero to tensor at indices specified in a list

I have to tensors, for example
A = tf.Tensor(
[[1.0986123 0.6931472 0. 0.6931472 0. ]
[0. 0. 0. 0. 0. ]
[3.7376697 3.7612002 3.7841897 3.8066626 3.8286414]], shape=(3, 5), dtype=float32)
B = tf.Tensor(
[[2 1]
[2 2]], shape=(2, 2), dtype=int64)
Tensor B holds indices in tensor A. I want to update every value in tensor A to zero that is listed in the index list B.
So, the expected result would be
tf.Tensor(
[[1.0986123 0.6931472 0. 0.6931472 0. ]
[0. 0. 0. 0. 0. ]
[3.7376697 0 0 3.8066626 3.8286414]], shape=(3, 5), dtype=float32)
So the entries at index [2,1] and [2, 2] are set to 0.
I looked at tf.assign but they can only be used for tf.Variable's. tf.boolean_mask would be a nice way to do it, but i do not know and could not find out how i can create a boolean mask with a list of indices.
I looked at the tensor flow functions i could find and related S/O answers but couldn't find a satisfying solution.
You can use tf.scatter_nd_update for this. For example:
A = tf.Variable(
[[1.0986123, 0.6931472, 0. , 0.6931472, 0. ],
[0. , 0. , 0. , 0. , 0. ],
[3.7376697, 3.7612002, 3.7841897, 3.8066626, 3.8286414]], dtype=tf.float32)
B = tf.Variable(
[[2, 1],
[2, 2]], dtype=tf.int64)
C = tf.scatter_nd_update(A, B, tf.zeros(shape=tf.shape(B)[0]))
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
print(sess.run(C))
or
A = tf.constant(
[[1.0986123, 0.6931472, 0. , 0.6931472, 0. ],
[0. , 0. , 0. , 0. , 0. ],
[3.7376697, 3.7612002, 3.7841897, 3.8066626, 3.8286414]], dtype=tf.float32)
B = tf.constant(
[[2, 1],
[2, 2]], dtype=tf.int64)
AV = tf.Variable(A)
C = tf.scatter_nd_update(AV, B, tf.zeros(shape=tf.shape(B)[0]))
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
print(sess.run(C))

Convert output of tf.nn.top_n into a sparse matrix

As the title states, I'm trying to extract the highest n elements per row from a matrix in tensorflow, and store the result in a sparse Tensor.
I've been able to extract the indices and values with tf.nn.top_n, but the indices don't follow the convention required by tf.SparseTensor.
Specifically, tf.nn.top_n returns a matrix of col indices with the same shape as the resulting value matrix (Rows x n), whereas tf.SparseTensor wants a (# non-zero x 2) matrix with 1 row per non-zero element and the columns holding the row and col indices.
The values can an analogous problem whereby a list of non-zero elements is desired instead of a matrix of values.
How can I quickly convert between these indexing notation schemes?
This is doable with a bit of modular arithmetic. Here's an example that works on matrices, although it would be possible to loop over more axes.
import tensorflow as tf
def slices_to_dims(slice_indices):
"""
Args:
slice_indices: An [N, k] Tensor mapping to column indices.
Returns:
An index Tensor with shape [N * k, 2], corresponding to indices suitable for
passing to SparseTensor.
"""
slice_indices = tf.cast(slice_indices, tf.int64)
num_rows = tf.shape(slice_indices, out_type=tf.int64)[0]
row_range = tf.range(num_rows)
item_numbers = slice_indices * num_rows + tf.expand_dims(row_range, axis=1)
item_numbers_flat = tf.reshape(item_numbers, [-1])
return tf.stack([item_numbers_flat % num_rows,
item_numbers_flat // num_rows], axis=1)
Example usage:
dense_shape = [5, 7]
dense_matrix = tf.random_normal(shape=dense_shape)
top_values, top_indices = tf.nn.top_k(dense_matrix, k=2)
sparse_indices = slices_to_dims(top_indices)
sparse_tensor = tf.sparse_reorder(tf.SparseTensor(
indices=sparse_indices,
values=tf.reshape(top_values, [-1]),
dense_shape=dense_shape))
densified_top = tf.sparse_tensor_to_dense(sparse_tensor)
with tf.Session() as session:
sparse_top, dense_original, dense_selected = session.run(
[sparse_tensor, dense_matrix, densified_top])
print(dense_original)
print(dense_selected)
print(sparse_top)
Prints:
[[ 1.44056129 -1.01790774 -0.2795608 2.34854746 -2.27528405 -0.62035948
3.36598897]
[ 0.7114948 -0.42564821 -0.93446779 -0.25373486 -0.51730365 0.72331643
-0.75625718]
[-0.6501748 -0.92748415 -0.95409006 -0.07157528 0.80637723 -0.32177576
-1.4516511 ]
[-1.081038 -0.67226124 -1.19455576 0.44537872 -0.69019234 -0.61539739
0.15328468]
[ 0.43032476 -0.11295394 0.83491379 -0.67906654 0.20325914 -0.0155068
0.52107805]]
[[ 0. 0. 0. 2.34854746 0. 0.
3.36598897]
[ 0.7114948 0. 0. 0. 0. 0.72331643
0. ]
[ 0. 0. 0. -0.07157528 0.80637723 0. 0. ]
[ 0. 0. 0. 0.44537872 0. 0.
0.15328468]
[ 0. 0. 0.83491379 0. 0. 0.
0.52107805]]
SparseTensorValue(indices=array([[0, 3],
[0, 6],
[1, 0],
[1, 5],
[2, 3],
[2, 4],
[3, 3],
[3, 6],
[4, 2],
[4, 6]]), values=array([ 2.34854746, 3.36598897, 0.7114948 , 0.72331643, -0.07157528,
0.80637723, 0.44537872, 0.15328468, 0.83491379, 0.52107805], dtype=float32), dense_shape=array([5, 7]))

Categories

Resources