Broadcast SparseTensor in tensorflow - python

I want to elementwise multiply a dense tensor with shape [n, n, k] with a sparse tensor that has the shape [n, n, 1]. I want the values from the sparse tensor to repeat along the axis with the size s, like it would do if I used a dense tensor instead and relied on implicit broadcasting.
However the SparseTensor.__mul__ operation does not support broadcasting the sparse operand. I didn't find an operator to explicitly broadcast the sparse Tensor. How could I achieve this?

If you do not want to just convert the sparse tensor to dense, you can extract select the right values from the dense tensor to build a sparse result directly, something like this:
import tensorflow as tf
import numpy as np
with tf.Graph().as_default(), tf.Session() as sess:
# Input data
x = tf.placeholder(tf.float32, shape=[None, None, None])
y = tf.sparse.placeholder(tf.float32, shape=[None, None, 1])
# Indices of sparse tensor without third index coordinate
indices2 = y.indices[:, :-1]
# Values of dense tensor corresponding to sparse tensor values
x_sp = tf.gather_nd(x, indices2)
# Values of the resulting sparse tensor
res_vals = tf.reshape(x_sp * tf.expand_dims(y.values, 1), [-1])
# Shape of the resulting sparse tensor
res_shape = tf.shape(x, out_type=tf.int64)
# Make sparse tensor indices
k = res_shape[2]
v = tf.size(y.values)
# Add third coordinate to existing sparse tensor coordinates
idx1 = tf.tile(tf.expand_dims(indices2, 1), [1, k, 1])
idx2 = tf.tile(tf.range(k), [v])
res_idx = tf.concat([tf.reshape(idx1, [-1, 2]), tf.expand_dims(idx2, 1)], axis=1)
# Make sparse result
res = tf.SparseTensor(res_idx, res_vals, res_shape)
# Dense value for testing
res_dense = tf.sparse.to_dense(res)
# Dense operation for testing
res_dense2 = x * tf.sparse.to_dense(y)
# Test
x_val = np.arange(48).reshape(4, 4, 3)
y_val = tf.SparseTensorValue([[0, 0, 0], [2, 3, 0], [3, 1, 0]], [1, 2, 3], [4, 4, 1])
res_dense_val, res_dense2_val = sess.run((res_dense, res_dense2),
feed_dict={x: x_val, y: y_val})
print(np.allclose(res_dense_val, res_dense2_val))
# True

Related

Tensorflow custom layer: Creating a sparse matrix with trainable parameters

A model that I am working on should be predicting quite a lot of variables simultaneously (>1000). Therefore I would like to have a small neural network at the end of the network for each output.
In order to do this compactly, I would like to find a way to create a sparse trainable connection between two layers in the neural network within the Tensorflow framework.
Only a small portion of the connection matrix should be trainable: It is only the parameters that are part of the block-diagonal.
For example:
The connection matrix is the following:
The trainable parameters should be in the place of the 1's.
I have written exactly such a layer:
https://github.com/ArnovanHilten/GenNet/blob/master/GenNet_utils/LocallyDirectedConnected_tf2.py
It takes a sparse matrix as an input and lets you decide how to connect between layers. The layer uses sparse tensors and matrix multiplications.
edit
so the comment was Is this a trainable object though?
The answer: No. You cannot use sparse matrix currently and make it trainable. Instead you can use a mask matrix (see at the end)
But if you need to use sparse matrix, you just have to use tf.sparse.sparse_dense_matmul() or tf.sparse_tensor_to_dense() where your sparse interacts with a dense matrix. I have taken a simple XOR example from here and replaced dense with a sparse matrix:
#Declaring necessary modules
import tensorflow as tf
import numpy as np
"""
A simple numpy implementation of a XOR gate to understand the backpropagation
algorithm
"""
x = tf.placeholder(tf.float32,shape = [4,2],name = "x")
#declaring a place holder for input x
y = tf.placeholder(tf.float32,shape = [4,1],name = "y")
#declaring a place holder for desired output y
m = np.shape(x)[0]#number of training examples
n = np.shape(x)[1]#number of features
hidden_s = 2 #number of nodes in the hidden layer
l_r = 1#learning rate initialization
theta1 = tf.SparseTensor(indices=[[0, 0],[0, 1], [1, 1]], values=[0.1, 0.2, 0.1], dense_shape=[3, 2])
#theta1 = tf.cast(tf.Variable(tf.random_normal([3,hidden_s]),name = "theta1"),tf.float64)
theta2 = tf.cast(tf.Variable(tf.random_normal([hidden_s+1,1]),name = "theta2"),tf.float32)
#conducting forward propagation
a1 = tf.concat([np.c_[np.ones(x.shape[0])],x],1)
#the weights of the first layer are multiplied by the input of the first layer
#z1 = tf.sparse_tensor_dense_matmul(theta1, a1)
z1 = tf.matmul(a1,tf.sparse_tensor_to_dense(theta1))
#the input of the second layer is the output of the first layer, passed through the
a2 = tf.concat([np.c_[np.ones(x.shape[0])],tf.sigmoid(z1)],1)
#the input of the second layer is multiplied by the weights
z3 = tf.matmul(a2,theta2)
#the output is passed through the activation function to obtain the final probability
h3 = tf.sigmoid(z3)
cost_func = -tf.reduce_sum(y*tf.log(h3)+(1-y)*tf.log(1-h3),axis = 1)
#built in tensorflow optimizer that conducts gradient descent using specified
optimiser = tf.train.GradientDescentOptimizer(learning_rate = l_r).minimize(cost_func)
#setting required X and Y values to perform XOR operation
X = [[0,0],[0,1],[1,0],[1,1]]
Y = [[0],[1],[1],[0]]
#initializing all variables, creating a session and running a tensorflow session
init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)
#running gradient descent for each iterati
for i in range(200):
sess.run(optimiser, feed_dict = {x:X,y:Y})#setting place holder values using feed_dict
if i%100==0:
print("Epoch:",i)
print(sess.run(theta1))
and the output is:
Epoch: 0
SparseTensorValue(indices=array([[0, 0],
[0, 1],
[1, 1]]), values=array([0.1, 0.2, 0.1], dtype=float32), dense_shape=array([3, 2]))
Epoch: 100
SparseTensorValue(indices=array([[0, 0],
[0, 1],
[1, 1]]), values=array([0.1, 0.2, 0.1], dtype=float32), dense_shape=array([3, 2]))
So the only way is to use a mask matrix. You can use it by multiplication or tf.where
1) Multiplication: You can create mask matrix of the desired shape and multiply it with your weight matrix:
mask = tf.Variable([[1,0,0],[0,1,0],[0,0,1]],name ='mask', trainable=False)
weight = tf.cast(tf.Variable(tf.random_normal([3,3])),tf.float32)
desired_tensor = tf.matmul(weight, mask)
2) tf.where
mask = tf.Variable([[1,0,0],[0,1,0],[0,0,1]],name ='mask', trainable=False)
weight = tf.cast(tf.Variable(tf.random_normal([3,3])),tf.float32)
desired_tensor = tf.where(mask > 0, tf.ones_like(weight), weight)
Hope it helps
You can do that by using sparse tensors like so:
SparseTensor(indices=[[0, 0], [1, 2]], values=[1, 2], dense_shape=[3, 4])
and the output is:
[[1, 0, 0, 0]
[0, 0, 2, 0]
[0, 0, 0, 0]]
you can look up more on the documentation of sparse tensor here:
https://www.tensorflow.org/api_docs/python/tf/sparse/SparseTensor
Hope it helps!

return the top_k masked softmax of each row for a 2D tensor

For any 2D tensor like
[[2,5,4,7],
[7,5,6,8]],
I want to do softmax for the top k element in each row and then construct a new tensor by replacing all the other elements to 0.
The result should be to get the softmax of top k (here k=2) elements for each row [[7,5],[8,7]],
which is thus
[[0.880797,0.11920291],
[0.7310586,0.26894143]]
and then reconstruct a new tensor according to the index of the top k elements in the original tensor, the final result should be
[[0,0.11920291,0,0.880797],
[0.26894143,0,0,0.7310586]].
Is it possible to implement this kind of masked softmax in tensorflow? Many thanks in advance!
Here is how you can do that:
import tensorflow as tf
# Input data
a = tf.placeholder(tf.float32, [None, None])
num_top = tf.placeholder(tf.int32, [])
# Find top elements
a_top, a_top_idx = tf.nn.top_k(a, num_top, sorted=False)
# Apply softmax
a_top_sm = tf.nn.softmax(a_top)
# Reconstruct into original shape
a_shape = tf.shape(a)
a_row_idx = tf.tile(tf.range(a_shape[0])[:, tf.newaxis], (1, num_top))
scatter_idx = tf.stack([a_row_idx, a_top_idx], axis=-1)
result = tf.scatter_nd(scatter_idx, a_top_sm, a_shape)
# Test
with tf.Session() as sess:
result_val = sess.run(result, feed_dict={a: [[2, 5, 4, 7], [7, 5, 6, 8]], num_top: 2})
print(result_val)
Output:
[[0. 0.11920291 0. 0.880797 ]
[0.26894143 0. 0. 0.7310586 ]]
EDIT:
Actually, there is a function that more closely does what you intend, tf.sparse.softmax. However, it requires a SparseTensor as input, and I'm not sure it should be faster since it has to figure out which sparse values go together in the softmax. The good thing about this function is that you could have different number of elements to softmax in each row, but in your case that does not seem to be important. Anyway, here is an implementation with that, in case you find it useful.
import tensorflow as tf
a = tf.placeholder(tf.float32, [None, None])
num_top = tf.placeholder(tf.int32, [])
# Find top elements
a_top, a_top_idx = tf.nn.top_k(a, num_top, sorted=False)
# Flatten values
sparse_values = tf.reshape(a_top, [-1])
# Make sparse indices
shape = tf.cast(tf.shape(a), tf.int64)
a_row_idx = tf.tile(tf.range(shape[0])[:, tf.newaxis], (1, num_top))
sparse_idx = tf.stack([a_row_idx, tf.cast(a_top_idx, tf.int64)], axis=-1)
sparse_idx = tf.reshape(sparse_idx, [-1, 2])
# Make sparse tensor
a_top_sparse = tf.SparseTensor(sparse_idx, sparse_values, shape)
# Reorder sparse tensor
a_top_sparse = tf.sparse.reorder(a_top_sparse)
# Softmax
result_sparse = tf.sparse.softmax(a_top_sparse)
# Convert back to dense (or you can keep working with the sparse tensor)
result = tf.sparse.to_dense(result_sparse)
# Test
with tf.Session() as sess:
result_val = sess.run(result, feed_dict={a: [[2, 5, 4, 7], [7, 5, 6, 8]], num_top: 2})
print(result_val)
# Same as before
Let's say you have a weights tensor w with shape (None, N)
Find the minimum value of the top k elements
top_kw = tf.math.top_k(w, k=10, sorted=False)[0]
min_w = tf.reduce_min(top_kw, axis=1, keepdims=True)
Generate a boolean mask for the weights tensor
mask_w = tf.greater_equal(w, min_w)
mask_w = tf.cast(mask_w, tf.float32)
Compute custom softmax using the mask
w = tf.multiply(tf.exp(w), mask_w) / tf.reduce_sum(tf.multiply(tf.exp(w), mask_w), axis=1, keepdims=True)

Custom Layer behaves differently when inside keras model

I'm working on a Permutational Equivariant Layer for Keras based on this paper https://arxiv.org/pdf/1612.04530.pdf and previous work by Josef Ondrej found here.
The layer itself is a Keras Model consisting of multiple layers:
from keras import backend as K
from keras import losses
from keras.layers import Average, Add, Concatenate, Maximum, Input, Dense, Lambda
from keras.models import Model
from keras.engine.topology import Layer
def PermutationEquivariant(input_shape, layer_size, tuple_dim = 2, reduce_fun = "sum", dense_params = {}):
"""
Implements a permutation equivariant layer.
Each batch in our data consists of `input_shape[0]` observations
each with `input_shape[1]` features.
Args:
input_shape -- A pair of `int` - (number of observations in one batch x
number of features of each observation). The batch dimension is not included.
layer_size -- `int`. Size of dense layer applied to each tuple of observations.
tuple_dim -- A `int`, how many observations to put in one tuple.
reduce_fun -- A `string`, type of function to "average" over all tuples starting with the same index.
Returns:
g -- A keras Model - the permutation equivariant layer.
It consists of one tuple layer that creates all possible `tuple_dim`-tuples
of observations, sorted on an axis along which the first index is constant.
The same dense layer is applied on every tuple and then some symmetric pooling function is applied
across all tuples with the same first index (for example mean or maximum).
"""
inputs = Input(shape=input_shape)## input_shape: batch_size x row x col
## SeperatedTuple layer
x = SeperatedTuples(tuple_dim, input_shape = input_shape)(inputs)## out_shape: batch_size x row x row ** (tuple_dim-1) x tuple_dim*col
## Dense layer -- implemented with a conv layer
# Use the same dense layer for each tuple
dense_input_shape = (tuple_dim*input_shape[1], ) # batch_size x tuple_dim*col
dense_layer = Dense(input_shape = dense_input_shape, units=layer_size, **dense_params)
# iterate through rows
x_i_list = []
for i in range(input_shape[0]):
xi_j_list = []
# applying the dense layer to each tuple where first index equals i
# here we could also use a 1x1 convolution. Instead of reusing
# the dense layer for each tuple, we would be reusing the kernels
for j in range(input_shape[0] ** (tuple_dim-1)):
input_ij = Lambda(lambda x : x[:,i,j,:], output_shape=(tuple_dim*input_shape[-1],))(x) ##out_shape: batch_size x tuple_dim * col
xi_j_list += [dense_layer(input_ij)] ## xi_j_list-shape: row x batch_size x layer_size
## Pooling layer
# Pooling the list of the dense outputs of all the tuples where first index equals i to out_shape: batch_size x layer_size
# note that axis=0 because in previous step row-axis comes before batch_size-axis
# Use Lambda Wrapper to preserve the output being a Keras Tensor
if reduce_fun == "mean":
pooling_layer = Average(axis=1)
#pooling_layer = Lambda(lambda x : K.mean(x, axis = 0))
elif reduce_fun == "max":
pooling_layer = Maximum()
#pooling_layer = Lambda(lambda x : K.max(x, axis = 0))
elif reduce_fun == "sum":
pooling_layer = Add()
#pooling_layer = Lambda(lambda x : K.sum(x, axis = 0))
else:
raise ValueError("Invalid value for argument `reduce_fun` provided. ")
xi = pooling_layer(xi_j_list) ## xi-shape: batch_size x layer_size
x_i_list += [xi]
# x_i_list-shape:
# Concatenate the results of each row
x = Lambda(lambda x : K.stack(x, axis=1), output_shape = (input_shape[0], layer_size))(x_i_list) ## out_shape: batch_size x row x layer_size
model = Model(inputs=inputs, outputs=x)
return model
class SeperatedTuples(Layer):
"""
Creates all possible tuples of rows of 2D tensor, with an additional axis
along which the first elements are constant.
In the case of tuple_dim = 2, from one input batch:
x_1,
x_2,
...
x_n,
where x_i are rows of the tensor, it creates 3D output tensor:
[[x_1 | x_1, x_1 | x_2 ... x_1 | x_n],
[x_2 | x_1, x_2 | x_2 ... x_2 | x_n],
...
... x_n | x_n]]
Args:
tuple_dim -- A `int`. Dimension of one tuple (i.e. how many rows from the input
tensor to combine to create a row in output tensor)
input_shape -- A `tuple` of `int`. In the most frequent case where our data
has shape (batch_size x num_rows x num_cols) this should be (num_rows x num_cols).
"""
def __init__(self, tuple_dim = 2, **kwargs):
self.tuple_dim = tuple_dim
super(SeperatedTuples, self).__init__(**kwargs)
def create_indices(self, n, k = 2):
"""
Creates all integer valued coordinate k-tuples in k dimensional hypercube with edge size n.
for example n = 4, k = 2
returns [[0, 0], [0, 1], [0, 2], [0, 3],
[1, 0], [1, 1], [1, 2], [1, 3],
...
[3, 0], [3, 1], [3, 2], [3, 3]]
Args:
n -- A `int`, edge size of the hypercube.
k -- A `int`, dimension of the hypercube.
Returns:
indices_n_k -- A `list` of `list` of `int`. Each inner list represents coordinates of one integer point
in the hypercube.
"""
if k == 0:
indices_n_k = [[]]
else:
indices_n_k_minus_1 = self.create_indices(n, k-1)
indices_n_k = [[i] + indices_n_k_minus_1[c] for i in range(n) for c in range(n**(k-1))]
return indices_n_k
def create_seperated_indices(self, n, k = 2):
"""
Same as create_indices, just that there is an additional axis along which the first value of the tuples is constant
for example n = 4, k = 2
returns [[[0, 0], [0, 1], [0, 2], [0, 3]],
[[1, 0], [1, 1], [1, 2], [1, 3]],
...
[[3, 0], [3, 1], [3, 2], [3, 3]]]
shape: row x row x k
"""
indices = self.create_indices(n,k)
seperated_indices = [indices[i:i + n] for i in range(0, len(indices), n)]
return seperated_indices
def build(self, input_shape):
# Create indexing tuple
self.gathering_indices = self.create_seperated_indices(input_shape[-2], self.tuple_dim)
super(SeperatedTuples, self).build(input_shape) # Be sure to call this somewhere!
def call(self, x):
"""
input_dim : batch_size x rows x cols
output_dim : batch_size x rows x rows ** (tuple_dim-1) x cols * tuple_dim
"""
stacks_of_tuples = K.map_fn(
fn = lambda z : ## z shape: row x col
K.stack(
[K.concatenate(
[K.reshape(
K.gather(z, i), ## shape: tuple_dim x col
shape = (1,-1)
) ## shape: 1 x tuple_dim*col
for i in indices # i-dim: tuple_dim, indices-shape: row x tuple_dim
], ## shape: row x 1 x tuple_dim*col
axis = 0
) ## shape: row x tuple_dim*col
for indices in self.gathering_indices # gathering_indices-shape: row x row x tuple_dim
],
axis=0), ## shape: row x row x tuple_dim*col
elems = x ## shape: batch_size x row x col
) ## shape: batch_size x row x row x tuple_dim*col
return stacks_of_tuples
def compute_output_shape(self, input_shape):
"""
input_shape: batch_size x rows x cols
output_shape: batch_size x rows x rows ** (tuple_dim-1) x cols * tuple_dim
"""
output_shape = list(input_shape)
output_shape[-1] = output_shape[-1] * self.tuple_dim
output_shape[-2] = output_shape[-2] ** self.tuple_dim
return tuple(output_shape)
When testing the PermutationEquivariant layer all alone, everything seems to work fine (run 1). However, when I try to incorporate it in a larger model, the outputs just repeat themselves (run 2).
from keras.models import Model
from keras.layers import Input, Lambda
import numpy as np
# parameters for Permutational Equivariant layer
input_shape = (2,5)
dense_params = {'kernel_initializer': 'glorot_normal', 'bias_initializer': 'glorot_normal', 'activation': 'tanh'}
sample = np.random.random((1,) + input_shape)
# run 1: Using only the PermutationEquivariant layer as a model by itself seems to work
model_1 = PermutationEquivariant(input_shape=input_shape, layer_size=10, tuple_dim=2, reduce_fun="sum", dense_params = dense_params)
model_1.compile(optimizer='sgd', loss='categorical_crossentropy')
print("model_1: \n", model_1.predict(sample))
#model_1:
#[[[-1.0494264 -1.6808903 1.2861781 -0.90004706 1.6178854
# 1.6686234 -1.5724193 1.2454509 0.3730019 -1.4580158 ]
# [-1.3904197 -1.467866 1.0848606 -1.2094728 1.6304723
# 1.6369174 -1.4074551 0.58116794 0.292305 -1.7162979 ]]]
# run 2: Incorporating the PermutationEquivariant layer inside another model makes the output constant along the first axis
inputs = Input(shape=input_shape)
x = PermutationEquivariant(input_shape=input_shape, layer_size=10, tuple_dim=2, reduce_fun="sum", dense_params = dense_params)(inputs)
model_2 = Model(inputs=inputs,outputs = x)
model_2.compile(optimizer='sgd', loss='categorical_crossentropy')
print("model_2: \n", model_2.predict(sample))
enter code here
#model_2:
# [[[ 0.72823656 1.2213255 -0.28404936 1.4711846 -0.49544945
# 1.7930243 -0.7502286 1.892496 -1.675402 -0.2252224 ]
# [ 0.72823656 1.2213255 -0.28404936 1.4711846 -0.49544945
# 1.7930243 -0.7502286 1.892496 -1.675402 -0.2252224 ]]]
I have tried theano and tensorflow as backends, both with the same result. Does anybody have an idea why it behaves differently when inside another model / what am I missing? I appreciate any help!

Converting Tensor to a SparseTensor for ctc_loss [duplicate]

This question already has answers here:
Sparse Tensor (matrix) from a dense Tensor Tensorflow
(5 answers)
Closed 5 years ago.
Is there a way to convert a dense tensor into a sparse tensor? Apparently, Tensorflow's Estimator.fit doesn't accept SparseTensors as labels. One reason I would like to pass SparseTensors into Tensorflow's Estimator.fit is to be able to use tensorflow ctc_loss. Here's the code:
import dataset_utils
import tensorflow as tf
import numpy as np
from tensorflow.contrib import grid_rnn, learn, layers, framework
def grid_rnn_fn(features, labels, mode):
input_layer = tf.reshape(features["x"], [-1, 48, 1596])
indices = tf.where(tf.not_equal(labels, tf.constant(0, dtype=tf.int32)))
values = tf.gather_nd(labels, indices)
sparse_labels = tf.SparseTensor(indices, values, dense_shape=tf.shape(labels, out_type=tf.int64))
cell_fw = grid_rnn.Grid2LSTMCell(num_units=128)
cell_bw = grid_rnn.Grid2LSTMCell(num_units=128)
bidirectional_grid_rnn = tf.nn.bidirectional_dynamic_rnn(cell_fw, cell_bw, input_layer, dtype=tf.float32)
outputs = tf.reshape(bidirectional_grid_rnn[0], [-1, 256])
W = tf.Variable(tf.truncated_normal([256,
80],
stddev=0.1, dtype=tf.float32), name='W')
b = tf.Variable(tf.constant(0., dtype=tf.float32, shape=[80], name='b'))
logits = tf.matmul(outputs, W) + b
logits = tf.reshape(logits, [tf.shape(input_layer)[0], -1, 80])
logits = tf.transpose(logits, (1, 0, 2))
loss = None
train_op = None
if mode != learn.ModeKeys.INFER:
#Error occurs here
loss = tf.nn.ctc_loss(inputs=logits, labels=sparse_labels, sequence_length=320)
... # returning ModelFnOps
def main(_):
image_paths, labels = dataset_utils.read_dataset_list('../test/dummy_labels_file.txt')
data_dir = "../test/dummy_data/"
images = dataset_utils.read_images(data_dir=data_dir, image_paths=image_paths, image_extension='png')
print('Done reading images')
images = dataset_utils.resize(images, (1596, 48))
images = dataset_utils.transpose(images)
labels = dataset_utils.encode(labels)
x_train, x_test, y_train, y_test = dataset_utils.split(features=images, test_size=0.5, labels=labels)
train_input_fn = tf.estimator.inputs.numpy_input_fn(
x={"x": np.array(x_train)},
y=np.array(y_train),
num_epochs=1,
shuffle=True,
batch_size=1
)
classifier = learn.Estimator(model_fn=grid_rnn_fn, model_dir="/tmp/grid_rnn_ocr_model")
classifier.fit(input_fn=train_input_fn)
UPDATE:
It turns out, this solution from here converts the dense tensor into a sparse one:
indices = tf.where(tf.not_equal(labels, tf.constant(0, dtype=tf.int32)))
values = tf.gather_nd(labels, indices)
sparse_labels = tf.SparseTensor(indices, values, dense_shape=tf.shape(labels, out_type=tf.int64))
However, I encounter this error now raised by ctc_loss:
ValueError: Shape must be rank 1 but is rank 0 for 'CTCLoss' (op: 'CTCLoss') with input shapes: [?,?,80], [?,2], [?], [].
I have this code that converts dense labels to sparse:
def convert_to_sparse(labels, dtype=np.int32):
indices = []
values = []
for n, seq in enumerate(labels):
indices.extend(zip([n] * len(seq), range(len(seq))))
values.extend(seq)
indices = np.asarray(indices, dtype=dtype)
values = np.asarray(values, dtype=dtype)
shape = np.asarray([len(labels), np.asarray(indices).max(0)[1] + 1], dtype=dtype)
return indices, values, shape
I converted y_train to sparse labels, and place the values inside a SparseTensor:
sparse_y_train = convert_to_sparse(y_train)
print(tf.SparseTensor(
indices=sparse_y_train[0],
values=sparse_y_train[1],
dense_shape=sparse_y_train
))
And compared it to the SparseTensor created inside the grid_rnn_fn:
indices = tf.where(tf.not_equal(labels, tf.constant(0, dtype=tf.int32)))
values = tf.gather_nd(labels, indices)
sparse_labels = tf.SparseTensor(indices, values, dense_shape=tf.shape(labels, out_type=tf.int64))
Here's what I got:
For sparse_y_train:
SparseTensor(indices=Tensor("SparseTensor/indices:0", shape=(33, 2), dtype=int64), values=Tensor("SparseTensor/values:0", shape=(33,), dtype=int32), dense_shape=Tensor("SparseTensor/dense_shape:0", shape=(2,), dtype=int64))
For sparse_labels:
SparseTensor(indices=Tensor("Where:0", shape=(?, 2), dtype=int64), values=Tensor("GatherNd:0", shape=(?,), dtype=int32), dense_shape=Tensor("Shape:0", shape=(2,), dtype=int64))
Which leads me to think that ctc_loss can't seem to handle SparseTensors as labels with dynamic shapes.
Yes. It is possible to convert a tensor to a sparse tensor and back:
Let sparse be a sparse tensor and dense be a dense tensor.
From sparse to dense:
dense = tf.sparse_to_dense(sparse.indices, sparse.shape, sparse.values)
From dense to sparse:
zero = tf.constant(0, dtype=tf.float32)
where = tf.not_equal(dense, zero)
indices = tf.where(where)
values = tf.gather_nd(dense, indices)
sparse = tf.SparseTensor(indices, values, dense.shape)

tensorflow matrix multiplication

So, i want to multiply a matrix with a matrix. When I try an array with a matrix, it works:
import tensorflow as tf
x = tf.placeholder(tf.float32, [None, 3])
W = tf.Variable(tf.ones([3, 3]))
y = tf.matmul(x, W)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
curr_y = sess.run(y, feed_dict={x: [[1,2,3],[0,4,5]]})
print curr_y
So the array has the batch size 2 and shape 3x1. So I can multiply the matrix with shape 3x3 with the array 3x1.
But when I have again a matrix with the shape 3x3, but this time a matrix and not an array with the shape 3x2, with batch size 2, its not working.
But if I try to multiply a matrix with a matrix. It doesn't work.
import tensorflow as tf
x = tf.placeholder(tf.float32, [None, 3,3])
W = tf.Variable(tf.ones([3, 3]))
y = tf.matmul(x, W)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
curr_y = sess.run(y, feed_dict={x: [[[1,2,3],[1,2,3]],[[1,1,4],[0,4,5]]]})
print curr_y
ValueError: Shape must be rank 2 but is rank 3 for 'MatMul' (op:
'MatMul') with input shapes: [?,3,3], [3,3].
########EDIT
Sorry, what I want to do, is, to matmul a matrix to a batch of matrix or arrays. So I dont want to do
y = tf.matmul(x, W)
actually, I want to do
y = tf.matmul(W, x)
Your input to tensor 'x' has a shape (2, 2, 3).
You're trying to do matrix multiplication of (2, 2, 3) and (3, 3). they don't have the same rank, and that's the reason for the error.
from Tensorflow official site:
https://www.tensorflow.org/api_docs/python/tf/matmul
Args:
a: Tensor of type float16, float32, float64, int32, complex64, complex128 and rank > 1.
b: Tensor with same type and rank as a.
When you do matrices multiplication, the shape of the matrices need to follow the rule
(a, b) * (b, c) = (a, c)
Keep in mind the shape of W as you defined is (3, 3).
This feed_dict={x: [[1,2,3],[0,4,5]]} is a 2D array, the shape of it is (2, 3)
In [67]: x = [[1, 2, 3], [0, 4, 5]]
In [68]: x = np.array(x)
In [69]: x.shape
Out[69]: (2, 3)
It follows the rule (2, 3) * (3, 3) => (2, 3)
But your second example, the shape doesn't follow the rule of multiplication. The shape of your input is (2, 2, 3) which is not even in the same dimension as your defined W, so it won't work.
In [70]: foo = [[[1,2,3],[1,2,3]],[[1,1,4],[0,4,5]]]
In [71]: foo = np.array(foo)
In [72]: foo.shape
Out[72]: (2, 2, 3)

Categories

Resources