I'm building a LSTM RNN with Tensorflow that performs pixel-wise classification (or, maybe a better way to put it is, pixel-wise prediction?)
Bear with me as I explain the title.
The network looks like the following drawing...
The idea goes like this... an input image of size (200,200) is the input into a LSTM RNN of size (200,200,200). Each sequence output from the LSTM tensor vector (the pink boxes in the LSTM RNN) is fed into a MLP, and then the MLP makes a single output prediction -- ergo pixel-wise prediction (you can see how one input pixel generates one output "pixel"
The code looks like this (not all of the code, just parts that are needed):
...
n_input_x = 200
n_input_y = 200
x = tf.placeholder("float", [None, n_input_x, n_input_y])
y = tf.placeholder("float", [None, n_input_x, n_input_y])
def RNN(x):
x = tf.transpose(x, [1, 0, 2])
x = tf.reshape(x, [-1, n_input_x])
x = tf.split(0, n_steps, x)
lstm_cell = rnn_cell.BasicLSTMCell(n_hidden, forget_bias=1.0, state_is_tuple=True)
outputs, states = rnn.rnn(lstm_cell, x, dtype=tf.float32)
output_matrix = []
for i in xrange(200):
temp_vector = []
for j in xrange(200):
lstm_vector = outputs[j]
pixel_pred = multilayer_perceptron(lstm_vector, mlp_weights, mlp_biases)
temp_vector.append(pixel_pred)
output_matrix.append(temp_vector)
print i
return output_matrix
temp = RNN(x)
pred = tf.placeholder(temp, [None, n_input_x, n_input_y])
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(pred, y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
...
I have confirmed that the output of RNN -- that is, what is stored in temp is a 200x200 array of <tf.Tensor 'Softmax_39999:0' shape=(?, 1) dtype=float32>
As you can see, I place temp in a tf.placeholder of the same shape (None for the batch size ... or do I need this?)... and the program just exits as if it completed running. Ideally what I want to see when I debug and print pred is something like <tf.Tensor shape=(200,200)>
When I debug, the first time I execute pred = tf.placeholder(temp, [None, n_input_x, n_input_y]) I get TypeError: TypeErro...32>]].",) and then it returns and I try again, and it says Exception AttributeError: "'NoneType' object has no attribute 'path'" in <function _remove at 0x7f1ab77c26e0> ignored
EDIT I also now realize that I need to place the lines
lstm_cell = rnn_cell.BasicLSTMCell(n_hidden, forget_bias=1.0, state_is_tuple=True)
outputs, states = rnn.rnn(lstm_cell, x, dtype=tf.float32)
Inside the first loop so that new 2D LSTM RNN are generated, however I'm getting an error about variable reusing ValueError: Variable RNN/BasicLSTMCell/Linear/Matrix does not exist, disallowed. Did you mean to set reuse=None in VarScope?
So in other words, is it isn't auto incrementing the RNN tensor name?
A more convenient way to report shapes is with tf.shape(). In your case:
size1 = tf.shape(temp)
sess = tf.Session()
size1_fetched = sess.run(size1, feed_dict = your_feed_dict)
That way, the size1_fetched is something like you would get from NumPy. Moreover, also your sizes for that particular feed_dict are given. For example, your [None, 200, 200] Tensor would be [64, 200, 200]
Another question: why do you have the placeholder in between your flow-graph? Will you later on feed pre-defined images feature-maps?
Related
Lets say that my input data, x, has the shape (2000, 2) where 2000 is the number of samples and 2 is the number of features.
So for this input data, I can setup a place holder like this:
x = tf.placeholder(tf.float32, shape=[None, 2], name='features')
My question is, if I transpose my input data, x, so that the shape is now (2, 2000) where 2000 is still the number of samples, how would I change the "shape" parameter in tf.placeholder?
I have tried setting shape=[2, None], but I just get an error. Does the 1st element in the "shape" parameter always have to be "None"?
Here the error I get: "ValueError: The last dimension of the inputs to Dense should be defined. Found None."
import tensorflow as tf
# Binary Classifier Implementation
# Training data
x_train = np.transpose(X) #shape=(2, 2000)
y_train = np.hstack((np.zeros((1, 1000)),np.zeros((1, 1000)) + 1)) #shape=(1, 2000)
# Variables
x = tf.placeholder(tf.float32, shape=[2, None], name='features')
y_ = tf.placeholder(tf.int64, shape=[1, None], name='labels')
h1 = tf.layers.dense(inputs=x, units=50, activation=tf.nn.relu) #one hidden layer with 50 neurons
y = tf.layers.dense(inputs=h1, units=1, activation=tf.nn.sigmoid) #one output layer with 1 neuron
# Functions
#loss
cross_entropy = tf.losses.sigmoid_cross_entropy(multi_class_labels=y_, logits=y)
train_step = tf.train.GradientDescentOptimizer(0.1).minimize(cross_entropy)
# Initializer
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for i in range(1000):
sess.run([cross_entropy], feed_dict={x: x_train, y_: y_train})
I think you might be having trouble with shape inconsistency, not with transposing or having an undefined dimension size somewhere other than the first dim.
The following things work fine for me:
import tensorflow as tf
x = tf.placeholder(tf.float32, shape=[None, 2])
x_t = tf.transpose(x)
y = tf.placeholder(tf.float32, shape=[2, None])
print(x_t.shape)
>> TensorShape([Dimension(2), Dimension(None)])
print(y.shape)
>> TensorShape([Dimension(2), Dimension(None)])
I'm assuming the first dimension might be your batch-size? Are you maybe not being consistent with this or feeding the wrong shaped data?
Or maybe you're trying to manually change the shape of the tensor? (If so then you don't need to. tf takes care of that as apparent in my example).
That's the most one can help without your giving an idea of the error you're getting, a code snippet, or anything more than a vague and general description, but I hope it helps nonetheless.
Good luck!
Update due to updated question:
The error you're getting is telling you exactly what the problem is. You can transpose whatever you want, but the dense layer specifically cannot accept a tensor for which the last dimension is None. This makes sense; you have cannot create a fully connected layer without knowing the sizes of the weight matrices for instance.
If you're transposing x, then you're saying you have 2000 features (and 2 data-points) and the NN needs to know this in order to create the parameters you're going to train.
If you still consider yourself to have 2 features and however-many examples, then you shouldn't be working with a shape of (2, 2000) in the first place!
Try the following:
# Training data
x_train = X #shape=(2000, 2)
y_train = np.hstack((np.zeros((1000, 1)),np.zeros((1000, 1)) + 1)) #shape=(2000, 1)
# Variables
x = tf.placeholder(tf.float32, shape=[None, 2], name='features')
y_ = tf.placeholder(tf.int64, shape=[None, 1], name='labels')
h1 = tf.layers.dense(inputs=x, units=50, activation=tf.nn.relu) #one hidden layer with 50 neurons
y = tf.layers.dense(inputs=h1, units=1, activation=tf.nn.sigmoid) #one output layer with 1 neuron
# Functions
#loss
cross_entropy = tf.losses.sigmoid_cross_entropy(multi_class_labels=y_, logits=y)
train_step = tf.train.GradientDescentOptimizer(0.1).minimize(cross_entropy)
# Initializer
init = tf.global_variables_initializer()
with tf.Session() as sess:
sess.run(init)
for i in range(1000):
sess.run([cross_entropy], feed_dict={x: x_train, y_: y_train})
Hope this helps.
On a completely different and unrelated note: I hope you know what you're doing when embedding 2 features in a 50 dimensional space that way.
Here is what I have tried:
tf.reset_default_graph()
X = tf.placeholder(tf.float32, [None, n_steps, n_inputs])
y = tf.placeholder(tf.float32, [None,n_outputs])
layers = [tf.contrib.rnn.LSTMCell(num_units=n_neurons,
activation=tf.nn.leaky_relu, use_peepholes = True)
for layer in range(n_layers)]
multi_layer_cell = tf.contrib.rnn.MultiRNNCell(layers)
rnn_outputs, states = tf.nn.dynamic_rnn(multi_layer_cell, X, dtype=tf.float32)
tf.summary.histogram("outputs", rnn_outputs)
tf.summary.image("RNN",rnn_outputs)
I am getting the following error:
InvalidArgumentError: Tensor must be 4-D with last dim 1, 3, or 4, not [55413,4,100]
[[Node: RNN_1 = ImageSummary[T=DT_FLOAT, bad_color=Tensor<type: uint8 shape: [4] values: 255 0 0...>, max_images=3, _device="/job:localhost/replica:0/task:0/device:CPU:0"](RNN_1/tag, rnn/transpose_1)]]
Kindly, help me get the visualization of the rnn inside the LSTM model that I am trying to run. This will help in understanding what LSTM is doing more accurately.
You can plot each RNN output as an image with one axis being the time and the other axis being the output. Here is an small example:
import tensorflow as tf
import numpy as np
n_steps = 100
n_inputs = 10
n_neurons = 10
n_layers = 3
x = tf.placeholder(tf.float32, [None, n_steps, n_inputs])
layers = [tf.contrib.rnn.LSTMCell(num_units=n_neurons,
activation=tf.nn.leaky_relu, use_peepholes=True)
for layer in range(n_layers)]
multi_layer_cell = tf.contrib.rnn.MultiRNNCell(layers)
rnn_outputs, states = tf.nn.dynamic_rnn(multi_layer_cell, x, dtype=tf.float32)
# Time steps in horizontal axis, outputs in vertical axis, add last dimension for channel
rnn_out_imgs = tf.transpose(rnn_outputs, (0, 2, 1))[..., tf.newaxis]
out_img_sum = tf.summary.image("RNN", rnn_out_imgs, max_outputs=10)
init_op = tf.global_variables_initializer()
with tf.Session() as sess, tf.summary.FileWriter('log') as fw:
sess.run(init_op)
fw.add_summary(sess.run(out_img_sum, feed_dict={x: np.random.rand(10, n_steps, n_inputs)}))
You would get a visualization that could look like this:
Here the brighter pixels would represent a stronger activation, so even if it is hard to tell what exactly is causing what you can at least see if any meaningful patterns arise.
Your RNN output has the wrong shape for tf.summary.image. The tensor should be four-dimensional with the dimensions' sizes given by [batch_size, height, width, channels].
In your code, you're calling tf.summary.image with rnn_outputs, which has shape [55413, 4, 100]. Assuming your images are 55413-by-100 pixels in size and that each pixel contains 4 channels (RGBA), I'd use tf.reshape to reshape rnn_outputs to [1, 55413, 100, 4]. Then you should be able to call tf.summary.image without error.
I don't think I can help you visualize the RNN's operation, but when I was learning about RNNs and LSTMs, I found this article very helpful.
I am trying to use LSTM with inputs with different time steps (different number of frames). The input to the rnn.static_rnn should be a sequence of tf (not a tf!). So, I should convert my input to sequence. I tried to use tf.unstack and tf.split, but both of them need to know exact size of inputs, while one dimension of my inputs (time steps) is changing by different inputs. following is part of my code:
n_input = 256*256 # data input (img shape: 256*256)
n_steps = None # timesteps
batch_size = 1
# tf Graph input
x = tf.placeholder("float", [ batch_size , n_input,n_steps])
y = tf.placeholder("float", [batch_size, n_classes])
# Permuting batch_size and n_steps
x1 = tf.transpose(x, [2, 1, 0])
x1 = tf.transpose(x1, [0, 2, 1])
x3=tf.unstack(x1,axis=0)
#or x3 = tf.split(x2, ?, 0)
# Define a lstm cell with tensorflow
lstm_cell = rnn.BasicLSTMCell(num_units=n_hidden, forget_bias=1.0)
# Get lstm cell output
outputs, states = rnn.static_rnn(lstm_cell, x3, dtype=tf.float32,sequence_length=None)
I got following error when I am using tf.unstack:
ValueError: Cannot infer num from shape (?, 1, 65536)
Also, there are some discussions here and here, but none of them were useful for me. Any help is appreciated.
As explained in here, tf.unstack does not work if the argument is unspecified and non-inferrable.
In your code, after transpositions, x1 has the shape of [ n_steps, batch_size, n_input] and its value at axis=0 is set to None.
I'm trying to define an operation for a NN I'm implementing, but to do so I need to iterate over the dimension of a tensor. I have a small working example below.
X = tf.placeholder(tf.float32, shape=[None, 10])
idx = [[i] for i in tf.range(X.get_shape()[0])]
This produces an error stating
ValueError: Cannot convert an unknown Dimension to a Tensor: ?
When using the same code but using tf.shape instead, resulting in the code being
X = tf.placeholder(tf.float32, shape=[None, 10])
idx = [[i] for i in tf.range(tf.shape(X)[0])]
Gives the following error
TypeError: 'Tensor' object is not iterable.
The way that I'm implementing this NN, the batch_size isn't defined until the training function, which is at the end of the code. This is just where I'm building the graph itself, so the batch_size isn't known by this point, and it can't be fixed as the training batch_size and the test set batch_sizes are different.
What is the best way to fix this? This is the last thing keeping my code from running, as I got it to run with a fixed batch_size, though those results aren't useful. I've been pouring over the TensorFlow API Documentation and stack overflow for weeks to no avail.
I've also tried to feed in a placeholder into the range, so when I'm running the test/training set the code would be the following
X = tf.placeholder(tf.float32, shape=[None, 10])
bs = tf.placeholder(tf.int32)
def My_Function(X):
# Do some stuff to X
idx = [[i] for i in tf.range(bs)]
# return some tensor
A = tf.nn.relu(My_Function(X))
However, this gives the same error as above
TypeError: 'Tensor' object is not iterable.
I think you should use the tf.shape(x) instead.
x = tf.placeholder(..., shape=[None, ...])
batch_size = tf.shape(x)[0] # Returns a scalar `tf.Tensor`
print x.get_shape()[0] # ==> "?"
# You can use `batch_size` as an argument to other operators.
some_other_tensor = ...
some_other_tensor_reshaped = tf.reshape(some_other_tensor, [batch_size, 32, 32])
# To get the value, however, you need to call `Session.run()`.
sess = tf.Session()
x_val = np.random.rand(37, 100, 100)
batch_size_val = sess.run(batch_size, {x: x_val})
print x_val # ==> "37"
See : get the size of a variable batch dimension
You can't operate on tensors that way. You need to use tf.map_fn as user1735003 mentioned.
Here is an example where I used tf.map_fn in order to pass the output of an LSTM at each timestep into a linear layer, defined by weights['out'] and biases['out'].
x = tf.placeholder("float", [features_dimension, None, n_timesteps])
weights = {'out': tf.Variable(tf.zeros([N_HIDDEN_LSTM, labels_dimension]))}
biases = {'out': tf.Variable(tf.zeros([labels_dimension]))}
def LSTM_model(x, weights, biases):
lstm_cell = rnn.LSTMCell(N_HIDDEN_LSTM)
# outputs is a Tensor of shape (n_timesteps, n_observations, N_HIDDEN_LSTM)
outputs, states = tf.nn.dynamic_rnn(lstm_cell, x, dtype=tf.float32, time_major=True)
# Linear activation
def pred_fn(current_output):
return tf.matmul(current_output, weights['out']) + biases['out']
# Use tf.map_fn to apply pred_fn to each tensor in outputs, along
# dimension 0 (timestep dimension)
pred = tf.map_fn(pred_fn, outputs)
return pred
Could tf.map_fn be what you are looking for?
x = tf.placeholder(tf.float32, shape=[None, 10])
f = tf.map_fn(lambda y: y, x) # or perhaps something more useful than identity
EDIT
Now that I understand better, I think the problem is that you are trying to get the range while the graph is created, as opposed to when the graph is run.
Also, you need to use tf.range to query the shape at run time.
In [2]: import numpy as np
...: import tensorflow as tf
...: x = tf.placeholder(tf.float32, shape=[None, 10])
...: sess = tf.InteractiveSession()
...: sess.run(tf.range(tf.shape(x)[0]), {x: np.zeros((7,10))})
Out[2]: array([0, 1, 2, 3, 4, 5, 6])
There's a small trick you can use, if you are using tensorflow >= 1.13, although not very efficient, since it uses sorting.
x=tf.placeholder(dtype=tf.float32, shape=[None])
xargs=tf.argsort(x)
range=tf.sort(xargs)
I'm trying to build a pixel-wise classification LSTM RNN using tensorflow. My model is displayed in the picture below. The problem I'm having is building a 3D LSTM RNN. The code that I have builds a 2D LSTM RNN, so I placed the code inside a loop, but now I get the following error:
ValueError: Variable RNN/BasicLSTMCell/Linear/Matrix does not exist, disallowed. Did you mean to set reuse=None in VarScope?
So here's the network:
The idea goes like this... an input image of size (200,200) is the input into a LSTM RNN of size (200,200,200). Each sequence output from the LSTM tensor vector (the pink boxes in the LSTM RNN) is fed into a MLP, and then the MLP makes a single output prediction -- ergo pixel-wise prediction (you can see how one input pixel generates one output "pixel"
So here's my code:
...
n_input_x = 200
n_input_y = 200
x = tf.placeholder("float", [None, n_input_x, n_input_y])
y = tf.placeholder("float", [None, n_input_x, n_input_y])
def RNN(x):
x = tf.transpose(x, [1, 0, 2])
x = tf.reshape(x, [-1, n_input_x])
x = tf.split(0, n_steps, x)
output_matrix = []
for i in xrange(200):
temp_vector = []
lstm_cell = rnn_cell.BasicLSTMCell(n_hidden, forget_bias=1.0, state_is_tuple=True)
outputs, states = rnn.rnn(lstm_cell, x, dtype=tf.float32)
for j in xrange(200):
lstm_vector = outputs[j]
pixel_pred = multilayer_perceptron(lstm_vector, mlp_weights, mlp_biases)
temp_vector.append(pixel_pred)
output_matrix.append(temp_vector)
print i
return output_matrix
temp = RNN(x)
pred = tf.placeholder(temp, [None, n_input_x, n_input_y])
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(pred, y))
optimizer = tf.train.AdamOptimizer(learning_rate=learning_rate).minimize(cost)
...
You can see that I placed the call to RNN inside the first loop. In this way, I generate a new RNN every time. I know Tensorflow auto-increments other Tensors.
debugging I have
(Pdb) lstm_cell
<tensorflow.python.ops.rnn_cell.BasicLSTMCell object at 0x7f9d26956850>
and then for outputs I have a vector of 200 BasicLSTMCells
(Pdb) len(outputs)
200
...
<tf.Tensor 'RNN_2/BasicLSTMCell_199/mul_2:0' shape=(?, 200) dtype=float32>]
So ideally, I want the second call to RNN to generate LSTM tensors with indexes 200-399
I tried this, but it won't construct a RNN because the dimensions of 40000 and x (after the split) don't line up.
x = tf.reshape(x, [-1, n_input_x])
# Split to get a list of 'n_steps' tensors of shape (batch_size, n_hidden)
# This input shape is required by `rnn` function
x = tf.split(0, n_input_y, x)
lstm_cell = rnn_cell.BasicLSTMCell(40000, forget_bias=1.0, state_is_tuple=True)
outputs, states = rnn.rnn(lstm_cell, x, dtype=tf.float32)
output_matrix = []
for i in xrange(200):
temp_vector = []
for j in xrange(200):
lstm_vector = outputs[i*j]
So then I also tried to get rid of the split, but then it complains that it must be a list. So then I tried reshaping x = tf.reshape(x, [n_input_x * n_input_y]) but then it still says it must be a list