I'm building an encoder-decoder model in TensorFlow 1.0.1 using the legacy sequence-to-sequence framework. Everything works as it should when I have one layer of LSTMs in the encoder and decoder. However, when I try with >1 layers of LSTMs wrapped in a MultiRNNCell, I get an error when calling tf.contrib.legacy_seq2seq.rnn_decoder.
The full error is at the end up this post, but in brief, it's caused by a line
(c_prev, m_prev) = state
in TensorFlow that throws TypeError: 'Tensor' object is not iterable.. I'm confused by this, since the initial state I'm passing to rnn_decoder is indeed a tuple as it should be. As far as I can tell, the only difference between using 1 or >1 layers is that the latter involves using MultiRNNCell. Are there some API quirks that I should know about when using this?
This is my code (based on the example in this GitHub repo). Apologies for its length; this is as minimal I could make it while still being complete and verifiable.
import tensorflow as tf
import tensorflow.contrib.legacy_seq2seq as seq2seq
import tensorflow.contrib.rnn as rnn
seq_len = 50
input_dim = 300
output_dim = 12
num_layers = 2
hidden_units = 100
sess = tf.Session()
encoder_inputs = []
decoder_inputs = []
for i in range(seq_len):
encoder_inputs.append(tf.placeholder(tf.float32, shape=(None, input_dim),
name="encoder_{0}".format(i)))
for i in range(seq_len + 1):
decoder_inputs.append(tf.placeholder(tf.float32, shape=(None, output_dim),
name="decoder_{0}".format(i)))
if num_layers > 1:
# Encoder cells (bidirectional)
# Forward
enc_cells_fw = [rnn.LSTMCell(hidden_units)
for _ in range(num_layers)]
enc_cell_fw = rnn.MultiRNNCell(enc_cells_fw)
# Backward
enc_cells_bw = [rnn.LSTMCell(hidden_units)
for _ in range(num_layers)]
enc_cell_bw = rnn.MultiRNNCell(enc_cells_bw)
# Decoder cell
dec_cells = [rnn.LSTMCell(2*hidden_units)
for _ in range(num_layers)]
dec_cell = rnn.MultiRNNCell(dec_cells)
else:
# Encoder
enc_cell_fw = rnn.LSTMCell(hidden_units)
enc_cell_bw = rnn.LSTMCell(hidden_units)
# Decoder
dec_cell = rnn.LSTMCell(2*hidden_units)
# Make sure input and output are the correct dimensions
enc_cell_fw = rnn.InputProjectionWrapper(enc_cell_fw, input_dim)
enc_cell_bw = rnn.InputProjectionWrapper(enc_cell_bw, input_dim)
dec_cell = rnn.OutputProjectionWrapper(dec_cell, output_dim)
_, final_fw_state, final_bw_state = \
rnn.static_bidirectional_rnn(enc_cell_fw,
enc_cell_bw,
encoder_inputs,
dtype=tf.float32)
# Concatenate forward and backward cell states
# (The state is a tuple of previous output and cell state)
if num_layers == 1:
initial_dec_state = tuple([tf.concat([final_fw_state[i],
final_bw_state[i]], 1)
for i in range(2)])
else:
initial_dec_state = tuple([tf.concat([final_fw_state[-1][i],
final_bw_state[-1][i]], 1)
for i in range(2)])
decoder = seq2seq.rnn_decoder(decoder_inputs, initial_dec_state, dec_cell)
tf.global_variables_initializer().run(session=sess)
And this is the error:
Traceback (most recent call last):
File "example.py", line 67, in <module>
decoder = seq2seq.rnn_decoder(decoder_inputs, initial_dec_state, dec_cell)
File "/home/tao/.virtualenvs/example/lib/python2.7/site-packages/tensorflow/contrib/legacy_seq2seq/python/ops/seq2seq.py", line 150, in rnn_decoder
output, state = cell(inp, state)
File "/home/tao/.virtualenvs/example/lib/python2.7/site-packages/tensorflow/contrib/rnn/python/ops/core_rnn_cell_impl.py", line 426, in __call__
output, res_state = self._cell(inputs, state)
File "/home/tao/.virtualenvs/example/lib/python2.7/site-packages/tensorflow/contrib/rnn/python/ops/core_rnn_cell_impl.py", line 655, in __call__
cur_inp, new_state = cell(cur_inp, cur_state)
File "/home/tao/.virtualenvs/example/lib/python2.7/site-packages/tensorflow/contrib/rnn/python/ops/core_rnn_cell_impl.py", line 321, in __call__
(c_prev, m_prev) = state
File "/home/tao/.virtualenvs/example/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 502, in __iter__
raise TypeError("'Tensor' object is not iterable.")
TypeError: 'Tensor' object is not iterable.
Thank you!
The problem is in the format of the initial state (initial_dec_state) passed to seq2seq.rnn_decoder.
When you use rnn.MultiRNNCell, you're building a multilayer recurrent network, so you need to provide an initial state for each of these layers.
Hence, you should provide a list of tuples as the initial state, where each element of the list is the previous state coming from the corresponding layer of the recurrent network.
So your initial_dec_state, initialized like this:
initial_dec_state = tuple([tf.concat([final_fw_state[-1][i],
final_bw_state[-1][i]], 1)
for i in range(2)])
instead should be like this:
initial_dec_state = [
tuple([tf.concat([final_fw_state[j][i],final_bw_state[j][i]], 1)
for i in range(2)]) for j in range(len(final_fw_state))
]
which creates a list of tuples in the format:
[(state_c1, state_m1), (state_c2, state_m2) ...]
In more detail, the 'Tensor' object is not iterable. error, happens because seq2seq.rnn_decoder internally calls your rnn.MultiRNNCell (dec_cell) passing the initial state (initial_dec_state) to it.
rnn.MultiRNNCell.__call__ iterates through the list of initial states and for each one of them extracts the tuple (c_prev, m_prev) (in the statement (c_prev, m_prev) = state).
So if you pass just a tuple, rnn.MultiRNNCell.__call__ will iterate over it, and as soon as it reaches the (c_prev, m_prev) = state it will find a tensor (which should be a tuple) as state and will throw the 'Tensor' object is not iterable. error.
A good way to know which format of initial state a seq2seq.rnn_decoder expects, is to call dec_cell.zero_state(batch_size, dtype=tf.float32). This method returns zero-filled state tensor(s) in the exact format needed to initialize the recurrent module that you're using.
Related
Traceback (most recent call last):
File "c:/Users/levin/Desktop/programming/nn.py", line 208, in <module>
agent.train(BATCHSIZE)
File "c:/Users/levin/Desktop/programming/nn.py", line 147, in train
output = F.mse_loss(prediction, target)
File "C:\Users\levin\Anaconda3\lib\site-packages\torch\nn\functional.py", line 2203, in mse_loss
if not (target.size() == input.size()):
AttributeError: 'NoneType' object has no attribute 'size'
This above is the Error that I'm constantly getting and I really don't know how to fix it.
This some code that might be important
def train(self, BATCHSIZE):
trainsample = random.sample(self.memory, BATCHSIZE)
for state, action, reward, new_state, gameovertemp in trainsample:
if gameovertemp:
target = torch.tensor(reward).grad_fn
else:
target = reward + self.gamma * torch.max(self.dqn.forward(new_state))
self.dqn.zero_grad()
prediction = torch.max(self.dqn.forward(state))
#print(prediction, "prediction")
#print(target, "target")
output = F.mse_loss(prediction, target)
output.backward()
self.optimizer.step()
As stated in a comment the error due to either target of input to be None and is not related to the size() attribute.
The problem is probably at this line:
target = torch.tensor(reward).grad_fn
Here you convert reward to a new Tensor. However, a Tensor created by the user always has a grad_fn equal to None (as explained in Pytorch Autograd).
To have a grad_fn a Tensor must be the result of some computation, not a static value.
The thing is that mse_loss does not expect target to be differentiable, as the name suggest it is just the value to be compared.
Try to remove the grad_fn from this line the raw Tensor should be sufficient.
I am new to tensorflow, I have a class that describes my neural network and all the methods i need for the network. I have a specific method to feed a neuron value. I have declared the neuron initially as a placeholder, hoping to use get_neuron method to feed the value to neuron when available. When I am running , I am having a runtime error saying RuntimeError: The Session graph is empty. Add operations to the graph before calling run(). Can anyone help me how to solve it ? and what exactly it means ? I dont understand what exactly it means by add operations as I have been using feed_dict operation and running the session. Also does it bring any disadvantages if i run sessions for every computation to my tensor or should the session only be run at the end when having the final output. I am asking this because i am wondering whether this effects the effectiveness of tensorflow as it follows the graph to optimize while using optimizer.
def get_neuron(self, a, b):
#'while a being the name of the neuron and b being the value of that neuron'
with tf.Session() as sess_n:
sess_n.run(a, feed_dict={a: b})
return
EDIT:
this is how I am calling that above function i.e.
knowledge_out = knowledge.run(carollis_inp)
knowledge is a object created from class knowledge_transfer which has a method run, and the very first line of the function is
self.get_neuron(self.neuron_input, carollis_input)
The error displayed exactly is
[ERROR] [1566241992.292524, 15.300000]: bad callback: <function joint_callback at 0x7f42221982f0>
Traceback (most recent call last):
File "/opt/ros/melodic/lib/python2.7/dist-packages/rospy/topics.py", line 748, in _invoke_callback
cb(msg, cb_args)
File "/home/microbot/catkin_ws/src/spider/spider_control/control1.py", line 60, in joint_callback
knowledge_out = knowledge.run(carollis_inp)
File "/home/microbot/catkin_ws/src/spider/spider_control/knowledge_transfer.py", line 99, in run
self.get_neuron(self.neuron_input, carollis_input)
File "/home/microbot/catkin_ws/src/spider/spider_control/knowledge_transfer.py", line 81, in get_neuron
sess_n.run(a, feed_dict={a: b})
File "/home/microbot/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 950, in run
run_metadata_ptr)
File "/home/microbot/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1098, in _run
raise RuntimeError('The Session graph is empty. Add operations to the '
RuntimeError: The Session graph is empty. Add operations to the graph before calling run().
EDIT2:
I have a neural network with name knowledge_transfer, it takes 8 inputs and gives four outputs which are then soft maxed , kind of like multi-classification problem. There is this class method run which accepts the input array of size 8 and then returns the output array of size four. Inside the run method it feeds the given input arguments to input placeholder of a size 8 and then does some processing through the layers and weights and finally gives the output.
example:
I want to create a class member method which takes two inputs and feed_dict first second input argument to the first argument considering the first argument is a placeholder and both the arguments are of same size.
Edit3:
I am feeding an input to the input layer, and then using an input function to calculate the output of input layer, later feeding it to input function of hidden layer and then using the output of input function to hidden layer to pass through leaky_rectified linear output , the output of which is then passed to input function of output layer folowed by passing the output of it through softmax function and giving that softmax output as an output of NN.
The code is as follows:
self.neuron_input = tf.compat.v1.placeholder(tf.float32, shape=(self.neurons, 1))
self.weight_in = tf.get_variable(name="Weight_input", dtype=tf.float32, shape=[self.neurons, 1], initializer=self.weight_initer)
self.neuron_hid = tf.compat.v1.placeholder(tf.float32, shape=(int(self.neurons/2), 1))
self.weight_initer1 = tf.truncated_normal_initializer(mean=1.0, stddev=0.01)
self.weight_hid = tf.get_variable(name="Weight_hidden", dtype=tf.float32, shape=[self.neurons, 1], initializer=self.weight_initer1)
self.neuron_out = tf.compat.v1.placeholder(tf.float32, shape=(4, 1))
self.weight_initer2 = tf.truncated_normal_initializer(mean=2.0, stddev=0.01)
self.weight_out = tf.get_variable(name="Weight_output", dtype=tf.float32, shape=[4, 2], initializer=self.weight_initer2)
self.bias_initer =tf.truncated_normal_initializer(mean=0-1, stddev=0.01)
self.bias_in =tf.get_variable(name="Bias_input", dtype=tf.float32, shape=[self.neurons, 1], initializer=self.bias_initer)
self.bias_initer1 =tf.truncated_normal_initializer(mean=0-2, stddev=0.01)
self.bias_hid = tf.get_variable(name="Bias_hidden", dtype=tf.float32, shape=[self.neurons, 1], initializer=self.bias_initer1)
self.bias_initer2 =tf.truncated_normal_initializer(mean=0-3, stddev=0.01)
self.bias_out = tf.get_variable(name="Bias_output", dtype=tf.float32, shape=[4, 1], initializer=self.bias_initer2)
and then the run() function is as follows:
def run(self, carollis_input):
self.normalization(carollis_input)
#'finding the output of the input layer'
knowledge_input = tf.add(tf.multiply(self.neuron_input, self.weight_in), self.bias_in)
#'calculating the input for the hidden layer'
knowledge_hidden = tf.add(tf.multiply(knowledge_input, self.weight_in), self.bias_hid)
#'calculating the output of hidden layer'
knowledge_hidden_output = 3.14*(tf.add(tf.multiply(knowledge_hidden, self.weight_hid), self.bias_hid))#input function of hidden layer
knowledge_hidden_out = tf.nn.leaky_relu(self.neuron_hid, alpha=0.01, name='leaky_relu')
with tf.Session() as sess1_2:
sess1_2.run(knowledge_hidden_out, feed_dict={self.neuron_input: carollis_input, self.neuron_hid: knowledge_hidden_output})
#'calculating the input of output layer'
tf.reshape(knowledge_hidden_out, [4, 2])#for quadrant method
in_out = tf.add(tf.multiply(knowledge_hidden_out, self.weight_out), self.bias_out)
with tf.Session as s:
s.run(in_out)
#'finding the softmax output of the neurons'
softmax_output = np.array(4)
softmax_output = self.out_softmax(in_out) # this gives the softmax output and stores it in the newly created array
return softmax_output
The error is as follows :
sess1_2.run(knowledge_hidden_out, feed_dict={self.neuron_input: carollis_input, self.neuron_hid: knowledge_hidden_output})
File "/home/microbot/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 950, in run
run_metadata_ptr)
File "/home/microbot/.local/lib/python3.6/site-packages/tensorflow/python/client/session.py", line 1098, in _run
raise RuntimeError('The Session graph is empty. Add operations to the '
RuntimeError: The Session graph is empty. Add operations to the graph before calling run().
I am trying to see what my data looks like after going through a convolutional layer in my Keras model. I am using the Theano backend. The code I have has been cobbled together from the Keras Github:
def get_layer0_outputs(model, test_data):
output = model.layers[0].output
inputs = [K.learning_phase()] + model.inputs
func = K.function(inputs, [output])
return func([0] + [test_data])
What I'm trying to do here is compile a function for the first layer (a Conv2D layer) in my network. The test_data argument is an np.ndarray. My model is loaded correctly, and I have already trained it with decent accuracy.
However, when I call this function, I get a cryptic stacktrace:
Traceback (most recent call last):
File "/usr/local/lib/python3.5/dist-packages/theano/compile/function_module.py", line 884, in __call__
self.fn() if output_subset is None else\
File "/usr/local/lib/python3.5/dist-packages/theano/gof/op.py", line 872, in rval
r = p(n, [x[0] for x in i], o)
File "/usr/local/lib/python3.5/dist-packages/theano/tensor/nnet/abstract_conv.py", line 1626, in perform
conv_out = self.conv(img, kern, mode="valid", dilation=self.filter_dilation)
File "/usr/local/lib/python3.5/dist-packages/theano/tensor/nnet/abstract_conv.py", line 1531, in conv
dilated_kern[n, im0, ...],
IndexError: index 1 is out of bounds for axis 1 with size 1
What does this mean? Am I calling my function incorrectly?
Your function works for me using the following model:
a = Input(shape=(224,224,3))
b = Conv2D(8, 3, strides=(2,2))(a)
model = Model(inputs=a, outputs=b)
model.compile(optimizer='sgd', loss='mse')
def get_layer0_outputs(model, test_data):
output = model.layers[0].output
inputs = [K.learning_phase()] + model.inputs
func = K.function(inputs, [output])
return func([0] + [test_data])
print get_layer0_outputs(model, np.zeros((1, 224, 224, 3)))[0].shape
Note that layer 0 is an Input layer not a Conv2D, but the code also works for layer 1. I'm using the tensorflow backend so I don't know if the difference is your model or the theano backend.
I want to use pretrained models' convolutionnal feature maps as input features for a master model.
inputs = layers.Input(shape=(100, 100, 12))
sub_models = get_model_ensemble(inputs)
sub_models_outputs = [m.layers[-1] for m in sub_models]
inputs_augmented = layers.concatenate([inputs] + sub_models_outputs, axis=-1)
Here is the key part of what I do in get_model_ensemble():
for i in range(len(models)):
model = models[i]
for lay in model.layers:
lay.name = lay.name + "_" + str(i)
# Remove the last classification layer to rather get the underlying convolutional embeddings
model.layers.pop()
# while "conv2d" not in model.layers[-1].name.lower():
# model.layers.pop()
model.layers[0] = new_input_layer
return models
All this gives:
Traceback (most recent call last):
File "model_ensemble.py", line 151, in <module>
model = get_mini_ensemble_net()
File "model_ensemble.py", line 116, in get_mini_ensemble_net
inputs_augmented = layers.concatenate([inputs] + sub_models_outputs, axis=-1)
File "/usr/local/lib/python3.4/dist-packages/keras/layers/merge.py", line 508, in concatenate
return Concatenate(axis=axis, **kwargs)(inputs)
File "/usr/local/lib/python3.4/dist-packages/keras/engine/topology.py", line 549, in __call__
input_shapes.append(K.int_shape(x_elem))
File "/usr/local/lib/python3.4/dist-packages/keras/backend/tensorflow_backend.py", line 451, in int_shape
shape = x.get_shape()
AttributeError: 'BatchNormalization' object has no attribute 'get_shape'
Here is type info:
print(type(inputs))
print(type(sub_models[0]))
print(type(sub_models_outputs[0]))
<class 'tensorflow.python.framework.ops.Tensor'>
<class 'keras.engine.training.Model'>
<class 'keras.layers.normalization.BatchNormalization'>
Note: the models I get from get_model_ensemble() have got their compile() function already called. So, how should I concatenate my models properly? Why wont it work? I guess that maybe that has something to do with how would the inputs be fed to the sub-models and how I hot-swapped their input layers.
Thanks for the help!
The thing works if we do:
sub_models_outputs = [m(inputs) for m in sub_models]
rather than:
sub_models_outputs = [m.layers[-1] for m in sub_models]
TLDR: models needs to be called as a layer.
I'm trying to use implement the code from this page. But I can't work out how to format the data (training set / testing set) correctly. My code:
numpy_rng = numpy.random.RandomState(123)
dbn = DBN(numpy_rng=numpy_rng, n_ins=2,hidden_layers_sizes=[50, 50, 50],n_outs=1)
train_set_x = [
([1,2],[2,]), #first element in the tuple is the input, the second is the output
([4,5],[5,])
]
testing_set_x = [
([6,1],[3,]), #same format as the training set
]
#when I looked at the load_data function found elsewhere in the tutorial (I'll show the code they used at the bottom for ease) I found it rather confusing, but this was my first attempt at recreating what they did
train_set_xPrime = [theano.shared(numpy.asarray(train_set_x[0][0],dtype=theano.config.floatX),borrow=True),theano.shared(numpy.asarray(train_set_x[0][1],dtype=theano.config.floatX),borrow=True)]
pretraining_fns = dbn.pretraining_functions(train_set_x=train_set_xPrime,batch_size=10,k=1)
which produced this error:
Traceback (most recent call last):
File "/Users/spudzee1111/Desktop/Code/NNChatbot/DeepBeliefScratch.command", line 837, in <module>
pretraining_fns = dbn.pretraining_functions(train_set_x=train_set_xPrime,batch_size=10,k=1)
File "/Users/spudzee1111/Desktop/Code/NNChatbot/DeepBeliefScratch.command", line 532, in pretraining_functions
n_batches = train_set_x.get_value(borrow=True).shape[0] / batch_size
AttributeError: 'list' object has no attribute 'get_value'
I can't work out how the input is supposed to be formatted. I tried using theano.shared on the list, so that it would be:
train_set_xPrime = theano.shared([theano.shared(numpy.asarray(train_set_x[0][0],dtype=theano.config.floatX),borrow=True),theano.shared(numpy.asarray(train_set_x[0][1],dtype=theano.config.floatX),borrow=True)],borrow=True)
but then it said:
Traceback (most recent call last):
File "/Users/spudzee1111/Desktop/Code/NNChatbot/DeepBeliefScratch.command", line 834, in <module>
train_set_xPrime = theano.shared([theano.shared(numpy.asarray(train_set_x[0][0],dtype=theano.config.floatX),borrow=True),theano.shared(numpy.asarray(train_set_x[0][1],dtype=theano.config.floatX),borrow=True)],borrow=True) #,borrow=True),numpy.asarray(train_set_x[0][1],dtype=theano.config.floatX),borrow=True))
File "/Library/Frameworks/Python.framework/Versions/3.4/lib/python3.4/site-packages/theano/compile/sharedvalue.py", line 228, in shared
(value, kwargs))
TypeError: No suitable SharedVariable constructor could be found. Are you sure all kwargs are supported? We do not support the parameter dtype or type. value="[<TensorType(float64, vector)>, <TensorType(float64, vector)>]". parameters="{'borrow': True}"
I tried other combinations but none of them worked.
This should work
numpy_rng = numpy.random.RandomState(123)
dbn = DBN(numpy_rng=numpy_rng, n_ins=2, hidden_layers_sizes=[50, 50, 50], n_outs=1)
train_set = [
([1,2],[2,]),
([4,5],[5,])
]
train_set_x = [train_set[i][0] for i in range(len(train_set))]
nparray = numpy.asarray(train_set_x, dtype=theano.config.floatX)
train_set_x = theano.shared(nparray, borrow=True)
pretraining_fns = dbn.pretraining_functions(train_set_x=train_set_x, batch_size=10, k=1)
The method pretraining_fns is expecting as an input a shared variable of size (number of samples, dimension of inputs). You could check this by looking at the shape of the MNIST dataset, the standard input for this example
It doesn't take a list as an input because this method is only for the pre-training functions. DBNs are pre-trained with an unsupervised learning algorithm, so it doesn't make sense to use the labels
Furthermore, the input list to make your numpy array doesn't make sense. train_set_x[0][0] yields only the first training example. You want train_set_xPrime to have all training examples. Even if you did train_set_x[0] you would have the first training example but with the labels