Keras Error when checking input - python

I want to explore and intermediate layer on a tensorflow model defined with Keras:
input_dim = 30
input_layer = Input(shape=(input_dim, ))
encoder = Dense(encoding_dim, activation="tanh",
activity_regularizer=regularizers.l1(10e-5))(input_layer)
encoder = Dense(int(encoding_dim / 2), activation="relu")(encoder)
decoder = Dense(int(encoding_dim / 2), activation='tanh')(encoder)
decoder = Dense(input_dim, activation='relu')(decoder)
autoencoder = Model(inputs=input_layer, outputs=decoder)
####TRAINING....
#inspect layer 1
intermediate_layer_model = Model(inputs=autoencoder.layers[0].input,
outputs=autoencoder.layers[1].output)
xtest = #array of dim (30,)
intermediate_output = intermediate_layer_model.predict(xtest)
print(intermediate_output)
However I got the error on dimension when I inspect:
/usr/local/lib/python2.7/site-packages/keras/engine/training_utils.pyc in standardize_input_data(data, names, shapes, check_batch_axis, exception_prefix)
134 ': expected ' + names[i] + ' to have shape ' +
135 str(shape) + ' but got array with shape ' +
--> 136 str(data_shape))
137 return data
138
ValueError: Error when checking input: expected input_4 to have shape (30,) but got array with shape (1,)
Any help appreciated

From the Keras docs:
shape: A shape tuple (integer), not including the batch size. For instance, shape=(32,) indicates that the expected input will be batches of 32-dimensional vectors.
When specifying the model, you do not need to provide a batch dimension. model.predict() expects your array to be shaped as such however.
Reshape your xtest to contain a batch dimension: xtest = np.reshape(xtest, (1, -1)) and set the batch_size argument of model.predict() to 1.

Related

input must have 3 dimensions, got 2 Error in create LSTM Classifier

The structure of the network must be as follows:
(lstm): LSTM(1, 64, batch_first=True)
(fc1): Linear(in_features=64, out_features=32, bias=True)
(relu): ReLU()
(fc2): Linear(in_features=32, out_features=5, bias=True)
I wrote this code:
class LSTMClassifier(nn.Module):
def __init__(self):
super(LSTMClassifier, self).__init__()
self.lstm = nn.LSTM(1, 64, batch_first=True)
self.fc1 = nn.Linear(in_features=64, out_features=32, bias=True)
self.relu = nn.ReLU()
self.fc2 = nn.Linear(in_features=32, out_features=5, bias=True)
def forward(self, x):
x = torch.tanh(self.lstm(x)[0])
x = self.fc1(x)
x = F.relu(x)
x = self.fc2(x)
This is for test:
(batch_data, batch_label) = next (iter (train_loader))
model = LSTMClassifier().to(device)
output = model (batch_data.to(device)).cpu()
assert output.shape == (batch_size, 5)
print ("passed")
The error is:
----> 3 output = model (batch_data.to(device)).cpu()
5 frames
/usr/local/lib/python3.7/dist-packages/torch/nn/modules/rnn.py in check_input(self, input, batch_sizes)
201 raise RuntimeError(
202 'input must have {} dimensions, got {}'.format(
--> 203 expected_input_dim, input.dim()))
204 if self.input_size != input.size(-1):
205 raise RuntimeError(
RuntimeError: input must have 3 dimensions, got 2
What is my problem?
LSTMs support 3 dimensional input (sample, time-step, features). You need to transform your input from 2D to 3D. To do so, you can :
Use reshape function
First, you need the shape of your 2D input using batch_data.shape. Let's assume the shape of your 2D input is (15, 4).
Now to reshape your input from 2D to 3D you use the reshape function np.reshape(data, new_shape)
(batch_data, batch_label) = next (iter (train_loader))
batch_data = np.reshape(batch_data, (15, 4, 1)) # line to add
model = LSTMClassifier().to(device)
output = model (batch_data.to(device)).cpu()
assert output.shape == (batch_size, 5)
print ("passed")
Later on, you will also need to reshape your test data from 2D to 3D.
Add RepeatVector Layer
This layer is implemented in Keras, I'm not sure if it's available in PyTorch which is your case.
This layer adds an extra dimension to your data (repeats the input n times). For example you can convert a 2D input (batch size, input size) to a 3D input (batch_size, sequence_length, input size).

Behavior difference when building TF Keras RNN with two different methods

I am building a RNN text generator, mostly going from the Tensorflow docs here.
My question, I have defined the model two ways:
Method (1):
model = tf.keras.Sequential([
tf.keras.layers.Embedding(vocab_size, embed_dim,
batch_input_shape=[batch_size, None]),
tf.keras.layers.GRU(rnn_units, return_sequences=True, stateful=True,
recurrent_initializer='glorot_uniform'),
tf.keras.layers.Dense(vocab_size)
])
Method (2):
model = tf.keras.Sequential()
model.add(tf.keras.layers.Embedding(vocab_size, embed_dim,
batch_input_shape=[BATCH_SIZE, None]))
model.add(tf.keras.layers.GRU(rnn_units, return_sequences=True,
stateful=True,
recurrent_initializer='glorot_uniform'))
model.add(tf.keras.layers.Dense(vocab_size))
In my mind, these both do the same thing. However when generating text with:
def generate_text(model, start_string, length=1000):
# converting start string to numbers (vectorisation)
input_eval = [char2idx[s] for s in start_string]
input_eval = tf.expand_dims(input_eval, 0)
# initialise empty string to store results
text = []
model.reset_states()
for i in range(length):
predictions = model(input_eval)
# remove batch dimension
predictions = tf.squeeze(predictions, 0)
# use categorical distribution to predict character returned by model
predicted_id = tf.random.categorical(predictions, num_samples=1)[-1,0].numpy()
# we pass the predicted character as the next input to the model
# along with previous hidden state
input_eval = tf.expand_dims([predicted_id], 0)
# append predicted text
text.append(idx2char[predicted_id])
return (start_string + ''.join(text))
Which I pass:
print(generate_text(model, start_string=u'From '))
Method (1) works perfectly, but method (2) throws the following error:
WARNING:tensorflow:Model was constructed with shape Tensor("embedding_1_input:0", shape=(64, None), dtype=float32) for input (64, None), but it was re-called on a Tensor with incompatible shape (1, 5).
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-33-eb814780c9fe> in <module>()
----> 1 print(generate_text(model, start_string=u'From ', length=PRINT))
14 frames
/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py in set_shape(self, shape)
1086 raise ValueError(
1087 "Tensor's shape %s is not compatible with supplied shape %s" %
-> 1088 (self.shape, shape))
1089
1090 # Methods not supported / implemented for Eager Tensors.
ValueError: Tensor's shape (5, 64, 1024) is not compatible with supplied shape [5, 1, 1024]
If anyone could help me understand what the difference is between these two methods that would be amazing, thankyou!
Edit:
Including model saving and loading code. I use this to save the model (with a batch size 64) and then load with a batch size of 1 for text generation.
Saving weights:
checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(
filepath='./training_checkpoints/ckpt_{epoch}',
save_weights_only=True
)
Loading weights into new model (batch size = 1):
model = build_model(len(vocab_char), EMBED_DIM, UNITS, 1)
model.load_weights(tf.train.latest_checkpoint('./training_checkpoints'))
model.build(tf.TensorShape([1, None]))
model.summary()

ValueError: Error when checking target: expected dense_2 to have shape (1,) but got array with shape (27,)

I am writing a program to predict a random name based on names it has been trained on (character level encoding). My output shape is (44,27) and my dense layer is set to give 27 classes of softmax output. I still get a ValueError.
I have tried adding an axis to the output
(Y_train_oh = np.expand_dims(Y_train_oh, axis=2)).
def model1(vocab_len):
model = Sequential()
model.add(LSTM(128,input_shape=(buff_length, vocab_len)))
model.add(Dense(units=60, activation='relu'))
model.add(Dense(units=vocab_len, activation='softmax'))
model.summary()
return model
def one_hot(Y, char2idx, vocablen):
Ty = len(Y)
Yoh = np.zeros((Ty, vocablen))
for idx in range(Ty):
Yoh[idx, char2idx[Y[idx]]] = 1
return Yoh
def trainer(X, vocab, char2idx, no_epochs=1, batch_size=10):
model = model1(len(vocab))
model.compile(optimizer='Adam', loss='sparse_categorical_crossentropy', metrics=['accuracy'])
for epn in range(no_epochs):
np.random.seed(1 + epn)
Tx = len(X)
indices = np.random.randint(0, Tx, batch_size)
X_train = []
Y_train = []
for index in indices:
name = str(X[index])
for chIndex in range(len(name) - 1):
if chIndex >= buff_length - 1:
X_train.append(name[chIndex - buff_length + 1: chIndex + 1])
Y_train.append(name[chIndex + 1])
for i in range(len(X_train)):
print ((X_train[i] + ' : '+ Y_train[i]) )
X_train_oh = np.copy(one_hot_buffer(X_train, char2idx, len(vocab)))
Y_train_oh = np.copy(one_hot(Y_train, char2idx, len(vocab)))
print(X_train_oh.shape,':',Y_train_oh.shape)
model.fit(x=X_train_oh, y=Y_train_oh)
model.save('name_model.h5')
Error message:
ValueError: Error when checking target: expected dense_10 to have
shape (1,) but got array with shape (27,)

Shapes of logits and labels are incompatible

The full error message is like this:
ValueError: Shapes (2, 1) and (50, 1) are incompatible
It occurs when my model is trained. The mistake either is in my input_fn:
train_input_fn = tf.estimator.inputs.numpy_input_fn(
x = {"x" : training_data},
y = training_labels,
batch_size = 50,
num_epochs = None,
shuffle = True)
in my logits and loss function:
dense = tf.layers.dense(inputs = pool2_flat, units = 1024, activation = tf.nn.relu)
dropout = tf.layers.dropout(inputs = dense, rate = 0.4, training = mode == tf.estimator.ModeKeys.TRAIN)
logits = tf.layers.dense(inputs = dropout, units = 1)
loss = tf.losses.softmax_cross_entropy(labels = labels, logits = logits)
or in my dataset. I can only print out the shape of my dataset for you to take a look at it.
#shape of the dataset
train_data.shape
(1196,2,1)
train_data[0].shape
(2,1)
#this is the data
train_data[0][0].shape
(1,)
train_data[0][0][0].shape
(20,50,50)
#this is the labels
train_data[0][1].shape
(1,)
The problem seems to be the shape of the logits. They are supposed to be [batch_size, num_classes] in this case [50,1] but are [2,1]. The shape of the labels is correctly [50,1]
I have made a github gist if you want to take a look at the whole code.
https://gist.github.com/hjkhjk1999/38f358a53da84a94bf5a59f44050aad5
In your code, you are stating that the inputs to your model will be feed in batches of 50 samples per batch with one variable. But it looks like your are feeding actually a batch of 2 samples with 1 variable (shape=[2, 1]) despite feeding labels with shape [50, 1].
That's the problem, you are giving 50 'questions' and two 'answers'.
Also, your dataset is shaped in a really weird way. I see you named your github gist 3D Conv. If you are indeed trying to do a 3D convolution you might want to reshape your dataset into a tensor (numpy array) of this shape shape = [samples, width, height, deepth]

How to multiply list of tensors by single tensor on TensorFlow?

I am implementing an RNN and contrarily to the examples I have found which minimize only the cost for the output in the last step
x = tf.placeholder ("float", [features_dimension, None, n_timesteps])
y = tf.placeholder ("float", [labels_dimension, None, n_timesteps])
# Define weights
weights = {'out': tf.Variable (tf.random_normal ([N_HIDDEN, labels_dimension]))}
biases = {'out': tf.Variable (tf.random_normal ([labels_dimension]))}
def RNN (x, weights, biases):
# Prepare data shape to match `rnn` function requirements
# Current data input shape: (features_dimension, BATCH_SIZE, n_timesteps)
# Required shape: `n_timesteps` tensors list of shape (BATCH_SIZE, features_dimension)
# We make a division of the data to split it in individual vectors that
# will be fed for each timestep
# Permuting features_dimension and n_timesteps
# Shape will be (n_timesteps, BATCH_SIZE, features_dimension)
x = tf.transpose (x, [2, 1, 0])
# Reshaping to (BATCH_SIZE*n_timesteps, features_dimension) (we are removing the depth dimension with this)
x = tf.reshape(x, [BATCH_SIZE*n_timesteps, features_dimension])
# Split the previous 2D tensor to get a list of `n_timesteps` tensors of
# shape (batch_size, features_dimension).
x = tf.split (x, n_timesteps, 0)
# Define a lstm cell with tensorflow
lstm_cell = rnn.BasicLSTMCell (N_HIDDEN, forget_bias=1.0)
# Get lstm cell output
outputs, states = rnn.static_rnn (lstm_cell, x, dtype=tf.float32)
# Linear activation; outputs contains the array of outputs for all the
# timesteps
pred = tf.matmul (outputs, weights['out']) + biases['out']
However, the object outputs is a list of Tensor with n_timesteps elements, so the pred = tf.matmul (outputs, weights['out']) + biases['out'] throws the error
ValueError: Shape must be rank 2 but is rank 3 for 'MatMul' (op:
'MatMul') with input shapes: [100,128,16], [16,1].
. How can I do this multiplication?
The solution is to tf.stack the list of tensors into a 3d tensor and then use tf.map_fn to apply the multiplication operation on each 2d tensor along dimension 0:
# Transform the list into a 3D tensor with dimensions (n_timesteps, batch_size, N_HIDDEN)
outputs = tf.stack(outputs)
def pred_fn(current_output):
return tf.matmul(current_output, weights['out']) + biases['out']
# Use tf.map_fn to apply pred_fn to each tensor in outputs, along dimension 0 (timestep dimension)
pred = tf.map_fn(pred_fn, outputs)

Categories

Resources