Problem
I am using the Model API to create a Keras network that takes in two inputs and one output. When training the network I get the following error:
Error when checking model input: the list of Numpy arrays that you
are passing to your model is not the size the model expected. Expected
to see 2 array(s), but instead got the following list of 1 arrays:
Despite this error, the input X array has a shape of (2,8), and the output y array has a shape of (1,4).
Things already tried
There are a number of similar questions on SO, however, their solutions largely revolves around ensuring X and y are Numpy arrays. As seen in my implementation, I have already done that. Thus, I do not believe this is a duplicate question.
Implementation
I have defined the model as follows:
opt = Adam(lr = alpha)
input = Input(shape=(input_dim_,))
delta = Input(shape=[1])
l1 = Dense(units = 1024, input_dim = input_dim_, activation = "relu")(input)
l2 = Dense(units=512, activation="relu")(l1)
def loss_function (y,y_pred):
y_pred = K.clip(y_pred,1e-8,1-1e-8)
return K.sum(-y*K.log(y_pred)*delta)
if model_type == "actor":
out = Dense(units = output_dim_, activation="softmax")(l2)
model = Model(input=[input,delta], output = [out])
model.compile(loss = loss_function,optimizer=opt)
And train the model by doing the following:
X = [s_t,delta]
X = np.array(X)
actor.fit(X,y,verbose=0)
You are not passing the data correctly in fit:
actor.fit(X,y,verbose=0)
Here X should be a list containing two numpy arrays, and each numpy array corresponds to one of your inputs (you have a model with two inputs): So it should be more like this:
X = [np.array(s_t), np.array(delta)]
actor.fit(X, y, verbose=0)
Then it should work.
Related
I followed the code examples for structured data classification at keras.io to build a model for classifying a rather simple model similar to the one in the example.
I wanted to extend the model to handle a second output, but I cannot use this model to train. The dataset is generated like it is done in the example (but with two results):
res1 = dataframe.pop("result1")
res2 = dataframe.pop("result2")
ds = tf.data.Dataset.from_tensor_slices((dict(dataframe),(res1,res2)))
The model is also similar to the example but using a two-dimensional output:
x = layers.Dense(32, activation="relu")(all_features)
x = layers.Dropout(0.5)(x)
output = layers.Dense(2, activation="sigmoid")(x)
model = keras.Model(all_inputs, output)
model.compile("adam", "binary_crossentropy", metrics=["accuracy"])
It compiles, but when i try to run fit...
model.fit(train_ds,epochs=30)
I get an error message:
ValueError: logits and labels must have the same shape ((None, 2) vs (None, 1))
How can I prepare the dataset to meet the shape constraints?
I believe you should use the zip() function:
ds = tf.data.Dataset.from_tensor_slices((dict(dataframe),zip(res1,res2)))
This way, you are telling from_tensor_slices() to zip labels into a new array of shape (N, 2) instead of concatenating two vectors of shape (N, 1) into (2N, 1).
I have a Keras LSTM model that contains multiple outputs.
The model is defined as follows:
outputs=[]
main_input = Input(shape= (seq_length,feature_cnt), name='main_input')
lstm = LSTM(32,return_sequences=True)(main_input)
for _ in range((output_branches)): #output_branches is the number of output branches of the model
prediction = LSTM(8,return_sequences=False)(lstm)
out = Dense(1)(prediction)
outputs.append(out)
model = Model(inputs=main_input, outputs=outputs)
model.compile(optimizer='rmsprop',loss='mse')
I have a problem when reshaping the output data.
The code for reshaping the output data is:
y=y.reshape((len(y),output_branches,1))
I got the following error:
ValueError: Error when checking model target: the list of Numpy arrays
that you are passing to your model is not the size the model expected.
Expected to see 5 array(s), but instead got the following list of 1
arrays: [array([[[0.29670931],
[0.16652206],
[0.25114482],
[0.36952324],
[0.09429612]],
[[0.16652206],
[0.25114482],
[0.36952324],
[0.09429612],...
How can I correctly reshape the output data?
It depends on how y is structured initially. Here I assume that y is a single-valued label for each sequence in batch.
When there are multiple inputs/outputs model.fit() expects a corresponding list of inputs/outputs to be given. np.split(y, output_branches, axis=-1) in a following fully reproducible example does exactly this - for each batch splits a single list of outputs into a list of separate outputs where each output (in this case) is 1-element list:
import tensorflow as tf
import numpy as np
tf.enable_eager_execution()
batch_size = 100
seq_length = 10
feature_cnt = 5
output_branches = 3
# Say we've got:
# - 100-element batch
# - of 10-element sequences
# - where each element of a sequence is a vector describing 5 features.
X = np.random.random_sample([batch_size, seq_length, feature_cnt])
# Every sequence of a batch is labelled with `output_branches` labels.
y = np.random.random_sample([batch_size, output_branches])
# Here y.shape() == (100, 3)
# Here we split the last axis of y (output_branches) into `output_branches` separate lists.
y = np.split(y, output_branches, axis=-1)
# Here y is not a numpy matrix anymore, but a list of matrices.
# E.g. y[0].shape() == (100, 1); y[1].shape() == (100, 1) etc...
outputs = []
main_input = tf.keras.layers.Input(shape=(seq_length, feature_cnt), name='main_input')
lstm = tf.keras.layers.LSTM(32, return_sequences=True)(main_input)
for _ in range(output_branches):
prediction = tf.keras.layers.LSTM(8, return_sequences=False)(lstm)
out = tf.keras.layers.Dense(1)(prediction)
outputs.append(out)
model = tf.keras.models.Model(inputs=main_input, outputs=outputs)
model.compile(optimizer='rmsprop', loss='mse')
model.fit(X, y)
You might need to play around with axes as you didn't specify how exactly your data look like.
EDIT:
As author is looking for an answer drawing from official sources, it's mentioned here (not explicitly though, it only mentions what the Dataset should yield, hence - what kind of input structure model.fit() expects):
When calling fit with a Dataset object, it should yield either a tuple of lists like ([title_data, body_data, tags_data], [priority_targets, dept_targets]) or a tuple of dictionaries like ({'title': title_data, 'body': body_data, 'tags': tags_data}, {'priority': priority_targets, 'department': dept_targets}).
Since you have an amount of outputs equal to output_branches, your output data must be a list with the same amount of arrays.
Basically, if the output data is in the middle dimension as your reshape suggests:
y = [ y[:,i] for i in range(output_branches)]
I have a trained keras model which takes inputs of size (batchSize,2). This works well and gives good results.
My main problem is to have a model which takes an input a vector of size(batchSize,2,16) and slice it inside the model to 16 vectors of size(batchSize,2) and concatenate the outputs together.
I have used this code for this
y = layers.Input(shape=(2,16,))
model_x= load_model('saved_model')
for i in range(16):
x_input = Lambda(lambda x: x[:, :, i])(y)
if i == 0:
x_output = model_x(x_input)
else:
x_output = layers.concatenate([x_output,
model_x(x_input)])
x_output = Lambda(lambda x: x[:, :tf.cast(N, tf.int32)])(x_output)
final_model = Model(y, x_output)
Although the saved model gives me good performance, this code does not trains well and doesn't give the intended performance.
What can I do to get better results?
I can't say anything about the bad performance of your final model because it might be due to various reasons and this is not readily evident from the content of your question. But to answer your original question: yes, you can use for loops that way, because you are essentially creating layers/tensors and connecting them to each other (i.e. building the graph of the model). So it's a valid thing to do. The problem might be somewhere else, e.g. a wrong indexing, a wrong loss function, etc.
Further, you can build your final model in a much simpler approach. You already have a trained model which gets inputs of shape (batch_size, 2) and gives outputs of shape (batch_size, 8). Now you want to build a model which takes inputs of shape (batch_size, 2, 16), apply the already trained model on each of the 16 (batch_size, 2) segments and then concatenate the results. You can easily do that with a TimeDistributed wrapper:
# load your already trained model
model_x = load_model('saved_model')
inp = layers.Input(shape=(2,16))
# this makes the input shape as `(16,2)`
x = layers.Permute((2,1))(inp)
# this would apply `model_x` on each of the 16 segments; the output shape would be (None, 16, 8)
x = layers.TimeDistributed(model_x)(x)
# flatten to make it have a shape of (None, 128)
out = layers.Flatten()(x)
final_model = Model(inp, out)
I am using model.predict() on a testing tensor, which has the same size of the input used for training, (N_tr*70,1025,11,3)
The model is trained by regression, with three outputs as ground-truth, each of size (N_te*70,1025).
For information, when testing the model N_te=180.
According to the documentation, the output of model.predict() should be a numpy tensor, instead I get a list of three elements, each with shape (N_te*70,1025).
I am afraid that the output might have been somehow shuffled (which would explain my unexpected results).
Do you have any advice to get a numpy array which is compatible to the one I used as ground-truth? If not, do you know any other work-around?
EDIT: added the neural network code
input_img = Input(shape=(1025, 11, 3 ) )
x = ( Flatten())(input_img)
for i in range(0,4):
x = ( Dense(1024*3))(x)
x = ( BatchNormalization() )(x)
x = ( LeakyReLU())(x)
o0 = ( Dense(1025, activation='sigmoid'))(x)
o1 = ( Dense(1025, activation='sigmoid'))(x)
o2 = ( Dense(1025, activation='sigmoid'))(x)
Model prediction:
output = model.predict(X_in, batch_size = batch_size, verbose=1)
It is expected that in a multi-output model, predict returns a list of numpy arrays, with each element being the corresponding output. Remember that loss is computed individually between each output and the ground truth, so this format is already ideas for that purpose.
I want to train a binary classifier using Keras and my training data is of shape (2000,2,128) and labels of shape (2000,) as Numpy arrays.
The idea is to train such that embeddings together in a single array means they are either same or different, labelled using 0 or 1 respectively.
The training data looks like:
[[[0 1 2 ....128][129.....256]][[1 2 3 ...128][9 9 3 5...]].....]
and the labels looks like [1 1 0 0 1 1 0 0..].
Here is the code:
import keras
from keras.layers import Input, Dense
from keras.models import Model
frst_input = Input(shape=(128,), name='frst_input')
scnd_input = Input(shape=(128,),name='scnd_input')
x = keras.layers.concatenate([frst_input, scnd_input])
x = Dense(128, activation='relu')(x)
x=(Dense(1, activation='softmax'))(x)
model=Model(inputs=[frst_input, scnd_input], outputs=[x])
model.compile(optimizer='rmsprop', loss='binary_crossentropy',
loss_weights=[ 0.2],metrics=['accuracy'])
I am getting the following error while running this code:
ValueError: Error when checking model input: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 2 array(s), but instead got the following list of 1 arrays: [array([[[ 0.07124118, -0.02316936, -0.12737238, ..., 0.15822273,
0.00129827, -0.02457245],
[ 0.15869428, -0.0570458 , -0.10459555, ..., 0.0968155 ,
0.0183982 , -0.077924...
How can I resolve this issue? Is my code correct to train the classifier using two inputs to classify?
Well, you have two options here:
1) Reshape the training data to (2000, 128*2) and define only one input layer:
X_train = X_train.reshape(-1, 128*2)
inp = Input(shape=(128*2,))
x = Dense(128, activation='relu')(inp)
x = Dense(1, activation='sigmoid'))(x)
model=Model(inputs=[inp], outputs=[x])
2) Define two input layers, as you have already done, and pass a list of two input arrays when calling fit method:
# assuming X_train have a shape of `(2000, 2, 128)` as you suggested
model.fit([X_train[:,0], X_train[:,1]], y_train, ...)
Further, since you are doing binary classification here, you need to use sigmoid as the activation of last layer (i.e. using softmax in this case would always outputs 1, since softmax normalizes the outputs such that their sum equals to one).