I'm trying to construct a (very simple) Keras model as a baseline for a project. I have a list of 3459 numpy arrays of shape (2, 6, 15) as input, and a list of target values (ints as numpy arrays with shape ()). When I try to train the model I get this error:
"ValueError: Number of samples 2 is less than samples required for specified batch_size 32 and steps 108."
The model so far is extremely simple, but I'm having no luck getting it to train:
input = Input(shape=(2, 6, 15))
x = Dense(64, activation='relu')(input)
x = Dense(64, activation='relu')(x)
output = Dense(1)(x)
model = Model(inputs=input, outputs=output)
model.compile(optimizer='adam', loss='mean_squared_error',
metrics=['accuracy'])
hist = model.fit(
X_train,
y_train,
batch_size=32,
epochs=10,
validation_data=(X_test, y_test),
steps_per_epoch=(len(X_train) // 32),
validation_steps=(len(X_test) // 32))
I'm currently loading the data from pickle files, and I suspect that the issue might be the array structure of the individual training cases. When looking at one of the arrays in the X_train it has a structure [[[...]...], [[...]...]], and I suspect the code is confusing the outer brackets as the batch container, so it's reading a batch size of 2 as input instead. Just a theory, but I don't know how to address that to check for myself.
The error is indeed due to the way your data is being generated/loaded; there are no errors if you train the model on random tensors with the specified shapes.
Related
I'm trying to make a binary classification, each entry of my data is composed by a list of features (previously generated bert features saved as a tensor.numpy.flatten() on a csv and loaded as pandas dataframe) and a label (0 or 1 values).
Then I create a naive model (just to test) with a dense 16 relu layer and a dense 1 sigmoid layer. I compile the model with BinaryCrossentropy loss and Adam optimizer.
When I run the model fit, I get the error,
ValueError: Dimension 0 in both shapes must be equal, but are 2 and 1. Shapes are [2] and [1]. for '{{node AssignAddVariableOp_8}} = AssignAddVariableOp[dtype=DT_FLOAT](AssignAddVariableOp_8/resource, Sum_7)' with input shapes: [], [1]
I've already tried to reshape y, but got the same error. If I use a 2 softmax layer (with BinaryCrossentropy or sparse_categorical_crossentropy) get another error:
ValueError: logits and labels must have the same shape ((None, 2) vs (None, 1))
I've given a look at many solutions here in stackoverflow, but couldn't solve my problem. Could anyone please help me?
Tensorflow and keras versions: 2.5.0
train = pd.read_csv("../../../bert_features/panglee/panglee_bert_features_train.csv", sep=",")
valid = pd.read_csv("../../../bert_features/panglee/panglee_bert_features_valid.csv", sep=",")
test = pd.read_csv("../../../bert_features/panglee/panglee_bert_features_test.csv", sep=",")
x_train = train['bert_feats'].apply(np.vectorize(tf.constant))
y_train = train['label'].apply(np.vectorize(tf.constant))
x_valid = valid['bert_feats'].apply(np.vectorize(tf.constant))
y_valid = valid['label'].apply(np.vectorize(tf.constant))
x_test = test['bert_feats'].apply(np.vectorize(tf.constant))
y_test = test['label'].apply(np.vectorize(tf.constant))
y_train = np.asarray(y_train).astype('float32').reshape((len(y_train),1))
y_valid = np.asarray(y_valid).astype('float32').reshape((len(y_valid),1))
y_test = np.asarray(y_test).astype('float32').reshape((len(y_test),1))
model = keras.Sequential()
model.add(keras.layers.Dense(16, activation='relu'))
model.add(keras.layers.Dense(1, activation='sigmoid'))
model.compile(optimizer='adam',
loss=keras.losses.BinaryCrossentropy(),
metrics=['accuracy', tf.metrics.Precision(),tf.metrics.Recall(), tfa.metrics.F1Score(num_classes=2, average='macro'), tf.metrics.AUC(curve='PR'),tf.metrics.AUC(curve='ROC')])
model.fit(x_train, y_train, epochs=10, validation_data=(x_valid, y_valid))
Answering it here for the benefit of the community since the issues is solved as mentioned in the comment.
Problem solved. I used set_to_categorical() and worked fine.
I'm trying to implement the sliding windows approach and use DNN for the forecasting part. The window length = 24
What I did:
I have x (input) and y (output) in the data set. I kept the "y" value as it is (single array). And on the x-value:
def generate_input(data, sequence_length=1):
x_data = []
for i in range(len(data)-sequence_length+1):
a = data[i:(i+sequence_length)]
x_data.append(a)
return np.array (x_data)
sequence_length = 24
x_train = generate_input(train, sequence_length)
#Shape of X train: (201389, 24)
#Shape of y train: (201412,)
model = Sequential()
model.add(Dense(30,input_shape= (x_train.shape[1],)))
model.add(Dense(20))
model.add(Dropout(0.2))
model.compile(loss="mse", optimizer='rmsprop')
model.summary()
model.fit(x_train, y_train, batch_size=batch_size, epochs=epochs,
validation_split=0.1)
The error message I'm receiving:
Error when checking target: expected dropout_5 to have shape (20,) but got
array with shape (1,)
One more question, how can I use the same approach for multivariate time series? I want to use sequences as input to predict y.
I changed the slicing part to:
x_data.append(data[i:i+sequence_length])
But I received an error:
cannot copy sequence with size 24 to array axis with dimension 4
model.summary() should show you that the output layer in your model is the Dropout layer with a shape of (None, 20). That is probably not what you want. It seems that you are trying to predict a single value. Thus you need to add a Dense(1) layer after. It is also highly unusual to have dropout as an output layer.
Also, x_train and y_train should have the same shape[0].
I am trying to run a neural network on text inputs. This is a binary classification. Here is my working code so far:
df = pd.read_csv(pathname, encoding = "ISO-8859-1")
df = df[['content_cleaned', 'meaningful']] #Content cleaned: text, meaningful: label
X = df['content_cleaned']
y = df['meaningful']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=21)
tokenizer = Tokenizer(num_words=100)
tokenizer.fit_on_texts(X_train)
X_train_encoded = tokenizer.texts_to_sequences(X_train)
X_test_encoded = tokenizer.texts_to_sequences(X_test)
max_len = 100
X_train = pad_sequences(X_train_encoded, maxlen=max_len)
X_test = pad_sequences(X_test_encoded, maxlen=max_len)
batch_size = 100
max_words = 100
input_dim = X_train.shape[1] # Number of features
model = Sequential()
model.add(layers.Dense(10, activation='relu', input_shape=X_train.shape[1:]))
model.add(layers.Dense(1, activation='sigmoid'))
model.compile(loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
history = model.fit(X_train, X_test,
batch_size=batch_size,
epochs=5,
verbose=1,
validation_split=0.1)
My question is two parts. First is with the input_shape when creating the layers. I am confused as to the syntax of declaring this. When running this command:
print(X_train.shape)
I am getting this shape: (3609, 100).
From my understanding, this is telling me that there are 3609 instances. From viewing other examples, my naive assumption was to use the 100 as there are 100 types (may be understanding this incorrectly) corresponding to the max_words that I initialized. I believe that I may have done the syntax incorrectly when initializing the input_shape.
The second question is with an error message when running all of this (most likely with the incorrect input_shape). The error message highlights this line of code:
validation_split=0.1)
The error message is:
ValueError: Error when checking target: expected dense_2 to have shape (None, 1) but got array with shape (1547, 1
Am I going about this problem incorrectly? I am very new to Deep Learning.
The input_shape argument specifies the shape of one training sample. Therefore, you need to set it to X_train.shape[1:] (i.e. ignore samples or batch axis):
model.add(layers.Dense(10, activation='relu', input_shape=X_train.shape[1:]))
Further, pass X_train and y_train to the fit_generator (instead of X_train_encoded and X_test_encoded).
You missed two ending parenthesis ) at the line where you defined the input of your model. Also make sure that you provide your activation function.
Change your code as below:
model.add(layers.Dense(10, activation='relu', input_shape=(X_train.shape[0],)))
EDIT:
For your last error just change your input_shape to input_shape=(X_train.shape[0],).
I am trying to train an LSTM recurrent neural network, for sequence classification.
My data has the following formart:
Input: [1,5,2,3,6,2, ...] -> Output: 1
Input: [2,10,4,6,12,4, ...] -> Output: 1
Input: [4,1,7,1,9,2, ...] -> Output: 2
Input: [1,3,5,9,10,20, ...] -> Output: 3
.
.
.
So basically I want to provide a sequence as an input and get an integer as an output.
Each input sequence has length = 2000 float numbers, and I have around 1485 samples for training
The output is just an integer from 1 to 10
This is what I tried to do:
# Get the training numpy 2D array for the input (1485X 2000).
# Each element is an input sequence of length 2000
# eg: [ [1,2,3...], [4,5,6...], ... ]
x_train = get_training_x()
# Get the training numpy 2D array for the outputs (1485 X 1).
# Each element is an integer output for the corresponding input from x_train
# eg: [ 1, 2, 3, ...]
y_train = get_training_y()
# Create the model
model = Sequential()
model.add(LSTM(100, input_shape=(x_train.shape)))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
print(model.summary())
model.fit(x_train, y_train, nb_epoch=3, batch_size=64)
I get the following error:
Error when checking input: expected lstm_1_input to have 3 dimensions, but got array with shape (1485, 2000)
I tried using this instead:
model.add(LSTM(100, input_shape=(1485, 1, 2000)))
But got the another error this time:
ValueError: Input 0 is incompatible with layer lstm_1: expected ndim=3, found ndim=4
Can anyone explain what is my input shape? and what am I doing wrong?
Thanks
try reshaping your training data to:
x_train=x_train.reshape(x_train.shape[0], 1, x_train.shape[1])
input_shape=(None, x_train.shape[1], 1), where None is the batch size, x_train.shape[1] is the length of each sequence of features, and 1 is each feature length. (Not sure if batch size is necessary for Sequential model).
And then reshape your data into x_train = x_train.reshape(-1, x_train.shape[1], 1).
Given the format of your input and output, you can use parts of the approach taken by one of the official Keras examples. More specifically, since you are not creating a binary classifier, but rather predicting an integer, you can use one-hot encoding to encode y_train using to_categorical().
# Number of elements in each sample
num_vals = x_train.shape[1]
# Convert all samples in y_train to one-hot encoding
y_train = to_categorical(y_train)
# Get number of possible values for model inputs and outputs
num_x_tokens = np.amax(x_train) + 1
num_y_tokens = y_train.shape[1]
model = Sequential()
model.add(Embedding(num_x_tokens, 100))
model.add(LSTM(100))
model.add(Dense(num_y_tokens, activation='sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.fit(x_train, y_train,
batch_size=64,
epochs=3)
The num_x_tokens in the code above would be the maximum size of the element in one of your input samples (e.g. if you have two samples [1, 7, 2] and [3, 5, 4] then num_x_tokens is 7). If you use numpy you can find this with np.amax(x_train). Similarly, num_y_tokens is the number of categories you have in y_train.
After training, you can run predictions using the code below. Using np.argmax effectively reverses to_categorical in this configuration.
model_out = model.predict(x_test)
model_out = np.argmax(model_out, axis=1)
You can import to_categorical using from keras.utils import to_categorical, Embedding using from keras.layers import Embedding, and numpy using import numpy as np.
Also, you don't have to do print(model.summary()). model.summary() is enough to print out the summary.
EDIT
If it is the case that the input is of the form [[0.12, 0.31, ...], [0.22, 0.95, ...], ...] (say, generated with x_train = np.random.rand(num_samples, num_vals)) then you can use x_train = np.reshape(x_train, (num_samples, num_vals, 1)) to change the shape of the array to input it into the LSTM layer. The code to train the model in that case would be:
num_samples = x_train.shape[0]
num_vals = x_train.shape[1] # Number of elements in each sample
# Reshape for what LSTM expects
x_train = np.reshape(x_train, (num_samples, num_vals, 1))
y_train = to_categorical(y_train)
# Get number of possible values for model outputs
num_y_tokens = y_train.shape[1]
model = Sequential()
model.add(LSTM(100, input_shape=(num_vals, 1)))
model.add(Dense(num_y_tokens, activation='sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.fit(x_train, y_train,
batch_size=64,
epochs=3)
The num_vals is the length of each sample array in x_train. np.reshape(x_train, (num_samples, num_vals, 1)) changes each sample from [0.12, 0.31, ...] form to [[0.12], [0.31], ...] form, which is the shape that LSTM then takes (input_shape=(num_vals, 1)). The extra 1 seems strange in this case, but it is necessary to add an extra dimension to the input for the LSTM since it expects each sample to have at least two dimensions, typically called (timesteps, data_dim), or in this case (num_vals, 1).
To see how else LSTMs are used in Keras you can refer to:
Keras Sequential model guide (has several LSTM examples)
Keras examples (look for *.py files with lstm in their name)
I slightly misunderstand how to create a simple Sequence for my data.
The data has the following dimensions:
X_train.shape
(2369, 12)
y_train.shape
(2369,)
X_test.shape
(592, 12)
y_test.shape
(592,)
This is how I create the model:
batch_size = 128
nb_epoch = 20
in_out_neurons = X_train.shape[1]
dimof_middle = 100
model = Sequential()
model.add(Dense(batch_size, batch_input_shape=(None, in_out_neurons)))
model.add(Activation('relu'))
model.add(Dropout(0.2))
model.add(Dense(batch_size))
model.add(Activation('relu'))
model.add(Dropout(0.2))
model.add(Dense(in_out_neurons))
model.add(Activation('linear'))
# I am solving the regression problem, not the classification one
model.compile(loss="mean_squared_error", optimizer="rmsprop")
history = model.fit(X_train, y_train,
batch_size=batch_size, nb_epoch=nb_epoch,
verbose=1, validation_data=(X_test, y_test))
The error message:
Exception: Error when checking model input: expected dense_input_14 to
have shape (None, 1) but got array with shape (2369, 12)รง
The error is:
Error when checking model target: expected activation_42 to have shape
(None, 12) but got array with shape (2369, 1)
This error occurs at line:
model.add(Dense(in_out_neurons))
How to change Dense to make it work?
Another question is how to add a simple autoencoder in order to initialize weights of ANN?
One of your problems is that you seem to misunderstand what a batch is.
A batch is the number of training samples computed at a time, so instead of computing one training sample from X_train at a time you use, for example, 100 at a time. The important bit here is that this has nothing to do with your model.
So when you write
model.add(Dense(batch_size, batch_input_shape=(None, in_out_neurons)))
then you create a fully connected layer with an output size of one batch. That does not make a lot of sense.
Another problem is that your model's output is 12 neurons while your Y is only one value/neuron. Your model looks like this:
|
v
[128]
[128]
[ 12]
|
v
Then what fit() does is, it inputs a matrix of shape (128, 12) ((batch size, X_train.shape[1])) into the model and attempts to compare the output of shape (128,12) from the last layer to the corresponding Y values of the batch (shape (128,1)).