I made a keras classification model, and I have inputs with different lengths, so I'm using train_on_batch. And I'm getting a
ValueError: Shapes (1, 5) and (1, 36329, 5) are incompatible .
Each input is a set of 2D points.
X_train.shape >>> (2680,)
X_train[0].shape >>> (36329, 2)
X_train[5].shape >>> (40233, 2)
For the output shape :
y_train.shape >>> (2680, 5)
# y_train[0] >>> array([0, 0, 0, 1, 0])
The full code:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(array, classes_bi, test_size=0.33, random_state=69)
from keras.models import Sequential
from keras.layers import Dense
model = Sequential()
model.add(Dense(10000, activation='LeakyReLU'))
model.add(Dense(1000, activation='LeakyReLU'))
model.add(Dense(100, activation='LeakyReLU'))
model.add(Dense(5, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
epochs=5
for epoch in range(5):
for diag,output in zip(X_train,y_train):
diag = np.expand_dims(diag,axis=0) #add the batch size = 1
output = np.expand_dims(output,axis=0) ##add batch size = 1
#print(diag.shape) >>> (1, 36329, 2)
#print(output.shape) >>> (1, 5)
model.train_on_batch(diag,output)
Error :
ValueError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_9248/1106303834.py in <module>
6 print(diag.shape)
7 print(output.shape)
----> 8 model.train_on_batch(diag,output)
...
ValueError: in user code:
...
ValueError: Shapes (1, 5) and (1, 36329, 5) are incompatible
I tried to expand the dimsention of output twice to get a shape of (1, 1, 5) with (1, 36329, 5) but it didnt work.
For the model you are building, the dimensions of your training data needs to be constant - it cannot vary from one training example to the other.
When you create a model with Sequential(), the input shape of your model will be defined when you do the training for the first time by calling model.fit or model.train_on_batch.
For example, if your first training batch has dimension (36329, 2), your model will assume that each of your training examples have the dimension (2, ) and this particular batch has 36329 training examples - so you need 36329 labels.
The batch size can change from batch to batch, but the dimension (2, ) needs to maintain.
This might be your case:
I think your problem is because your batch of 2D points contain thousands of examples but only 1 label per batch.
If each of the 36329 training examples in X_train[0] correspond to the same label y_train[0] >>> array([0, 0, 0, 1, 0]), all you need to do is broadcast the label so it has the same number of training examples as the input.
for epoch in range(5):
for diag, output in zip(X_train,y_train):
training_examples = diag.shape[0]
broadcast_arr = np.ones((training_examples, 1))
output = output * broadcast_arr
model.train_on_batch(diag,output)
PS: I wanted to ask a couple of questions by commenting, but I don't have enough reputation to do so, that's why I'm posting this answer as my best understanding of your question.
Related
I am writing a neural network to take the Mel frequency coefficients as inputs and then run the model. My dataset contains 100 samples - each sample is an array of 12 values corresponding to the coefficients. After splitting this data into train and test sets, I have created the X input corresponding to the array and the y input corresponding to the label.
Data array containing the coefficients
Here is a small sample of my data containing 5 elements in the X_train array:
['[107.59366 -14.153783 24.799461 -8.244417 20.95272\n -4.375943 12.77285 -0.92922235 3.9418116 7.3581047\n -0.30066165 5.441765 ]'
'[ 96.49664 2.0689797 21.557552 -32.827045 7.348135 -23.513977\n 7.9406714 -16.218931 10.594619 -21.4381 0.5903044 -10.569035 ]'
'[105.98041 -2.0483367 12.276348 -27.334534 6.8239 -23.019623\n 7.5176797 -21.884727 11.349695 -22.734652 3.0335162 -11.142375 ]'
'[ 7.73094559e+01 1.91073620e+00 6.72225571e+00 -2.74525508e-02\n 6.60858107e+00 5.99264860e-01 1.96265772e-01 -3.94772577e+00\n 7.46383286e+00 5.42239428e+00 1.21432066e-01 2.44894314e+00]']
When I create the Neural network, I want to use the 12 coefficients as an input for the network. In order to do this, I need to use each row of my X_train dataset that contains these arrays as the input. However, when I try to consider the array index as an input it gives me shape errors when trying to fit the model. My model is as follows:
def build_model_graph():
model = Sequential()
model.add(Input(shape=(12,)))
model.add(Dense(12))
model.add(Activation('relu'))
model.add(Dense(10))
model.add(Activation('relu'))
model.add(Dense(num_labels))
model.add(Activation('softmax'))
# Compile the model
model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer='adam')
return model
Here, I want to use every row of the X_train array as an input which would correspond to the shape(12,). When I use something like this:
num_epochs = 50
num_batch_size = 32
model.fit(x_train, y_train, batch_size=num_batch_size, epochs=num_epochs,
validation_data=(x_test, y_test), verbose=1)
I get an error for the shape which makes sense to me.
For reference, the error is as follows:
ValueError: Exception encountered when calling layer "sequential_20" (type Sequential).
Input 0 of layer "dense_54" is incompatible with the layer: expected min_ndim=2, found ndim=1. Full shape received: (None,)
But I am not exactly sure how I can extract the array of 12 coefficients present at each index of the X_train and then use it in the model input. Indexing the x_train and y_train did not work either. If anyone could point me in a relevant direction, it would be extremely helpful. Thanks!
Edit: My code for the dataframe is as follows:
clapdf = pd.read_csv("clapsdf.csv")
clapdf.drop('Unnamed: 0', inplace=True, axis=1)
clapdf.head()
nonclapdf = pd.read_csv("nonclapsdf.csv")
nonclapdf.drop('Unnamed: 0', inplace=True, axis=1)
sound_df = clapdf.append(nonclapdf)
sound_df.head()
d=sound_data.tolist()
df=pd.DataFrame(data=d)
data = df[0].to_numpy()
print("Before-->", data.shape)
dat = np.array([np.array(d) for d in data])
print('After-->', dat.shape)
Here, the shape remains the same as the values of each of the 80 samples are not in a comma separated format but instead in the form of a series.
If your data looks like this:
samples = 2
features = 12
x_train = tf.random.normal((samples, 1, features))
tf.Tensor(
[[[-2.5988803 -0.629626 -0.8306641 -0.78226614 0.88989156
-0.3851106 -0.66053045 1.0571191 -0.59061646 -1.1602987
0.69124466 -0.04354193]]
[[-0.86917496 2.2923143 -0.05498986 -0.09578358 0.85037625
-0.54679644 -1.2213608 -1.3766612 0.35416105 -0.57801914
-0.3699728 0.7884727 ]]], shape=(2, 1, 12), dtype=float32)
You will have to reshape it to (2, 12) in order to fit your model with the input shape (batch_size, 12):
import tensorflow as tf
def build_model_graph():
model = tf.keras.Sequential()
model.add(tf.keras.layers.Input(shape=(12,)))
model.add(tf.keras.layers.Dense(12))
model.add(tf.keras.layers.Activation('relu'))
model.add(tf.keras.layers.Dense(10))
model.add(tf.keras.layers.Activation('relu'))
model.add(tf.keras.layers.Dense(2))
model.add(tf.keras.layers.Activation('softmax'))
# Compile the model
model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer='adam')
return model
model = build_model_graph()
samples = 2
features = 12
x_train = tf.random.normal((samples, 1, features))
x_train = tf.reshape(x_train, (samples, features))
y = tf.random.uniform((samples, 1), maxval=2, dtype=tf.int32)
y_train = tf.keras.utils.to_categorical(y, 2)
model.fit(x_train, y_train, batch_size=1, epochs=2)
Also, you usually need to convert your labels to one-hot encoded vectors if you plan to use categorical_crossentropy.
y_train looks like this:
[[0. 1.]
[1. 0.]]
Update 1:
If your data is coming from a dataframe, try something like this:
import numpy as np
import pandas as pd
d = {'features': [[0.18525402, 0.92130125, 0.2296906, 0.75818471, 0.69813222, 0.47147329,
0.03560711, 0.06583931, 0.90921289, 0.76002148, 0.50413995, 0.36099004],
[0.18525402, 0.92130125, 0.2296906, 0.75818471, 0.69813222, 0.47147329,
0.03560711, 0.06583931, 0.90921289, 0.76002148, 0.50413995, 0.36099004]]}
df = pd.DataFrame(data=d)
data = df['features'].to_numpy()
print('Before -->', data.shape)
data = np.array([np.array(d) for d in data])
print('After -->', data.shape)
Before --> (2,)
After --> (2, 12)
I am facing some problems in training the following GRU model, which has to be stateful and output the hidden state.
import numpy as np
import tensorflow as tf #2.1.0
from tensorflow import keras
BATCH_SIZE = 1
nfeatures = 3
history = 30 # shapes input array
horizon = 5 # shapes output array
nodes = 32
input_layer = tf.keras.layers.Input(batch_shape=(1,30,3),name="INPUT")
output, state_h = tf.keras.layers.GRU(nodes,
return_sequences=True,
stateful=True,
return_state=True,
batch_input_shape=(1,history,3), name='GRU1')(input_layer)
output_layer = tf.keras.layers.GRU(nodes, activation='tanh', name='GRU2')(output, state_h)
output_dense = tf.keras.layers.Dense(5, name='DENSE')(output_layer)
model = tf.keras.Model(input_layer, [output_dense, state_h])
model.compile(optimizer=tf.keras.optimizers.Adam(clipvalue=2.0),
loss='mse',
metrics=['mean_absolute_error', 'mean_squared_error'])
As I need the model to output the hidden state, I do not use a Sequential model. (I had no problems training a stateful sequential model.)
The features fed to network are of shape np.shape(x)=(30,3) and the target np.shape(y)=(5,).
If I call model.predict(x), where x is a numpy array with the shape mentioned above, it throws an error, as expected, because the input shape doesn't match the expected input. Therefore, I reshape the input array to have an input shape of (1,30,3) by calling np.expand_dims(x,axis=0). After that, it works fine, i.e. I get an output.
The issues I am facing are when I try to train the model. Calling
model.fit(x, y,epochs=1,steps_per_epoch=STEPS_PER_EPOCH)
throws the same error, about the shape of the data
ValueError: Error when checking input: expected input to have 3 dimensions, but got array with shape (30, 3)
Reshapping the data as I did for the prediction didn't help
model.fit(np.expand_dims(x,axis=0), np.expand_dims(y,axis=0),epochs=1,steps_per_epoch=STEPS_PER_EPOCH)
ValueError: The number of samples 1 is not divisible by steps 30. Please change the number of steps to a value that can consume all the samples.
This was a new error, setting the steps_per_epoch=1 threw a new one
ValueError: Error when checking model target: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 2 array(s), for inputs ['DENSE', 'GRU1'] but instead got the following list of 1 arrays: [array([[0.5124772 , 0.51047856, 0.509669 , 0.50830126, 0.5070507 ]],
dtype=float32)]...
Is the format of my data wrong or is the architecture of my layers missing something? I tried adding a Flatten layer after the input, but it didn't make much sense (in my head) and it didn't work either.
Thanks in advance.
Problem here is that the Number of Nodes should be equal to the Output Shape. Changing the value of Nodes from 32 to 5, along with other minor changes, will fix the Error.
Complete working code is shown below:
import numpy as np
import tensorflow as tf #2.1.0
from tensorflow import keras
BATCH_SIZE = 1
nfeatures = 3
history = 30 # shapes input array
horizon = 5 # shapes output array
nodes = 5
x = np.ones(shape = (30,3))
x = np.expand_dims(x, axis = 0)
y = np.ones(shape = (5,))
y = np.expand_dims(y, axis = 0)
print(x.shape) #(1, 30, 3)
print(y.shape) #(1, 5)
input_layer = tf.keras.layers.Input(batch_shape=(1,30,3),name="INPUT")
output, state_h = tf.keras.layers.GRU(nodes,
return_sequences=True,
stateful=True,
return_state=True,
batch_input_shape=(1,history,3), name='GRU1')(input_layer)
output_layer = tf.keras.layers.GRU(nodes, activation='tanh', name='GRU2')(output, state_h)
output_dense = tf.keras.layers.Dense(5, name='DENSE')(output_layer)
model = tf.keras.Model(input_layer, [output_dense, state_h])
model.compile(optimizer=tf.keras.optimizers.Adam(clipvalue=2.0),
loss='mse',
metrics=['mean_absolute_error', 'mean_squared_error'])
STEPS_PER_EPOCH = 1
model.fit(x, y,epochs=1,steps_per_epoch=STEPS_PER_EPOCH)
Output of the above code is:
(1, 30, 3)
(1, 5)
1/1 [==============================] - 0s 3ms/step - loss: 1.8172 - DENSE_loss: 1.1737 - GRU1_loss: 0.6435 - DENSE_mean_absolute_error: 1.0498 - DENSE_mean_squared_error: 1.1737 - GRU1_mean_absolute_error: 0.7157 - GRU1_mean_squared_error: 0.6435
<tensorflow.python.keras.callbacks.History at 0x7f698bf8ac50>
Hope this helps. Happy Learning!
I compared the results, obtained via model.evaluate(...) and the ones via numpy. As you can see, they differ a lot. The kernel has just been restarted. Cannot find where the problem is.
import numpy as np
import keras
from keras.layers import Dense
from keras.models import Sequential
import keras.backend as K
X = np.random.rand(10000)
Y = X + np.random.rand(10000) / 5
X_train, X_valid = X[:8000], X[8000:]
Y_train, Y_valid = Y[:8000], Y[8000:]
model = Sequential([
Dense(1, input_shape=(1,), activation='linear'),
])
model.compile('adam', 'mae')
model.fit(X_train, Y_train, epochs=1, batch_size=2000, validation_data=(X_valid, Y_valid))
print(model.evaluate(X_valid, Y_valid))
>>> 0.15643194556236267
preds = model.predict(X_valid)
np.abs(Y_valid - preds).mean()
>>> 0.34461398701699736
Versions: keras = '2.3.1', tensorflow = '2.1.0'.
It's because the model.predict output shape is not same with Y_valid. If you get the transpose of the predictions it will give you almost same loss.
>>> Y_valid.shape
(2000,)
>>> preds.shape
(2000, 1)
>>> np.abs(Y_valid - np.transpose(preds)).mean()
This is a tricky one, but actually simple to fix:
Your targets Y_valid have shape (2000,), i.e. just an array of 2000 numbers. The network outputs however, have shape (2000, 1). The expression Y_valid - preds then tries to subtract a shape (2000, 1) from a shape (2000,)... The two are not compatible, and need to be broadcast. Standard broadcasting rules will proceed as follows:
1. Align like
( 2000,)
(2000, 1)`
2. add extra dimension in front
(1, 2000,)
(2000, 1)
3. broadcast to make compatible
(2000, 2000)
(2000, 2000)
...and so you are actually subtracting two arrays of size (2000, 2000) from each other. You are basically computing the difference between each prediction and all targets instead of just the corresponding one. Obviously, the mean of this will be much larger.
tl; dr: model.evaluate is correct. The manual computation is incorrect due to funny broadcasting. You can fix it by reshaping the predictions to (2000,) (or the targets to (2000, 1):
preds = model.predict(X_valid)[:, 0]
np.abs(Y_valid - preds).mean()
I'm attempting to follow along on what I'm thinking is the 5th or 6th simple introductory tutorial for keras that almost but never quite works.
Stripping everything out, I appear to come down to a problem with the format of my input. I read in an array of images, and extract two types, images of sign language ones and images of sign language zeros. I then set up an array of ones and zeros to correspond to what the images actually are, then make sure of sizes and types.
import numpy as np
from subprocess import check_output
print(check_output(["ls", "../data/keras/"]).decode("utf8"))
## load dataset of images of sign language numbers
x = np.load('../data/keras/npy_dataset/X.npy')
# Get the zeros and ones, construct a list of known values (Y)
X = np.concatenate((x[204:409], x[822:1027] ), axis=0) # from 0 to 204 is zero sign and from 205 to 410 is one sign
Y = np.concatenate((np.zeros(205), np.ones(205)), axis=0).reshape(X.shape[0],1)
# test shape and type
print("X shape: " , X.shape)
print("X class: " , type(X))
print("Y shape: " , Y.shape)
print("Y type: " , type(Y))
This gives me:
X shape: (410, 64, 64)
X class: <class 'numpy.ndarray'>
Y shape: (410, 1)
Y type: <class 'numpy.ndarray'>
which is all good. I then load the relevant bits from Keras, using Tensorflow as the backend and try to construct a classifier.
# get the relevant keras bits.
from keras.models import Sequential
from keras.layers import Convolution2D
# construct a classifier
classifier = Sequential() # initialize neural network
classifier.add(Convolution2D(32, (3, 3), input_shape=(410, 64, 64), activation="relu", data_format="channels_last"))
classifier.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])
classifier.fit(X, Y, batch_size=32, epochs=10, verbose=1)
This results in:
ValueError: Error when checking input: expected conv2d_1_input to have 4 dimensions, but got array with shape (410, 64, 64)
This SO question, I think, suggests that my input shape needs to be altered to have a 4th dimension added to it - though it also says it's the output shape that needs to altered, I haven't been able to find anywhere to specify an output shape, so I'm assuming it is meant that I should alter the input shape to input_shape=(1, 64, 64, 1).
If I change my input shape however, then I immeadiately get this:
ValueError: Input 0 is incompatible with layer conv2d_1: expected ndim=4, found ndim=5
Which this github issue suggests is because I no longer need to specify the number of samples. So I'm left with the situation of using one input shape and getting one error, or changing it and getting another error.
Reading this and this made me think I might need to reshape my data to include information about the channels in X, but if I add in
X = X.reshape(X.shape[0], 64, 64, 1)
print(X.shape)
Then I get
ValueError: Error when checking target: expected conv2d_1 to have 4 dimensions, but got array with shape (410, 1)
If I change the reshape to anything else, i.e.
X = X.reshape(X.shape[0], 64, 64, 2)
Then I get a message saying it's unable to reshape the data, so I'm obviously doing something wrong with that, if that is, indeed, the problem.
I have read the suggested Conv2d docs which shed exactly zero light on the matter for me. Anyone else able to?
At first I used the following data sets (similar to your case):
import numpy as np
import keras
X = np.random.randint(256, size=(410, 64, 64))
Y = np.random.randint(10, size=(410, 1))
x_train = X[:, :, :, np.newaxis]
y_train = keras.utils.to_categorical(Y, num_classes=10)
And then modified your code as follows to work:
from keras.models import Sequential
from keras.layers import Convolution2D, Flatten, Dense
classifier = Sequential() # initialize neural network
classifier.add(Convolution2D(32, (3, 3), input_shape=(64, 64, 1), activation="relu", data_format="channels_last"))
classifier.add(Flatten())
classifier.add(Dense(10, activation='softmax'))
classifier.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])
classifier.fit(x_train, y_train, batch_size=32, epochs=10, verbose=1)
Changed the shape of X from 410 x 64 x 64 to 410 x 64 x 64 x 1 (with channel 1).
input_shape be the shape of a sample data, that is, 64 x 64 x 1.
Changed the shape of Y using keras.utils.to_categorical() (one-hot encoding with num_classes=10).
Before compiling, Flatten() and Dense() were applied because you want categorical_crossentropy.
I have inputs that look like this:
[
[1, 2, 3]
[4, 5, 6]
[7, 8, 9]
...]
of shape (1, num_samples, num_features), and labels that look like this:
[
[0, 1]
[1, 0]
[1, 0]
...]
of shape (1, num_samples, 2).
However, when I try to run the following Keras code, I get this error:
ValueError: Error when checking model target: expected dense_1 to have 2 dimensions, but got array with shape (1, 8038, 2). From what I've read, this appears to stem from the fact that my labels are 2D, and not simply integers. Is this correct, and if so, how can I use one-hot labels with Keras?
Here's the code:
num_features = 463
trX = np.random(8038, num_features)
trY = # one-hot array of shape (8038, 2) as described above
def keras_builder(): #generator to build the inputs
while(1):
x = np.reshape(trX, (1,) + np.shape(trX))
y = np.reshape(trY, (1,) + np.shape(trY))
print(np.shape(x)) # (1, 8038, 463)
print(np.shape(y)) # (1, 8038, 2)
yield x, y
model = Sequential()
model.add(LSTM(100, input_dim = num_features))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
model.fit_generator(keras_builder(), samples_per_epoch = 1, nb_epoch=3, verbose = 2, nb_worker = 1)
Which promptly throws the error above:
Traceback (most recent call last):
File "file.py", line 35, in <module>
model.fit_generator(keras_builder(), samples_per_epoch = 1, nb_epoch=3, verbose = 2, nb_worker = 1)
...
ValueError: Error when checking model target: expected dense_1 to have 2 dimensions, but got array with shape (1, 8038, 2)
Thank you!
There are a lot of things that do not add up.
I assume that you are trying to solve a sequential classification task, i.e. your data is shaped as (<batch size>, <sequence length>, <feature length>).
In your batch generator you create a batch consisting of one sequence of length 8038 and 463 features per sequence element. You create a matching Y batch to compare against, consisting of one sequence with 8038 elements, each of size 2.
Your problem is that Y does not match up with the output of the last layer. Your Y is 3-dimensional while the output of your model is only 2-dimensional: Y.shape = (1, 8038, 2) does not match dense_1.shape = (1,1). This explains the error message you get.
The solution to this: you need to enable return_sequences=True in the LSTM layer to return a sequence instead of only the last element (effectively removing the time-dimension). This would give an output shape of (1, 8038, 100) at the LSTM layer. Since the Dense layer is not able to handle sequential data you need to apply it to each sequence element individually which is done by wrapping it in a TimeDistributed wrapper. This then gives your model the output shape (1, 8038, 1).
Your model should look like this:
from keras.layers.wrappers import TimeDistributed
model = Sequential()
model.add(LSTM(100, input_dim=num_features, return_sequences=True))
model.add(TimeDistributed(Dense(1, activation='sigmoid')))
This can be easily spotted when examining the summary of the model:
print(model.summary())