Loss function in Keras does not match analogical function - python

I compared the results, obtained via model.evaluate(...) and the ones via numpy. As you can see, they differ a lot. The kernel has just been restarted. Cannot find where the problem is.
import numpy as np
import keras
from keras.layers import Dense
from keras.models import Sequential
import keras.backend as K
X = np.random.rand(10000)
Y = X + np.random.rand(10000) / 5
X_train, X_valid = X[:8000], X[8000:]
Y_train, Y_valid = Y[:8000], Y[8000:]
model = Sequential([
Dense(1, input_shape=(1,), activation='linear'),
])
model.compile('adam', 'mae')
model.fit(X_train, Y_train, epochs=1, batch_size=2000, validation_data=(X_valid, Y_valid))
print(model.evaluate(X_valid, Y_valid))
>>> 0.15643194556236267
preds = model.predict(X_valid)
np.abs(Y_valid - preds).mean()
>>> 0.34461398701699736
Versions: keras = '2.3.1', tensorflow = '2.1.0'.

It's because the model.predict output shape is not same with Y_valid. If you get the transpose of the predictions it will give you almost same loss.
>>> Y_valid.shape
(2000,)
>>> preds.shape
(2000, 1)
>>> np.abs(Y_valid - np.transpose(preds)).mean()

This is a tricky one, but actually simple to fix:
Your targets Y_valid have shape (2000,), i.e. just an array of 2000 numbers. The network outputs however, have shape (2000, 1). The expression Y_valid - preds then tries to subtract a shape (2000, 1) from a shape (2000,)... The two are not compatible, and need to be broadcast. Standard broadcasting rules will proceed as follows:
1. Align like
( 2000,)
(2000, 1)`
2. add extra dimension in front
(1, 2000,)
(2000, 1)
3. broadcast to make compatible
(2000, 2000)
(2000, 2000)
...and so you are actually subtracting two arrays of size (2000, 2000) from each other. You are basically computing the difference between each prediction and all targets instead of just the corresponding one. Obviously, the mean of this will be much larger.
tl; dr: model.evaluate is correct. The manual computation is incorrect due to funny broadcasting. You can fix it by reshaping the predictions to (2000,) (or the targets to (2000, 1):
preds = model.predict(X_valid)[:, 0]
np.abs(Y_valid - preds).mean()

Related

Keras, shapes are incompatible

I made a keras classification model, and I have inputs with different lengths, so I'm using train_on_batch. And I'm getting a
ValueError: Shapes (1, 5) and (1, 36329, 5) are incompatible .
Each input is a set of 2D points.
X_train.shape >>> (2680,)
X_train[0].shape >>> (36329, 2)
X_train[5].shape >>> (40233, 2)
For the output shape :
y_train.shape >>> (2680, 5)
# y_train[0] >>> array([0, 0, 0, 1, 0])
The full code:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(array, classes_bi, test_size=0.33, random_state=69)
from keras.models import Sequential
from keras.layers import Dense
model = Sequential()
model.add(Dense(10000, activation='LeakyReLU'))
model.add(Dense(1000, activation='LeakyReLU'))
model.add(Dense(100, activation='LeakyReLU'))
model.add(Dense(5, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
epochs=5
for epoch in range(5):
for diag,output in zip(X_train,y_train):
diag = np.expand_dims(diag,axis=0) #add the batch size = 1
output = np.expand_dims(output,axis=0) ##add batch size = 1
#print(diag.shape) >>> (1, 36329, 2)
#print(output.shape) >>> (1, 5)
model.train_on_batch(diag,output)
Error :
ValueError Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_9248/1106303834.py in <module>
6 print(diag.shape)
7 print(output.shape)
----> 8 model.train_on_batch(diag,output)
...
ValueError: in user code:
...
ValueError: Shapes (1, 5) and (1, 36329, 5) are incompatible
I tried to expand the dimsention of output twice to get a shape of (1, 1, 5) with (1, 36329, 5) but it didnt work.
For the model you are building, the dimensions of your training data needs to be constant - it cannot vary from one training example to the other.
When you create a model with Sequential(), the input shape of your model will be defined when you do the training for the first time by calling model.fit or model.train_on_batch.
For example, if your first training batch has dimension (36329, 2), your model will assume that each of your training examples have the dimension (2, ) and this particular batch has 36329 training examples - so you need 36329 labels.
The batch size can change from batch to batch, but the dimension (2, ) needs to maintain.
This might be your case:
I think your problem is because your batch of 2D points contain thousands of examples but only 1 label per batch.
If each of the 36329 training examples in X_train[0] correspond to the same label y_train[0] >>> array([0, 0, 0, 1, 0]), all you need to do is broadcast the label so it has the same number of training examples as the input.
for epoch in range(5):
for diag, output in zip(X_train,y_train):
training_examples = diag.shape[0]
broadcast_arr = np.ones((training_examples, 1))
output = output * broadcast_arr
model.train_on_batch(diag,output)
PS: I wanted to ask a couple of questions by commenting, but I don't have enough reputation to do so, that's why I'm posting this answer as my best understanding of your question.

Create a tensorflow dataset based on a "multi-input"

Problem
Create a tf.data.Dataset object from a numpy array that contains multiple X array.
Explaination
This is the model that I'm using, some layers eliminated for reduce the image:
As you can see, the model contains two different input:
The data itself (shape [Batch, 730, 1]) (from now called x_train)
The timestamp (shape [Batch, 730, 3]) (from now called ts_train)
The problem that I'm aiming to solve is a timeseries forecast.
The x_train contains a single feature.
The ts_train contains three features that rappresent Year,Month,Day of the misuration.
I can fit/evaluate/predict the model without any particular problem.
Example of fit:
model.fit(
[x_train, ts_train],
y_train,
batch_size=1024,
epochs=2000,
validation_data=([x_test, ts_test], y_test),
callbacks=callbacks,
)
Example of predict:
model.predict([x_test[0].reshape(1, window, 1), ts_test[0].reshape(1, window, 3)])
However, i can't understand how to cast the numpy array that rappresent my dataset into a tensorflow dataset.
Using the following code:
tf.data.Dataset.from_tensor_slices([x_train, ts_train], y_train)
I'll receive the following error:
ValueError: Can't convert non-rectangular Python sequence to Tensor.
How can I cast my 2 x -> 1 y into a tf.data.Dataset ?
Maybe try using tuples like this:
import numpy as np
import tensorflow as tf
x_train = np.random.random((50, 730, 1))
ts_train = np.random.random((50, 730, 3))
y_train = np.random.random((50, 5))
ds = tf.data.Dataset.from_tensor_slices(((x_train, ts_train), y_train))
for (x, t), y in ds.take(1):
print(x.shape, t.shape, y.shape)
(730, 1) (730, 3) (5,)
And here is an example model:
input1 = tf.keras.layers.Input((730, 1))
input2 = tf.keras.layers.Input((730, 3))
x = tf.keras.layers.Flatten()(input1)
y = tf.keras.layers.Flatten()(input2)
outputs = tf.keras.layers.Concatenate()([x, y])
outputs = tf.keras.layers.Dense(5)(outputs)
model = tf.keras.Model([input1, input2], outputs)
model.compile(optimizer='adam', loss='mse')
model.fit(ds.batch(10), epochs=5)

Keras Normalization for a 2d input array

I am new to machine learning and trying to apply it to my problem.
I have a training dataset with 44000 rows of features with shape 6, 25. I want to build a sequential model. I was wondering if there is a way to use the features without flattening it. Currently, I flatten the features to 1d array and normalize for training (see the code below). I could not find a way to normalize 2d features.
dataset2d = dataset2d.reshape(dataset2d.shape[0],
dataset2d.shape[1]*dataset2d.shape[2])
normalizer = preprocessing.Normalization()
normalizer.adapt(dataset2d)
print(normalizer.mean.numpy())
x_train, x_test, y_train, y_test = train_test_split(dataset2d, flux_val,
test_size=0.2)
# %% DNN regression multiple parameter
def build_and_compile_model(norm):
inputs = Input(shape=(x_test.shape[1],))
x = norm(inputs)
x = layers.Dense(128, activation="selu")(x)
x = layers.Dense(64, activation="relu")(x)
x = layers.Dense(32, activation="relu")(x)
x = layers.Dense(1, activation="linear")(x)
model = Model(inputs, x)
model.compile(loss='mean_squared_error',
optimizer=keras.optimizers.Adam(learning_rate=1e-3))
return model
dnn_model = build_and_compile_model(normalizer)
dnn_model.summary()
# interrupt training when model is no longer imporving
path_checkpoint = "model_checkpoint.h5"
modelckpt_callback = keras.callbacks.ModelCheckpoint(monitor="val_loss",
filepath=path_checkpoint,
verbose=1,
save_weights_only=True,
save_best_only=True)
es_callback = keras.callbacks.EarlyStopping(monitor="val_loss",
min_delta=0, patience=10)
history = dnn_model.fit(x_train, y_train, validation_split=0.2,
epochs=120, callbacks=[es_callback, modelckpt_callback])
I also tried to modify my model input layer to the following, such that I do not need to reshape my input
inputs = Input(shape=(x_test.shape[-1], x_test.shape[-2], ))
and modify the normalization to the following
normalizer = preprocessing.Normalization(axis=1)
normalizer.adapt(dataset2d)
print(normalizer.mean.numpy())
But this does not seem to help. The normalization adapts to a 1d array of length 6, while I want it to adapt to a 2d array of shape 25, 6.
Sorry for the long question. You help will be much appreciated.
I'm not sure if I understood your issue. The normalizer layer can take N-D tensor and it produces an output with the same shape, for example:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import numpy as np
t = tf.constant(np.arange(2*3*4).reshape(2,3,4) , dtype=tf.float32)
tf.print("\n",t)
normalizer_layer = tf.keras.layers.LayerNormalization(axis=1)
output = normalizer_layer(t)
tf.print("\n",output)

Keras sequential model with multiple inputs, Tensorflow 1.9.0

I try creating a neural network, having two inputs of a particular size (here four) each and one output of the same size size (so also four). Unfortunately, I always get this error when running my code:
ValueError: Error when checking model input: the list of Numpy arrays that you are passing to your model is not
the size the model expected. Expected to see 2 array(s), but instead got the following list of 1 arrays:
[array([[[-1.07920336, 1.16782929, 1.40131554, -0.30052492],
[-0.50067655, 0.54517916, -0.87033621, -0.22922157]],
[[-0.53766128, -0.03527806, -0.14637072, 2.32319071],
[ 0...
I think, the problem lies in the fact, that once I pass the data for training, the input shape is either incorrect or I have a datatype issue. Hence, there is an extra list bracket around the array.
I'm using Tensorflow 1.9.0 (due to project restrictions). I already checked the search function and tried solutions provided here. Following is an example code for reproducting the error of mine:
import numpy as np
import tensorflow as tf
from tensorflow import keras
import keras.backend as K
from tensorflow.keras import layers, models
def main():
ip1 = keras.layers.Input(shape=(4,))
ip2 = keras.layers.Input(shape=(4,))
dense = layers.Dense(3, activation='sigmoid', input_dim=4) # Passing the value in a weighted manner
merge_layer = layers.Concatenate()([ip1, ip2]) # Concatenating the outputs of the first network
y = layers.Dense(6, activation='sigmoid')(merge_layer) # Three fully connected layers
y = layers.Dense(4, activation='sigmoid')(y)
model = keras.Model(inputs=[ip1, ip2], outputs=y)
model.compile(optimizer='adam',
loss='mean_squared_error')
model.summary()
# dataset shape: 800 samples, 2 inputs for sequential model, 4 input size
X_train = np.random.randn(800, 2, 4)
y_train = np.random.randn(800, 4)
X_test = np.random.randn(200, 2, 4)
y_test = np.random.randn(200, 4)
history = model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=1000, batch_size=32)
if __name__ == '__main__':
main()
When there is multiple inputs keras expects list of multiple arrays. The size of the list corresponds to number of inputs you have for the model.
So basically you need to pass a list of 2 array each with shape (X,4)
X_train1 = np.random.randn(800, 4)
X_train2=np.random.randn(800,4)
y_train = np.random.randn(800, 4)
X_test1 = np.random.randn(200, 4)
X_test2 = np.random.randn(200, 4)
y_test = np.random.randn(200, 4)
history = model.fit([X_train1,X_train2], y_train, validation_data=([X_test1,X_test2], y_test), epochs=1000, batch_size=32)

Incorrect number of dimensions in Keras input

I'm attempting to follow along on what I'm thinking is the 5th or 6th simple introductory tutorial for keras that almost but never quite works.
Stripping everything out, I appear to come down to a problem with the format of my input. I read in an array of images, and extract two types, images of sign language ones and images of sign language zeros. I then set up an array of ones and zeros to correspond to what the images actually are, then make sure of sizes and types.
import numpy as np
from subprocess import check_output
print(check_output(["ls", "../data/keras/"]).decode("utf8"))
## load dataset of images of sign language numbers
x = np.load('../data/keras/npy_dataset/X.npy')
# Get the zeros and ones, construct a list of known values (Y)
X = np.concatenate((x[204:409], x[822:1027] ), axis=0) # from 0 to 204 is zero sign and from 205 to 410 is one sign
Y = np.concatenate((np.zeros(205), np.ones(205)), axis=0).reshape(X.shape[0],1)
# test shape and type
print("X shape: " , X.shape)
print("X class: " , type(X))
print("Y shape: " , Y.shape)
print("Y type: " , type(Y))
This gives me:
X shape: (410, 64, 64)
X class: <class 'numpy.ndarray'>
Y shape: (410, 1)
Y type: <class 'numpy.ndarray'>
which is all good. I then load the relevant bits from Keras, using Tensorflow as the backend and try to construct a classifier.
# get the relevant keras bits.
from keras.models import Sequential
from keras.layers import Convolution2D
# construct a classifier
classifier = Sequential() # initialize neural network
classifier.add(Convolution2D(32, (3, 3), input_shape=(410, 64, 64), activation="relu", data_format="channels_last"))
classifier.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])
classifier.fit(X, Y, batch_size=32, epochs=10, verbose=1)
This results in:
ValueError: Error when checking input: expected conv2d_1_input to have 4 dimensions, but got array with shape (410, 64, 64)
This SO question, I think, suggests that my input shape needs to be altered to have a 4th dimension added to it - though it also says it's the output shape that needs to altered, I haven't been able to find anywhere to specify an output shape, so I'm assuming it is meant that I should alter the input shape to input_shape=(1, 64, 64, 1).
If I change my input shape however, then I immeadiately get this:
ValueError: Input 0 is incompatible with layer conv2d_1: expected ndim=4, found ndim=5
Which this github issue suggests is because I no longer need to specify the number of samples. So I'm left with the situation of using one input shape and getting one error, or changing it and getting another error.
Reading this and this made me think I might need to reshape my data to include information about the channels in X, but if I add in
X = X.reshape(X.shape[0], 64, 64, 1)
print(X.shape)
Then I get
ValueError: Error when checking target: expected conv2d_1 to have 4 dimensions, but got array with shape (410, 1)
If I change the reshape to anything else, i.e.
X = X.reshape(X.shape[0], 64, 64, 2)
Then I get a message saying it's unable to reshape the data, so I'm obviously doing something wrong with that, if that is, indeed, the problem.
I have read the suggested Conv2d docs which shed exactly zero light on the matter for me. Anyone else able to?
At first I used the following data sets (similar to your case):
import numpy as np
import keras
X = np.random.randint(256, size=(410, 64, 64))
Y = np.random.randint(10, size=(410, 1))
x_train = X[:, :, :, np.newaxis]
y_train = keras.utils.to_categorical(Y, num_classes=10)
And then modified your code as follows to work:
from keras.models import Sequential
from keras.layers import Convolution2D, Flatten, Dense
classifier = Sequential() # initialize neural network
classifier.add(Convolution2D(32, (3, 3), input_shape=(64, 64, 1), activation="relu", data_format="channels_last"))
classifier.add(Flatten())
classifier.add(Dense(10, activation='softmax'))
classifier.compile(optimizer = 'adam', loss = 'categorical_crossentropy', metrics = ['accuracy'])
classifier.fit(x_train, y_train, batch_size=32, epochs=10, verbose=1)
Changed the shape of X from 410 x 64 x 64 to 410 x 64 x 64 x 1 (with channel 1).
input_shape be the shape of a sample data, that is, 64 x 64 x 1.
Changed the shape of Y using keras.utils.to_categorical() (one-hot encoding with num_classes=10).
Before compiling, Flatten() and Dense() were applied because you want categorical_crossentropy.

Categories

Resources