Using KFold cross validation to get MAE for each data split - python

I want to get mean absolute error (MAE) for each split of data using 5-fold cross validation. I have built a custom model using Xception.
Hence, to try this, I coded the following:
# Data Generators:
train_gen = flow_from_dataframe(core_idg, train_df,
path_col = 'path',
y_col = 'boneage_zscore',
target_size = IMG_SIZE,
color_mode = 'rgb',
batch_size = 32,
shuffle = True)
X_train, Y_train = next(train_gen)
#-----------------------------------------------------------------------
# Custom Model initiation:
base_model = Xception(input_shape = X_train.shape[1:], include_top = False, weights = 'imagenet')
base_model.trainable = True
model = Sequential()
model.add(base_model)
model.add(GlobalMaxPooling2D())
model.add(Flatten())
model.add(Dense(16, activation = 'relu'))
model.add(Dense(1, activation = 'linear'))
def mae_months(in_gt, in_pred):
return mean_absolute_error(boneage_div * in_gt, boneage_div * in_pred)
# Compile model
adam = Adam(learning_rate = 0.0005)
model.compile(loss = 'mse', optimizer = adam, metrics = [mae_months])
#-----------------------------------------------------------------------
# KFold
n_splits = 5
kf = KFold(n_splits = n_splits, shuffle = True, random_state = 42)
I coded up to KFold, but now I am stuck with proceeding to the cross validation step to get MAE for each data splits?
A post here suggests a for loop for each Kfold splits, but that's only if the model such as DecisionTreeRegressor() is used instead of a custom model using Xception like mine?
UPDATE
After referring to the suggestion below, I applied the code as follows after the using KFold:
# Data Generators:
train_gen = flow_from_dataframe(core_idg, train_df,
path_col = 'path',
y_col = 'boneage_zscore',
target_size = IMG_SIZE,
color_mode = 'rgb',
batch_size = 1024,
shuffle = True)
...
...
...
mae_list = []
n_splits = 5
kf = KFold(n_splits = n_splits, shuffle = True, random_state = 42)
split = kf.split(X_train, Y_train) # X_train, Y_train = next(train_gen) from above
for train, test in split:
x_train, x_test, y_train, y_test = X_train[train], X_train[test], Y_train[train], Y_train[test]
history = model.fit(x_train, y_train, validation_data = (x_test, y_test), batch_size = 16)
pred = model.predict(x_test, batch_size = 8)
err = mean_absolute_error(y_test, pred)
mae_list .append(err)
I set the batch size of train_gen to like 1024 first then run the code above, however, I get the following error:
52/52 [==============================] - 16s 200ms/step - loss: 0.9926 - mae_months: 31.5353 - val_loss: 4.4153 - val_mae_months: 81.5463
52/52 [==============================] - 9s 172ms/step - loss: 0.4185 - mae_months: 21.4242 - val_loss: 0.7401 - val_mae_months: 29.3815
52/52 [==============================] - 9s 172ms/step - loss: 0.2930 - mae_months: 17.3729 - val_loss: 0.5628 - val_mae_months: 23.9055
9/52 [====>.........................] - ETA: 7s - loss: 0.2355 - mae_months: 16.7444
ResourceExhaustedError Traceback (most recent call last)
Input In [11], in <cell line: 9>()
10 x_train, x_test, y_train, y_test = X_train[train], X_train[test], Y_train[train], Y_train[test]
11 # model = boneage_model()
12 # history = model.fit(train_gen, validation_data = (x_test, y_test))
---> 13 history = model.fit(x_train, y_train, validation_data = (x_test, y_test), batch_size = 16)
14 pred = model.predict(x_test, batch_size = 8)
15 err = mean_absolute_error(y_test, pred)
ResourceExhaustedError: Graph execution error:
....
....
....
Node: 'gradient_tape/sequential/xception/block14_sepconv2/separable_conv2d/Conv2DBackpropFilter'
OOM when allocating tensor with shape[2048,1536,1,1] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc
[[{{node gradient_tape/sequential/xception/block14_sepconv2/separable_conv2d/Conv2DBackpropFilter}}]]
The memory allocation looks like this from the prompt (hopefully this makes sense):
total_region_allocated_bytes_: 5769199616
memory_limit_: 5769199616
available bytes: 0
curr_region_allocation_bytes_: 8589934592
Stats:
Limit: 5769199616
InUse: 5762760448
MaxInUse: 5769190400
NumAllocs: 192519
MaxAllocSize: 2470510592
Reserved: 0
PeakReserved: 0
LargestFreeBlock: 0
Is it because my GPU cannot take the batch_size?
UPDATE 2
I have decreased the batch_size of the train_gen to 32. Took out the batch_size from the fit() and predict() method. Is this the right way to determine the MAE for each data split?
Code:
# Data Generators:
train_gen = flow_from_dataframe(core_idg, train_df,
path_col = 'path',
y_col = 'boneage_zscore',
target_size = IMG_SIZE,
color_mode = 'rgb',
batch_size = 32,
shuffle = True)
X_train, Y_train = next(train_gen)
...
...
...
mae_list = []
n_splits = 5
kf = KFold(n_splits = n_splits, shuffle = True, random_state = 42)
split = kf.split(X_train, Y_train) # X_train, Y_train = next(train_gen) from above
for train, test in split:
x_train, x_test, y_train, y_test = X_train[train], X_train[test], Y_train[train], Y_train[test]
history = model.fit(x_train, y_train, validation_data = (x_test, y_test))
pred = model.predict(x_test)
err = mean_absolute_error(y_test, pred)
mae_list.append(err)
UPDATE 3
According to the suggestions from the comments:
Edited the batch_size of the train_gen to 64.
Added valid_gen to use X_valid and y_valid as validation data of the fit() method.
Used x_test for the predict() method.
Added a method for limiting GPU memory growth.
Code:
# Checking the GPU availability
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
physical_devices = tf.config.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(physical_devices[0], True)
...
...
...
# Data Generators:
train_gen = flow_from_dataframe(core_idg, train_df,
path_col = 'path',
y_col = 'boneage_zscore',
target_size = IMG_SIZE,
color_mode = 'rgb',
batch_size = 64,
shuffle = True)
X_train, Y_train = next(train_gen)
valid_gen = flow_from_dataframe(core_valid, valid_df,
path_col = 'path',
y_col = 'boneage_zscore',
target_size = IMG_SIZE,
color_mode = 'rgb',
batch_size = 64,
shuffle = True)
X_valid, y_valid = next(valid_gen)
# Getting MAE for each data split using 5-fold (KFold)
cv_mae = []
n_splits = 5
kf = KFold(n_splits = n_splits, shuffle = True, random_state = 42)
split = kf.split(X_train, Y_train)
for train, test in split:
x_train, x_test, y_train, y_test = X_train[train], X_train[test], Y_train[train], Y_train[test]
history = model.fit(x_train, y_train, validation_data = (X_valid, y_valid))
pred = model.predict(x_test)
err = mean_absolute_error(y_test, pred)
cv_mae.append(err)
cv_mae
The output:
2/2 [==============================] - 8s 2s/step - loss: 3.6179 - mae_months: 66.8136 - val_loss: 2.1544 - val_mae_months: 47.2171
2/2 [==============================] - 1s 394ms/step - loss: 1.0826 - mae_months: 36.3370 - val_loss: 1.6431 - val_mae_months: 40.9770
2/2 [==============================] - 1s 344ms/step - loss: 0.6129 - mae_months: 23.0258 - val_loss: 1.8911 - val_mae_months: 45.6456
2/2 [==============================] - 1s 360ms/step - loss: 0.4500 - mae_months: 22.6450 - val_loss: 1.3592 - val_mae_months: 36.7073
2/2 [==============================] - 1s 1s/step - loss: 0.4222 - mae_months: 20.2543 - val_loss: 1.1010 - val_mae_months: 32.8488
[<tf.Tensor: shape=(13,), dtype=float32, numpy=
array([1.4442804, 1.3981661, 1.5037801, 2.2199252, 1.7645894, 1.4836203,
1.7916738, 1.3967942, 1.4069557, 2.516875 , 1.4077926, 1.4342965,
1.9279695], dtype=float32)>,
<tf.Tensor: shape=(13,), dtype=float32, numpy=
array([1.8153722, 1.9236553, 1.3917867, 1.5313213, 1.387209 , 1.3831038,
1.4519565, 1.4680854, 1.7810788, 2.5733376, 1.4269204, 1.3751 ,
1.446231 ], dtype=float32)>,
<tf.Tensor: shape=(13,), dtype=float32, numpy=
array([1.6616 , 1.6529323, 1.9181525, 2.536807 , 1.6306267, 2.856683 ,
2.113724 , 1.5543866, 1.9128528, 3.218016 , 1.4112593, 1.4043481,
3.229338 ], dtype=float32)>,
<tf.Tensor: shape=(13,), dtype=float32, numpy=
array([2.1295295, 1.8527019, 1.9779519, 3.1390932, 1.5525225, 2.0811615,
1.6279813, 1.87973 , 1.5029857, 1.6502519, 2.3677726, 1.8570358,
1.7251074], dtype=float32)>,
<tf.Tensor: shape=(12,), dtype=float32, numpy=
array([1.3926607, 1.7088655, 1.7379242, 3.5756006, 1.5988973, 1.3926607,
1.4928951, 1.4665956, 1.3926607, 1.4575896, 3.146022 , 1.3926607],
dtype=float32)>]
Does this mean that I have MAEs for 5 data splits? (where it says numpy = array[....] in the output?)

Ideally, you'd split train and test sets together from the kfold split, but it doesn't matter if you use the same seed. kfold split just returns indices to select train and test elements. So you need to get those indices from the split from the original dataset.
Answer based on OP comment and question:
from sklearn.model_selection import StratifiedKFold as kfold
x, y = # images, labels
cvscores = []
kf = kfold(n_splits = n_splits, shuffle = True, random_state = 42)
split = kf.split(x, y)
for train, test in split
x_train, x_test, y_train, y_test = x[train], x[test], y[train], y[test]
model = # do model stuff
_ = model.fit()
result = mode.evaluate()
#depending on how you want to handle the results
cvscores.append(result)
# do stuff with cvscores
I'm not sure if that would work with an object from flow_fromdataframe()` because that wouldn't be an array or array-like, although you should be able to get the arrays within.

Related

Using sklearn and keras to check how reliable a cumulative sum is for the long term

Part of my CSV file:
cumulative_sum
0.2244
0.75735
1.74845
1.93545
2.15985
2.8611
1.8611
2.2538
2.88025
3.83395
2.83395
5.0312
5.4426000000000005
5.5735
4.5735
3.5735
2.5735
3.38695
2.38695
1.3869500000000001
1.7048500000000002
2.47155
3.6309500000000003
4.1078
4.640750000000001
3.6407500000000006
4.3420000000000005
3.3420000000000005
3.6318500000000005
3.8282000000000003
4.08065
3.0806500000000003
3.7725500000000003
4.2868
3.2868000000000004
3.9974000000000003
4.7454
3.7454
4.7178
5.129200000000001
5.297500000000001
4.297500000000001
4.708900000000002
3.7089000000000016
I'm wanting to analyze how reliable is the growth of this cumulative sum for the long term:
import numpy as np
import pandas as pd
from sklearn.preprocessing import MinMaxScaler
from keras.models import Sequential
from keras.layers import LSTM, Dense
df = pd.read_csv("gradual_cumsum.csv")
scaler = MinMaxScaler(feature_range=(0, 1))
data = scaler.fit_transform(df)
train_size = int(len(data) * 0.8)
test_size = len(data) - train_size
train, test = data[0:train_size,:], data[train_size:len(data),:]
X_train = train[:, 0:1]
y_train = train[:, 1:2]
X_test = test[:, 0:1]
y_test = test[:, 1:2]
X_train = np.reshape(X_train, (X_train.shape[0], 1, X_train.shape[1]))
X_test = np.reshape(X_test, (X_test.shape[0], 1, X_test.shape[1]))
model = Sequential()
model.add(LSTM(50, input_shape=(X_train.shape[1], X_train.shape[2])))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
model.fit(X_train, y_train, epochs=100, batch_size=1, verbose=1)
test_loss = model.evaluate(X_test, y_test, verbose=0)
print(f'Test Loss: {test_loss}')
But all Epoch's generate loss: nan:
Epoch 1/100
849/849 [==============================] - 2s 1ms/step - loss: nan
And obviously the end result is:
Test Loss: nan
What am I missing that is generating this failure that fails to verify the data?

Model training with tf.data.Dataset and NumPy arrays yields different results

I use the Keras model training API and observed differences when training the model with NumPy arrays (x_train and y_train) and with tf.data.Dataset.from_tensor_slices((x_train, y_train)). A minimal working example is shown below:
import numpy as np
import tensorflow as tf
tf.keras.utils.set_random_seed(0)
n_examples, n_dims = (100, 10)
raw_dataset = np.random.randn(n_examples, n_dims)
model = tf.keras.models.Sequential(
[
tf.keras.layers.Dense(
1024, activation="relu", use_bias=True
),
tf.keras.layers.Dense(
1, activation="linear", use_bias=True
),
]
)
model.compile(
optimizer=tf.keras.optimizers.Adam(),
loss="mse",
)
x_train = raw_dataset[:, :-1]
y_train = raw_dataset[:, -1]
dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))
n_epochs = 10
batch_size = 16
use_dataset = True
if use_dataset:
model.fit(
dataset.batch(batch_size=batch_size),
epochs=n_epochs,
)
else:
model.fit(
x=x_train,
y=y_train,
batch_size=batch_size,
epochs=n_epochs,
)
print("Evaluation:")
model.evaluate(x_train, y_train)
model.evaluate(dataset.batch(batch_size=batch_size))
If I run this code with use_dataset = True, the final performance is:
Evaluation:
4/4 [==============================] - 0s 825us/step - loss: 0.4132
7/7 [==============================] - 0s 701us/step - loss: 0.4132
If I run it with use_dataset = False, I get:
Evaluation:
4/4 [==============================] - 0s 855us/step - loss: 0.4219
7/7 [==============================] - 0s 808us/step - loss: 0.4219
I expected that the two training loops would perform identically. Interestingly, the model performance is identical if I set batch_size = n_examples. The difference seems to be related with the way that batches are handled internally. Why is this happening? Is it a bug or a feature?
The behavior is due to the default parameter shuffle=True in model.fit(*) and not a bug. According to the docs regarding shuffle:
Boolean (whether to shuffle the training data before each epoch) or str (for 'batch'). This argument is ignored when x is a generator or an object of tf.data.Dataset. 'batch' is a special option for dealing with the limitations of HDF5 data; it shuffles in batch-sized chunks. Has no effect when steps_per_epoch is not None.
So this parameter is ignored when a tf.data.Dataset is passed, and the data is not reshuffled after each epoch as in the other approach with arrays.
Here is the code to get the same results for both methods:
import numpy as np
import tensorflow as tf
tf.keras.utils.set_random_seed(0)
n_examples, n_dims = (100, 10)
raw_dataset = np.random.randn(n_examples, n_dims)
model = tf.keras.models.Sequential(
[
tf.keras.layers.Dense(
1024, activation="relu", use_bias=True
),
tf.keras.layers.Dense(
1, activation="linear", use_bias=True
),
]
)
model.compile(
optimizer=tf.keras.optimizers.Adam(),
loss="mse",
)
x_train = raw_dataset[:, :-1]
y_train = raw_dataset[:, -1]
dataset = tf.data.Dataset.from_tensor_slices((x_train, y_train))
n_epochs = 10
batch_size = 16
use_dataset = False
if use_dataset:
model.fit(
dataset.batch(batch_size=batch_size),
epochs=n_epochs,
)
else:
model.fit(
x=x_train,
y=y_train,
batch_size=batch_size,
shuffle=False,
epochs=n_epochs,
)
print("Evaluation:")
model.evaluate(x_train, y_train)
model.evaluate(dataset.batch(batch_size=batch_size))

Training GCN regression model but getting bad accuracy and prediction results

I am trying to build a model that can read my graph data and use the node features with the weighted adjacency data to predict specific targets.
I started with 21 sample node with each having a set of 16801 features, their indices for determining training, validation and testing nodes through the training, and the adjacency determining the corresponding weighted edges values.
x_features #shape=(1, 21, 16801) dtype=float32
x_indices #shape=(1, None) dtype=int32
x_adjacency #shape=(1, 21, 21) dtype=float32
The prediction targets are saved in separated target lists:
y_train = np.expand_dims(train_targets, 0).astype(np.float32)
y_val = np.expand_dims(val_targets, 0).astype(np.float32)
y_test = np.expand_dims(test_targets, 0).astype(np.float32)
y_train #array([[[32.],[31.],[27.],[29.],[28.],[35.],[35.],[27.],[33.],[26.]]], dtype=float32)
The model is like the following:
x = Dropout(0.5)(x_features)
x = GraphConvolution(32, activation='relu',
use_bias=True,
kernel_initializer=kernel_initializer,
bias_initializer=bias_initializer)([x, x_adjacency])
x = Dropout(0.5)(x)
x = GraphConvolution(16, activation='relu',
use_bias=True,
kernel_initializer=kernel_initializer,
bias_initializer=bias_initializer)([x, x_adjacency])
x = GatherIndices(batch_dims=1)([x, x_indices])
output = Dense(1, activation='linear')(x)
model = Model(inputs=[x_features, x_indices, x_adjacency], outputs=output)
model.summary()
The model summary
model.compile(
optimizer=SGD(learning_rate=0.1, momentum=0.9),
loss='mean_squared_error',
metrics=["acc"],
)
history = model.fit(
x = [features_input, train_indices, A_input], #features_input.shape:(1, 21, 16801). train_indices.shape:(1,10). A_input.shape:(1, 21, 21)
y = y_train, #y_train.shape:(1, 10, 1)
batch_size = 32,
epochs=200,
validation_data=([features_input, val_indices, A_input], y_val),
verbose=1,
shuffle=False,
)
I reach the last epoch with:
Epoch 200/200
1/1 [==============================] - 0s 31ms/step - loss: nan - acc: 0.0000e+00 - val_loss: nan - val_acc: 0.0000e+00
test_preds = model.predict([features_input, test_indices, A_input])
print('test_preds:\n' , test_preds,'\n\n y_test:\n', y_test)
outputs:
test_preds: [ [ [nan][nan][nan][nan][nan][nan] ] ]
y_test: [ [ [28.][32.][30.][34.][32.][35.] ] ]

Check the predictions of my Keras model using a test dataset

I want to check if the predictions my model is computing are correct.
I have already trained my model and saved it. My dataset is created using tf.keras.preprocessing.image_dataset_from_directory as following :
ds_train = tf.keras.preprocessing.image_dataset_from_directory(
'data/',
labels = 'inferred',
label_mode = 'categorical',
color_mode = 'rgb',
batch_size = batch_size,
image_size = (img_height, img_width),
shuffle = True,
seed = 123,
validation_split = 0.1,
subset = 'training',
)
ds_validation = tf.keras.preprocessing.image_dataset_from_directory(
'data/',
labels = 'inferred',
label_mode = 'categorical',
color_mode = 'rgb',
batch_size = batch_size,
image_size = (img_height, img_width),
shuffle = True,
seed = 123,
validation_split = 0.1,
subset = 'validation',
)
As my model is working and is evaluated with an accuracy of 95.65% :
Found 921 files belonging to 2 classes.
Using 829 files for training.
Found 921 files belonging to 2 classes.
Using 92 files for validation.
Epoch 1/5
104/104 - 12s - loss: 0.1335 - accuracy: 0.9867
Epoch 2/5
104/104 - 5s - loss: 0.1017 - accuracy: 0.9879
Epoch 3/5
104/104 - 5s - loss: 0.0207 - accuracy: 0.9952
Epoch 4/5
104/104 - 5s - loss: 0.0603 - accuracy: 0.9879
Epoch 5/5
104/104 - 5s - loss: 0.0127 - accuracy: 0.9964
12/12 - 0s - loss: 0.2097 - accuracy: 0.9565
I have created a test_dataset the same way as I created my training and validation datasets.
ds_test = tf.keras.preprocessing.image_dataset_from_directory(
'test/',
labels = 'inferred',
label_mode = 'categorical',
color_mode = 'rgb',
batch_size = batch_size,
image_size = (img_height, img_width),
shuffle = True,
seed = 123,
)
I know want to predict the entire test_dataset using model.predict().
My data is stored in subfolders as followed :
---test/
---label1/
---label2/
I manage to recover all the predictions for all my images but I am not able to check if the predictions are correct. Is there any way to store the label of the image predicted in a list ?
I would like to use something like this :
for i in test_dataset :
list_labels = test_dataset.class_names(i)
predictions = model.predict(test_dataset)
predictions = np.argmax(predictions, axis=1)
So I can check the predictions output and link the predictions with labels.
Futhermore, I don't knwo why but I am not able to get the labels using like (x_train, y_train) = mnist.load_data :
# It is not working
(x_test, y_test) = test_dataset

How to fix a constant validation accuracy in machine learning?

I'm trying to do image classification with dicom images that have balanced classes using the pre-trained InceptionV3 model.
def convertDCM(PathDCM) :
data = []
for dirName, subdir, files in os.walk(PathDCM):
for filename in sorted(files):
ds = pydicom.dcmread(PathDCM +'/' + filename)
im = fromarray(ds.pixel_array)
im = keras.preprocessing.image.img_to_array(im)
im = cv2.resize(im,(299,299))
data.append(im)
return data
PathDCM = '/home/Desktop/FULL_BALANCED_COLOURED/'
data = convertDCM(PathDCM)
#scale the raw pixel intensities to the range [0,1]
data = np.array(data, dtype="float")/255.0
labels = np.array(labels,dtype ="int")
#splitting data into training and testing
#test_size is percentage to split into test/train data
(trainX, testX, trainY, testY) = train_test_split(
data,labels,
test_size=0.2,
random_state=42)
img_width, img_height = 299, 299 #InceptionV3 size
train_samples = 300
validation_samples = 50
epochs = 25
batch_size = 15
base_model = keras.applications.InceptionV3(
weights ='imagenet',
include_top=False,
input_shape = (img_width,img_height,3))
model_top = keras.models.Sequential()
model_top.add(keras.layers.GlobalAveragePooling2D(input_shape=base_model.output_shape[1:], data_format=None)),
model_top.add(keras.layers.Dense(300,activation='relu'))
model_top.add(keras.layers.Dropout(0.5))
model_top.add(keras.layers.Dense(1, activation = 'sigmoid'))
model = keras.models.Model(inputs = base_model.input, outputs = model_top(base_model.output))
#Compiling model
model.compile(optimizer = keras.optimizers.Adam(
lr=0.0001),
loss='binary_crossentropy',
metrics=['accuracy'])
#Image Processing and Augmentation
train_datagen = keras.preprocessing.image.ImageDataGenerator(
rescale = 1./255,
zoom_range = 0.1,
width_shift_range = 0.2,
height_shift_range = 0.2,
horizontal_flip = True,
fill_mode ='nearest')
val_datagen = keras.preprocessing.image.ImageDataGenerator()
train_generator = train_datagen.flow(
trainX,
trainY,
batch_size=batch_size,
shuffle=True)
validation_generator = train_datagen.flow(
testX,
testY,
batch_size=batch_size,
shuffle=True)
When I train the model, I always get a constant validation accuracy of 0.3889 with the validation loss fluctuating.
#Training the model
history = model.fit_generator(
train_generator,
steps_per_epoch = train_samples//batch_size,
epochs = epochs,
validation_data = validation_generator,
validation_steps = validation_samples//batch_size)
Epoch 1/25
20/20 [==============================]20/20
[==============================] - 195s 49s/step - loss: 0.7677 - acc: 0.4020 - val_loss: 0.7784 - val_acc: 0.3889
Epoch 2/25
20/20 [==============================]20/20
[==============================] - 187s 47s/step - loss: 0.7016 - acc: 0.4848 - val_loss: 0.7531 - val_acc: 0.3889
Epoch 3/25
20/20 [==============================]20/20
[==============================] - 191s 48s/step - loss: 0.6566 - acc: 0.6304 - val_loss: 0.7492 - val_acc: 0.3889
Epoch 4/25
20/20 [==============================]20/20
[==============================] - 175s 44s/step - loss: 0.6533 - acc: 0.5529 - val_loss: 0.7575 - val_acc: 0.3889
predictions= model.predict(testX)
print(predictions)
Predicting the model also only returns an array of one prediction per image:
[[0.457804 ]
[0.45051473]
[0.48343503]
[0.49180537]...
Why is it that the model only predicts one of the two classes? Does this have to do with the constant val accuracy or possibly overfitting?
If you have two classes, every image is in one or the other so probabilities for one class are enough to find everything because the sum of the probabilities for each image is supposed to make 1. So if you have the probabilitie p for 1 class, the probability for the other one is 1-p.
If you want to have the possibility of classifing images not in one of those two class, then you should create a third one.
Also, this line:
model_top.add(keras.layers.Dense(1, activation = 'sigmoid'))
means that the Output is a vector of shape(nb_sample,1) and has the same shape as your training labels

Categories

Resources