ValueError when using ModelCheckpoint in Keras - python

I'm creating an Ensemble of Vgg19, DenseNet, and EfficientNetB1.
The code is as follows:
IMAGE_SIZE = (224,224,3)
import tensorflow as tf
vgg19 = tf.keras.applications.vgg19.VGG19(
input_shape=IMAGE_SIZE, weights='imagenet', include_top=False)
for layer in vgg19.layers:
layer._name = layer._name + str('_19')
layer.trainable = False
effnetb1 =tf.keras.applications.efficientnet.EfficientNetB1(
include_top=False, weights='imagenet', input_shape=IMAGE_SIZE)
for layer in effnetb1.layers:
layer._name = layer._name + str('_B1')
layer.trainable=False
densenet=tf.keras.applications.densenet.DenseNet121(
include_top=False, weights="imagenet", input_shape=IMAGE_SIZE)
for layer in densenet.layers:
layer._name = layer._name + str('_Dense')
layer.trainable=False
from keras.layers import Input, Flatten, Concatenate, Dense, Average, Dropout
inp = Input(IMAGE_SIZE)
vgg19_x = Flatten()(vgg19(inp))
vgg19_x = Dense(256, activation='relu')(vgg19_x)
effnet_x = Flatten()(effnetb1(inp))
effnet_x = Dense(256, activation='relu')(effnet_x)
densenet_x = Flatten()(densenet(inp))
densenet_x = Dense(256, activation='relu')(densenet_x)
from keras.models import Model
x = Concatenate()([vgg19_x, effnet_x, densenet_x])
x = Dense(128, activation='relu')(x)
x = Dropout(0.30)(x)
x = Dense(64, activation='relu')(x)
out = Dense(2, activation='softmax')(x)
model = Model(inputs = inp, outputs = out)
model.compile(
loss='categorical_crossentropy',
optimizer=tf.keras.optimizers.Adam(
learning_rate=0.0005,
name="Adam"),
metrics=['accuracy']
)
model.summary()
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint
checkpointer = ModelCheckpoint(filepath="/content/drive/MyDrive/ensemble/ensemble-weights.hdf5", verbose=1, save_best_only=True)
r = model.fit(
training_set,
validation_data=test_set,
epochs=30,
steps_per_epoch=len(training_set),
validation_steps=len(test_set),
callbacks = [checkpointer]
)
The code runs fine and the training is successfully taking place when I'm not using the callback. But when I use a ModelCheckpoint, I get the following error after 1st epoch:
ValueError: The target structure is of type `<class 'keras.engine.keras_tensor.KerasTensor'>`
KerasTensor(type_spec=TensorSpec(shape=(None, 224, 224, 3), dtype=tf.float32, name='input_5'), name=...
However, the input structure is a sequence (<class 'list'>) of length 0.
[]
nest cannot guarantee that it is safe to map one to the other.
Can anyone tell me what's wrong here? Also, is it because I'm concatenating three models?
Your help will be appreciated. Thank you!

I also ran into this issue while trying to implement a nested model (which is what would be constructed here after you create the concatenated model).
The issue seems to be that Keras cannot handle the inputs and outputs of nested models in newer tensorflow versions(tf 2.0 and above). Depending on the version you are on, you might want to either explicitly refer the input/output of the nested model you are using. In tf2.6, what seems to work is to define separate models for each part - ie - the common layers added after concatenation should also be wrapped in a model like below (taken from here):
#Make GradCAM heatmap following the Keras tutorial.
last_conv_layer = model.layers[-4].layers[-1]
last_conv_layer_model = keras.Model(model.layers[-4].inputs, last_conv_layer.output)
# Second, we create a model that maps the activations of the last conv
# layer to the final class predictions
classifier_input = keras.Input(shape=last_conv_layer.output.shape[1:])
x = classifier_input
for layer in model.layers[-3:]:
x = layer(x)
classifier_model = keras.Model(classifier_input, x)
#Preparing the image with the preprocessing layers
preprocess_layers = keras.Model(model.inputs, model.layers[-5].output)
img_array = preprocess_layers(prepared_image)
# Then, we compute the gradient of the top predicted class for our input image
# with respect to the activations of the last conv layer
with tf.GradientTape() as tape:
# Compute activations of the last conv layer and make the tape watch it
last_conv_layer_output = last_conv_layer_model(img_array)
tape.watch(last_conv_layer_output)
# Compute class predictions
preds = classifier_model(last_conv_layer_output)
top_pred_index = tf.argmax(preds[0])
top_class_channel = preds[:, top_pred_index]
# This is the gradient of the top predicted class with regard to
# the output feature map of the last conv layer
grads = tape.gradient(top_class_channel, last_conv_layer_output)
You can also check the following github issues (they are not very related, but deal with a similar problem) - issue1, issue2, issue3

Related

Can I use prediction function inside custom loss function in TensorFlow?

I have a NN with this structure:
I need to make it predict ~ 0, when input -> ∞
To implement this, I decided to add model prediction with large input to the loss function (If there is another way, will be happy to hear)
But I don't see the way to predict values inside my loss function
Network code:
from keras.models import Model
from keras.layers import Input, Dense, Add
input_array = []
output_array = []
for i in range(14):
input_layer = Input(shape=(1,))
hidden1 = Dense(64, activation='relu')(input_layer)
hidden2 = Dense(64, activation='relu')(hidden1)
output_layer = Dense(1, activation='linear')(hidden2)
input_array.append(input_layer)
output_array.append(output_layer)
# merge input models
summation = Add()(output_array)
model = Model(inputs=input_array, outputs=summation)

How to connect input layer to an extra layer in Tensorflow

I was wondering if it is possible to create a customized network structure where the input layer has an extra connection to a hidden layer that is not adjacent to the input layer by using tensorflow. As an example, suppose I have a simple network structure as shown below.
import numpy as np
import random
import tensorflow as tf
from tensorflow import keras
m = 200
n = 5
my_input= np.random.random([m,n])
my_output = np.random.random([m,1])
my_model = tf.keras.Sequential([
tf.keras.layers.Flatten(input_shape=(my_input.shape[1],)),
tf.keras.layers.Dense(32, activation='softmax'),
tf.keras.layers.Dense(32, activation='tanh'),
tf.keras.layers.Dense(32, activation='relu'),
tf.keras.layers.Dense(1)
])
my_model.compile(loss='mse',optimizer = tf.keras.optimizers.Adam(learning_rate=0.001))
res = my_model.fit(my_input, my_output, epochs=50, batch_size=1,verbose=0)
Is there a way that the first layer having the input values can have an extra connection to the third layer that has the ReLU activation? While doing so, I'd like to have different constraints in each connection. For example, for the connection coming from the previous layer, I'd like to use GlorotNormal as my weight initialization. As for the extra connection coming from the input layer, I'd like to use HeUniform initialization.
I tried to visualize what I have in mind below.
Use the Keras functional API and tf.concat:
import numpy as np
import random
import tensorflow as tf
from tensorflow import keras
m = 200
n = 5
my_input= np.random.random([m,n])
my_output = np.random.random([m,1])
inputs = tf.keras.layers.Input((my_input.shape[1],))
x = tf.keras.layers.Flatten()(inputs)
x = tf.keras.layers.Dense(32, activation='softmax')(x)
x = tf.keras.layers.Dense(32, activation='tanh', kernel_initializer=tf.keras.initializers.GlorotNormal())(x)
y = tf.keras.layers.Dense(my_input.shape[1], kernel_initializer=tf.keras.initializers.HeUniform())(inputs)
x = tf.keras.layers.Dense(32, activation='relu')(tf.concat([x, y], axis=1))
outputs = tf.keras.layers.Dense(1)(x)
my_model = tf.keras.Model(inputs, outputs)
dot_img_file = 'model_1.png'
tf.keras.utils.plot_model(my_model, to_file=dot_img_file, show_shapes=True)
my_model.compile(loss='mse',optimizer = tf.keras.optimizers.Adam(learning_rate=0.001))
res = my_model.fit(my_input, my_output, epochs=50, batch_size=1,verbose=0)

How to fit basic custom built model in tensorflow

I am used to working in PyTorch but now have to learn Tensorflow for my job. I am trying to get up to speed by creating a simple dense network and training it on the MNIST dataset, but I cannot get it to train. My super simple code:
import tensorflow as tf
from tensorflow.keras import Input
from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import SGD
from tensorflow.keras.utils import to_categorical
# Load mnist data from keras
(train_data, train_label), (test_data, test_label) = tf.keras.datasets.mnist.load_data(path="mnist.npz")
train_label, test_label = to_categorical(train_label), to_categorical(test_label)
train_data, train_label, test_data, test_label = Flatten()(train_data), Flatten()(train_label), Flatten()(test_data), Flatten()(test_label)
# Create generic SGD optimizer (no learning schedule)
optimizer = SGD(learning_rate = 0.01)
# Define function to build and compile model
def build_mnist_model(input_shape, batch_size = 30):
input_img = Input(shape = input_shape, batch_size = batch_size)
# Pass through dense layer
x = Dense(200, activation = 'relu', use_bias = True)(input_img)
x = Dense(400, activation = 'relu', use_bias = True)(x)
scores = Dense(10, activation = 'softmax', use_bias = True)(x)
# Create and compile tf model
mnist_model = Model(input_img, scores)
mnist_model.compile(optimizer = optimizer, loss = 'categorical_crossentropy')
return mnist_model
# Build the model
mnist_model = build_mnist_model(train_data[0].shape)
# Train the model
mnist_model.fit(
x = train_data,
y = train_label,
batch_size = 30,
epochs = 20,
verbose = 2,
shuffle = True,
# steps_per_epoch = 200
)
When I run this I get
ValueError: When using data tensors as input to a model, you should specify the `steps_per_epoch` argument.
This does not really make sense to me because my train_data and train_label are just regular tensors and per the Tensorflow documentation in this case it should default to the number of samples in the dataset divided by the batch size (which would be 200 in my case).
At any rate, I tried specifying steps_per_epoch = 200 when I call mnist_model.fit() but then I get a different error:
InvalidArgumentError: Incompatible shapes: [60000,10] vs. [30,1]
[[{{node training_4/SGD/gradients/gradients/loss_5/dense_17_loss/softmax_cross_entropy_with_logits_grad/mul}}]]
I can't seem to discern where a size mismatch would come from. In PyTorch, I am used to manually creating batches (by subindexing my data and label tensors) but in Tensorflow this seems to happen automatically. As such, this leaves me quite confused about what batch has the wrong size, how it got the wrong size, etc. I hope this simple model is way easier than I am making it and I just do not know the Tensorflow tricks yet.
Thanks for the help.

Does my Keras multiple output NN not converge because i have not sufficient layers?

I am trying to build a neural network to predict 3 output values out of 63 inputs. I have a dataset containing two numpy arrays with the shape of [8100, 63] and [8100, 3] but when I try to feed them to Keras the model does not converge and the mean squared error is in the area of 10^11.
The function i used to calculate the Data does not have any non-linear properties so i first thought that one or two layers should be enough. with three layers the MSE is still in the area of 10^10 and I am not sure what I am doing wrong.
The regression should return three absolute Values which can be bigger than 1 - this is the reason why I didn't use softmax layers.
I would be really grateful for any input or help!
import numpy as np
from keras.models import *
from keras.layers import Dense
from keras import optimizers
from keras.utils import plot_model
np.random.seed(7)
#Define Input
tf_features_64 = np.load("IN.npy")
tf_labels_64 = np.load("OUT.npy")
tf_features_32 = tf_features_64.astype(np.float32)
tf_labels_32 = tf_labels_64.astype(np.float32)
X = tf_features_32
Y = tf_labels_32
#create Layers
visible = Input(shape=(63,))
x = Dense(100, activation='relu')(visible)
x = Dense(100, activation='relu')(x)
x = Dense(100, activation='relu')(x)
x = Dense(70, activation='relu')(x)
x = Dense(30, activation='relu')(x)
output = Dense(3)(x)
Optimizer = optimizers.adam(lr=0.001)
model = Model(inputs=visible, outputs = output)
model.compile(optimizer=Optimizer,
loss='categorical_crossentropy',
metrics=['mse']
)
model.fit(X, Y, epochs=400, batch_size=300, shuffle=True)
print(model.summary)
When we are using neural networks for classification we should use softmax at last layer with categorical_crossentropy loss.
output = Dense(3, activation='softmax')(x)
model.compile(optimizer=Optimizer,
loss='categorical_crossentropy')
For regression we should use linear output with mse loss
output = Dense(3)(x)
model.compile(optimizer=Optimizer,
loss='mse')
You are using categorical_crossentropy as a loss function and mse as a metric
model.compile(optimizer=Optimizer,
loss='categorical_crossentropy',
metrics=['mse']
)
Change loss function to mse
model.compile(optimizer=Optimizer,
loss='mse')

Adding fully connected layers after pretrained model

I'm new to ConvNets and Python and want to implement the following:
I want to use the pretrained vgg16 model and add 3 fully connected layers after it with an L2-Normalization at the end.
So Data->VGG16->FC (1x4096)->FC (1x4096)->FC (1x3)->L2-Norm->Output
The first and second FC get an array 1x4096 the last FC gets an array 1x3 where the L2-Norm is performed.
Can anyone give me a hint how to do that ?
I found that I can load the model like that :
model_vgg19 = models.vgg19(pretrained=True)
But how can I add the FCs and the L2-Norm after that ? And how can I get Test-Data through the model ?
I'm quoting an example mentioned in Keras#3465
In Keras framework, if you mention include_top = False while loading your pre-trained model it will not include the final classification layer. You can add your custom FC layers at the end as shown in the example below:
#load vgg16 without dense layer and with theano dim ordering
base_model = VGG16(weights = 'imagenet', include_top = False, input_shape = (3,224,224))
#number of classes in your dataset e.g. 20
num_classes = 20
x = Flatten()(base_model.output)
x = Dense(4096, activation='relu')(x)
x = Dropout(0.5)(x)
x = BatchNormalization()(x)
predictions = Dense(num_classes, activation = 'softmax')(x)
#create graph of your new model
head_model = Model(input = base_model.input, output = predictions)
#compile the model
head_model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
head_model.summary()
.
.
.
#train your model on data
head_model.fit(x, y, batch_size = batch_size, verbose = 1)

Categories

Resources