How can I know if weights were loaded in Keras Model? - python

I am using keras for tensorflow 1.4. For some unknown reason, I continue failing when I try to download the weights (vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5') using:
base_model = VGG16(weights='imagenet', include_top=False)
so I manually downloaded the weights from the link cited in the script.
https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5
Since the VGG16's weight cannot be set to the path of the weights (it has to be imagenet or None), I used the code lines:
base_model = VGG16(weights=None, include_top=False)
path = os.path.join('weights','vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5')
base_model.load_weights(path)
It is the same than base_model = VGG16(weights='imagenet', include_top=False) ?
How can I check that the weights and outputs are correct?

You can access the weights in an .h5 file as:
w1 = h5py.File('someFile.h5')
allKeys = w1.keys()
first_layer_wts = w1[allKeys[0]][:] # assuming first layer has weights
After you load the weights in a Keras model, then you can access them as:
w2 = model.get_weights()
Now check if first_layer_wts and the corresponding weights in w2 are equal. This may need some tweaking. For example, if the layer is convolutional, then you have two weight arrays (one for kernel and another for bias). You can access them both as:
first_layer_wts_kernel = w1[allKeys[0]]['kernel'][:]
first_layer_wts_bias = w1[allKeys[0]]['bias'][:]
You can figure this out by always checking the available keys.

Related

Accessing intermediate tensors of a Keras Model that were not explicitly exposed as layers in TF 2.0

Is it possible to access pre-activation tensors in a Keras Model? For example, given this model:
import tensorflow as tf
image_ = tf.keras.Input(shape=[224, 224, 3], batch_size=1)
vgg19 = tf.keras.applications.VGG19(include_top=False, weights='imagenet', input_tensor=image_, input_shape=image_.shape[1:], pooling=None)
the usual way to access layers is:
intermediate_layer_model = tf.keras.models.Model(inputs=image_, outputs=[vgg19.get_layer('block1_conv2').output])
intermediate_layer_model.summary()
This gives the ReLU outputs for a layer, while I would like the ReLU inputs. I tried doing this:
graph = tf.function(vgg19, [tf.TensorSpec.from_tensor(image_)]).get_concrete_function().graph
outputs = [graph.get_tensor_by_name(tname) for tname in [
'vgg19/block4_conv3/BiasAdd:0',
'vgg19/block4_conv4/BiasAdd:0',
'vgg19/block5_conv1/BiasAdd:0'
]]
intermediate_layer_model = tf.keras.models.Model(inputs=image_, outputs=outputs)
intermediate_layer_model.summary()
but I get the error
ValueError: Unknown graph. Aborting.
The only workaround I've found is to edit the model file to manually expose the intermediates, turning every layer like this:
x = layers.Conv2D(256, (3, 3), activation="relu", padding="same", name="block3_conv1")(x)
into 2 layers where the 1st one can be accessed before activations:
x = layers.Conv2D(256, (3, 3), activation=None, padding="same", name="block3_conv1")(x)
x = layers.ReLU(name="block3_conv1_relu")(x)
Is there a way to acces pre-activation tensors in a Model without essentially editing Tensorflow 2 source code, or reverting to Tensorflow 1 which had full flexibility accessing intermediates?
There is a way to access pre-activation layers for pretrained Keras models using TF version 2.7.0. Here's how to access two intermediate pre-activation outputs from VGG19 in a single forward pass.
Initialize VGG19 model. We can omit top layers to avoid loading unnecessary parameters into memory.
vgg19 = tf.keras.applications.VGG19(
include_top=False,
weights="imagenet"
)
This is the important part: Create a deepcopy of the intermediate layer form which you like to have the features, change the activation of the conv layers to linear (i.e. no activation), rename the layer (otherwise two layers in the model will have the same name which will raise errors) and finally pass the output of the previous through the copied conv layer.
# for more intermediate features wrap a loop around it to avoid copy paste
b5c4_layer = deepcopy(vgg19.get_layer("block5_conv4"))
b5c4_layer.activation = tf.keras.activations.linear
b5c4_layer._name = b5c4_layer.name + str("_preact")
b5c4_preact_output = b5c4_layer(vgg19.get_layer("block5_conv3").output)
b2c2_layer = deepcopy(vgg19.get_layer("block2_conv2"))
b2c2_layer.activation = tf.keras.activations.linear
b2c2_layer._name = b2c2_layer.name + str("_preact")
b2c2_preact_output = b2c2_layer(vgg19.get_layer("block2_conv1").output)
Finally, get the outputs and check if they equal post-activation outputs when we apply ReLU-activation.
vgg19_features = Model(vgg19.input, [b2c2_preact_output, b5c4_preact_output])
vgg19_features_control = Model(vgg19.input, [vgg19.get_layer("block2_conv2").output, vgg19.get_layer("block5_conv4").output])
b2c2_preact, b5c4_preact = vgg19_features(tf.keras.applications.vgg19.preprocess_input(img))
b2c2, b5c4 = vgg19_features_control(tf.keras.applications.vgg19.preprocess_input(img))
print(np.allclose(tf.keras.activations.relu(b2c2_preact).numpy(),b2c2.numpy()))
print(np.allclose(tf.keras.activations.relu(b5c4_preact).numpy(),b5c4.numpy()))
True
True
Here's a visualization similar to Fig. 6 of Wang et al. to see the effect in the feature space.
Input image
To get output of each layer. You have to define a keras function and evaluate it for each layer.
Please refer the code as shown below
from tensorflow.keras import backend as K
inp = model.input # input
outputs = [layer.output for layer in model.layers] # all layer outputs
functors = [K.function([inp], [out]) for out in outputs] # evaluation functions
For more details on this please refer SO Answer.

Extracting features from EfficientNet Tensorflow

I have a CNN model trained using EfficientNetB6.
My task is to extract the features of this trained model by removing the last dense layer and then using those weights to train a boosting model.
i did this using Pytorch earlier and was able to extract the weights from the layers i was interested and predicted on my validation set and then boosted.
I am doing this now in tensorflow but currently stuck.
Below is my model structure and I have tried using the code on the website but did not had any luck.
I want to remove the last dense layer and predict on the validation set using the remaining layers.
I tried using :
layer_name = 'efficientnet-b6'
intermediate_layer_model = tf.keras.Model(inputs = model.input, outputs = model.get_layer(layer_name).output)
but i get an error "
ValueError: Graph disconnected: cannot obtain value for tensor Tensor("input_1:0", shape=(None, 760, 760, 3), dtype=float32) at layer "input_1". The following previous layers were accessed without issue: []"
Any way to resolve this?
Sorry my bad.
I simply added a GlobalAveragePooling2D layer after the efficientnet layer and i am able to extract the weights and continue :)
just for reference:
def build_model(dim=CFG['net_size'], ef=0):
inp = tf.keras.layers.Input(shape=(dim,dim,3))
base = EFNS[ef](input_shape=(dim,dim,3),weights='imagenet',include_top=False)
x = base(inp)
x = tf.keras.layers.GlobalAveragePooling2D()(x)
x = tf.keras.layers.Dense(1,activation='sigmoid')(x)
model = tf.keras.Model(inputs=inp,outputs=x)
opt = tf.keras.optimizers.Adam(learning_rate=0.001)
loss = tf.keras.losses.BinaryCrossentropy(label_smoothing=0.05)
model.compile(optimizer=CFG['optimizer'],loss=loss,metrics=[tf.keras.metrics.AUC(name='auc')])
print(model.summary())
return model

How to move a tensorflow.keras model to GPU

Let's say I have a keras model like this:
with tf.device("/CPU"):
model = tf.keras.Sequential([
# Adds a densely-connected layer with 64 units to the model:
tf.keras.layers.Dense(64, activation='relu', input_shape=(32,)),
# Add another:
tf.keras.layers.Dense(64, activation='relu'),
# Add a softmax layer with 10 output units:
tf.keras.layers.Dense(10, activation='softmax')])
I would like to move this model to GPU.
I tried doing this:
with tf.device("/GPU:0"):
gpu_model = tf.keras.models.clone_model(model)
But the problem with this is that, the variable names change. For example:
The first layer's weight's name of model is: Got from model.layers[0].weights[0].name
'dense/kernel:0'
But the first layer's weight's name of gpu_model is: Got from gpu_model.layers[0].weights[0].name
'dense_3/kernel:0'
How can I do this GPU transformation while also preserving the names of the variables?
I don't want to save the model to disk and load again
I am answering my own question. If someone has a better solution. Kindly post it
This is a work around I found:
Create a state_dict like PyTorch
Get the model architecture as JSON
Clear the Keras session and delete the model instance
Create a new model from the JSON within tf.device context
Load the previous weights from state_dict
state_dict = {}
for layer in model.layers:
for weight in layer.weights:
state_dict[weight.name] = weight.numpy()
model_json_config = model.to_json()
tf.keras.backend.clear_session() # this is crucial to get previous names again
del model
with tf.device("/GPU:0"):
new_model = tf.keras.models.model_from_json(model_json_config)
for layer in new_model.layers:
current_layer_weights = []
for weight in layer.weights:
current_layer_weights.append(state_dict[weight.name])
layer.set_weights(current_layer_weights)

different prediction after load a model in keras

I have a Sequential Model built in Keras and after trained it give me good prediction but when i save and then load the model i don't obtain the same prediction on the same dataset. Why?
Note that I checked the weight of the model and they are the same as well as the architecture of the model, checked with model.summary() and model.getWeights(). This is very strange in my opinion and I have no idea how to deal with this problem.
I don't have any error but the prediction are different
I tried to use model.save() and load_model()
I tried to use model.save_weights() and after that re-built the model and then load the model
I have the same problem with both options.
def Classifier(input_shape, word_to_vec_map, word_to_index, emb_dim, num_activation):
sentence_indices = Input(shape=input_shape, dtype=np.int32)
emb_dim = 300 # embedding di 300 parole in italiano
embedding_layer = pretrained_embedding_layer(word_to_vec_map, word_to_index, emb_dim)
embeddings = embedding_layer(sentence_indices)
X = LSTM(256, return_sequences=True)(embeddings)
X = Dropout(0.15)(X)
X = LSTM(128)(X)
X = Dropout(0.15)(X)
X = Dense(num_activation, activation='softmax')(X)
model = Model(sentence_indices, X)
sequentialModel = Sequential(model.layers)
return sequentialModel
model = Classifier((maxLen,), word_to_vec_map, word_to_index, maxLen, num_activation)
...
model.fit(Y_train_indices, Z_train_oh, epochs=30, batch_size=32, shuffle=True)
# attempt 1
model.save('classificationTest.h5', True, True)
modelRNN = load_model(r'C:\Users\Alessio\classificationTest.h5')
# attempt 2
model.save_weights("myWeight.h5")
model = Classifier((maxLen,), word_to_vec_map, word_to_index, maxLen, num_activation)
model.load_weights(r'C:\Users\Alessio\myWeight.h5')
# PREDICTION TEST
code_train, category_train, category_code_train, text_train = read_csv_for_email(r'C:\Users\Alessio\Desktop\6Febbraio\2test.csv')
categories, code_categories = get_categories(r'C:\Users\Alessio\Desktop\6Febbraio\2test.csv')
X_my_sentences = text_train
Y_my_labels = category_code_train
X_test_indices = sentences_to_indices(X_my_sentences, word_to_index, maxLen)
pred = model.predict(X_test_indices)
def codeToCategory(categories, code_categories, current_code):
i = 0;
for code in code_categories:
if code == current_code:
return categories[i]
i = i + 1
return "no_one_find"
# result
for i in range(len(Y_my_labels)):
num = np.argmax(pred[i])
# Pretrained embedding layer
def pretrained_embedding_layer(word_to_vec_map, word_to_index, emb_dim):
"""
Creates a Keras Embedding() layer and loads in pre-trained GloVe 50-dimensional vectors.
Arguments:
word_to_vec_map -- dictionary mapping words to their GloVe vector representation.
word_to_index -- dictionary mapping from words to their indices in the vocabulary (400,001 words)
Returns:
embedding_layer -- pretrained layer Keras instance
"""
vocab_len = len(word_to_index) + 1 # adding 1 to fit Keras embedding (requirement)
### START CODE HERE ###
# Initialize the embedding matrix as a numpy array of zeros of shape (vocab_len, dimensions of word vectors = emb_dim)
emb_matrix = np.zeros((vocab_len, emb_dim))
# Set each row "index" of the embedding matrix to be the word vector representation of the "index"th word of the vocabulary
for word, index in word_to_index.items():
emb_matrix[index, :] = word_to_vec_map[word]
# Define Keras embedding layer with the correct output/input sizes, make it trainable. Use Embedding(...). Make sure to set trainable=False.
embedding_layer = Embedding(vocab_len, emb_dim)
### END CODE HERE ###
# Build the embedding layer, it is required before setting the weights of the embedding layer. Do not modify the "None".
embedding_layer.build((None,))
# Set the weights of the embedding layer to the embedding matrix. Your layer is now pretrained.
embedding_layer.set_weights([emb_matrix])
return embedding_layer
Do you have any kind of suggestion?
Thanks in Advance.
Edit1: if use the code of saving and loading in the same "page" (I'm using notebook jupyter) it works fine. If I change "page" it doesn't work. Could it be that there is something related with the tensorflow session?
Edit2: my final goal is to load a model, trained in Keras, with Deeplearning4J in java. So if you know a solution for "transforming" the keras model in something else readable in DL4J it will help anyway.
Edit3: add function pretrained_embedding_layer()
Edit4: dictionaries from word2Vec model read with gensim
from gensim.models import Word2Vec
model = Word2Vec.load('C:/Users/Alessio/Desktop/emoji_ita/embedding/glove_WIKI')
def getMyModels (model):
word_to_index = dict({})
index_to_word = dict({})
word_to_vec_map = dict({})
for idx, key in enumerate(model.wv.vocab):
word_to_index[key] = idx
index_to_word[idx] = key
word_to_vec_map[key] = model.wv[key]
return word_to_index, index_to_word, word_to_vec_map
Are you pre-processing your data in the same way when you load your model ?
And if yes, did you set the seed of your pre-processing functions ?
If you build a dictionnary with keras, are the sentences coming in the same order ?
I had the same problem before, so here is how you solve it. After making sure that the weights and summary are the same, try to print your random seed and check. If its value is changing from a session to another and if you tried tensorflow's seed, it means you need to disable the PYTHONHASHSEED environment variable. You can read more about it here:
https://docs.python.org/3/using/cmdline.html#envvar-PYTHONHASHSEED
To disable it, go to your system environment variables and add PYTHONHASHSEED as a new variable if it doesn't exist. Then, set its value to 0 to disable it. Please note that it was done in this way because it has to be disabled before running the interpreter.

Error when trying to output previous to last layer in Keras

I am trying to output the previous to last Dense layer in a keras model. I first load the model architecture and the weights:
base_model = applications.ResNet50(weights = None,
include_top = False,
input_shape = (image_size[0], image_size[1], nb_channels))
top_model = Sequential()
top_model.add(Flatten(input_shape=base_model.output_shape[1:]))
top_model.add(Dense(1024, init = 'glorot_uniform', activation='relu', name = 'last_layer_1024'))
top_model.add(Dropout(0.5))
top_model.add(Dense(nb_classes, activation = 'softmax', name = 'softmax_layer'))
top_model_tensor = top_model(base_model.output)
model = Model(inputs = base_model.input, outputs = top_model_tensor)
model.load_weights(weights_path)
Then I remove the last Dense layer by doing this:
model.layers[-1].pop()
#model.outputs = [model.layers[-1].layers[-1].output]
#model.layers[-1].layers[-1].outbound_nodes = []
If I uncomment the commented lines, I get this error: InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'flatten_1_input' with dtype float. If I keep them commented, the last dense layer is NOT effectively removed (by that I mean that when I call predict on model, I still get the output of the last dense layer). How can I solve this issue?
Also, if there is a different method to get the model to output the previous to last dense layer, I can take that as an answer too (instead of trying to fix this way of doing it).
Another solution that does not work is to just cut the long model after you load weights by simply doing this:
short_top_model = Model(top_model.input, top_model.get_layer('last_layer_1024').output)
You get the following error:
RuntimeError: Graph disconnected: cannot obtain value for tensor Tensor("flatten_1_input:0", shape=(?, 1, 1, 2048), dtype=float32, device=/device:GPU:2) at layer "flatten_1_input". The following previous layers were accessed without issue: []
Trying to cut models, change their inputs/outputs etc. sounds not what keras expects from users.
You should just create another model that follows the same path but ends earlier:
#do this "before" creating "top_model_tensor".
short_top = Model(
top_model.input,
top_model.get_layer('last_layer_1024').output
)
top_model_out = top_model(base_model.output)
short_top_out = short_top(base_model.output)
model = Model(base_model.input,top_model_out)
short_model = Model(base_model.input,short_top_out)
Choose which one to use depending on the expected results. Training one updates the other.
A shorter version of the above answer.
#again create connection between two model
feature_vec_model = Model(
top_model.input,
top_model.get_layer('last_layer_1024').output
)
feature_vec_model_output = feature_vec_model(base_model.output)
#Connection created
# Define final connected model & load pretrained weights
complete_feature_vec_model = Model(base_model.input,feature_vec_model_output)
complete_feature_vec_model.load_weights("path_to_model")

Categories

Resources