I was wondering if it is possible to create a customized network structure where the input layer has an extra connection to a hidden layer that is not adjacent to the input layer by using tensorflow. As an example, suppose I have a simple network structure as shown below.
import numpy as np
import random
import tensorflow as tf
from tensorflow import keras
m = 200
n = 5
my_input= np.random.random([m,n])
my_output = np.random.random([m,1])
my_model = tf.keras.Sequential([
tf.keras.layers.Flatten(input_shape=(my_input.shape[1],)),
tf.keras.layers.Dense(32, activation='softmax'),
tf.keras.layers.Dense(32, activation='tanh'),
tf.keras.layers.Dense(32, activation='relu'),
tf.keras.layers.Dense(1)
])
my_model.compile(loss='mse',optimizer = tf.keras.optimizers.Adam(learning_rate=0.001))
res = my_model.fit(my_input, my_output, epochs=50, batch_size=1,verbose=0)
Is there a way that the first layer having the input values can have an extra connection to the third layer that has the ReLU activation? While doing so, I'd like to have different constraints in each connection. For example, for the connection coming from the previous layer, I'd like to use GlorotNormal as my weight initialization. As for the extra connection coming from the input layer, I'd like to use HeUniform initialization.
I tried to visualize what I have in mind below.
Use the Keras functional API and tf.concat:
import numpy as np
import random
import tensorflow as tf
from tensorflow import keras
m = 200
n = 5
my_input= np.random.random([m,n])
my_output = np.random.random([m,1])
inputs = tf.keras.layers.Input((my_input.shape[1],))
x = tf.keras.layers.Flatten()(inputs)
x = tf.keras.layers.Dense(32, activation='softmax')(x)
x = tf.keras.layers.Dense(32, activation='tanh', kernel_initializer=tf.keras.initializers.GlorotNormal())(x)
y = tf.keras.layers.Dense(my_input.shape[1], kernel_initializer=tf.keras.initializers.HeUniform())(inputs)
x = tf.keras.layers.Dense(32, activation='relu')(tf.concat([x, y], axis=1))
outputs = tf.keras.layers.Dense(1)(x)
my_model = tf.keras.Model(inputs, outputs)
dot_img_file = 'model_1.png'
tf.keras.utils.plot_model(my_model, to_file=dot_img_file, show_shapes=True)
my_model.compile(loss='mse',optimizer = tf.keras.optimizers.Adam(learning_rate=0.001))
res = my_model.fit(my_input, my_output, epochs=50, batch_size=1,verbose=0)
Related
I have a NN with this structure:
I need to make it predict ~ 0, when input -> ∞
To implement this, I decided to add model prediction with large input to the loss function (If there is another way, will be happy to hear)
But I don't see the way to predict values inside my loss function
Network code:
from keras.models import Model
from keras.layers import Input, Dense, Add
input_array = []
output_array = []
for i in range(14):
input_layer = Input(shape=(1,))
hidden1 = Dense(64, activation='relu')(input_layer)
hidden2 = Dense(64, activation='relu')(hidden1)
output_layer = Dense(1, activation='linear')(hidden2)
input_array.append(input_layer)
output_array.append(output_layer)
# merge input models
summation = Add()(output_array)
model = Model(inputs=input_array, outputs=summation)
I'm creating an Ensemble of Vgg19, DenseNet, and EfficientNetB1.
The code is as follows:
IMAGE_SIZE = (224,224,3)
import tensorflow as tf
vgg19 = tf.keras.applications.vgg19.VGG19(
input_shape=IMAGE_SIZE, weights='imagenet', include_top=False)
for layer in vgg19.layers:
layer._name = layer._name + str('_19')
layer.trainable = False
effnetb1 =tf.keras.applications.efficientnet.EfficientNetB1(
include_top=False, weights='imagenet', input_shape=IMAGE_SIZE)
for layer in effnetb1.layers:
layer._name = layer._name + str('_B1')
layer.trainable=False
densenet=tf.keras.applications.densenet.DenseNet121(
include_top=False, weights="imagenet", input_shape=IMAGE_SIZE)
for layer in densenet.layers:
layer._name = layer._name + str('_Dense')
layer.trainable=False
from keras.layers import Input, Flatten, Concatenate, Dense, Average, Dropout
inp = Input(IMAGE_SIZE)
vgg19_x = Flatten()(vgg19(inp))
vgg19_x = Dense(256, activation='relu')(vgg19_x)
effnet_x = Flatten()(effnetb1(inp))
effnet_x = Dense(256, activation='relu')(effnet_x)
densenet_x = Flatten()(densenet(inp))
densenet_x = Dense(256, activation='relu')(densenet_x)
from keras.models import Model
x = Concatenate()([vgg19_x, effnet_x, densenet_x])
x = Dense(128, activation='relu')(x)
x = Dropout(0.30)(x)
x = Dense(64, activation='relu')(x)
out = Dense(2, activation='softmax')(x)
model = Model(inputs = inp, outputs = out)
model.compile(
loss='categorical_crossentropy',
optimizer=tf.keras.optimizers.Adam(
learning_rate=0.0005,
name="Adam"),
metrics=['accuracy']
)
model.summary()
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint
checkpointer = ModelCheckpoint(filepath="/content/drive/MyDrive/ensemble/ensemble-weights.hdf5", verbose=1, save_best_only=True)
r = model.fit(
training_set,
validation_data=test_set,
epochs=30,
steps_per_epoch=len(training_set),
validation_steps=len(test_set),
callbacks = [checkpointer]
)
The code runs fine and the training is successfully taking place when I'm not using the callback. But when I use a ModelCheckpoint, I get the following error after 1st epoch:
ValueError: The target structure is of type `<class 'keras.engine.keras_tensor.KerasTensor'>`
KerasTensor(type_spec=TensorSpec(shape=(None, 224, 224, 3), dtype=tf.float32, name='input_5'), name=...
However, the input structure is a sequence (<class 'list'>) of length 0.
[]
nest cannot guarantee that it is safe to map one to the other.
Can anyone tell me what's wrong here? Also, is it because I'm concatenating three models?
Your help will be appreciated. Thank you!
I also ran into this issue while trying to implement a nested model (which is what would be constructed here after you create the concatenated model).
The issue seems to be that Keras cannot handle the inputs and outputs of nested models in newer tensorflow versions(tf 2.0 and above). Depending on the version you are on, you might want to either explicitly refer the input/output of the nested model you are using. In tf2.6, what seems to work is to define separate models for each part - ie - the common layers added after concatenation should also be wrapped in a model like below (taken from here):
#Make GradCAM heatmap following the Keras tutorial.
last_conv_layer = model.layers[-4].layers[-1]
last_conv_layer_model = keras.Model(model.layers[-4].inputs, last_conv_layer.output)
# Second, we create a model that maps the activations of the last conv
# layer to the final class predictions
classifier_input = keras.Input(shape=last_conv_layer.output.shape[1:])
x = classifier_input
for layer in model.layers[-3:]:
x = layer(x)
classifier_model = keras.Model(classifier_input, x)
#Preparing the image with the preprocessing layers
preprocess_layers = keras.Model(model.inputs, model.layers[-5].output)
img_array = preprocess_layers(prepared_image)
# Then, we compute the gradient of the top predicted class for our input image
# with respect to the activations of the last conv layer
with tf.GradientTape() as tape:
# Compute activations of the last conv layer and make the tape watch it
last_conv_layer_output = last_conv_layer_model(img_array)
tape.watch(last_conv_layer_output)
# Compute class predictions
preds = classifier_model(last_conv_layer_output)
top_pred_index = tf.argmax(preds[0])
top_class_channel = preds[:, top_pred_index]
# This is the gradient of the top predicted class with regard to
# the output feature map of the last conv layer
grads = tape.gradient(top_class_channel, last_conv_layer_output)
You can also check the following github issues (they are not very related, but deal with a similar problem) - issue1, issue2, issue3
I try to implement a network in Keras for a symmetric problem - a model that predicts the distance between inputs a and b.
I used the following official references:
1 and 2 to create the following simple implementation:
from __future__ import absolute_import
from __future__ import print_function
from __future__ import absolute_import
from __future__ import print_function
import keras
from keras.models import Model
from keras.layers import Input, Flatten, Dense, Dropout
import numpy as np
def create_base_network(input_shape):
'''Base network to be shared (eq. to feature extraction).
'''
input = Input(shape=input_shape)
x = Flatten()(input)
x = Dense(128, activation='relu')(x)
x = Dropout(0.1)(x)
x = Dense(128, activation='relu')(x)
x = Dropout(0.1)(x)
x = Dense(128, activation='relu')(x)
return Model(input, x)
num_of_features = 256
num_of_samples = 280
data_a = np.random.random((num_of_samples, 1, num_of_features))
data_b = np.random.random((num_of_samples, 1, num_of_features))
# binary label
labels = np.random.randint(2, size=num_of_samples)
input_shape= (1, num_of_features)
# network definition
base_network = create_base_network(input_shape)
input_a = Input(shape=input_shape)
input_b = Input(shape=input_shape)
# because we re-use the same instance `base_network`, the weights of the network
# will be shared across the two branches
encoded_a = base_network(input_a)
encoded_b = base_network(input_b)
# We can then concatenate the two vectors:
merged_vector = keras.layers.concatenate([encoded_a, encoded_b], axis=-1)
# And add a logistic regression on top
predictions = Dense(1, activation='sigmoid')(merged_vector)
# We define a trainable model linking the inputs to the predictions
model = Model(inputs=[input_a, input_b], outputs=predictions)
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
model.fit([data_a, data_b], labels, epochs=10)
Yet, when I use model evaluation, I get a different metric value over the test set when I switch between a and b:
data_a_test = np.random.random((num_of_samples, 1, num_of_features))
data_b_test = np.random.random((num_of_samples, 1, num_of_features))
labels_test = np.random.randint(2, size=num_of_samples)
loss_ab, metric_ab = model.evaluate([data_a_test, data_b_test], labels_test, batch_size=32, verbose=2)
loss_ba, metric_ba = model.evaluate([data_b_test, data_a_test], labels_test, batch_size=32, verbose=2)
loss_ab: 0.9805058070591518 metric_ab: 0.48928571343421934
loss_ba: 1.0541694641113282 metric_ba: 0.5
What do I miss here?
please help me with some inputs...
I was wondering if it would be possible to extract the last cell state of an LSTM in Keras after training the model. For example, in this simple LSTM model:
number_of_dimensions = 128
number_of_examples = 123456
input_ = Input(shape = (10,100,))
lstm, hidden, cell = CuDNNLSTM(units = number_of_dimensions, return_state=True)(input_)
dense = Dense(num_of_classes, activation='softmax')(lstm)
model = Model(inputs = input_, outputs = dense)
parallel_model = multi_gpu_model(model, gpus=2)
parallel_model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['acc'])
# fit the model
parallel_model.fit(X1, onehot_encoded, epochs=100, verbose=1, batch_size = 128, validation_split = 0.2)
I tried printing 'cell' but the result was
tf.Tensor 'cu_dnnlstm_2/strided_slice_17:0' shape=(?, 128) dtype=float32
I would like to get the cell state as a numpy array of shape (number_of_examples, number_of_dimensions) or (123456, 128). Is it possible to do this keras?
Thank you!
Assuming that you are using TensorFlow as a backend, you could specifically run cell within the TensorFlow session. For example:
from keras.layers import LSTM, Input, Dense
from keras.models import Model
import keras.backend as K
import numpy as np
number_of_dimensions = 128
number_of_examples = 123456
input_ = Input(shape=(10, 100,))
lstm, hidden, cell = LSTM(units=number_of_dimensions, return_state=True)(input_)
dense = Dense(10, activation='softmax')(lstm)
model = Model(inputs=input_, outputs=dense)
with K.get_session() as sess:
x = np.zeros((number_of_examples, 10, 100))
cell_state = sess.run(cell, feed_dict={input_: x})
print(cell_state.shape)
An option that you might be interested in is to save model weights to hdf5 file:
model.save_weights('my_model_weights.h5')
(ref: https://keras.io/getting-started/faq/#how-can-i-save-a-keras-model)
Then use an HDF viewer such as the Java HDFView package: https://support.hdfgroup.org/products/java/hdfview/
I believe that you can then export the data to CSV for import into Numpy for example.
I'm new to ConvNets and Python and want to implement the following:
I want to use the pretrained vgg16 model and add 3 fully connected layers after it with an L2-Normalization at the end.
So Data->VGG16->FC (1x4096)->FC (1x4096)->FC (1x3)->L2-Norm->Output
The first and second FC get an array 1x4096 the last FC gets an array 1x3 where the L2-Norm is performed.
Can anyone give me a hint how to do that ?
I found that I can load the model like that :
model_vgg19 = models.vgg19(pretrained=True)
But how can I add the FCs and the L2-Norm after that ? And how can I get Test-Data through the model ?
I'm quoting an example mentioned in Keras#3465
In Keras framework, if you mention include_top = False while loading your pre-trained model it will not include the final classification layer. You can add your custom FC layers at the end as shown in the example below:
#load vgg16 without dense layer and with theano dim ordering
base_model = VGG16(weights = 'imagenet', include_top = False, input_shape = (3,224,224))
#number of classes in your dataset e.g. 20
num_classes = 20
x = Flatten()(base_model.output)
x = Dense(4096, activation='relu')(x)
x = Dropout(0.5)(x)
x = BatchNormalization()(x)
predictions = Dense(num_classes, activation = 'softmax')(x)
#create graph of your new model
head_model = Model(input = base_model.input, output = predictions)
#compile the model
head_model.compile(optimizer='rmsprop', loss='categorical_crossentropy', metrics=['accuracy'])
head_model.summary()
.
.
.
#train your model on data
head_model.fit(x, y, batch_size = batch_size, verbose = 1)