I have tried to visualize the architecture of my neural network (see code below). I want to get something like this in terms of visualization.
but I didn't manage to do it. What package should I use or can anyone illustrate what would my network result in?
This is the code for my network:
new_img_size = 128
nbr_img = 3
delta_t = 10
min_pred = 10
image1 = Input(shape=(new_img_size, new_img_size, 3))
image2 = Input(shape=(new_img_size, new_img_size, 3))
y1 = BatchNormalization()(image1)
y1 = Flatten()(y1)
y1 = Dense(1024, activation='relu')(y1)
cnn1 = Model(inputs=image1, outputs=y1)
input_sequence1 = Input(shape=(nbr_img, new_img_size, new_img_size, 3))
lstm1 = TimeDistributed(cnn1)(input_sequence1)
lstm1 = LSTM(1024, activation='relu', return_sequences=False)(lstm1)
lstm1 = Dense(48, activation='relu')(lstm1)
y2 = BatchNormalization()(image2)
y2 = Flatten()(y2)
y2 = Dense(1024, activation='relu')(y2)
cnn2 = Model(inputs=image2, outputs=y2)
input_sequence2 = Input(shape=(nbr_img, new_img_size, new_img_size, 3))
lstm2 = TimeDistributed(cnn2)(input_sequence2)
lstm2 = LSTM(1024, activation='relu', return_sequences=False)(lstm2)
lstm2 = Dense(48, activation='relu')(lstm2)
merged = concatenate([lstm1, lstm2])
mlp = Dense(96, activation='relu')(merged)
mlp = Dense(48, activation='relu')(merged)
mlp = Dense(int(min_pred/delta_t), activation='linear')(mlp)
model = Model(inputs=[input_sequence1, input_sequence2], outputs=mlp)
model.compile(optimizer="Adam", loss='mse', metrics=['mae'])
Thank you for your help
EDIT
I have tried both
tf.keras.utils.plot_model
and netron and both give me this
I do find this useful but since I have a layer of TimeDistributed, I would like to see this in my plot as well. I don't want to see just the name "Time distributed", I would like to see how this layer is creating seperate CNN layers for each input image.
You can use plot_model to programmatically visualize the architecture https://www.tensorflow.org/api_docs/python/tf/keras/utils/plot_model
tf.keras.utils.plot_model(
model, to_file='model.png', show_shapes=False, show_dtype=False,
show_layer_names=True, rankdir='TB', expand_nested=False, dpi=96,
layer_range=None
)
Or, you can use netron to visualize the weight of the model including the architecture.
ref: https://github.com/lutzroeder/netron
Related
I have been struggling to create a automatic speech recognition neural network using tensorflow trained on the hugging face mozilla common voice 11 dataset. The model seems to train well for around 100 batches before the loss sudenly goes to infinity.
Here is the code for the data preprocessing:
dataset = datasets.load_dataset("mozilla-foundation/common_voice_11_0", "en")
dataset = dataset.remove_columns(['client_id', 'audio', 'up_votes', 'down_votes', 'age', 'gender', 'accent', 'locale', 'segment'])
def prepare_dataset(batch):
wav_file = batch['path']
# Remove file name
split = wav_file.split("\\")
joined = "\\".join(split[:-1]) + "\\"
# Get the train number
complete_path = glob.glob(joined + "*")
# Combine all the parts
file = complete_path[0] + "\\" + split[-1]
batch['path'] = file
return batch
train_dataset = dataset['train'].map(prepare_dataset).shuffle(len(dataset['train']))
val_dataset = dataset['validation'].map(prepare_dataset).shuffle(len(dataset['validation']))
frame_length = 256
frame_step = 160
fft_length = 384
def load_mp3(wav_file):
audio = tfio.audio.AudioIOTensor(wav_file, dtype=tf.float32)
sample_rate = tf.cast(audio.rate, dtype=tf.int64)
audio = tf.squeeze(audio.to_tensor())
audio = tfio.audio.resample(audio, rate_in=sample_rate, rate_out=8000)
audio = tfio.audio.fade(audio, fade_in=1000, fade_out=2000, mode="logarithmic")
return audio
def convert_to_spect(audio):
spectrogram = tf.signal.stft(
audio, frame_length=frame_length, frame_step=frame_step, fft_length=fft_length
)
spectrogram = tf.abs(spectrogram)
spectrogram = tf.math.pow(spectrogram, 0.5)
spectrogram = tfio.audio.freq_mask(spectrogram, param=25)
spectrogram = tfio.audio.time_mask(spectrogram, param=25)
spectrogram = tfio.audio.freq_mask(spectrogram, param=25)
spectrogram = tfio.audio.time_mask(spectrogram, param=25)
means = tf.math.reduce_mean(spectrogram, 1, keepdims=True)
stddevs = tf.math.reduce_std(spectrogram, 1, keepdims=True)
spectrogram = (spectrogram - means) / (stddevs + 1e-10)
return spectrogram
def process_text(label):
label = tf.strings.lower(label)
label = tf.strings.unicode_split(label, input_encoding="UTF-8")
label = char_to_num(label)
return label
def encode_mozilla_sample(wav_file, label):
audio = load_mp3(wav_file)
spectrogram = convert_to_spect(audio)
label = process_text(label)
return spectrogram, label
And here is the code for the model:
def CTCLoss(y_true, y_pred):
# Compute the training-time loss value
batch_len = tf.cast(tf.shape(y_true)[0], dtype="int64")
input_length = tf.cast(tf.shape(y_pred)[1], dtype="int64")
label_length = tf.cast(tf.shape(y_true)[1], dtype="int64")
input_length = input_length * tf.ones(shape=(batch_len, 1), dtype="int64")
label_length = label_length * tf.ones(shape=(batch_len, 1), dtype="int64")
loss = tf.keras.backend.ctc_batch_cost(y_true, y_pred, input_length, label_length)
return loss
def build_model(input_dim, output_dim, rnn_layers=5, conv_units=128, rnn_units=128, dropout=0.5):
input_spectrogram = tf.keras.layers.Input((None, input_dim), name="input")
x = tf.keras.layers.Reshape((-1, input_dim, 1), name="expand_dim")(input_spectrogram)
# Conv layers
x = tf.keras.layers.Conv2D(
filters=conv_units,
kernel_size=[11, 41],
strides=[2, 2],
padding="same",
use_bias=False,
name="conv_1",
)(x)
x = tf.keras.layers.BatchNormalization(name="conv_1_bn")(x)
x = tf.keras.layers.ReLU(name="conv_1_relu")(x)
x = tf.keras.layers.Conv2D(
filters=conv_units,
kernel_size=[11, 21],
strides=[1, 2],
padding="same",
use_bias=False,
name="conv_2",
)(x)
x = tf.keras.layers.BatchNormalization(name="conv_2_bn")(x)
x = tf.keras.layers.ReLU(name="conv_2_relu")(x)
x = tf.keras.layers.Reshape((-1, x.shape[-2] * x.shape[-1]))(x)
# RNN layers
for i in range(1, rnn_layers + 1):
recurrent = tf.keras.layers.GRU(
units=rnn_units,
activation="tanh",
recurrent_activation="sigmoid",
use_bias=True,
return_sequences=True,
reset_after=True,
name=f"gru_{i}",
)
x = tf.keras.layers.Bidirectional(
recurrent, name=f"bidirectional_{i}", merge_mode="concat"
)(x)
x = tf.keras.layers.BatchNormalization(name=f"rnn_{i}_bn")(x)
if i < rnn_layers:
x = tf.keras.layers.Dropout(rate=dropout)(x)
# Dense layer
x = tf.keras.layers.Dense(units=rnn_units * 2, activation="gelu", name="dense_1")(x)
x = tf.keras.layers.Dropout(rate=dropout)(x)
# Classification layer
output = tf.keras.layers.Dense(units=output_dim + 1, activation="softmax", name="output_layer")(x)
# Model
model = tf.keras.Model(input_spectrogram, output, name="DeepSpeech_2")
# Optimizer
opt = tf.keras.optimizers.Adam(learning_rate=0.01)
# Compile the model and return
model.compile(optimizer=opt, loss=CTCLoss)
return model
# Get the model
model = build_model(
input_dim=fft_length // 2 + 1,
output_dim=char_to_num.vocabulary_size(),
rnn_units=32,
conv_units=32,
rnn_layers=5,
dropout=0.5
)
Versions:
tensorflow: 2.10.1
python: 3.9.12
gpu: Nvidia GeForce RTX 3080
OS: Windows 11
cuDNN: 8.1
CUDA: 11.2
I have tried increasing the batch size expecting the model to generalize better but any batch size 256 or higher caused the gpu to run out of memory. The infite loss occurs with any batch size 128 or less. I have also tried increasing the batch size while using less data but the result is the same. I thought that reducing the neural network size would help solve the problem but no matter what, it seems that the loss goes to infinity after reaching a loss of around 200. A few other changes I have tried are activation functions(relu, leakyrelu, gelu), optimizers(SGD, ADAM, ADAMW), and the number of rnn/conv layers.
Note: I have considered using a pretrained model but I have always wanted to successfully create ASR from scratch using tensorflow. Will it even be possible to get even moderately acceptable results using my GPU and data or will I have to resort to using wav2vec?
Another note: I was first inspired to create this project after watching the video https://www.youtube.com/watch?v=YereI6Gn3bM
made by "The A.I. Hacker - Michael Phi" who first convinced me that this was possible. Before I had thought that my computer would not be able to handle this task but after seeing him do this with pytorch, similar computer specs, and the same data, I though that I would be able to do so.
Update:
I have recently tried replacing the 2D Conv layers with a single 1D Conv layer, making the GRU layer not bidirectional, and going back to the AdamW optimizer but nothing has changed.
Thanks for the solution. I just changed the number of neurons in the second to last dense layer to 512 and the model is currently running without error. Now I am just going to have to figure out how to improve the model so I can finally wrap up this project.
I want to make a model like the below picture. (simplified)
So, practically, I want the weights with the same names to always have the same values during training. What I did was the code below:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
example_train_features = np.arange(12000).reshape(1000, 12)
example_lanbels = np.random.randint(2, size=1000) #these data are just for illustration purposes
train_ds = tf.data.Dataset.from_tensor_slices((example_train_features, example_lanbels)).shuffle(buffer_size = 1000).batch(32)
dense1 = layers.Dense(1, activation="relu") #input shape:4
dense2 = layers.Dense(2, activation="relu") #input shape:1
dense3 = layers.Dense(1, activation="sigmoid") #input shape:6
feature_input = keras.Input(shape=(12,), name="features")
nodes_list = []
for i in range(3):
first_lvl_input = feature_input[i :: 4] ######## marked line
out1 = dense1(first_lvl_input)
out2 = dense2(out1)
nodes_list.append(out2)
joined = layers.concatenate(nodes_list)
final_output = dense3(joined)
model = keras.Model(inputs = feature_input, outputs = final_output, name="extrema_model")
compile_and_fit(model, train_ds, val_ds, patience=4)
model.compile(loss = tf.keras.losses.BinaryCrossentropy(),
optimizer = tf.keras.optimizers.RMSprop(),
metrics=keras.metrics.BinaryAccuracy())
history = model.fit(train_ds, epochs=10, validation_data=val_ds)
But when I try to run this code I get this error:
MklConcatOp : Dimensions of inputs should match: shape[0][0]= 71 vs. shape[18][0] = 70
[[node extrema_model/concatenate_2/concat (defined at <ipython-input-373-5efb41d312df>:398) ]] [Op:__inference_train_function_15338]
(please don't pay attention to numbers as they are from my real code) this is because it gets the whole data including the labels as an input, but shouldn't Keras only feed the features itself? Anyway, if I write the marked line as below:
first_lvl_input = feature_input[i :12: 4]
it doesn't give me the above error anymore. But, then I get another error which I know why happens but I don't know how to resolve it.
InvalidArgumentError: Incompatible shapes: [4,1] vs. [32,1]
[[node gradient_tape/binary_crossentropy/logistic_loss/mul/BroadcastGradientArgs
(defined at <ipython-input-1-b82546367b3c>:398) ]] [Op:__inference_train_function_6098]
This is because keras is feeding again the whole batch array, whereas in Keras documentation it is written you shouldn't specify the batch dimension for the program as it understands itself, so I expected Keras to feed the data one by one for my code to work. So I appreciate any ideas on how to resolve this or on how to write a code for what I want. Thanks.
You can wrap the dense layers in timedistributed wrapper , and reshape your data to have three dimensions (1000,3,4)(batch, sequence, feature), so for each time step (=3 that replace your for loop code .) the four features will be multiplied with the same weights each time.
example_train_features = np.arange(12000).reshape(1000, 3, 4 )
example_lanbels = np.random.randint(2, size=1000) #these data are just for illustration purposes
train_ds = tf.data.Dataset.from_tensor_slices((example_train_features, example_lanbels)).shuffle(buffer_size = 1000).batch(32)
dense1 = layers.TimeDistributed(layers.Dense(1, activation="relu")) #input shape:4
dense2 =layers.TimeDistributed(layers.Dense(2, activation="relu")) #input shape:1
dense3 = layers.Dense(1, activation="sigmoid") #input shape:6
feature_input = keras.Input(shape=(3,4), name="features")
out1 = dense1(feature_input)
out2 = dense2(out1)
z = layers.Flatten()(out2)
final_output = dense3(z)
model = keras.Model(inputs = feature_input, outputs = final_output, name="extrema_model")
model.compile(loss = tf.keras.losses.BinaryCrossentropy(),
optimizer = tf.keras.optimizers.RMSprop(),
metrics=keras.metrics.BinaryAccuracy())
history = model.fit(train_ds, epochs=10)
EDIT: More clarification -
I have a pre-trained model file which I can load and pull model.layers and model.weights from. This model may have a complex set of interconnected layers.
I want to be able to use the model.layers or the model() file directly to append it to a layer in another neural network.
#Dummy model - this function is not available to me; only the model file
def model1():
inp = layers.Input((3,))
x = layers.Dense(4, activation='relu')(inp)
out = layers.Dense(2, activation='softmax')(x)
model = Model(inp, out)
return model
pretrained_model = model1() #I have THIS only!
L = pretrained_model.layers
print(L)
[<tensorflow.python.keras.engine.input_layer.InputLayer at 0x7f915d6778b0>,
<tensorflow.python.keras.layers.core.Dense at 0x7f915d643790>,
<tensorflow.python.keras.layers.core.Dense at 0x7f915e124e50>]
I want to take the Dense layers L[1:] and add them to another architecture (Not the weights, just the layers). Something like below as #Anton has described in his solution.
inp = layers.Input((3,))
x = Dense(3, activation='relu')(inp)
m0 = get_layers(pretrained_model)(x) #<---
out = layers.Dense(2)(m0)
This should give me a model.summary() with 5 layers - inp, x, L[1], L[2], out
But I am unable to use the list of layers directly.
I can come up with a function that recreates a partial computation graph based on these layers but I am looking for something simpler.
I have already tried modifying the model1() function to work for me as below, which serves my purpose but assuming I only get a model file and with a massive number of layers, this will not be possible.
def model1(layer):
#inp = layers.Input((3,))
x = layers.Dense(4, activation='relu')(layer)
out = layers.Dense(2, activation='softmax')(x)
model = Model(inp, out)
return model.output
How can I use a model generator inside another model
We can use the generate model1() and replace
inp = layers.Input((3,))
x = Dense(3, activation='relu')(inp)
m0 = get_layers(pretrained_model)(x) # <---
out = layers.Dense(2)(m0)
with
inp = layers.Input((3,))
x = layers.Dense(3, activation='relu')(inp)
m0 = pretrained_model(x) # <---
out = layers.Dense(2)(m0)
and if we want a new model generator model2() that does that as a function
def model2(pretrained_model):
inp = layers.Input((3,), name='model2_input')
x = layers.Dense(3, activation='relu', name='model2_x')(inp)
m0 = pretrained_model(x)
out = layers.Dense(2, name='model2_out')(m0)
model = Model(inp, out, name='model2')
return model
second_model = model2()
If we look at the graph of second_model we can see that indeed it contains the layers of model1
We can generate the above image using
tf.keras.utils.plot_model(second_model, show_shapes=True, expand_nested=True)
I am having trouble finding a Generator architecture for a simple array.
The ideas is the following,
Generator takes the encoded input from the auto-encoder (trained earlier) which outputs an array of shape (200,).
My current generator Model:
def Generator():
input_layer = Input(shape=[200]) #(bs, 200)
d1 = Dense(165, activation = 'relu')(input_layer)
d2 = Dense(140, activation = 'relu')(d1)
d3 = Dense(100, activation = 'relu')(d2)
d4 = Dense(50, activation = 'relu')(d3)
d5 = Dense(25, activation = 'relu')(d4) #(bs, 25)
up1 = Dense(50, activation = 'relu')(d5)
cat = concatenate([up1,dense_4])
up2 = Dense(100, activation = 'relu')(cat)
cat = concatenate([up2, dense_3])
up3 = Dense(140, activation = 'relu')(cat)
cat = concatenate([up3, dense_2])
up4 = Dense(165, activation='relu')(cat)
cat = concatenate([up4, dense_1])
last = Dense(200, activation = 'relu')(cat)
return tf.keras.models.Model(inputs = input_layer, outputs=last)
The train steps, losses and cycleGAN tensorflow documentation are same.
Since I am using this generator for a cycle-GAN project, I thought i would get good results after 100 epochs, but it was the opposite.
I used a Generator without skip connections but the results were much worse than the one above.
I searched for RES-Net and U-Net Architecture but since they require Convolutional layers, I haven't tried them yet.
My Questions:
What recommendations do you have?
What type of skips connections should I use (add or concatenate)?
I am trying to mimic this keras blog about fine tuning image classifiers. I would like to use the Inceptionv3 found on a fchollet repo.
Inception is a Model (functional API), so I can't just do model.add(top_model) which is reserved for Sequential.
How can I add combine two functional Models? Let's say I have
inputs = Input(shape=input_shape)
x = Flatten()(inputs)
predictions = Dense(4, name='final1')(x)
model1 = Model(input=inputs, output=predictions)
for the first model and
inputs_2 = Input(shape=(4,))
y = Dense(5)(l_inputs)
y = Dense(2, name='final2')(y)
predictions_2 = Dense(29)(y)
model2 = Model(input=inputs2, output=predictions2)
for the second. I now want an end-to-end that goes from inputs to predicions_2 and links predictions to inputs_2.
I tried using model1.get_layer('final1').output but I had a mismatch with types and I couldn't make it work.
I haven't tried this but according to the documentation functional models are callable, so you can do something like:
y = model2(model1(x))
where x is the data that goes to inputs and y is the result of predictions_2
I ran into this problem as well while fine tuning VGG16. Here's what worked for me and I imagine a similar approach can be taken for Inception V3. Tested on Keras 2.0.5 with Tensorflow 1.2 backend.
# NOTE: define the following variables
# top_model_weights_path
# num_classes
# dense_layer_1 = 4096
# dense_layer_2 = 4096
vgg16 = applications.VGG16(
include_top=False,
weights='imagenet',
input_shape=(224, 224, 3))
# Inspect the model
vgg16.summary()
# This shape has to match the last layer in VGG16 (without top)
dense_input = Input(shape=(7, 7, 512))
dense_output = Flatten(name='flatten')(dense_input)
dense_output = Dense(dense_layer_1, activation='relu', name='fc1')(dense_output)
dense_output = Dense(dense_layer_2, activation='relu', name='fc2')(dense_output)
dense_output = Dense(num_classes, activation='softmax', name='predictions')(dense_output)
top_model = Model(inputs=dense_input, outputs=dense_output, name='top_model')
# from: https://blog.keras.io/building-powerful-image-classification-models-using-very-little-data.html
# note that it is necessary to start with a fully-trained
# classifier, including the top classifier,
# in order to successfully do fine-tuning
top_model.load_weights(top_model_weights_path)
block5_pool = vgg16.get_layer('block5_pool').output
# Now combine the two models
full_output = top_model(block5_pool)
full_model = Model(inputs=vgg16.input, outputs=full_output)
# set the first 15 layers (up to the last conv block)
# to non-trainable (weights will not be updated)
# WARNING: this may not be applicable for Inception V3
for layer in full_model.layers[:15]:
layer.trainable = False
# Verify things look as expected
full_model.summary()
# compile the model with a SGD/momentum optimizer
# and a very slow learning rate.
full_model.compile(
loss='binary_crossentropy',
optimizer=optimizers.SGD(lr=5e-5, momentum=0.9),
metrics=['accuracy'])
# Train the model...
I think there are 2 options depending on what you need:
(a) predictions_1 and predictions_2 matter for you. In this case, you can train a network with 2 outputs. Here an example derived from your post:
input_shape = [3, 20]
inputs = Input(shape=input_shape)
x = Flatten()(inputs)
predictions_1 = Dense(4, name='predictions_1')(x)
# here the predictions_1 just corresponds to your next layer's input
y = Dense(5)(predictions_1)
y = Dense(2)(y)
predictions_2 = Dense(29, name='predictions_2')(y)
# you specify here that you have 2 outputs
model = Model(input=inputs, output=[predictions_1, predictions_2])
For the .fit and .predict, you can find a lot of details in https://keras.io/getting-started/functional-api-guide/, section: Multi-input and multi-output models.
(b) you are only interested in predictions_2. In this case, you can just do:
input_shape = [3, 20]
inputs = Input(shape=input_shape)
x = Flatten()(inputs)
predictions_1 = Dense(4, name='predictions_1')(x)
# here the predictions_1 just corresponds to your next layer's input
y = Dense(5)(predictions_1)
y = Dense(2)(y)
predictions_2 = Dense(29, name='predictions_2')(y)
# you specify here that your only output is predictions_2
model = Model(input=inputs, output=predictions_2)
Now as regards inception_v3. You can define by yourself the architecture and modify the deep layers inside according to your needs (giving to these layers specific names in order to avoid keras naming them automatically).
After that, compile your model and loads weights (as in https://keras.io/models/about-keras-models/ see function load_weights(..., by_name=True))
# you can load weights for only the part that corresponds to the true
# inception_v3 architecture. The other part will be initialized
# randomly
model.load_weights("inception_v3.hdf5", by_name=True)
This should solve your problem. By the way, you can find extra information here: https://www.gradientzoo.com. The doc. explains several saving / loading / fine-tuning routines ;)
Update: if you do not want to redefine your model from scratch you can do the following:
input_shape = [3, 20]
# define model1 and model2 as you want
inputs1 = Input(shape=input_shape)
x = Flatten()(inputs1)
predictions_1 = Dense(4, name='predictions_1')(x)
model1 = Model(input=inputs1, output=predictions_1)
inputs2 = Input(shape=(4,))
y = Dense(5)(inputs2)
y = Dense(2)(y)
predictions_2 = Dense(29, name='predictions_2')(y)
model2 = Model(input=inputs2, output=predictions_2)
# then define functions returning the image of an input through model1 or model2
def give_model1():
def f(x):
return model1(x)
return f
def give_model2():
def g(x):
return model2(x)
return g
# now you can create a global model as follows:
inputs = Input(shape=input_shape)
x = model1(inputs)
predictions = model2(x)
model = Model(input=inputs, output=predictions)
Drawing from filitchp's answer above, assuming the output dimensions of model1 match the input dimensions of model2, this worked for me:
model12 = Model(inputs=inputs, outputs=model2(model1.output))