I am trying the recreate a CNN in Keras to classify point cloud data. The CNN is described in this paper.
Network Design
This is my current implementation:
inputs = Input(shape=(None, 3))
x = Conv1D(filters=64, kernel_size=1, activation='relu')(inputs)
x = BatchNormalization()(x)
x = Conv1D(filters=64, kernel_size=1, activation='relu')(x)
x = BatchNormalization()(x)
y = Conv1D(filters=64, kernel_size=1, activation='relu')(x)
y = BatchNormalization()(y)
y = Conv1D(filters=128, kernel_size=1, activation='relu')(y)
y = BatchNormalization()(y)
y = Conv1D(filters=2048, kernel_size=1, activation='relu')(y)
y = MaxPooling1D(1)(y)
z = keras.layers.concatenate([x, y], axis=2)
z = Conv1D(filters=512, kernel_size=1, activation='relu')(z)
z = BatchNormalization()(z)
z = Conv1D(filters=512, kernel_size=1, activation='relu')(z)
z = BatchNormalization()(z)
z = Conv1D(filters=512, kernel_size=1, activation='relu')(z)
z = BatchNormalization()(z)
z = Dense(9, activation='softmax')(z)
model = Model(inputs=inputs, outputs=z)
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
The problem is that the network predicts the same class for all input data. This may be caused by a mistake in my implementation of the network, overfitting or insufficient training data. Can someone spot a mistake in my implementation?
Yousefhussien, M., Kelbe, D. J., Ientilucci, E. J., & Salvaggio, C. (2017). A Fully Convolutional Network for Semantic Labeling of 3D Point Clouds. arXiv preprint arXiv:1710.01408.
The same output class typically indicates a network that has just been initialized, meaning that the training weights are not loaded. Did this same class thing happen during training? Another reason could be bad pre-processing though. Another thing that I noticed is that the paper states "1D-fully convolutional network". Your dense layer is a convolutional in the paper.
I believe that the mistake is not in the implementation. Most probably the problem is that you have an insufficient amount of data. Also, if network predicts the same class for all input data, it usually means that you lack regularization. Try adding some Dropout layers with dropout of 0.2 to 0.5 and see if the results have improved.
Also, I don't think that
x = Conv1D(filters=64, kernel_size=1, activation='relu')(inputs)
x = BatchNormalization()(x)
is the same as
x = Conv1D(filters=64, kernel_size=1)(inputs)
x = BatchNormalization()(x)
x = ReLU(x)
and I think you need the latter.
Another thing for you to try is LeakyReLU as it usually gives better results than plain ReLU.
The network is fixed as it provides the expected predictions now. Thanks for the help!
Based on the answers I changed the following things:
The order of the activation and the batch normalization.
The last layer from a dense to a convolutional layer.
I also added the training=True parameter to the batch normalization layer
The code of the correct implementation:
inputs = Input(shape=(None, 3))
x = Conv1D(filters=64, kernel_size=1, input_shape=(None, 4))(inputs)
x = BatchNormalization()(x, training=True)
x = Activation('relu')(x)
x = Conv1D(filters=64, kernel_size=1, use_bias=False)(x)
x = BatchNormalization()(x, training=True)
x = Activation('relu')(x)
y = Conv1D(filters=64, kernel_size=1)(x)
y = BatchNormalization()(y, training=True)
y = Activation('relu')(y)
y = Conv1D(filters=128, kernel_size=1)(y)
y = BatchNormalization()(y, training=True)
y = Activation('relu')(y)
y = Conv1D(filters=2048, kernel_size=1)(y)
y = BatchNormalization()(y, training=True)
y = Activation('relu')(y)
y = MaxPooling1D(1)(y)
z = keras.layers.concatenate([x, y], axis=2)
z = Conv1D(filters=512, kernel_size=1)(z)
z = BatchNormalization()(z, training=True)
z = Activation('relu')(z)
z = Conv1D(filters=512, kernel_size=1)(z)
z = BatchNormalization()(z, training=True)
z = Activation('relu')(z)
z = Conv1D(filters=512, kernel_size=1)(z)
z = BatchNormalization()(z, training=True)
z = Activation('relu')(z)
z = Conv1D(filters=2, kernel_size=1, activation='softmax')(z)
model = Model(inputs=inputs, outputs=z)
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
Related
I am trying to finish a project on variational autoencoders (VAE)
I have tried lots of methods and lots of help from the internet but every time I run into a different problem.
Most often it is SHAPE problems, but it can also be like that one:
after I gave up, I took a ready-made code
(from here: https://www.youtube.com/watch?v=8wrLjnQ7EWQ&ab_channel=DigitalSreeni)
and copied it one by one.
nd even then it showed me a problem (which I have seen before as well) and there is no answer on the internet (from what I saw):
TypeError: You are passing KerasTensor(type_spec=TensorSpec(shape=(), dtype=tf.float32, name=None), name='tf.math.reduce_sum_1/Sum:0', description="created by layer 'tf.math.reduce_sum_1'"), an intermediate Keras symbolic input/output, to a TF API that does not allow registering custom dispatchers, such as tf.cond, tf.function, gradient tapes, or tf.map_fn. Keras Functional model construction only supports TF API calls that do support dispatching, such as tf.math.add or tf.reshape. Other APIs cannot be called directly on symbolic Kerasinputs/outputs. You can work around this limitation by putting the operation in a custom Keras layer call and calling that layer on this symbolic input/output.
you can see the full code in the description of the youtube video I have attached
does anyone know what I should do?
Part of the code there, in my opinion, is the problem:
latent_dim = 2
input_img = Input(shape=input_shape, name='encoder_input')
x = Conv2D(32, 3, padding='same', activation='relu')(input_img)
x = Conv2D(64, 3, padding='same', activation='relu',strides=(2, 2))(x)
x = Conv2D(64, 3, padding='same', activation='relu')(x)
x = Conv2D(64, 3, padding='same', activation='relu')(x)
conv_shape = K.int_shape(x) #Shape of conv to be provided to decoder
x = Flatten()(x)
x = Dense(32, activation='relu')(x)
z_mu = Dense(latent_dim, name='latent_mu')(x) #Mean values of encoded input
z_sigma = Dense(latent_dim, name='latent_sigma')(x) #Std dev. (variance) of encoded input
def sample_z(args):
z_mu, z_sigma = args
eps = K.random_normal(shape=(K.shape(z_mu)[0], K.int_shape(z_mu)[1]))
return z_mu + K.exp(z_sigma / 2) * eps
z = Lambda(sample_z, output_shape=(latent_dim, ), name='z')([z_mu, z_sigma])
encoder = Model(input_img, [z_mu, z_sigma, z], name='encoder')
decoder_input = Input(shape=(latent_dim, ), name='decoder_input')
x = Dense(conv_shape[1]*conv_shape[2]*conv_shape[3], activation='relu')(decoder_input)
x = Reshape((conv_shape[1], conv_shape[2], conv_shape[3]))(x)
x = Conv2DTranspose(32, 3, padding='same', activation='relu',strides=(2, 2))(x)
x = Conv2DTranspose(num_channels, 3, padding='same', activation='sigmoid', name='decoder_output')(x)
decoder = Model(decoder_input, x, name='decoder')
z_decoded = decoder(z)
class CustomLayer(keras.layers.Layer):
def vae_loss(self, x, z_decoded):
x = K.flatten(x)
z_decoded = K.flatten(z_decoded)
# Reconstruction loss (as we used sigmoid activation we can use binarycrossentropy)
recon_loss = keras.metrics.binary_crossentropy(x, z_decoded)
# KL divergence
kl_loss = -5e-4 * K.mean(1 + z_sigma - K.square(z_mu) - K.exp(z_sigma), axis=-1)
return K.mean(recon_loss + kl_loss)
# add custom loss to the class
def call(self, inputs):
x = inputs[0]
z_decoded = inputs[1]
loss = self.vae_loss(x, z_decoded)
self.add_loss(loss, inputs=inputs)
return x
y = CustomLayer()([input_img, z_decoded])
vae = Model(input_img, y, name='vae')
vae.compile(optimizer='adam', loss=None)
vae.summary()
vae.fit(x_train, None, epochs = 10, batch_size = 32, validation_split = 0.2)
I've been trying to train audio classification model. When i used SGD with learning_rate=0.01, momentum=0.0 and nesterov=False i get the following Loss and Accuracy graphs:
I can't figure out what what causes the instant decrease in loss at around epoch 750. I tried different learning rates, momentum values and their combinations, different batch sizes, initial layer weights etc. to get more appropriate graph but no luck at all. So if you have any knowledge about what causes this please let me know.
Code i used for this training is below:
# MFCCs Model
x = tf.keras.layers.Dense(units=512, activation="sigmoid")(mfcc_inputs)
x = tf.keras.layers.Dropout(0.5)(x)
x = tf.keras.layers.Dense(units=256, activation="sigmoid")(x)
x = tf.keras.layers.Dropout(0.5)(x)
# Spectrograms Model
y = tf.keras.layers.Conv2D(32, kernel_size=(3,3), strides=(2,2))(spec_inputs)
y = tf.keras.layers.AveragePooling2D(pool_size=(2,2), strides=(2,2))(y)
y = tf.keras.layers.BatchNormalization()(y)
y = tf.keras.layers.Activation("sigmoid")(y)
y = tf.keras.layers.Conv2D(64, kernel_size=(3,3), strides=(1,1), padding="same")(y)
y = tf.keras.layers.AveragePooling2D(pool_size=(2,2), strides=(2,2))(y)
y = tf.keras.layers.BatchNormalization()(y)
y = tf.keras.layers.Activation("sigmoid")(y)
y = tf.keras.layers.Conv2D(64, kernel_size=(3,3), strides=(1,1), padding="same")(y)
y = tf.keras.layers.AveragePooling2D(pool_size=(2,2), strides=(2,2))(y)
y = tf.keras.layers.BatchNormalization()(y)
y = tf.keras.layers.Activation("sigmoid")(y)
y = tf.keras.layers.Flatten()(y)
y = tf.keras.layers.Dense(units=256, activation="sigmoid")(y)
y = tf.keras.layers.Dropout(0.5)(y)
# Chroma Model
t = tf.keras.layers.Dense(units=512, activation="sigmoid")(chroma_inputs)
t = tf.keras.layers.Dropout(0.5)(t)
t = tf.keras.layers.Dense(units=256, activation="sigmoid")(t)
t = tf.keras.layers.Dropout(0.5)(t)
# Merge Models
concated = tf.keras.layers.concatenate([x, y, t])
# Dense and Output Layers
z = tf.keras.layers.Dense(64, activation="sigmoid")(concated)
z = tf.keras.layers.Dropout(0.5)(z)
z = tf.keras.layers.Dense(64, activation="sigmoid")(z)
z = tf.keras.layers.Dropout(0.5)(z)
z = tf.keras.layers.Dense(1, activation="sigmoid")(z)
mdl = tf.keras.Model(inputs=[mfcc_inputs, spec_inputs, chroma_inputs], outputs=z)
mdl.compile(optimizer=SGD(), loss="binary_crossentropy", metrics=["accuracy"])
mdl.fit([M_train, X_train, C_train], y_train, batch_size=8, epochs=1000, validation_data=([M_val, X_val, C_val], y_val), callbacks=[tensorboard_cb])
I'm not too sure myself, but as Frightera said, sigmoid activations in hidden layers can cause trouble since it is more sensitive to weight initialization, and if the weights aren't perfectly set, it can cause gradients to be very small. Perhaps the model eventually deals with the small sigmoid gradients and loss finally decreases around epoch 750, but just my hypothesis. If ReLU doesn't work, try using LeakyReLU since it doesn't have the dead neuron effect that ReLU does.
I am new to CNNs, so I am guessing I am making an elementary error here. I am trying to do age estimation and gender classification on the UTKFace dataset. I have made a dataframe which looks like this:
I've split the data using Sklearn train_test_split
train_validation, test_df = train_test_split(df, test_size=0.25)
train_df, validation_df = train_test_split(train_validation, test_size=0.3333)
I have written the following code to do some data augmentation:
from keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(
rescale=1./255,
rotation_range = 40,
width_shift_range = 0.2,
height_shift_range = 0.2,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True,
fill_mode = 'nearest')
test_datagen = ImageDataGenerator(rescale=1./255)
batch_size = 32
train_generator = train_datagen.flow_from_dataframe(
train_df,
x_col="file",
y_col=["age","gender"],
batch_size=batch_size,
class_mode='multi_output')
val_generator = test_datagen.flow_from_dataframe(
validation_df,
x_col="file",
y_col=["age","gender"],
batch_size=batch_size,
class_mode='multi_output')
Then I edited the model from this post (https://towardsdatascience.com/building-a-multi-output-convolutional-neural-network-with-keras-ed24c7bc1178) to just have the age and gender branches of the model:
class UtkMultiOutputModel():
def make_default_hidden_layers(self, inputs):
"""
Used to generate a default set of hidden layers. The structure used in this network is defined as:
Conv2D -> BatchNormalization -> Pooling -> Dropout
"""
x = Conv2D(16, (3, 3), padding="same")(inputs)
x = Activation("relu")(x)
x = BatchNormalization(axis=-1)(x)
x = MaxPooling2D(pool_size=(3, 3))(x)
x = Dropout(0.25)(x)
x = Conv2D(32, (3, 3), padding="same")(x)
x = Activation("relu")(x)
x = BatchNormalization(axis=-1)(x)
x = MaxPooling2D(pool_size=(2, 2))(x)
x = Dropout(0.25)(x)
x = Conv2D(32, (3, 3), padding="same")(x)
x = Activation("relu")(x)
x = BatchNormalization(axis=-1)(x)
x = MaxPooling2D(pool_size=(2, 2))(x)
x = Dropout(0.25)(x)
return x
def build_gender_branch(self, inputs, num_genders=2):
"""
Used to build the gender branch of our face recognition network.
This branch is composed of three Conv -> BN -> Pool -> Dropout blocks,
followed by the Dense output layer.
"""
x = Lambda(lambda c: tf.image.rgb_to_grayscale(c))(inputs)
x = self.make_default_hidden_layers(inputs)
x = Flatten()(x)
x = Dense(128)(x)
x = Activation("relu")(x)
x = BatchNormalization()(x)
x = Dropout(0.5)(x)
x = Dense(num_genders)(x)
x = Activation("sigmoid", name="gender_output")(x)
return x
def build_age_branch(self, inputs):
"""
Used to build the age branch of our face recognition network.
This branch is composed of three Conv -> BN -> Pool -> Dropout blocks,
followed by the Dense output layer.
"""
x = self.make_default_hidden_layers(inputs)
x = Flatten()(x)
x = Dense(128)(x)
x = Activation("relu")(x)
x = BatchNormalization()(x)
x = Dropout(0.5)(x)
x = Dense(1)(x)
x = Activation("linear", name="age_output")(x)
return x
def assemble_full_model(self, width, height):
"""
Used to assemble our multi-output model CNN.
"""
input_shape = (height, width, 3)
inputs = Input(shape=input_shape)
age_branch = self.build_age_branch(inputs)
gender_branch = self.build_gender_branch(inputs)
model = Model(inputs=inputs,
outputs = [age_branch, gender_branch],
name="modelA")
return model
num_races=len(dataset_dict['race_alias']))
modelA = UtkMultiOutputModel().assemble_full_model(IM_WIDTH, IM_HEIGHT)
modelA.summary()
Then I train and compile:
from keras.optimizers import Adam
init_lr = 1e-4
epochs = 100
opt = Adam(lr=init_lr, decay=init_lr / epochs)
modelA.compile(optimizer=opt,
loss={
'age_output': 'mean_squared_error',
'gender_output': 'binary_crossentropy'},
loss_weights={
'age_output': 4.,
'gender_output': 0.1},
metrics={
'age_output': 'mean_absolute_error',
'gender_output': 'accuracy'})
batch_size = 32
history = modelA.fit_generator(train_generator,
steps_per_epoch=len(train_df)//batch_size,
epochs=20,
validation_data=val_generator,
validation_steps=len(validation_df)//batch_size)
I get the following error, which I am struggling to understand.
ValueError: logits and labels must have the same shape ((None, 2) vs (None, 1))
From reading other posts on here, it looks that this could be due to the labels being the wrong dimension. I don't understand how this could be the case when I use Keras flow_from_dataframe, with a dataframe formatted as I do. Can anyone help?
logits are probably one-hot encoded. However your gender column are only 1 and 0. Try to one-hot encode them.
I'm wanting to pass in three differently formatted inputs (ex, one is a number, one is a one hot encoded array, and the other is an embedding problem) into my Keras functional model and get a number out (regression problem). I've only worked with sequential models in the past, so I'm having trouble understanding how this works.
Here's my current model architecture (input_a is one hot, input_b is a number from 0 to 1, and input_c is another one hot, different size):
However, I'm not exactly sure if this is the 'correct' way my model should be formatted for my intention. Because when I change my third input out so that it's an embedding input, Keras complains that the input shape is incorrect. That dataset should be an array of integers, so I set the input_size to 1, but the embedding layer is saying it's getting an array of 129 size. Which is the length of my dense layer above it, so I think it's receiving the dense's output and not my input.
Here's my model formatted the same, but with embedding (it's failing):
input_a = Input(shape=genres.shape[1:], name='input_a')
x = Dense(128, activation='relu')(input_a)
output_a = Dense(128, activation='relu')(x)
input_b = Input(shape=(1,), name='input_b')
x = keras.layers.concatenate([output_a, input_b])
x = Dense(128, activation='relu')(x)
output_b = Dense(128, activation='relu')(x)
input_c = Input(shape=(1,), name='input_c')
x = keras.layers.concatenate([output_b, input_c])
x = Embedding(max(directors) + 1, 16, input_length=1)(x)
x = Flatten()(x)
x = Dense(128, activation='relu')(x)
x = Dense(128, activation='relu')(x)
output_c = Dense(1)(x)
model = Model(inputs=[input_a, input_b, input_c], outputs=output_c)
But I get "input_length" is 1, but received input has shape (None, 129)
How can I make input_c receive the actual third input and not the output from the above layers?
input_c = Input(shape=(10,))
x = Embedding(1024, 16, input_length=10)(input_c)
output_c = Flatten()(x)
input_a = Input(shape=(10,))
x = Dense(128, activation='relu')(input_a)
output_a = Dense(128, activation='relu')(x)
input_b = Input(shape=(1,))
x = concatenate([output_a, input_b])
x = Dense(128, activation='relu')(x)
output_b = Dense(128, activation='relu')(x)
x = concatenate([output_b, output_c], axis=1)
output_c = Dense(1)(x)
model = Model([input_a, input_b, input_c], output_c)
model.predict([np.random.randn(1000,10),
np.random.randn(1000,1),
np.random.randint(0,1023,size=(1000,10))])
Embedding layers takes in as input the indices to words, so do not concatenate it with the other inputs. Rather pass input_c to embedding layers and then concatenate the flattened word embeddings with others.
I have a problem which deals with predicting two outputs when given a vector of predictors.
Assume that a predictor vector looks like x1, y1, att1, att2, ..., attn, which says x1, y1 are coordinates and att's are the other attributes attached to the occurrence of x1, y1 coordinates. Based on this predictor set I want to predict x2, y2. This is a time series problem, which I am trying to solve using multiple regresssion.
My question is how do I setup keras, which can give me 2 outputs in the final layer.
from keras.models import Model
from keras.layers import *
#inp is a "tensor", that can be passed when calling other layers to produce an output
inp = Input((10,)) #supposing you have ten numeric values as input
#here, SomeLayer() is defining a layer,
#and calling it with (inp) produces the output tensor x
x = SomeLayer(blablabla)(inp)
x = SomeOtherLayer(blablabla)(x) #here, I just replace x, because this intermediate output is not interesting to keep
#here, I want to keep the two different outputs for defining the model
#notice that both left and right are called with the same input x, creating a fork
out1 = LeftSideLastLayer(balbalba)(x)
out2 = RightSideLastLayer(banblabala)(x)
#here, you define which path you will follow in the graph you've drawn with layers
#notice the two outputs passed in a list, telling the model I want it to have two outputs.
model = Model(inp, [out1,out2])
model.compile(optimizer = ...., loss = ....) #loss can be one for both sides or a list with different loss functions for out1 and out2
model.fit(inputData,[outputYLeft, outputYRight], epochs=..., batch_size=...)
You can make a model with multiple output with
the Functional API
by subclassing tf.keras.Model.
Here's an example of dual outputs (regression and classification) on the Iris Dataset, using the Functional API:
from sklearn.datasets import load_iris
from tensorflow.keras.layers import Dense
from tensorflow.keras import Input, Model
import tensorflow as tf
data, target = load_iris(return_X_y=True)
X = data[:, (0, 1, 2)]
Y = data[:, 3]
Z = target
inputs = Input(shape=(3,), name='input')
x = Dense(16, activation='relu', name='16')(inputs)
x = Dense(32, activation='relu', name='32')(x)
output1 = Dense(1, name='cont_out')(x)
output2 = Dense(3, activation='softmax', name='cat_out')(x)
model = Model(inputs=inputs, outputs=[output1, output2])
model.compile(loss={'cont_out': 'mean_absolute_error',
'cat_out': 'sparse_categorical_crossentropy'},
optimizer='adam',
metrics={'cat_out': tf.metrics.SparseCategoricalAccuracy(name='acc')})
history = model.fit(X, {'cont_out': Y, 'cat_out': Z}, epochs=10, batch_size=8)
Here's a simplified version:
from sklearn.datasets import load_iris
from tensorflow.keras.layers import Dense
from tensorflow.keras import Input, Model
data, target = load_iris(return_X_y=True)
X = data[:, (0, 1, 2)]
Y = data[:, 3]
Z = target
inputs = Input(shape=(3,))
x = Dense(16, activation='relu')(inputs)
x = Dense(32, activation='relu')(x)
output1 = Dense(1)(x)
output2 = Dense(3, activation='softmax')(x)
model = Model(inputs=inputs, outputs=[output1, output2])
model.compile(loss=['mae', 'sparse_categorical_crossentropy'], optimizer='adam')
history = model.fit(X, [Y, Z], epochs=10, batch_size=8)
Here's the same example, subclassing tf.keras.Model and with a custom training loop:
import tensorflow as tf
from tensorflow.keras.layers import Dense
from tensorflow.keras import Model
from sklearn.datasets import load_iris
tf.keras.backend.set_floatx('float64')
iris, target = load_iris(return_X_y=True)
X = iris[:, :3]
y = iris[:, 3]
z = target
ds = tf.data.Dataset.from_tensor_slices((X, y, z)).shuffle(150).batch(8)
class MyModel(Model):
def __init__(self):
super(MyModel, self).__init__()
self.d0 = Dense(16, activation='relu')
self.d1 = Dense(32, activation='relu')
self.d2 = Dense(1)
self.d3 = Dense(3, activation='softmax')
def call(self, x, training=None, **kwargs):
x = self.d0(x)
x = self.d1(x)
a = self.d2(x)
b = self.d3(x)
return a, b
model = MyModel()
loss_obj_reg = tf.keras.losses.MeanAbsoluteError()
loss_obj_cat = tf.keras.losses.SparseCategoricalCrossentropy()
optimizer = tf.keras.optimizers.Adam(learning_rate=1e-3)
loss_reg = tf.keras.metrics.Mean(name='regression loss')
loss_cat = tf.keras.metrics.Mean(name='categorical loss')
error_reg = tf.keras.metrics.MeanAbsoluteError()
error_cat = tf.keras.metrics.SparseCategoricalAccuracy()
#tf.function
def train_step(inputs, y_reg, y_cat):
with tf.GradientTape() as tape:
pred_reg, pred_cat = model(inputs)
reg_loss = loss_obj_reg(y_reg, pred_reg)
cat_loss = loss_obj_cat(y_cat, pred_cat)
gradients = tape.gradient([reg_loss, cat_loss], model.trainable_variables)
optimizer.apply_gradients(zip(gradients, model.trainable_variables))
loss_reg(reg_loss)
loss_cat(cat_loss)
error_reg(y_reg, pred_reg)
error_cat(y_cat, pred_cat)
for epoch in range(50):
for xx, yy, zz in ds:
train_step(xx, yy, zz)
template = 'Epoch {:>2}, SCCE: {:>5.2f},' \
' MAE: {:>4.2f}, SAcc: {:>5.1%}'
print(template.format(epoch+1,
loss_cat.result(),
error_reg.result(),
error_cat.result()))
loss_reg.reset_states()
loss_cat.reset_states()
error_reg.reset_states()
error_cat.reset_states()