I've looked at a few similar questions but I still don't understand how to solve my problem.
I am trying to build a CNN that estimates how many particles hit a detector, based on what's essentially an oscilloscope trace of the energy released in the detector over time.
I have 100,000 events of 1024 time samples, which I split 80/20 as train/test, like so:
from sklearn.model_selection import train_test_split
train_to_test_ratio=0.8 #proportion of the dataset to include in the train split
X_train,X_test,Y_train,Y_test=train_test_split(NormSignals,labels,train_size=train_to_test_ratio)
no_outputs = 14 # maximum number of particles expected
# force the labels to have 14 binary digits, one for each of the possible outputs
Y_train=tf.one_hot(Y_train,no_outputs)
Y_test=tf.one_hot(Y_test,no_outputs)
When I try to define the input shape for the network I do so like this (full CNN code below):
# Define input to neural network (tensors of 1024 time samples x 1 amplitude per sample)
inputs = keras.Input(shape=(1024,1))
But it gives me the error: "Input 0 of layer Conv_1 is incompatible with the layer: expected ndim=4, found ndim=3. Full shape received: [None, 1024, 1]"
I thought the input shape was as simple as the shape of the data arrays being passed to the network. Can someone please explain what the correct shape of my data should be?
Thank you very much in advance!
Full CNN:
from tensorflow import keras
# Following the architecture of the CNN from the image recognition lab (14/5/2020):
# Simple CNN:
class noiseLayer(keras.layers.Layer):
def __init__(self,mean):
super(noiseLayer, self).__init__()
self.mean = mean
def call(self, input):
mean = self.mean
return input + (np.random.poisson(mean))/mean
# Add data augmentation to produce a random flip of the data (the ECal is symmetrical)
# and add poissonian noise to all of the crystals - using large N and dividing by N normalises
# the noise to be approximately continuous between 0 and 1
data_augmentation = keras.Sequential([
noiseLayer(mean = 1000)
], name='DataAugm')
# Define input to neural network (tensors of 1024 time samples x 1 amplitude per sample)
inputs = keras.Input(shape=(1024,1))
#x=inputs
x = data_augmentation(inputs)
# primo blocco Convoluzionale
x = keras.layers.Conv2D(16, kernel_size=(3,3), name='Conv_1')(x)
x = keras.layers.LeakyReLU(0.1)(x)
x = keras.layers.MaxPool2D((2,2), name='MaxPool_1')(x)
# secondo blocco Convoluzionale
x = keras.layers.Conv2D(16, kernel_size=(3,3), name='Conv_2')(x)
x = keras.layers.LeakyReLU(0.1)(x)
x = keras.layers.MaxPool2D((2,2), name='MaxPool_2')(x)
# terzo blocco convoluzionale
x = keras.layers.Conv2D(32, kernel_size=(3,3), name='Conv_3')(x)
x = keras.layers.LeakyReLU(0.1)(x)
x = keras.layers.MaxPool2D((2,2), name='MaxPool_3')(x)
# Flatten output tensor of the last convolutional layer so it can be used as
# input to the dense layers
x = keras.layers.Flatten(name='Flatten')(x)
# dense network: 2 dense hidden layer with 256 neurons, with ReLU activation
# Classifier
x = keras.layers.Dense(64, name='Dense_1')(x)
x = keras.layers.ReLU(name='ReLU_dense_1')(x)
#x = keras.layers.Dropout(0.2)(x)
x = keras.layers.Dense(64, name='Dense_2')(x)
x = keras.layers.ReLU(name='ReLU_dense_2')(x)
outputs = keras.layers.Dense(no_outputs, activation='softmax', name='Output')(x)
# Model definition
model = keras.Model(inputs=inputs, outputs=outputs, name='VGGlike_CNN')
# Print model summary
model.summary()
# Show model structure
keras.utils.plot_model(model, show_shapes=True)
The problem was that I was using 2D layers to try to solve a 1D problem.
Changing all the 2D layers to 1D now compiles without errors:
x = keras.layers.Conv1D(16, kernel_size=(3), name='Conv_1')(x)
x = keras.layers.LeakyReLU(0.1)(x)
x = keras.layers.MaxPool1D((2), name='MaxPool_1')(x)
# secondo blocco Convoluzionale
x = keras.layers.Conv1D(16, kernel_size=(3), name='Conv_2')(x)
x = keras.layers.LeakyReLU(0.1)(x)
x = keras.layers.MaxPool1D((2), name='MaxPool_2')(x)
# terzo blocco convoluzionale
x = keras.layers.Conv1D(32, kernel_size=(3), name='Conv_3')(x)
x = keras.layers.LeakyReLU(0.1)(x)
x = keras.layers.MaxPool1D((2), name='MaxPool_3')(x)
# Flatten output tensor of the last convolutional layer so it can be used as
# input to the dense layers
x = keras.layers.Flatten(name='Flatten')(x)
# dense network: 2 dense hidden layer with 256 neurons, with ReLU activation
# Classifier
x = keras.layers.Dense(64, name='Dense_1')(x)
x = keras.layers.ReLU(name='ReLU_dense_1')(x)
#x = keras.layers.Dropout(0.2)(x)
x = keras.layers.Dense(64, name='Dense_2')(x)
x = keras.layers.ReLU(name='ReLU_dense_2')(x)
Related
I have the following neural net model. I have an input to as int sequence. And there is also another two neural nets beginning from same type of input layer and get concatenated together. This concatenation is the final output of the model. If I specified the input of the model as main_input and the entity_extraction and relation_extraction networks also start with main_input and their output is the final output, then does it mean that I have 3 inputs to this model? What is the underlying input/output mechanism in this model?
main_input = Input(shape=(MAX_SEQUENCE_LENGTH,), dtype='int32', name='main_input')
x = embedding_layer(main_input)
x = CuDNNLSTM(KG_EMBEDDING_DIM, return_sequences=True)(x)
x = Avg(x)
x = Dense(KG_EMBEDDING_DIM)(x)
x = Activation('relu')(x)
# relation_extraction = Reshape([KG_EMBEDDING_DIM])(x)
relation_extraction = Transpose(x)
x = embedding_layer(main_input)
x = CuDNNLSTM(KG_EMBEDDING_DIM, return_sequences=True)(x)
x = Avg(x)
x = Dense(KG_EMBEDDING_DIM)(x)
x = Activation('relu')(x)
# entity_extraction = Reshape([KG_EMBEDDING_DIM])(x)
entity_extraction = Transpose(x)
final_output = Dense(units=20, activation='softmax')(Concatenate(axis=0)([entity_extraction,relation_extraction]))
m = Model(inputs=[main_input], outputs=[final_output])
main_input is the only input into this model. relation_extraction and embedding_layer both use the same input. The output of these two LSTM layers are transposed, concatenated, and passed through a Dense layer to produce the final output.
I had a similar problem to the one given in the following link:
ValueError: `decode_predictions` expects a batch of predictions (i.e. a 2D array of shape (samples, 1000)). Found array with shape: (1, 7)
I have been able to successfully run my code and also got the probabilities for each class.My concern is what will be the order of the output probabilities?Meaning how will we know which probability belongs to which class since we are assigning or mapping them manually.any assistance would be really appreciated!
Code:
class_list=style_info['articleType'].unique()
base_model = ResNet50(weights='imagenet',include_top=False,input_shape=(80, 60, 3))
img=glob.glob('images\\'+str(style_info.iloc[55,0])+'.jpg')
# add a global spatial average pooling layer
x = base_model.output
x = GlobalAveragePooling2D()(x)
# add a fully-connected layer
x = Dense(1024, activation='relu')(x)
# and a logistic layer -- let's say we have 7 classes
predictions = Dense(no_of_classes, activation='softmax')(x)
res_model = Model(inputs=base_model.input, outputs=predictions)
image1 = image.load_img(img[0], target_size=(80,60))
x = image.img_to_array(image1)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
preds = res_model.predict(x)
# decode the results into a list of tuples (class, description, probability)
# (one such list for each sample in the batch)
#print('Predicted:', decode_predictions(preds, top=5)[0])
# Predicted: [(u'n02504013', u'Indian_elephant', 0.82658225), (u'n01871265', u'tusker', 0.1122357), (u'n02504458', u'African_elephant', 0.061040461)]
class_list[np.argmax(preds[0])]
I am looking for a way where we can use tensorflow API to create a neural network with the number of layer and hidden units as user defined.
Lets say I have a neural network like this
hidden1 = tf.layers.dense(inp, units=32, kernel_initializer=tf.initializers.he_uniform(),activation=tf.nn.relu, name="hidden1")
bn1 = tf.layers.batch_normalization(inputs=hidden1, name="bn1")
hidden2 = tf.layers.dense(bn1, units=16, kernel_initializer=tf.initializers.he_uniform(),activation=tf.nn.relu, name="hidden2")
bn2 = tf.layers.batch_normalization(inputs=hidden2, name="bn2")
hidden3 = tf.layers.dense(bn2, units=8 , kernel_initializer=tf.initializers.he_uniform(),activation=tf.nn.relu, name="hidden3")
bn3 = tf.layers.batch_normalization(inputs=hidden3, name="bn3")
out = tf.layers.dense(bn3, units=1, kernel_initializer=tf.initializers.he_uniform(), activation=None, name="out")
In the above snippet you can notice, if I want 3 layers then I need to repeat the code for 3 times.
I am looking for a way, where we can use for loop to define the above code block. For example, if number of layers is defined as 3, then the for loop should iterate and assign units and activation value for each according to user defined.
# psuedocode
for i in range(number_of_layer):
hidden_(i) = tf.layers.dense(inp, units=32, kernel_initializer=tf.initializers.he_uniform(),activation=tf.nn.relu, name="hidden_(i)")
bn_(i) = tf.layers.batch_normalization(inputs=hidden_(i), name="bn_(i)")
You can do it like this:
from keras.layers import Dense, BatchNormalization, Dropout
from keras.layers.advanced_activations import ReLU
from keras.models import Model
# Define the number of units per hidden layer
layer_widths = [128, 64, 32]
# Set up input layer
input_layer = Input(...) # change according to your input
x = input_layer.output
# Iteratively add the hidden layers
for n_neurons in layer_widths:
x = Dense(n_neurons)(x)
x = ReLU()(x)
x = BatchNormalization()(x)
x = Dropout(0.5)(x)
# Add the output layer
output = Dense(16, activation='softmax')(x) # change according to your output
# Stack the model together
model = Model(input, output)
Using tensorflow API
inp = tf.placeholder("float", [None,2],name="inp")
units = [32, 16, 8]
for unit in range(len(units)):
inp = tf.layers.dense(inp, units=units[unit], kernel_initializer=tf.initializers.he_uniform(),activation=tf.nn.relu,name="hidden" + str(unit + 1))
inp = tf.layers.batch_normalization(inputs=inp, name="bn"+str(unit + 1))
out = tf.layers.dense(inp, units=1, kernel_initializer=tf.initializers.he_uniform(), activation=None, name="out")
Was trying to implement a ResNet- CIFAR 10 model on Google Colab, using the code from https://github.com/jzuern/cifar-classifier.
Instead of ReLU activation I'm using my own custom activation function. Here is the code:
def fonlaaf(x):
return x/(1-tf.exp(-x))
def resnet_layer(inputs,
num_filters=16,
kernel_size=3,
strides=1, activation='fonlaaf',
batch_normalization=True,
conv_first=True):
"""2D Convolution-Batch Normalization-Activation stack builder
# Arguments
inputs (tensor): input tensor from input image or previous layer
num_filters (int): Conv2D number of filters
kernel_size (int): Conv2D square kernel dimensions
strides (int): Conv2D square stride dimensions
activation (string): activation name
batch_normalization (bool): whether to include batch normalization
conv_first (bool): conv-bn-activation (True) or
bn-activation-conv (False)
# Returns
x (tensor): tensor as input to the next layer
"""
conv = Conv2D(num_filters,
kernel_size=kernel_size,
strides=strides,
padding='same',
kernel_initializer='he_normal',
kernel_regularizer=tf.keras.regularizers.l2(1e-4))
x = inputs
if conv_first:
x = conv(x)
if batch_normalization:
x = BatchNormalization()(x)
if activation is not None:
x = fonlaaf(x)
else:
if batch_normalization:
x = BatchNormalization()(x)
if activation is not None:
x = fonlaaf(x)
x = conv(x)
return x
def resnet_v2(input_shape, depth=20, num_classes=10):
"""ResNet Version 2 Model builder [b]
Stacks of (1 x 1)-(3 x 3)-(1 x 1) BN-ReLU-Conv2D or also known as
bottleneck layer
First shortcut connection per layer is 1 x 1 Conv2D.
Second and onwards shortcut connection is identity.
At the beginning of each stage, the feature map size is halved (downsampled)
by a convolutional layer with strides=2, while the number of filter maps is
doubled. Within each stage, the layers have the same number filters and the
same filter map sizes.
Features maps sizes:
conv1 : 32x32, 16
stage 0: 32x32, 64
stage 1: 16x16, 128
stage 2: 8x8, 256
# Arguments
input_shape (tensor): shape of input image tensor
depth (int): number of core convolutional layers
num_classes (int): number of classes (CIFAR10 has 10)
# Returns
model (Model): Keras model instance
"""
if (depth - 2) % 9 != 0:
raise ValueError('depth should be 9n+2 (eg 56 or 110 in [b])')
# Start model definition.
num_filters_in = 16
num_res_blocks = int((depth - 2) / 9)
inputs = Input(shape=input_shape)
# v2 performs Conv2D with BN-ReLU on input before splitting into 2 paths
x = resnet_layer(inputs=inputs,
num_filters=num_filters_in,
conv_first=True)
# Instantiate the stack of residual units
for stage in range(3):
for res_block in range(num_res_blocks):
activation = 'relu'
batch_normalization = True
strides = 1
if stage == 0:
num_filters_out = num_filters_in * 4
if res_block == 0: # first layer and first stage
activation = None
batch_normalization = False
else:
num_filters_out = num_filters_in * 2
if res_block == 0: # first layer but not first stage
strides = 2 # downsample
# bottleneck residual unit
y = resnet_layer(inputs=x,
num_filters=num_filters_in,
kernel_size=1,
strides=strides,
activation=activation,
batch_normalization=batch_normalization,
conv_first=False)
y = resnet_layer(inputs=y,
num_filters=num_filters_in,
conv_first=False)
y = resnet_layer(inputs=y,
num_filters=num_filters_out,
kernel_size=1,
conv_first=False)
if res_block == 0:
# linear projection residual shortcut connection to match
# changed dims
x = resnet_layer(inputs=x,
num_filters=num_filters_out,
kernel_size=1,
strides=strides,
activation=None,
batch_normalization=False)
x = tf.keras.layers.add([x, y])
num_filters_in = num_filters_out
# Add classifier on top.
# v2 has BN-ReLU before Pooling
x = BatchNormalization()(x)
x = fonlaaf(x)
x = AveragePooling2D(pool_size=8)(x)
y = Flatten()(x)
outputs = Dense(num_classes,
activation='softmax',
kernel_initializer='he_normal')(y)
# Instantiate model.
model = Model(inputs=inputs, outputs=outputs)
model.compile(loss='sparse_categorical_crossentropy',
optimizer=tf.keras.optimizers.Adam(lr=hparams.learning_rate),
metrics=['accuracy'])
return model
tf.logging.set_verbosity(tf.logging.DEBUG)
resnet_model = resnet_v2((32, 32, 3), depth=56, num_classes=hparams.n_classes)
# Download and extract CIFAR-10 data
maybe_download_and_extract()
# training data
x_train, y_train = load_training_data()
# Validation data
x_val, y_val = load_validation_data()
# Testing data
x_test, y_test = load_testing_data()
# Define callbacks
callbacks = [
tf.keras.callbacks.TensorBoard(log_dir=hparams.checkpoint_dir)
]
# This will do preprocessing and realtime data augmentation:
datagen = ImageDataGenerator(
featurewise_center=False, # set input mean to 0 over the dataset
samplewise_center=False, # set each sample mean to 0
featurewise_std_normalization=False, # divide inputs by std of the dataset
samplewise_std_normalization=False, # divide each input by its std
zca_whitening=False, # apply ZCA whitening
zca_epsilon=1e-06, # epsilon for ZCA whitening
rotation_range=0, # randomly rotate images in the range (degrees, 0 to 180)
# randomly shift images horizontally (fraction of total width)
width_shift_range=0.1,
# randomly shift images vertically (fraction of total height)
height_shift_range=0.1,
# set mode for filling points outside the input boundaries
fill_mode='nearest',
cval=0., # value used for fill_mode = "constant"
horizontal_flip=True, # randomly flip images
vertical_flip=False)
# Compute quantities required for feature-wise normalization
# (std, mean, and principal components if ZCA whitening is applied).
datagen.fit(x_train)
# Fit the model on the batches generated by datagen.flow().
resnet_model.fit_generator(
datagen.flow(x_train, y_train,batch_size=hparams.train_batch_size),
epochs=hparams.n_epochs,
validation_data=(x_val, y_val),
workers=4,
callbacks=callbacks)
Got the following error: ValueError: Output tensors to a Model must be the output of a TensorFlow Layer (thus holding past layer metadata). Found: Tensor("dense/Softmax:0", shape=(?, 10), dtype=float32)
The previous answers mostly to this error didn't work out. What am I missing here?
As the error states, you have to pass the output of a Layer. As fonlaaf() is an activation function with no state, you can use Lambda layer.
Replace,
def fonlaaf(x):
return x/(1-tf.exp(-x))
with
def fonlaaf(x):
return tf.keras.layers.Lambda(lambda x: x/(1-tf.exp(-x)))(x)
https://www.tensorflow.org/guide/keras/#custom_layers
https://www.tensorflow.org/api_docs/python/tf/keras/layers/Lambda
I am trying to develop a 1D convolutional neural network with residual connections and batch-normalization based on the paper Cardiologist-Level Arrhythmia Detection with Convolutional Neural Networks, using keras.
This is the code so far:
# define model
x = Input(shape=(time_steps, n_features))
# First Conv / BN / ReLU layer
y = Conv1D(filters=n_filters, kernel_size=n_kernel, strides=n_strides, padding='same')(x)
y = BatchNormalization()(y)
y = ReLU()(y)
shortcut = MaxPooling1D(pool_size = n_pool)(y)
# First Residual block
y = Conv1D(filters=n_filters, kernel_size=n_kernel, strides=n_strides, padding='same')(y)
y = BatchNormalization()(y)
y = ReLU()(y)
y = Dropout(rate=drop_rate)(y)
y = Conv1D(filters=n_filters, kernel_size=n_kernel, strides=n_strides, padding='same')(y)
# Add Residual (shortcut)
y = add([shortcut, y])
# Repeated Residual blocks
for k in range (2,3): # smaller network for testing
shortcut = MaxPooling1D(pool_size = n_pool)(y)
y = BatchNormalization()(y)
y = ReLU()(y)
y = Dropout(rate=drop_rate)(y)
y = Conv1D(filters=n_filters * k, kernel_size=n_kernel, strides=n_strides, padding='same')(y)
y = BatchNormalization()(y)
y = ReLU()(y)
y = Dropout(rate=drop_rate)(y)
y = Conv1D(filters=n_filters * k, kernel_size=n_kernel, strides=n_strides, padding='same')(y)
y = add([shortcut, y])
z = BatchNormalization()(y)
z = ReLU()(z)
z = Flatten()(z)
z = Dense(64, activation='relu')(z)
predictions = Dense(classes, activation='softmax')(z)
model = Model(inputs=x, outputs=predictions)
# Compiling
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['categorical_accuracy'])
# Fitting
model.fit(train_x, train_y, epochs=n_epochs, batch_size=n_batch)
And this is the graph of a simplified model of what I am trying to build.
The model described in the paper uses an incrementing number of filters:
The network consists of 16 residual blocks with 2 convolutional layers per block. The convolutional layers all have a filter length of 16 and have 64k filters, where k starts out as 1 and is incremented every 4-th residual block. Every alternate residual block subsamples its inputs by a factor of 2, thus the original input is ultimately subsampled by a factor of 2^8. When a residual block subsamples the input, the corresponding shortcut connections also subsample their input using a Max Pooling operation with the same subsample factor.
But I can only make it work if I use the same number of filters in every Conv1D layer, with k=1, strides=1 and padding=same, without applying any MaxPooling1D. Any changes in these parameters causes a tensor size mismatch and failure to compile with the following error:
ValueError: Operands could not be broadcast together with shapes (70, 64) (70, 128)
Does anyone have any idea on how to fix this size mismatch and make it work?
In addition, if the input has more than one channel (or features) the mismatch is even worst! Is there a way to deal with more than one channel?
The issue of tensor shape mismatch should be happening in add([y, shortcut]) layer. Because of the fact that you are using MaxPooling1D layer, this halves your time-steps by default, which you can change it by using the pool_size parameter. On the other hand, your residual portion is not reducing the time-steps by same amount. You should apply stride=2 with padding='same' before adding shortcut and y in any one of Conv1D layer (preferably the last one).
For reference, you can check out the Resnet code here Keras-applications-github