I'm building a Keras model to categorise data into one of 9 categories. The issue is it will only work with a Sigmoid activation which is designed for binary outputs, other activations result in 0 accuracy. What would I need to change for it to classify into each of the labels?
#Reshape data to add new dimension
X_train = X_train.reshape((100, 150, 1))
Y_train = X_train.reshape((100, 1, 1))
model = Sequential()
model.add(Conv1d(1, kernel_size=3, activation='relu', input_shape=(None, 1)))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss='categorical_hinge', optimizer='adam', metrics=['accuracy'])
model.fit(x=X_train,y=Y_train, epochs=200, batch_size=20)
A single-unit dense layer is not what we use in the case of multi-class classification; you should first ensure that your Y data are one-hot encoded - if not, you can make them so using Keras utility functions:
num_classes=9
Y_train = keras.utils.to_categorical(Y_train, num_classes)
and then change your last layer to:
model.add(Dense(num_classes))
model.add(Activation('softmax'))
Also, if you don't have any specific reasons to use the categorical Hinge loss, I would suggest starting with loss='categorical_crossentropy' in your model compilation.
That said, your model seems too simple, and you may want to try adding some more layers...
Related
I was wondering whether creating the model by passing activity_regularizer='l1_l2' as an argument to Conv2D()
model = keras.Sequential()
model.add(Conv2D(filters=16, kernel_size=(6, 6), strides=3, padding='valid', activation='relu',
activity_regularizer='l1_l2', input_shape=X_train[0].shape))
model.add(Dropout(0.2))
model.add(MaxPooling2D(pool_size=(3, 1), strides=3, padding='valid'))
model.add(Dropout(0.3))
model.add(Flatten())
model.add(Dense(10, activation='softmax'))
model.compile(optimizer=Adam(learning_rate = 0.001), loss = 'sparse_categorical_crossentropy', metrics = ['accuracy'])
model.summary()
history = model.fit(X_train, y_train, epochs = 10, validation_data = (X_val, y_val), verbose=0)
will mathematically make a difference to creating the model by adding model.add(ActivityRegularization(l1=..., l2=...)) seperately?
model = keras.Sequential()
model.add(Conv2D(filters=16, kernel_size=(6, 6), strides=3, padding='valid', activation='relu',
input_shape=X_train[0].shape))
model.add(Dropout(0.2))
model.add(ActivityRegularization(l1=some_number, l2=some_number))
model.add(MaxPooling2D(pool_size=(3, 1), strides=3, padding='valid'))
model.add(Dropout(0.3))
model.add(Flatten())
model.add(Dense(10, activation='softmax'))
model.compile(optimizer=Adam(learning_rate = 0.001), loss = 'sparse_categorical_crossentropy', metrics = ['accuracy'])
model.summary()
history = model.fit(X_train, y_train, epochs = 10, validation_data = (X_val, y_val), verbose=0)
For me, it is hard to tell, as training always involves some randomness. But the results seem similar.
One additional question I have is: I accidentally passed the activity_regularizer='l1_l2' argument to the MaxPooling2D() layer before, and the code ran. How can that be, considering that activity_regularizer is not given as a possible argument for MaxPooling2D() in tensorflow?
Technically, if you are not applying any other constraint on the layer output, applying the activity regularizer inside the layer as well as outside the convolution layer is same. However, applying it outside the convolution layer gives the user more flexibility. For instance, the user might want to regularize the output units after the skip connections are set up instead of after the convolution. It is just like to have an activation function inside the convolution layer or using keras.activations to use the activations after he convolution layer. Sometimes this is done after batch normalization.
For your second question, the MaxPool2D layer takes the activity regularizer constraint. Even though this is not mentioned in their documentation, it kind of makes sense intuitionally, since the user might want to regularize the outputs after max-pooling. You can check that activity_regularizer does not only work with the MaxPool2D layer but also with other layers such as the BatchNormalization layer for the same reason.
tf.keras.layers.Conv2D(64 , 2 , padding='same', activity_regularizer='l1_l2')
and this code,
tf.keras.layers.Conv2D(64 , 2 , padding='same')
tf.keras.layers.ActivityRegularization()
They both do the same job, actually doing inside or outside has the same impact. Moreover, Tensorflow on the backend makes a graph of it which will first apply the ConvLayer then it will apply the Activity-Regularization, in both cases, the computation shall be done in the same way with no difference...
I am working with predicting a time series in 3 dimensions and would like to know if it is possible to configure a model to output a martix in Keras.
Currently, I have 3 regression models I train one after the other, one for predicting each output dimension. With a prediction horizon of 10 samples for example, each model is outputting a 10x1 vector. However, it seems like this could be done much more efficiently with a single model.
Thank you
I have found a much better way to do this using the Keras core layer Reshape. For a prediction horizon by predicted variables sized output, add a Reshape layer after the dense layer with that shape
from keras.layers import Dense, Reshape, Sequential, LSTM
model = Sequential()
model.add(LSTM(100, activation='relu', return_sequences=True, input_shape=(n_steps_in, n_features)))
model.add(LSTM(100, activation='relu'))
model.add(Dense(n_steps_out*n_features))
model.add(Reshape((n_steps_out,n_features)))
model.compile(optimizer='adam', loss='mse')
I figured out a pretty easy work around. I just reshape the targets on the way in and reshape the predictions on the way out.
input_data = input_data.reshape((num_simulations,input_samples*3))
target_data = target_data.reshape((num_simulations,horizon*3))
model.fit(input_data, target_data, validation_split=0.2, epochs=epochs,
batch_size=batch_size, verbose=0, shuffle=True)
prediction = model.predict(input_data, batch_size=batch_size)
prediction = prediction.reshape((num_simulations,horizon,3))
have 2 datasets, for the first data set i want to apply convolution and keep the result of flatten layyer then concatenate it with an other data set and a do a simple feed forward it is possible with keras ?
def build_model(x_train,y_train):
np.random.seed(7)
left = Sequential()
left.add(Conv1D(nb_filter= 6, filter_length=3, input_shape= (48,1),activation = 'relu', kernel_initializer='glorot_uniform'))
left.add(Conv1D(nb_filter= 6, filter_length=3, activation= 'relu'))
#model.add(MaxPooling1D())
print model
#model.add(Dropout(0.2))
# flatten layer
#https://www.quora.com/What-is-the-meaning-of-flattening-step-in-a-convolutional-neural-network
left.add(Flatten())
left.add(Reshape((48,1)))
right = Sequential()
#model.add(Reshape((48,1)))
# Compile model
model.add(Merge([left, right], mode='sum'))
model.add(Dense(10, 10))
epochs = 100
lrate = 0.01
decay = lrate/epochs
sgd = SGD(lr=lrate, momentum=0.9, decay=decay, nesterov=False)
#clipvalue=0.5)
model.compile(loss='mean_squared_error', optimizer='Adam')
model.fit(x_train,y_train, nb_epoch =epochs, batch_size=10, verbose=1)
#model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'] , )
return model
You need to look at the functional API. The sequential model you are using is not designed to take multiple network inputs.
Follow the "Multi-input and multi-output models" example and you will have it working in no time!
I am using a very simple MLP with just 1 hidden layer to estimate option prices.
In addition to the actual output of the neural network I would also like to know the partial derivative of the output value (of each line of the data sample) with regard to one of the 6 input parameters such that the resulting value can be interpreted as the percentage change of the output with regard to a change in the input parameter.
As I am pretty new to Keras and Neural Networks in general I was not able to come up with a solution for the problem myself.
# Create Model
model = Sequential()
model.add(Dense(6, input_dim=6)) #input layer
model.add(Dense(10, activation=relu)) #hidden layer
model.add(Dense(1, activation=linear)) #output layer
# Compile Model
model.compile(loss='mse', optimizer='adam', metrics=['mae'])
# Train model
model.fit(X_train, Y_train, epochs=50, batch_size=10 verbose=2, validation_split=0.2)
# Predict Values
Y_pred = model.predict(X_test, batch_size=10)
How do I train 1 model multiple times and combine them at the output layer?
For example:
model_one = Sequential() #model 1
model_one.add(Convolution2D(32, 3, 3, activation='relu', input_shape=(1,28,28)))
model_one.add(Flatten())
model_one.add(Dense(128, activation='relu'))
model_two = Sequential() #model 2
model_two.add(Dense(128, activation='relu', input_shape=(784)))
model_two.add(Dense(128, activation='relu'))
model_???.add(Dense(10, activation='softmax')) #combine them here
model.compile(loss='categorical_crossentropy', #continu together
optimizer='adam',
metrics=['accuracy'])
model.fit(X_train, Y_train, #continu together somehow, even though this would never work because X_train and Y_train have wrong formats
batch_size=32, nb_epoch=10, verbose=1)
I've heard I can do this through a graph model but I can't find any documentation on it.
EDIT: in reply to the suggestion below:
A1 = Conv2D(20,kernel_size=(5,5),activation='relu',input_shape=( 28, 28, 1))
---> B1 = MaxPooling2D(pool_size=(2,2))(A1)
throws this error:
AttributeError: 'Conv2D' object has no attribute 'get_shape'
Graph notation would do it for you. Essentially you give every layer a unique handle then link back to the previous layer using the handle in brackets at the end:
layer_handle = Layer(params)(prev_layer_handle)
Note that the first layer must be an Input(shape=(x,y)) with no prior connection.
Then when you make your model you need to tell it that it expects multiple inputs with a list:
model = Model(inputs=[in_layer1, in_layer2, ..], outputs=[out_layer1, out_layer2, ..])
Finally when you train it you also need to provide a list of input and output data that corresponds with your definition:
model.fit([x_train1, x_train2, ..], [y_train1, y_train2, ..])
Meanwhile everything else is the same so you just need to combine together the above to give you the network layout that you want:
from keras.models import Model
from keras.layers import Input, Convolution2D, Flatten, Dense, Concatenate
# Note Keras 2.02, channel last dimension ordering
# Model 1
in1 = Input(shape=(28,28,1))
model_one_conv_1 = Convolution2D(32, (3, 3), activation='relu')(in1)
model_one_flat_1 = Flatten()(model_one_conv_1)
model_one_dense_1 = Dense(128, activation='relu')(model_one_flat_1)
# Model 2
in2 = Input(shape=(784, ))
model_two_dense_1 = Dense(128, activation='relu')(in2)
model_two_dense_2 = Dense(128, activation='relu')(model_two_dense_1)
# Model Final
model_final_concat = Concatenate(axis=-1)([model_one_dense_1, model_two_dense_2])
model_final_dense_1 = Dense(10, activation='softmax')(model_final_concat)
model = Model(inputs=[in1, in2], outputs=model_final_dense_1)
model.compile(loss='categorical_crossentropy', #continu together
optimizer='adam',
metrics=['accuracy'])
model.fit([X_train_one, X_train_two], Y_train,
batch_size=32, nb_epoch=10, verbose=1)
Documentation can be found in the Functional Model API. I'd recommend reading around other questions or checking out Keras' repo as well since the documentation currently doesn't have many examples.