I am trying to implement CNN by Theano. I used Keras library. My data set is 55 alphabet images, 28x28.
In the last part I get this error:
train_acc=hist.history['acc']
KeyError: 'acc'
Any help would be much appreciated. Thanks.
This is part of my code:
from keras.models import Sequential
from keras.models import Model
from keras.layers.core import Dense, Dropout, Activation, Flatten
from keras.layers.convolutional import Convolution2D, MaxPooling2D
from keras.optimizers import SGD, RMSprop, adam
from keras.utils import np_utils
import matplotlib
import matplotlib.pyplot as plt
import matplotlib.cm as cm
from urllib.request import urlretrieve
import pickle
import os
import gzip
import numpy as np
import theano
import lasagne
from lasagne import layers
from lasagne.updates import nesterov_momentum
from nolearn.lasagne import NeuralNet
from nolearn.lasagne import visualize
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
from PIL import Image
import PIL.Image
#from Image import *
import webbrowser
from numpy import *
from sklearn.utils import shuffle
from sklearn.cross_validation import train_test_split
from tkinter import *
from tkinter.ttk import *
import tkinter
from keras import backend as K
K.set_image_dim_ordering('th')
%%%%%%%%%%
batch_size = 10
# number of output classes
nb_classes = 6
# number of epochs to train
nb_epoch = 5
# input iag dimensions
img_rows, img_clos = 28,28
# number of channels
img_channels = 3
# number of convolutional filters to use
nb_filters = 32
# number of convolutional filters to use
nb_pool = 2
# convolution kernel size
nb_conv = 3
%%%%%%%%
model = Sequential()
model.add(Convolution2D(nb_filters, nb_conv, nb_conv,
border_mode='valid',
input_shape=(1, img_rows, img_clos)))
convout1 = Activation('relu')
model.add(convout1)
model.add(Convolution2D(nb_filters, nb_conv, nb_conv))
convout2 = Activation('relu')
model.add(convout2)
model.add(MaxPooling2D(pool_size=(nb_pool, nb_pool)))
model.add(Dropout(0.5))
model.add(Flatten())
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adadelta')
%%%%%%%%%%%%
hist = model.fit(X_train, Y_train, batch_size=batch_size, nb_epoch=nb_epoch,
show_accuracy=True, verbose=1, validation_data=(X_test, Y_test))
hist = model.fit(X_train, Y_train, batch_size=batch_size, nb_epoch=nb_epoch,
show_accuracy=True, verbose=1, validation_split=0.2)
%%%%%%%%%%%%%%
train_loss=hist.history['loss']
val_loss=hist.history['val_loss']
train_acc=hist.history['acc']
val_acc=hist.history['val_acc']
xc=range(nb_epoch)
#xc=range(on_epoch_end)
plt.figure(1,figsize=(7,5))
plt.plot(xc,train_loss)
plt.plot(xc,val_loss)
plt.xlabel('num of Epochs')
plt.ylabel('loss')
plt.title('train_loss vs val_loss')
plt.grid(True)
plt.legend(['train','val'])
print (plt.style.available) # use bmh, classic,ggplot for big pictures
plt.style.use(['classic'])
plt.figure(2,figsize=(7,5))
plt.plot(xc,train_acc)
plt.plot(xc,val_acc)
plt.xlabel('num of Epochs')
plt.ylabel('accuracy')
plt.title('train_acc vs val_acc')
plt.grid(True)
plt.legend(['train','val'],loc=4)
#print plt.style.available # use bmh, classic,ggplot for big pictures
plt.style.use(['classic'])
In a not-so-common case (as I expected after some tensorflow updates), despite choosing metrics=["accuracy"] in the model definitions, I still got the same error.
The solution was: replacing metrics=["acc"] with metrics=["accuracy"] everywhere. In my case, I was unable to plot the parameters of the history of my training. I had to replace
acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']
to
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']
Your log variable will be consistent with the metrics when you compile your model.
For example, the following code
model.compile(loss="mean_squared_error", optimizer=optimizer)
model.fit_generator(gen,epochs=50,callbacks=ModelCheckpoint("model_{acc}.hdf5")])
will gives a KeyError: 'acc' because you didn't set metrics=["accuracy"] in model.compile.
This error also happens when metrics are not matched. For example
model.compile(loss="mean_squared_error",optimizer=optimizer, metrics="binary_accuracy"])
model.fit_generator(gen,epochs=50,callbacks=ModelCheckpoint("model_{acc}.hdf5")])
still gives a KeyError: 'acc' because you set a binary_accuracy metric but asking for accuracy later.
If you change the above code to
model.compile(loss="mean_squared_error",optimizer=optimizer, metrics="binary_accuracy"])
model.fit_generator(gen,epochs=50,callbacks=ModelCheckpoint("model_{binary_accuracy}.hdf5")])
it will work.
You can use print(history.history.keys()) to find out what metrics you have and what they are called. In my case also, it was called "accuracy", not "acc"
In my case switching from
metrics=["accuracy"]
to
metrics=["acc"]
was the solution.
from keras source :
warnings.warn('The "show_accuracy" argument is deprecated, '
'instead you should pass the "accuracy" metric to '
'the model at compile time:\n'
'`model.compile(optimizer, loss, '
'metrics=["accuracy"])`')
The right way to get the accuracy is indeed to compile your model like this:
model.compile(loss='categorical_crossentropy', optimizer='adadelta', metrics=["accuracy"])
does it work?
Make sure to check this "breaking change":
Metrics and losses are now reported under the exact name specified by the user (e.g. if you pass metrics=['acc'], your metric will be reported under the string "acc", not "accuracy", and inversely metrics=['accuracy'] will be reported under the string "accuracy".
If you are using Tensorflow 2.3 then you can specify like this
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
loss=tf.keras.losses.CategoricalCrossentropy(), metrics=[tf.keras.metrics.CategoricalAccuracy(name="acc")])
In the New version of TensorFlow, some things have changed so we have to replace it with :
acc = history.history['accuracy']
print(history.history.keys())
output--
dict_keys(['loss', 'accuracy', 'val_loss', 'val_accuracy'])
so you need to change "acc" to "accuracy" and "val_acc" to "val_accuracy"
For Practice
3.5-classifying-movie-reviews.ipynb
Change
acc = history.history['acc']
val_acc = history.history['val_acc']
To
acc = history.history['binary_accuracy']
val_acc = history.history['val_binary_accuracy']
&
Change
acc_values = history_dict['acc']
val_acc_values = history_dict['val_acc']
To
acc_values = history_dict['binary_accuracy']
val_acc_values = history_dict['val_binary_accuracy']
================
Practice
3.6-classifying-newswires.ipynb
Change
acc = history.history['acc']
val_acc = history.history['val_acc']
To
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']
Related
I've been trying to create a model that recognizes different singing techniques. I have got good results but I want to do different tests with different optimizers, layers, etc. However, I can't get reproducible results. By running twice this model training:
num_epochs = 100
batch_size = 128
history = modelo.fit(X_train_f, Y_train, validation_data=(X_test_f,Y_test), epochs=num_epochs, batch_size=batch_size, verbose=2)
I can get 25% accuracy the first run and then 34% the second. Then if I change the optimizer from "sgd" to "adam", I would get a 99%. If I come back to the previous "sgd" optimizer that got me 34% the second run, I would get 100% or something crazy like that. I don't understand why.
I've tried many things I've read in similar questions. The following lines show how I am trying to make my code to be reproducible, and these are actually the first lines of my whole code:
import numpy as np
import tensorflow as tf
import random as rn
import os
#https://stackoverflow.com/questions/57305909/tensorflow-keras-reproducibility-problem-on-google-colab
os.environ['PYTHONHASHSEED']=str(5)
np.random.seed(5)
rn.seed(12345)
session_conf = tf.compat.v1.ConfigProto(intra_op_parallelism_threads=1,
inter_op_parallelism_threads=1)
tf.compat.v1.set_random_seed(1234)
sess = tf.compat.v1.Session(graph=tf.compat.v1.get_default_graph(), config=session_conf)
tf.compat.v1.keras.backend.set_session(sess)
Question is, what am I doing wrong with the code above that is not working (as I mentioned)?
Here's where I create the training sets:
from keras.datasets import mnist
from keras.utils import np_utils
from keras.models import Sequential
from keras.layers.convolutional import Conv1D, MaxPooling1D
from keras.layers.core import Dense, Flatten
from keras.layers import BatchNormalization,Activation
from keras.optimizers import SGD, Adam
from sklearn.model_selection import train_test_split
X_train, X_test, Y_train, Y_test = train_test_split(X,Y, test_size=0.2, random_state=2)
My model:
from tensorflow.keras import layers
from tensorflow.keras import initializers
input_dim = X_train_f.shape[1]
output_dim = Y_train.shape[1]
modelo = Sequential()
modelo.add(Conv1D(filters=6, kernel_initializer=initializers.glorot_uniform(seed=5), kernel_size=5, activation='relu', input_shape=(40, 1))) # 6
modelo.add(MaxPooling1D(pool_size=2))
modelo.add(Conv1D(filters=16, kernel_initializer=initializers.glorot_uniform(seed=5), kernel_size=5, activation='relu')) # 16
modelo.add(MaxPooling1D(pool_size=2))
modelo.add(Flatten())
modelo.add(Dense(120, kernel_initializer=initializers.glorot_uniform(seed=5), activation='relu')) # 120
modelo.add(Dense(84, kernel_initializer=initializers.glorot_uniform(seed=5), activation='relu')) # 84
modelo.add(Dense(nclases, kernel_initializer=initializers.glorot_uniform(seed=5), activation='softmax'))
sgd = SGD(lr=0.1)
#modelo.compile(loss='categorical_crossentropy',
# optimizer='adam',
# metrics=['accuracy'])
modelo.compile(loss='categorical_crossentropy',
optimizer=sgd,
metrics=['accuracy'])
modelo.summary()
modelo.input_shape
It is a normal situation. Adam optimizer is much more powerful comparing to SGD. Adam implicitly performs coordinate-wise gradient clipping and can hence, unlike SGD, tackle heavy-tailed noise.
I have a folder called train and in that I have three separate folders for "Covid", "Pneumonia", "Healthy". I have created VGG19 model using transfer learning. I want help with creating confusion matrix for my test data. I have split my train data as 75%,25%. I have tried everything to create confusion matrix and calculate F-1 score. It would be really great if someone provides me the code, since I am new to this.
#import needed packages
import numpy as np
import tensorflow as tf
import keras
from keras import backend as K
from keras.optimizers import Adam
from keras.metrics import categorical_crossentropy
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Model
from keras.layers import Dense,GlobalAveragePooling2D,Dropout,SeparableConv2D,BatchNormalization, Activation, Dense
from keras.applications.vgg19 import VGG19
from keras.applications.resnet import ResNet101
from keras.optimizers import Adam
from keras.layers import Input, Lambda, Dense, Flatten
import matplotlib.pyplot as plt
# dataset has 3 classes
num_class = 3
# Base model without Fully connected Layers
base_model = VGG19(include_top=False, weights='imagenet', input_shape=(224,224,3))
# don't train existing weights
for layer in base_model.layers:
layer.trainable = False
x=base_model.output
x = Flatten()(base_model.output)
##3 classes ,COVID, PNEUNOMIA, NORMAL
preds=Dense(num_class, activation='softmax')(x) #final layer with softmax activation
model=Model(inputs=base_model.input,outputs=preds)
##check the model
model.summary()
# tell the model what cost and optimization method to use
model.compile(
loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy']
)
##Preparing Data
train_datagen=ImageDataGenerator(preprocessing_function=keras.applications.vgg16.preprocess_input,
validation_split=0.25)
train_generator=train_datagen.flow_from_directory('C:/Users/prani/OneDrive/Desktop/Masters project/chest_xray/Train-250/',
target_size=(224,224),
batch_size=64,
class_mode='categorical',
subset='training')
validation_generator = train_datagen.flow_from_directory(
'C:/Users/prani/OneDrive/Desktop/Masters project/chest_xray/Train-250/', # same directory as training data
target_size=(224,224),
batch_size=64,
class_mode='categorical',
subset='validation') # set as validation data
##Training
##SEtting hyper parameter
epochs = 50
learning_rate = 0.0005
decay_rate = learning_rate / epochs
opt = Adam(lr=learning_rate, beta_1=0.9, beta_2=0.999, decay=decay_rate, amsgrad=False)
model.compile(optimizer=opt,loss='categorical_crossentropy',metrics=['accuracy'])
##Train
step_size_train = train_generator.n/train_generator.batch_size
step_size_val = validation_generator.samples // validation_generator.batch_size
r = model.fit_generator(generator=train_generator,
steps_per_epoch=step_size_train,
validation_data = validation_generator,
validation_steps =step_size_val,
epochs=50)
##Confusion Matrix (Sample code for confusion matrix but this is not working, it doesn't produce error neither results)
from sklearn.metrics import classification_report, confusion_matrix
batch_size=64
Y_pred = model.predict_generator(validation_generator,186)
y_pred = np.argmax(Y_pred, axis=1)
print('Confusion Matrix')
print(confusion_matrix(validation_generator.classes, y_pred))
print('Classification Report')
target_names = ['COVID','NORMAL','PNEUMONIA']
print(classification_report(validation_generator.classes, y_pred, target_names=target_names))
You could look at this page for information on how to code a confusion matrix and other metrics:
https://www.tensorflow.org/tutorials/structured_data/imbalanced_data
I'm trying to train a deep classifier in Keras both with and without pretraining of the hidden layers via stacked autoencoders. My problem is that the pretraining seems to drastically degrade performance (i.e. if pretrain is set to False in the code below the training error of the final classification layer converges much faster). This seems completely outrageous to me given that pretraining should only initialize the weights of the hidden layers and I don't see how that could completely kill the models performance even if that initialization does not work very well. I can not include the specific dataset I used but the effect should occur for any appropriate dataset (e.g. minist). What is going on here and how can I fix it?
EDIT: code is now reproducible with the MNIST data, final line prints change in loss function, which is significantly lower with pre-training.
I have also slightly modified the code and added sample learning curves below:
from functools import partial
import matplotlib.pyplot as plt
from keras.datasets import mnist
from keras.layers import Dense
from keras.models import Sequential
from keras.optimizers import SGD
from keras.regularizers import l2
from keras.utils import to_categorical
(inputs_train, targets_train), _ = mnist.load_data()
inputs_train = inputs_train[:1000].reshape(1000, 784)
targets_train = to_categorical(targets_train[:1000])
hidden_nodes = [256] * 4
learning_rate = 0.01
regularization = 1e-6
epochs = 30
def train_model(pretrain):
model = Sequential()
layer = partial(Dense,
activation='sigmoid',
kernel_initializer='random_normal',
kernel_regularizer=l2(regularization))
for i, hn in enumerate(hidden_nodes):
kwargs = dict(units=hn, name='hidden_{}'.format(i + 1))
if i == 0:
kwargs['input_dim'] = inputs_train.shape[1]
model.add(layer(**kwargs))
if pretrain:
# train autoencoders
inputs_train_ = inputs_train.copy()
for i, hn in enumerate(hidden_nodes):
autoencoder = Sequential()
autoencoder.add(layer(units=hn,
input_dim=inputs_train_.shape[1],
name='hidden'))
autoencoder.add(layer(units=inputs_train_.shape[1],
name='decode'))
autoencoder.compile(optimizer=SGD(lr=learning_rate, momentum=0.9),
loss='binary_crossentropy')
autoencoder.fit(
inputs_train_,
inputs_train_,
batch_size=32,
epochs=epochs,
verbose=0)
autoencoder.pop()
model.layers[i].set_weights(autoencoder.layers[0].get_weights())
inputs_train_ = autoencoder.predict(inputs_train_)
num_classes = targets_train.shape[1]
model.add(Dense(units=num_classes,
activation='softmax',
name='classify'))
model.compile(optimizer=SGD(lr=learning_rate, momentum=0.9),
loss='categorical_crossentropy')
h = model.fit(
inputs_train,
targets_train,
batch_size=32,
epochs=epochs,
verbose=0)
return h.history['loss']
plt.plot(train_model(pretrain=False), label="Without Pre-Training")
plt.plot(train_model(pretrain=True), label="With Pre-Training")
plt.xlabel("Epoch")
plt.ylabel("Cross-Entropy")
plt.legend()
plt.show()
I want to build a non linear regression model using keras to predict a +ve continuous variable.
For the below model how do I select the following hyperparameters?
Number of Hidden layers and Neurons
Dropout ratio
Use BatchNormalization or not
Activation function out of linear, relu, tanh, sigmoid
Best optimizer to use among adam, rmsprog, sgd
Code
def dnn_reg():
model = Sequential()
#layer 1
model.add(Dense(40, input_dim=13, kernel_initializer='normal'))
model.add(Activation('tanh'))
model.add(Dropout(0.2))
#layer 2
model.add(Dense(30, kernel_initializer='normal'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dropout(0.4))
#layer 3
model.add(Dense(5, kernel_initializer='normal'))
model.add(BatchNormalization())
model.add(Activation('relu'))
model.add(Dropout(0.4))
model.add(Dense(1, kernel_initializer='normal'))
model.add(Activation('relu'))
# Compile model
model.compile(loss='mean_squared_error', optimizer='adam')
return model
I have considered random gridsearch but instead want to use hyperopt which I believe will be faster. I initially implemented the tuning using https://github.com/maxpumperla/hyperas. Hyperas is not working with latest version of keras. I suspect that keras is evolving fast and it's difficult for the maintainer to make it compatible. So I think using hyperopt directly will be a better option.
PS: I am new to bayesian optimization for hyper parameter tuning and hyperopt.
I've had a lot of success with Hyperas. The following are the things I've learned to make it work.
1) Run it as a python script from the terminal (not from an Ipython notebook)
2) Make sure that you do not have any comments in your code (Hyperas doesn't like comments!)
3) Encapsulate your data and model in a function as described in the hyperas readme.
Below is an example of a Hyperas script that worked for me (following the instructions above).
from __future__ import print_function
from hyperopt import Trials, STATUS_OK, tpe
from keras.datasets import mnist
from keras.layers.core import Dense, Dropout, Activation
from keras.models import Sequential
from keras.utils import np_utils
import numpy as np
from hyperas import optim
from keras.models import model_from_json
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation, Flatten
from keras.layers.convolutional import Convolution2D, MaxPooling2D
from keras.optimizers import SGD , Adam
import tensorflow as tf
from hyperas.distributions import choice, uniform, conditional
__author__ = 'JOnathan Hilgart'
def data():
"""
Data providing function:
This function is separated from model() so that hyperopt
won't reload data for each evaluation run.
"""
import numpy as np
x = np.load('training_x.npy')
y = np.load('training_y.npy')
x_train = x[:15000,:]
y_train = y[:15000,:]
x_test = x[15000:,:]
y_test = y[15000:,:]
return x_train, y_train, x_test, y_test
def model(x_train, y_train, x_test, y_test):
"""
Model providing function:
Create Keras model with double curly brackets dropped-in as needed.
Return value has to be a valid python dictionary with two customary keys:
- loss: Specify a numeric evaluation metric to be minimized
- status: Just use STATUS_OK and see hyperopt documentation if not feasible
The last one is optional, though recommended, namely:
- model: specify the model just created so that we can later use it again.
"""
model_mlp = Sequential()
model_mlp.add(Dense({{choice([32, 64,126, 256, 512, 1024])}},
activation='relu', input_shape= (2,)))
model_mlp.add(Dropout({{uniform(0, .5)}}))
model_mlp.add(Dense({{choice([32, 64, 126, 256, 512, 1024])}}))
model_mlp.add(Activation({{choice(['relu', 'sigmoid'])}}))
model_mlp.add(Dropout({{uniform(0, .5)}}))
model_mlp.add(Dense({{choice([32, 64, 126, 256, 512, 1024])}}))
model_mlp.add(Activation({{choice(['relu', 'sigmoid'])}}))
model_mlp.add(Dropout({{uniform(0, .5)}}))
model_mlp.add(Dense({{choice([32, 64, 126, 256, 512, 1024])}}))
model_mlp.add(Activation({{choice(['relu', 'sigmoid'])}}))
model_mlp.add(Dropout({{uniform(0, .5)}}))
model_mlp.add(Dense(9))
model_mlp.add(Activation({{choice(['softmax','linear'])}}))
model_mlp.compile(loss={{choice(['categorical_crossentropy','mse'])}}, metrics=['accuracy'],
optimizer={{choice(['rmsprop', 'adam', 'sgd'])}})
model_mlp.fit(x_train, y_train,
batch_size={{choice([16, 32, 64, 128])}},
epochs=50,
verbose=2,
validation_data=(x_test, y_test))
score, acc = model_mlp.evaluate(x_test, y_test, verbose=0)
print('Test accuracy:', acc)
return {'loss': -acc, 'status': STATUS_OK, 'model': model_mlp}
enter code here
if __name__ == '__main__':
import gc; gc.collect()
with K.get_session(): ## TF session
best_run, best_model = optim.minimize(model=model,
data=data,
algo=tpe.suggest,
max_evals=2,
trials=Trials())
X_train, Y_train, X_test, Y_test = data()
print("Evalutation of best performing model:")
print(best_model.evaluate(X_test, Y_test))
print("Best performing model chosen hyper-parameters:")
print(best_run)
it induced by different gc sequence, if python collect session first , the program will exit successfully, if python collect swig memory(tf_session) first, the program exit with failure.
you can force python to del session by:
del session
or if you are using keras, you cant get the session instance, you can run following code at end of your code:
import gc; gc.collect()
This can be also another approach:
from hyperopt import fmin, tpe, hp, STATUS_OK, Trials
from sklearn.metrics import roc_auc_score
import sys
X = []
y = []
X_val = []
y_val = []
space = {'choice': hp.choice('num_layers',
[ {'layers':'two', },
{'layers':'three',
'units3': hp.uniform('units3', 64,1024),
'dropout3': hp.uniform('dropout3', .25,.75)}
]),
'units1': hp.uniform('units1', 64,1024),
'units2': hp.uniform('units2', 64,1024),
'dropout1': hp.uniform('dropout1', .25,.75),
'dropout2': hp.uniform('dropout2', .25,.75),
'batch_size' : hp.uniform('batch_size', 28,128),
'nb_epochs' : 100,
'optimizer': hp.choice('optimizer',['adadelta','adam','rmsprop']),
'activation': 'relu'
}
def f_nn(params):
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation
from keras.optimizers import Adadelta, Adam, rmsprop
print ('Params testing: ', params)
model = Sequential()
model.add(Dense(output_dim=params['units1'], input_dim = X.shape[1]))
model.add(Activation(params['activation']))
model.add(Dropout(params['dropout1']))
model.add(Dense(output_dim=params['units2'], init = "glorot_uniform"))
model.add(Activation(params['activation']))
model.add(Dropout(params['dropout2']))
if params['choice']['layers']== 'three':
model.add(Dense(output_dim=params['choice']['units3'], init = "glorot_uniform"))
model.add(Activation(params['activation']))
model.add(Dropout(params['choice']['dropout3']))
model.add(Dense(1))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy', optimizer=params['optimizer'])
model.fit(X, y, nb_epoch=params['nb_epochs'], batch_size=params['batch_size'], verbose = 0)
pred_auc =model.predict_proba(X_val, batch_size = 128, verbose = 0)
acc = roc_auc_score(y_val, pred_auc)
print('AUC:', acc)
sys.stdout.flush()
return {'loss': -acc, 'status': STATUS_OK}
trials = Trials()
best = fmin(f_nn, space, algo=tpe.suggest, max_evals=50, trials=trials)
print('best: ', best)
Source
I'm using a combination of sklearn and Keras running with Theano as its back-end. I'm using the following code-
import numpy as np
import pandas as pd
from pandas import Series, DataFrame
import keras
from keras.callbacks import EarlyStopping, ModelCheckpoint
from keras.constraints import maxnorm
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.optimizers import SGD
from keras.wrappers.scikit_learn import KerasClassifier
from keras.constraints import maxnorm
from keras.utils.np_utils import to_categorical
from sklearn.model_selection import cross_val_score
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import StratifiedKFold
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
from sklearn.model_selection import train_test_split
from datetime import datetime
import time
from datetime import timedelta
from __future__ import division
seed = 7
np.random.seed(seed)
Y = data['Genre']
del data['Genre']
X = data
encoder = LabelEncoder()
encoder.fit(Y)
encoded_Y = encoder.transform(Y)
X = X.as_matrix().astype("float")
calls=[EarlyStopping(monitor='acc', patience=10), ModelCheckpoint('C:/Users/1383921/Documents/NNs/model', monitor='acc', save_best_only=True, mode='auto', period=1)]
def create_baseline():
# create model
model = Sequential()
model.add(Dense(18, input_dim=9, init='normal', activation='relu'))
model.add(Dense(9, init='normal', activation='relu'))
model.add(Dense(12, init='normal', activation='softmax'))
# Compile model
sgd = SGD(lr=0.01, momentum=0.8, decay=0.0, nesterov=False)
model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])
return model
np.random.seed(seed)
estimators = []
estimators.append(('standardize', StandardScaler()))
estimators.append(('mlp', KerasClassifier(build_fn=create_baseline, nb_epoch=300, batch_size=16, verbose=2)))
pipeline = Pipeline(estimators)
kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)
results = cross_val_score(pipeline, X, encoded_Y, cv=kfold, fit_params={'mlp__callbacks':calls})
print("Baseline: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))
The result when I start running this last part is-
Epoch 1/10
...
Epoch 2/10
etc.
It's supposed to be Epoch 1/300 and it works just fine when I run it on a different notebook.
What do you guys think is happening? np_epoch=300...
What Keras version is this? If its greater than 2.0, then nb_epoch was changed to just epochs. Else it defaults to 10.
In Keras 2.0 the nb_epoch parameter was renamed to epochs so when you set epochs=300 it runs 300 epochs. If you use nb_epoch=300 it will default to 10 instead.
Another solution to your problem: Forget about nb_epoch (doesn't work). Pass epochs inside fit_params:
results = cross_val_score(pipeline, X, encoded_Y, cv=kfold,
fit_params={'epochs':300,'mlp__callbacks':calls})
And that would work. fit_params goes straight into the Fit method and it will get the right epochs.
The parameter name in your function should be epochs instead of nb_epochs.
Be very careful though. For example, I trained my ANN with the old fashioned way of declaring the parameters (nb_epochs = number), and it worked (the iPython console only showed me some warnings), but when I plugged the same parameter names in the cross_val_score function, it did not work.
I think that what sklearn calls "Epoch" is one step of your crossvalidation. So it does 300 epochs of training 10 times :-) is that possible? Try with verbose=1