I want the classifier to run faster and stop early if the patience reaches the number I set. In the following code it does 10 iterations of fitting the model.
import numpy
import pandas
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from keras.wrappers.scikit_learn import KerasClassifier
from keras.callbacks import EarlyStopping, ModelCheckpoint
from keras.constraints import maxnorm
from keras.optimizers import SGD
from sklearn.model_selection import cross_val_score
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import StratifiedKFold
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
# fix random seed for reproducibility
seed = 7
numpy.random.seed(seed)
# load dataset
dataframe = pandas.read_csv("sonar.csv", header=None)
dataset = dataframe.values
# split into input (X) and output (Y) variables
X = dataset[:,0:60].astype(float)
Y = dataset[:,60]
# encode class values as integers
encoder = LabelEncoder()
encoder.fit(Y)
encoded_Y = encoder.transform(Y)
calls=[EarlyStopping(monitor='acc', patience=10), ModelCheckpoint('C:/Users/Nick/Data Science/model', monitor='acc', save_best_only=True, mode='auto', period=1)]
def create_baseline():
# create model
model = Sequential()
model.add(Dropout(0.2, input_shape=(33,)))
model.add(Dense(33, init='normal', activation='relu', W_constraint=maxnorm(3)))
model.add(Dense(16, init='normal', activation='relu', W_constraint=maxnorm(3)))
model.add(Dense(122, init='normal', activation='softmax'))
# Compile model
sgd = SGD(lr=0.1, momentum=0.8, decay=0.0, nesterov=False)
model.compile(loss='categorical_crossentropy', optimizer=sgd, metrics=['accuracy'])
return model
numpy.random.seed(seed)
estimators = []
estimators.append(('standardize', StandardScaler()))
estimators.append(('mlp', KerasClassifier(build_fn=create_baseline, nb_epoch=300, batch_size=16, verbose=0, callbacks=calls)))
pipeline = Pipeline(estimators)
kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)
results = cross_val_score(pipeline, X, encoded_Y, cv=kfold)
print("Baseline: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))
Here is the resulting error-
RuntimeError: Cannot clone object <keras.wrappers.scikit_learn.KerasClassifier object at 0x000000001D691438>, as the constructor does not seem to set parameter callbacks
I changed the cross_val_score in the following-
numpy.random.seed(seed)
estimators = []
estimators.append(('standardize', StandardScaler()))
estimators.append(('mlp', KerasClassifier(build_fn=create_baseline, nb_epoch=300, batch_size=16, verbose=0, callbacks=calls)))
pipeline = Pipeline(estimators)
kfold = StratifiedKFold(n_splits=10, shuffle=True, random_state=seed)
results = cross_val_score(pipeline, X, encoded_Y, cv=kfold, fit_params={'callbacks':calls})
print("Baseline: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))
and now I get this error-
ValueError: need more than 1 value to unpack
This code came from here. The code is by far the most accurate I've used so far. The problem is that there is no defined model.fit() anywhere in the code. It also takes forever to fit. The fit() operation occurs at the results = cross_val_score(...) and there's no parameters to throw a callback in there.
How do I go about doing this?
Also, how do I run the model trained on a test set?
I need to be able to save the trained model for later use...
Reading from here, which is the source code of KerasClassifier, you can pass it the arguments of fit and they should be used.
I don't have your dataset so I cannot test it, but you can tell me if this works and if not I will try and adapt the solution. Change this line :
estimators.append(('mlp', KerasClassifier(build_fn=create_baseline, nb_epoch=300, batch_size=16, verbose=0, callbacks=[...your_callbacks...])))
A small explaination of what's happening : KerasClassifier is taking all the possibles arguments for fit, predict, score and uses them accordingly when each method is called. They made a function that filters the arguments that should go to each of the above functions that can be called in the pipeline.
I guess there are several fit and predict calls inside the StratifiedKFold step to train on different splits everytime.
The reason why it takes forever to fit and it fits 10 times is because one fit is doing 300 epochs, as you asked. So the KFold is repeating this step over the different folds :
calls fit with all the parameters given to KerasClassifier (300 epochs and batch size = 16). It's training on 9/10 of your data and using 1/10 as validation.
EDIT :
Ok, so I took the time to download the dataset and try your code... First of all you need to correct a "few" things in your network :
your input have a 60 features. You clearly show it in your data prep :
X = dataset[:,:60].astype(float)
so why would you have this :
model.add(Dropout(0.2, input_shape=(33,)))
please change to :
model.add(Dropout(0.2, input_shape=(60,)))
About your targets/labels. You changed the objective from the original code (binary_crossentropy) to categorical_crossentropy. But you didn't change your Y array. So either do this in your data preparation :
from keras.utils.np_utils import to_categorical
encoded_Y = to_categorical(encoder.transform(Y))
or change your objective back to binary_crossentropy.
Now the network's output size : 122 on the last dense layer? your dataset obviously has 2 categories so why are you trying to output 122 classes? it won't match the target. Please change back your last layer to :
model.add(Dense(2, init='normal', activation='softmax'))
if you choose to use categorical_crossentropy, or
model.add(Dense(1, init='normal', activation='sigmoid'))
if you go back to binary_crossentropy.
So now that your network compiles, I could start to troubleshout.
here is your solution
So now I could get the real error message. It turns out that when you feed fit_params=whatever in the cross_val_score() function, you are feeding those parameters to a pipeline. In order to know to which part of the pipeline you want to send those parameters you have to specify it like this :
fit_params={'mlp__callbacks':calls}
Your error was saying that the process couldn't unpack 'callbacks'.split('__', 1) into 2 values. It was actually looking for the name of the pipeline's step to apply this to.
It should be working now :)
results = cross_val_score(pipeline, X, encoded_Y, cv=kfold, fit_params={'mlp__callbacks':calls})
BUT, you should be aware of what's happening here... the cross validation actually calls the create_baseline() function to recreate the model from scratch 10 times an trains it 10 times on different parts of the dataset. So it's not doing epochs as you were saying, it's doing 300 epochs 10 times.
What is also happening as a consequence of using this tool : since the models are always differents, it means the fit() method is applied 10 times on different models, therefore, the callbacks are also applied 10 different times and the files saved by ModelCheckpoint() get overriden and you find yourself only with the best model of the last run.
This is intrinsec to the tools you use, I don't see any way around this. This comes as consequence to using different general tools that weren't especially thought to be used together with all the possible configurations.
Try:
estimators.append(('mlp',
KerasClassifier(build_fn=create_model2,
nb_epoch=300,
batch_size=16,
verbose=0,
callbacks=[list_of_callbacks])))
where list_of_callbacks is a list of callbacks you want to apply. You could find details here. It's mentioned there that parameters fed to KerasClassifier could be legal fitting parameters.
It's also worth to mention that if you are using multiple runs with GPUs there might be a problem due to several reported memory leaks especially when you are using theano. I also noticed that running multiple fits consequently may show results which seem to be not independent when using sklearn API.
Edit:
Try also:
results = cross_val_score(pipeline, X, encoded_Y, cv=kfold, fit_params = {'mlp__callbacks': calls})
Instead of putting callbacks list in a wrapper instantiation.
This is what I have done
results = cross_val_score(estimator, X, Y, cv=kfold,
fit_params = {'callbacks': [checkpointer,plateau]})
and has worked so far
Despite the TensorFlow, Keras & SciKeras documentation suggesting you can define training callbacks via the fit method, for my setup it turns out (like #NassimBen suggests) you should do it through the model constructor instead.
Rather than this:
model = KerasClassifier(..).fit(X, y, callbacks=[<HERE>])
Try this:
model = KerasClassifier(callbacks=[<HERE>]).fit(X, y)
Related
I came across a weird difference between keras model.fit() and sklearn model.fit() functions. When model.fit() is called inside a loop I get inconsistent predictions using a Keras sequential model. This is not the case when using an sklearn model. See sample code to reproduce the phenomenon.
from numpy.random import seed
seed(1337)
import tensorflow as tf
tf.random.set_seed(1337)
from sklearn.linear_model import LogisticRegression
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras.layers import InputLayer
from sklearn.datasets import make_blobs
from sklearn.preprocessing import MinMaxScaler
import numpy as np
def get_sequential_dnn(NUM_COLS, NUM_ROWS):
# code for model
if __name__ == "__main__":
input_size = 10
X, y = make_blobs(n_samples=100, centers=2, n_features=input_size,
random_state=1
)
scalar = MinMaxScaler()
scalar.fit(X)
X = scalar.transform(X)
model = get_sequential_dnn(X.shape[1], X.shape[0])
# print(model.summary())
# model = LogisticRegression()
for i in range(2):
model.fit(X, y, epochs=100, verbose=0, shuffle=False)
# model.fit(X, y)
Xnew, _ = make_blobs(n_samples=3, centers=2, n_features=10, random_state=1)
Xnew = scalar.transform(Xnew)
# make a prediction
# ynew = model.predict_proba(Xnew)[:, 1]
ynew = model.predict_proba(Xnew)
ynew = np.array(ynew)
# show the inputs and predicted outputs
print('--------------')
for i in range(len(Xnew)):
print("X=%s \n Predicted=%s" % (Xnew[i], ynew[i]))
The output of this is
--------------
X=[0.32799209 0.32682211 0.62699485 0.89987274 0.59894281 0.94662653
0.77125788 0.73345369 0.2153754 0.35317172]
Predicted=[0.9931685]
X=[0.60876924 0.33208319 0.24770841 0.11435312 0.66211608 0.17361879
0.12891829 0.25729677 0.69975833 0.73165292]
Predicted=[0.35249507]
X=[0.65154993 0.26153846 0.2416324 0.11793901 0.7047334 0.17706289
0.07761879 0.45189967 0.8481064 0.85092378]
Predicted=[0.35249507]
--------------
X=[0.32799209 0.32682211 0.62699485 0.89987274 0.59894281 0.94662653
0.77125788 0.73345369 0.2153754 0.35317172]
Predicted=[1.]
X=[0.60876924 0.33208319 0.24770841 0.11435312 0.66211608 0.17361879
0.12891829 0.25729677 0.69975833 0.73165292]
Predicted=[0.17942095]
X=[0.65154993 0.26153846 0.2416324 0.11793901 0.7047334 0.17706289
0.07761879 0.45189967 0.8481064 0.85092378]
Predicted=[0.17942095]
While if I use a Logistic Regression (un-comment the commented lines) the predictions are consistent:
--------------
X=[0.32799209 0.32682211 0.62699485 0.89987274 0.59894281 0.94662653
0.77125788 0.73345369 0.2153754 0.35317172]
Predicted=0.929209043999009
X=[0.60876924 0.33208319 0.24770841 0.11435312 0.66211608 0.17361879
0.12891829 0.25729677 0.69975833 0.73165292]
Predicted=0.04643513037543502
X=[0.65154993 0.26153846 0.2416324 0.11793901 0.7047334 0.17706289
0.07761879 0.45189967 0.8481064 0.85092378]
Predicted=0.038716408758471876
--------------
X=[0.32799209 0.32682211 0.62699485 0.89987274 0.59894281 0.94662653
0.77125788 0.73345369 0.2153754 0.35317172]
Predicted=0.929209043999009
X=[0.60876924 0.33208319 0.24770841 0.11435312 0.66211608 0.17361879
0.12891829 0.25729677 0.69975833 0.73165292]
Predicted=0.04643513037543502
X=[0.65154993 0.26153846 0.2416324 0.11793901 0.7047334 0.17706289
0.07761879 0.45189967 0.8481064 0.85092378]
Predicted=0.038716408758471876
I get that the obvious solution to this is fit the model before the loop, probably there is a strong randomness how Keras models fit the data to the labels, but there are a couple of cases where you need to have a loop to get prediction scores. For example if you want to perform a 10-fold cross validation to get the AUC, sensitivity, specificity values on a training data. In these situations this randomness is unacceptable.
What is causing this inconsistency and what is the solution to it?
There are couple of issue with the way your are trying to make reproducible results with keras.
You are calling the fit (when i==1) over the already fitted model (when i==0). So the optimizer sees different sets of inital weights in both the cases and so you will end up in two different models. Solution: Get a fresh model everytime. This is not the case with sklearn, which starts with fresh initialized weights every time a fit is called.
model.fit internally might use a current stage of random number generator. You seeded it outside the loop, so the state will be different when fit is called the second time. Solution: Seed inside the loop.
Sample code with issue
# Issue 2 here
tf.random.set_seed(1337)
def get_model():
model = Sequential()
model.add(Dense(4, input_dim=8, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam')
return model
X = np.random.randn(10,8)
y = np.random.randn(10,1)
# Issue 1 here
model = get_model()
results = []
for i in range(10):
model.fit(X, y, epochs=5, verbose=0, shuffle=False)
results.append(np.sum(model.predict(X)))
assert np.all(np.isclose(results, results[0]))
As you can see the assert fails
Corrected code
results = []
for i in range(10):
tf.random.set_seed(1337)
model = get_model()
model.fit(X, y, epochs=5, verbose=0, shuffle=False)
results.append(np.sum(model.predict(X)))
assert np.all(np.isclose(results, results[0]))
I am an electrical engineer and I am looking for a solution to calculate the DC current of a permanent synchronous motor. So I decided to check the ANN solutions with Keras and so on.Long story short, I'll show you a screenshot of some measured signals.
The first 5 signals are the measured signals. The last one is the DC current, which I will estimate. Here the value was recorded with the help of a current clamp. Okay, I started building a model in Python and tried some things that I assume will increase the accuracy of the model. But after all that, I am not getting that good results from the model and my hope is that maybe I am choosing wrong parameters or not an ideal model for this purpose.
Here is my code:
import numpy as np
from keras.layers import Dense, LSTM
from keras.models import Sequential
from keras.callbacks import EarlyStopping
import pandas as pd
from sklearn import preprocessing
from sklearn.model_selection import train_test_split
from sklearn.metrics import r2_score
from matplotlib import pyplot as plt
import seaborn as sns
# Import input (x) and output (y) data, and asign these to df1 and df1
df = pd.read_csv('train_data.csv')
df = df[['rpm','iq','uq','udc','idc']]
X = df[df.columns[:-1]]
Y = df.idc
plt.figure()
sns.heatmap(df.corr(),annot=True)
plt.show()
# Split the data into input (x) training and testing data, and ouput (y) training and testing data,
# with training data being 80% of the data, and testing data being the remaining 20% of the data
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.2)#, shuffle=True)
# Scale both training and testing input data
X_train = preprocessing.maxabs_scale(X_train)
X_test = preprocessing.maxabs_scale(X_test)
model = Sequential()
model.add(Dense(4, input_shape=(4,)))
model.add(Dense(4, input_shape=(4,)))
model.add(Dense(1, input_shape=(4,)))
model.compile(optimizer="adam", loss="msle", metrics=['mean_squared_logarithmic_error','accuracy'])
# Pass several parameters to 'EarlyStopping' function and assign it to 'earlystopper'
earlystopper = EarlyStopping(monitor='val_loss', min_delta=0, patience=15, verbose=1, mode='auto')
model.summary()
history = model.fit(X_train, y_train, epochs = 2000, validation_split = 0.3, verbose = 2, callbacks = [earlystopper])
# Runs model (the one with the activation function, although this doesn't really matter as they perform the same)
# with its current weights on the training and testing data
y_train_pred = model.predict(X_train)
y_test_pred = model.predict(X_test)
# Calculates and prints r2 score of training and testing data
print("The R2 score on the Train set is:\t{:0.3f}".format(r2_score(y_train, y_train_pred)))
print("The R2 score on the Test set is:\t{:0.3f}".format(r2_score(y_test, y_test_pred)))
df = pd.read_csv('test_two_data.csv')
df = df[['rpm','iq','uq','udc','idc']]
X = df[df.columns[:-1]]
Y = df.idc
X_validate = preprocessing.maxabs_scale(X)
y_pred = model.predict(X_validate)
plt.plot(Y)
plt.plot(y_pred)
plt.show()
(weight_0,bias_0) = model.layers[0].get_weights()
(weight_1,bias_1) = model.layers[1].get_weights()
One limitation is that I can't use LSTM layers or other complex algorithms because I need to implement the trained model in a microcontroller on a motor application later.
I guess you could find some words for me to make my model a little better in accuracy.
At the end here is a figure where I show you the worse prediction performance. Orange is the prediction and blue is the measured current.
The training dataset was this one.
The correlation between the individual values can be found here. Since the values of id and ud have no correlation to idc, I decided to delete them.
The most important thing to keep in mind when trying to improve the accuracy of the model is ALWAYS Normalise the input data which basically means rescaling real-valued numeric attributes into the range 0 and 1. I am not able to understand the way you are providing the training data to the model. Could you please explain that. It would be better in understanding and identifying the scope of higher accuracy.
Now if we talk about parameters, I would suggest you the addition of a Tuning Algorithm for the parameters to get the optimized value of each parameter.
It is always a good parctice to include hidden layers which could provide better feature extract.
Im trying to use a NN model to predict with new data. However predicted data is not of the correct scale (values obtained 1e-10 when it should be 0.3 etc).
In my model ive used minmaxscaler on the x and y data. The model gave me an R2 value of 0.9 when using the test train split method, and and MSE of 0.01% using a pipeline method and also the cross val method. So i believe the model ive created is ok.
here is the model ive made.
data=pd.read_csv(r'''F:\DataforANNfromIESFebAugPowerValues.csv''')
data.dropna(axis=0,how='all')
x=data[['Dry-bulb_temperature_C','Wind_speed_m/s','Cloud_cover_oktas','External_relative_humidity_%','Starrag1250','StarragEcospeed2538','StarragS191','StarragLX051','DoosanCNC6700','MakinoG7','HermleC52MT','WFL_Millturn','Hofler1350','MoriNT4250','MoriNT5400','NMV8000','MoriNT6600','MoriNVL1350','HermleC42','CFV550','MoriDura635','DMGUltrasonic10']]
y=data[['Process_heat_output_waste_kW','Heating_plant_sensible_load_kW','Cooling_plant_sensible_load_kW','Relative_humidity_%','Air_temperature_C','Total_electricity_kW','Chillers_energy_kW','Boilers_energy_kW']]
epochs=150
learning_rate=0.001
decay_rate=learning_rate/epochs
optimiser=keras.optimizers.Nadam(lr=learning_rate, schedule_decay=decay_rate)
def create_model():
model=Sequential()
model.add(Dense(21, input_dim=22, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(19, activation='relu')) #hidden layer 2
model.add(Dropout(0.2))
model.add(Dense(8, activation='sigmoid')) #output layer
model.compile(loss='mean_squared_error', optimizer=optimiser,metrics=['accuracy','mse'])
return model
scaler=MinMaxScaler()
x=MinMaxScaler().fit_transform(x)
print(x)
y=MinMaxScaler().fit_transform(y)
model=KerasRegressor(build_fn=create_model, verbose=0,epochs=150, batch_size=70)
model.fit(x, y, epochs=150, batch_size=70)
##SET UP NEW DATA FOR PREDICTIONS
xnewdata=pd.read_csv(r'''F:\newdatapowervalues.csv''')
xnewdata.dropna(axis=0,how='all')
xnew=xnewdata[['Dry-bulb_temperature_C','Wind_speed_m/s','Cloud_cover_oktas','External_relative_humidity_%','Starrag1250','StarragEcospeed2538','StarragS191','StarragLX051','DoosanCNC6700','MakinoG7','HermleC52MT','WFL_Millturn','Hofler1350','MoriNT4250','MoriNT5400','NMV8000','MoriNT6600','MoriNVL1350','HermleC42','CFV550','MoriDura635','DMGUltrasonic10']]
xnew=MinMaxScaler().fit_transform(xnew)
ynew=model.predict(xnew)
ynewdata=pd.DataFrame(data=ynew)
ynewdata.to_csv(r'''F:\KerasIESPowerYPredict.csv''',header=['Process_heat_output_waste_kW','Heating_plant_sensible_load_kW','Cooling_plant_sensible_load_kW','Relative_humidity_%','Air_temperature_C','Total_electricity_kW','Chillers_energy_kW','Boilers_energy_kW'])
seeing ive used the scaler on the inital training model, i thought i would also need to do this to the new data. Ive tried doing
scaler.inverse_transform(ynew)
after model.predict(ynew) however i get the error that the minmaxscaler instance isnt fitted to y yet.
Therefore, i tried using the pipeline method.
estimators = []
estimators.append(('standardize', MinMaxScaler()))
estimators.append(('mlp', KerasRegressor(build_fn=create_model, epochs=150, batch_size=70, verbose=0)))
pipeline = Pipeline(estimators)
pipeline.fit(x,y)
for the inital training model instead of
x=MinMaxScaler().fit_transform(x)
y=MinMaxScaler().fit_transform(y)
model=KerasRegressor(build_fn=create_model, verbose=0,epochs=150, batch_size=70)
model.fit(x, y, epochs=150, batch_size=70)
i then used
ynew=pipeline.predict(xnew)
however this gave me data consisting mainly of 1's!
any idea on how i can predict correctly on this new data? im unsure which data to scale and which not too, as i believe that using the pipeline.predict would include scaling for x and y. therefore do i need some sort of inverse pipeline scalar after making these predictions?
many thanks for your help.
There is one minor and one major problem with your approach.
Minor one: there's no need to scale your target variable, it does not affect your optimisation function.
Major one: you fit the scaler again on the data on which you want to run the prediction. By doing this, you skew completely the relations you have in the data and hence the predicted output is of a very different scale. Also, you define scaler and later not use it. Let's fix it.
(...)
scaler=MinMaxScaler()
x=scaler.fit_transform(x)
model=KerasRegressor(build_fn=create_model, verbose=0,epochs=150, batch_size=70)
model.fit(x, y, epochs=150, batch_size=70)
##SET UP NEW DATA FOR PREDICTIONS
xnewdata=pd.read_csv(r'''F:\newdatapowervalues.csv''')
xnewdata.dropna(axis=0,how='all')
xnew=xnewdata[['Dry-bulb_temperature_C','Wind_speed_m/s','Cloud_cover_oktas','External_relative_humidity_%','Starrag1250','StarragEcospeed2538','StarragS191','StarragLX051','DoosanCNC6700','MakinoG7','HermleC52MT','WFL_Millturn','Hofler1350','MoriNT4250','MoriNT5400','NMV8000','MoriNT6600','MoriNVL1350','HermleC42','CFV550','MoriDura635','DMGUltrasonic10']]
xnew=scaler.transform(xnew)
ynew=model.predict(xnew)
ynewdata=pd.DataFrame(data=ynew)
As you can see, we used the scaler first to learn the proper normnalisation factor and then used it (transform) on the new data on which we run predict.
I am quite new to Keras so apologies in advance for any stupid mistakes. I am currently attempting to try out some good old cross-domain transfer learning between two datasets. I have a model here that is trained and executed on a voice recognition dataset that I have generated (code is at the bottom of this question because it's quite long)
If I were to train a new model, say model_2 on a different dataset, then I'd get a baseline from the initial random distribution of weights.
I wonder, is it possible to train model_1 and model_2, then, and this is the bit I don't know how to do; can I take the two 256 and 128 dense layers from model_1 (with trained weights) and use them as starting points for a model_3 - which is dataset 2 with the initial weight distribution from model_1?
So, in the end, I have the following:
Model_1 which starts from a random distribution and trains on dataset 1
Model_2 which starts from a random distribution and trains on dataset 2
Model_3 which starts from the distribution trained in Model_1 and trains on dataset 2.
My question is, how would I go about doing step 3 in the above? I don't want to freeze the weights, I just want an initial distribution for training from a past experiment
Any help would be greatly appreciated. Thank you! Apologies if I didn't make it quite clear enough what I'm going for
My code to train Model_1 is as follows:
import numpy
import pandas
from keras.models import Sequential
from keras.layers import Dense
from keras.callbacks import EarlyStopping
from keras.wrappers.scikit_learn import KerasClassifier
from sklearn.model_selection import cross_val_score
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import KFold
from sklearn.preprocessing import StandardScaler
from sklearn.pipeline import Pipeline
import matplotlib.pyplot as plt
from keras.utils import np_utils
from keras.layers.normalization import BatchNormalization
import time
start = time.clock()
# fix random seed for reproducibility
seed = 1
numpy.random.seed(seed)
# load dataset
dataframe = pandas.read_csv("voice.csv", header=None)
dataset = dataframe.values
# split into input (X) and output (Y) variables
numVars = len(dataframe.columns) - 1
numClasses = dataframe[numVars].nunique()
X = dataset[:,0:numVars].astype(float)
Y = dataset[:,numVars]
print("THERE ARE " + str(numVars) + " ATTRIBUTES")
print("THERE ARE " + str(numClasses) + " UNIQUE CLASSES")
# encode class values as integers
encoder = LabelEncoder()
encoder.fit(Y)
encoded_Y = encoder.transform(Y)
# convert integers to dummy variables (i.e. one hot encoded)
dummy_y = np_utils.to_categorical(encoded_Y)
calls = [EarlyStopping(monitor='acc', min_delta=0.0001, patience=100, verbose=2, mode='max', restore_best_weights=True)]
# define baseline model
def baseline_model():
# create model
model = Sequential()
model.add(BatchNormalization())
model.add(Dense(256, input_dim=numVars, activation='sigmoid'))
model.add(Dense(128, activation='sigmoid'))
model.add(Dense(numClasses, activation='softmax'))
# Compile model
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
estimator = KerasClassifier(build_fn=baseline_model, epochs=2000, batch_size=1000, verbose=1)
kfold = KFold(n_splits=10, shuffle=True, random_state=seed)
results = cross_val_score(estimator, X, dummy_y, cv=kfold, fit_params={'callbacks':calls})
print("Baseline: %.2f%% (%.2f%%)" % (results.mean()*100, results.std()*100))
#your code here
print (time.clock() - start)
PS: Input attributes and outputs will all be the same between the two datasets, all that will change are attribute values. I am curious, can this be done if the two datasets have different numbers of output classes?
In short, to fine-tune Model_3 from Model_1, just call model.load_weights('/path/to/model_1.h5', by_name=True) after model.compile(...). Of course, you must have saved the trained Model_1 first.
If I understood correct, you have the same number of features and classes among the two datasets, so you do not even need to re-design your model. If you had different set of classes, then you had to give different names to the last layers of Model_1 and Model_3:
model.add(Dense(numClasses, activation='softmax', name='some_unique_name'))
Lately, I have been working on applying grid search cross validation (sklearn GridSearchCV) for hyper-parameter tuning in Keras with Tensorflow backend. An soon as my model is tuned
I am trying to save the GridSearchCV object for later use without success.
The hyper-parameter tuning is done as follows:
x_train, x_val, y_train, y_val = train_test_split(NN_input, NN_target, train_size = 0.85, random_state = 4)
history = History()
kfold = 10
regressor = KerasRegressor(build_fn = create_keras_model, epochs = 100, batch_size=1000, verbose=1)
neurons = np.arange(10,101,10)
hidden_layers = [1,2]
optimizer = ['adam','sgd']
activation = ['relu']
dropout = [0.1]
parameters = dict(neurons = neurons,
hidden_layers = hidden_layers,
optimizer = optimizer,
activation = activation,
dropout = dropout)
gs = GridSearchCV(estimator = regressor,
param_grid = parameters,
scoring='mean_squared_error',
n_jobs = 1,
cv = kfold,
verbose = 3,
return_train_score=True))
grid_result = gs.fit(NN_input,
NN_target,
callbacks=[history],
verbose=1,
validation_data=(x_val, y_val))
Remark: create_keras_model function initializes and compiles a Keras Sequential model.
After the cross validation is performed I am trying to save the grid search object (gs) with the following code:
from sklearn.externals import joblib
joblib.dump(gs, 'GS_obj.pkl')
The error I am getting is the following:
TypeError: can't pickle _thread.RLock objects
Could you please let me know what might be the reason for this error?
Thank you!
P.S.: joblib.dump method works well for saving GridSearchCV objects that are used
for the training MLPRegressors from sklearn.
Use
import joblib directly
instead of
from sklearn.externals import joblib
Save objects or results with:
joblib.dump(gs, 'model_file_name.pkl')
and load your results using:
joblib.load("model_file_name.pkl")
Here is a simple working example:
import joblib
#save your model or results
joblib.dump(gs, 'model_file_name.pkl')
#load your model for further usage
joblib.load("model_file_name.pkl")
Try this:
from sklearn.externals import joblib
joblib.dump(gs.best_estimator_, 'filename.pkl')
If you want to dump your object into one file - use:
joblib.dump(gs.best_estimator_, 'filename.pkl', compress = 1)
Simple Example:
from sklearn import svm, datasets
from sklearn.model_selection import GridSearchCV
from sklearn.externals import joblib
iris = datasets.load_iris()
parameters = {'kernel':('linear', 'rbf'), 'C':[1, 10]}
svc = svm.SVC()
gs = GridSearchCV(svc, parameters)
gs.fit(iris.data, iris.target)
joblib.dump(gs.best_estimator_, 'filename.pkl')
#['filename.pkl']
EDIT 1:
you can also save the whole object:
joblib.dump(gs, 'gs_object.pkl')
Subclass the sklearn.model_selection._search.BaseSearchCV class. Override the fit(self, X, y=None, groups=None, **fit_params) method, and modify its internal evaluate_candidates(candidate_params) function. Instead of immediately returning the results dictionary from evaluate_candidates(candidate_params), perform your serialization here (or in the _run_search method depending on your use case). With some additional modifications, this approach has the added benefit of allowing you to execute the grid search sequentially (see the comment in the source code here: _search.py). Note that the results dictionary returned by evaluate_candidates(candidate_params) is the same as the cv_results dictionary. This approach worked for me, but I was also attempting to add save-and-restore functionality for interrupted grid search executions.