Using fit_generator in Keras Model - python

I'm trying to train a neural network using Keras and Tensorflow backend. My X is text descriptions which I have processed and transformed into sequences. Now, my y is a sparse matrix since it's a multi-label classification and I have many output classes.
>>> y
<30405x3387 sparse matrix of type '<type 'numpy.int64'>'
with 54971 stored elements in Compressed Sparse Row format>
To train the model, I tried defining a batch generator:
def batch_generator(x, y, batch_size=32):
n_batches_per_epoch = x.shape[0]//batch_size
for i in range(n_batches_per_epoch):
index_batch = range(x.shape[0])[batch_size*i:batch_size*(i+1)]
x_batch = x[index_batch,:]
y_batch = y[index_batch,:].todense()
yield x_batch, np.array(y_batch)
I've divided my data as:
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=42)
I define my model as:
model = Sequential()
# Create architecture, add some layers.
model.add(Dense(num_classes))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
And I'm training my model as:
model.fit_generator(generator=batch_generator(x_train, y_train), steps_per_epoch=(x_train[0]/32), epochs=200, callbacks=the_callbacks)
But my model starts with around 55% accuracy and it quickly (in 2 or 3 steps) becomes 99.95%, which makes no sense at all. Am I doing something wrong?

You'll need to switch your loss to "categorical_crossentropy" or change your metric to "crossentropy" for multiclass classification.
The "accuracy" metric is actually ambiguous behind the scenes in Keras- it picks binary or multiclass accuracy based on the loss function used.
https://github.com/keras-team/keras/blob/master/keras/engine/training.py#L375

If you have two classes you can use sigmoid activation in the last layer and binary cross entropy loss function. But, if you have more than one classes, then you have to replace sigmoid with softmax and binary with categorical cross entropy.
There could be multiple other reasons for the abrupt change in accuracy depending upon your data distribution, model configuration etc. etc.

Related

LSTM Predict number from sequence of numbers

I am trying to train an LSTM to predict some numbers from a sequence of number. My X dataset has 33 features and my Y dataset has 4 variables that I have to predict for each X sample. For example:
Xdf:
Ydf:
After I turn X, y to numpy arrays and reshape the data I build the below model:
I get the following results with huge loss and distinct overfitting. Any ideas what I am doing wrong and I am getting these results? Thanks.
Xdf: https://drive.google.com/file/d/1wH56E0M3ok1MGWzU6FGKgDJF7EZfL_7f/view?usp=sharing
ydf: https://drive.google.com/file/d/1RkjWl1FIiQDyjkRvl7ZQKTOXIE8FtBXA/view?usp=sharing
UPDATE
I edited my code the following way and it seems to bee working a tiny bit better. Still huge loss and overfitting but after some epochs it seemed it worked fine for a while. Any suggestions on how to reduce loss and overfitting would be appreciated.
X = X.reshape((X.shape[0], 33, 1)) # (5850, 33, 1)
model = tf.keras.Sequential()
model.add(layers.LSTM(50, activation='relu', input_shape=(33, 1)))
model.add(layers.Dense(4))
model.add(layers.Dense(4))
model.add(layers.Dense(4))
early_stopping = EarlyStopping(monitor='val_loss', patience=42)
model.compile(optimizer=tf.keras.optimizers.Adam(0.01), loss=tf.keras.losses.MeanSquaredError(), metrics=['accuracy'])
model.fit(X, y, epochs=200, verbose=1, validation_split = 0.2)

Train many neural networks and pick best one

I'm working on a classification task, trying to reconstruct a network from paper. In that paper, they are talking about doing a train test split 300 times and training the network each time after they are taking the mean of all predictions from each network for specific input data.
So here's the question: What is the best option for doing that, I've already reconstructed their network and thinking about using a for loop and saving outputs of each network in a data frame but can't get it the right way.
Here's the code :
# Set X and Y for training
X = dum_bll_fsrq.drop(['type2', 'name', 'Type_is_bll', 'Type_is_fsrq'], axis = 1)
Y = dum_bll_fsrq.iloc[:,-2:]
# Train test split
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.3, stratify = Y)
# Create model
model_two_neuron = tf.keras.Sequential([
tf.keras.layers.Dense(40, input_shape=(15,)), # input shape required
tf.keras.layers.Dense(2, activation=tf.nn.sigmoid)
])
model_two_neuron.compile(optimizer=tf.keras.optimizers.Adam(),
loss=tf.keras.losses.MeanSquaredError(),
metrics=[tf.keras.metrics.Precision()])
# Train
model_two_neuron.fit(X_train, y_train, epochs=20)
You can use callbacks to save the best weights for each of your models, then evaluate the best results saved by callbacks after training.
Here is a basic example, provided in the Documentation:
model.compile(loss=..., optimizer=...,
metrics=['accuracy'])
EPOCHS = 10
checkpoint_filepath = '/tmp/checkpoint'
model_checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(
filepath=checkpoint_filepath,
save_weights_only=True,
monitor='val_accuracy',
mode='max',
save_best_only=True)
# Model weights are saved at the end of every epoch, if it's the best seen
# so far.
model.fit(epochs=EPOCHS, callbacks=[model_checkpoint_callback])
# The model weights (that are considered the best) are loaded into the model.
model.load_weights(checkpoint_filepath)

Inverse scale of predicted data in Keras

Im trying to use a NN model to predict with new data. However predicted data is not of the correct scale (values obtained 1e-10 when it should be 0.3 etc).
In my model ive used minmaxscaler on the x and y data. The model gave me an R2 value of 0.9 when using the test train split method, and and MSE of 0.01% using a pipeline method and also the cross val method. So i believe the model ive created is ok.
here is the model ive made.
data=pd.read_csv(r'''F:\DataforANNfromIESFebAugPowerValues.csv''')
data.dropna(axis=0,how='all')
x=data[['Dry-bulb_temperature_C','Wind_speed_m/s','Cloud_cover_oktas','External_relative_humidity_%','Starrag1250','StarragEcospeed2538','StarragS191','StarragLX051','DoosanCNC6700','MakinoG7','HermleC52MT','WFL_Millturn','Hofler1350','MoriNT4250','MoriNT5400','NMV8000','MoriNT6600','MoriNVL1350','HermleC42','CFV550','MoriDura635','DMGUltrasonic10']]
y=data[['Process_heat_output_waste_kW','Heating_plant_sensible_load_kW','Cooling_plant_sensible_load_kW','Relative_humidity_%','Air_temperature_C','Total_electricity_kW','Chillers_energy_kW','Boilers_energy_kW']]
epochs=150
learning_rate=0.001
decay_rate=learning_rate/epochs
optimiser=keras.optimizers.Nadam(lr=learning_rate, schedule_decay=decay_rate)
def create_model():
model=Sequential()
model.add(Dense(21, input_dim=22, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(19, activation='relu')) #hidden layer 2
model.add(Dropout(0.2))
model.add(Dense(8, activation='sigmoid')) #output layer
model.compile(loss='mean_squared_error', optimizer=optimiser,metrics=['accuracy','mse'])
return model
scaler=MinMaxScaler()
x=MinMaxScaler().fit_transform(x)
print(x)
y=MinMaxScaler().fit_transform(y)
model=KerasRegressor(build_fn=create_model, verbose=0,epochs=150, batch_size=70)
model.fit(x, y, epochs=150, batch_size=70)
##SET UP NEW DATA FOR PREDICTIONS
xnewdata=pd.read_csv(r'''F:\newdatapowervalues.csv''')
xnewdata.dropna(axis=0,how='all')
xnew=xnewdata[['Dry-bulb_temperature_C','Wind_speed_m/s','Cloud_cover_oktas','External_relative_humidity_%','Starrag1250','StarragEcospeed2538','StarragS191','StarragLX051','DoosanCNC6700','MakinoG7','HermleC52MT','WFL_Millturn','Hofler1350','MoriNT4250','MoriNT5400','NMV8000','MoriNT6600','MoriNVL1350','HermleC42','CFV550','MoriDura635','DMGUltrasonic10']]
xnew=MinMaxScaler().fit_transform(xnew)
ynew=model.predict(xnew)
ynewdata=pd.DataFrame(data=ynew)
ynewdata.to_csv(r'''F:\KerasIESPowerYPredict.csv''',header=['Process_heat_output_waste_kW','Heating_plant_sensible_load_kW','Cooling_plant_sensible_load_kW','Relative_humidity_%','Air_temperature_C','Total_electricity_kW','Chillers_energy_kW','Boilers_energy_kW'])
seeing ive used the scaler on the inital training model, i thought i would also need to do this to the new data. Ive tried doing
scaler.inverse_transform(ynew)
after model.predict(ynew) however i get the error that the minmaxscaler instance isnt fitted to y yet.
Therefore, i tried using the pipeline method.
estimators = []
estimators.append(('standardize', MinMaxScaler()))
estimators.append(('mlp', KerasRegressor(build_fn=create_model, epochs=150, batch_size=70, verbose=0)))
pipeline = Pipeline(estimators)
pipeline.fit(x,y)
for the inital training model instead of
x=MinMaxScaler().fit_transform(x)
y=MinMaxScaler().fit_transform(y)
model=KerasRegressor(build_fn=create_model, verbose=0,epochs=150, batch_size=70)
model.fit(x, y, epochs=150, batch_size=70)
i then used
ynew=pipeline.predict(xnew)
however this gave me data consisting mainly of 1's!
any idea on how i can predict correctly on this new data? im unsure which data to scale and which not too, as i believe that using the pipeline.predict would include scaling for x and y. therefore do i need some sort of inverse pipeline scalar after making these predictions?
many thanks for your help.
There is one minor and one major problem with your approach.
Minor one: there's no need to scale your target variable, it does not affect your optimisation function.
Major one: you fit the scaler again on the data on which you want to run the prediction. By doing this, you skew completely the relations you have in the data and hence the predicted output is of a very different scale. Also, you define scaler and later not use it. Let's fix it.
(...)
scaler=MinMaxScaler()
x=scaler.fit_transform(x)
model=KerasRegressor(build_fn=create_model, verbose=0,epochs=150, batch_size=70)
model.fit(x, y, epochs=150, batch_size=70)
##SET UP NEW DATA FOR PREDICTIONS
xnewdata=pd.read_csv(r'''F:\newdatapowervalues.csv''')
xnewdata.dropna(axis=0,how='all')
xnew=xnewdata[['Dry-bulb_temperature_C','Wind_speed_m/s','Cloud_cover_oktas','External_relative_humidity_%','Starrag1250','StarragEcospeed2538','StarragS191','StarragLX051','DoosanCNC6700','MakinoG7','HermleC52MT','WFL_Millturn','Hofler1350','MoriNT4250','MoriNT5400','NMV8000','MoriNT6600','MoriNVL1350','HermleC42','CFV550','MoriDura635','DMGUltrasonic10']]
xnew=scaler.transform(xnew)
ynew=model.predict(xnew)
ynewdata=pd.DataFrame(data=ynew)
As you can see, we used the scaler first to learn the proper normnalisation factor and then used it (transform) on the new data on which we run predict.

Keras regression prediction is not same dimension as output dimension

Hello I'm trying to do Energy Disaggregation (predict the energy use of appliances while given the total energy consumption of a certain household.)
Now I have an input dimension of 2 because of 2 main energy measurements.
The output dimension of the Keras Sequential model should be 18 because I have 18 appliances I would like to make a prediction for.
I have enough data using the REDD dataset (this is no problem).
I have trained the model and gained reasonable loss and accuracy.
But when I want to make a prediction for some test data, the prediction consists of values in a 1-dimensional array. Meanwhile the outputs are 18-dimensional?
How is this possible or am I trying something that isn't really viable?
Some code:
model = Sequential()
model.add(Dense(HIDDEN_LAYER_NEURONS,input_dim=2))
model.add(Activation('relu'))
model.add(Dense(18))
model.compile(loss=LOSS,
optimizer=OPTIMIZER,
metrics=['accuracy'])
model.fit(X_train, y_train, epochs=EPOCHS, batch_size=BATCH_SIZE,
verbose=1, validation_split=VALIDATION_SPLIT)
pred = model.predict(X_test).reshape(-1)
pred.shape # prints the following 1 dimensional array: (xxxxx,) dimensional
The ALL_CAPS variables are constants.
X_train is 2-dim
y_train is 18-dim
Any help is appreciated!
Well you are reshaping the predictions and flattening them here:
pred = model.predict(X_test).reshape(-1)
The reshape(-1) effectively makes the array one-dimensional. Just take the predictions directly:
pred = model.predict(X_test)

Using `predict` in Keras to predict an 1D array in the same order as given

I am doing regression in Keras, with a neural network with 1 input, 10 hidden units and 1 output. I fit the model, as usual:
model.fit(x_train, y_train, nb_epoch=15, batch_size=32)
now I want to predict for a xtest that is (as x_train and y_train) a very big 1-dimensional numpy array. In the documentation of the Keras web, you can find:
predict(self, x, batch_size=32, verbose=0)
so I understand you have to do:
model.predict(xtest, batch_size=32)
I am confused by the batch_size instruction. Does it mean that predict takes the values of xtest in a random way?
Because what I need is that predict generates the outputs in exactly the same order as given by xtest. I mean, first of all the output predicted for xtest[0], then the output predicted for xtest[1], then the output predicted for xtest[2]... and so on. With that array predicted I want to do some comparisons with an actual ytest that I have and do some statistics. So, the order is essential. How can I do it?
Thank you in advance.
The predict method preserves the order of examples. Batch size is essential when your data is big and you simply cannot load a lot of examples to your memory. Then it's loaded and evaluated batch by batch in order of original set.

Categories

Resources