Hello I'm trying to do Energy Disaggregation (predict the energy use of appliances while given the total energy consumption of a certain household.)
Now I have an input dimension of 2 because of 2 main energy measurements.
The output dimension of the Keras Sequential model should be 18 because I have 18 appliances I would like to make a prediction for.
I have enough data using the REDD dataset (this is no problem).
I have trained the model and gained reasonable loss and accuracy.
But when I want to make a prediction for some test data, the prediction consists of values in a 1-dimensional array. Meanwhile the outputs are 18-dimensional?
How is this possible or am I trying something that isn't really viable?
Some code:
model = Sequential()
model.add(Dense(HIDDEN_LAYER_NEURONS,input_dim=2))
model.add(Activation('relu'))
model.add(Dense(18))
model.compile(loss=LOSS,
optimizer=OPTIMIZER,
metrics=['accuracy'])
model.fit(X_train, y_train, epochs=EPOCHS, batch_size=BATCH_SIZE,
verbose=1, validation_split=VALIDATION_SPLIT)
pred = model.predict(X_test).reshape(-1)
pred.shape # prints the following 1 dimensional array: (xxxxx,) dimensional
The ALL_CAPS variables are constants.
X_train is 2-dim
y_train is 18-dim
Any help is appreciated!
Well you are reshaping the predictions and flattening them here:
pred = model.predict(X_test).reshape(-1)
The reshape(-1) effectively makes the array one-dimensional. Just take the predictions directly:
pred = model.predict(X_test)
Related
I am trying to train an LSTM to predict some numbers from a sequence of number. My X dataset has 33 features and my Y dataset has 4 variables that I have to predict for each X sample. For example:
Xdf:
Ydf:
After I turn X, y to numpy arrays and reshape the data I build the below model:
I get the following results with huge loss and distinct overfitting. Any ideas what I am doing wrong and I am getting these results? Thanks.
Xdf: https://drive.google.com/file/d/1wH56E0M3ok1MGWzU6FGKgDJF7EZfL_7f/view?usp=sharing
ydf: https://drive.google.com/file/d/1RkjWl1FIiQDyjkRvl7ZQKTOXIE8FtBXA/view?usp=sharing
UPDATE
I edited my code the following way and it seems to bee working a tiny bit better. Still huge loss and overfitting but after some epochs it seemed it worked fine for a while. Any suggestions on how to reduce loss and overfitting would be appreciated.
X = X.reshape((X.shape[0], 33, 1)) # (5850, 33, 1)
model = tf.keras.Sequential()
model.add(layers.LSTM(50, activation='relu', input_shape=(33, 1)))
model.add(layers.Dense(4))
model.add(layers.Dense(4))
model.add(layers.Dense(4))
early_stopping = EarlyStopping(monitor='val_loss', patience=42)
model.compile(optimizer=tf.keras.optimizers.Adam(0.01), loss=tf.keras.losses.MeanSquaredError(), metrics=['accuracy'])
model.fit(X, y, epochs=200, verbose=1, validation_split = 0.2)
Im trying to use a NN model to predict with new data. However predicted data is not of the correct scale (values obtained 1e-10 when it should be 0.3 etc).
In my model ive used minmaxscaler on the x and y data. The model gave me an R2 value of 0.9 when using the test train split method, and and MSE of 0.01% using a pipeline method and also the cross val method. So i believe the model ive created is ok.
here is the model ive made.
data=pd.read_csv(r'''F:\DataforANNfromIESFebAugPowerValues.csv''')
data.dropna(axis=0,how='all')
x=data[['Dry-bulb_temperature_C','Wind_speed_m/s','Cloud_cover_oktas','External_relative_humidity_%','Starrag1250','StarragEcospeed2538','StarragS191','StarragLX051','DoosanCNC6700','MakinoG7','HermleC52MT','WFL_Millturn','Hofler1350','MoriNT4250','MoriNT5400','NMV8000','MoriNT6600','MoriNVL1350','HermleC42','CFV550','MoriDura635','DMGUltrasonic10']]
y=data[['Process_heat_output_waste_kW','Heating_plant_sensible_load_kW','Cooling_plant_sensible_load_kW','Relative_humidity_%','Air_temperature_C','Total_electricity_kW','Chillers_energy_kW','Boilers_energy_kW']]
epochs=150
learning_rate=0.001
decay_rate=learning_rate/epochs
optimiser=keras.optimizers.Nadam(lr=learning_rate, schedule_decay=decay_rate)
def create_model():
model=Sequential()
model.add(Dense(21, input_dim=22, activation='relu'))
model.add(Dropout(0.2))
model.add(Dense(19, activation='relu')) #hidden layer 2
model.add(Dropout(0.2))
model.add(Dense(8, activation='sigmoid')) #output layer
model.compile(loss='mean_squared_error', optimizer=optimiser,metrics=['accuracy','mse'])
return model
scaler=MinMaxScaler()
x=MinMaxScaler().fit_transform(x)
print(x)
y=MinMaxScaler().fit_transform(y)
model=KerasRegressor(build_fn=create_model, verbose=0,epochs=150, batch_size=70)
model.fit(x, y, epochs=150, batch_size=70)
##SET UP NEW DATA FOR PREDICTIONS
xnewdata=pd.read_csv(r'''F:\newdatapowervalues.csv''')
xnewdata.dropna(axis=0,how='all')
xnew=xnewdata[['Dry-bulb_temperature_C','Wind_speed_m/s','Cloud_cover_oktas','External_relative_humidity_%','Starrag1250','StarragEcospeed2538','StarragS191','StarragLX051','DoosanCNC6700','MakinoG7','HermleC52MT','WFL_Millturn','Hofler1350','MoriNT4250','MoriNT5400','NMV8000','MoriNT6600','MoriNVL1350','HermleC42','CFV550','MoriDura635','DMGUltrasonic10']]
xnew=MinMaxScaler().fit_transform(xnew)
ynew=model.predict(xnew)
ynewdata=pd.DataFrame(data=ynew)
ynewdata.to_csv(r'''F:\KerasIESPowerYPredict.csv''',header=['Process_heat_output_waste_kW','Heating_plant_sensible_load_kW','Cooling_plant_sensible_load_kW','Relative_humidity_%','Air_temperature_C','Total_electricity_kW','Chillers_energy_kW','Boilers_energy_kW'])
seeing ive used the scaler on the inital training model, i thought i would also need to do this to the new data. Ive tried doing
scaler.inverse_transform(ynew)
after model.predict(ynew) however i get the error that the minmaxscaler instance isnt fitted to y yet.
Therefore, i tried using the pipeline method.
estimators = []
estimators.append(('standardize', MinMaxScaler()))
estimators.append(('mlp', KerasRegressor(build_fn=create_model, epochs=150, batch_size=70, verbose=0)))
pipeline = Pipeline(estimators)
pipeline.fit(x,y)
for the inital training model instead of
x=MinMaxScaler().fit_transform(x)
y=MinMaxScaler().fit_transform(y)
model=KerasRegressor(build_fn=create_model, verbose=0,epochs=150, batch_size=70)
model.fit(x, y, epochs=150, batch_size=70)
i then used
ynew=pipeline.predict(xnew)
however this gave me data consisting mainly of 1's!
any idea on how i can predict correctly on this new data? im unsure which data to scale and which not too, as i believe that using the pipeline.predict would include scaling for x and y. therefore do i need some sort of inverse pipeline scalar after making these predictions?
many thanks for your help.
There is one minor and one major problem with your approach.
Minor one: there's no need to scale your target variable, it does not affect your optimisation function.
Major one: you fit the scaler again on the data on which you want to run the prediction. By doing this, you skew completely the relations you have in the data and hence the predicted output is of a very different scale. Also, you define scaler and later not use it. Let's fix it.
(...)
scaler=MinMaxScaler()
x=scaler.fit_transform(x)
model=KerasRegressor(build_fn=create_model, verbose=0,epochs=150, batch_size=70)
model.fit(x, y, epochs=150, batch_size=70)
##SET UP NEW DATA FOR PREDICTIONS
xnewdata=pd.read_csv(r'''F:\newdatapowervalues.csv''')
xnewdata.dropna(axis=0,how='all')
xnew=xnewdata[['Dry-bulb_temperature_C','Wind_speed_m/s','Cloud_cover_oktas','External_relative_humidity_%','Starrag1250','StarragEcospeed2538','StarragS191','StarragLX051','DoosanCNC6700','MakinoG7','HermleC52MT','WFL_Millturn','Hofler1350','MoriNT4250','MoriNT5400','NMV8000','MoriNT6600','MoriNVL1350','HermleC42','CFV550','MoriDura635','DMGUltrasonic10']]
xnew=scaler.transform(xnew)
ynew=model.predict(xnew)
ynewdata=pd.DataFrame(data=ynew)
As you can see, we used the scaler first to learn the proper normnalisation factor and then used it (transform) on the new data on which we run predict.
I'm trying to train a neural network using Keras and Tensorflow backend. My X is text descriptions which I have processed and transformed into sequences. Now, my y is a sparse matrix since it's a multi-label classification and I have many output classes.
>>> y
<30405x3387 sparse matrix of type '<type 'numpy.int64'>'
with 54971 stored elements in Compressed Sparse Row format>
To train the model, I tried defining a batch generator:
def batch_generator(x, y, batch_size=32):
n_batches_per_epoch = x.shape[0]//batch_size
for i in range(n_batches_per_epoch):
index_batch = range(x.shape[0])[batch_size*i:batch_size*(i+1)]
x_batch = x[index_batch,:]
y_batch = y[index_batch,:].todense()
yield x_batch, np.array(y_batch)
I've divided my data as:
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=42)
I define my model as:
model = Sequential()
# Create architecture, add some layers.
model.add(Dense(num_classes))
model.add(Activation('sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
And I'm training my model as:
model.fit_generator(generator=batch_generator(x_train, y_train), steps_per_epoch=(x_train[0]/32), epochs=200, callbacks=the_callbacks)
But my model starts with around 55% accuracy and it quickly (in 2 or 3 steps) becomes 99.95%, which makes no sense at all. Am I doing something wrong?
You'll need to switch your loss to "categorical_crossentropy" or change your metric to "crossentropy" for multiclass classification.
The "accuracy" metric is actually ambiguous behind the scenes in Keras- it picks binary or multiclass accuracy based on the loss function used.
https://github.com/keras-team/keras/blob/master/keras/engine/training.py#L375
If you have two classes you can use sigmoid activation in the last layer and binary cross entropy loss function. But, if you have more than one classes, then you have to replace sigmoid with softmax and binary with categorical cross entropy.
There could be multiple other reasons for the abrupt change in accuracy depending upon your data distribution, model configuration etc. etc.
I am trying to predict neutron widths from resonance energies, using a Neural Network (I'm quite new to Keras/NNs in general so apologies in advance).
There is said to be a link between resonance energies and neutron widths, and the similiarities between energy increasing monotonically this can be modelled similiar to a time series problem.
In essences I have 2 columns of data with the first column being resonance energy and the other column containing the respective neutron width on each row. I have decided to use an LSTM layer to help in the networks predict by utlising previous computations.
From various tutorials and other answers, it seems common to use a "look_back" argument to allow the network to use previous timesteps to help predict the current timestep when creating the dataset e.g
trainX, trainY = create_dataset(train, look_back)
I would like to ask regarding forming the NN:
1) Given my particular application do I need to explicitly map each resonance energy to its corresponding neutron width on the same row?
2) Look_back indicates how many previous values the NN can use to help predict the current value, but how is it incorporated with the LSTM layer? I.e I dont quite understand how both can be used?
3) At which point do I inverse the MinMaxScaler?
That is the main two queries, for 1) I have assumed its okay not to, for 2) I believe it is possible but I dont really understand how. I can't quite work out what I have done wrong in the code, ideally I would like to plot the relative deviation of predicted to reference values in the train and test data once the code works. Any advice would be much appreciated:
import numpy
import matplotlib.pyplot as plt
import pandas
import math
from keras.models import Sequential
from keras.layers import Dense, LSTM, Dropout
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_squared_error
# convert an array of values into a dataset matrix
def create_dataset(dataset, look_back=1):
dataX, dataY = [], []
for i in range(len(dataset) - look_back - 1):
a = dataset[i:(i + look_back), 0]
dataX.append(a)
dataY.append(dataset[i + look_back, 1])
return numpy.array(dataX), numpy.array(dataY)
# fix random seed for reproducibility
numpy.random.seed(7)
# load the dataset
dataframe = pandas.read_csv('CSVDataFe56Energyneutron.csv', engine='python')
dataset = dataframe.values
print("dataset")
print(dataset.shape)
print(dataset)
# normalize the dataset
scaler = MinMaxScaler(feature_range=(0, 1))
dataset = scaler.fit_transform(dataset)
print(dataset)
# split into train and test sets
train_size = int(len(dataset) * 0.67)
test_size = len(dataset) - train_size
train, test = dataset[0:train_size, :], dataset[train_size:len(dataset), :]
# reshape into X=t and Y=t+1
look_back = 3
trainX, trainY = create_dataset(train, look_back)
testX, testY = create_dataset(test, look_back)
# reshape input to be [samples, time steps, features]
trainX = numpy.reshape(trainX, (trainX.shape[0], look_back, 1))
testX = numpy.reshape(testX, (testX.shape[0],look_back, 1))
# # create and fit the LSTM network
#
number_of_hidden_layers=16
model = Sequential()
model.add(LSTM(6, input_shape=(look_back,1)))
for x in range(0, number_of_hidden_layers):
model.add(Dense(50, activation='relu'))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
history= model.fit(trainX, trainY, nb_epoch=200, batch_size=32)
trainPredict = model.predict(trainX)
testPredict = model.predict(testX)
print('Train Score: %.2f MSE (%.2f RMSE)' % (trainScore, math.sqrt(trainScore)))
testScore = model.evaluate(testX, testY, verbose=0)
print('Test Score: %.2f MSE (%.2f RMSE)' % (testScore, math.sqrt(testScore)))
1) Given my particular application do I need to explicitly map each
resonance energy to its corresponding neutron width on the same row?
Yes you have to do that. Basically your data has to be in a shape of.
X=[timestep, timestep,...] y=[label, label,...]
2) Look_back indicates how many previous values the NN can use to help
predict the current value, but how is it incorporated with the LSTM
layer? I.e I dont quite understand how both can be used?
A LSTM is a sequence aware layer. You can think about it as a hidden markov model. It takes the first timestep, calculates something and in the next timestep the previous calculation is considered. Look_back, with is usually called sequence_length is just the maximum number of timesteps.
3) At which point do I inverse the MinMaxScaler?
Why should you do that? Furthermore, you don´t need to scale your input.
It seems like you have a general misconception in your model. If you have input_shape=(look_back,1) you don´t need LSTMs at all. If your sequence is just sequence of single values, it might be better to avoid LSTMs. Furthermore, fitting your model should include validation after each epoch to track the loss and validation performance.
model.fit(x_train, y_train,
batch_size=32,
epochs=200,
validation_data=[x_test, y_test],
verbose=1)
I am doing regression in Keras, with a neural network with 1 input, 10 hidden units and 1 output. I fit the model, as usual:
model.fit(x_train, y_train, nb_epoch=15, batch_size=32)
now I want to predict for a xtest that is (as x_train and y_train) a very big 1-dimensional numpy array. In the documentation of the Keras web, you can find:
predict(self, x, batch_size=32, verbose=0)
so I understand you have to do:
model.predict(xtest, batch_size=32)
I am confused by the batch_size instruction. Does it mean that predict takes the values of xtest in a random way?
Because what I need is that predict generates the outputs in exactly the same order as given by xtest. I mean, first of all the output predicted for xtest[0], then the output predicted for xtest[1], then the output predicted for xtest[2]... and so on. With that array predicted I want to do some comparisons with an actual ytest that I have and do some statistics. So, the order is essential. How can I do it?
Thank you in advance.
The predict method preserves the order of examples. Batch size is essential when your data is big and you simply cannot load a lot of examples to your memory. Then it's loaded and evaluated batch by batch in order of original set.