So my main goal is to use data from 2018 and try to predict data for 2019. I'm using a GRU model and I have the following code. I have a few issues, I'm not sure if the code is actually correct or if I am missing something, and also for model.fit should I use validation_split=0.1 or validation_data=X_test,y_test since I'm using a different dataframe for tesing.
Regarding the accuracy, it is very small and doesn't make any sense and I have no idea why.
import pandas as pd
import tensorflow as tf
from keras.layers.core import Dense
from keras.layers.recurrent import GRU
from keras.models import Sequential
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
from tensorboardcolab import TensorBoardColab, TensorBoardColabCallback
df = pd.read_csv('IF 10 PERCENT.csv',index_col=None)
#Loading Second Dataframe
df2 = pd.read_csv('2019 10minutes IF 10 PERCENT.csv',index_col=None)
tbc=TensorBoardColab() # Tensorboard
X_train= df[['WindSpeed_mps','AmbTemp_DegC','RotorSpeed_rpm','RotorSpeedAve','NacelleOrientation_Deg','MeasuredYawError','Pitch_Deg','WindSpeed1','WindSpeed2','WindSpeed3','GeneratorTemperature_DegC','GearBoxTemperature_DegC']]
X_train=X_train.values
y_train= df['Power_kW']
y_train=y_train.values
X_test= df2[['WindSpeed_mps','AmbTemp_DegC','RotorSpeed_rpm','RotorSpeedAve','NacelleOrientation_Deg','MeasuredYawError','Pitch_Deg','WindSpeed1','WindSpeed2','WindSpeed3','GeneratorTemperature_DegC','GearBoxTemperature_DegC']]
X_test=X_test.values
y_test= df2['Power_kW']
y_test=y_test.values
# conversion to numpy array
# scaling values for model
x_scale = MinMaxScaler()
y_scale = MinMaxScaler()
X_train= x_scale.fit_transform(X_train)
y_train= y_scale.fit_transform(y_train.reshape(-1,1))
X_test=x_scale.fit_transform(X_test)
y_test=y_scale.fit_transform(y_test.reshape(-1,1))
X_train = X_train.reshape((-1,1,12))
X_test = X_test.reshape((-1,1,12))
# splitting train and test
# creating model using Keras
model = Sequential()
model.add(GRU(units=512, return_sequences=True, input_shape=(1,12)))
model.add(GRU(units=256, return_sequences=True))
model.add(GRU(units=256))
model.add(Dense(units=1, activation='sigmoid'))
model.compile(loss=['mse'], optimizer='adam',metrics=['accuracy'])
model.summary()
#model.fit(X_train, y_train, batch_size=250, epochs=10, validation_split=0.1, verbose=1, callbacks=[TensorBoardColabCallback(tbc)])
model.fit(X_train, y_train, batch_size=250, epochs=10, validation_data=(X_test,y_test), verbose=1, callbacks=[TensorBoardColabCallback(tbc)])
score = model.evaluate(X_test, y_test)
print('Score: {}'.format(score))
print('Accuracy: {}'.format(acc))
y_predicted = model.predict(X_test)
y_predicted = y_scale.inverse_transform(y_predicted)
y_t
est = y_scale.inverse_transform(y_test)
plt.plot(y_predicted, label='Predicted')
plt.plot(y_test, label='Measurements')
plt.legend()
plt.show()
Thank you
It sounds to me that you are trying to solve a regression problem here. if it is so, It does not make sense to measure accuracy as a metric, since accuracy is about to measure the exact label matching. MSE should be pretty good for the regression
Related
I am training a neural network with a simple dataset. I have tried different combinations of parameters, optimizers, learning rates ... but even after 20 epochs the network is still not learning anything.
I wonder where in the following code lies the problem?
from tensorflow.keras.models import Sequential, load_model
from tensorflow.keras.layers import Input, Dense, Flatten
from tensorflow import keras
from livelossplot import PlotLossesKeras
from keras.models import Model
from sklearn.datasets import make_classification
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import pandas as pd
seed = 42
X, y = make_classification(n_samples=100000, n_features=2, n_redundant=0,
n_informative=2, random_state=seed)
print(f"Number of features: {X.shape[1]}")
print(f"Number of samples: {X.shape[0]}")
df = pd.DataFrame(np.concatenate((X,y.reshape(-1,1)), axis=1))
df.set_axis([*df.columns[:-1], 'Class'], axis=1, inplace=True)
df['Class'] = df['Class'].astype('int')
X = df.drop('Class', axis=1)
y = df['Class']
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=42)
print(f"Train set: {X_train.shape}")
print(f"Validation set: {X_val.shape}")
scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train.astype(np.float64))
X_val_scaled = scaler.transform(X_val.astype(np.float64))
inputs = Input(shape=X_train_scaled.shape[1:])
h0 = Dense(5, activation='relu')(inputs)
h1 = Dense(5, activation='relu')(h0)
preds = Dense(1, activation = 'sigmoid')(h1)
model = Model(inputs=inputs, outputs=preds)
opt = keras.optimizers.Adam(lr=0.0001)
model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])
history = model.fit(X_train_scaled, y_train, batch_size=128, epochs=20, verbose=0,
validation_data=(X_val_scaled, y_val),
callbacks=[PlotLossesKeras()])
score_train = model.evaluate(X_train_scaled, y_train, verbose=0)
score_test = model.evaluate(X_val_scaled, y_val, verbose=0)
print('Train score:', score_train[0])
print('Train accuracy:', score_train[1])
print('Test score:', score_test[0])
print('Test accuracy:', score_test[1])
The code produces the following kind of output
You have used wrong loss function, change this line
model.compile(optimizer=opt, loss='categorical_crossentropy', metrics=['accuracy'])
to, for example,
model.compile(optimizer=opt, loss='mse', metrics=['accuracy'])
Categorical cross-entropy needs a one-hot encoded y which means, you have to have a 0 or a 1 for every class. MSE is just mean squared error, so it will work. But you might try some other losses as well.
your y:
[1,0,1]
one-hot encoded y:
[[0,1], [1,0], [0,1]]
The training and validations datasets I am using are shared here for the sake of reproducibility.
The validation_dataset.csv is the ground truth of training_dataset.csv.
What I am doing below is feeding the datasets into a simple CNN layer that extracts the useful features of the images and feed that as 1D into the LSTM network for classification.
from keras.models import Sequential
from keras.layers import Dense, Flatten
from keras.layers.convolutional import Conv1D
from keras.layers import LSTM
from keras.layers.convolutional import MaxPooling1D
from keras.layers import TimeDistributed
from keras.layers import Dropout
from keras import optimizers
from keras.callbacks import EarlyStopping
import pandas as pd
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
from numpy import genfromtxt
df_train = genfromtxt('data/train/training_dataset.csv', delimiter=',')
df_validation = genfromtxt('data/validation/validation_dataset.csv', delimiter=',')
#train,test = train_test_split(df_train, test_size=0.20, random_state=0)
df_train = df_train[..., None]
df_validation = df_validation[..., None]
batch_size=8
epochs=5
model = Sequential()
model.add(Conv1D(filters=5, kernel_size=3, activation='relu', padding='same'))
model.add(MaxPooling1D(pool_size=2))
#model.add(TimeDistributed(Flatten()))
model.add(LSTM(50, return_sequences=True, recurrent_dropout=0.2))
model.add(Dropout(0.2))
model.add(LSTM(10))
model.add(Dropout(0.2))
model.add(Dense(1, activation='sigmoid'))
adam = optimizers.Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0)
model.compile(optimizer="rmsprop", loss='mse', metrics=['accuracy'])
callbacks = [EarlyStopping('val_loss', patience=3)]
model.fit(df_train, df_validation, batch_size=batch_size)
print(model.summary())
scores = model.evaluate(df_train, df_validation, verbose=0)
print("Accuracy: %.2f%%" % (scores[1]*100))
I want to split the training and validation dataset into (X_train, y_train), (X_test, y_test) so that I can use both datasets for training and testing. I tried the split function of the Scikit-learn library -train,test = train_test_split(df_train, test_size=0.20, random_state=0) but it is giving me the following error after we invoke the model.fit() function.
ValueError: Data cardinality is ambiguous:
x sizes: 14384
y sizes: 3596
Please provide data which shares the same first dimension.
How can we split the dataset into (X_train, y_train), (X_test, y_test) sharing the same dimension?
One way is to have X and Y sets. Here, I assume the column name for Y is 'target'.
target = df_train['target']
df_train = df_train.drop(columns=['target'])
X_train, X_test, y_train, y_test = train_test_split(df_train, target, test_size=0.20, random_state=0)
--
It seems that I had initially misunderstood your problem, and "validation_dataset.csv" is your label data. I apologize for not reading correctly.
In this case, you do not need a "target" variable, as that is what df_validation would be. Therefore, I think the following may work:
X_train, X_test, y_train, y_test = train_test_split(df_train, df_validation, test_size=0.20, random_state=0)
So I have two dataframs that I'm using on a GRU model.
I had to reshape them in order to apply it, but at the end of the code I wanted to plot y_predicted with y_test. It works but what used to be the x-axis is now the y-axis and vice versa. I though that it was because the data was reshaped so I applied .reshape(1,-1) but every time I do it I run out of memory (25GB) and I really don't know how to fix this problem
EDIT:
So I have dont everything using sample data, and discovered that reshaping y_predict doesn't take memory at all. but this line uses all 25GB of memory for some reason y_test = y_scale.inverse_transform(y_test)
import pandas as pd
import tensorflow as tf
from keras.layers.core import Dense
from keras.layers.recurrent import GRU
from keras.models import Sequential
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
from tensorboardcolab import tensorboard
# Loading First Dataframe
df = pd.read_csv('IF 10 PERCENT.csv',index_col=None)
df2 = pd.read_csv('2019 10minutes IF 10 PERCENT.csv',index_col=None)
tbc=TensorBoardColab() # Tensorboard
X_train= df[['WindSpeed_mps','AmbTemp_DegC','RotorSpeed_rpm','RotorSpeedAve','NacelleOrientation_Deg','MeasuredYawError','Pitch_Deg','WindSpeed1','WindSpeed2','WindSpeed3','GeneratorTemperature_DegC','GearBoxTemperature_DegC']]
X_train=X_train.values
y_train= df['Power_kW']
y_train=y_train.values
X_test= df2[['WindSpeed_mps','AmbTemp_DegC','RotorSpeed_rpm','RotorSpeedAve','NacelleOrientation_Deg','MeasuredYawError','Pitch_Deg','WindSpeed1','WindSpeed2','WindSpeed3','GeneratorTemperature_DegC','GearBoxTemperature_DegC']]
X_test=X_test.values
y_test= df2['Power_kW']
y_test=y_test.values
# conversion to numpy array
# scaling values for model
x_scale = MinMaxScaler()
y_scale = MinMaxScaler()
X_train= x_scale.fit_transform(X_train)
y_train= y_scale.fit_transform(y_train.reshape(-1,1))
X_test= x_scale.fit_transform(X_test)
y_test= y_scale.fit_transform(y_test.reshape(-1,1))
X_train = X_train.reshape((-1,1,12))
X_test = X_test.reshape((-1,1,12))
# splitting train and test
# creating model using Keras
model = Sequential()
model.add(GRU(units=512, return_sequences=True, input_shape=(1,12)))
model.add(GRU(units=256, return_sequences=True))
model.add(GRU(units=256))
model.add(Dense(units=1, activation='sigmoid'))
model.compile(loss=['mse'], optimizer='adam',metrics=['mae'])
model.summary()
history=model.fit(X_train, y_train, batch_size=256, epochs=1, validation_data=(X_test,y_test),validation_split=0.1, verbose=1, callbacks=[TensorBoardColabCallback(tbc)])
score = model.evaluate(X_test, y_test)
print('Score: {}'.format(score))
y_predicted = model.predict(X_test)
y_predicted = y_scale.inverse_transform(y_predicted)
y_test = y_scale.inverse_transform(y_test)
plt.plot( y_predicted, label='Predicted')
plt.plot( y_test, label='Measurements')
plt.legend()
plt.show()
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('model loss')
plt.ylabel('loss')
plt.xlabel('epoch')
plt.legend(['train', 'test'], loc='upper left')
plt.show()
Thank you.
I have a regression problem and I am using a keras fully connected layer to model my problem. I am using cross_val_score and my question is: how can I extract the model and the history of each train/validation combination the cross_val_score does?
Assuming this example:
from sklearn import datasets
from sklearn.model_selection import cross_val_score, KFold
from keras.models import Sequential
from keras.layers import Dense
from keras.wrappers.scikit_learn import KerasRegressor
seed = 1
diabetes = datasets.load_diabetes()
X = diabetes.data[:150]
y = diabetes.target[:150]
def baseline_model():
model = Sequential()
model.add(Dense(10, input_dim=10, activation='relu'))
model.add(Dense(1))
model.compile(loss='mean_squared_error', optimizer='adam')
return model
estimator = KerasRegressor(build_fn=baseline_model, nb_epoch=100, batch_size=100, verbose=False)
kfold = KFold(n_splits=10, random_state=seed)
results = cross_val_score(estimator, X, y, cv=kfold)
print("Results: %.2f (%.2f) MSE" % (results.mean(), results.std()))
My understanding is that I only get the overall mse over each fold, so to say.
But I want to compare the train to validation mse over the epochs of the model for each fold, i.e. for 10 in this case.
When not using kfold, but simple train/validation split, then one can do:
hist = model.fit(X_tr, y_tr, validation_data=val_data,
epochs=100, batch_size=100,
verbose=1)
plt.plot(history.history['loss'])
plt.plot(history.history['loss'])
This would return a plot representing the evolution of the mse w.r.t. to the epochs for the train and validation datasets, allowing to spot over/underfitting.
How to do this for each fold when using cross validation?
You can go for a "manual" CV procedure, and plot the loss (or any other available metric you might want to use) for each fold, i.e. something like this:
from sklearn.metrics import mean_squared_error
cv_mse = []
for train_index, val_index in kfold.split(X):
history = estimator.fit(X[train_index], y[train_index])
pred = estimator.predict(X[val_index])
err = mean_squared_error(y[val_index], pred)
cv_mse.append(err)
plt.plot(history.history['loss'])
In which case, the cv_mse list will contain the final MSE for each fold, and you also get the respective plots for its evolution per epoch for each fold.
I am new in ML field and learning it, I made a model by following a tutorial but resulted accuracy is always jumps to 100% soon. I searched online and find about it that i have issue related to model overfitting according to my understanding. Dataset i have used is pretty small from UCI site named Indian Liver Patients Dataset. The dataset contains very few observation around 600.
My Question is how i could overcome this overfitting in the data. Any Help will be appreciated, Thanks.
import pandas as pd
from keras.models import Sequential
from keras.layers import Dense
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report
import scikitplot as skplt
import matplotlib.pyplot as plt
import numpy as np
import seaborn as sns
df = pd.read_csv("C:/TF/TEST/ILDP.csv")
df["ag_ratio"].fillna("0.6", inplace=True)
df.isnull().sum()
print(df.head())
LD, NLD = df['is_patient'].value_counts()
df_sex = pd.get_dummies(df['gender'])
df_new = pd.concat([df, df_sex], axis=1)
Droop_gender = df_new.drop(labels=['gender'], axis=1)
Droop_gender.columns = ['age', 'tot_bilirubin', 'direct_bilirubin', 'tot_proteins', 'albumin', 'ag_ratio',
'sgpt', 'sgot', 'alkphos', 'Female', 'Male', 'is_patient']
X = Droop_gender.drop('is_patient', axis=1)
y = Droop_gender['is_patient']
print(X.shape)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=0)
from sklearn.preprocessing import StandardScaler
sc = StandardScaler()
X_train = sc.fit_transform(X_train)
X_test = sc.transform(X_test)
classifier = Sequential() # Initialising the ANN
classifier.add(Dense(units=16, kernel_initializer='uniform', activation='relu', input_dim=11))
classifier.add(Dense(units=8, kernel_initializer='uniform', activation='relu'))
classifier.add(Dense(units=6, kernel_initializer='uniform', activation='relu'))
classifier.add(Dense(units=1, kernel_initializer='uniform', activation='sigmoid'))
# compile ANN
classifier.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['accuracy'])
# Fitting the data
histroy = classifier.fit(X_train, y_train, batch_size=20, epochs=50)
y_pred = classifier.predict(X_test)
y_pred = [1 if y >= 0.5 else 0 for y in y_pred]
print(classification_report(y_test, y_pred))
That your model is overfitting is encouraging because it means your model has the capacity to learn. Now you have to gradually reduce the capacity of your model to make it generalize better. My recommendation is to add regularization.
Add dropout layers between some of your fully connected layers:
classifier.add(Dense(units=16, kernel_initializer='uniform', activation='relu', input_dim=11))
classifier.add(keras.layers.Dropout(0.5))
classifier.add(Dense(units=8, kernel_initializer='uniform', activation='relu'))
You can add these dropout layers between any layers, but adding between layers with more neurons is better.
If that doesn't work well you can try weight decay. Here is an example from the documentation:
from keras import regularizers
model.add(Dense(64, input_dim=64,
kernel_regularizer=regularizers.l2(0.01),
activity_regularizer=regularizers.l1(0.01)))
Although try either kernel_regularize or activity_regularizer first. They should both work about the same anyway. Try tuning and see how different parameters change. In the end it's a lot of black magic so you'll have to experiment a bit. Good luck!