How to see why a keras / tensorflow model is getting stuck? - python
My code is:
from keras.models import Sequential
from keras.layers import Dense
import numpy
import pandas as pd
X = pd.read_csv(
"data/train.csv", usecols=['Type', 'Age', 'Breed1', 'Breed2', 'Gender', 'Color1', 'Color2', 'Color3', 'MaturitySize', 'FurLength', 'Vaccinated', 'Dewormed', 'Sterilized', 'Health', 'Quantity', 'Fee', 'VideoAmt', 'PhotoAmt'])
Y = pd.read_csv(
"data/train.csv", usecols=['AdoptionSpeed'])
model = Sequential()
model.add(Dense(18, input_dim=18, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='adam', metrics=['accuracy'])
model.fit(X, Y, epochs=150, batch_size=100)
scores = model.evaluate(X, Y)
print("\n%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))
I am trying to train to see how the various factors (type, age, etc) affect the AdoptionSpeed. However, the accuracy gets stuck at 20.6% and doesn't really move from there.
Epoch 2/150
14993/14993 [==============================] - 0s 9us/step - loss: -24.1539 - acc: 0.2061
Epoch 3/150
14993/14993 [==============================] - 0s 9us/step - loss: -24.1591 - acc: 0.2061
Epoch 4/150
14993/14993 [==============================] - 0s 9us/step - loss: -24.1626 - acc: 0.2061
...
Epoch 150/150
14993/14993 [==============================] - 0s 9us/step - loss: -24.1757 - acc: 0.2061
14993/14993 [==============================] - 0s 11us/step
acc: 20.61%
Is there anything I can do to nudge to get unstuck?
By the values of the loss, it seems your true data is not in the same range as the the model's output (sigmoid).
Sigmoid outputs between 0 and 1 only. So you should normalize your data in order to have it between 0 and 1. One possibility is simply divide y by y.max().
Or you can try other possibilities, considering:
sigmoid: between 0 and 1
tanh: between -1 and 1
relu: 0 to infinity
linear: -inf to +inf
Related
ANN regression problem with high loss - Python Pandas
I try to run an artificial neural network with 2 parameters in input that can give me the value of the command. An example of the dataset in CSV file: P1,P2,S 7.03,3.36,787.75 6.11,3.31,491.06 5.92,3.34,480.4 5.0,3.39,469.77 5.09,3.36,481.14 5.05,3.35,502.2 4.97,3.38,200.75 5.01,3.34,464.36 5.0,3.42,475.1 4.94,3.36,448.8 4.97,3.37,750.3 5.1,3.39,344.93 5.03,3.41,199.75 5.03,3.39,484.35 5.0,3.47,483.17 4.91,3.42,485.29 3.65,3.51,513.81 5.08,3.47,443.94 5.06,3.4,473.77 5.0,3.42,535.78 3.45,3.44,483.23 4.94,3.45,449.49 4.94,3.51,345.14 5.05,3.48,2829.14 5.01,3.45,1465.58 4.96,3.45,1404.53 3.35,3.58,453.09 5.09,3.47,488.02 5.12,3.52,451.12 5.15,3.54,457.48 5.07,3.53,458.07 5.11,3.5,458.69 5.11,3.47,448.13 5.01,3.42,474.44 4.92,3.44,443.44 5.08,3.53,476.89 5.01,3.49,505.67 5.01,3.47,451.82 4.95,3.49,460.96 5.14,3.42,422.13 5.14,3.42,431.44 5.03,3.46,476.09 4.95,3.53,486.88 5.03,3.42,489.81 5.07,3.45,544.39 5.01,3.52,630.21 5.16,3.49,484.47 5.03,3.52,450.83 5.12,3.48,505.6 5.13,3.54,8400.34 4.99,3.49,615.57 5.13,3.46,673.72, 5.19,3.52,522.31 5.11,3.52,417.29 5.15,3.49,454.97 4.96,3.55,3224.72 5.12,3.54,418.85 5.06,3.53,489.87 5.05,3.45,433.04, 5.0,3.46,491.56 12.93,3.48,3280.98 5.66,3.5,428.5 4.98,3.59,586.43 4.96,3.51,427.67 5.06,3.54,508.53 4.88,3.49,1040.43 5.11,3.52,467.79 5.18,3.54,512.79 5.11,3.52,560.05 5.08,3.53,913.69 5.12,3.53,521.1 5.15,3.52,419.24 5.12,3.56,527.72 5.03,3.52,478.1 5.1,3.55,450.32 5.08,3.53,451.12 4.89,3.53,514.78 4.92,3.46,469.23 5.03,3.53,507.8 4.96,3.56,2580.22 4.99,3.52,516.24 5.0,3.55,525.96 3.66,3.61,450.69 4.91,3.53,487.98 4.97,3.54,443.86 3.53,3.57,628.8 5.02,3.51,466.91 6.41,3.46,430.19 5.0,3.58,589.98 5.06,3.55,711.22 5.26,3.55,2167.16 6.59,3.53,380.59 6.12,3.47,723.56 6.08,3.47,404.59 6.09,3.49,509.5 5.75,3.52,560.21 5.11,3.58,414.83 5.56,3.17,411.22 6.66,3.26,219.38 5.52,3.2,422.13 7.91,3.22,464.87 7.14,3.2,594.18 6.9,3.21,491.0 6.98,3.28,642.09 6.39,3.22,394.49 5.82,3.19,616.82 5.71,3.13,479.6 5.31,3.1,430.6 6.19,3.34,435.42 4.88,3.42,518.14 4.88,3.36,370.93 4.88,3.4,193.36 5.11,3.47,430.06 4.77,3.46,379.38 5.34,3.39,465.39 6.27,3.29,413.8 6.22,3.19,633.28 5.22,3.45,444.14 4.08,3.42,499.91 3.57,3.48,534.41 4.1,3.48,373.8 4.13,3.49,443.57 4.07,3.48,463.74 4.13,3.46,419.92 4.21,3.44,457.76 4.13,3.41,339.31 4.23,3.51,893.39 4.11,3.45,392.54 4.99,3.44,472.96 4.96,3.45,192.54 5.0,3.48,191.22 5.25,3.43,425.64 5.11,3.41,191.12 5.06,3.44,422.32 5.08,3.44,973.29 5.23,3.43,400.67 5.15,3.44,404.2 6.23,3.46,383.07 6.07,3.37,484.3 6.17,3.44,549.94 4.7,3.45,373.43 5.56,3.41,379.33 5.12,3.45,357.51 5.87,3.42,349.89 5.49,3.44,374.4 5.14,3.44,361.11 6.09,3.46,521.23 5.68,3.5,392.98 5.04,3.44,406.9 5.07,3.42,360.8 5.14,3.38,406.48 4.14,3.56,362.45 4.09,3.48,421.83 4.1,3.48,473.64 4.04,3.53,378.35 4.16,3.47,424.59 4.07,3.47,366.27 3.53,3.59,484.37 4.07,3.51,417.12 4.21,3.49,2521.87 4.15,3.5,458.69 4.08,3.52,402.48 4.2,3.47,373.26 3.69,3.5,486.62 4.24,3.51,402.12 4.19,3.5,414.79 4.13,3.55,390.08 4.2,3.5,452.96 4.06,3.52,524.97 4.22,3.47,442.46 4.07,3.5,403.13 4.07,3.51,404.54 4.17,3.46,393.33 4.1,3.4,430.81 4.05,3.41,365.2 4.11,3.47,412.8 4.13,3.49,431.14 4.03,3.51,417.5 3.9,3.48,386.62 4.16,3.49,351.71 5.18,3.48,351.43 4.49,3.5,336.33 3.7,3.51,551.8 6.39,3.44,369.79 6.74,3.35,408.57 6.0,3.38,2924.54 6.61,3.36,449.27 4.91,3.42,361.8 5.81,3.43,470.62 5.8,3.48,389.52 4.81,3.45,403.57 5.75,3.43,570.8 5.68,3.42,405.9 5.9,3.4,458.53 6.51,3.45,374.3 6.63,3.38,406.68 6.85,3.35,382.9 6.8,3.46,398.47 4.81,3.47,398.39 8.3,3.48,538.2 The code : import pandas as pd import matplotlib.pyplot as plt plt.style.use('ggplot') concatenation = pd.read_csv('concatenation.csv') X = concatenation.iloc[:, :2].values # 2 columns y = concatenation.iloc[:, 2].values # 1 column from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3, random_state = 0) from sklearn.preprocessing import StandardScaler sc = StandardScaler() X_train = sc.fit_transform(X_train) X_test = sc.transform(X_test) from tensorflow.keras import Sequential from tensorflow.keras.layers import Dense model = Sequential() model.add(Dense(units=128, activation='relu')) model.add(Dense(units=64, activation='relu')) model.add(Dense(units=1, activation='linear')) model.compile(loss='mean_squared_error', optimizer='adam') model.fit(X_train, y_train, epochs= 1000) But I have a problem during the training, I have high loss, I can not understand why? Epoch 1/1000 10/10 [==============================] - 1s 22ms/step - loss: 407736.7188 - mae: 431.3878 - val_loss: 269746.6875 - val_mae: 380.4598 Epoch 2/1000 10/10 [==============================] - 0s 7ms/step - loss: 407391.1875 - mae: 431.0146 - val_loss: 269452.0625 - val_mae: 380.0934 Epoch 3/1000 10/10 [==============================] - 0s 8ms/step - loss: 407016.3750 - mae: 430.5912 - val_loss: 269062.3125 - val_mae: 379.6077 Epoch 4/1000 10/10 [==============================] - 0s 7ms/step - loss: 406472.7188 - mae: 430.0183 - val_loss: 268508.0312 - val_mae: 378.9190 Epoch 5/1000 10/10 [==============================] - 0s 9ms/step - loss: 405686.1562 - mae: 429.1566 - val_loss: 267709.7812 - val_mae: 377.9213 ... I checked that I didn't have a null value, I standardized my X_train I didn't touch the outputs and I am well in case of regression with the right optimizer and the right loss function... so I can't understand why
Trying to predict numbers in a LSTM and having extremely high loss (even with MinMax Scaler and Dropout)
from numpy import array from keras.models import Sequential from keras.layers import LSTM, Dropout from keras.layers import Dense def split_univariate_sequence(sequence, n_steps_in, n_steps_out): X, y = list(), list() for i in range(len(sequence)): # find the end of this pattern end_ix = i + n_steps_in out_end_ix = end_ix + n_steps_out # check if we are beyond the sequence if out_end_ix > len(sequence): break # gather input and output parts of the pattern seq_x, seq_y = sequence[i:end_ix], sequence[end_ix:out_end_ix] X.append(seq_x) y.append(seq_y) return array(X), array(y) n_steps_in, n_steps_out = 30, 30 X1, y1 = split_univariate_sequence(sumpred, n_steps_in, n_steps_out) transformer = MinMaxScaler() X1_transformed = transformer.fit_transform(X1) n_features = 1 X1_transformed = X1_transformed.reshape((X1_transformed.shape[0], X1_transformed.shape[1], n_features)) model = Sequential() model.add(LSTM(150, activation='relu', return_sequences=True, input_shape=(n_steps_in, n_features))) model.add(Dropout(0.3)) model.add(LSTM(50, activation='relu')) model.add(Dropout(0.3)) model.add(Dense(n_steps_out)) model.compile(optimizer='adam', loss='mse') model.fit(X1_transformed, y1, epochs=1000, verbose=1) # demonstrate prediction x_input = sumpred[-30:].reshape(1, -1) x_input = transformer.transform(x_input) x_input = x_input.reshape((1, n_steps_in, n_features)) yhat = model.predict(x_input, verbose=1) yhat_inverse = transformer.inverse_transform(yhat) sumpred is a array of float-32 (144,) with values between 390.624 to 347471. I'm trying to predict the next 30 numbers based on the last 30 sumpred values. When I train the model, I have results like this: Epoch 990/1000 85/85 [==============================] - 0s 2ms/step - loss: 1031220211.9529 Epoch 991/1000 85/85 [==============================] - 0s 2ms/step - loss: 1087168440.4706 Epoch 992/1000 85/85 [==============================] - 0s 2ms/step - loss: 1011368153.6000 Epoch 993/1000 85/85 [==============================] - 0s 2ms/step - loss: 1104842800.1882 Epoch 994/1000 85/85 [==============================] - 0s 2ms/step - loss: 1086514331.1059 Epoch 995/1000 85/85 [==============================] - 0s 2ms/step - loss: 1050088100.8941 Epoch 996/1000 85/85 [==============================] - 0s 2ms/step - loss: 1003426751.2471 Epoch 997/1000 85/85 [==============================] - 0s 2ms/step - loss: 1139417025.5059 Epoch 998/1000 85/85 [==============================] - 0s 2ms/step - loss: 1129283814.4000 Epoch 999/1000 85/85 [==============================] - 0s 2ms/step - loss: 1107968009.0353 Epoch 1000/1000 85/85 [==============================] - 0s 2ms/step - loss: 1651960831.6235 The values in yhat_inverse are far beyond expected. It was not better with other losses, like mean squared logarithmic error. Even with the data transformation (MinMaxScaler) and Dropout layers, I'm still having this issue. Someone has any clue to improve my model performance?
Your model is not able to learn, so, first increase the size of the network. Given how much the loss is coming out, the input size is quite large and you are not providing enough power to the neural network to learn the data. Remove the dropouts first and just increase the layers and keep them all at 150 or more. Dropout is usually used towards the end when you see overfitting, but, your model has not even started learning.
Keras Multi-layer Neural Network Accuracy
I've built a simplistic multi-layer NN using Keras with precipitation data in Australia. The code takes 4 input columns: ['MinTemp', 'MaxTemp', 'Rainfall', 'WindGustSpeed'] and trains against the RainTomorrow output. I've partitioned the data into training/test buckets, transformed all values into 0 <= n <= 1. When I trying to run model.fit, my loss values steady at ~13.2, but my accuracy is always 0.0. An example of logged fitting intervals are: ... Epoch 37/200 113754/113754 [==============================] - 0s 2us/step - loss: -13.1274 - acc: 0.0000e+00 - val_loss: -16.1168 - val_acc: 0.0000e+00 Epoch 38/200 113754/113754 [==============================] - 0s 2us/step - loss: -13.1457 - acc: 0.0000e+00 - val_loss: -16.1168 - val_acc: 0.0000e+00 Epoch 39/200 113754/113754 [==============================] - 0s 2us/step - loss: -13.1315 - acc: 0.0000e+00 - val_loss: -16.1168 - val_acc: 0.0000e+00 Epoch 40/200 113754/113754 [==============================] - 0s 2us/step - loss: -13.1797 - acc: 0.0000e+00 - val_loss: -16.1168 - val_acc: 0.0000e+00 Epoch 41/200 113754/113754 [==============================] - 0s 2us/step - loss: -13.1844 - acc: 0.0000e+00 - val_loss: -16.1169 - val_acc: 0.0000e+00 Epoch 42/200 113754/113754 [==============================] - 0s 2us/step - loss: -13.2205 - acc: 0.0000e+00 - val_loss: -16.1169 - val_acc: 0.0000e+00 Epoch 43/200 ... How can I amend the following script, so my accuracy grows, and my predication output returns a value between 0 and 1 (0: no rain, 1: rain)? import keras import sklearn.model_selection import numpy as np import pandas as pd from sklearn.preprocessing import LabelEncoder from sklearn.preprocessing import MinMaxScaler labelencoder = LabelEncoder() # read data, replace NaN with 0.0 csv_data = pd.read_csv('weatherAUS.csv', header=0) csv_data = csv_data.replace(np.nan, 0.0, regex=True) # Input/output columns scaled to 0<=n<=1 x = csv_data.loc[:, ['MinTemp', 'MaxTemp', 'Rainfall', 'WindGustSpeed']] y = labelencoder.fit_transform(csv_data['RainTomorrow']) scaler_x = MinMaxScaler(feature_range =(-1, 1)) x = scaler_x.fit_transform(x) scaler_y = MinMaxScaler(feature_range =(-1, 1)) y = scaler_y.fit_transform([y])[0] # Partitioned data for training/testing x_train, x_test, y_train, y_test = sklearn.model_selection.train_test_split(x, y, test_size=0.2) # model model = keras.models.Sequential() model.add( keras.layers.normalization.BatchNormalization(input_shape=tuple([x_train.shape[1]]))) model.add(keras.layers.core.Dense(4, activation='relu')) model.add(keras.layers.core.Dropout(rate=0.5)) model.add(keras.layers.normalization.BatchNormalization()) model.add(keras.layers.core.Dense(4, activation='relu')) model.add(keras.layers.core.Dropout(rate=0.5)) model.add(keras.layers.normalization.BatchNormalization()) model.add(keras.layers.core.Dense(4, activation='relu')) model.add(keras.layers.core.Dropout(rate=0.5)) model.add(keras.layers.core.Dense(1, activation='sigmoid')) model.compile(loss='binary_crossentropy', optimizer='rmsprop', metrics=["accuracy"]) callback_early_stopping = keras.callbacks.EarlyStopping(monitor='val_loss', patience=10, verbose=0, mode='auto') model.fit(x_train, y_train, batch_size=1024, epochs=200, validation_data=(x_test, y_test), verbose=1, callbacks=[callback_early_stopping]) y_test = model.predict(x_test.values)
As you can see, the sigmoid activation function that you are using in your neural network output (the last layer) range from 0 to 1. Note that your label (y) is rescaled to -1 to 1. I suggest you change the y range to 0 to 1 and keep the sigmoid output.
So the sigmoid Ranges from 0 to 1. Your MinMaxscaler scales data from -1 to 1. You can fix it by replacing 'sigmoid' in the output layer with 'tanh', as tanh has output ranging from -1 to 1
Both the other answers can be used to address the fact that your network ouput is not in the same range as your y vector values. Either adjust your final layer to a tanh activation, or change the y-vector range to [0,1]. However, your network loss function and metric is defined for classification purposes, where as you are attempting regression (continuous values between [-1, 1]). The most common loss function and accuracy metric to use is the mean sqaured error, or mean absolute errtr. So I suggest you change the following: model.compile(loss='mse', optimizer='rmsprop', metrics=['mse, 'mae'])
Keras model.predict always predicts 1
I'm working on some Artificial Intelligence project and I want to predict the bitcoin trend but while using the model.predict function from Keras with my test_set, the prediction is always equal to 1 and the line in my diagram is therefor always straight. import csv import matplotlib.pyplot as plt import numpy as np import pandas as pd from cryptory import Cryptory from keras.models import Sequential, Model, InputLayer from keras.layers import LSTM, Dropout, Dense from sklearn.preprocessing import MinMaxScaler def format_to_3d(df_to_reshape): reshaped_df = np.array(df_to_reshape) return np.reshape(reshaped_df, (reshaped_df.shape[0], 1, reshaped_df.shape[1])) crypto_data = Cryptory(from_date = "2014-01-01") bitcoin_data = crypto_data.extract_coinmarketcap("bitcoin") sc = MinMaxScaler() for col in bitcoin_data.columns: if col != "open": del bitcoin_data[col] training_set = bitcoin_data; training_set = sc.fit_transform(training_set) # Split the data into train, validate and test train_data = training_set[365:] # Split the data into x and y x_train, y_train = train_data[:len(train_data)-1], train_data[1:] model = Sequential() model.add(LSTM(units=4, input_shape=(None, 1))) # 128 -- neurons**? # model.add(Dropout(0.2)) model.add(Dense(units=1, activation="softmax")) # activation function could be different model.compile(optimizer="adam", loss="mean_squared_error") # mse could be used for loss, look into optimiser model.fit(format_to_3d(x_train), y_train, batch_size=32, epochs=15) test_set = bitcoin_data test_set = sc.transform(test_set) test_data = test_set[:364] input = test_data input = sc.inverse_transform(input) input = np.reshape(input, (364, 1, 1)) predicted_result = model.predict(input) print(predicted_result) real_value = sc.inverse_transform(input) plt.plot(real_value, color='pink', label='Real Price') plt.plot(predicted_result, color='blue', label='Predicted Price') plt.title('Bitcoin Prediction') plt.xlabel('Time') plt.ylabel('Prices') plt.legend() plt.show() The training set performance looks like this: 1566/1566 [==============================] - 3s 2ms/step - loss: 0.8572 Epoch 2/15 1566/1566 [==============================] - 1s 406us/step - loss: 0.8572 Epoch 3/15 1566/1566 [==============================] - 1s 388us/step - loss: 0.8572 Epoch 4/15 1566/1566 [==============================] - 1s 388us/step - loss: 0.8572 Epoch 5/15 1566/1566 [==============================] - 1s 389us/step - loss: 0.8572 Epoch 6/15 1566/1566 [==============================] - 1s 392us/step - loss: 0.8572 Epoch 7/15 1566/1566 [==============================] - 1s 408us/step - loss: 0.8572 Epoch 8/15 1566/1566 [==============================] - 1s 459us/step - loss: 0.8572 Epoch 9/15 1566/1566 [==============================] - 1s 400us/step - loss: 0.8572 Epoch 10/15 1566/1566 [==============================] - 1s 410us/step - loss: 0.8572 Epoch 11/15 1566/1566 [==============================] - 1s 395us/step - loss: 0.8572 Epoch 12/15 1566/1566 [==============================] - 1s 386us/step - loss: 0.8572 Epoch 13/15 1566/1566 [==============================] - 1s 385us/step - loss: 0.8572 Epoch 14/15 1566/1566 [==============================] - 1s 393us/step - loss: 0.8572 Epoch 15/15 1566/1566 [==============================] - 1s 397us/step - loss: 0.8572 I'm supposed to print a plot with the Real Price and the Predicted Price, the Real Price is displayed properly but the Predicted price is only a straight line because of that model.predict that only contains the value 1. Thanks in advance!
You're trying to predict a price value, that is, you're aiming at solving a regression problem and not a classification problem. However, in your last layer of the network (model.add(Dense(units=1, activation="softmax"))), you have a single neuron (which would be adequate for a regression problem), but you've chosen to use a softmax activation function. The softmax function is used in multi-class classification problems, to normalize the outputs into a probability distribution. If you have a single output neuron and you apply softmax, the final result will always 1.0, as it is the only parameter of the probability distribution. In summary, for regression problems you do not use an activation function, as the network is intended to already output the predicted value.
Keras network fit: loss is 'nan', accuracy doesn't change
I try to fit keras network, but in each epoch loss is 'nan' and accuracy doesn't change... I tried to change epoch, layers count, neurons count, learning rate, optimizers, I checked nan data in datasets, normalize data by different ways, but problem was not solved. Thanks for your help. np.random.seed(1337) # example of input vector: [-1.459746, 0.2694708, ... 0.90043] # example of output vector: [1, 0] or [0, 1] model = Sequential() model.add(Dense(1000, activation='tanh', init='normal', input_dim=503)) model.add(Dense(2, init='normal', activation='softmax')) opt = optimizers.sgd(lr=0.01) model.compile(loss="categorical_crossentropy", optimizer=opt, metrics=['accuracy']) print(model.summary()) model.fit(x_train, y_train, batch_size=1000, nb_epoch=100, verbose=1) 99804/99804 [==============================] - 5s 52us/step - loss: nan - acc: 0.4938 Epoch 1/100 99804/99804 [==============================] - 5s 49us/step - loss: nan - acc: 0.4938 Epoch 2/100 99804/99804 [==============================] - 5s 51us/step - loss: nan - acc: 0.4938 Epoch 3/100 99804/99804 [==============================] - 5s 52us/step - loss: nan - acc: 0.4938 Epoch 4/100 99804/99804 [==============================] - 5s 52us/step - loss: nan - acc: 0.4938 Epoch 5/100 99804/99804 [==============================] - 5s 51us/step - loss: nan - acc: 0.4938 ...
Oh, problem has been found! After normalization, one nan neuron appeared in the input vector
First convert your output to categorical, as described in Keras documentation: Note: when using the categorical_crossentropy loss, your targets should be in categorical format. In order to convert integer targets into categorical targets, you can use the Keras utility to_categorical: from keras.utils import to_categorical categorical_labels = to_categorical(int_labels, num_classes=None)