Regression model output looks inconsistent - python
Some context about my project: I intend to study various parameters about bullets and how they affect the ballistics coefficient (i.e. bullet performance) of the projectile. I have different parameters, such as weight, caliber, sectional density, etc. I feel that I did this all wrong though; I am just reading through tutorials and applying what I feel could be useful and relevant in my project.
The output of my regression model looks a bit off to me; the trained model continuously outputs 0.0201 as MSE throughout the model.fit() part of my program.
Also, the model.predict(X) seems to have an accuracy of 100%, however, this does not seem right; I borrowed some code from a tutorial describing Keras models to display the model output while displaying the expected output.
This is the program constructing the model and training it
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.utils import shuffle
import tensorflow as tf
from tensorflow.keras.callbacks import TensorBoard
from pandas.plotting import scatter_matrix
import time
name = 'Bullet Database Analysis v2-{}'.format(int(time.time()))
tensorboard = TensorBoard(log_dir='logs/{}'.format(name))
physical_devices = tf.config.list_physical_devices('GPU')
tf.config.experimental.set_memory_growth(physical_devices[0], True)
df = pd.read_csv('Bullet Optimization\ShootForum Bullet DB_2.csv')
from sklearn.model_selection import train_test_split
from sklearn import preprocessing
dataset = df.values
X = dataset[:,0:12]
X = np.asarray(X).astype(np.float32)
y = dataset[:,13]
y = np.asarray(y).astype(np.float32)
X_train, X_val_and_test, y_train, y_val_and_test = train_test_split(X, y, test_size=0.3, shuffle=True)
X_val, X_test, y_val, y_test = train_test_split(X_val_and_test, y_val_and_test, test_size=0.5)
from keras.models import Sequential
from keras.layers import Dense, BatchNormalization
model = Sequential(
[
#2430 is the shape of X_train
#BatchNormalization(axis=-1, momentum = 0.1),
Dense(2430, activation='relu'),
Dense(32, activation='relu'),
Dense(1),
]
)
model.compile(loss='mse', metrics=['mse'])
history = model.fit(X_train, y_train,
batch_size=64,
epochs=20,
validation_data=(X_val, y_val),
#callbacks = [tensorboard]
)
# plt.plot(history.history['loss'],'r')
# plt.plot(history.history['val_loss'],'m')
plt.plot(history.history['mse'],'b')
plt.show()
model.summary()
model.save("Bullet Optimization\Bullet Database Analysis.h5")
Here is my code, loading my previously trained model via h5
import numpy as np
import tensorflow as tf
from tensorflow import keras
from keras.models import load_model
import pandas as pd
df = pd.read_csv('Bullet Optimization\ShootForum Bullet DB_2.csv')
model = load_model('Bullet Optimization\Bullet Database Analysis.h5')
dataset = df.values
X = dataset[:,0:12]
y = dataset[:,13]
model.fit(X,y, epochs=10)
#predictions = np.argmax(model.predict(X), axis=-1)
predictions = model.predict(X)
# summarize the first 5 cases
for i in range(5):
print('%s => %d (expected %d)' % (X[i].tolist(), predictions[i], y[i]))
This is the output
Epoch 1/10
2021-03-09 10:38:06.372303: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublas64_11.dll
2021-03-09 10:38:07.747241: I tensorflow/stream_executor/platform/default/dso_loader.cc:49] Successfully opened dynamic library cublasLt64_11.dll
109/109 [==============================] - 2s 4ms/step - loss: 0.0201 - mse: 0.0201
Epoch 2/10
109/109 [==============================] - 1s 5ms/step - loss: 0.0201 - mse: 0.0201
Epoch 3/10
109/109 [==============================] - 0s 4ms/step - loss: 0.0201 - mse: 0.0201
Epoch 4/10
109/109 [==============================] - 0s 5ms/step - loss: 0.0201 - mse: 0.0201
Epoch 5/10
109/109 [==============================] - 1s 5ms/step - loss: 0.0201 - mse: 0.0201
Epoch 6/10
109/109 [==============================] - 1s 5ms/step - loss: 0.0201 - mse: 0.0201
Epoch 7/10
109/109 [==============================] - 1s 5ms/step - loss: 0.0201 - mse: 0.0201
Epoch 8/10
109/109 [==============================] - 0s 4ms/step - loss: 0.0201 - mse: 0.0201
Epoch 9/10
109/109 [==============================] - 1s 5ms/step - loss: 0.0201 - mse: 0.0201
Epoch 10/10
109/109 [==============================] - 0s 4ms/step - loss: 0.0201 - mse: 0.0201
[0.314, 7.9756, 100.0, 100.0, 31.4, 0.00314, 318.4713376, 6.480041472000001, 0.51, 12.95400001, 4.067556004, 0.145] => 0 (expected 0)
[0.358, 9.0932, 148.0, 148.0, 52.983999999999995, 0.002418919, 413.4078212, 9.590461379, 0.635, 16.12900002, 5.774182006, 0.165] => 0 (expected 0)
[0.313, 7.9502, 83.0, 83.0, 25.979, 0.003771084, 265.1757188, 5.378434422000001, 0.504, 12.80160001, 4.006900804, 0.121] => 0 (expected 0)
[0.251, 6.3754, 50.0, 50.0, 12.55, 0.00502, 199.20318730000002, 3.2400207360000004, 0.4, 10.16000001, 2.5501600030000002, 0.113] => 0 (expected 0)
[0.251, 6.3754, 50.0, 50.0, 12.55, 0.00502, 199.20318730000002, 3.2400207360000004, 0.41, 10.41400001, 2.613914003, 0.113] => 0 (expected 0)
Here is a link to my training dataset. Within my code, I used train_test_split to create both the test and train dataset.
Lastly, is there a way within Tensorboard to visualize the model fitting with the dataset? I really feel that although my model is training, it is not making any significant fitting even though the MSE error is reduced.
Because you have nan values in your dataset. Before splitting up you can check it with df.isna().sum(). These can have a negative impact on your network. Here I just simply dropped them (df.dropna(inplace = True, axis = 0)) but you can use some imputation techniques to replace them.
Also 2430 neurons can be overkill for this data, start with less neurons.
model = tf.keras.models.Sequential(
[
tf.keras.layers.Dense(512, activation='relu'),
tf.keras.layers.Dense(32, activation='relu'),
tf.keras.layers.Dense(1),
]
)
Here is the last epoch:
Epoch 20/20
27/27 [==============================] - 0s 8ms/step - loss: 8.2077e-04 - mse: 8.2077e-04 -
val_loss: 8.5023e-04 - val_mse: 8.5023e-04
While doing regression, calculating accuracy straight forward is not a valid option. You can use model.evaluate(X_test, y_test) or when you get predictions by model.predict, you can use other regression metrics to compute how close your predictions are.
Related
Evaluation accuracy stays the same while test accuracy increases just fine
I have tried over and over with different approaches to building this model however I continue to run into this issue where my training accuracy steadily increases just fine but my validation and evaluation accuracy remains very low (55% - 65%). Epoch 95/100 119/119 [==============================] - 0s 2ms/step - loss: 0.6326 - accuracy: 0.8057 - val_loss: 2.0461 - val_accuracy: 0.5985 Epoch 96/100 119/119 [==============================] - 0s 2ms/step - loss: 0.6485 - accuracy: 0.7990 - val_loss: 1.9512 - val_accuracy: 0.5909 Epoch 97/100 119/119 [==============================] - 0s 2ms/step - loss: 0.6263 - accuracy: 0.8032 - val_loss: 2.0344 - val_accuracy: 0.5682 Epoch 98/100 119/119 [==============================] - 0s 2ms/step - loss: 0.6249 - accuracy: 0.7990 - val_loss: 2.0183 - val_accuracy: 0.5682 Epoch 99/100 119/119 [==============================] - 0s 2ms/step - loss: 0.6189 - accuracy: 0.8007 - val_loss: 2.0818 - val_accuracy: 0.5758 Epoch 100/100 119/119 [==============================] - 0s 2ms/step - loss: 0.6261 - accuracy: 0.8024 - val_loss: 2.0591 - val_accuracy: 0.5833 18/18 [==============================] - 0s 1ms/step - loss: 2.2385 - accuracy: 0.5628 EVAL: [2.238506317138672, 0.5628318786621094] The entire script is as follows: import pandas as pd from sklearn.preprocessing import StandardScaler from sklearn.model_selection import train_test_split from keras import Sequential from keras.layers import Dense, Dropout from keras.optimizers import Adam import numpy as np from keras.utils import to_categorical def count_classes(y: list): counts = {} for i in y: if i in counts.keys(): counts[i] += 1 else: counts[i] = 1 return counts dataset = pd.read_csv('Dementia-data.csv') X= dataset.iloc[:,1:] y= dataset.iloc[:,0] # roughly 2562 input variables X.head(2) #standardizing the input feature X_train, X_test, y_train, y_test = train_test_split(X, y, shuffle = True, random_state = 0, test_size=0.2) print("TRAIN:") print(count_classes(list(y_train))) print("TEST:") print(count_classes(list(y_test))) sc = StandardScaler() scaler = sc.fit(X_train) X_train = scaler.transform(X_train) X_test = scaler.transform(X_test) optimizer = Adam(lr=0.00005) classifier = Sequential() #First Hidden Layer classifier.add(Dense(32, activation='relu', kernel_initializer='random_normal', kernel_regularizer=regularizers.l2(0.005), input_dim=2562)) #Second Hidden Layer classifier.add(Dropout(0.2)) classifier.add(Dense(64, activation='relu', kernel_initializer='random_normal', kernel_regularizer=regularizers.l2(0.005))) #Output Layer classifier.add(Dropout(0.2)) classifier.add(Dense(64, activation='relu', kernel_initializer='random_normal')) classifier.add(Dense(7, activation='softmax', kernel_initializer='random_normal')) #Compiling the neural network classifier.compile(optimizer =optimizer,loss='sparse_categorical_crossentropy', metrics =['accuracy']) #Fitting the data to the training dataset classifier.fit(X_train,y_train, batch_size=20, epochs=200, validation_split=0.1, shuffle=True) eval_model=classifier.evaluate(X_test, y_test) print("EVAL:") print(eval_model) I have tried all sorts of things to try to combat overfittings such as regulators, dropouts, and data splitting because that seems to be the lead cause for problems like this one. I do not know if I am missing something. The class distribution for the test and train data is as follows, it doesn't look like there is any inconsistencies between the 2 datasets in terms of ratios of classses: TRAIN: {2: 822, 4: 136, 6: 229, 5: 184, 3: 76, 1: 57} TEST: {2: 199, 6: 59, 5: 45, 1: 26, 3: 15, 4: 33} The plotted accuracy for training and testing: The plotted loss for training and testing, the loss clearly increases for the testing data: I have been struggling with this for weeks now and I'd truly appreciate any help to get this working. I have also tried using learning rates between 0.05 and 0.00005 with little to no improvement.
ANN regression problem with high loss - Python Pandas
I try to run an artificial neural network with 2 parameters in input that can give me the value of the command. An example of the dataset in CSV file: P1,P2,S 7.03,3.36,787.75 6.11,3.31,491.06 5.92,3.34,480.4 5.0,3.39,469.77 5.09,3.36,481.14 5.05,3.35,502.2 4.97,3.38,200.75 5.01,3.34,464.36 5.0,3.42,475.1 4.94,3.36,448.8 4.97,3.37,750.3 5.1,3.39,344.93 5.03,3.41,199.75 5.03,3.39,484.35 5.0,3.47,483.17 4.91,3.42,485.29 3.65,3.51,513.81 5.08,3.47,443.94 5.06,3.4,473.77 5.0,3.42,535.78 3.45,3.44,483.23 4.94,3.45,449.49 4.94,3.51,345.14 5.05,3.48,2829.14 5.01,3.45,1465.58 4.96,3.45,1404.53 3.35,3.58,453.09 5.09,3.47,488.02 5.12,3.52,451.12 5.15,3.54,457.48 5.07,3.53,458.07 5.11,3.5,458.69 5.11,3.47,448.13 5.01,3.42,474.44 4.92,3.44,443.44 5.08,3.53,476.89 5.01,3.49,505.67 5.01,3.47,451.82 4.95,3.49,460.96 5.14,3.42,422.13 5.14,3.42,431.44 5.03,3.46,476.09 4.95,3.53,486.88 5.03,3.42,489.81 5.07,3.45,544.39 5.01,3.52,630.21 5.16,3.49,484.47 5.03,3.52,450.83 5.12,3.48,505.6 5.13,3.54,8400.34 4.99,3.49,615.57 5.13,3.46,673.72, 5.19,3.52,522.31 5.11,3.52,417.29 5.15,3.49,454.97 4.96,3.55,3224.72 5.12,3.54,418.85 5.06,3.53,489.87 5.05,3.45,433.04, 5.0,3.46,491.56 12.93,3.48,3280.98 5.66,3.5,428.5 4.98,3.59,586.43 4.96,3.51,427.67 5.06,3.54,508.53 4.88,3.49,1040.43 5.11,3.52,467.79 5.18,3.54,512.79 5.11,3.52,560.05 5.08,3.53,913.69 5.12,3.53,521.1 5.15,3.52,419.24 5.12,3.56,527.72 5.03,3.52,478.1 5.1,3.55,450.32 5.08,3.53,451.12 4.89,3.53,514.78 4.92,3.46,469.23 5.03,3.53,507.8 4.96,3.56,2580.22 4.99,3.52,516.24 5.0,3.55,525.96 3.66,3.61,450.69 4.91,3.53,487.98 4.97,3.54,443.86 3.53,3.57,628.8 5.02,3.51,466.91 6.41,3.46,430.19 5.0,3.58,589.98 5.06,3.55,711.22 5.26,3.55,2167.16 6.59,3.53,380.59 6.12,3.47,723.56 6.08,3.47,404.59 6.09,3.49,509.5 5.75,3.52,560.21 5.11,3.58,414.83 5.56,3.17,411.22 6.66,3.26,219.38 5.52,3.2,422.13 7.91,3.22,464.87 7.14,3.2,594.18 6.9,3.21,491.0 6.98,3.28,642.09 6.39,3.22,394.49 5.82,3.19,616.82 5.71,3.13,479.6 5.31,3.1,430.6 6.19,3.34,435.42 4.88,3.42,518.14 4.88,3.36,370.93 4.88,3.4,193.36 5.11,3.47,430.06 4.77,3.46,379.38 5.34,3.39,465.39 6.27,3.29,413.8 6.22,3.19,633.28 5.22,3.45,444.14 4.08,3.42,499.91 3.57,3.48,534.41 4.1,3.48,373.8 4.13,3.49,443.57 4.07,3.48,463.74 4.13,3.46,419.92 4.21,3.44,457.76 4.13,3.41,339.31 4.23,3.51,893.39 4.11,3.45,392.54 4.99,3.44,472.96 4.96,3.45,192.54 5.0,3.48,191.22 5.25,3.43,425.64 5.11,3.41,191.12 5.06,3.44,422.32 5.08,3.44,973.29 5.23,3.43,400.67 5.15,3.44,404.2 6.23,3.46,383.07 6.07,3.37,484.3 6.17,3.44,549.94 4.7,3.45,373.43 5.56,3.41,379.33 5.12,3.45,357.51 5.87,3.42,349.89 5.49,3.44,374.4 5.14,3.44,361.11 6.09,3.46,521.23 5.68,3.5,392.98 5.04,3.44,406.9 5.07,3.42,360.8 5.14,3.38,406.48 4.14,3.56,362.45 4.09,3.48,421.83 4.1,3.48,473.64 4.04,3.53,378.35 4.16,3.47,424.59 4.07,3.47,366.27 3.53,3.59,484.37 4.07,3.51,417.12 4.21,3.49,2521.87 4.15,3.5,458.69 4.08,3.52,402.48 4.2,3.47,373.26 3.69,3.5,486.62 4.24,3.51,402.12 4.19,3.5,414.79 4.13,3.55,390.08 4.2,3.5,452.96 4.06,3.52,524.97 4.22,3.47,442.46 4.07,3.5,403.13 4.07,3.51,404.54 4.17,3.46,393.33 4.1,3.4,430.81 4.05,3.41,365.2 4.11,3.47,412.8 4.13,3.49,431.14 4.03,3.51,417.5 3.9,3.48,386.62 4.16,3.49,351.71 5.18,3.48,351.43 4.49,3.5,336.33 3.7,3.51,551.8 6.39,3.44,369.79 6.74,3.35,408.57 6.0,3.38,2924.54 6.61,3.36,449.27 4.91,3.42,361.8 5.81,3.43,470.62 5.8,3.48,389.52 4.81,3.45,403.57 5.75,3.43,570.8 5.68,3.42,405.9 5.9,3.4,458.53 6.51,3.45,374.3 6.63,3.38,406.68 6.85,3.35,382.9 6.8,3.46,398.47 4.81,3.47,398.39 8.3,3.48,538.2 The code : import pandas as pd import matplotlib.pyplot as plt plt.style.use('ggplot') concatenation = pd.read_csv('concatenation.csv') X = concatenation.iloc[:, :2].values # 2 columns y = concatenation.iloc[:, 2].values # 1 column from sklearn.model_selection import train_test_split X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.3, random_state = 0) from sklearn.preprocessing import StandardScaler sc = StandardScaler() X_train = sc.fit_transform(X_train) X_test = sc.transform(X_test) from tensorflow.keras import Sequential from tensorflow.keras.layers import Dense model = Sequential() model.add(Dense(units=128, activation='relu')) model.add(Dense(units=64, activation='relu')) model.add(Dense(units=1, activation='linear')) model.compile(loss='mean_squared_error', optimizer='adam') model.fit(X_train, y_train, epochs= 1000) But I have a problem during the training, I have high loss, I can not understand why? Epoch 1/1000 10/10 [==============================] - 1s 22ms/step - loss: 407736.7188 - mae: 431.3878 - val_loss: 269746.6875 - val_mae: 380.4598 Epoch 2/1000 10/10 [==============================] - 0s 7ms/step - loss: 407391.1875 - mae: 431.0146 - val_loss: 269452.0625 - val_mae: 380.0934 Epoch 3/1000 10/10 [==============================] - 0s 8ms/step - loss: 407016.3750 - mae: 430.5912 - val_loss: 269062.3125 - val_mae: 379.6077 Epoch 4/1000 10/10 [==============================] - 0s 7ms/step - loss: 406472.7188 - mae: 430.0183 - val_loss: 268508.0312 - val_mae: 378.9190 Epoch 5/1000 10/10 [==============================] - 0s 9ms/step - loss: 405686.1562 - mae: 429.1566 - val_loss: 267709.7812 - val_mae: 377.9213 ... I checked that I didn't have a null value, I standardized my X_train I didn't touch the outputs and I am well in case of regression with the right optimizer and the right loss function... so I can't understand why
Deep learning train large numbers
When I try to create model to predict this data. I can't get good loss. How can I optimize it? import tensorflow as tf import numpy as np import matplotlib.pyplot as plt from sklearn import preprocessing from sklearn.model_selection import train_test_split X = np.array([32878265.2, 39635188.8, 738222697.41, 33921812.23, 39364408, 50854015, 50938146.63, 54062184.4, 32977734, 27267164, 30673902.72]) y = np.array([80712, 111654, 127836.61, 128710, 147907, 152862, 154962, 138503, 140238, 105121, 113211.8]) X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.1) scaler = preprocessing.StandardScaler().fit(X_train.reshape(-1, 1)) X_scaled = scaler.transform(X_train.reshape(-1, 1)) tf.random.set_seed(42) model = tf.keras.Sequential([ # tf.keras.layers.Dense(10), tf.keras.layers.Dense(1, input_shape=[1]), tf.keras.layers.Dense(1), ]) model.compile(loss=tf.keras.losses.mae, optimizer=tf.keras.optimizers.Adam(), metrics=["mae"]) model.fit(X_scaled, y_train, epochs=5000, validation_data=(X_test, y_test)) Epoch 2466/5000 1/1 [==============================] - 0s 29ms/step - loss: 38000588.0000 - mae: 38000588.0000 - val_loss: 28384532.0000 - val_mae: 28384532.0000 Epoch 2467/5000 1/1 [==============================] - 0s 31ms/step - loss: 38000588.0000 mae: 38000588.0000 - val_loss: 28384536.0000 - val_mae: 28384536.0000 Epoch 2468/5000 1/1 [==============================] - 0s 41ms/step - loss: 38000588.0000 - mae: 38000588.0000 - val_loss: 28384540.0000 - val_mae: 28384540.0000 Epoch 2469/5000 1/1 [==============================] - 0s 41ms/step - loss: 38000588.0000 mae: 38000588.0000 - val_loss: 28384536.0000 - val_mae: 28384536.0000
Your NN model is just a linear regression. When you plot the data, you see that you have an outlier which is the main problem for a good prediction: I guess, you typed a digit too much.
Keras model.predict always predicts 1
I'm working on some Artificial Intelligence project and I want to predict the bitcoin trend but while using the model.predict function from Keras with my test_set, the prediction is always equal to 1 and the line in my diagram is therefor always straight. import csv import matplotlib.pyplot as plt import numpy as np import pandas as pd from cryptory import Cryptory from keras.models import Sequential, Model, InputLayer from keras.layers import LSTM, Dropout, Dense from sklearn.preprocessing import MinMaxScaler def format_to_3d(df_to_reshape): reshaped_df = np.array(df_to_reshape) return np.reshape(reshaped_df, (reshaped_df.shape[0], 1, reshaped_df.shape[1])) crypto_data = Cryptory(from_date = "2014-01-01") bitcoin_data = crypto_data.extract_coinmarketcap("bitcoin") sc = MinMaxScaler() for col in bitcoin_data.columns: if col != "open": del bitcoin_data[col] training_set = bitcoin_data; training_set = sc.fit_transform(training_set) # Split the data into train, validate and test train_data = training_set[365:] # Split the data into x and y x_train, y_train = train_data[:len(train_data)-1], train_data[1:] model = Sequential() model.add(LSTM(units=4, input_shape=(None, 1))) # 128 -- neurons**? # model.add(Dropout(0.2)) model.add(Dense(units=1, activation="softmax")) # activation function could be different model.compile(optimizer="adam", loss="mean_squared_error") # mse could be used for loss, look into optimiser model.fit(format_to_3d(x_train), y_train, batch_size=32, epochs=15) test_set = bitcoin_data test_set = sc.transform(test_set) test_data = test_set[:364] input = test_data input = sc.inverse_transform(input) input = np.reshape(input, (364, 1, 1)) predicted_result = model.predict(input) print(predicted_result) real_value = sc.inverse_transform(input) plt.plot(real_value, color='pink', label='Real Price') plt.plot(predicted_result, color='blue', label='Predicted Price') plt.title('Bitcoin Prediction') plt.xlabel('Time') plt.ylabel('Prices') plt.legend() plt.show() The training set performance looks like this: 1566/1566 [==============================] - 3s 2ms/step - loss: 0.8572 Epoch 2/15 1566/1566 [==============================] - 1s 406us/step - loss: 0.8572 Epoch 3/15 1566/1566 [==============================] - 1s 388us/step - loss: 0.8572 Epoch 4/15 1566/1566 [==============================] - 1s 388us/step - loss: 0.8572 Epoch 5/15 1566/1566 [==============================] - 1s 389us/step - loss: 0.8572 Epoch 6/15 1566/1566 [==============================] - 1s 392us/step - loss: 0.8572 Epoch 7/15 1566/1566 [==============================] - 1s 408us/step - loss: 0.8572 Epoch 8/15 1566/1566 [==============================] - 1s 459us/step - loss: 0.8572 Epoch 9/15 1566/1566 [==============================] - 1s 400us/step - loss: 0.8572 Epoch 10/15 1566/1566 [==============================] - 1s 410us/step - loss: 0.8572 Epoch 11/15 1566/1566 [==============================] - 1s 395us/step - loss: 0.8572 Epoch 12/15 1566/1566 [==============================] - 1s 386us/step - loss: 0.8572 Epoch 13/15 1566/1566 [==============================] - 1s 385us/step - loss: 0.8572 Epoch 14/15 1566/1566 [==============================] - 1s 393us/step - loss: 0.8572 Epoch 15/15 1566/1566 [==============================] - 1s 397us/step - loss: 0.8572 I'm supposed to print a plot with the Real Price and the Predicted Price, the Real Price is displayed properly but the Predicted price is only a straight line because of that model.predict that only contains the value 1. Thanks in advance!
You're trying to predict a price value, that is, you're aiming at solving a regression problem and not a classification problem. However, in your last layer of the network (model.add(Dense(units=1, activation="softmax"))), you have a single neuron (which would be adequate for a regression problem), but you've chosen to use a softmax activation function. The softmax function is used in multi-class classification problems, to normalize the outputs into a probability distribution. If you have a single output neuron and you apply softmax, the final result will always 1.0, as it is the only parameter of the probability distribution. In summary, for regression problems you do not use an activation function, as the network is intended to already output the predicted value.
How to see why a keras / tensorflow model is getting stuck?
My code is: from keras.models import Sequential from keras.layers import Dense import numpy import pandas as pd X = pd.read_csv( "data/train.csv", usecols=['Type', 'Age', 'Breed1', 'Breed2', 'Gender', 'Color1', 'Color2', 'Color3', 'MaturitySize', 'FurLength', 'Vaccinated', 'Dewormed', 'Sterilized', 'Health', 'Quantity', 'Fee', 'VideoAmt', 'PhotoAmt']) Y = pd.read_csv( "data/train.csv", usecols=['AdoptionSpeed']) model = Sequential() model.add(Dense(18, input_dim=18, activation='relu')) model.add(Dense(1, activation='sigmoid')) model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy']) model.fit(X, Y, epochs=150, batch_size=100) scores = model.evaluate(X, Y) print("\n%s: %.2f%%" % (model.metrics_names[1], scores[1]*100)) I am trying to train to see how the various factors (type, age, etc) affect the AdoptionSpeed. However, the accuracy gets stuck at 20.6% and doesn't really move from there. Epoch 2/150 14993/14993 [==============================] - 0s 9us/step - loss: -24.1539 - acc: 0.2061 Epoch 3/150 14993/14993 [==============================] - 0s 9us/step - loss: -24.1591 - acc: 0.2061 Epoch 4/150 14993/14993 [==============================] - 0s 9us/step - loss: -24.1626 - acc: 0.2061 ... Epoch 150/150 14993/14993 [==============================] - 0s 9us/step - loss: -24.1757 - acc: 0.2061 14993/14993 [==============================] - 0s 11us/step acc: 20.61% Is there anything I can do to nudge to get unstuck?
By the values of the loss, it seems your true data is not in the same range as the the model's output (sigmoid). Sigmoid outputs between 0 and 1 only. So you should normalize your data in order to have it between 0 and 1. One possibility is simply divide y by y.max(). Or you can try other possibilities, considering: sigmoid: between 0 and 1 tanh: between -1 and 1 relu: 0 to infinity linear: -inf to +inf