Keras: Binary_crossentropy has negative values - python

I'm following this tutorial (section 6: Tying it All Together), with my own dataset. I can get the example in the tutorial working, no problem, with the sample dataset provided.
I'm getting a binary cross-entropy error that is negative, and no improvements as epochs progress. I'm pretty sure binary cross-entropy should always be positive, and I should see some improvement in the loss. I've truncated the sample output (and code call) below to 5 epochs. Others seem to run into similar problems sometimes when training CNNs, but I didn't see a clear solution in my case. Does anyone know why this is happening?
Sample output:
Creating TensorFlow device (/gpu:2) -> (device: 2, name: GeForce GTX TITAN Black, pci bus id: 0000:84:00.0)
10240/10240 [==============================] - 2s - loss: -5.5378 - acc: 0.5000 - val_loss: -7.9712 - val_acc: 0.5000
Epoch 2/5
10240/10240 [==============================] - 0s - loss: -7.9712 - acc: 0.5000 - val_loss: -7.9712 - val_acc: 0.5000
Epoch 3/5
10240/10240 [==============================] - 0s - loss: -7.9712 - acc: 0.5000 - val_loss: -7.9712 - val_acc: 0.5000
Epoch 4/5
10240/10240 [==============================] - 0s - loss: -7.9712 - acc: 0.5000 - val_loss: -7.9712 - val_acc: 0.5000
Epoch 5/5
10240/10240 [==============================] - 0s - loss: -7.9712 - acc: 0.5000 - val_loss: -7.9712 - val_acc: 0.5000
My code:
import numpy as np
import keras
from keras.models import Sequential
from keras.layers import Dense
from keras.callbacks import History
history = History()
seed = 7
np.random.seed(seed)
dataset = np.loadtxt('train_rows.csv', delimiter=",")
#print dataset.shape (10240, 64)
# split into input (X) and output (Y) variables
X = dataset[:, 0:(dataset.shape[1]-2)] #0:62 (63 of 64 columns)
Y = dataset[:, dataset.shape[1]-1] #column 64 counting from 0
#print X.shape (10240, 62)
#print Y.shape (10240,)
testset = np.loadtxt('test_rows.csv', delimiter=",")
#print testset.shape (2560, 64)
X_test = testset[:,0:(testset.shape[1]-2)]
Y_test = testset[:,testset.shape[1]-1]
#print X_test.shape (2560, 62)
#print Y_test.shape (2560,)
num_units_per_layer = [100, 50]
### create model
model = Sequential()
model.add(Dense(100, input_dim=(dataset.shape[1]-2), init='uniform', activation='relu'))
model.add(Dense(50, init='uniform', activation='relu'))
model.add(Dense(1, init='uniform', activation='sigmoid'))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
## Fit the model
model.fit(X, Y, validation_data=(X_test, Y_test), nb_epoch=5, batch_size=128)

I should have printed out my response variable. The categories were labelled as 1 and 2 instead of 0 and 1, which confused the classifier.

Related

Accuracy and val_accuracy don't change while training

I tried to train my convolutional neural network using tensorflow and keras libraries. But the values of accuracy and val_accuracy didn't change the whole time. There is my neural network code:
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten, Conv2D, MaxPooling2D
import pickle
X = pickle.load(open("X.pickle", "rb"))
y = pickle.load(open("y.pickle", "rb"))
X = X/255.0
model = Sequential()
model.add(Conv2D(64, (3, 3), input_shape=X.shape[1:]))
model.add(Activation("relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (3, 3), activation="relu"))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(64, activation="relu"))
model.add(Dense(1, activation="sigmoid"))
model.compile(loss="binary_crossentropy",
optimizer="adam",
metrics=["accuracy"])
model.fit(X, y, batch_size=10, epochs=10, validation_split=0.1)
There is the creation of traning data, features and labels (X - features, y - labels)
def create_training_data():
for category in CATEGORIES:
path = os.path.join(DATADIR, category)
class_num = CATEGORIES.index(category)
for img in os.listdir(path):
try:
img_array = cv2.imread(os.path.join(path, img), cv2.IMREAD_GRAYSCALE)
new_array = cv2.resize(img_array, (IMG_SIZE, IMG_SIZE))
training_data.append([new_array, class_num])
except Exception as e:
pass
create_training_data()
random.shuffle(training_data)
X = []
y = []
for features, label in training_data:
X.append(features)
y.append(label)
X = np.array(X).reshape(-1, IMG_SIZE, IMG_SIZE, 1)
y = np.array(y)
And this is the log of training:
2023-01-15 00:36:42.368335: I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations: AVX
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.
Epoch 1/10
70/70 [==============================] - 45s 619ms/step - loss: 0.3039 - accuracy: 0.9627 - val_loss: 0.1211 - val_accuracy: 0.9744
Epoch 2/10
70/70 [==============================] - 42s 600ms/step - loss: 0.1524 - accuracy: 0.9670 - val_loss: 0.1189 - val_accuracy: 0.9744
Epoch 3/10
70/70 [==============================] - 42s 600ms/step - loss: 0.1537 - accuracy: 0.9670 - val_loss: 0.1622 - val_accuracy: 0.9744
Epoch 4/10
70/70 [==============================] - 44s 627ms/step - loss: 0.1563 - accuracy: 0.9670 - val_loss: 0.1464 - val_accuracy: 0.9744
Epoch 5/10
70/70 [==============================] - 42s 604ms/step - loss: 0.1591 - accuracy: 0.9670 - val_loss: 0.1185 - val_accuracy: 0.9744
Epoch 6/10
70/70 [==============================] - 42s 605ms/step - loss: 0.1511 - accuracy: 0.9670 - val_loss: 0.1338 - val_accuracy: 0.9744
Epoch 7/10
70/70 [==============================] - 49s 698ms/step - loss: 0.1623 - accuracy: 0.9670 - val_loss: 0.1188 - val_accuracy: 0.9744
Epoch 8/10
70/70 [==============================] - 50s 709ms/step - loss: 0.1480 - accuracy: 0.9670 - val_loss: 0.1397 - val_accuracy: 0.9744
Epoch 9/10
70/70 [==============================] - 45s 637ms/step - loss: 0.1508 - accuracy: 0.9670 - val_loss: 0.1203 - val_accuracy: 0.9744
Epoch 10/10
70/70 [==============================] - 47s 665ms/step - loss: 0.1716 - accuracy: 0.9670 - val_loss: 0.1238 - val_accuracy: 0.9744
Process finished with exit code 0
What should I do to fix this problem?
There are a couple potential reasons as to why you are facing this:
Your dataset is far too small. If your validation set is tiny, there is a high probability that your model will get the same % of predictions correct/incorrect
There is a great imbalance in your dataset. If one class heavily outweighs another, your model will favor the majority class, and predict it no matter what, as that is what brings the optimal accuracy for the model.
From what I see, there is nothing wrong with your code, rather modifications that need to be made to the dataset itself.
Hmm accuracy and validation accuracy are high even on the first epoch. Try using a lower learning rate in the Adam optimizer say .0002, On the first epoch pay attention to the loss and validation loss as the batches are process. It should start low and gradually increase during the epoch.

Model Validation Accuracy is always 1 during Training ( Keras)

I have a time series imbalanced dataset on which I have to perform binary classification. I cannot split training and test sets randomly or even perform stratify on them. The issue is that while training the model validation accuracy is always 1. I realize this has somethings to do with the train-test split but I may be wrong.
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.8, random_state=None, shuffle=False)
from collections import Counter
print(Counter(y))
print(Counter(y_train))
print(Counter(y_test))
Counter({0.0: 55534, 1.0: 10000})
Counter({0.0: 9995, 1.0: 3111})
Counter({0.0: 45539, 1.0: 6889})
model = Sequential()
#First Hidden Layer
model.add(Dense(128, activation='relu', kernel_initializer='random_normal', input_dim=19))
#model.add(Dropout(0.3))
#Second Hidden Layer
model.add(Dense(64, activation='relu', kernel_initializer='random_normal'))
#Output Layer
model.add(Dense(1, activation='sigmoid', kernel_initializer='random_normal'))
history = model.fit(X_train,y_train, batch_size=128, validation_split=0.1, epochs=50)
Train on 11795 samples, validate on 1311 samples
Epoch 1/50
11795/11795 [==============================] - 0s 34us/step - loss: 1.1359 - accuracy: 0.8719 - val_loss: 4.2016e-18 - val_accuracy: 1.0000
Epoch 2/50
11795/11795 [==============================] - 0s 12us/step - loss: 0.1247 - accuracy: 0.9442 - val_loss: 1.0255e-19 - val_accuracy: 1.0000
Epoch 3/50
11795/11795 [==============================] - 0s 13us/step - loss: 0.1177 - accuracy: 0.9462 - val_loss: 3.2516e-23 - val_accuracy: 1.0000
Epoch 4/50
11795/11795 [==============================] - 0s 12us/step - loss: 0.1103 - accuracy: 0.9519 - val_loss: 1.1607e-23 - val_accuracy: 1.0000
Epoch 5/50
11795/11795 [==============================] - 0s 13us/step - loss: 0.0805 - accuracy: 0.9739 - val_loss: 6.2345e-26 - val_accuracy: 1.0000
Appreciate any help on this problem.Thanks

Keras LSTM Model not learning

I wrote this code a few days ago and I had a few bugs but with some help, I was able to fix them. The Model is not learning. I tried different batch sizes, different amount of epochs, different activation functions, checked my data a few times for flaws I wasn't able to find any. It is due in a week or so for a school project. Any help will be very much valued.
Here is the code.
from keras.layers import Dense, Input, Concatenate, Dropout
from sklearn.preprocessing import MinMaxScaler
from keras.models import Model
from keras.layers import LSTM
import tensorflow as tf
import NetworkRequest as NR
import ParseNetworkRequest as PNR
import numpy as np
def buildModel():
_Price = Input(shape=(1, 1))
_Volume = Input(shape=(1, 1))
PriceLayer = LSTM(128)(_Price)
VolumeLayer = LSTM(128)(_Volume)
merged = Concatenate(axis=1)([PriceLayer, VolumeLayer])
Dropout(0.2)
dense1 = Dense(128, input_dim=2, activation='relu', use_bias=True)(merged)
Dropout(0.2)
dense2 = Dense(64, input_dim=2, activation='relu', use_bias=True)(dense1)
Dropout(0.2)
output = Dense(1, activation='softmax', use_bias=True)(dense2)
opt = tf.keras.optimizers.Adam(learning_rate=1e-3, decay=1e-6)
_Model = Model(inputs=[_Price, _Volume], output=output)
_Model.compile(optimizer=opt, loss='mse', metrics=['accuracy'])
return _Model
if __name__ == '__main__':
api_key = "47BGPYJPFN4CEC20"
stock = "DJI"
Index = ['4. close', '5. volume']
RawData = NR.Initial_Network_Request(api_key, stock)
Closing = PNR.Parse_Network_Request(RawData, Index[0])
Volume = PNR.Parse_Network_Request(RawData, Index[1])
Length = len(Closing)
scalar = MinMaxScaler(feature_range=(0, 1))
Closing_scaled = scalar.fit_transform(np.reshape(Closing[:-1], (-1, 1)))
Volume_scaled = scalar.fit_transform(np.reshape(Volume[:-1], (-1, 1)))
Labels_scaled = scalar.fit_transform(np.reshape(Closing[1:], (-1, 1)))
Train_Closing = Closing_scaled[:int(0.9 * Length)]
Train_Closing = np.reshape(Train_Closing, (Train_Closing.shape[0], 1, 1))
Train_Volume = Volume_scaled[:int(0.9 * Length)]
Train_Volume = np.reshape(Train_Volume, (Train_Volume.shape[0], 1, 1))
Train_Labels = Labels_scaled[:int((0.9 * Length))]
Train_Labels = np.reshape(Train_Labels, (Train_Labels.shape[0], 1))
# -------------------------------------------------------------------------------------------#
Test_Closing = Closing_scaled[int(0.9 * Length):(Length - 1)]
Test_Closing = np.reshape(Test_Closing, (Test_Closing.shape[0], 1, 1))
Test_Volume = Volume_scaled[int(0.9 * Length):(Length - 1)]
Test_Volume = np.reshape(Test_Volume, (Test_Volume.shape[0], 1, 1))
Test_Labels = Labels_scaled[int(0.9 * Length):(Length - 1)]
Test_Labels = np.reshape(Test_Labels, (Test_Labels.shape[0], 1))
Predict_Closing = Closing_scaled[-1]
Predict_Closing = np.reshape(Predict_Closing, (Predict_Closing.shape[0], 1, 1))
Predict_Volume = Volume_scaled[-1]
Predict_Volume = np.reshape(Predict_Volume, (Predict_Volume.shape[0], 1, 1))
Predict_Label = Labels_scaled[-1]
Predict_Label = np.reshape(Predict_Label, (Predict_Label.shape[0], 1))
model = buildModel()
model.fit(
[
Train_Closing,
Train_Volume
],
[
Train_Labels
],
validation_data=(
[
Test_Closing,
Test_Volume
],
[
Test_Labels
]
),
epochs=10,
batch_size=Length
)
This is the output when I run it.
Using TensorFlow backend.
2020-01-01 16:31:47.905012: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2199985000 Hz
2020-01-01 16:31:47.906105: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x49214f0 executing computations on platform Host. Devices:
2020-01-01 16:31:47.906137: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): Host, Default Version
/home/martin/PycharmProjects/MarketPredictor/Model.py:26: UserWarning: Update your `Model` call to the Keras 2 API: `Model(inputs=[<tf.Tenso..., outputs=Tensor("de...)`
_Model = Model(inputs=[_Price, _Volume], output=output)
Train on 4527 samples, validate on 503 samples
Epoch 1/10
4527/4527 [==============================] - 1s 179us/step - loss: 0.4716 - accuracy: 2.2090e-04 - val_loss: 0.6772 - val_accuracy: 0.0000e+00
Epoch 2/10
4527/4527 [==============================] - 0s 41us/step - loss: 0.4716 - accuracy: 2.2090e-04 - val_loss: 0.6772 - val_accuracy: 0.0000e+00
Epoch 3/10
4527/4527 [==============================] - 0s 42us/step - loss: 0.4716 - accuracy: 2.2090e-04 - val_loss: 0.6772 - val_accuracy: 0.0000e+00
Epoch 4/10
4527/4527 [==============================] - 0s 42us/step - loss: 0.4716 - accuracy: 2.2090e-04 - val_loss: 0.6772 - val_accuracy: 0.0000e+00
Epoch 5/10
4527/4527 [==============================] - 0s 43us/step - loss: 0.4716 - accuracy: 2.2090e-04 - val_loss: 0.6772 - val_accuracy: 0.0000e+00
Epoch 6/10
4527/4527 [==============================] - 0s 39us/step - loss: 0.4716 - accuracy: 2.2090e-04 - val_loss: 0.6772 - val_accuracy: 0.0000e+00
Epoch 7/10
4527/4527 [==============================] - 0s 42us/step - loss: 0.4716 - accuracy: 2.2090e-04 - val_loss: 0.6772 - val_accuracy: 0.0000e+00
Epoch 8/10
4527/4527 [==============================] - 0s 39us/step - loss: 0.4716 - accuracy: 2.2090e-04 - val_loss: 0.6772 - val_accuracy: 0.0000e+00
Epoch 9/10
4527/4527 [==============================] - 0s 42us/step - loss: 0.4716 - accuracy: 2.2090e-04 - val_loss: 0.6772 - val_accuracy: 0.0000e+00
Epoch 10/10
4527/4527 [==============================] - 0s 38us/step - loss: 0.4716 - accuracy: 2.2090e-04 - val_loss: 0.6772 - val_accuracy: 0.0000e+00
Process finished with exit code 0
The loss is high, and the accuracy is 0.
Please help.
You're using activation functions and metrics made for a classification task, not a stock forecasting task (with a continuous target).
For continuous targets, your final activation layer should be linear. Metrics should be mse or mae, not accuracy.
accuracy would only be satisfied is the dji prediction is exactly equal to the actual price. Since dji has at least 7 digits, it's nearly impossible.
Here's my suggestion:
Use a simpler network: Not sure how big is your dataset, but sometimes using dense. layer isn't helpful. Looks like the weights of there intermediate layers are not changing at all. Try the model with just one dense layer.
Reduce dropout: Try with using one dropout layer with Dropout(0.1).
Adam defaults: Start with using adam optimizer with its default parameters.
Metric selection: As mentioned by Nicolas's answer, use a regression metric instead of accuracy.

Keras network fit: loss is 'nan', accuracy doesn't change

I try to fit keras network, but in each epoch loss is 'nan' and accuracy doesn't change... I tried to change epoch, layers count, neurons count, learning rate, optimizers, I checked nan data in datasets, normalize data by different ways, but problem was not solved. Thanks for your help.
np.random.seed(1337)
# example of input vector: [-1.459746, 0.2694708, ... 0.90043]
# example of output vector: [1, 0] or [0, 1]
model = Sequential()
model.add(Dense(1000, activation='tanh', init='normal', input_dim=503))
model.add(Dense(2, init='normal', activation='softmax'))
opt = optimizers.sgd(lr=0.01)
model.compile(loss="categorical_crossentropy", optimizer=opt, metrics=['accuracy'])
print(model.summary())
model.fit(x_train, y_train, batch_size=1000, nb_epoch=100, verbose=1)
99804/99804 [==============================] - 5s 52us/step - loss: nan - acc: 0.4938
Epoch 1/100
99804/99804 [==============================] - 5s 49us/step - loss: nan - acc: 0.4938
Epoch 2/100
99804/99804 [==============================] - 5s 51us/step - loss: nan - acc: 0.4938
Epoch 3/100
99804/99804 [==============================] - 5s 52us/step - loss: nan - acc: 0.4938
Epoch 4/100
99804/99804 [==============================] - 5s 52us/step - loss: nan - acc: 0.4938
Epoch 5/100
99804/99804 [==============================] - 5s 51us/step - loss: nan - acc: 0.4938
...
Oh, problem has been found! After normalization, one nan neuron appeared in the input vector
First convert your output to categorical, as described in Keras documentation:
Note: when using the categorical_crossentropy loss, your targets should be in categorical format. In order to convert integer targets into categorical targets, you can use the Keras utility to_categorical:
from keras.utils import to_categorical
categorical_labels = to_categorical(int_labels, num_classes=None)

Re-fitting model with partial data results lower accuracy than fitting model with full data

My dataset is large and cannot fit into RAM. My hypothesis is to divide the dataset into large chunks and train the model iteratively and see similar accuracy results as non-divided dataset (as described in #107). So, I tested it in two steps with a sample set of 146244 elements.
First, try splitting the data into three chunks (50000, 50000 and 46244) and fit the model with each chunk. To be on the safe side, I load the model saved in the previous iteration.
Second try fitting the model with full data (146244 elements).
Here's the code for creating and fitting the model:
model = Sequential()
model.add(Flatten(input_shape=train_data.shape[1:]))
model.add(Dense(256, activation='relu'))
model.add(BatchNormalization())
model.add(Dropout(0.5))
model.add(Dense(num_classes, activation='sigmoid'))
optimizer = optimizers.adam(lr=0.0001)
model.compile(optimizer=optimizer,
loss='categorical_crossentropy',
metrics=['accuracy'])
model.save(top_model_path)
for train_data, train_labels in hdf5_generator('D:/_g/kaggle/cdisco/files/features_xception_1_100_90x90_tr1.hdf5', 'train', train_labels):
model = load_model(top_model_path)
model.fit(train_data, train_labels, epochs=2, batch_size=batch_size)
model.save(top_model_path)
(eval_loss, eval_accuracy) = model.evaluate(validation_data, validation_labels, batch_size=batch_size, verbose=1)
print("Accuracy: {:.2f}%".format(eval_accuracy * 100))
print("Loss: {}\n".format(eval_loss))
And, here is the code for the generator:
def hdf5_generator(file_name, dataset, labels):
hf = h5py.File(file_name, 'r')
i = 0
batch_size = 50000
while 1:
start = i * batch_size
end = ((i + 1) * batch_size)
print("start: %d -- end: %d" % (start, end))
yield (hf[dataset][start:end], labels[start:end])
if end >= len(hf[dataset]): break
i += 1
First try with chunked data results in unstable accuracy between steps (22,90%, 59,93%, 51,17%):
start: 0 -- end: 50000
Epoch 1/2
50000/50000 [==============================] - 39s - loss: 2.4969 - acc: 0.5143
Epoch 2/2
50000/50000 [==============================] - 38s - loss: 1.5667 - acc: 0.6201
16156/16156 [==============================] - 33s
Accuracy: 22.90%
Loss: 5.976185762991436
start: 50000 -- end: 100000
Epoch 1/2
50000/50000 [==============================] - 38s - loss: 1.3759 - acc: 0.7211
Epoch 2/2
50000/50000 [==============================] - 39s - loss: 0.5446 - acc: 0.8621
16156/16156 [==============================] - 34s
Accuracy: 59.93%
Loss: 2.540657121840312
start: 100000 -- end: 150000
Epoch 1/2
46244/46244 [==============================] - 36s - loss: 0.2640 - acc: 0.9531
Epoch 2/2
46244/46244 [==============================] - 36s - loss: 0.1283 - acc: 0.9672
16156/16156 [==============================] - 34s
Accuracy: 51.17%
Loss: 3.8107337748964336
Second try (with batch_size=146244 in hdf5_generator) results in 77.49% accuracy:
start: 0 -- end: 146244
Epoch 1/2
146244/146244 [==============================] - 112s - loss: 1.8089 - acc: 0.6123
Epoch 2/2
146244/146244 [==============================] - 113s - loss: 1.0966 - acc: 0.7265
16156/16156 [==============================] - 39s
Accuracy: 77.49%
Loss: 0.8401890944788202
I expected to see similar accuracy results. However, results were different and first results seems like model loses parameters between iterations. How can I get results with chunked dataset with re-fitting similar to single fitting with full data?
I tried using HDF5Matrix, but it resulted in very low performance.

Categories

Resources