keras network doesn't train - python

First time I try to make the simplest net. I train it for XOR. It does not work absolutely. Tryed everithing: different activation functions, number of layers, neurons, epoches, batches, optimisers... Everytime result is 1,1,1,1 (accuracy=0.5). Please, help! What I do wrong?
from keras.models import Sequential
from keras.layers import Dense
from tensorflow import keras
import numpy as np
X = np.array([ [0,0],
[0,1],
[1,0],
[1,1] ])
Y = np.array([[1,0,0,1]]).T
model = Sequential()
model.add(Dense(10, input_dim=2, activation='relu'))
model.add(Dense(10, activation='relu'))
model.add(Dense(1, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics='accuracy')
#Traiting a model
model.fit(X, Y, epochs=100, batch_size=len(X))
# Prediction
predictions = model.predict(X)
print(predictions)
I noticed, that there are always 1/1 at the left side of the output. But, I guess, there must be something like 4/4. May be this is the reason? But I can't understand how to fix it...
Tail of output:
...
...
Epoch 97/100
1/1 [==============================] - 0s 1ms/step - loss: 0.0000e+00 - accuracy: 0.5000
Epoch 98/100
1/1 [==============================] - 0s 1ms/step - loss: 0.0000e+00 - accuracy: 0.5000
Epoch 99/100
1/1 [==============================] - 0s 1ms/step - loss: 0.0000e+00 - accuracy: 0.5000
Epoch 100/100
1/1 [==============================] - 0s 1ms/step - loss: 0.0000e+00 - accuracy: 0.5000
1/1 [==============================] - 0s 165ms/step - loss: 0.0000e+00 - accuracy: 0.5000
[0.0, 0.5]
[[1.]
[1.]
[1.]
[1.]]

Thank you very much to all!
Here the working net below. Strange, that it takes too long time for training! I remember, I did the same task but without Keras several years ago. Training was almost instantly (of course without any GPU). But here "Adam optimisation" (with "fast relu" I managed to do only 4 layers net). Seems that that functions has the opposit effect for such simple tasks.
from keras.models import Sequential
from keras import initializers
from keras.layers import Dense
from tensorflow import keras
import numpy as np
X = np.array([0,0,
0,1,
1,0,
1,1] )
X = X.reshape(4,2).astype("float32")
Y = np.array([1,
0,
0,
1] )
Y = Y.reshape(4,1).astype("float32")
init_2 = initializers.TruncatedNormal(mean=0.0, stddev=0.05, seed=12345)
model = Sequential()
model.add(Dense(4, input_dim=2, activation='sigmoid', kernel_initializer=init_2, bias_initializer=init_2))
model.add(Dense(1, activation='sigmoid', kernel_initializer=init_2, bias_initializer=init_2))
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['accuracy'])
#Traiting a model
model.fit(X, Y, epochs=7000, batch_size=4, verbose=0)
scores = model.evaluate(X, Y)
print(scores)
# Prediction
predictions = model.predict(X)
print(predictions)

Related

Why doesn't my CNN's accuracy/loss change during training?

My goal is to train a convolutional neural network to recognise the images present in the mnist sign language dataset. Here is my attempt to process the data and train the model
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import os
import cv2
import random
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D, Activation, Dropout, Flatten, Dense
import cv2
import keras
import sys
import tensorflow as tf
from keras import optimizers
import json
train_df = pd.read_csv("data/sign_mnist_train.csv")
test_df = pd.read_csv("data/sign_mnist_test.csv")
X = np.array(train_df.drop(["label"], axis=1))
y = np.array(train_df[["label"]])
X = X.reshape(-1, 28, 28, 1)
X = tf.cast(X, tf.float32)
model = Sequential()
model.add(Conv2D(28, (3,3), activation = 'relu'))
model.add(MaxPooling2D((2,2)))
model.add(Flatten())
model.add(Dense(24, activation = 'softmax'))
model.compile(optimizer='RMSprop',
loss='binary_crossentropy',
metrics=['accuracy'])
model.fit(X, y, epochs=10, validation_split=0.2)
and after running this I get this result
Epoch 1/10
687/687 [==============================] - 4s 6ms/step - loss: 174.9729 - accuracy: 0.0438 - val_loss: 174.6281 - val_accuracy: 0.0382
Epoch 2/10
687/687 [==============================] - 2s 3ms/step - loss: 174.9779 - accuracy: 0.0433 - val_loss: 174.6281 - val_accuracy: 0.0382
Epoch 3/10
687/687 [==============================] - 2s 3ms/step - loss: 174.9777 - accuracy: 0.0433 - val_loss: 174.6281 - val_accuracy: 0.0382
and this continues for the remaining 7 epochs. My model is slightly different from what I have provided (for brevity) but this sequential model has the same issue, which makes me suspect that the issue must come before the model = Sequential() line. Furthermore, I have tried countless combinations of optimizers/loss and all those do is make the accuracy/loss converge to slightly different numbers, so I doubt that's the problem.
One of potential is that you use loss='binary_crossentropy' rather than loss='CategoricalCrossentropy'.
Besides, you defined the split datasets for training and testing, but you again defined it as model.fit(X, y, epochs=10, validation_split=0.2) to split datasets with 20% for validation and 80% for training.

Keras Neural Network Accuracy is always 0 While Training

I'm making a simple classification algo with a keras neural network. The goal is to take 3 data points on weather and decide whether or not there's a wildfire. Here's an image of the .csv dataset that I'm using to train the model(this image is only the top few lines and isn't the entire thing ):
wildfire weather dataset
As you can see, there are 4 columns with the fourth being either a "1" which means "fire", or a "0" which means "no fire". I want the algo to predict either a 1 or a 0. This is the code that I wrote:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import keras
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Dropout
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
import csv
#THIS IS USED TO TRAIN THE MODEL
# Importing the dataset
dataset = pd.read_csv('Fire_Weather.csv')
dataset.head()
X=dataset.iloc[:,0:3]
Y=dataset.iloc[:,3]
X.head()
obj=StandardScaler()
X=obj.fit_transform(X)
X_train,X_test,y_train,y_test=train_test_split(X, Y, test_size=0.25)
print(X_train.shape)
print(X_test.shape)
print(y_train.shape)
print(y_test.shape)
classifier = Sequential()
# Adding the input layer and the first hidden layer
classifier.add(Dense(units = 6, kernel_initializer = 'uniform', activation =
'relu', input_dim = 3))
# classifier.add(Dropout(p = 0.1))
# Adding the second hidden layer
classifier.add(Dense(units = 6, kernel_initializer = 'uniform', activation
= 'relu'))
# classifier.add(Dropout(p = 0.1))
# Adding the output layer
classifier.add(Dense(units = 1, kernel_initializer = 'uniform', activation
= 'sigmoid'))
# Compiling the ANN
classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics
= ['accuracy'])
classifier.fit(X_train, y_train, batch_size = 3, epochs = 10)
y_pred = classifier.predict(X_test)
y_pred = (y_pred > 0.5)
print(y_pred)
classifier.save("weather_model.h5")
The problem is that whenever I run this, my accuracy is always "0.0000e+00" and my training output looks like this:
Epoch 1/10
2146/2146 [==============================] - 2s 758us/step - loss: nan - accuracy: 0.0238
Epoch 2/10
2146/2146 [==============================] - 1s 625us/step - loss: nan - accuracy: 0.0000e+00
Epoch 3/10
2146/2146 [==============================] - 1s 604us/step - loss: nan - accuracy: 0.0000e+00
Epoch 4/10
2146/2146 [==============================] - 1s 609us/step - loss: nan - accuracy: 0.0000e+00
Epoch 5/10
2146/2146 [==============================] - 1s 624us/step - loss: nan - accuracy: 0.0000e+00
Epoch 6/10
2146/2146 [==============================] - 1s 633us/step - loss: nan - accuracy: 0.0000e+00
Epoch 7/10
2146/2146 [==============================] - 1s 481us/step - loss: nan - accuracy: 0.0000e+00
Epoch 8/10
2146/2146 [==============================] - 1s 476us/step - loss: nan - accuracy: 0.0000e+00
Epoch 9/10
2146/2146 [==============================] - 1s 474us/step - loss: nan - accuracy: 0.0000e+00
Epoch 10/10
2146/2146 [==============================] - 1s 474us/step - loss: nan - accuracy: 0.0000e+00
Does anyone know why this is happening and what I could do to my code to fix this?
Thank You!
EDIT: I realized that my earlier response was highly misleading, which was thankfully pointed out by #xdurch0 and #Timbus Calin. Here is an edited answer.
Check that all your input values are valid. Are there any nan or inf values in your training data?
Try using different activation functions. ReLU is good, but it is prone to what is known as the dying ReLu problem, where the neural network basically learns nothing since no updates are made to its weight. One possibility is to use Leaky ReLu or PReLU.
Try using gradient clipping, which is a technique used to tackle vanishing or exploding gradients (which is likely what is happening in your case). Keras allows users to configure clipnorm clip value for optimizers.
There are posts on SO that report similar problems, such as this one, which might also be of interest to you.

Loss not decrasing and is very high keras

I'm learning deep learning in keras and I have a problem.
The loss isn't decreasing and it's very high, about 650.
I'm working on MNIST dataset from tensorflow.keras.datasets.mnist
There is no error, just my NN isn't learning.
There is my model:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten
import tensorflow.nn as tfnn
inputdim = 28 * 28
model = Sequential()
model.add(Flatten())
model.add(Dense(inputdim, activation = tfnn.relu))
model.add(Dense(128, activation = tfnn.relu))
model.add(Dense(10, activation = tfnn.softmax))
model.compile(loss = 'categorical_crossentropy', optimizer = 'adam', metrics = ['accuracy'])
model.fit(X_train, Y_train, epochs = 4)
and my output:
Epoch 1/4
60000/60000 [==============================] - 32s 527us/sample - loss: 646.0926 - acc: 6.6667e-05
Epoch 2/4
60000/60000 [==============================] - 39s 652us/sample - loss: 646.1003 - acc: 0.0000e+00 - l - ETA: 0s - loss: 646.0983 - acc: 0.0000e
Epoch 3/4
60000/60000 [==============================] - 35s 590us/sample - loss: 646.1003 - acc: 0.0000e+00
Epoch 4/4
60000/60000 [==============================] - 33s 544us/sample - loss: 646.1003 - acc: 0.0000e+00
```
Ok, I added BatchNormalization between lines and changed loss function to 'sparse_categorical_crossentropy'. That's how my NN looks like:
model = Sequential()
model.add(Flatten())
model.add(BatchNormalization(axis = 1, momentum = 0.99))
model.add(Dense(inputdim, activation = tfnn.relu))
model.add(BatchNormalization(axis = 1, momentum = 0.99))
model.add(Dense(128, activation = tfnn.relu))
model.add(BatchNormalization(axis = 1, momentum = 0.99))
model.add(Dense(10, activation = tfnn.softmax))
model.compile(loss = 'sparse_categorical_crossentropy', optimizer = 'adam', metrics = ['accuracy'])
and thats a results:
Epoch 1/4
60000/60000 [==============================] - 68s 1ms/sample - loss: 0.2045 - acc: 0.9374
Epoch 2/4
60000/60000 [==============================] - 55s 916us/sample - loss: 0.1007 - acc: 0.9689
Thanks for your help!
You may try sparse_categorical_crossentropy loss function. Also what is your batch size? and as has already been suggested you may want to increase number of epochs.

How is the keras accuracy showed in progress bar calculated? From which inputs is it calculated? How to replicate it?

I am trying to understand what is the accuracy "acc" shown in the keras progress bar at the end of epoch:
13/13 [==============================] - 0s 76us/step - loss: 0.7100 - acc: 0.4615
At the end of an epoch it should be the accuracy of the model predictions of all training samples. However when the model is evaluated on the same training samples, the actual accuracy can be very different.
Below is adapted example of MLP for binary classification from keras webpage. A simple sequential neural net is doing binary classification of randomly generated numbers. The batch size is the same as the number of training examples (13), so that every epoch contain only one step. Since loss is set to binary_crossentropy, for the accuracy calculation is used binary_accuracy defined in metrics.py. MyEval class defines callback, which is called at the end of each epoch. It uses two ways of calculating the accuracy of the training data a) model evaluate and b) model predict to get prediction and then almost the same code as is used in keras binary_accuracy function. These two accuracies are consistent, but most of the time are different to the one in the progress bar. Why they are different? Is is possible to calculate the same accuracy as is in the progress bar? Or have I made a mistake in my assumptions?
import numpy as np
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras import callbacks
np.random.seed(1) # fix random seed for reproducibility
# Generate dummy data
x_train = np.random.random((13, 20))
y_train = np.random.randint(2, size=(13, 1))
model = Sequential()
model.add(Dense(64, input_dim=20, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
class MyEval(callbacks.Callback):
def on_epoch_end(self, epoch, logs=None):
my_accuracy_1 = self.model.evaluate(x_train, y_train, verbose=0)[1]
y_pred = self.model.predict(x_train)
my_accuracy_2 = np.mean(np.equal(y_train, np.round(y_pred)))
print("my accuracy 1: {}".format(my_accuracy_1))
print("my accuracy 2: {}".format(my_accuracy_2))
my_eval = MyEval()
model.fit(x_train, y_train,
epochs=5,
batch_size=13,
callbacks=[my_eval],
shuffle=False)
The output of the above code:
13/13 [==============================] - 0s 25ms/step - loss: 0.7303 - acc: 0.5385
my accuracy 1: 0.5384615659713745
my accuracy 2: 0.5384615384615384
Epoch 2/5
13/13 [==============================] - 0s 95us/step - loss: 0.7412 - acc: 0.4615
my accuracy 1: 0.9230769276618958
my accuracy 2: 0.9230769230769231
Epoch 3/5
13/13 [==============================] - 0s 77us/step - loss: 0.7324 - acc: 0.3846
my accuracy 1: 0.9230769276618958
my accuracy 2: 0.9230769230769231
Epoch 4/5
13/13 [==============================] - 0s 72us/step - loss: 0.6543 - acc: 0.5385
my accuracy 1: 0.9230769276618958
my accuracy 2: 0.9230769230769231
Epoch 5/5
13/13 [==============================] - 0s 76us/step - loss: 0.6459 - acc: 0.6923
my accuracy 1: 0.8461538553237915
my accuracy 2: 0.8461538461538461
using: Python 3.5.2, tensorflow-gpu==1.14.0 Keras==2.2.4 numpy==1.15.2
I think it has to do with the usage of Dropout. Dropout is only enabled during training, but not during evaluation or prediction. Hence the discrepancy of the accuracies during training and evaluation/prediction.
Moreover, the training accuracy that is displayed in the bar shows the averaged accuracy over the training epoch, averaged over the batch accuracies calculated after each batch. Keep in mind that the model parameters are tuned after each batch, such that the accuracy shown in the bar at the end does not exactly match the accuracy of a valication after the epoch is finished (because the training accuracy is calculated with different model parameters per batch, and the validation accuracy is calculated with the same parameters for all batches).
This is your example, with more data (therefore more than one epoch), and without dropout:
import numpy as np
from keras.models import Sequential
from keras.layers import Dense, Dropout
from keras import callbacks
np.random.seed(1) # fix random seed for reproducibility
# Generate dummy data
x_train = np.random.random((200, 20))
y_train = np.random.randint(2, size=(200, 1))
model = Sequential()
model.add(Dense(64, input_dim=20, activation='relu'))
# model.add(Dropout(0.5))
model.add(Dense(64, activation='relu'))
# model.add(Dropout(0.5))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='rmsprop',
metrics=['accuracy'])
class MyEval(callbacks.Callback):
def on_epoch_end(self, epoch, logs=None):
my_accuracy_1 = self.model.evaluate(x_train, y_train, verbose=0)[1]
y_pred = self.model.predict(x_train)
my_accuracy_2 = np.mean(np.equal(y_train, np.round(y_pred)))
print("my accuracy 1 after epoch {}: {}".format(epoch + 1,my_accuracy_1))
print("my accuracy 2 after epoch {}: {}".format(epoch + 1,my_accuracy_2))
my_eval = MyEval()
model.fit(x_train, y_train,
epochs=5,
batch_size=13,
callbacks=[my_eval],
shuffle=False)
The output reads:
Train on 200 samples
Epoch 1/5
my accuracy 1 after epoch 1: 0.5450000166893005
my accuracy 2 after epoch 1: 0.545
200/200 [==============================] - 0s 2ms/sample - loss: 0.6978 - accuracy: 0.5350
Epoch 2/5
my accuracy 1 after epoch 2: 0.5600000023841858
my accuracy 2 after epoch 2: 0.56
200/200 [==============================] - 0s 383us/sample - loss: 0.6892 - accuracy: 0.5550
Epoch 3/5
my accuracy 1 after epoch 3: 0.5799999833106995
my accuracy 2 after epoch 3: 0.58
200/200 [==============================] - 0s 496us/sample - loss: 0.6844 - accuracy: 0.5800
Epoch 4/5
my accuracy 1 after epoch 4: 0.6000000238418579
my accuracy 2 after epoch 4: 0.6
200/200 [==============================] - 0s 364us/sample - loss: 0.6801 - accuracy: 0.6150
Epoch 5/5
my accuracy 1 after epoch 5: 0.6050000190734863
my accuracy 2 after epoch 5: 0.605
200/200 [==============================] - 0s 393us/sample - loss: 0.6756 - accuracy: 0.6200
The validation accuracy after the epoch pretty much resembles the averaged training accuracy at the end of the epoch now.

How to see why a keras / tensorflow model is getting stuck?

My code is:
from keras.models import Sequential
from keras.layers import Dense
import numpy
import pandas as pd
X = pd.read_csv(
"data/train.csv", usecols=['Type', 'Age', 'Breed1', 'Breed2', 'Gender', 'Color1', 'Color2', 'Color3', 'MaturitySize', 'FurLength', 'Vaccinated', 'Dewormed', 'Sterilized', 'Health', 'Quantity', 'Fee', 'VideoAmt', 'PhotoAmt'])
Y = pd.read_csv(
"data/train.csv", usecols=['AdoptionSpeed'])
model = Sequential()
model.add(Dense(18, input_dim=18, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model.compile(loss='binary_crossentropy',
optimizer='adam', metrics=['accuracy'])
model.fit(X, Y, epochs=150, batch_size=100)
scores = model.evaluate(X, Y)
print("\n%s: %.2f%%" % (model.metrics_names[1], scores[1]*100))
I am trying to train to see how the various factors (type, age, etc) affect the AdoptionSpeed. However, the accuracy gets stuck at 20.6% and doesn't really move from there.
Epoch 2/150
14993/14993 [==============================] - 0s 9us/step - loss: -24.1539 - acc: 0.2061
Epoch 3/150
14993/14993 [==============================] - 0s 9us/step - loss: -24.1591 - acc: 0.2061
Epoch 4/150
14993/14993 [==============================] - 0s 9us/step - loss: -24.1626 - acc: 0.2061
...
Epoch 150/150
14993/14993 [==============================] - 0s 9us/step - loss: -24.1757 - acc: 0.2061
14993/14993 [==============================] - 0s 11us/step
acc: 20.61%
Is there anything I can do to nudge to get unstuck?
By the values of the loss, it seems your true data is not in the same range as the the model's output (sigmoid).
Sigmoid outputs between 0 and 1 only. So you should normalize your data in order to have it between 0 and 1. One possibility is simply divide y by y.max().
Or you can try other possibilities, considering:
sigmoid: between 0 and 1
tanh: between -1 and 1
relu: 0 to infinity
linear: -inf to +inf

Categories

Resources