Keras autoencoder accuracy/loss doesn't change - python

Here is my code:
AE_0 = Sequential()
encoder = Sequential([Dense(output_dim=100, input_dim=256, activation='sigmoid')])
decoder = Sequential([Dense(output_dim=256, input_dim=100, activation='linear')])
AE_0.add(AutoEncoder(encoder=encoder, decoder=decoder, output_reconstruction=True))
AE_0.compile(loss='mse', optimizer=SGD(lr=0.03, momentum=0.9, decay=0.001, nesterov=True))
AE_0.fit(X, X, batch_size=21, nb_epoch=500, show_accuracy=True)
X has a shape (537621, 256). I'm trying to find a way to compress the vectors of size 256 to 100, then to 70, then to 50. I have done this is Lasagne but in Keras it seems to be easier to work w/ Autoencoders.
Here is the output:
Epoch 1/500
537621/537621 [==============================] - 27s - loss: 0.1339 - acc: 0.0036
Epoch 2/500
537621/537621 [==============================] - 32s - loss: 0.1339 - acc: 0.0036
Epoch 3/500
252336/537621 [=============>................] - ETA: 14s - loss: 0.1339 - acc: 0.0035
And it continues like this on and on..

It's now fixed on master:) openning issues is sometimes best choice
https://github.com/fchollet/keras/issues/1604

Related

Logits and labels must have same shape for Keras model

I am new to Keras and have been practicing with resources from the web. Unfortunately, I cannot build a model without it throwing the following error:
ValueError: logits and labels must have the same shape, received ((None, 10) vs (None, 1)).
I have attempted the following:
DF = pd.read_csv("https://raw.githubusercontent.com/EpistasisLab/tpot/master/tutorials/MAGIC%20Gamma%20Telescope/MAGIC%20Gamma%20Telescope%20Data.csv")
X = DF.iloc[:,0:-1]
y = DF.iloc[:,-1]
yBin = np.array([1 if x == 'g' else 0 for x in y ])
scaler = StandardScaler()
X1 = scaler.fit_transform(X)
X_train, X_test, y_train, y_test = train_test_split(X1, yBin, test_size=0.25, random_state=2018)
print(X_train.__class__,X_test.__class__,y_train.__class__,y_test.__class__ )
model=Sequential()
model.add(Dense(6,activation="relu", input_shape=(10,)))
model.add(Dense(10,activation="softmax"))
model.build(input_shape=(None,1))
model.summary()
model.compile(optimizer='rmsprop',
loss='binary_crossentropy',
metrics=['accuracy'])
model.fit(x=X_train,
y=y_train,
epochs=600,
validation_data=(X_test, y_test), verbose=1
)
I have read my model is likely wrong in terms of input parameters, what is the correct approach?
When I look at the shape of your data
print(X_train.shape,X_test.shape,y_train.shape,y_test.shape)
I see, that X is 10-dimensional and y us 1-dimensional
Therefore, you need 10-dimensional input
model.build(input_shape=(None,10))
and 1-dimensional output in the last dense layer
model.add(Dense(1,activation="softmax"))
Target variable yBin/y_train/y_test is 1D array (has a shape (None,1) for a given batch).
Your logits come from the Dense layer and the last Dense layer has 10 neurons with softmax activation. So it will give 10 outputs for each input or (batch_size,10) for each batch. This is represented formally as (None,10).
To resolve the particular shape mismatch issue in question change the neuron count of dense layer to 1 and set activation finction to "sigmoid".
model.add(Dense(1,activation="sigmoid"))
As correctly mentioned by #MSS, You need to use sigmoid activation function with 1 neuron in the last dense layer to match the logits with the labels(1,0) of your dataset which indicates binary class.
Fixed code:
model=Sequential()
model.add(Dense(6,activation="relu", input_shape=(10,)))
model.add(Dense(1,activation="sigmoid"))
#model.build(input_shape=(None,1))
model.summary()
model.compile(optimizer='rmsprop',loss='binary_crossentropy',metrics=['accuracy'])
model.fit(x=X_train,y=y_train,epochs=10,validation_data=(X_test, y_test),verbose=1)
Output:
Epoch 1/10
446/446 [==============================] - 3s 4ms/step - loss: 0.5400 - accuracy: 0.7449 - val_loss: 0.4769 - val_accuracy: 0.7800
Epoch 2/10
446/446 [==============================] - 2s 4ms/step - loss: 0.4425 - accuracy: 0.7987 - val_loss: 0.4241 - val_accuracy: 0.8095
Epoch 3/10
446/446 [==============================] - 2s 3ms/step - loss: 0.4082 - accuracy: 0.8175 - val_loss: 0.4034 - val_accuracy: 0.8242
Epoch 4/10
446/446 [==============================] - 2s 3ms/step - loss: 0.3934 - accuracy: 0.8286 - val_loss: 0.3927 - val_accuracy: 0.8313
Epoch 5/10
446/446 [==============================] - 2s 4ms/step - loss: 0.3854 - accuracy: 0.8347 - val_loss: 0.3866 - val_accuracy: 0.8320
Epoch 6/10
446/446 [==============================] - 2s 4ms/step - loss: 0.3800 - accuracy: 0.8397 - val_loss: 0.3827 - val_accuracy: 0.8364
Epoch 7/10
446/446 [==============================] - 2s 4ms/step - loss: 0.3762 - accuracy: 0.8411 - val_loss: 0.3786 - val_accuracy: 0.8387
Epoch 8/10
446/446 [==============================] - 2s 3ms/step - loss: 0.3726 - accuracy: 0.8432 - val_loss: 0.3764 - val_accuracy: 0.8404
Epoch 9/10
446/446 [==============================] - 2s 3ms/step - loss: 0.3695 - accuracy: 0.8466 - val_loss: 0.3724 - val_accuracy: 0.8408
Epoch 10/10
446/446 [==============================] - 2s 4ms/step - loss: 0.3665 - accuracy: 0.8478 - val_loss: 0.3698 - val_accuracy: 0.8454
<keras.callbacks.History at 0x7f68ca30f670>

Simple RNN(LSTM) model doesn't make progress

I have np_final_x(71520, 2, 50) and np_final_y(71520, 1, 50) corpus
https://www.dropbox.com/s/k15dtcak78jaf34/np_final_x_len_2.npy?dl=0
https://www.dropbox.com/s/555lhbdnkl6gmrq/np_final_y_len_2.npy?dl=0
This means predict like this I use -> this,
you give -> me
predict next word from two words.
And each words are encoded into 50 dimension vector.
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM,Dropout
from tensorflow.keras.layers import Activation, Dense
from tensorflow.keras.optimizers import Adam
import numpy as np
final_x = np.load('np_final_x_len_2.npy')
final_y = np.load('np_final_y_len_2.npy')
in_out_neurons = 50
n_hidden = 512 # not so much change
#n_hidden = 1 # not so much change
model = Sequential()
model.add(LSTM(n_hidden, batch_input_shape=(None, 2, in_out_neurons), return_sequences=True))
model.add(Dense(in_out_neurons,activation="tanh")) #not so much change
#model.add(Dense(in_out_neurons,activation="sigmoid")) #not so much change
#model.add(Dense(in_out_neurons, activation="relu")) #not so much change
optimizer = Adam(learning_rate=0.0001)
model.compile(loss="mean_squared_error", optimizer=optimizer)
model.summary()
model.fit(
final_x,final_y,
batch_size=400,
epochs=10,
validation_split=0.1
)
However what I got was around 0.017~0.019 not so much progress, even I changed any parameters.
Moreover, even when n_hidden = 1,result doesn't change.
So I guess something is wrong basically.
appreciate any help and hints. thank you.
Epoch 1/10
161/161 [==============================] - 9s 46ms/step - loss: 0.0195 - val_loss: 0.0192
Epoch 2/10
161/161 [==============================] - 8s 49ms/step - loss: 0.0191 - val_loss: 0.0188
Epoch 3/10
161/161 [==============================] - 8s 52ms/step - loss: 0.0187 - val_loss: 0.0186
Epoch 4/10
161/161 [==============================] - 11s 68ms/step - loss: 0.0185 - val_loss: 0.0184
Epoch 5/10
161/161 [==============================] - 12s 77ms/step - loss: 0.0184 - val_loss: 0.0183
Epoch 6/10
161/161 [==============================] - 13s 83ms/step - loss: 0.0183 - val_loss: 0.0183
Epoch 7/10
161/161 [==============================] - 14s 85ms/step - loss: 0.0183 - val_loss: 0.0182
You want to predict from two word embeddings the next word. It doesn't make sense to set return_sequences to True in the LSTM. I played a bit with your data and model, my best mse validation is 0.015 The limitation comes from the data and the simplistic model to predict the next word from two consecutive words (would a human be able to do that?) . Also, how do you get the 50 dimensional word embedding? It would be interesting to be able to go back to the words and see what kind of prediction the model produces.

Validation accuracy (val_acc) does not change over the epochs

Value of val_acc does not change over the epochs.
Summary:
I'm using a pre-trained (ImageNet) VGG16 from Keras;
from keras.applications import VGG16
conv_base = VGG16(weights='imagenet', include_top=True, input_shape=(224, 224, 3))
Database from ISBI 2016 (ISIC) - which is a set of 900 images of skin lesion used for binary classification (malignant or benign) for training and validation, plus 379 images for testing -;
I use the top dense layers of VGG16 except the last one (that classifies over 1000 classes), and use a binary output with sigmoid function activation;
conv_base.layers.pop() # Remove last one
conv_base.trainable = False
model = models.Sequential()
model.add(conv_base)
model.add(layers.Dense(1, activation='sigmoid'))
Unlock the dense layers setting them to trainable;
Fetch the data, which are in two different folders, one named "malignant" and the other "benign", within the "training data" folder;
from keras.preprocessing.image import ImageDataGenerator
from keras import optimizers
folder = 'ISBI2016_ISIC_Part3_Training_Data'
batch_size = 20
full_datagen = ImageDataGenerator(
rescale=1./255,
#rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
validation_split = 0.2, # 20% validation
horizontal_flip=True)
train_generator = full_datagen.flow_from_directory( # Found 721 images belonging to 2 classes.
folder,
target_size=(224, 224),
batch_size=batch_size,
subset = 'training',
class_mode='binary')
validation_generator = full_datagen.flow_from_directory( # Found 179 images belonging to 2 classes.
folder,
target_size=(224, 224),
batch_size=batch_size,
subset = 'validation',
shuffle=False,
class_mode='binary')
model.compile(loss='binary_crossentropy',
optimizer=optimizers.SGD(lr=0.001), # High learning rate
metrics=['accuracy'])
history = model.fit_generator(
train_generator,
steps_per_epoch=721 // batch_size+1,
epochs=20,
validation_data=validation_generator,
validation_steps=180 // batch_size+1,
)
Then I fine-tune it with 100 more epochs and lower learning rate, setting the last convolutional layer to trainable.
I've tried many things such as:
Changing the optimizer (RMSprop, Adam and SGD);
Removing the top dense layers of the pre-trained VGG16 and adding mine;
model.add(layers.Flatten())
model.add(layers.Dense(128, activation='relu'))
model.add(layers.Dense(64, activation='relu'))
model.add(layers.Dense(1, activation='sigmoid'))
Shuffle=True in validation_generator;
Changing batch size;
Varying the learning rate (0.001, 0.0001, 2e-5).
The results are similar to the following:
Epoch 1/100
37/37 [==============================] - 33s 900ms/step - loss: 0.6394 - acc: 0.7857 - val_loss: 0.6343 - val_acc: 0.8101
Epoch 2/100
37/37 [==============================] - 30s 819ms/step - loss: 0.6342 - acc: 0.8107 - val_loss: 0.6342 - val_acc: 0.8101
Epoch 3/100
37/37 [==============================] - 30s 822ms/step - loss: 0.6324 - acc: 0.8188 - val_loss: 0.6341 - val_acc: 0.8101
Epoch 4/100
37/37 [==============================] - 31s 840ms/step - loss: 0.6346 - acc: 0.8080 - val_loss: 0.6341 - val_acc: 0.8101
Epoch 5/100
37/37 [==============================] - 31s 833ms/step - loss: 0.6395 - acc: 0.7843 - val_loss: 0.6341 - val_acc: 0.8101
Epoch 6/100
37/37 [==============================] - 31s 829ms/step - loss: 0.6334 - acc: 0.8134 - val_loss: 0.6340 - val_acc: 0.8101
Epoch 7/100
37/37 [==============================] - 31s 834ms/step - loss: 0.6334 - acc: 0.8134 - val_loss: 0.6340 - val_acc: 0.8101
Epoch 8/100
37/37 [==============================] - 31s 829ms/step - loss: 0.6342 - acc: 0.8093 - val_loss: 0.6339 - val_acc: 0.8101
Epoch 9/100
37/37 [==============================] - 31s 849ms/step - loss: 0.6330 - acc: 0.8147 - val_loss: 0.6339 - val_acc: 0.8101
Epoch 10/100
37/37 [==============================] - 30s 812ms/step - loss: 0.6332 - acc: 0.8134 - val_loss: 0.6338 - val_acc: 0.8101
Epoch 11/100
37/37 [==============================] - 31s 839ms/step - loss: 0.6338 - acc: 0.8107 - val_loss: 0.6338 - val_acc: 0.8101
Epoch 12/100
37/37 [==============================] - 30s 807ms/step - loss: 0.6334 - acc: 0.8120 - val_loss: 0.6337 - val_acc: 0.8101
Epoch 13/100
37/37 [==============================] - 32s 852ms/step - loss: 0.6334 - acc: 0.8120 - val_loss: 0.6337 - val_acc: 0.8101
Epoch 14/100
37/37 [==============================] - 31s 826ms/step - loss: 0.6330 - acc: 0.8134 - val_loss: 0.6336 - val_acc: 0.8101
Epoch 15/100
37/37 [==============================] - 32s 854ms/step - loss: 0.6335 - acc: 0.8107 - val_loss: 0.6336 - val_acc: 0.8101
And goes on the same way, with constant val_acc = 0.8101.
When I use the test set after finishing training, the confusion matrix gives me 100% correct on benign lesions (304) and 0% on malignant, as so:
Confusion Matrix
[[304 0]
[ 75 0]]
What could I be doing wrong?
Thank you.
VGG16 was trained on RGB centered data. Your ImageDataGenerator does not enable featurewise_center, however, so you're feeding your net with raw RGB data. The VGG convolutional base can't process this to provide any meaningful information, so your net ends up universally guessing the more common class.
In general, when you see this type of problem (your net exclusively guessing the most common class), it means that there's something wrong with your data, not with the net. It can be caused by a preprocessing step like this or by a significant portion of "poisoned" anomalous training data that actively harms the training process.

How do I train a neural network with an array of list using keras in python

I'm trying to train a neural network using tensorflow.keras, but I don't understand how do I train it with a numpy array of list (in python3).
I have tried to change the input shape of the layers, but I don't really understand how it's work.
import tensorflow as tf
from tensorflow import keras
import numpy as np
# Create the array of data
train_data = [[1.0,2.0,3.0],[4.0,5.0,6.0]]
train_data_np = np.asarray(train_data)
train_label = [[1,2,3],[4,5,6]]
train_label_np = np.asarray(train_data)
### Build the model
model = keras.Sequential([
keras.layers.Dense(3,input_shape =(3,2)),
keras.layers.Dense(3,activation=tf.nn.sigmoid)
])
model.compile(optimizer='sgd',loss='sparse_categorical_crossentropy',metrics=['accuracy'])
#Train the model
model.fit(train_data_np,train_label_np,epochs=10)
The error is "Error when checking input: expected dense_input to have 3 dimensions, but got array with shape (2, 3)" when model.fit is called.
While defining a Keras model, you have to provide input shape to the first layer of the model.
For example, if your training data have n rows and m features ie shape : (n, m), You have to set the input_shape of the first Dense layer of the model to (m, ) ie the model should expect m features coming into it.
Now coming to your toy data,
train_data = [[1.0,2.0,3.0],[4.0,5.0,6.0]]
train_data_np = np.asarray(train_data)
train_label = [[1,2,3],[4,5,6]]
train_label_np = np.asarray(train_label)
Here, train_data_np.shape is (2, 3) ie 2 rows and 3 features, then you have to define the model like this,
model = keras.Sequential([
keras.layers.Dense(3,input_shape =(3, )),
keras.layers.Dense(3,activation=tf.nn.sigmoid)
])
Now, your labels are [[1,2,3],[4,5,6]]. In the normal 3 class classification task this will be a one-hot vector with 1 and 0s. But let's leave that aside as this is a toy example to check Keras.
If the target label ie y_train is one-hot then you have to use categorical_crossentropy loss instead of sparse_categorical_crossentropy.
So you can compile and train the model like this
model.compile(optimizer='sgd',loss='categorical_crossentropy',metrics=['accuracy'])
#Train the model
model.fit(train_data_np,train_label_np,epochs=10)
Epoch 1/10
2/2 [==============================] - 0s 61ms/step - loss: 11.5406 - acc: 0.0000e+00
Epoch 2/10
2/2 [==============================] - 0s 0us/step - loss: 11.4970 - acc: 0.5000
Epoch 3/10
2/2 [==============================] - 0s 0us/step - loss: 11.4664 - acc: 0.5000
Epoch 4/10
2/2 [==============================] - 0s 498us/step - loss: 11.4430 - acc: 0.5000
Epoch 5/10
2/2 [==============================] - 0s 496us/step - loss: 11.4243 - acc: 1.0000
Epoch 6/10
2/2 [==============================] - 0s 483us/step - loss: 11.4087 - acc: 1.0000
Epoch 7/10
2/2 [==============================] - 0s 1ms/step - loss: 11.3954 - acc: 1.0000
Epoch 8/10
2/2 [==============================] - 0s 997us/step - loss: 11.3840 - acc: 1.0000
Epoch 9/10
2/2 [==============================] - 0s 1ms/step - loss: 11.3740 - acc: 1.0000
Epoch 10/10
2/2 [==============================] - 0s 995us/step - loss: 11.3653 - acc: 1.0000

Keras poor performance (Loss and optimization functions?)

I've spend the last 2 weeks struggling with my NN. The aim is to predict trip durations of taxi courses based on several
numerical variables (latitudes and longitudes)
categorical variables (numerically encoded) (hour of the day, day of the week, etc)
Here is the simplest version
X_train = trainData.as_matrix(columns=["fareDistance","hour","day","pickup_longitude","pickup_latitude","dropoff_longitude","dropoff_latitude"])
Y_train = np.array(trainData["trip_duration"])
model = Sequential()
model.add(Dense(32, input_dim=7, activation='linear'))
model.add(Dense(12, activation='linear'))
model.add(Dense(1, activation='linear'))
model.compile(loss='mean_absolute_percentage_error', optimizer='adagrad', metrics=['accuracy'])
model.summary()
model.fit(X_train, Y_train, epochs=10, validation_split=0.2)
I also tried to merge two different models for numerical variables on one hand and categorical on the other but it didn't change a thing. Depending on the combinations of Loss and optimization function either the loss and accuracy remain quite the same (acc. 0.0016) or I don't even have non null acc.
A friend of mine replicated the NN in pure TensorFlow and got the same kind of results
Train on 233383 samples, validate on 58346 samples
Epoch 1/20 233383/233383 [==============================] - 15s - loss: 45.9550 - acc: 0.0016 - val_loss: 46.2514 - val_acc: 0.0014
Epoch 2/20 233383/233383 [==============================] - 15s - loss: 45.8675 - acc: 0.0014 - val_loss: 46.2675 - val_acc: 0.0015
Epoch 3/20 233383/233383 [==============================] - 15s - loss: 45.8465 - acc: 0.0015 - val_loss: 46.2131 - val_acc: 0.0013
Epoch 4/20 233383/233383 [==============================] - 15s - loss: 45.8283 - acc: 0.0014 - val_loss: 46.2478 - val_acc: 0.0016
Epoch 5/20 233383/233383 [==============================] - 15s - loss: 45.8214 - acc: 0.0015 - val_loss: 46.2043 - val_acc: 0.0013
Epoch 6/20 233383/233383 [==============================] - 14s - loss: 45.8122 - acc: 0.0014 - val_loss: 46.2526 - val_acc: 0.0014
Epoch 7/20 233383/233383 [==============================] - 12s - loss: 45.7990 - acc: 0.0015 - val_loss: 46.1821 - val_acc: 0.0014
Epoch 8/20 233383/233383 [==============================] - 12s - loss: 45.7964 - acc: 0.0016 - val_loss: 46.1761 - val_acc: 0.0013
Epoch 9/20 233383/233383 [==============================] - 11s - loss: 45.7898 - acc: 0.0015 - val_loss: 46.1804 - val_acc: 0.0016
Am I missing something -- like something big, obvious -- which would explain why any attempt to change activation, loss or optimization function ends up doing the same?
Thanks in advance
D.
try this:
X_train = trainData.as_matrix(columns=["fareDistance","hour","day","pickup_longitude","pickup_latitude","dropoff_longitude","dropoff_latitude"])
Y_train = np.array(trainData["trip_duration"])
model = Sequential()
model.add(Dense(32, input_dim=7, activation='elu'))
model.add(Dense(12, activation='elu'))
model.add(Dense(1, kernel_initializer='normal'))
model.compile(loss='mean_absolute_percentage_error', optimizer='rmsprop')
model.summary()
model.fit(X_train, Y_train, epochs=10, validation_split=0.2)
you can also try the adam optimizer.
model.compile(loss='mean_absolute_percentage_error', optimizer='adam')
Update:
If the code above didn't help you it means your input data either not normalized or very dirty.

Categories

Resources