I am doing multi class classification using BI LSTM and glove embeddings, when I train my model, on the prediction (model.predict) I get not proper results as below, the results are not between 0 and 1, can anyone help me please?
I also used one hot encoded.
3916/3916 [==============================] - 17s 4ms/step
[[9.9723792e-01 1.6954101e-03 1.0665554e-03]
[1.6794224e-01 8.6485274e-02 7.4557245e-01]
[9.4370516e-03 1.0848863e-03 9.8947805e-01]
...
[1.3264662e-02 9.7078091e-01 1.5954463e-02]
[1.2019513e-02 9.8711687e-01 8.6356810e-04]
[8.1863362e-01 1.5828104e-01 2.3085352e-02]]
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM,Dense, Dropout,Bidirectional
from tensorflow.keras.layers import SpatialDropout1D
from tensorflow.keras.layers import Embedding
from tensorflow.keras.preprocessing.text import Tokenizer
embedding_vector_length = 100
model_2 = Sequential()
model_2.add(Embedding(len(tokenizer.word_index) + 1, embedding_vector_length,
input_length=409,name="Bi-LSTM") )
model_2.add(SpatialDropout1D(0.3))
model_2.add(Bidirectional(LSTM(64, return_sequences=False, recurrent_dropout=0.4)))
model_2.add(Dropout(0.5))
model_2.add(Dense(3,activation='softmax'))
model_2.compile(loss='categorical_crossentropy',optimizer='adam',
metrics=['accuracy'])
print(model_2.summary())
model_2.layers[0].set_weights([embedding_matrix])
model_2.layers[0].trainable = False
print(model_2.summary())
from keras.callbacks import EarlyStopping
es = EarlyStopping(monitor='val_loss', mode='min', verbose=1, patience=3)
history_2=model_2.fit(x_train, y_train,
batch_size=400,
epochs=30,
validation_data=(x_val, y_val),
callbacks=[es])
#We save this model so that we can use in own web app
If the printed matrix is your model.predict() results - they are between 0 an 1 (you need to take exponential part into account)
Related
I've been trying to create a model that recognizes different singing techniques. I have got good results but I want to do different tests with different optimizers, layers, etc. However, I can't get reproducible results. By running twice this model training:
num_epochs = 100
batch_size = 128
history = modelo.fit(X_train_f, Y_train, validation_data=(X_test_f,Y_test), epochs=num_epochs, batch_size=batch_size, verbose=2)
I can get 25% accuracy the first run and then 34% the second. Then if I change the optimizer from "sgd" to "adam", I would get a 99%. If I come back to the previous "sgd" optimizer that got me 34% the second run, I would get 100% or something crazy like that. I don't understand why.
I've tried many things I've read in similar questions. The following lines show how I am trying to make my code to be reproducible, and these are actually the first lines of my whole code:
import numpy as np
import tensorflow as tf
import random as rn
import os
#https://stackoverflow.com/questions/57305909/tensorflow-keras-reproducibility-problem-on-google-colab
os.environ['PYTHONHASHSEED']=str(5)
np.random.seed(5)
rn.seed(12345)
session_conf = tf.compat.v1.ConfigProto(intra_op_parallelism_threads=1,
inter_op_parallelism_threads=1)
tf.compat.v1.set_random_seed(1234)
sess = tf.compat.v1.Session(graph=tf.compat.v1.get_default_graph(), config=session_conf)
tf.compat.v1.keras.backend.set_session(sess)
Question is, what am I doing wrong with the code above that is not working (as I mentioned)?
Here's where I create the training sets:
from keras.datasets import mnist
from keras.utils import np_utils
from keras.models import Sequential
from keras.layers.convolutional import Conv1D, MaxPooling1D
from keras.layers.core import Dense, Flatten
from keras.layers import BatchNormalization,Activation
from keras.optimizers import SGD, Adam
from sklearn.model_selection import train_test_split
X_train, X_test, Y_train, Y_test = train_test_split(X,Y, test_size=0.2, random_state=2)
My model:
from tensorflow.keras import layers
from tensorflow.keras import initializers
input_dim = X_train_f.shape[1]
output_dim = Y_train.shape[1]
modelo = Sequential()
modelo.add(Conv1D(filters=6, kernel_initializer=initializers.glorot_uniform(seed=5), kernel_size=5, activation='relu', input_shape=(40, 1))) # 6
modelo.add(MaxPooling1D(pool_size=2))
modelo.add(Conv1D(filters=16, kernel_initializer=initializers.glorot_uniform(seed=5), kernel_size=5, activation='relu')) # 16
modelo.add(MaxPooling1D(pool_size=2))
modelo.add(Flatten())
modelo.add(Dense(120, kernel_initializer=initializers.glorot_uniform(seed=5), activation='relu')) # 120
modelo.add(Dense(84, kernel_initializer=initializers.glorot_uniform(seed=5), activation='relu')) # 84
modelo.add(Dense(nclases, kernel_initializer=initializers.glorot_uniform(seed=5), activation='softmax'))
sgd = SGD(lr=0.1)
#modelo.compile(loss='categorical_crossentropy',
# optimizer='adam',
# metrics=['accuracy'])
modelo.compile(loss='categorical_crossentropy',
optimizer=sgd,
metrics=['accuracy'])
modelo.summary()
modelo.input_shape
It is a normal situation. Adam optimizer is much more powerful comparing to SGD. Adam implicitly performs coordinate-wise gradient clipping and can hence, unlike SGD, tackle heavy-tailed noise.
I have a folder called train and in that I have three separate folders for "Covid", "Pneumonia", "Healthy". I have created VGG19 model using transfer learning. I want help with creating confusion matrix for my test data. I have split my train data as 75%,25%. I have tried everything to create confusion matrix and calculate F-1 score. It would be really great if someone provides me the code, since I am new to this.
#import needed packages
import numpy as np
import tensorflow as tf
import keras
from keras import backend as K
from keras.optimizers import Adam
from keras.metrics import categorical_crossentropy
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Model
from keras.layers import Dense,GlobalAveragePooling2D,Dropout,SeparableConv2D,BatchNormalization, Activation, Dense
from keras.applications.vgg19 import VGG19
from keras.applications.resnet import ResNet101
from keras.optimizers import Adam
from keras.layers import Input, Lambda, Dense, Flatten
import matplotlib.pyplot as plt
# dataset has 3 classes
num_class = 3
# Base model without Fully connected Layers
base_model = VGG19(include_top=False, weights='imagenet', input_shape=(224,224,3))
# don't train existing weights
for layer in base_model.layers:
layer.trainable = False
x=base_model.output
x = Flatten()(base_model.output)
##3 classes ,COVID, PNEUNOMIA, NORMAL
preds=Dense(num_class, activation='softmax')(x) #final layer with softmax activation
model=Model(inputs=base_model.input,outputs=preds)
##check the model
model.summary()
# tell the model what cost and optimization method to use
model.compile(
loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy']
)
##Preparing Data
train_datagen=ImageDataGenerator(preprocessing_function=keras.applications.vgg16.preprocess_input,
validation_split=0.25)
train_generator=train_datagen.flow_from_directory('C:/Users/prani/OneDrive/Desktop/Masters project/chest_xray/Train-250/',
target_size=(224,224),
batch_size=64,
class_mode='categorical',
subset='training')
validation_generator = train_datagen.flow_from_directory(
'C:/Users/prani/OneDrive/Desktop/Masters project/chest_xray/Train-250/', # same directory as training data
target_size=(224,224),
batch_size=64,
class_mode='categorical',
subset='validation') # set as validation data
##Training
##SEtting hyper parameter
epochs = 50
learning_rate = 0.0005
decay_rate = learning_rate / epochs
opt = Adam(lr=learning_rate, beta_1=0.9, beta_2=0.999, decay=decay_rate, amsgrad=False)
model.compile(optimizer=opt,loss='categorical_crossentropy',metrics=['accuracy'])
##Train
step_size_train = train_generator.n/train_generator.batch_size
step_size_val = validation_generator.samples // validation_generator.batch_size
r = model.fit_generator(generator=train_generator,
steps_per_epoch=step_size_train,
validation_data = validation_generator,
validation_steps =step_size_val,
epochs=50)
##Confusion Matrix (Sample code for confusion matrix but this is not working, it doesn't produce error neither results)
from sklearn.metrics import classification_report, confusion_matrix
batch_size=64
Y_pred = model.predict_generator(validation_generator,186)
y_pred = np.argmax(Y_pred, axis=1)
print('Confusion Matrix')
print(confusion_matrix(validation_generator.classes, y_pred))
print('Classification Report')
target_names = ['COVID','NORMAL','PNEUMONIA']
print(classification_report(validation_generator.classes, y_pred, target_names=target_names))
You could look at this page for information on how to code a confusion matrix and other metrics:
https://www.tensorflow.org/tutorials/structured_data/imbalanced_data
I am building a LSTM for text classification in with Keras, and am playing around with different input sentences to get a sense of what is happening, but I'm getting strange outputs. For example:
Sentence 1 = "On Tuesday, Ms. [Mary] Barra, 51, completed a remarkable personal odyssey when she was named as the next chief executive of G.M.--and the first woman to ascend to the top job at a major auto company."
Sentence 2 = "On Tuesday, Ms. [Mary] Barra, 51, was named as the next chief executive of G.M.--and the first woman to ascend to the top job at a major auto company."
The model predicts the class "objective" (0), output 0.4242 when the Sentence 2 is the only element in the input array. It predicts "subjective" (1), output 0.9061 for Sentence 1. If they are both (as separate strings) fed as input in the same array, both are classified as "subjective" (1) - but Sentence 1 outputs 0.8689 and 2 outputs 0.5607. It seems as though they are affecting each other's outputs. It does not matter which index in the input array each sentence is.
Here is the code:
max_length = 500
from keras.preprocessing.sequence import pad_sequences
from keras.preprocessing.text import Tokenizer
tokenizer = Tokenizer(num_words=5000, lower=True,split=' ')
tokenizer.fit_on_texts(dataset["sentence"].values)
#print(tokenizer.word_index) # To see the dicstionary
X = tokenizer.texts_to_sequences(dataset["sentence"].values)
X = pad_sequences(X, maxlen=max_length)
y = np.array(dataset["label"])
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.1, random_state=0)
import numpy
from keras.datasets import imdb
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import LSTM
from keras.layers.embeddings import Embedding
from keras.preprocessing import sequence
# fix random seed for reproducibility
numpy.random.seed(7)
X_train = sequence.pad_sequences(X_train, maxlen=max_length)
X_test = sequence.pad_sequences(X_test, maxlen=max_length)
embedding_vector_length = 32
###LSTM
from keras.layers.convolutional import Conv1D
from keras.layers.convolutional import MaxPooling1D
model = Sequential()
model.add(Embedding(5000, embedding_vector_length, input_length=max_length))
model.add(Conv1D(filters=32, kernel_size=3, padding='same', activation='sigmoid'))
model.add(MaxPooling1D(pool_size=2))
model.add(LSTM(100))
model.add(Dense(1, activation='sigmoid'))
from keras import optimizers
sgd = optimizers.SGD(lr=0.9)
model.compile(loss='binary_crossentropy', optimizer=sgd, metrics=['accuracy'])
print(model.summary())
model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=3, batch_size=64)
# save model
model.save('LSTM.h5')
I then reloaded the model in a separate script and am feeding it hard-coded sentences:
model = load_model('LSTM.h5')
max_length = 500
from keras.preprocessing.sequence import pad_sequences
from keras.preprocessing.text import Tokenizer
tokenizer = Tokenizer(num_words=5000, lower=True,split=' ')
tokenizer.fit_on_texts(article_sentences)
#print(tokenizer.word_index) # To see the dicstionary
X = tokenizer.texts_to_sequences(article_sentences)
X = pad_sequences(X, maxlen=max_length)
prediction = model.predict(X)
print(prediction)
for i in range(len(X)):
print('%s\nLabel:%d' % (article_sentences[i], prediction[i]))
I set the random seed before training the model and in the script where I load the model, am I missing something when loading the model? Should I be arranging my data differently?
When predicting, you always to fit tokenizer with prediction text, So print out X, maybe you find the difference between twice predicting.
The tokenizer in predicting should be same with one in training. But in you code, you fit tokenizer with prediction text.
You're fitting a new tokenizer which is different from the tokenzier your model is trained on. So for that the first thing you need to do is to pickle the tokenizer and while prediction don't fit the tokenizer again use that pickle tokenizer
You can use that tokenizer directly as X = tokenizer.texts_to_sequences(article_sentences)
no need to fit again. I hope it helps!
I'm trying to train a deep classifier in Keras both with and without pretraining of the hidden layers via stacked autoencoders. My problem is that the pretraining seems to drastically degrade performance (i.e. if pretrain is set to False in the code below the training error of the final classification layer converges much faster). This seems completely outrageous to me given that pretraining should only initialize the weights of the hidden layers and I don't see how that could completely kill the models performance even if that initialization does not work very well. I can not include the specific dataset I used but the effect should occur for any appropriate dataset (e.g. minist). What is going on here and how can I fix it?
EDIT: code is now reproducible with the MNIST data, final line prints change in loss function, which is significantly lower with pre-training.
I have also slightly modified the code and added sample learning curves below:
from functools import partial
import matplotlib.pyplot as plt
from keras.datasets import mnist
from keras.layers import Dense
from keras.models import Sequential
from keras.optimizers import SGD
from keras.regularizers import l2
from keras.utils import to_categorical
(inputs_train, targets_train), _ = mnist.load_data()
inputs_train = inputs_train[:1000].reshape(1000, 784)
targets_train = to_categorical(targets_train[:1000])
hidden_nodes = [256] * 4
learning_rate = 0.01
regularization = 1e-6
epochs = 30
def train_model(pretrain):
model = Sequential()
layer = partial(Dense,
activation='sigmoid',
kernel_initializer='random_normal',
kernel_regularizer=l2(regularization))
for i, hn in enumerate(hidden_nodes):
kwargs = dict(units=hn, name='hidden_{}'.format(i + 1))
if i == 0:
kwargs['input_dim'] = inputs_train.shape[1]
model.add(layer(**kwargs))
if pretrain:
# train autoencoders
inputs_train_ = inputs_train.copy()
for i, hn in enumerate(hidden_nodes):
autoencoder = Sequential()
autoencoder.add(layer(units=hn,
input_dim=inputs_train_.shape[1],
name='hidden'))
autoencoder.add(layer(units=inputs_train_.shape[1],
name='decode'))
autoencoder.compile(optimizer=SGD(lr=learning_rate, momentum=0.9),
loss='binary_crossentropy')
autoencoder.fit(
inputs_train_,
inputs_train_,
batch_size=32,
epochs=epochs,
verbose=0)
autoencoder.pop()
model.layers[i].set_weights(autoencoder.layers[0].get_weights())
inputs_train_ = autoencoder.predict(inputs_train_)
num_classes = targets_train.shape[1]
model.add(Dense(units=num_classes,
activation='softmax',
name='classify'))
model.compile(optimizer=SGD(lr=learning_rate, momentum=0.9),
loss='categorical_crossentropy')
h = model.fit(
inputs_train,
targets_train,
batch_size=32,
epochs=epochs,
verbose=0)
return h.history['loss']
plt.plot(train_model(pretrain=False), label="Without Pre-Training")
plt.plot(train_model(pretrain=True), label="With Pre-Training")
plt.xlabel("Epoch")
plt.ylabel("Cross-Entropy")
plt.legend()
plt.show()
I am trying to implement CNN by Theano. I used Keras library. My data set is 55 alphabet images, 28x28.
In the last part I get this error:
train_acc=hist.history['acc']
KeyError: 'acc'
Any help would be much appreciated. Thanks.
This is part of my code:
from keras.models import Sequential
from keras.models import Model
from keras.layers.core import Dense, Dropout, Activation, Flatten
from keras.layers.convolutional import Convolution2D, MaxPooling2D
from keras.optimizers import SGD, RMSprop, adam
from keras.utils import np_utils
import matplotlib
import matplotlib.pyplot as plt
import matplotlib.cm as cm
from urllib.request import urlretrieve
import pickle
import os
import gzip
import numpy as np
import theano
import lasagne
from lasagne import layers
from lasagne.updates import nesterov_momentum
from nolearn.lasagne import NeuralNet
from nolearn.lasagne import visualize
from sklearn.metrics import classification_report
from sklearn.metrics import confusion_matrix
from PIL import Image
import PIL.Image
#from Image import *
import webbrowser
from numpy import *
from sklearn.utils import shuffle
from sklearn.cross_validation import train_test_split
from tkinter import *
from tkinter.ttk import *
import tkinter
from keras import backend as K
K.set_image_dim_ordering('th')
%%%%%%%%%%
batch_size = 10
# number of output classes
nb_classes = 6
# number of epochs to train
nb_epoch = 5
# input iag dimensions
img_rows, img_clos = 28,28
# number of channels
img_channels = 3
# number of convolutional filters to use
nb_filters = 32
# number of convolutional filters to use
nb_pool = 2
# convolution kernel size
nb_conv = 3
%%%%%%%%
model = Sequential()
model.add(Convolution2D(nb_filters, nb_conv, nb_conv,
border_mode='valid',
input_shape=(1, img_rows, img_clos)))
convout1 = Activation('relu')
model.add(convout1)
model.add(Convolution2D(nb_filters, nb_conv, nb_conv))
convout2 = Activation('relu')
model.add(convout2)
model.add(MaxPooling2D(pool_size=(nb_pool, nb_pool)))
model.add(Dropout(0.5))
model.add(Flatten())
model.add(Dense(128))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes))
model.add(Activation('softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adadelta')
%%%%%%%%%%%%
hist = model.fit(X_train, Y_train, batch_size=batch_size, nb_epoch=nb_epoch,
show_accuracy=True, verbose=1, validation_data=(X_test, Y_test))
hist = model.fit(X_train, Y_train, batch_size=batch_size, nb_epoch=nb_epoch,
show_accuracy=True, verbose=1, validation_split=0.2)
%%%%%%%%%%%%%%
train_loss=hist.history['loss']
val_loss=hist.history['val_loss']
train_acc=hist.history['acc']
val_acc=hist.history['val_acc']
xc=range(nb_epoch)
#xc=range(on_epoch_end)
plt.figure(1,figsize=(7,5))
plt.plot(xc,train_loss)
plt.plot(xc,val_loss)
plt.xlabel('num of Epochs')
plt.ylabel('loss')
plt.title('train_loss vs val_loss')
plt.grid(True)
plt.legend(['train','val'])
print (plt.style.available) # use bmh, classic,ggplot for big pictures
plt.style.use(['classic'])
plt.figure(2,figsize=(7,5))
plt.plot(xc,train_acc)
plt.plot(xc,val_acc)
plt.xlabel('num of Epochs')
plt.ylabel('accuracy')
plt.title('train_acc vs val_acc')
plt.grid(True)
plt.legend(['train','val'],loc=4)
#print plt.style.available # use bmh, classic,ggplot for big pictures
plt.style.use(['classic'])
In a not-so-common case (as I expected after some tensorflow updates), despite choosing metrics=["accuracy"] in the model definitions, I still got the same error.
The solution was: replacing metrics=["acc"] with metrics=["accuracy"] everywhere. In my case, I was unable to plot the parameters of the history of my training. I had to replace
acc = history.history['acc']
val_acc = history.history['val_acc']
loss = history.history['loss']
val_loss = history.history['val_loss']
to
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']
Your log variable will be consistent with the metrics when you compile your model.
For example, the following code
model.compile(loss="mean_squared_error", optimizer=optimizer)
model.fit_generator(gen,epochs=50,callbacks=ModelCheckpoint("model_{acc}.hdf5")])
will gives a KeyError: 'acc' because you didn't set metrics=["accuracy"] in model.compile.
This error also happens when metrics are not matched. For example
model.compile(loss="mean_squared_error",optimizer=optimizer, metrics="binary_accuracy"])
model.fit_generator(gen,epochs=50,callbacks=ModelCheckpoint("model_{acc}.hdf5")])
still gives a KeyError: 'acc' because you set a binary_accuracy metric but asking for accuracy later.
If you change the above code to
model.compile(loss="mean_squared_error",optimizer=optimizer, metrics="binary_accuracy"])
model.fit_generator(gen,epochs=50,callbacks=ModelCheckpoint("model_{binary_accuracy}.hdf5")])
it will work.
You can use print(history.history.keys()) to find out what metrics you have and what they are called. In my case also, it was called "accuracy", not "acc"
In my case switching from
metrics=["accuracy"]
to
metrics=["acc"]
was the solution.
from keras source :
warnings.warn('The "show_accuracy" argument is deprecated, '
'instead you should pass the "accuracy" metric to '
'the model at compile time:\n'
'`model.compile(optimizer, loss, '
'metrics=["accuracy"])`')
The right way to get the accuracy is indeed to compile your model like this:
model.compile(loss='categorical_crossentropy', optimizer='adadelta', metrics=["accuracy"])
does it work?
Make sure to check this "breaking change":
Metrics and losses are now reported under the exact name specified by the user (e.g. if you pass metrics=['acc'], your metric will be reported under the string "acc", not "accuracy", and inversely metrics=['accuracy'] will be reported under the string "accuracy".
If you are using Tensorflow 2.3 then you can specify like this
model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
loss=tf.keras.losses.CategoricalCrossentropy(), metrics=[tf.keras.metrics.CategoricalAccuracy(name="acc")])
In the New version of TensorFlow, some things have changed so we have to replace it with :
acc = history.history['accuracy']
print(history.history.keys())
output--
dict_keys(['loss', 'accuracy', 'val_loss', 'val_accuracy'])
so you need to change "acc" to "accuracy" and "val_acc" to "val_accuracy"
For Practice
3.5-classifying-movie-reviews.ipynb
Change
acc = history.history['acc']
val_acc = history.history['val_acc']
To
acc = history.history['binary_accuracy']
val_acc = history.history['val_binary_accuracy']
&
Change
acc_values = history_dict['acc']
val_acc_values = history_dict['val_acc']
To
acc_values = history_dict['binary_accuracy']
val_acc_values = history_dict['val_binary_accuracy']
================
Practice
3.6-classifying-newswires.ipynb
Change
acc = history.history['acc']
val_acc = history.history['val_acc']
To
acc = history.history['accuracy']
val_acc = history.history['val_accuracy']