I've been using Keras with Tensorflow to classify a normalized 60x60 grayscale image of an arrow into 4 categories, its orient, up, down, left, right. I have created a dataset of about ~1800 images, almost equally distributed into said categories.
However, there's a problem with classification. From the source where I have created the dataset, there are two types of arrows, arrow shape 1,
and arrow shape 2.
The accuracy is okay for arrows which are shaped like 1(about ~70% validation accuracy), but for arrows like number 2, terrible.
I've went through my dataset, and about 90% of the dataset images are arrow shape 1.
Does that mean that the lack of traning data for arrow shape 2 is the reason that it cannot classify them as well as shape 1, and therefore increasing the dataset for shape 2 resolve this issue?
If true, doesn't that mean that my model has failed to generalize?
Also, if the arrow colors are inverted, will the network be affected by this?
Here is the source I'm using to train data:
# -*- coding:utf-8 -*-
import cv2
import numpy as np
import os
from random import shuffle
import glob
train_dir = "images\\cropped\\traindata"
test_dir = "images\\cropped\\testdata"
MODEL_NAME = "ARROWS.model"
img_size = 60
# Importing the Keras libraries and packages
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers import Activation
from keras.layers import BatchNormalization
from keras.preprocessing.image import ImageDataGenerator
from keras.optimizers import adam
from keras.callbacks import TensorBoard
from keras import backend as K
from tensorflow import Session, ConfigProto, GPUOptions
gpuoptions = GPUOptions(allow_growth=True)
session = Session(config=ConfigProto(gpu_options=gpuoptions))
K.set_session(session)
classifier = Sequential()
classifier.add(Conv2D(32, (3,3), input_shape=(img_size, img_size, 1)))
classifier.add(BatchNormalization())
classifier.add(Activation("relu"))
classifier.add(Conv2D(32, (3,3)))
classifier.add(BatchNormalization())
classifier.add(Activation("relu"))
classifier.add(MaxPooling2D(pool_size=(2, 2)))
classifier.add(Dropout(0.25))
#classifier.add(Dropout(0.25))
classifier.add(Conv2D(64, (3,3), padding='same'))
classifier.add(BatchNormalization())
classifier.add(Activation("relu"))
classifier.add(MaxPooling2D(pool_size=(2, 2)))
classifier.add(Dropout(0.25))
#classifier.add(Dropout(0.25))
classifier.add(Flatten())
classifier.add(Dense(128))
classifier.add(BatchNormalization())
classifier.add(Activation("relu"))
classifier.add(Dropout(0.5))
classifier.add(Dense(4))
classifier.add(BatchNormalization())
classifier.add(Activation("softmax"))
classifier.compile(optimizer = adam(lr=1e-6), loss = 'categorical_crossentropy', metrics = ['accuracy'])
train_datagen = ImageDataGenerator(rotation_range=12)
test_datagen = ImageDataGenerator(rotation_range=12)
training_set = train_datagen.flow_from_directory('images/cropped/traindata',
color_mode="grayscale",
target_size = (img_size, img_size),
batch_size = 32,
class_mode = 'categorical', shuffle=True)
test_set = test_datagen.flow_from_directory('images/cropped/testdata',
color_mode="grayscale",
target_size = (img_size, img_size),
batch_size = 32,
class_mode = 'categorical', shuffle=True)
with open("class_indices.txt", "w") as indices_fine: # Log debug data to file
indices_fine.write(str(classifier.summary()))
indices_fine.write("\n")
indices_fine.write("training_set indices:\n"+str(training_set.class_indices))
indices_fine.write("test_set indices:\n"+str(test_set.class_indices))
tbCallBack = TensorBoard(log_dir='./log', histogram_freq=0, write_graph=True, write_images=True)
classifier.fit_generator(training_set,steps_per_epoch = 8000,epochs = 15,validation_data = test_set,validation_steps = 2000, shuffle=True, callbacks=[tbCallBack])
classifier.save("arrow_classifier_keras_gray.h5")
Does that mean that the lack of traning data for arrow shape 2 is the
reason that it cannot classify them as well as shape 1, and therefore
increasing the dataset for shape 2 resolve this issue?
Your dataset distribution is very important and can cause bias toward a particular class and not performing as you expect. In your case, the number of cases of shape 2 is much smaller than your shape 1, hence creating bias in your Deep Learning model to somehow assuming all of the down arrows must be like shape 1, and not shape 2. Solution? You already know the answer: Increasing the dataset for shape 2 or make shape 1 and shape 2 equally distributed in down arrow class.
If true, doesn't that mean that my model has failed to generalize?
Your dataset distribution of images caused the model to fail generalizing well on that particular class (down arrow). If your model works well on other classes, the problem is not your model, but your dataset for down arrow class.
Just imagine your first image is a tiny cat and a second is a fat cat, like Garfield. The distribution of cats is something we cannot change, but we need to detect all the cats (even when the cats are inverted or sprayed in pink).
What I would do is for instance if I have 1000 of tiny cats, I may filter some of the images add some distortions, and effect to make the training set bigger. This is called data augmentation.
You don't need per purpose to make the number of fat cat images equal the number of tiny cats images if at the end you recognize them all well and you have trained your image classifier with let's say ~98% accuracy on this dataset.
It is important to test.
NOTE: CNN should be good at detecting images with inverted colors. It is because they use the convolution technique.
Related
I am trying to classify which are defected solar cells. I have a huge dataset of both defected plates and non-defected solar cells. As per a few suggestions from research papers I have been using the VGG 16 model for the training purpose. But even after 3 epochs, it is showing 100 % accuracy. I don't why it is coming like this. Is there any other way to solve this problem, any other Algorithm.
I am uploading some of the defected cells which I have in my dataset
]2]2]3
from keras.layers import Input, Lambda, Dense, Flatten
from keras.models import Model
from keras.applications.vgg16 import VGG16
from keras.applications.vgg16 import preprocess_input
from keras.preprocessing import image
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
import numpy as np
from glob import glob
import matplotlib.pyplot as plt
# re-size all the images to this
IMAGE_SIZE = [224, 224]
train_path = 'Datasets/Train'
valid_path = 'Datasets/Test'
# add preprocessing layer to the front of VGG
vgg = VGG16(input_shape=IMAGE_SIZE + [3], weights='imagenet', include_top=False)
# don't train existing weights
for layer in vgg.layers:
layer.trainable = False
# useful for getting number of classes
folders = glob('Datasets/Train/*')
# our layers - you can add more if you want
x = Flatten()(vgg.output)
# x = Dense(1000, activation='relu')(x)
prediction = Dense(len(folders), activation='softmax')(x)
# create a model object
model = Model(inputs=vgg.input, outputs=prediction)
# view the structure of the model
model.summary()
# tell the model what cost and optimization method to use
model.compile(
loss='categorical_crossentropy',
optimizer='adam',
metrics=['accuracy']
)
from keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(rescale = 1./255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True)
test_datagen = ImageDataGenerator(rescale = 1./255)
training_set = train_datagen.flow_from_directory('Datasets/Train',
target_size = (224, 224),
batch_size = 32,
class_mode = 'categorical')
test_set = test_datagen.flow_from_directory('Datasets/Test',
target_size = (224, 224),
batch_size = 32,
class_mode = 'categorical')
# fit the model
r = model.fit_generator(
training_set,
validation_data=test_set,
epochs=10,
steps_per_epoch=len(training_set),
validation_steps=len(test_set)
)
# loss
plt.plot(r.history['loss'], label='train loss')
plt.plot(r.history['val_loss'], label='val loss')
plt.legend()
plt.show()
plt.savefig('LossVal_loss')
# accuracies
plt.plot(r.history['acc'], label='train acc')
plt.plot(r.history['val_acc'], label='val acc')
plt.legend()
plt.show()
plt.savefig('AccVal_acc')
import tensorflow as tf
from keras.models import load_model
model.save('defect_features_new_model.h5')
It feels like the images are too simple and do not contain any complicated data. Hence they tend to overfit the VGG16 model. VGG16 models are generally trained on much more complicated data. You can try defining your own convolutional neural network. Since you are already using keras you can create sequential model to define your own model with minimum as 2 or 3 convolutional layersPlease refer the link for keras sequential model
Your data for the network looks very simple so it is not unexpected that you achieve high accuracy. Remember the network trains in batches so for every 32 images it goes through a back propagation cycle and updates the weights accordingly. So if you have a lot of images which you say you do then you are executing a lot of iterations on the weights.
I do not see a problem here. You are not over training in the sense that your validation accuracy is 100%. Certainly you could get good results with a simpler model but why bother your results are what you would want.
I'm doing my first steps into machine learning and trying to do a sign-language machine learning project using the Kaggle dataset. It is supposed to be able to predict characters in ASL. Here's the data presented by Kaggle.
Image of Dataset here.
My current issue is that I can achieve moderate accuracy that fits the data given by Kaggle using their testing data, but if I try to predict a single image, say a random letter of the alphabet, it will be consistently wrong. Here's my code.
from keras.models import Sequential, load_model
from keras.preprocessing.image import load_img, img_to_array
from keras.layers import Dense, Dropout, Flatten, BatchNormalization, Activation
from keras.layers.convolutional import Conv2D, MaxPooling2D
from keras.optimizers import SGD
import numpy as np
from pandas import read_csv
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import OneHotEncoder, LabelBinarizer
import matplotlib.pyplot as plt
trainer = read_csv("sign_mnist_train.csv")
labels = trainer["label"].values
trainer = trainer.drop(["label"], axis=1) #
tester = read_csv("sign_mnist_test.csv")
testlabels = tester["label"].values
tester = tester.drop(["label"], axis=1)
def preProcessing(raw, classes):
OH = OneHotEncoder(sparse=False) # One hot's the labels, can be replaced with LabelBinarizer
binary = classes.reshape(len(classes), 1)
binary = OH.fit_transform(binary)
images = raw.values
for c, i in enumerate(images, 0):
image = np.reshape(i, (28, 28))
image = image.flatten()
images[c] = np.array(image)
return images, binary
def defineModel(): # Builds the layers for our model
model = Sequential()
model.add(Conv2D(64, (3, 3), input_shape=(x_test.shape[1:]), activation='relu', padding='same'))
model.add(Dropout(0.2))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu',padding='same'))
model.add(Dropout(0.2))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(y_train.shape[1], activation='softmax'))
opt = SGD(lr=0.001, momentum=0.9)
model.compile(optimizer=opt, loss="categorical_crossentropy", metrics=["categorical_accuracy"])
return model
def testModel(): #Test's a single image, predicting the class.
model = load_model("my_model.hl5")
img = load_img("C.jpg", color_mode="grayscale", target_size=(28, 28))
img = img_to_array(img)
img = np.reshape(img, (-1, 28, 28, 1))
test = model.predict_classes(img)
print(test)
test_test = model.predict_proba(img)[0]
test_test = "%.2f" % (test_test[test]*100)
print(test_test)
if __name__ == "__main__":
data, labels = preProcessing(trainer, labels)
x_train, x_test, y_train, y_test = train_test_split(data, labels, test_size=0.33, random_state=42)
x_train = x_train.astype('float32')
x_train = x_train/255.0
x_train = np.reshape(x_train, (x_train.shape[0], 28, 28, 1))
x_test = x_test.astype('float32')
x_test = x_test/255.0
x_test = np.reshape(x_test, (x_test.shape[0], 28, 28, 1))
model = defineModel()
history = model.fit(x_train, y_train, validation_data = (x_test, y_test), epochs=40, verbose=1, batch_size=128)
model.evaluate(x_test, y_test)
model.save("my_model.hl5")
Apologies for the messy code, but essentially I try to break the data into usable parts using Panda, then using Keras/Sklearn to fit the data. I wanted to look deeper and was advised to use accuracy_score in the Sklearn library.
testStuff, testlabels = preProcessing(tester, testlabels)
testStuff = testStuff.reshape(testStuff.shape[0], 28, 28, 1)
pred = model.predict(testStuff).round()
print(accuracy_score(testlabels, pred))
This showed that my accuracy was only around 70% compared to the 99% model.evaluate posed. Regardless, I still a very low accuracy on random predictions, some of my individual tests were snipped straight from the Kaggle example images. From there, I tried removing layers, increasing/reducing filters on the Conv2d layers to see what happens, but nothing seems to make a difference. I picked up Pyplot to display the graph and I get this. I don't see a problematic trend, but I may be looking in the wrong area.
Is it because of overfitting/underfitting? I feel that I am getting something wrong at a fundamental level and could use some tips. Looking at similar questions, they point toward possible indexing issues and otherwise mismanagement of the dataset, I am unsure how to test if these issues are present in my code. This is my first time using StackOverflow to ask a question so feel free to ask anything since I understand that reading my rambling code/question is confusing.
Summary: Okay accuracy, bad predictions, why?
In general this behaviour often occurs due to overfitting:
Try to tweak your network to have fewer parameters and try to add some regularizations.
Further it could be that your test set only contains a part of the planned real world domain. Meaning that your training set is far away from reality, which also could lead to bad predictions.
A way to tweak your dataset could be data augmentation, I assume it could work very well on this ASL DataSet - but I did not had a deep look.
Data Augmentation is basically an artifical way to increase the size of your dataset, reducing overfitting as well and improves on slight rotations of your hand or other "random" distortions, like different background or different clothing.
A great article about data augmentation can be found here:
https://towardsdatascience.com/data-augmentation-for-deep-learning-4fe21d1a4eb9
I have a video of droplets. I have taken the first frame and manually marked the edges. I split up the image into smaller images. I then tried to train a keras model of the small unmarked images against the small marked images.
I've tried using "dense" layers. The model trains, but the loss does not improve. When I try using the model, it just gives me a black image output.
Marked split images
Input image (frame 1)
Model summary
#################### IMPORT AND SPLIT
from cam_img_split import cam_img_split
import cv2
img_tr_in=cv2.imread('frame 1.png')
img_tr_out=cv2.imread('frame 1 so far.png')
seg_shape=[32,32]
tr_in=cam_img_split(img_tr_in,seg_shape)
tr_out=cam_img_split(img_tr_out,seg_shape)
pl=[4,20] #images selected for training
##################### NEURAL NETWORK
import tensorflow as tf
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import adam
b_sha=np.prod(tr_in.shape[2:5]) #batch shape
model = Sequential()
model.add(Dense(b_sha, activation='relu'))
model.add(Dense(3072, activation='softmax'))
model.add(Dense(3072, activation='softmax'))
model.add(Dense(3072, activation='softmax'))
model.add(Dense(np.prod(tr_out.shape[2:5]), activation='softmax'))
model.compile(optimizer=adam(lr=0.1), loss='mean_squared_error', metrics=['accuracy'])
tr_in_sel=tr_in[0:pl[0],0:pl[1],:,:,:]
tr_out_sel=tr_out[0:pl[0],0:pl[1],:,:,:]
tr_in_sel_flat=tr_in_sel.reshape([np.prod(pl),b_sha]) #Flattening
tr_out_sel_flat=tr_in_sel.reshape([np.prod(pl),b_sha])
tr_in_sel_flat_norm=tr_in_sel_flat/255
tr_out_sel_flat_norm=tr_out_sel_flat/255
model.fit(tr_in_sel_flat_norm, tr_out_sel_flat_norm, epochs=10, batch_size=pl[0])
I expected an output matching the images with the marked edges. Instead I got a black image output.
I solved this problem by using 7x7 sections of the image to classify the central pixel of that section as either oil or water (1 or 0). I then used the binary_crossentropy loss function to train the model.
By having a 7x7 section moving one pixel at a time across the main image, I can have a lot more training data than just segmenting the main image.
I previously tried to get a 7x7 image from another 7x7 image, which made the problem more difficult.
#IMPORT AND SPLIT
from cam_img_split import cam_img_split
from cam_pad import cam_pad
from cam_img_bow import cam_img_bow
import cv2
import numpy as np
img_tr_in=cv2.imread('frame 1.png',0)[0:767,0:767]/255
img_tr_out=cv2.imread('frame 1 so far bnw 2.png',0)[0:767,0:767]/255
img_tr_out=(cam_img_bow(img_tr_out,0.5)).astype(np.uint8)
seg_shape=[15,15] #needs to be odd and equal to each other
pl_max=img_tr_in.shape[0:2]
pl=np.array([0.15*pl_max[0],pl_max[1]]).astype(np.uint32)
pad_in=int(np.floor(seg_shape[0]/2))
img_tr_in_pad=cam_pad(img_tr_in,pad_in)
tr_in=np.zeros([pl[0],pl[1],seg_shape[0],seg_shape[1]])
for n1 in range(0,pl[0]):
for n2 in range(0,pl[1]):
tr_in[n1,n2]=img_tr_in_pad[n1:n1+seg_shape[0],n2:n2+seg_shape[1]]
##################### NEURAL NETWORK
import tensorflow as tf
from keras.models import Sequential
from keras.layers import Dense,Dropout,Conv2D, MaxPooling2D, Flatten
from keras.optimizers import adam
from keras.utils import to_categorical
import matplotlib.pyplot as plt
pad=4
input_shape=(seg_shape[0]+2*pad,seg_shape[1]+2*pad,1)
output_shape=(1,1,1)
model = Sequential()
model.add(Conv2D(32, (3, 3),input_shape=input_shape, activation='relu'))
model.add(Conv2D(64,(3, 3), activation='relu'))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(units=2, activation='softmax'))
model.compile(optimizer=adam(lr=0.001), loss='binary_crossentropy', metrics=['accuracy'])
##################### FITTING THE MODEL
tr_in_flat=tr_in.reshape([pl[0]*pl[1],seg_shape[0],seg_shape[1],1])
tr_out_flat=img_tr_out.reshape([pl_max[0]*pl_max[1]])
tr_in_flat_pad=np.zeros(tr_in_flat.shape+np.array([0,2*pad,2*pad,0]))
for n3 in range(0,tr_in_flat.shape[0]):
tr_in_flat_pad[n3,:,:,0]=cam_pad(tr_in_flat[n3,:,:,0], pad)
model.fit(tr_in_flat_pad, to_categorical(tr_out_flat[0:pl[0]*pl[1]]), epochs=5, batch_size=int(16*pl[0]),shuffle=True)
##################### PLOTTING PREDICTIONS
tr_in_full=np.zeros([pl_max[0],pl_max[1],seg_shape[0]+2*pad,seg_shape[1]+2*pad])
for n1 in range(0,pl_max[0]):
for n2 in range(0,pl_max[1]):
tr_in_full[n1,n2]=cam_pad(img_tr_in_pad[n1:n1+seg_shape[0],n2:n2+seg_shape[1]],pad)
tr_in_full_flat=tr_in_full.reshape([pl_max[0]*pl_max[1],seg_shape[0]+2*pad,seg_shape[1]+2*pad,1])
pred = model.predict(tr_in_full_flat)
pred_img=np.zeros(pred.shape[0])
for n1 in range(0,pred.shape[0]):
pred_img[n1]=round(pred[n1,0])
pred_img_out=(pred_img.reshape([pl_max[0],pl_max[1]]))
plt.subplot(1,2,1)
plt.imshow(pred_img_out)
plt.subplot(1,2,2)
plt.imshow(img_tr_in)
plt.show()
You are using the wrong loss/metric combo. Is your problem classification, or regression? MSE is for regression, while categorical_crossentropy (or sparse, or binary) is for classification.
I'm currently building a convolutional neural network to distinguish clear ECG images from ECG images with noise.
With Noise :
Without Noise :
My Problem
So I did build a convnet using keras above tensorflow and trained it several times but all the time, it has like 99% of Training Accuracy, 99% Validation Accuracy and 98% of Testing accuracy. But when I predict an image, it always give me [0].
Most of the times, my model early stops at epoch 3 or 4 with 99% of accuracy in both training and validation. It almost all the time given 98% or 99% accuracy in first epoch or second epoch.
My Model
from keras.models import Sequential
from keras.datasets import mnist
from keras.layers import Conv2D, MaxPooling2D
from keras.layers import Activation,Dropout,Flatten,Dense
from keras.preprocessing.image import ImageDataGenerator
from keras.callbacks import TensorBoard
from keras.layers import ZeroPadding2D
from keras.optimizers import Adam
from keras.callbacks import EarlyStopping
tensorboard = TensorBoard(log_dir="./logs",histogram_freq=0,write_graph=True,write_images=True)
earlystop = EarlyStopping(monitor='val_loss',patience=2,verbose=1)
# Variables
batchSize = 15
num_of_samples = 15000
num_of_testing_samples = 3750
num_of_val_samples = 2000
training_imGenProp = ImageDataGenerator(rescale = 1./255,
width_shift_range=0.02,
height_shift_range=0.02,
horizontal_flip=False,
fill_mode='nearest'
)
testing_imGenProp = ImageDataGenerator(
rotation_range=5,
horizontal_flip=False,
fill_mode='nearest'
)
val_imGenProp = ImageDataGenerator(rescale = 1./255,
rotation_range=5,
zoom_range=0.2,
horizontal_flip=False,
fill_mode='nearest'
)
# Create the model
classifier = Sequential()
classifier.add(ZeroPadding2D(padding=(374,0),input_shape=(74,448,3)))
classifier.add(Conv2D(32, (3, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2, 2)))
classifier.add(Conv2D(32, (3, 3), activation = 'relu'))
classifier.add(MaxPooling2D(pool_size = (2, 2)))
classifier.add(Flatten())
classifier.add(Dense(units = 128, activation = 'relu'))
classifier.add(Dropout(0.8))
classifier.add(Dense(units = 1, activation = 'sigmoid'))
classifier.summary()
adam = Adam(lr=0.00005)
classifier.compile(loss='binary_crossentropy',optimizer=adam,metrics=['accuracy'])
training_imGen = training_imGenProp.flow_from_directory(
'Directory\Training',
target_size=(74,448),
batch_size=batchSize,
class_mode='binary',
)
testing_imGen = testing_imGenProp.flow_from_directory(
'Directory\Testing',
target_size=(74,448),
batch_size=batchSize,
class_mode='binary',
)
val_imGen = testing_imGenProp.flow_from_directory(
'Directory\Validation',
target_size=(74,448),
batch_size=batchSize,
class_mode='binary',
)
classifier.fit_generator(
training_imGen,
callbacks = [tensorboard,earlystop],
steps_per_epoch=num_of_samples // batchSize,
epochs=30,
validation_data = val_imGen,
validation_steps = num_of_val_samples // batchSize
)
score, acc = classifier.evaluate_generator(
testing_imGen,
num_of_testing_samples // batchSize,
verbose = 0
)
print('Test score:', score)
print('Test accuracy:', acc)
classifier.save('Directory\Config_10_Model.h5')
Notes
I used 0.0005 Learning rate to stop this model being early stopped at 2nd or 3rd epoch. Also I've separated images for training, testing and validation under three folders and have 1020,375,200 images respectively for training,testing and validation (Which means Training folder alone has 2040 images since I have two classes. Each class have same number of images). So no images will be reused under any circumstances.
Also, before I'm rescaling images by 1./255 in ImageDataGenerator, My model had 50% of accuracy in training, validation and 54% in testing. But after using rescaling, this early stopping happened frequently and accuracy was 99% almost all the time.
I didn't use rescaling for test images purposely. But still receive 98% accuracy and yet fails desperately at predicting. Since I've with noise and without noise folders under training folder, My output class should be with noise or without noise. Since with Noise comes first in alphabetical order, I believe [0] class says With Noise and [1] should be for Without Noise. But if I input without noise image to model, it still gives me [0].
Below is the code I use to predict trained model.
from keras.models import load_model
import numpy as np
from keras.preprocessing import image
model = load_model('Directory\Config_10_Model.h5')
test_image = image.load_img('Path_to_Without_Noise_Image\image3452.png', target_size = (74, 448))
test_image = image.img_to_array(test_image)
test_image = test_image/255
test_image = np.expand_dims(test_image, axis = 0)
result = model.predict(test_image)
y_classes = result.argmax(axis=-1)
print(y_classes)
I don't know why this happenes even though I never used same images for testing, validation or training. Can someone help me with this? I tried everything and trained model with different hyper parameters but everytime this model output [0].
You are doing binary classification. result has shape [batch_size,1]. So if you are doing argmax() you will always get 0.
>>> import numpy as np
>>> result = np.random.rand(5,1)
>>> result
array([[ 0.54719484],
[ 0.31675804],
[ 0.55151251],
[ 0.25014937],
[ 0.00724972]])
>>> result.argmax(axis=-1)
array([0, 0, 0, 0, 0])
>>> (result > 0.5).astype(int)
array([[1],
[0],
[1],
[0],
[0]])
>>>
I am pretty new to machine learning so I am playing around with examples and such.
The image size specified in the code is (28,28)
But for some reason I keep getting the same ValueError
I cant figure out why this is happening.
Here's the code:
import pandas as pd
import numpy as np
np.random.seed(1337) # for reproducibility
from keras.models import Sequential
from keras.layers.core import Dense, Dropout, Activation, Flatten
from keras.layers.convolutional import Convolution2D, MaxPooling2D
from keras.utils import np_utils
# input image dimensions
img_rows, img_cols = 28, 28
batch_size = 128 # Number of images used in each optimization step
nb_classes = 10 # One class per digit
nb_epoch = 35 # Number of times the whole data is used to learn
# Read the train and test datasets
train = pd.read_csv("../input/train.csv").values
test = pd.read_csv("../input/test.csv").values
# Reshape the data to be used by a Theano CNN. Shape is
# (nb_of_samples, nb_of_color_channels, img_width, img_heigh)
X_train = train[:, 1:].reshape(train.shape[0], 1, img_rows, img_cols)
X_test = test.reshape(test.shape[0], 1, img_rows, img_cols)
y_train = train[:, 0] # First data is label (already removed from X_train)
# Make the value floats in [0;1] instead of int in [0;255]
X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train /= 255
X_test /= 255
# convert class vectors to binary class matrices (ie one-hot vectors)
Y_train = np_utils.to_categorical(y_train, nb_classes)
#Display the shapes to check if everything's ok
print('X_train shape:', X_train.shape)
print('Y_train shape:', Y_train.shape)
print('X_test shape:', X_test.shape)
model = Sequential()
# For an explanation on conv layers see http://cs231n.github.io/convolutional-networks/#conv
# By default the stride/subsample is 1
# border_mode "valid" means no zero-padding.
# If you want zero-padding add a ZeroPadding layer or, if stride is 1 use border_mode="same"
model.add(Convolution2D(12, 5, 5, border_mode='valid',input_shape=(1,img_rows, img_cols)))
model.add(Activation('relu'))
# For an explanation on pooling layers see http://cs231n.github.io/convolutional-networks/#pool
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.15))
model.add(Convolution2D(24, 5, 5))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.15))
# Flatten the 3D output to 1D tensor for a fully connected layer to accept the input
model.add(Flatten())
model.add(Dense(180))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(100))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(nb_classes)) #Last layer with one output per class
model.add(Activation('softmax')) #We want a score simlar to a probability for each class
# The function to optimize is the cross entropy between the true label and the output (softmax) of the model
# We will use adadelta to do the gradient descent see http://cs231n.github.io/neural-networks-3/#ada
model.compile(loss='categorical_crossentropy', optimizer='adadelta', metrics=["accuracy"])
# Make the model learn
model.fit(X_train, Y_train, batch_size=batch_size, nb_epoch=nb_epoch, verbose=1)
# Predict the label for X_test
yPred = model.predict_classes(X_test)
# Save prediction in file for Kaggle submission
np.savetxt('mnist-pred.csv', np.c_[range(1,len(yPred)+1),yPred], delimiter=',', header = 'ImageId,Label', comments = '', fmt='%d')
So the problem is with the convolution sizes used. Convolution operations usually reduce dimension of the image. Similarly - each pooling operation reduces the size. You have very small images yet applied model architecture which has been designed for a bigger ones, thus at some point, after one of the convolutions/poolings you actually have a smaller outputed image than a following filter size, and this is an ill-defined operation.
To temporarly fix the problem - remove second convolution and maxpooling layers, since these operations (with parameters provided) cannot be performed on such small data. In general you should first understand how convolution works, and not apply someone elses model, since the parameters are crucial for good performance - if you apply transformations which reduce resolution to much - you will be unable to learn anything. Thus once you have some intuition how convolution works you can go back and try different architectures, but there is no one, "magical" equation to figure out the architecture, thus I cannot provide you with parameters that will "just work" - start with removing this additional convolution and pooling, and than go back and try other possibilities once you have better understanding of your data and model.