SKLearn Multiclass Classifier - python

I have written the following code to import data vectors from file and test the performance of SVM classifier (using sklearn and python).
However the classifier performance is lower than any other classifier (NNet for example gives 98% accuracy on test data but this gives 92% at best). In my experience SVM should produce better results for this kind of data.
Am I possibly doing something wrong?
import numpy as np
def buildData(featureCols, testRatio):
f = open("car-eval-data-1.csv")
data = np.loadtxt(fname = f, delimiter = ',')
X = data[:, :featureCols] # select columns 0:featureCols-1
y = data[:, featureCols] # select column featureCols
n_points = y.size
print "Imported " + str(n_points) + " lines."
### split into train/test sets
split = int((1-testRatio) * n_points)
X_train = X[0:split,:]
X_test = X[split:,:]
y_train = y[0:split]
y_test = y[split:]
return X_train, y_train, X_test, y_test
def buildClassifier(features_train, labels_train):
from sklearn import svm
#clf = svm.SVC(kernel='linear',C=1.0, gamma=0.1)
#clf = svm.SVC(kernel='poly', degree=3,C=1.0, gamma=0.1)
clf = svm.SVC(kernel='rbf',C=1.0, gamma=0.1)
clf.fit(features_train, labels_train)
return clf
def checkAccuracy(clf, features, labels):
from sklearn.metrics import accuracy_score
pred = clf.predict(features)
accuracy = accuracy_score(pred, labels)
return accuracy
features_train, labels_train, features_test, labels_test = buildData(6, 0.3)
clf = buildClassifier(features_train, labels_train)
trainAccuracy = checkAccuracy(clf, features_train, labels_train)
testAccuracy = checkAccuracy(clf, features_test, labels_test)
print "Training Items: " + str(labels_train.size) + ", Test Items: " + str(labels_test.size)
print "Training Accuracy: " + str(trainAccuracy)
print "Test Accuracy: " + str(testAccuracy)
i = 0
while i < labels_test.size:
pred = clf.predict(features_test[i])
print "F(" + str(i) + ") : " + str(features_test[i]) + " label= " + str(labels_test[i]) + " pred= " + str(pred);
i = i + 1
How is it possible to do multi-class classification if it does not do it by default?
p.s. my data is of the following format (last column is the class):
2,2,2,2,2,1,0
2,2,2,2,1,2,0
0,2,2,5,2,2,3
2,2,2,4,2,2,1
2,2,2,4,2,0,0
2,2,2,4,2,1,1
2,2,2,4,1,2,1
0,2,2,5,2,2,3

I found the problem after a long time and I am posting it, in case someone needs it.
The problem was that the data import function wouldn't shuffle the data. If the data is somehow sorted, then there is the risk that you train the classifier with some data and test it with totally different data. In the NNet case, Matlab was used which automatically shuffles the input data.
def buildData(filename, featureCols, testRatio):
f = open(filename)
data = np.loadtxt(fname = f, delimiter = ',')
np.random.shuffle(data) # randomize the order
X = data[:, :featureCols] # select columns 0:featureCols-1
y = data[:, featureCols] # select column featureCols
n_points = y.size
print "Imported " + str(n_points) + " lines."
### split into train/test sets
split = int((1-testRatio) * n_points)
X_train = X[0:split,:]
X_test = X[split:,:]
y_train = y[0:split]
y_test = y[split:]
return X_train, y_train, X_test, y_test

Related

precision-recall auc from cross_val_score in python

cross_val_score in python enables one to generate a variety of convenient model performance metrics. This is what I use to get ROC-AUC and Recall for a binary classification model.
import sklearn
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import cross_val_score
from sklearn import metrics
log = LogisticRegression(class_weight='balanced')
auc = cross_val_score(log, X, y, scoring='roc_auc')
print ("ROC-AUC (Mean->): " + str(round(100*auc.mean(), 2)) + "%" + " (Standard Deviation->): " + str(round(100*auc.std(), 2)) + "%")
recall = cross_val_score(log, X, y, scoring='recall')
print ("RECALL (Mean->): " + str(round(100*recall.mean(), 2)) + "%"+ " (Standard Deviation->): " + str(round(100*recall.std(), 2)) + "%")
For the same binary classification model, how can one incorporate a metric for calculating precision-recall AUC within cross_val_score?
I think you should look into the function: precision_recall_curve(),
it compute precision-recall pairs for different probability thresholds.
Try the following approach:
FOLDS = 6
k_fold = KFold(n_splits=FOLDS, shuffle=True, random_state=42)
for i, (train_index, test_index) in enumerate(k_fold.split(X)):
Xtrain, Xtest = X[train_index], X[test_index]
ytrain, ytest = y[train_index], y[test_index]
logistic_model.fit(Xtrain, ytrain)
pred_proba = logistic_model.predict_proba(Xtest)
precision, recall, _ = precision_recall_curve(ytest, pred_proba[:, 1])
...

List wrong predictions in the test data! - Python

I'm trying to list all the wrong predictions in a test set, but quite unsure how to do it. I tried Stackoverflow, but might have searched for the wrong "problem". So I have these text files from a folder, containing emails. The problems is that my predictions isn't doing to well, and I want to inspect the emails that is predicted wrong. Currently a snippet of my code looks something like this:
no_head_train_path_0 = 'folder_name'
no_head_train_path_1 = 'folder_name'
def get_data(path):
text_list = list()
files = os.listdir(path)
for text_file in files:
file_path = os.path.join(path, text_file)
read_file = open(file_path,'r+')
read_text = read_file.read()
read_file.close()
cleaned_text = clean_text(read_text)
text_list.append(cleaned_text)
return text_list, files
no_head_train_0, temp = get_data(no_head_train_path_0)
no_head_train_1, temp1 = get_data(no_head_train_path_1)
no_head_train = no_head_train_0 + no_head_train_1
no_head_labels_train = ([0] * len(no_head_train_0)) + ([1] * len(no_head_train_1))
def vocabularymat(TEXTFILES,VOC,PLAY,METHOD):
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer
if (METHOD == "TDM"):
voc = CountVectorizer()
voc.fit(VOC)
if (PLAY == "TRAIN"):
TrainMat = voc.transform(TEXTFILES)
return TrainMat
if (PLAY =="TEST"):
TestMat = voc.transform(TEXTFILES)
return TestMat
TrainMat = vocabularymat(no_head_train,no_head_train,PLAY= "TRAIN",METHOD="TDM")
X_train = Featurelearning(Traindata, Method="NMF")
y_train = datalabel
X_train, X_test, y_train, y_test = train_test_split(data, datalabel, test_size=0.33,
random_state=42
model = LogisticRegression()
model.fit(X_train, y_train)
expected = y_test
predicted = model.predict(X_test)
proba = model.predict_proba(X_test)
accuracy = accuracy_score(expected, predicted)
recall = recall_score(expected, predicted, average="binary")
precision = precision_score(expected, predicted , average="binary")
f1 = f1_score(expected, predicted , average="binary")
Is it possible to find the emails/filename that are predicted wrong, so I can manually inspect them? (Sorry for the long code)
You can use NumPy to create a Boolean vector indicating which predictions are wrong, and then use that vector to index your array of file names. For example:
import numpy as np
# mock data
files = np.array(['mail1.txt', 'mail2.txt', 'mail3.txt', 'mail4.txt'])
y_test = np.array([0, 0, 1, 1])
predicted = np.array([0, 1, 0, 1])
# create a Boolean index for the wrong classifications
classification_is_wrong = y_test != predicted
# print the file names of the wrongly classified mails
print(files[classification_is_wrong])
Output:
['mail2.txt' 'mail3.txt']
# find the wrong prediction
prediction = model.predict(x_test)
# save the wrong predicted values
wrong_predict = []
for order, value in enumerate(y_test):
if y_test[order] != prediction[order].argmax():
wrong_predict.append(order)
print(wrong_predict)

Python sklearn: GMM model produces bad results

I am trying to use a GMM model to classify the Iris data set. My model seems to produce inconsistent results, some runs have an accuracy of 90% and some of 33% (some even going all the way to 0%). I am not sure if the mistake occurs at the classification stage or my preprocessing or print statements for accuracy are incorrect.
I have tried changing the number of iterations and the init_params when defining the classifier. I am using scikit-learn version 0.21.3 and Python 3.7.0
import numpy as np
from sklearn.model_selection import StratifiedKFold
from sklearn.mixture import GaussianMixture as GMM
def data_processor(input_file):
data = []
with open(input_file, 'r') as f:
for line in f:
line = line.strip().split(',')
if line[-1] == 'Iris-setosa':
line[-1] = 0
elif line[-1] == 'Iris-versicolor':
line[-1] = 1
elif line[-1] == 'Iris-virginica':
line[-1] = 2
data.append(line)
return np.array(data, dtype=float)
x, y = data_processor('iris.txt')[:, :-1], data_processor('iris.txt')[:, -1]
# split data into 5 chucnks, 80% used for training and 20% for testing
skf = StratifiedKFold(n_splits=5)
train_index, test_index = next(iter(skf.split(x, y)))
x_train, y_train = x[train_index], y[train_index]
x_test, y_test = x[test_index], y[test_index]
# calculate number of components in data set using number of classes
num_classes = len(np.unique(y_train))
# build classifier and fit model
classifier = GMM(n_components=num_classes, covariance_type='full',
max_iter=200)
classifier.fit(x_train)
# Make predictions and print accuracy
y_train_pred = classifier.predict(x_train)
accuracy_training = np.mean(y_train_pred.ravel() == y_train.ravel()) * 100
print('Accuracy on training data =', accuracy_training)
y_test_pred = classifier.predict(x_test)
accuracy_testing = np.mean(y_test_pred.ravel() == y_test.ravel()) * 100
print('Accuracy on testing data =', accuracy_testing)
The number of test data is 150 rows, 50 per class. Previously with different models on the same data set, I was getting an accuracy of around 90-93%.
The input file contains data in this format: 5.9,3.0,5.1,1.8,Iris-virginica
I have also tried to scale the data using:
x = preprocessing.scale(x)
The data now looks as follows:
x[0] = [-0.90068117 1.03205722 -1.3412724 -1.31297673]
y[0] = 0.0
However, this did not affect the accuracy.

Saving and loading neupy algorithm with dill library can return different predictions for the same time period?

First of all thank you for reading this, and thank you in advance if you can help.
This is the algorithm that I´m using for supervised learning:
# Define neural network
cgnet = algorithms.LevenbergMarquardt(
connection=[
layers.Input(XTrain.shape[1]),
layers.Relu(6),
layers.Linear(1)
],
mu_update_factor=2,
mu=0.1,
shuffle_data=True,
verbose=True,
decay_rate=0.1,
addons=[algorithms.WeightElimination]
)
Cross validation results are good (k=10):
[0.16767815652364237, 0.13396493112368024, 0.19033966833586402, 0.12023567250054788, 0.11826824035439124, 0.13115856672872392, 0.14250003819441104, 0.12729442202775898, 0.31073760721487326, 0.19299511349686768]
[0.9395976956178138, 0.9727526340820827, 0.9410503161549465, 0.9740922179654977, 0.9764171089773663, 0.9707258917808179, 0.9688830174583372, 0.973160633351555, 0.8551738446276884, 0.936661707991699]
MEA: 0.16 (+/- 0.11)
R2: 0.95 (+/- 0.07)
After training I have saved the algorithm with dill:
with open('network-storage.dill', 'wb') as f:
dill.dump(cgnet, f)
Then if I load the network with dill and consider the X values of the entire training set I get the same R2 (0.9691), until now everything is ok. This are the results:
If I try to do the same thing but with only the last few years [2018-2022] I get this (prediction of y with X training values (2018 to 2022):
Instead of this (prediction of y with X training values (1992 to 2022):
Why do I get different predictions for the same period when I load different X values range? (X input from 1992 to 2022: y prediction for 1992 to 2022 is ok.
(X input from 2018 to 2022: y prediction for 2018 to 2022 is not ok.
This is the code:
import numpy as np
import pandas as pd
import datetime as dt
import matplotlib.pyplot as plt
import dill
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn import metrics
from sklearn.model_selection import KFold
from scipy.interpolate import Rbf
from scipy import stats
from neupy import layers, environment, algorithms
from neupy import plots
# Import data
data = pd.read_excel('DataAL_Incremento.xlsx', index_col=0, header=1).iloc[:,[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,-1]]
data.columns = ['PPO4L(in)','PPO4(in)','NH4L(in)','NH4(in)','NO3L(in)','NNO3(in)','CBOL(in)', 'CBO(in)','Temp(In)','Temp(alb)','Tair ','Tdew',
'Wvel','Cl_aL(in)','Cl_a(in)','ODL(in)','OD(in)','Qin(in)','ODalb','PPO4(alb)','NNO3(alb)']
# Add filtered data
tmp0 = data.iloc[:,[9, 6, 14]].rolling(9, center=False, axis=0).mean()
tmp0.columns = ['Temp(alb)_09','CBOL(in)_09','Cl_a(in)_09']
tmp1 = data.iloc[:,[9, 6, 14]].rolling(15, center=False, axis=0).mean()
tmp1.columns = ['Temp(alb)_15', 'CBOL(in)_15','Cl_a(in)_15']
tmp2 = data.iloc[:,[9, 6, 14]].rolling(31, center=False, axis=0).mean()
tmp2.columns = ['Temp(alb)_31', 'CBOL(in)_31','Cl_a(in)_31']
data = pd.concat((data, tmp0, tmp1, tmp2), axis=1)
# Drop empty records
data = data.dropna()
# Define data
X = data.loc[:, ['CBOL(in)', 'CBO(in)','Temp(In)','Temp(alb)','Tair ','Cl_aL(in)','Cl_a(in)','OD(in)','Temp(alb)_31', 'CBOL(in)_31','Cl_a(in)_31']]
y = data.loc[:, ['ODalb']]
years = data.index.year
yearsTrain = range(1992,2022)
yearsTest = 2019,2020,2021
#yearsTrain, yearsTest = train_test_split(np.unique(years), test_size=0.2, train_size=0.8, random_state=None)
XTrain = X.query('#years in #yearsTrain')
yTrain = y.query('#years in #yearsTrain').values.ravel()
XTest = X.query('#years in #yearsTest')
yTest = y.query('#years in #yearsTest').values.ravel()
results = y.query('#years in #yearsTest')
#===============================================================================
# Neural network
#===============================================================================
# Define neural network
cgnet = algorithms.LevenbergMarquardt(
connection=[
layers.Input(XTrain.shape[1]),
layers.Relu(6),
layers.Linear(1)
],
mu_update_factor=2,
mu=0.1,
shuffle_data=True,
verbose=True,
decay_rate=0.1,
addons=[algorithms.WeightElimination]
)
# Scale
XScaler = StandardScaler()
XScaler.fit(XTrain)
XTrainScaled = XScaler.transform(XTrain)
XTestScaled = XScaler.transform(XTest)
yScaler = StandardScaler()
yScaler.fit(yTrain.reshape(-1, 1))
yTrainScaled = yScaler.transform(yTrain.reshape(-1, 1)).ravel()
yTestScaled = yScaler.transform(yTest.reshape(-1, 1)).ravel()
# Train
cgnet.train(XTrainScaled, yTrainScaled, XTestScaled, yTestScaled, epochs=30)
yEstTrain = yScaler.inverse_transform(cgnet.predict(XTrainScaled).reshape(-1, 1)).ravel()
mae = np.mean(np.abs(yTrain-yEstTrain))
results['ANN'] = yScaler.inverse_transform(cgnet.predict(XTestScaled).reshape(-1, 1)).ravel()
# Metrics
mse = np.mean((yTrain-yEstTrain)**2)
mseTes = np.mean((yTest-results['ANN'])**2)
maeTes = np.mean(np.abs(yTest-results['ANN']))
meantrain = np.mean(yTrain)
ssTest = (yTrain-meantrain)**2
r2=(1-(mse/(np.mean(ssTest))))
meantest = np.mean(yTest)
ssTrain = (yTest-meantest)**2
r2Tes=(1-(mseTes/(np.mean(ssTrain))))
# Plot results
print("NN MAE: %f (All), %f (Test) " % (mae, maeTes))
print ("NN MSE: %f (All), %f (Test) " % (mse, mseTes))
print ("NN R2: %f (All), %f (Test) " % (r2, r2Tes))
results.plot()
plt.show(block=True)
plots.error_plot(cgnet)
plt.show(block=True)
plt.scatter(yTest,results['ANN'])
plt.xlabel('True Values')
plt.ylabel('Predictions')
plt.show(block=True)
#===============================================================================
# Save algorithms - Neural network
#===============================================================================
with open('network-storage.dill', 'wb') as f:
dill.dump(cgnet, f)
#===============================================================================
# Load algorithms - Neural network
#===============================================================================
#Prepare data
dataVal = pd.read_excel('DataAL_IncrementoTeste.xlsx', index_col=0, header=1).iloc[:,[0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,-1]]
dataVal.columns = ['PPO4L(in)','PPO4(in)','NH4L(in)','NH4(in)','NO3L(in)','NNO3(in)','CBOL(in)', 'CBO(in)','Temp(In)','Temp(alb)','Tair ','Tdew',
'Wvel','Cl_aL(in)','Cl_a(in)','ODL(in)','OD(in)','Qin(in)','ODalb','PPO4(alb)','NNO3(alb)']
# Add filtered data
tmp0 = dataVal.iloc[:,[9, 6, 14]].rolling(9, center=False, axis=0).mean()
tmp0.columns = ['Temp(alb)_09','CBOL(in)_09','Cl_a(in)_09']
tmp1 = dataVal.iloc[:,[9, 6, 14]].rolling(15, center=False, axis=0).mean()
tmp1.columns = ['Temp(alb)_15', 'CBOL(in)_15','Cl_a(in)_15']
tmp2 = dataVal.iloc[:,[9, 6, 14]].rolling(31, center=False, axis=0).mean()
tmp2.columns = ['Temp(alb)_31', 'CBOL(in)_31','Cl_a(in)_31']
dataVal = pd.concat((dataVal, tmp0, tmp1, tmp2), axis=1)
# Drop empty records (removes adjacent columns)
dataVal = dataVal.dropna()
# Define data
Xval = dataVal.loc[:, ['CBOL(in)', 'CBO(in)','Temp(In)','Temp(alb)','Tair ','Cl_aL(in)','Cl_a(in)','OD(in)','Temp(alb)_31', 'CBOL(in)_31','Cl_a(in)_31']]
yval = dataVal.loc[:, ['ODalb']]
years = dataVal.index.year
yearsTrain = range(2018,2022)
XFinalVal = Xval.query('#years in #yearsTrain')
yFinalVal = yval.query('#years in #yearsTrain').values.ravel()
resultsVal = yval.query('#years in #yearsTrain')
# Load algorithms
with open('network-storage.dill', 'rb') as f:
cgnet = dill.load(f)
# Scale X
XScaler = StandardScaler()
XScaler.fit(XFinalVal)
XFinalScaled = XScaler.transform(XFinalVal)
# Scale y
yScaler = StandardScaler()
yScaler.fit(yFinalVal.reshape(-1, 1))
yTrainScaled = yScaler.transform(yFinalVal.reshape(-1, 1)).ravel()
# Predict
y_predicted = yScaler.inverse_transform(cgnet.predict(XFinalScaled).reshape(-1, 1)).ravel()
resultsVal['ANN'] = y_predicted
scoreMean = metrics.mean_absolute_error(yFinalVal, y_predicted)
scoreR2 = metrics.r2_score(yFinalVal, y_predicted)
print(scoreMean)
print(scoreR2)
plt.scatter(yFinalVal,y_predicted)
plt.xlabel('True Values')
plt.ylabel('Predictions')
plt.show(block=True)
resultsVal.plot()
plt.show(block=True)
#===============================================================================
# Cross validation - Neural network
#===============================================================================
XScaler = StandardScaler()
XScaler.fit(XTrain)
XTrainScaled = XScaler.transform(XTrain)
XTestScaled = XScaler.transform(XTest)
yScaler = StandardScaler()
yScaler.fit(yTrain.reshape(-1, 1))
yTrainScaled = yScaler.transform(yTrain.reshape(-1, 1)).ravel()
yTestScaled = yScaler.transform(yTest.reshape(-1, 1)).ravel()
kfold = KFold(n_splits=10, shuffle=True, random_state=None)
scoresMean = []
scoresR2 = []
for train, test in kfold.split(XTrainScaled):
x_train, x_test = XTrainScaled[train], XTrainScaled[test]
y_train, y_test = yTrainScaled[train], yTrainScaled[test]
cgnet = algorithms.LevenbergMarquardt(
connection=[
layers.Input(XTrain.shape[1]),
layers.Relu(6),
layers.Linear(1)
],
mu_update_factor=2,
mu=0.1,
shuffle_data=True,
verbose=True,
decay_rate=0.1,
addons=[algorithms.WeightElimination]
)
cgnet.train(x_train, y_train, epochs=100)
y_predicted = cgnet.predict(x_test)
scoreMean = metrics.mean_absolute_error(y_test, y_predicted)
scoreR2 = metrics.r2_score(y_test, y_predicted)
scoresMean.append(scoreMean)
scoresR2.append(scoreR2)
print(scoresMean)
print(scoresR2)
scoresMean = np.array(scoresMean)
scoresR2 = np.array(scoresR2)
print("MEA: %0.2f (+/- %0.2f)" % (scoresMean.mean(), scoresMean.std() * 2))
print("R2: %0.2f (+/- %0.2f)" % (scoresR2.mean(), scoresR2.std() * 2))
I think that one of the problems might be with the scaling that you apply before the training. In the training stage you fit scaler function using training data
XScaler = StandardScaler()
XScaler.fit(XTrain)
But after you loaded network using dill you've fitted scaler with different data (validation data specificaly)
XScaler = StandardScaler()
XScaler.fit(XFinalVal)
In the second case, you use different scaling for the prediction which network hasn't seen during the training. New scaling might create different distrubition of the samples compare to the one that networks expects.
In order to make effect from the training reproducible you also need to save XScaler and load it at the same time when you load network.
Everything that I've described also true for the yScaler

training accuracy drops in tensorflow

I was trying to create a model for character recognition.
This model was working fine with 28*28 dataset and for characters from 0-9 but it training accuracy is dropping if changed to 64*64 and characters ranges from 0-9, a-z, A-Z.
While iterating through accuracy it goes till 0.3 and then stays there afterwards. I tried to train with different dataset as well but the same thing is happening.
Changing learning rate to 0.001 also does not help.
Can anyone tell what is the issue with this?
import numpy as np
from PIL import Image
import matplotlib.pyplot as plt
import random as ran
import os
import tensorflow as tf
def TRAIN_SIZE(num):
images = np.load("data/train/images64.npy").reshape([2852,4096])
labels = np.load("data/train/labels.npy")
print ('Total Training Images in Dataset = ' + str(images.shape))
print ('--------------------------------------------------')
x_train = images[:num,:]
print ('x_train Examples Loaded = ' + str(x_train.shape))
y_train = labels[:num,:]
print ('y_train Examples Loaded = ' + str(y_train.shape))
print('')
return x_train, y_train
def TEST_SIZE(num):
images = np.load("data/test/images64.npy").reshape([558,4096])
labels = np.load("data/test/labels.npy")
print ('Total testing Images in Dataset = ' + str(images.shape))
print ('--------------------------------------------------')
x_test = images[:num,:]
print ('x_test Examples Loaded = ' + str(x_test.shape))
y_test = labels[:num,:]
print ('y_test Examples Loaded = ' + str(y_test.shape))
print('')
return x_test, y_test
def display_digit(num):
# print(y_train[num])
label = y_train[num].argmax(axis=0)
image = x_train[num].reshape([64,64])
# plt.axis("off")
plt.title('Example: %d Label: %d' % (num, label))
plt.imshow(image, cmap=plt.get_cmap('gray_r'))
plt.show()
def display_mult_flat(start, stop):
images = x_train[start].reshape([1,4096])
for i in range(start+1,stop):
images = np.concatenate((images, x_train[i].reshape([1,4096])))
plt.imshow(images, cmap=plt.get_cmap('gray_r'))
plt.show()
def get_char(a):
if(a<10):
return a
elif(a>=10 and a<36):
return chr(a+55)
else:
return chr(a+61)
x_train, y_train = TRAIN_SIZE(2850)
x_test, y_test = TRAIN_SIZE(1900)
x = tf.placeholder(tf.float32, shape=[None, 4096])
y_ = tf.placeholder(tf.float32, shape=[None, 62])
W = tf.Variable(tf.zeros([4096,62]))
b = tf.Variable(tf.zeros([62]))
y = tf.nn.softmax(tf.matmul(x,W) + b)
with tf.Session() as sess:
# x_test = x_test[1400:,:]
# y_test = y_test[1400:,:]
x_test, y_test =TEST_SIZE(400)
LEARNING_RATE = 0.2
TRAIN_STEPS = 1000
sess.run(tf.global_variables_initializer())
cross_entropy = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))
training = tf.train.GradientDescentOptimizer(LEARNING_RATE).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
for i in range(TRAIN_STEPS+1):
sess.run(training, feed_dict={x: x_train, y_: y_train})
if i%100 == 0:
print('Training Step:' + str(i) + ' Accuracy = ' + str(sess.run(accuracy, feed_dict={x: x_test, y_: y_test})) + ' Loss = ' + str(sess.run(cross_entropy, {x: x_train, y_: y_train})))
savedPath = tf.train.Saver().save(sess, "/tmp/model.ckpt")
print("Model saved at: " ,savedPath)
You are trying to classify 62 different numbers and characters, but use a single fully connected layer to do that. Your model simply has not enough parameters for that task. In other words, you are underfitting the data. So either expand your network by adding parameters (layers) and/or use CNNs, which generally have good performance for image classification tasks.
Try different CNN mode. the model you are using like inception v1, v2,v3 alexnet etc..

Categories

Resources