I am trying to train the Alexnet upon the data that i collected.
It contains the images converted to grayscale and the associated key.
This is a program to simulate a self driving car.
The keys are :
w = [1,0,0]
a = [0,1,0]
d = [0,0,1]
This is my code
import numpy as np
from alexnet import alexnet
WIDTH = 100
HEIGHT = 80
LR = 1e-3
EPOCHS = 8
MODEL_NAME = 'Udacity Model Car NN'
model = alexnet(WIDTH,HEIGHT,LR)
train_data = np.load('data.npy',encoding="bytes")
train = train_data[:-200]
test = train_data[-200:]
X = np.array([i[0] for i in train]).reshape(-1,WIDTH,HEIGHT,1)
Y = [i[1] for i in train]
test_X = np.array([i[0] for i in test]).reshape(-1,WIDTH,HEIGHT,1)
test_Y = [i[1] for i in test]
model.fit({'input':X},{'targets':Y},n_epoch=EPOCHS,validation_set=({'input':test_X},{'targets:test_y'}),snapshot_step=500,show_metric=True,run_id=MODEL_NAME)
model.save(MODEL_NAME)
But after every Epoch the validation accuracy remains 0 as well as the validation loss remains 0 as well.
Training Step: 104 | total loss: 1.31713 | time: 119.279s| Momentum |epoch: 008 | loss: 1.31713 - acc: 0.3878 | val_loss: 0.00000 - val_acc: 0.0000 -- iter: 801/801
This is probably a typo, look what you are passing as validation:
{'input':test_X},{'targets:test_y'}
\______________/ \_______________/
correct dict this is a set with a string!
while it should be
{'input':test_X},{'targets':test_Y}
Related
I have a discrepancy between the val_loss's produced by model.fit and model.test_on_batch.
For model.fit, after 1 epoch of batch size 4 and 50k training set size, this is the output
50000/50000 [==============================] - 508s 10ms/step - loss: 1.5587 - acc: 0.9442 - val_loss: 0.6883 - val_acc: 0.9721
Notice that val_loss = 0.6883.
I then stopped the training, and trained the model with model.train_on_batch, validating every 1k batches. I did not reset the model, so the weights are not changed. After 1k batches, I get this output:
Batch 1139: Train[0.539348,0.977112] ; Val[146.972092,0.972529] ; Duration=0.040436 s
Notice that here the validation loss is 146.97.... How is that possible? Does model.fit do some post-processing to the validation loss?
model.fit code
batch_size = 4
epochs = 300
myhist = model.fit(x_test,y_test,batch_size=batch_size,epochs=epochs,shuffle=True,validation_data=(val_x[:1000,],val_y[:1000,]),callbacks=[plot_losses])
model.train_on_batch iteration
n_batches = 500000
batch_size = 4
val_size = 1000
val_freq = 1000
val_loss,val_acc = 0,0
model_check = '17102019_1.hd5'
val_loss_min = 1000000
for ib in range(n_batches):
batch_init = time.time()
batch_x,batch_y = generate_mini_batch(batch_size,x_test,y_test,linear_comb=False,trans=False)
train_loss,train_acc = model.train_on_batch(batch_x,batch_y)
batch_end = time.time()-batch_init
clear_output(wait=True)
if (ib % val_freq == 0) & (ib > 0):
val_loss,val_acc = model.test_on_batch(val_x[:val_size,],val_y[:val_size,])
if val_loss < val_loss_min:
model.save(model_check)
val_loss_min = val_loss
print('Batch %i: Train[%f,%f] ; Val[%f,%f] ; Duration=%f s'%(ib,train_loss,train_acc,val_loss,val_acc,batch_end))
It seems like model.test_on_batch returns the sum of losses of the batch entries, while model.train_on_batch returns the average loss, so that solves the issue.
I have to do a CNN to detect Diabetic retinopathy in 4th stage (it have to detect if there DR in 4th stage or not, doesn't require detect the other levels). The input will be images like this : https://i.imgur.com/DsU06Xv.jpg
To be better to classificate, i'm refining my image: https://i.imgur.com/X1p9G1c.png
So, I have a database with 700 images of retinas in level 0, and 700 retinas with level 4.
The problem is all model I tried to make isn't work, generally it became a overfitting problem..
I already tried to use Sequential model, Functional API.. and in one question I've made here, a user recommended me to use VGG16 >> question : https://datascience.stackexchange.com/questions/60706/how-do-i-handle-with-my-keras-cnn-overfitting
And now, I'm trying to use VGG16 but still doens't work, all my predictions are 0 and I have no idea what to do to handle it..
This is my train.py:
import cv2
import os
import numpy as np
from keras.layers.core import Flatten, Dense, Dropout, Reshape
from keras.layers.normalization import BatchNormalization
from keras.layers.convolutional import Conv2D
from keras.layers.pooling import MaxPooling2D
from keras import regularizers
from keras.models import Model
from keras.layers import Input, ZeroPadding2D, Dropout
from keras import optimizers
from keras.optimizers import SGD
from keras.preprocessing.image import ImageDataGenerator
from keras.models import Sequential
from keras.utils import to_categorical
from keras.applications.vgg16 import VGG16
# example of using a pre-trained model as a classifier
from keras.preprocessing.image import load_img
from keras.preprocessing.image import img_to_array
from keras.applications.vgg16 import preprocess_input
from keras.applications.vgg16 import decode_predictions
TRAIN_DIR = 'train/'
TEST_DIR = 'test/'
v = 'v/'
BATCH_SIZE = 32
NUM_EPOCHS = 5
def ReadImages(Path):
LabelList = list()
ImageCV = list()
classes = ["nonPdr", "pdr"]
# Get all subdirectories
FolderList = [f for f in os.listdir(Path) if not f.startswith('.')]
# Loop over each directory
for File in FolderList:
for index, Image in enumerate(os.listdir(os.path.join(Path, File))):
# Convert the path into a file
ImageCV.append(cv2.resize(cv2.imread(os.path.join(Path, File) + os.path.sep + Image), (224,224)))
#ImageCV[index]= np.array(ImageCV[index]) / 255.0
LabelList.append(classes.index(os.path.splitext(File)[0]))
ImageCV[index] = cv2.addWeighted(ImageCV[index],4, cv2.GaussianBlur(ImageCV[index],(0,0), 224/30), -4, 128)
return ImageCV, LabelList
data, labels = ReadImages(TRAIN_DIR)
valid, vlabels = ReadImages(TEST_DIR)
vgg16_model = VGG16(weights="imagenet", include_top=True)
# (1) visualize layers
print("VGG16 model layers")
for i, layer in enumerate(vgg16_model.layers):
print(i, layer.name, layer.output_shape)
# (2) remove the top layer
base_model = Model(input=vgg16_model.input,
output=vgg16_model.get_layer("block5_pool").output)
# (3) attach a new top layer
base_out = base_model.output
base_out = Reshape((25088,))(base_out)
top_fc1 = Dropout(0.5)(base_out)
# output layer: (None, 5)
top_preds = Dense(1, activation="sigmoid")(top_fc1)
# (4) freeze weights until the last but one convolution layer (block4_pool)
for layer in base_model.layers[0:14]:
layer.trainable = False
# (5) create new hybrid model
model = Model(input=base_model.input, output=top_preds)
# (6) compile and train the model
sgd = SGD(lr=1e-4, momentum=0.9)
model.compile(optimizer=sgd, loss="binary_crossentropy", metrics=["accuracy"])
datagen = ImageDataGenerator(
featurewise_center=True,
featurewise_std_normalization=True,
rotation_range=20,
width_shift_range=0.2,
height_shift_range=0.2,
horizontal_flip=True)
# compute quantities required for featurewise normalization
# (std, mean, and principal components if ZCA whitening is applied)
datagen.fit(data)
# fits the model on batches with real-time data augmentation:
model.fit_generator(datagen.flow(np.array(data), np.array(labels), batch_size=32),
steps_per_epoch=len(np.array(data)) / 32, epochs=5)
#history = model.fit([data], [labels], nb_epoch=NUM_EPOCHS,
# batch_size=BATCH_SIZE, validation_split=0.1)
# evaluate final model
#vlabels = model.predict(np.array(valid))
model.save('model.h5')
When I run it, returns a accuracy of ~1.0 or 0.99 % with a minimal loss ~0.01..
This is my predict.py:
from keras.models import load_model
import cv2
import os
import json
import h5py
import numpy as np
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input
TEST_DIR = 'v/'
def fix_layer0(filename, batch_input_shape, dtype):
with h5py.File(filename, 'r+') as f:
model_config = json.loads(f.attrs['model_config'].decode('utf-8'))
layer0 = model_config['config']['layers'][0]['config']
layer0['batch_input_shape'] = batch_input_shape
layer0['dtype'] = dtype
f.attrs['model_config'] = json.dumps(model_config).encode('utf-8')
fix_layer0('model.h5', [None, 224, 224, 3], 'float32')
model = load_model('model.h5')
for filename in os.listdir(r'v/'):
if filename.endswith(".jpg") or filename.endswith(".ppm") or filename.endswith(".jpeg") or filename.endswith(".png"):
ImageCV = cv2.resize(cv2.imread(os.path.join(TEST_DIR) + filename), (224,224))
x = image.img_to_array(ImageCV)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
print(np.argmax(model.predict(x)))
When I run it, all my predictions are 0.. and if a drop the 'np.argmax' and run only model.predict, returns the follow result:
[[0.03993018]]
[[0.9984968]]
[[1.]]
[[1.]]
[[0.]]
[[0.9999999]]
[[0.8691623]]
[[1.01611796e-07]]
[[1.]]
[[0.]]
[[1.]]
[[0.17786741]]
Considering that the 2 first images are class 0 and the others are class 1 (level 4), the results aren't 0.99 or 1.0 of acc..
What should I have to do? I really, strongly appreciate any help!
UPDATE
I've updated my code as #Manoj has said.. I've added validation and early stopping:
es = EarlyStopping(monitor='val_loss', verbose=1)
# fits the model on batches with real-time data augmentation:
model.fit_generator(datagen.flow(np.array(data), np.array(labels), batch_size=32),
steps_per_epoch=len(np.array(data)) / 32, epochs=5,
validation_data=(np.array(valid), np.array(vlabels)),
nb_val_samples=72, callbacks=[es])
And returns these numbers:
Epoch 1/5
44/43 [==============================] - 452s 10s/step - loss: 0.2377 - acc: 0.9162 - val_loss: 1.9521 - val_acc: 0.8472
Epoch 2/5
44/43 [==============================] - 445s 10s/step - loss: 0.0229 - acc: 0.9991 - val_loss: 1.8908 - val_acc: 0.8611
Epoch 3/5
44/43 [==============================] - 447s 10s/step - loss: 0.0107 - acc: 0.9993 - val_loss: 1.7658 - val_acc: 0.8611
Epoch 4/5
44/43 [==============================] - 458s 10s/step - loss: 0.0090 - acc: 0.9993 - val_loss: 1.6805 - val_acc: 0.8750
Epoch 5/5
44/43 [==============================] - 463s 11s/step - loss: 0.0052 - acc: 0.9993 - val_loss: 1.6730 - val_acc: 0.8750
But after that my predictions (that were 7/12 correct) now is 5/12 correct..
What I can do to handle it?
UPDATE 2
I've putted this code in my train.py:
mean = datagen.mean
std = datagen.std
print(mean, "mean")
print(std, "std")
and the values returned by these prints I inserted in predict.py:
def normalize(x, mean, std):
x[..., 0] -= mean[0]
x[..., 1] -= mean[1]
x[..., 2] -= mean[2]
x[..., 0] /= std[0]
x[..., 1] /= std[1]
x[..., 2] /= std[2]
return x
for filename in os.listdir(r'v/'):
if filename.endswith(".jpg") or filename.endswith(".ppm") or filename.endswith(".jpeg") or filename.endswith(".png"):
ImageCV = cv2.resize(cv2.imread(os.path.join(TEST_DIR) + filename), (224,224))
x = image.img_to_array(ImageCV)
x = np.expand_dims(x, axis=0)
x = normalize(x, [59.5105,61.141457,61.141457], [60.26705,61.85445,63.139835])
prob = model.predict(x)
if prob < 0.5:
print("nonPDR")
else:
print("PDR")
print(filename)
and now all my predictions are (class 1) PDR... I've made something wrong?
UPDATE 3
I've drop out the gaussianblur I was using in ReadImages, and included the follow:
data = np.asarray(data)
valid = np.asarray(valid)
data = data.astype('float32')
valid = valid.astype('float32')
data /= 255
valid /= 255
And after run my train.py:
Epoch 1/15
44/43 [==============================] - 476s 11s/step - loss: 0.7153 - acc: 0.5788 - val_loss: 0.6937 - val_acc: 0.5556
Epoch 2/15
44/43 [==============================] - 468s 11s/step - loss: 0.5526 - acc: 0.7275 - val_loss: 0.6838 - val_acc: 0.5833
Epoch 3/15
44/43 [==============================] - 474s 11s/step - loss: 0.5068 - acc: 0.7595 - val_loss: 0.6927 - val_acc: 0.5694
Epoch 00003: early stopping
After, I update the std and mean on predict.py:
for filename in os.listdir(r'v/'):
if filename.endswith(".jpg") or filename.endswith(".ppm") or filename.endswith(".jpeg") or filename.endswith(".png"):
ImageCV = cv2.resize(cv2.imread(os.path.join(TEST_DIR) + filename), (224,224))
ImageCV = np.asarray(ImageCV)
ImageCV = ImageCV.astype('float32')
ImageCV /= 255
x = ImageCV
x = np.expand_dims(x, axis=0)
x = normalize(x, [0.12810835, 0.17897758, 0.23883381], [0.14304605, 0.18229756, 0.2362126])
prob = model.predict(x)
if prob <= 0.70: # I CHANGE THE THRESHOLD TO 0.7
print("nonPDR >>>", filename)
nonPdr += 1
else:
print("PDR >>>", filename)
pdr += 1
print(prob)
print("Number of retinas with PDR: ",pdr)
print("Number of retinas without PDR: ",nonPdr)
And after run this code, I'm getting roughly 75% accuracy in my test dir..
So, can I improve something, or this is the maximum for these tiny number of images?
The preprocessing steps done on the data should be same for training and test. I see at least two inconsistencies. First, on the train data, GaussianBlur is applied to all the images. Usually, such transformations are used as data augmentation strategies and not applied to the entire training set. Second, the normalization used for training and test should be same. In the code snippets above, for predictions the vgg16.preprocess_input is applied which uses the mean/variance of imagenet data while during training the mean/variance is calculated from the training data itself. What you can do is take the datagen.mean and datagen.std values after calling datagen.fit and use it during predictions for normalizing the data instead of preprocess_input.
You don't have a validation generator defined. When training, you use a training set & validation set, and stop the training when the validation loss doesn't improve. Otherwise, the model will overfit to the training dataset.
https://gist.github.com/fchollet/7eb39b44eb9e16e59632d25fb3119975
https://keras.io/callbacks/#earlystopping
Since the final layer of your network is a sigmoid like this
top_preds = Dense(1, activation="sigmoid")(top_fc1)
there is only one output and it's a probability value from 0 to 1.
np.argmax is not relevant here.
np.argmax is used when the last layer uses softmax activation with
two outputs whose probabilities sum to 1 and the index with the higher
probability is chosen as the result.
Coming back to the results you obtain with sigmoid, usually a threshold
is chosen to decide whether to classify it as class 0 or class 1. The
default threshold is 0.5. A ROC curve can be created using the
probabilities to come up with the optimal threshold.
https://scikit-learn.org/stable/modules/generated/sklearn.metrics.roc_curve.html
Using threshold of 0.5,
prob = model.predict(x)
if prob < 0.5:
output = 0
else:
output = 1
[[0.03993018]] => < 0.5, class 0 correct
[[0.9984968]] => > 0.5, class 1 incorrect
[[1.]] => > 0.5, class 1 correct
[[1.]] => > 0.5, class 1 correct
[[0.]] => < 0.5, class 0 incorrect
[[0.9999999]] => > 0.5, class 1 correct
[[0.8691623]] => > 0.5, class 1 correct
[[1.01611796e-07]] => < 0.5, class 0 incorrect
[[1.]] => > 0.5, class 1 correct
[[0.]] => < 0.5, class 0 incorrect
[[1.]] => > 0.5, class 1 correct
[[0.17786741]] => < 0.5, class 0 incorrect
Accuracy = 7/12 = 58%
Objective
I would like to predict the class if I give only partial input into the model.
(Working with sequence data. Using Keras LSTM's)
What I have done
I have implemented my model based on the answers what I got from here answered by #Kbrose
In such a way, i should train my training data with variable length sequence which corresponds to particular class.
Here, I would like to clarify some queries related to fit.generator, batch sizes, validation_steps and my Model results
Data
X_train.shape = (243*100*4) # Samples * Time steps * Features
Y_train.shape = (243,) # either 0 or 1 for each samples
X_validate.shape : (31, 100, 4) # Samples * Time steps * Features
Y_validate.shape : (31,) # either 0 or 1 for each samples
X_test.shape : (28, 100, 4) # Samples * Time steps * Features
Y_test.shape : (28,) # either 0 or 1 for each samples
Objective:
1. Train the model with random time length batches
2. Predict the class, if random time length batches provided as input to the model
Code
input_ = Input(shape=(None,4))
x = LSTM(16, return_sequences=True)(input_)
x = LSTM(8, return_sequences=True)(x)
output = TimeDistributed(Dense(2, activation='sigmoid'))(x)
# Model
model = Model(inputs=input_, outputs=output)
print(model.summary())
# Model Compile
model.compile(
loss='binary_crossentropy',
optimizer=Adam(lr=1e-4),
metrics=['accuracy']
)
def common_generator(X, Y):
while True:
sequence_length = random.randrange(60,100,5)
# I want my model to be trained with random time length b/w 50 to 100 with multiples of 5
x_train = X[:, :sequence_length, :]
y = to_categorical(Y)
y_train = np.repeat(y[:, np.newaxis], sequence_length, axis=1)
# For my convenience i changed my Y_train shape from (243,) to (243, sequence_length, 2)
# Refer picture below for better understanding
yield (x_train, y_train)
trainGen = common_generator(X_train,Y_train)
ValGen = common_generator(X_validate, Y_validate)
H = model.fit_generator(trainGen, steps_per_epoch=30, validation_data=ValGen, validation_steps=3, epochs=100)
Question 1 : The model trained with 25 steps_per_epoch. What will be
the batch_sizes for one step ? It takes default value as 32 or >
number of samples//steps_per_epoch (In my case 243//25 = 9)
Question 2 : How to choose or select validation_steps ? What is the correct selection ? (In my case, there were 31 samples of validation_data)
Training & Results
After training for almost 100 epochs
Epoch 99/100
30/30 [==============================] - 6s 200ms/step - loss: 0.3055 - acc: 0.8789 - val_loss: 0.4075 - val_acc: 0.8259
Epoch 100/100
30/30 [==============================] - 6s 201ms/step - loss: 0.3051 - acc: 0.8791 - val_loss: 0.4051 - val_acc: 0.8260
Question 3 : What can i understand from the picture ? From the graph, i believe it is Over fitting. How can I improve my model ?
Whether my model gets trained properly or not ? Kindly comment
I have a question concerning my recent project.
I have been trying using PyTorch to train my multiclass-classification work. I have 3 labels (namely, 0 -> none, 1 -> left, 2 -> right) in my image dataset. I have used nn.CrossEntropyLoss() as my loss function and Adam as optimizer. However, the training result looks like this, the accuracy does not change at all.
==> Building new CNN model ...
==> Initialize CUDA support for CNN model ...
==> Preparing RcCar Image dataset ...
==> Start training ...
Iteration: 1 | Loss: 1.3453235626220703 | Training accuracy: 70% | Test accuracy: 43%
==> Saving model ...
/usr/local/lib/python3.6/dist-packages/torch/serialization.py:251: UserWarning: Couldn't retrieve source code for container of type SimpleCNN. It won't be checked for correctness upon loading.
"type " + obj.__name__ + ". It won't be checked "
Iteration: 2 | Loss: 0.9048898816108704 | Training accuracy: 70% | Test accuracy: 43%
Iteration: 3 | Loss: 0.873579740524292 | Training accuracy: 70% | Test accuracy: 43%
Iteration: 4 | Loss: 0.8702362179756165 | Training accuracy: 70% | Test accuracy: 43%
Iteration: 5 | Loss: 0.8713874220848083 | Training accuracy: 70% | Test accuracy: 43%
Iteration: 6 | Loss: 0.8639134168624878 | Training accuracy: 70% | Test accuracy: 43%
Iteration: 7 | Loss: 0.8590883612632751 | Training accuracy: 70% | Test accuracy: 43%
Iteration: 8 | Loss: 0.8576076626777649 | Training accuracy: 70% | Test accuracy: 43%
Iteration: 9 | Loss: 0.8523686528205872 | Training accuracy: 70% | Test accuracy: 43%
Iteration: 10 | Loss: 0.8462777137756348 | Training accuracy: 70% | Test accuracy: 43%
I am thinking whether this is because the loss function I chose is not suitable or I have to one-hot encode the labels into
[
[0,0,1],
[0,1,0],
...
]
like this.
I have enclosed my custom Dataset part. Please, please, please help me with this. Thank you!
def RcCarImageLoader(root, batch_size_train, batch_size_test):
"""
RC Car Image Loader.
Args:
train_root:
test_root:
batch_size_train:
batch_size_test:
Return:
train_loader:
test_loader:
"""
# Normalize training set together with augmentation
transform_train = transforms.Compose([
transforms.RandomResizedCrop(64),
transforms.RandomRotation(10),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])
])
# Normalize test set same as training set without augmentation
transform_test = transforms.Compose([
transforms.Resize(64),
transforms.CenterCrop(64),
transforms.ToTensor(),
transforms.Normalize(mean=[0.485, 0.456, 0.406],
std=[0.229, 0.224, 0.225])
])
# Loading Tiny ImageNet dataset
print("==> Preparing RcCar Image dataset ...")
train_set = ImageLoader(csv_filename="./train.csv", transform=transform_train)
train_loader = torch.utils.data.DataLoader(
train_set, batch_size=batch_size_train, num_workers=2)
test_set = ImageLoader(csv_filename="./test.csv", transform=transform_test, train=False)
test_loader = torch.utils.data.DataLoader(
test_set, batch_size=batch_size_test, num_workers=2)
return train_loader, test_loader
def image_loader(path):
"""Image Loader helper function."""
return Image.open(path.rstrip("\n")).convert('RGB')
class ImageLoader(Dataset):
"""Image Loader for Tiny ImageNet."""
def __init__(self, csv_filename, transform=None, train=True, loader=image_loader):
"""
Image Loader Builder.
Args:
base_path: path to triplets.txt
filenames_filename: text file with each line containing the path to an image e.g., `images/class1/sample.JPEG`
triplets_filename: A text file with each line containing three images
transform: torchvision.transforms
loader: loader for each image
"""
self.transform = transform
self.loader = loader
self.train_flag = train
# load training data
if self.train_flag:
train_data = []
csv_file = pd.read_csv(csv_filename)
self.train_label = np.asarray(csv_file.iloc[:, 1])
train_img_names = np.asarray(csv_file.iloc[:, 0])
for train_img_name in train_img_names:
train_img = self.loader(os.path.join("./train/", train_img_name))
train_data.append(train_img)
self.train_data = train_data
# train_label_one_hot = [[0 for _ in range(3)] for _ in range(len(train_label))]
# for i, row in enumerate(train_label_one_hot):
# row[train_label[i]] = 1
#
# self.train_label = np.asarray(train_label_one_hot)
# load test data
else:
test_data = []
csv_file = pd.read_csv(csv_filename)
self.test_label = np.asarray(csv_file.iloc[:, 1])
test_img_names = np.asarray(csv_file.iloc[:, 0])
for test_img_name in test_img_names:
test_img = self.loader(os.path.join("./test/", test_img_name))
test_data.append(test_img)
self.test_data = test_data
# test_label_one_hot = [[0 for _ in range(3)] for _ in range(len(test_label))]
# for i, row in enumerate(test_label_one_hot):
# row[test_label[i]] = 1
#
# self.test_label = np.asarray(test_label_one_hot)
def __getitem__(self, index):
"""Get image and label in dataset."""
# get training images
if self.train_flag:
img = self.train_data[index]
label = self.train_label[index]
if self.transform is not None:
img = self.transform(img)
return (img, label)
else:
img = self.test_data[index]
label = self.test_label[index]
if self.transform is not None:
img = self.transform(img)
return (img, label)
def __len__(self):
if self.train_flag:
return len(self.train_label)
else:
return len(self.test_label)
Your speculations about the loss function and one-hot encoding are correct. Do one-hot encoding and use BCEloss and let me know.
I am a novice in TensorFlow but have a fair understanding of ML algorithms. I have a project to model the error characteristic of time of flight cameras. As ground truth, I have acquired a set of depth images(512x424 pixels) from a stereo set up. The the range images(512x424 pixels) from the ToF camera needs to be compared with the reference depth images. In order to learn the error characteristic I am implementing a deep neural network with the reference image pixels as training input data(median pixels as features) and the difference of reference image and range camera image as training output value. There are 3 pairs of images to train and 1 pair of images to test. I have flattened the image matrices so the training input data are 3-element lists of 217088 sized arrays.
My code works without any errors but the results are ugly:
The cost reduces nicely after the first epoch but does not change much after the second epoch.
The accuracy of the test phase is horrendous.
The code is extremely slow. It takes almost 2 hours for a complete run. May be it has to do with the hardware. I am running it on core i3.
My code:
import tensorflow as tf
import numpy
import cv2
import matplotlib.pyplot as plt
import glob
refDepthImgLoc = 'M:\Internship\Scan\png\scan_dist*.png'
tofDepthImgLoc = 'M:\Internship\Scan\png\kinect_distance*.png'
numImg = 4
refDepthImg = []
tofDepthImg = []
refLoc = glob.glob(scanDistImgLoc)
tofLoc = glob.glob(tofDistImgLoc)
for refImg, tofImg in zip(refLoc, tofLoc) :
img1 = cv2.imread(refImg, 0)
refDepthImg.append(img1)
img2 = cv2.imread(tofImg, 0)
tofDepthImg.append(img2)
trainData_median = []
trainLabel = []
for i in range(len(refDepthImg)):
tempData = cv2.medianBlur(refDepthImg[i], 3)
trainData_median.append(tempData.ravel())
tempLabel = refDepthImg[i] - tofDepthImg[i]
trainLabel.append(tempLabel.ravel())
n_nodes_hl1 = 100
n_nodes_hl2 = 100
n_nodes_hl3 = 100
n_input = 1;
n_output = 1;
learning_rate = 0.01
x = tf.placeholder('float')
y = tf.placeholder('float')
def neural_network_model(data):
hidden_1_layer = {'weights':tf.Variable(tf.random_normal([n_input, n_nodes_hl1])),
'biases':tf.Variable(tf.random_normal([n_nodes_hl1]))}
hidden_2_layer = {'weights':tf.Variable(tf.random_normal([n_nodes_hl1, n_nodes_hl2])),
'biases':tf.Variable(tf.random_normal([n_nodes_hl2]))}
hidden_3_layer = {'weights':tf.Variable(tf.random_normal([n_nodes_hl2, n_nodes_hl3])),
'biases':tf.Variable(tf.random_normal([n_nodes_hl3]))}
l1 = tf.add(tf.matmul(data,hidden_1_layer['weights']), hidden_1_layer['biases'])
l1 = tf.nn.relu(l1)
l2 = tf.add(tf.matmul(l1,hidden_2_layer['weights']), hidden_2_layer['biases'])
l2 = tf.nn.relu(l2)
l3 = tf.add(tf.matmul(l2,hidden_3_layer['weights']), hidden_3_layer['biases'])
l3 = tf.nn.relu(l3)
output = tf.reduce_sum(l3)
return output
def train_neural_network(x):
prediction = neural_network_model(x)
cost = tf.reduce_sum(tf.square(prediction-y))/((numImg-1)*len(trainLabel[0]))
optimizer = tf.train.GradientDescentOptimizer(learning_rate).minimize(cost)
hm_epochs = 10
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
for epoch in range(hm_epochs):
tempLoss = 0
for i in range(numImg - 1):
for (X, Y) in zip(trainData_median[i], trainLabel[i]):
_, c = sess.run([optimizer, cost], feed_dict={x: [[X]], y: [[Y]]})
tempLoss += c
print('Epoch', (epoch+1), 'completed out of',hm_epochs,'loss:',tempLoss)
print("Testing starts now")
test = tf.abs(prediction - y)
i = 0;
pred = numpy.zeros(len(trainLabel[0]));
result = numpy.zeros(len(trainLabel[0]));
for (X, Y) in zip(trainData_median[numImg - 1], trainLabel[numImg - 1]):
correct, pred[i] = sess.run([test, prediction], feed_dict={x: [[X]], y: [[Y]]})
if (correct < 0.5):
result[i] = 1
i += 1
accuracy = tf.reduce_mean(tf.cast(result, 'float'))
print('Accuracy:', accuracy.eval())
train_neural_network(x)
The output:
Epoch 1 completed out of 10 loss: 204681865.46
Epoch 2 completed out of 10 loss: 3188.81297796
Epoch 3 completed out of 10 loss: 3183.35926716
Epoch 4 completed out of 10 loss: 3181.37895241
Epoch 5 completed out of 10 loss: 3179.95276242
Epoch 6 completed out of 10 loss: 3178.51366003
Epoch 7 completed out of 10 loss: 3177.6227609
Epoch 8 completed out of 10 loss: 3176.69995104
Epoch 9 completed out of 10 loss: 3176.85162593
Epoch 10 completed out of 10 loss: 3177.04338937
Testing starts now
Accuracy: 0.00301721
Please comment if there is any inherent logical error in the code or the entire approach to the problem is incorrect. Should I try implementing it using CNNs? Please help me in making this work. Please let me know if any further information is required.
Thanks.