I have a video of droplets. I have taken the first frame and manually marked the edges. I split up the image into smaller images. I then tried to train a keras model of the small unmarked images against the small marked images.
I've tried using "dense" layers. The model trains, but the loss does not improve. When I try using the model, it just gives me a black image output.
Marked split images
Input image (frame 1)
Model summary
#################### IMPORT AND SPLIT
from cam_img_split import cam_img_split
import cv2
img_tr_in=cv2.imread('frame 1.png')
img_tr_out=cv2.imread('frame 1 so far.png')
seg_shape=[32,32]
tr_in=cam_img_split(img_tr_in,seg_shape)
tr_out=cam_img_split(img_tr_out,seg_shape)
pl=[4,20] #images selected for training
##################### NEURAL NETWORK
import tensorflow as tf
from keras.models import Sequential
from keras.layers import Dense
from keras.optimizers import adam
b_sha=np.prod(tr_in.shape[2:5]) #batch shape
model = Sequential()
model.add(Dense(b_sha, activation='relu'))
model.add(Dense(3072, activation='softmax'))
model.add(Dense(3072, activation='softmax'))
model.add(Dense(3072, activation='softmax'))
model.add(Dense(np.prod(tr_out.shape[2:5]), activation='softmax'))
model.compile(optimizer=adam(lr=0.1), loss='mean_squared_error', metrics=['accuracy'])
tr_in_sel=tr_in[0:pl[0],0:pl[1],:,:,:]
tr_out_sel=tr_out[0:pl[0],0:pl[1],:,:,:]
tr_in_sel_flat=tr_in_sel.reshape([np.prod(pl),b_sha]) #Flattening
tr_out_sel_flat=tr_in_sel.reshape([np.prod(pl),b_sha])
tr_in_sel_flat_norm=tr_in_sel_flat/255
tr_out_sel_flat_norm=tr_out_sel_flat/255
model.fit(tr_in_sel_flat_norm, tr_out_sel_flat_norm, epochs=10, batch_size=pl[0])
I expected an output matching the images with the marked edges. Instead I got a black image output.
I solved this problem by using 7x7 sections of the image to classify the central pixel of that section as either oil or water (1 or 0). I then used the binary_crossentropy loss function to train the model.
By having a 7x7 section moving one pixel at a time across the main image, I can have a lot more training data than just segmenting the main image.
I previously tried to get a 7x7 image from another 7x7 image, which made the problem more difficult.
#IMPORT AND SPLIT
from cam_img_split import cam_img_split
from cam_pad import cam_pad
from cam_img_bow import cam_img_bow
import cv2
import numpy as np
img_tr_in=cv2.imread('frame 1.png',0)[0:767,0:767]/255
img_tr_out=cv2.imread('frame 1 so far bnw 2.png',0)[0:767,0:767]/255
img_tr_out=(cam_img_bow(img_tr_out,0.5)).astype(np.uint8)
seg_shape=[15,15] #needs to be odd and equal to each other
pl_max=img_tr_in.shape[0:2]
pl=np.array([0.15*pl_max[0],pl_max[1]]).astype(np.uint32)
pad_in=int(np.floor(seg_shape[0]/2))
img_tr_in_pad=cam_pad(img_tr_in,pad_in)
tr_in=np.zeros([pl[0],pl[1],seg_shape[0],seg_shape[1]])
for n1 in range(0,pl[0]):
for n2 in range(0,pl[1]):
tr_in[n1,n2]=img_tr_in_pad[n1:n1+seg_shape[0],n2:n2+seg_shape[1]]
##################### NEURAL NETWORK
import tensorflow as tf
from keras.models import Sequential
from keras.layers import Dense,Dropout,Conv2D, MaxPooling2D, Flatten
from keras.optimizers import adam
from keras.utils import to_categorical
import matplotlib.pyplot as plt
pad=4
input_shape=(seg_shape[0]+2*pad,seg_shape[1]+2*pad,1)
output_shape=(1,1,1)
model = Sequential()
model.add(Conv2D(32, (3, 3),input_shape=input_shape, activation='relu'))
model.add(Conv2D(64,(3, 3), activation='relu'))
model.add(Dropout(0.2))
model.add(Flatten())
model.add(Dense(units=2, activation='softmax'))
model.compile(optimizer=adam(lr=0.001), loss='binary_crossentropy', metrics=['accuracy'])
##################### FITTING THE MODEL
tr_in_flat=tr_in.reshape([pl[0]*pl[1],seg_shape[0],seg_shape[1],1])
tr_out_flat=img_tr_out.reshape([pl_max[0]*pl_max[1]])
tr_in_flat_pad=np.zeros(tr_in_flat.shape+np.array([0,2*pad,2*pad,0]))
for n3 in range(0,tr_in_flat.shape[0]):
tr_in_flat_pad[n3,:,:,0]=cam_pad(tr_in_flat[n3,:,:,0], pad)
model.fit(tr_in_flat_pad, to_categorical(tr_out_flat[0:pl[0]*pl[1]]), epochs=5, batch_size=int(16*pl[0]),shuffle=True)
##################### PLOTTING PREDICTIONS
tr_in_full=np.zeros([pl_max[0],pl_max[1],seg_shape[0]+2*pad,seg_shape[1]+2*pad])
for n1 in range(0,pl_max[0]):
for n2 in range(0,pl_max[1]):
tr_in_full[n1,n2]=cam_pad(img_tr_in_pad[n1:n1+seg_shape[0],n2:n2+seg_shape[1]],pad)
tr_in_full_flat=tr_in_full.reshape([pl_max[0]*pl_max[1],seg_shape[0]+2*pad,seg_shape[1]+2*pad,1])
pred = model.predict(tr_in_full_flat)
pred_img=np.zeros(pred.shape[0])
for n1 in range(0,pred.shape[0]):
pred_img[n1]=round(pred[n1,0])
pred_img_out=(pred_img.reshape([pl_max[0],pl_max[1]]))
plt.subplot(1,2,1)
plt.imshow(pred_img_out)
plt.subplot(1,2,2)
plt.imshow(img_tr_in)
plt.show()
You are using the wrong loss/metric combo. Is your problem classification, or regression? MSE is for regression, while categorical_crossentropy (or sparse, or binary) is for classification.
Related
I have a dataset of size 273985 x 5 that I'm training as a path prediction problem. I chose an LSTM inspired by this paper: https://ieeexplore.ieee.org/abstract/document/9225479
I have a baseline implementation as such:
# lstm autoencoder recreate sequence
from numpy import array
from keras.models import Sequential
from keras.layers import LSTM
from keras.layers import Dense
from keras.layers import RepeatVector
from keras.callbacks import EarlyStopping
from keras.layers import TimeDistributed
from keras.utils import plot_model
# define input sequence
my_sequence = np.array(sample)
# reshape input into [samples, timesteps, features]
n_in = len(my_sequence)
my_sequence = my_sequence.reshape((1, n_in, 5))
# define model
model = Sequential()
model.add(LSTM(10, activation='sigmoid', input_shape=(n_in,5)))
model.add(RepeatVector(n_in))
model.add(LSTM(10, activation='sigmoid', return_sequences=True))
model.add(TimeDistributed(Dense(5)))
model.compile(optimizer='adam', loss='mse')
# fit model
model.fit(my_sequence, my_sequence, epochs=300, verbose=0)
# structure of the model and the layers
plot_model(model, show_shapes=True, to_file=path)
# demonstrate recreation
predicted = model.predict(my_sequence, verbose=0)
print(predicted)
print(my_sequence)
Right now, I am choosing my training sample by hand but I want to train my entire dataset much like bootstrapping where I train 1-50, predict the next 50; train 2-50, predict the next 50… until the end of the test set then compare my prediction against the actual values.
Would this be done via batching the data or k-fold validation? Also, how would one go about it or calculate the appropriate evaluation metric?
Thank you!
Ive been working on this CNN. In the Test() function it always says that it is 1 given number. (example. always outputting 8 even though it's not even close). Ive tried training the model more to see if the model was just not good enough. Here is my code:
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras.layers import Dense, Conv2D, Dropout, MaxPooling2D
from tensorflow.keras.callbacks import TensorBoard
from tensorflow.keras.utils import to_categorical
from matplotlib import pyplot as plt
(Train_Data, Train_Labels), (Test_Data, Test_Labels) = tf.keras.datasets.mnist.load_data()
Train_Data = Train_Data.reshape(60000,28,28,1)
Test_Data = Test_Data.reshape(10000,28,28,1)
Train_Data = Train_Data / 255 - 0.5
Test_Data = Test_Data / 255 - 0.5
def load(name):
net = keras.models.load_model(name)
return net
def save(name):
model.save(name)
print("""
###:::SAVING MODEL:::###
""")
def makeCNN():
model = keras.Sequential()
model.add(Conv2D(32, kernel_size=3, activation='relu'))
model.add(MaxPooling2D(pool_size=(3,3)))
model.add(keras.layers.Flatten())
model.add(Dense(9, activation='relu'))
model.add(Dense(10, activation='softmax'))
model.compile(optimzer='adam', loss="mse", metrics=['accuracy'])
return model
def train(epochs):
for i in range(epochs):
print(i+1)
model.fit(Train_Data, Train_Labels)
save('CNN.h5')
def test():
validCorrect = 0
validTotal = 0
print(Test_Data.shape)
for i in range(1000):
data = Test_Data[i]
data = data.reshape(1,28,28,1)
prediction = model.predict(data)
validTotal +=1
if np.argmax(prediction) == Test_Labels[i]:
validCorrect+=1
print(f"""
TOTAL:{validTotal}
ACCURACY:{(validCorrect/validTotal)*100}
CORRECT:{validCorrect}
""")
print(f"GUESS:{np.argmax(prediction)}
REALITY{Test_Labels[i]}")
model = makeCNN()
train(80)
test()
Any help is appreciated. Thanks! Im pretty new to Machine Learning(especially CNNs)
Firstly, you should use categorical_crossentropy as your loss. It's tempting to use MSE, we're dealing with digits after all, but since this is a classification task, the model doesn't know about the supposed ordinality of the different digits. It just knows them as "ten different classes of image". For example, is a 7 more similar to a 2 or an 8? In terms of ordinality, it's closer to 8. But the digit looks rather more like a 2, doesn't it?
Also, I'm guessing that your model is likely to under-fit quite severely, because is not deep enough. You can try adding some more convolutional layers to your network. You could draw inspiration from this example in the Keras documentation (also on the MNIST dataset) here https://keras.io/examples/mnist_cnn/ where they achieve >99% on this problem with just a couple of extra convolutional layers, and some techniques to reduce overfitting, such as dropout.
I have created a model that makes use of deep learning to classify the input data using CNN. The classification is multi-class though, actually with 5 classes. On training the model seems to be fine, i.e. it doesn't overfit or underfit. Yet, on saving and loading the model I always get the same output regardless of the input image. The final prediction array contains the output as 0 for all the classes.
So, I am not sure if the model doesn't predict anything or it always produces the same result.
The model created by me after using tensorboard to find the best fit model is below.
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout, Activation, Flatten, Conv2D, MaxPooling2D
from tensorflow.keras.callbacks import TensorBoard
import pickle
import time
X=pickle.load(open("X.pickle","rb"))
y=pickle.load(open("y.pickle","rb"))
X=X/255.0
dense_layers=[0]
layer_sizes=[64]
conv_layers=[3]
for dense_layer in dense_layers:
for layer_size in layer_sizes:
for conv_layer in conv_layers:
NAME="{}-conv-{}-nodes-{}-dense-{}".format(conv_layer,layer_size,dense_layer,int(time.time()))
print(NAME)
tensorboard=TensorBoard(log_dir='logs\{}'.format(NAME))
model = Sequential()
model.add(Conv2D(layer_size, (3,3), input_shape=X.shape[1:]))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
for l in range(conv_layer-1):
model.add(Conv2D(layer_size, (3,3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2,2)))
model.add(Flatten())
for l in range(dense_layer):
model.add(Dense(layer_size))
model.add(Activation('relu'))
model.add(Dense(5))
model.add(Activation('sigmoid'))
model.compile(loss='sparse_categorical_crossentropy',
optimizer='adam',
metrics=['accuracy'])
model.fit(X,y,batch_size=32,epochs=10,validation_split=0.3,callbacks=[tensorboard])
model.save('0x64x3-CNN-latest.model')
The loading model snippet is as below,
import cv2
import tensorflow as tf
CATEGORIES= ["fifty","hundred","ten","thousand","twenty"]
def prepare(filepath):
IMG_SIZE=100
img_array=cv2.imread(filepath)
new_array=cv2.resize(img_array,(IMG_SIZE,IMG_SIZE))
return new_array.reshape(-1,IMG_SIZE,IMG_SIZE,3)
model=tf.keras.models.load_model("0x64x3-CNN-latest.model")
prediction=model.predict([prepare('30.jpg')])
print(prediction)
The output is always [[0. 0. 0. 0. 0.]].
On converting to categories, it always results in fifty.
My dataset contains almost 2200 images with an average of 350-500 images for each class.
Can someone help out with this..?
I see that when you train, you normalize your images:
X = X/255.0
but when you test, i.e., in prediction time, you just read your image and resize but not normalize. Try:
def prepare(filepath):
IMG_SIZE=100
img_array=cv2.imread(filepath)
img_array = img_array/255.0
new_array=cv2.resize(img_array,(IMG_SIZE,IMG_SIZE))
return new_array.reshape(-1,IMG_SIZE,IMG_SIZE,3)
and also, your prepare function returns your image in 4 dimensions (including the batch dimension), so when you call predict, you do not have to give the input as a list. Instead of:
prediction=model.predict([prepare('30.jpg')])
you should do:
prediction=model.predict(prepare('30.jpg'))
Hope it helps.
Keras prediction returns the same value everywhere.
I have some xyz data that I want to predict in a regular grid, using keras ML.
I am using something wrong and can't figure it out.
import numpy as np
from keras.models import Sequential
from keras.layers.core import Dense, Activation
from keras.optimizers import Adadelta, Adam
m=1e5
data=np.random.rand(m,3) # let's generate some random data (i do actually have real data that make sense)
dx=0.05
xmin=np.min(data[:,0])
xmax=np.max(data[:,0])
ymin=np.min(data[:,1])
ymax=np.max(data[:,1])
xs=np.arange(xmin,xmax+dx,dx)
ys=np.arange(ymin,ymax+dx,dx)
xg,yg=np.meshgrid(xs,ys)
shape = (len(ys), len(xs))
activation='sigmoid'
hidden_layer_sizes=[128, 64, 32, 16]
keras_model = Sequential()
keras_model.add(Dense(hidden_layer_sizes[0], activation=activation, input_shape=(2, )))
for hl_size in hidden_layer_sizes[1: ]:
keras_model.add(Dense(hl_size, activation=activation))
keras_model.add(Dense(1))
keras_model.compile(loss='mean_squared_error', optimizer=Adam())
keras_model.save_weights('cache.h5')
keras_model.summary()
keras_model.load_weights('cache.h5') # re-initialize Keras model weights
keras_history = keras_model.fit(data[:,:2], data[:,2], batch_size=m, epochs=20000, verbose=1)
X_test = np.vstack((xg.flatten(), yg.flatten())).T
res_keras=keras_model.predict(X_test).reshape(shape)
I am expecting some values "close" to an interpolation function.
Where is the mistake in my code?
change you activation from sigmoid to relu
set
activation='relu'
I've been using Keras with Tensorflow to classify a normalized 60x60 grayscale image of an arrow into 4 categories, its orient, up, down, left, right. I have created a dataset of about ~1800 images, almost equally distributed into said categories.
However, there's a problem with classification. From the source where I have created the dataset, there are two types of arrows, arrow shape 1,
and arrow shape 2.
The accuracy is okay for arrows which are shaped like 1(about ~70% validation accuracy), but for arrows like number 2, terrible.
I've went through my dataset, and about 90% of the dataset images are arrow shape 1.
Does that mean that the lack of traning data for arrow shape 2 is the reason that it cannot classify them as well as shape 1, and therefore increasing the dataset for shape 2 resolve this issue?
If true, doesn't that mean that my model has failed to generalize?
Also, if the arrow colors are inverted, will the network be affected by this?
Here is the source I'm using to train data:
# -*- coding:utf-8 -*-
import cv2
import numpy as np
import os
from random import shuffle
import glob
train_dir = "images\\cropped\\traindata"
test_dir = "images\\cropped\\testdata"
MODEL_NAME = "ARROWS.model"
img_size = 60
# Importing the Keras libraries and packages
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense
from keras.layers import Dropout
from keras.layers import Activation
from keras.layers import BatchNormalization
from keras.preprocessing.image import ImageDataGenerator
from keras.optimizers import adam
from keras.callbacks import TensorBoard
from keras import backend as K
from tensorflow import Session, ConfigProto, GPUOptions
gpuoptions = GPUOptions(allow_growth=True)
session = Session(config=ConfigProto(gpu_options=gpuoptions))
K.set_session(session)
classifier = Sequential()
classifier.add(Conv2D(32, (3,3), input_shape=(img_size, img_size, 1)))
classifier.add(BatchNormalization())
classifier.add(Activation("relu"))
classifier.add(Conv2D(32, (3,3)))
classifier.add(BatchNormalization())
classifier.add(Activation("relu"))
classifier.add(MaxPooling2D(pool_size=(2, 2)))
classifier.add(Dropout(0.25))
#classifier.add(Dropout(0.25))
classifier.add(Conv2D(64, (3,3), padding='same'))
classifier.add(BatchNormalization())
classifier.add(Activation("relu"))
classifier.add(MaxPooling2D(pool_size=(2, 2)))
classifier.add(Dropout(0.25))
#classifier.add(Dropout(0.25))
classifier.add(Flatten())
classifier.add(Dense(128))
classifier.add(BatchNormalization())
classifier.add(Activation("relu"))
classifier.add(Dropout(0.5))
classifier.add(Dense(4))
classifier.add(BatchNormalization())
classifier.add(Activation("softmax"))
classifier.compile(optimizer = adam(lr=1e-6), loss = 'categorical_crossentropy', metrics = ['accuracy'])
train_datagen = ImageDataGenerator(rotation_range=12)
test_datagen = ImageDataGenerator(rotation_range=12)
training_set = train_datagen.flow_from_directory('images/cropped/traindata',
color_mode="grayscale",
target_size = (img_size, img_size),
batch_size = 32,
class_mode = 'categorical', shuffle=True)
test_set = test_datagen.flow_from_directory('images/cropped/testdata',
color_mode="grayscale",
target_size = (img_size, img_size),
batch_size = 32,
class_mode = 'categorical', shuffle=True)
with open("class_indices.txt", "w") as indices_fine: # Log debug data to file
indices_fine.write(str(classifier.summary()))
indices_fine.write("\n")
indices_fine.write("training_set indices:\n"+str(training_set.class_indices))
indices_fine.write("test_set indices:\n"+str(test_set.class_indices))
tbCallBack = TensorBoard(log_dir='./log', histogram_freq=0, write_graph=True, write_images=True)
classifier.fit_generator(training_set,steps_per_epoch = 8000,epochs = 15,validation_data = test_set,validation_steps = 2000, shuffle=True, callbacks=[tbCallBack])
classifier.save("arrow_classifier_keras_gray.h5")
Does that mean that the lack of traning data for arrow shape 2 is the
reason that it cannot classify them as well as shape 1, and therefore
increasing the dataset for shape 2 resolve this issue?
Your dataset distribution is very important and can cause bias toward a particular class and not performing as you expect. In your case, the number of cases of shape 2 is much smaller than your shape 1, hence creating bias in your Deep Learning model to somehow assuming all of the down arrows must be like shape 1, and not shape 2. Solution? You already know the answer: Increasing the dataset for shape 2 or make shape 1 and shape 2 equally distributed in down arrow class.
If true, doesn't that mean that my model has failed to generalize?
Your dataset distribution of images caused the model to fail generalizing well on that particular class (down arrow). If your model works well on other classes, the problem is not your model, but your dataset for down arrow class.
Just imagine your first image is a tiny cat and a second is a fat cat, like Garfield. The distribution of cats is something we cannot change, but we need to detect all the cats (even when the cats are inverted or sprayed in pink).
What I would do is for instance if I have 1000 of tiny cats, I may filter some of the images add some distortions, and effect to make the training set bigger. This is called data augmentation.
You don't need per purpose to make the number of fat cat images equal the number of tiny cats images if at the end you recognize them all well and you have trained your image classifier with let's say ~98% accuracy on this dataset.
It is important to test.
NOTE: CNN should be good at detecting images with inverted colors. It is because they use the convolution technique.