I have fine tuned the Keras VGG16 model, but I'm unsure about the preprocessing during the training phase.
I create a train generator as follow:
train_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
train_folder,
target_size=(IMAGE_SIZE, IMAGE_SIZE),
batch_size=train_batchsize,
class_mode="categorical"
)
Is the rescale enough or I have to apply others preprocessing functions?
When I use the network to classify an image I use this code:
from keras.models import load_model
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input
import numpy as np
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
preds = model.predict(x)
I think this is the correct preprocess and I should apply it before training.
Thanks for your help.
ImageDataGenerator has a preprocessing_function argument which allows you to pass the same preprocess_input function that you are using during inference. This function will do the rescaling for you, so can omit the scaling:
from keras.applications.vgg16 import preprocess_input
train_datagen = ImageDataGenerator(preprocessing_function=preprocess_input)
Most of the pretrained models in keras_applications use the same preprocessing function. You can inspect the docstring to see what it does:
def preprocess_input(x, data_format=None, mode='caffe', **kwargs):
"""Preprocesses a tensor or Numpy array encoding a batch of images.
# Arguments
x: Input Numpy or symbolic tensor, 3D or 4D.
The preprocessed data is written over the input data
if the data types are compatible. To avoid this
behaviour, `numpy.copy(x)` can be used.
data_format: Data format of the image tensor/array.
mode: One of "caffe", "tf" or "torch".
- caffe: will convert the images from RGB to BGR,
then will zero-center each color channel with
respect to the ImageNet dataset,
without scaling.
- tf: will scale pixels between -1 and 1,
sample-wise.
- torch: will scale pixels between 0 and 1 and then
will normalize each channel with respect to the
ImageNet dataset.
# Returns
Preprocessed tensor or Numpy array.
Related
How can I do this for multiple images (within a folder) and put them into a Dataframe?
This is the code for analysing one image:
import numpy as np
from keras.preprocessing import image
from keras.applications import resnet50
import warnings
warnings.filterwarnings('ignore')
# Load Keras' ResNet50 model that was pre-trained against the ImageNet database
model = resnet50.ResNet50()
# Load the image file, resizing it to 224x224 pixels (required by this model)
img = image.load_img("rgotunechair10.jpg", target_size=(224, 224))
# Convert the image to a numpy array
x = image.img_to_array(img)
# Add a forth dimension since Keras expects a list of images
x = np.expand_dims(x, axis=0)
# Scale the input image to the range used in the trained network
x = resnet50.preprocess_input(x)
# Run the image through the deep neural network to make a prediction
predictions = model.predict(x)
# Look up the names of the predicted classes. Index zero is the results for the first image.
predicted_classes = resnet50.decode_predictions(predictions, top=9)
image_components = []
for x,y,z in predicted_classes[0]:
image_components.append(y)
print(image_components)
This is the output:
['desktop_computer', 'desk', 'monitor', 'space_bar', 'computer_keyboard', 'typewriter_keyboard', 'screen', 'notebook', 'television']
How can I do this for multiple images (within a folder) and put them into a Dataframe?
First of all, move the code for analyzing the image to a function. Instead of printing the result, you will return it there:
import numpy as np
from keras.preprocessing import image
from keras.applications import resnet50
import warnings
warnings.filterwarnings('ignore')
def run_resnet50(image_name):
# Load Keras' ResNet50 model that was pre-trained against the ImageNet database
model = resnet50.ResNet50()
# Load the image file, resizing it to 224x224 pixels (required by this model)
img = image.load_img(image_name, target_size=(224, 224))
# Convert the image to a numpy array
x = image.img_to_array(img)
# Add a forth dimension since Keras expects a list of images
x = np.expand_dims(x, axis=0)
# Scale the input image to the range used in the trained network
x = resnet50.preprocess_input(x)
# Run the image through the deep neural network to make a prediction
predictions = model.predict(x)
# Look up the names of the predicted classes. Index zero is the results for the first image.
predicted_classes = resnet50.decode_predictions(predictions, top=9)
image_components = []
for x,y,z in predicted_classes[0]:
image_components.append(y)
return(image_components)
Then, get all images inside the desired folder (for instance, the current directory):
images_path = '.'
images = [f for f in os.listdir(images_path) if f.endswith('.jpg')]
Run the function on all images, get the result:
result = [run_resnet50(img_name) for img_name in images]
This result will be a list of lists. Then you could just move it to a DataFrame. If you want to keep the image name for each result, use a dictionary instead.
I have a tensorflow keras model trained with tensorflow 2.3. The model takes as input an image, however the model was trained with scaled inputs and therefore we have to scale the image by 255 before inputting them into the model.
As we use this model across a variety of platforms, I am trying to simplify this by modifying the model to simply insert a rescale layer at the start of the keras model (i.e. immediately after the input). Therefore any future consumption of this model can simply pass an image without having to scale them.
I am having a lot of trouble getting this to work. I understand I need to use the following function to create a rescaling layer;
tf.keras.layers.experimental.preprocessing.Rescaling(255, 0.0, "rescaling")
But I am unsure how to insert this to the start of the model.
Thank you in advance
you can insert this layer at the top of your trained model. below an example where first we train a model manual scaling the input and the we using the same trained model but adding at the top a Rescaling layer
from tensorflow.keras.layers.experimental.preprocessing import Rescaling
# generate dummy data
input_dim = (28,28,3)
n_sample = 10
X = np.random.randint(0,255, (n_sample,)+input_dim)
y = np.random.uniform(0,1, (n_sample,))
# create base model
inp = Input(input_dim)
x = Conv2D(8, (3,3))(inp)
x = Flatten()(x)
out = Dense(1)(x)
# fit base model with manual scaling
model = Model(inp, out)
model.compile('adam', 'mse')
model.fit(X/255, y, epochs=3)
# create new model with pretrained weight + rescaling at the top
inp = Input(input_dim)
scaled_input = Rescaling(1/255, 0.0, "rescaling")(inp)
out = model(scaled_input)
scaled_model = Model(inp, out)
# compare prediction with manual scaling vs layer scaling
pred = model.predict(X/255)
pred_scaled = scaled_model.predict(X)
(pred.round(5) == pred_scaled.round(5)).all() # True
Rescaling the images is part of data preprocessing, also rescaling images is called image normalization, this process is useful for providing a uniform scale for the dataset or numerical values you are using before building your model.In keras you can do this in many ways using one of the following according to your target:
If you are training using an Artificial neural network model you can use:-
"Batch normalization layer" or "Layer Normalization" or by the rescale method of keras you mentioned. You can look at this resource for more information about normalization .
https://machinelearningknowledge.ai/keras-normalization-layers-explained-for-beginners-batch-normalization-vs-layer-normalization/
to use the rescale method you mentioned:
#importing you libraries 1st
import tensorflow as tf
from tensorflow.keras.layers import BatchNormalization
#if your are using dataset from directory
import pathlib
then import your Dataset:
Dataset_Dir = '/Dataset/ path'
image size = (256,256) #the image size in your dataset
image shape = (96,96,3) #The shape you wish for your images in your network
Then divide your dataset to train-test I use 70-30 percent
Training_set = tf.keras.preprocessing.image_dataset_from_directory(Dataset_Dir,batch_size= 32,
image_size= image_size,
validation_split= 0.3,subset = "training",seed =123)
Test set
Testing_set = tf.keras.preprocessing.image_dataset_from_directory(Dataset_Dir,image_size= image_size,
validation_split=0.3,seed=123,subset ="validation")
normalization layer:
normalization_layer = tf.keras.layers.experimental.preprocessing.Rescaling(1./255)
normalized_training_set = Training_set.map(lambda x, y: (normalization_layer(x), y))
training_image_batch,training_labels_batch = next(iter(normalized_training_set))
for more about this method too:
look at tensorflow tutorial:
https://www.tensorflow.org/tutorials/images/classification
I am new to TensorFlow. I am trying to run a pre-trained NN for number recognition 'wide_resnet_28_10' from github- https://github.com/Curt-Park/handwritten_digit_recognition. When I try to predict an image it says expected input to have 4D. This is what I tried-
from tensorflow.keras.models import load_model
import tensorflow as tf
import cv2
import numpy
model = load_model(r'C:\Users\sesha\Desktop\python\Deep learning NN\handwritten_digit_recognition-master\models\WideResNet28_10.h5')
image = cv2.imread(r'C:\Users\sesha\Desktop\python\Deep learning NN\test_org01.png')
img = tf.convert_to_tensor(image)
predictions = model.predict([img])
print(np.argmax(predictions))
most tutorial are vague, i did try np.reshape(1,X,X,-1) that didn't work.
For the 4D input, it expects batches of data. You can make it a 4D tensor by doing:
predictions = model.predict(tf.expand_dims(img, 0))
if this does not work, try predict_on_batch instead of predict.
Also:
I don't think that your image reading is correct. It will probably give you a tensor of the byte string.
This should work
path = tf.constant(img_path)
image = tf.io.read_file(path)
image = tf.io.decode_image(image)
image = tf.image.resize(image, (X, Y)) # if necessary
Background:
I am trying to create a basic neural network to recognize hand-drawn images using the MNIST dataset. I have things working when training/predicting against the MNIST data.
Goal:
I would like to start applying the model to non-MNIST images (i.e. hand-drawn images that I create myself).
Problem:
Every prediction of hand-drawn images that I create has ended up being incorrect (which is odd because predictions against MNIST images are 95% accurate).
Code:
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np
import cv2
mnist = tf.keras.datasets.mnist # 28x28 images of handwritten digits (0-9)
(x_train, y_train), (x_test, y_test) = mnist.load_data()
x_train = tf.keras.utils.normalize(x_train, axis=1)
x_test = tf.keras.utils.normalize(x_test, axis=1)
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Flatten())
model.add(tf.keras.layers.Dense(128, activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(128, activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(10, activation=tf.nn.softmax))
model.compile(optimizer='adam',
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
model.fit(x_train, y_train, epochs=3)
val_loss, val_acc = model.evaluate(x_test, y_test)
print(val_loss, val_acc)
# prediction from MNIST dataset
index_of_mnist_img = 0
predictionsA = model.predict([x_test])
print(np.argmax(predictionsA[index_of_mnist_img]))
plt.imshow(x_test[index_of_mnist_img], cmap = plt.cm.binary)
plt.show()
# prediction from my own hand-drawn image (THIS IS WHERE THINGS START GOING WRONG)
img = cv2.imread('4.png')
img = cv2.resize(img, (28,28))
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img = np.reshape(img, [1,28,28])
predictionsB = model.predict(img)
print(np.argmax(predictionsB[0]))
plt.imshow(predictionsB[0])
plt.show()
Any ideas?
I believe that you need to invert the colormap for your new (hand-drawn) image.
When I look at MNIST example images, I see something like this:
# show mnist image
index_of_mnist_img = 0
plt.imshow(x_test[index_of_mnist_img], cmap = plt.cm.binary)
plt.show()
However, if I make an example hand-written digit, and read it in as you have, I see an inverted image.
img = cv2.imread("4.png")
img = cv2.resize(img, (28,28))
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
plt.imshow(img, cmap = plt.cm.binary)
You can invert the image with OpenCV by adding one line, cv2.bitwise_not().
img = cv2.imread(r"4.png")
img = cv2.resize(img, (28,28))
img = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
img= cv2.bitwise_not(img) # invert image
plt.imshow(img, cmap = plt.cm.binary)
When I invert the image, then I get correct predictions from the neural network you have trained above.
predictionsB = model.predict(img)
print(np.argmax(predictionsB[0]))
4
Do you need consider how to the train/test was created. This traning can be overrfited, with this, your give a good acurracity in train, but this is not trully whith test data.
Is also possible uses image augmentation to increase your dataset, because MNIST is a relative small dataset, and the numbers is centralized, the image does not have noise and etc.
And another concepts can be used, like Dropouts: The idea behind Dropouts is that they remove a random number of neurons in your neural network. This works very well for two reasons: The first is that neighboring neurons often end up with similar weights, which can lead to overfitting, so dropping some out at random can remove this.
Another thing is use Pooling Layers to reduce size of representation, with this it will detect more features.
I have just begun learning Machine learning and am using Tensorflow 1.14. I have just created my first model using tensorflow.keras using the inbuilt tensorflow.keras.datasets.mnist dataset. Here is the code for my model:
import tensorflow as tf
from tensorflow import keras
mnist = keras.datasets.mnist
(x_train, y_train), (x_test, y_test) = mnist.load_data()
class Stopper(keras.callbacks.Callback):
def on_epoch_end(self, epoch, log={}):
if log.get('acc') >= 0.99:
self.model.stop_training = True
print('\nReached 99% Accuracy. Stopping Training...')
model = keras.Sequential([
keras.layers.Flatten(),
keras.layers.Dense(1024, activation=tf.nn.relu),
keras.layers.Dense(10, activation=tf.nn.softmax)])
model.compile(
optimizer=tf.train.AdamOptimizer(),
loss='sparse_categorical_crossentropy',
metrics=['accuracy'])
x_train, x_test = x_train / 255, x_test / 255
model.fit(x_train, y_train, epochs=10, callbacks=[Stopper()])
Now that the model is trained, I can feed the x_test images into model.predict() and that works fine. But I was wondering how to feed my own images (JPG and PNG) into my model's predict() method?
I have looked at the documentation and their method results in an error for me. In particular I tried the following:
img_raw = tf.read_file(<my file path>)
img_tensor = tf.image.decode_image(img_raw)
img_final = tf.image.resize(img_tensor, [192, 192])
^^^ This line throws error 'ValueError: 'images' contains no shape.'
Please provide a step by step guide for getting an image (JPG and PNG) into my model for a prediction. Thank you very much.
from PIL import Image
img = Image.open("image_file_path").convert('L').resize((28, 28), Image.ANTIALIAS)
img = np.array(img)
model.predict(img[None,:,:])
You have trained your model with images of size (28 X 28), so have to resize your image to the same. You cannot use the images of a different dimension.
Predict requires a batch of images but since you want to make a prediction on a single image you have to add an additional dimension of the batch for this single image. This is done by expand_dim or reshape or img[None,:,:]
Every image fundamentally is made of pixels, you can pass these pixel values over to your neural network.
To convert the image into an array of pixels you can use libraries like skimage as follows.
from skimage.io import imread
imagedata=imread(imagepath)
#you can pass this image to the model
To read group of images loop them over and store that data in an array.
Also you will have to resize to normalise all the pictures to load them into your NN.
resized_image = imagedata.resize(preferred_width, preferred_height, Image.ANTIALIAS)
You can also choose to convert the image to black and white to reduce the number of computations, I am using pillow library, a common image preprocessing library here to apply the black and white filter
from PIL import Image
# load the image
image = Image.open('opera_house.jpg')
# convert the image to grayscale
gs_image = image.convert(mode='L')
The order of preprocessing can be
1. convert images to black and white
2. resize the images
3. convert them into numpy array using imread