I am trying to get the images from a list of predictions called 'classes'
classes = model.predict(test_set)
and to match the result with the picture, I am finding a little troubling to do.
Below is what I have done to try and show the images with the result.
location = 2 #Iterate through list of predictions.
print(categories[np.argmax(classes[location])]) #Shows result from prediction
plt.figure()
plt.imshow(test_set[location]) # Shows image
The error I get doing this is:
"could not broadcast input array from shape (32,224,224,3) into shape (32,)"
if I do
plt.imshow(classes[location]) # Shows image
then I get: TypeError: Invalid shape (2,) for image data
Your image data may need to be processed to the appropriate format (see article). Try this:
plt.imshow(test_set[location].numpy().astype("uint8"))
Also, your classes[location] is not an image, so you probably should not be trying to pass it to plt.imshow. Instead, it is the prediction result. I suspect it is the softmax probability vector with two elements (?for binary classification you may be attempting).
Related
Recently I have been learning Tensor Flow, and I have written a few machine learning programs, however, I am wondering in what way can I test the model on a single input and receive the prediction, and not just evaluate the accuracy of the model on a lot of data as you would do using the model.fit() function. I am also wondering how can I then implement the model in a script, that for example gathers data and feeds it into the model automatically to obtain the predictions and then for example plots the results on a graph.
Thanks in advance.
To use your trained model for a single input lets call it y, you must process y to have the same data format your model was trained on. For example lets assume that you trained on model on images of cats and dog. If you model trained properly you should be able to submit a picture of a cat or a dog to it and have it tell you which it is.
Now if images were the input used to train the model they had a certain image shape (height,width) and a certain channel format for example RGB or Grayscale etc. So for the image y you want to predict you must ensure its size is the same height and width the model was trained on. If the model was trained on rgb images then y must be an rgb image. one more thing. When using model.predict say for predicting the single image y you will have to account for the fact that model.predict requires that you have the first dimension of y to be the batch_size. For the case of a single image the batch size is 1. So you need to expand the dimensions of y to include the batch size. For an immage the shape of y is (height, width,channels). It doesn't have a batch dimension so you need to add it. You can do that with
the y=np.expand_dims(y,axis=0) which will now give y the shape (1, height,width,channels). For example lets assume you trained you model on images of shape (224,224,3) in rgb format. You have an image y you want to classify and say it is a directory my_pics. The code below shows how to handle doing a prediction on image y. Somewhere in your training code you need to have an ordered list called classes. For the dog example the index code for cat might be 0 and the index code for dog then will be 1. So classes would be classes=['cat', 'dog']
model=tf.keras.models.load_model(path where model is stored) # load the trained model
image_path=r'my_pics' # path to image y
y=cv2.imread(image_path) #Note cv2 reads in images as bgr
y=cv2.resize(y, (224,224) # gives y the same shape as the training images
y=cv2.cvtColor(y, cv2.COLOR_BGR2RGB)# convert from bgr to rgb
y=np.expand_dims(y, axis=0) # y has shape (1,224,224,3)
prediction = model.predict(y) # make a prediction on y
print (prediction) # is a list with a probability value for each class
class_index=np.argmax(prediction # gives index of entry in prediction with highest probability
klass=classes[class_index} #selects the class name from the ordered list of classes
print (class)
I'm setting up a image data pipeline on Tensorflow 2.1. I'm using a dataset with RGB images of variable shapes (h, w, 3) and I can't find a way to make it work. I get the following error when I call tf.data.Dataset.batch() :
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot batch tensors with different shapes in component 0. First element had shape [256,384,3] and element 3 had shape [160,240,3]
I found the padded_batch method but I don't want my images to be padded to the same shape.
EDIT:
I think that I found a little workaround to this by using the function tf.data.experimental.dense_to_ragged_batch (which convert the dense tensor representation to a ragged one).
Unlike tf.data.Dataset.batch, the input elements to be batched may have different shapes, and each batch will be encoded as a tf.RaggedTensor
But then I have another problem. My dataset contains images and their corresponding labels. When I use the function like this:
ds = ds.map(
lambda x: tf.data.experimental.dense_to_ragged_batch(batch_size)
)
I get the following error because it tries to map the function to the entire dataset (thus to images and labels), which is not possible because it can only be applied to a 1 single tensor (not 2).
TypeError: <lambda>() takes 1 positional argument but 2 were given
Is there a way to specify which element of the two I want the transformation to be applied to ?
I just hit the same problem. The solution turned out to be loading the data as 2 datasets and then using dataet.zip() to merge them.
images = dataset.map(parse_images, num_parallel_calls=tf.data.experimental.AUTOTUNE)
images = dataset_images.apply(
tf.data.experimental.dense_to_ragged_batch(batch_size=batch_size, drop_remainder=True))
dataset_total_cost = dataset.map(get_total_cost)
dataset_total_cost = dataset_total_cost.batch(batch_size, drop_remainder=True)
dataset = dataset.zip((dataset_images, dataset_total_cost))
If you do not want to resize your images, you can only use a batch size of 1 and not bigger than that. Thus you can train your model one image at at time. The error you reported clearly says that you are using a batch size bigger than 1 and trying to put two images of different shape/size in a batch. You could either resize your images to a fixed shape (or pad your images), or use batch size of 1 as follows:
my_data = tf.data.Dataset(....) # with whatever arguments you use here
my_data = my_data.batch(1)
I've taken a quick course in neural networks to better understand them and now I'm trying them out for myself in R. I'm following this documentation of Keras.
The way I understand what is happening:
We are inputting a series of images and transforming these images to numerical matrices based on the arrangement of the pixels and colors in those pixels. We then build a neural network model to learn the pattern of these arrangements, depending on the classification (0 to 9). We then use the model to predict which class an image belongs to. I'll be honest and admit I'm not entirely sure what y_train and x_train is. I simply see it as one training and one validation set so I'm not sure what the difference between x and y is.
My question:
I've followed the steps to the T and the model runs fine and the predictions look like they do in the documentation. Ultimately, the prediction looks like this:
I take this to mean that observation 1 in x_test is predicted to be a category 7.
However, looking at x_test it looks like this:
There is a 0 in every column and row, also if I scroll further down. This is where I get confused. I'm also not sure how I view the original images to view for myself how well they are predicting them. I would eventually like to draw a number myself in paint or so and then see if the model can predict it, but for that I need to first understand what is going on. I feel I am close but I just need a little nudge!
I think if you read more about the input and output layer's dimensions, that would help.
In your example:
Input layer:
A single training example of image has two dimensions 28*28, which is then converted to a single vector of dimension 784. This acts as the input layer for the neural network.
So for m training examples, your input layer will have dimensions (m, 784). Analogically speaking (to traditional ML systems), you can imagine that each pixel of an image is converted into a feature (or x1, x2, ... x784), and your training set is a dataframe with m rows and 784 columns, which is then fed into neural network to compute y_hat = f(x1,x2,x3,...x784).
Output layer:
As an output for our neural network, we want it to predict which number it is from 0 to 9. So for a single training example, the output layer has dimension 10, representing each number from 0 to 9 and for n testing examples the output layer would be a matrix with dimension n*10.
Our y is a vector of length n which would be something like [1,7,8,2,.....] containing true value for each testing example. But to match the dimension of output layer, the y vector's dimension are converted using one-hot encoding. Imagine a length 10 vector, representing number 7 by putting 1 at 7th place and rest of the positions zeros something like [0,0,0,0,0,0,1,0,0,0].
So in your question, if you wish to see the original image, you should be able to see it before reshaping the training examples with something like image(mnist$test$x[1, , ]
Hope this helps!!
y_train are the labels and x_train is the training data, so images in this example. You need to use some kind of plotting library to plot x'es. In this example you probably are not expected to input your own drawings and if you want you would need to preprocess them in the same way as in MNIST and pass them to the model.
Im getting an error when attempting to load the Caltech tensorflow-dataset. I'm using the standard code found in the tensorflow-datasets GitHub
The error is this:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot batch tensors with different shapes in component 0. First element had shape [204,300,3] and element 1 had shape [153,300,3]. [Op:IteratorGetNextSync]
The error points to the line for features in ds_train.take(1)
Code:
ds_train, ds_test = tfds.load(name="caltech101", split=["train", "test"])
ds_train = ds_train.shuffle(1000).batch(128).prefetch(10)
for features in ds_train.take(1):
image, label = features["image"], features["label"]
The issue comes from the fact that the dataset contains variable-sized images (see the dataset description here). Tensorflow can only batch together things with the same shape, so you first need to either reshape the images to a common shape (e.g., the input shape of your network) or pad them accordingly.
If you want to resize, use tf.image.resize_images:
def preprocess(features, label):
features['image'] = tf.image.resize_images(features['image'], YOUR_TARGET_SIZE)
# Other possible transformations needed (e.g., converting to float, normalizing to [0,1]
return features, label
If, instead, you want to pad, use tf.image.pad_to_bounding_box (just replace it in the above preprocess function and adapt the parameters as needed).
Normally, for most of the networks I'm aware of, resizing is used.
Finally, map the function on your dataset:
ds_train = (ds_train
.map(prepocess)
.shuffle(1000)
.batch(128)
.prefetch(10))
Note: The variable shapes in the error codes come from the shuffle call.
I can't figure out how to correctly feed training data to a functional keras model. I have two input types: Image data and float numbers, each number belonging to one image. This data is classified into 6 classes. How do I need to format my input data and how do I need to define it in my keras network?
The image data is analyzed by a CNN and should then be concatenated with the float numbers. Afterwards, three dense layers are used for classification. There doesn't seem to be an example or tutorial that is similar to my problem.
Two separate inputs:
imageInput = Input(image_shape) #often, image_shape is (pixelsX, pixelsY, channels)
floatInput = Input(float_shape) #if one number per image, shape is: (1,)
The convolutional part:
convOut = SomeConvLayer(...)(imageInput)
convOut = SomeConvLayer(...)(convOut)
#...
convOut = SomeConvLayer(...)(convOut)
If necessary, do something similar with the other input.
Joining the two branches:
#Please make sure you use compatible shapes
#You should probably not have spatial dimensions anymore at this point
#Probably some kind of GloobalPooling:
convOut = GlobalMaxPooling2D()(convOut)
#concatenate the values:
joinedOut = Concatenate()([convOut,floatInput])
#or some floatOut if there were previous layers in the float side
Do more stuff with your joined output:
joinedOut = SomeStuff(...)(joinedOut)
joinedOut = Dense(6, ...)(joinedOut)
Create the model with two inputs:
model = Model([imageInput,floatInput], joinedOut)
Train with:
model.fit([X_images, X_floats], classes, ...)
Where classes is a "one-hot encoded" tensor containing the correct class(es) for each image.
There isn't "one correct solution", though. You could try a lot of different things, such as "adding the number" somewhere in the middle of the convolutions, or multiplying it, or creating more convolutions after you manage to concatenate the values somehow.... this is art.
The input data
The input and output data should be numpy arrays.
The arrays should be shaped as:
- Image input: `(number_of_images, side1, side2, channels)`
- Floats input: `(number_of_images, number_of_floats_per_image)`
- Outputs: `(number_of_images, number_of_classes)`
Keras will know everything necessary from these shapes, row 0 in all arrays will be image 0, row 1 will be image 1 and so on.