Error when getting features from tensorflow-dataset - python

Im getting an error when attempting to load the Caltech tensorflow-dataset. I'm using the standard code found in the tensorflow-datasets GitHub
The error is this:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot batch tensors with different shapes in component 0. First element had shape [204,300,3] and element 1 had shape [153,300,3]. [Op:IteratorGetNextSync]
The error points to the line for features in ds_train.take(1)
Code:
ds_train, ds_test = tfds.load(name="caltech101", split=["train", "test"])
ds_train = ds_train.shuffle(1000).batch(128).prefetch(10)
for features in ds_train.take(1):
image, label = features["image"], features["label"]

The issue comes from the fact that the dataset contains variable-sized images (see the dataset description here). Tensorflow can only batch together things with the same shape, so you first need to either reshape the images to a common shape (e.g., the input shape of your network) or pad them accordingly.
If you want to resize, use tf.image.resize_images:
def preprocess(features, label):
features['image'] = tf.image.resize_images(features['image'], YOUR_TARGET_SIZE)
# Other possible transformations needed (e.g., converting to float, normalizing to [0,1]
return features, label
If, instead, you want to pad, use tf.image.pad_to_bounding_box (just replace it in the above preprocess function and adapt the parameters as needed).
Normally, for most of the networks I'm aware of, resizing is used.
Finally, map the function on your dataset:
ds_train = (ds_train
.map(prepocess)
.shuffle(1000)
.batch(128)
.prefetch(10))
Note: The variable shapes in the error codes come from the shuffle call.

Related

How to show picture from predictions? (CNN AI predictions)

I am trying to get the images from a list of predictions called 'classes'
classes = model.predict(test_set)
and to match the result with the picture, I am finding a little troubling to do.
Below is what I have done to try and show the images with the result.
location = 2 #Iterate through list of predictions.
print(categories[np.argmax(classes[location])]) #Shows result from prediction
plt.figure()
plt.imshow(test_set[location]) # Shows image
The error I get doing this is:
"could not broadcast input array from shape (32,224,224,3) into shape (32,)"
if I do
plt.imshow(classes[location]) # Shows image
then I get: TypeError: Invalid shape (2,) for image data
Your image data may need to be processed to the appropriate format (see article). Try this:
plt.imshow(test_set[location].numpy().astype("uint8"))
Also, your classes[location] is not an image, so you probably should not be trying to pass it to plt.imshow. Instead, it is the prediction result. I suspect it is the softmax probability vector with two elements (?for binary classification you may be attempting).

How to configure a tf.data.Dataset for variable size images?

I'm setting up a image data pipeline on Tensorflow 2.1. I'm using a dataset with RGB images of variable shapes (h, w, 3) and I can't find a way to make it work. I get the following error when I call tf.data.Dataset.batch() :
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot batch tensors with different shapes in component 0. First element had shape [256,384,3] and element 3 had shape [160,240,3]
I found the padded_batch method but I don't want my images to be padded to the same shape.
EDIT:
I think that I found a little workaround to this by using the function tf.data.experimental.dense_to_ragged_batch (which convert the dense tensor representation to a ragged one).
Unlike tf.data.Dataset.batch, the input elements to be batched may have different shapes, and each batch will be encoded as a tf.RaggedTensor
But then I have another problem. My dataset contains images and their corresponding labels. When I use the function like this:
ds = ds.map(
lambda x: tf.data.experimental.dense_to_ragged_batch(batch_size)
)
I get the following error because it tries to map the function to the entire dataset (thus to images and labels), which is not possible because it can only be applied to a 1 single tensor (not 2).
TypeError: <lambda>() takes 1 positional argument but 2 were given
Is there a way to specify which element of the two I want the transformation to be applied to ?
I just hit the same problem. The solution turned out to be loading the data as 2 datasets and then using dataet.zip() to merge them.
images = dataset.map(parse_images, num_parallel_calls=tf.data.experimental.AUTOTUNE)
images = dataset_images.apply(
tf.data.experimental.dense_to_ragged_batch(batch_size=batch_size, drop_remainder=True))
dataset_total_cost = dataset.map(get_total_cost)
dataset_total_cost = dataset_total_cost.batch(batch_size, drop_remainder=True)
dataset = dataset.zip((dataset_images, dataset_total_cost))
If you do not want to resize your images, you can only use a batch size of 1 and not bigger than that. Thus you can train your model one image at at time. The error you reported clearly says that you are using a batch size bigger than 1 and trying to put two images of different shape/size in a batch. You could either resize your images to a fixed shape (or pad your images), or use batch size of 1 as follows:
my_data = tf.data.Dataset(....) # with whatever arguments you use here
my_data = my_data.batch(1)

Tensorflow applying operations inside a model: FailedPreconditionError

Say I have CNN model that outputs N probability maps as mask the same size of the input image in a Unet like fashion. I would then want to apply for example least square fit on top of each mask to get coefficients for functions as output instead and use these to calculate my models loss.
def unet_model(...)
# init unet model
...
...
# final layer
mask_out = layers.Conv2D(output_channels, (1,1), activation='softmax')(conv9)
# start applying e.g least squares fit here
eq_list = tf.Variable((x_map, y_map, mask_out))
transp = tf.transpose(a)
...
transp would get the following error when I initialize the model. I have tested the least squares fit operations elsewhere.
FailedPreconditionError: Error while reading resource variable _AnonymousVar1423 from Container: localhost. This could mean that the variable was uninitialized. Not found: Resource localhost/_AnonymousVar1423/N10tensorflow3VarE does not exist. name: transpose/
I have some assumptions such as that transpose cannot deal with placeholders axis for batch sizes, but am generally clueless about this.
before adding each variables I needed to make sure that x_map and y_map also is batched by expanding the dims with axis -1

Format mutiple inputs with mutiple categories for a functional keras model and feed it to the model

I can't figure out how to correctly feed training data to a functional keras model. I have two input types: Image data and float numbers, each number belonging to one image. This data is classified into 6 classes. How do I need to format my input data and how do I need to define it in my keras network?
The image data is analyzed by a CNN and should then be concatenated with the float numbers. Afterwards, three dense layers are used for classification. There doesn't seem to be an example or tutorial that is similar to my problem.
Two separate inputs:
imageInput = Input(image_shape) #often, image_shape is (pixelsX, pixelsY, channels)
floatInput = Input(float_shape) #if one number per image, shape is: (1,)
The convolutional part:
convOut = SomeConvLayer(...)(imageInput)
convOut = SomeConvLayer(...)(convOut)
#...
convOut = SomeConvLayer(...)(convOut)
If necessary, do something similar with the other input.
Joining the two branches:
#Please make sure you use compatible shapes
#You should probably not have spatial dimensions anymore at this point
#Probably some kind of GloobalPooling:
convOut = GlobalMaxPooling2D()(convOut)
#concatenate the values:
joinedOut = Concatenate()([convOut,floatInput])
#or some floatOut if there were previous layers in the float side
Do more stuff with your joined output:
joinedOut = SomeStuff(...)(joinedOut)
joinedOut = Dense(6, ...)(joinedOut)
Create the model with two inputs:
model = Model([imageInput,floatInput], joinedOut)
Train with:
model.fit([X_images, X_floats], classes, ...)
Where classes is a "one-hot encoded" tensor containing the correct class(es) for each image.
There isn't "one correct solution", though. You could try a lot of different things, such as "adding the number" somewhere in the middle of the convolutions, or multiplying it, or creating more convolutions after you manage to concatenate the values somehow.... this is art.
The input data
The input and output data should be numpy arrays.
The arrays should be shaped as:
- Image input: `(number_of_images, side1, side2, channels)`
- Floats input: `(number_of_images, number_of_floats_per_image)`
- Outputs: `(number_of_images, number_of_classes)`
Keras will know everything necessary from these shapes, row 0 in all arrays will be image 0, row 1 will be image 1 and so on.

Tensorflow doesn't allow for training data to be used in (features,targets) style?

I have data that is a ndarray containing features and targets with different dimensions respectively. This seems to give tensorflow problems.
If I have a function:
def cost(self,data):
return self.sess.run(self.cost_function, feed_dict={self.input:data[:,0], self.targets:data[:,1]})
This results in ValueError: setting an array element with a sequence.
It seems to be because feed_dict is not recognizing my inputs as numpy arrays. I think this is because features and targets have different dimensions and ndarray has problems with that; if I have data with 100 pairs then its shape is (100,2). If I then slice it: data[:,0].shape=(100,) and data[:,1].shape=(100,), so the length of the feature/target vectors is not recognized even after slicing.
I've got around the problem by splitting data up beforehand into feats and targs, who's shapes are returned correctly.
My question is, is this normal - is this supposed to be like this? Or am I just doing something wrong? It would be nicer to just work with data instead of passing two variables around all the time.
edit:
self.input = tf.placeholder("float",shape=[None,39])
self.targets = tf.placeholder("float",shape=[None,949])
The dimensions of data should be self-explanatory.

Categories

Resources