TF Keras how to get expected input shape when loading a model?

TF Keras how to get expected input shape when loading a model? - python

Is it possbible to get the expected input shape from a 'model.h5' file?
I have two models for the same dataset but with different options and shapes. The first one expects a dim of (None, 64, 48, 1) and the seconds model need input shape (None, 128, 96, 3). (Note: The width or the height are not fixed and can change when I train again).
The channels problem was easy to "fix" (or bypass rather) by just using try: and except because there are only two options (1 for grayscale image and 3 for rgb image):
channels = self.df["channels"][0]
file = ""
try:
images, src_images, data = self.get_images()
images = self.preprocess_data(images, channels)
predictions, file = self.load_model(images, file)
self.predict_data(src_images, predictions, data)
except:
if channels == 1:
print("Except channels =", channels)
channels = 3
images, src_images, data = self.get_images()
images = self.preprocess_data(images, channels)
predictions = self.load_model(images, file)
self.predict_data(src_images, predictions, data)
else:
channels = 1
print("Except channels =", channels)
images, src_images, data = self.get_images()
images = self.preprocess_data(images, channels)
predictions = self.load_model(images, file)
self.predict_data(src_images, predictions, data)
This workaround however cannot be used for the width and height of an image because there basically unlimited amount of options.
Besides that it is rather slow because I read all the data twice and preprocess it twice for no reason.
Is there a way to load the model.h5 file and print the expected input shape in a form like this?:
[None, 128, 96, 3]

I finally found the answer myself.
config = model.get_config() # Returns pretty much every information about your model
print(config["layers"][0]["config"]["batch_input_shape"]) # returns a tuple of width, height and channels
This will output the following:
(None, 128, 96, 3)

I found the answer from here to be more concise:
model.layers[0].input_shape[0]
The way I understand it, it should be easier to deal with multiple inputs that way too.

Related

Resizing a numpy array to 224x224 for VGG16 Model

I am solving a Multiview Classification problem using VGG16 pretrained model. In my case, I have 4 views that are my inputs and they are of size (64,64,3). But VGG16 uses input size of (224,224,3).
Now for solving the problem, I am supposed to create my own data loader instead of using quick built-in methods like keras load_img() or openCV imread(). So I am doing all this with plain numpy arrays.
I am trying to resize the shape of my input from 64x64 to 224X224. But I am unable to do it, it keeps throwing one error or another. This is my code for data loader:
def data_loader(dataframe, classDict, basePath, batch_size=16):
while True:
x_batch = np.zeros((batch_size, 4, 64, 64, 3)) #Create a zeros array for images
y_batch = np.zeros((batch_size, 20)) #Create a zeros array for classes
for i in range(0, batch_size):
rndNumber = np.random.randint(len(dataframe))
*images, class_id = dataframe.iloc[rndNumber]
for j in range(4):
x_batch[i,j] = plt.imread(os.path.join(basePath, images[j])) / 255.
# x_batch[i,j] = x_batch[i,j].resize(1, 224, 224, 3) #<--- Try(1)
class_id = classDict[class_id]
y_batch[i, class_id] = 1.0
# yield {'image1': np.resize(x_batch[:, 0],(batch_size, 224, 224, 3)), #<--- Try(2)
# 'image2': np.resize(x_batch[:, 1],(1, 224, 224, 3)),
# 'image3': np.resize(x_batch[:, 2],(1, 224, 224, 3)),
# 'image4': np.resize(x_batch[:, 3],(1, 224, 224, 3)) }, {'class_out': y_batch} #'yield' is a keyword that is used like return, except the function will return a generator"
yield {'image1': x_batch[:, 0],
'image2': x_batch[:, 1],
'image3': x_batch[:, 2],
'image4': x_batch[:, 3], }, {'class_out': y_batch}
## Testing the data loader
example, lbl= next(data_loader(df_train, classDictTrain, basePath))
print(example['image1'].shape) #example['image1'][0].shape
print(lbl['class_out'].shape)
I have made several attempts to resizing the images. I am listing them below with error messages I am receiving with each TRY:
Try(1) : Using x_batch[i,j] = x_batch[i,j].resize(1, 224, 224, 3) >> Error: ValueError: cannot resize this array: it does not own its data
Try(2) : Using yield {'image1': np.resize(x_batch[:, 0],(batch_size, 224, 224, 3)), ....... } >> The output shape is (16, 224, 224, 3) which seems fine but when I plot this, the resultant is an image like this
where I need original image just bigger in size like this
Please tell me what am I doing wrong and how can I fix it?

If I understand your problem correctly, you have an image which is 64x64, and you want to upscale it to a resolution of 224x224. Notice that the latter resolution contains many more pixels and you cannot simply force a reshape, because the original image has way less pixel.
You have to upsample the image, generating the missing pixels. A tool you can try is PIL Resize function which can be used with different resampling filters.
As far as I know, numpy does not easily support upscaling filters. Check out this post to understand how to convert a PIL image to a numpy array and you are ready to go.

The well-defined dimension of a tf.tensor is inexplicably `None`

The example below is extracted from the official TensorFlow tutorial on data pipelines. Basically, one resizes a bunch of JPGs to be (128, 128, 3). For some reason, when applying the map() operation, the colour dimension, namely 3, is turned into a None when examining the shape of the dataset. Why is that third dimension singled out? (I checked to see if there were any images that weren't (128, 128, 3) but didn't fid any.)
If anything, None should only show up for the very first dimension, i.e., that which counts the number of examples, and should not affect the individual dimensions of the examples, since---as nested structures---they're supposed to have the same shape anyway so as to be stored as tf.data.Datasets.
The code in TensorFlow 2.1 is
import pathlib
import tensorflow as tf
# Download the files.
flowers_root = tf.keras.utils.get_file(
'flower_photos',
'https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz',
untar=True)
flowers_root = pathlib.Path(flowers_root)
# Compile the list of files.
list_ds = tf.data.Dataset.list_files(str(flowers_root/'*/*'))
# Reshape the images.
# Reads an image from a file, decodes it into a dense tensor, and resizes it
# to a fixed shape.
def parse_image(filename):
parts = tf.strings.split(file_path, '\\') # Use the forward slash on Linux
label = parts[-2]
image = tf.io.read_file(filename)
image = tf.image.decode_jpeg(image)
image = tf.image.convert_image_dtype(image, tf.float32)
image = tf.image.resize(image, [128, 128])
print("Image shape:", image.shape)
return image, label
print("Map the parse_image() on the first image only:")
file_path = next(iter(list_ds))
image, label = parse_image(file_path)
print("Map the parse_image() on the whole dataset:")
images_ds = list_ds.map(parse_image)
and yields
Map the parse_image() on the first image only:
Image shape: (128, 128, 3)
Map the parse_image() on the whole dataset:
Image shape: (128, 128, None)
Why None in that last line?

From the tutorial you are missing this part
for image, label in images_ds.take(5):
show(image, label)
The line
images_ds = list_ds.map(parse_image)
only creates a placeholder
and there is no image being passed to the function
if you put prints the file_path is blank
But if your use
for image, label in images_ds.take(5)
it iterates over each image passing it through the parse_image function.

What's the cleanest and most efficient way to pass two stereo images to a loss function in Keras?

First off, why am I using Keras? I'm trying to stay as high level as possible, which doesn't mean I'm scared of low-level Tensorflow; I just want to see how far I can go while keeping my code as simple and readable as possible.
I need my Keras model (custom-built using the Keras functional API) to read the left image from a stereo pair and minimize a loss function that needs to access both the right and left images. I want to store the data in a tf.data.Dataset.
What I tried:
Reading the dataset as (left image, right image), i.e. as tensors with shape ((W, H, 3), (W, H, 3)), then use function closure: define a keras_loss(left_images) that returns a loss(y_true, y_pred), with y_true being a tf.Tensor that holds the right image. The problem with this approach is that left_images is a tf.data.Dataset and Tensorflow complains (rightly so) that I'm trying to operate on a dataset instead of a tensor.
Reading the dataset as (left image, (left image, right image)), which should make y_true a tf.Tensor with shape ((W, H, 3), (W, H, 3)) that holds both the right and left images. The problem with this approach is that it...does not work and raises the following error:
ValueError: Error when checking model target: the list of Numpy arrays
that you are passing to your model is not the size the model expected.
Expected to see 1 array(s), for inputs ['tf_op_layer_resize/ResizeBilinear']
but instead got the following list of 2 arrays: [<tf.Tensor 'args_1:0'
shape=(None, 512, 256, 3) dtype=float32>, <tf.Tensor 'args_2:0'
shape=(None, 512, 256, 3) dtype=float32>]...
So, is there anything I did not consider? I read the documentation and found nothing about what gets considered as y_pred and what as y_true, nor about how to convert a dataset into a tensor smartly and without loading it all in memory.
My model is designed as such:
def my_model(input_shape):
width = input_shape[0]
height = input_shape[1]
inputs = tf.keras.Input(shape=input_shape)
# < a few more layers >
outputs = tf.image.resize(tf.nn.sigmoid(tf.slice(disp6, [0, 0, 0, 0], [-1, -1, -1, 2])), tf.Variable([width, height]))
model = tf.keras.Model(inputs=inputs, outputs=outputs)
return model
And my dataset is built as such (in case 2, while in case 1 only the function read_stereo_pair_from_line() changes):
def read_img_from_file(file_name):
img = tf.io.read_file(file_name)
# convert the compressed string to a 3D uint8 tensor
img = tf.image.decode_png(img, channels=3)
# Use `convert_image_dtype` to convert to floats in the [0,1] range.
img = tf.image.convert_image_dtype(img, tf.float32)
# resize the image to the desired size.
return tf.image.resize(img, [args.input_width, args.input_height])
def read_stereo_pair_from_line(line):
split_line = tf.strings.split(line, ' ')
return read_img_from_file(split_line[0]), (read_img_from_file(split_line[0]), read_img_from_file(split_line[1]))
# Dataset loading
list_ds = tf.data.TextLineDataset('test/files.txt')
images_ds = list_ds.map(lambda x: read_stereo_pair_from_line(x))
images_ds = images_ds.batch(1)

Solved. I just needed to read the dataset as (left image, [left image, right image]) instead of (left image, (left image, right image)) i.e. make the second item a list and not a tuple. I can then access the images as input_r = y_true[:, 1, :, :] and input_l = y_true[:, 0, :, :]

Display output of vgg19 layer as image

I was reading this paper: Neural Style Transfer. In this paper author reconstructs image from output of layers of vgg19. I am using Keras. The size of output of block1_conv1 layer is (1, 400, 533, 64). Here 1 is number of images as input, 400 is number of rows, 533 number of columns and 64 number of channels. When I try to reconstruct it as an image, I get an error as size of image is 13644800 which is not a multiple of 3, so I can't display the image in three channels. How can I reconstruct this image?
I want to reconstruct images from layers as shown below:
Below is the code for the same:
from keras.preprocessing.image import load_img, img_to_array
from scipy.misc import imsave
import numpy as np
from keras.applications import vgg19
from keras import backend as K
CONTENT_IMAGE_FN = store image as input here
def preprocess_image(image_path):
img = load_img(image_path, target_size=(img_nrows, img_ncols))
img = img_to_array(img)
img = np.expand_dims(img, axis=0)
img = vgg19.preprocess_input(img)
return img
width, height = load_img(CONTENT_IMAGE_FN).size
img_nrows = 400
img_ncols = int(width * img_nrows / height)
base_image = K.variable(preprocess_image(CONTENT_IMAGE_FN))
RESULT_DIR = "generated/"
RESULT_PREFIX = RESULT_DIR + "gen"
if not os.path.exists(RESULT_DIR):
os.makedirs(RESULT_DIR)
result_prefix = RESULT_PREFIX
# this will contain our generated image
if K.image_data_format() == 'channels_first':
combination_image = K.placeholder((1, 3, img_nrows, img_ncols))
else:
combination_image = K.placeholder((1, img_nrows, img_ncols, 3))
x = preprocess_image(CONTENT_IMAGE_FN)
outputs_dict = dict([(layer.name, layer.output) for layer in model.layers])
feature_layers = ['block1_conv1', 'block2_conv1',
'block3_conv1', 'block4_conv1',
'block5_conv1']
outputs = []
for layer_name in feature_layers:
outputs.append(outputs_dict[layer_name])
functor = K.function([combination_image], outputs ) # evaluation function
# Testing
test = x
layer_outs = functor([test])
print(layer_outs)
layer_outs[0].reshape(400, -1 , 3) //getting error here
I am getting following error:
ValueError: cannot reshape array of size 13644800 into shape (400,newaxis,3)

You wrote:
"The size of output of block1_conv1 layer is (1, 400, 533, 64). Here 1
is number of images as input, 400 is number of rows, 533 number of
columns and 64 number of channels"
But this is not correct. The block1_conv1 output corresponds 1 channel dimension(channel first), 400 * 533 image dimension and 64 filters.
The error occurs, as you try to reshape a vector of VGG19 output of an image input with a 1 channel (400 * 533 * 64 = 13644800) to a vector which correspond to a 3 channels output.
Furthermore you have to pass 3 channel input:
From the VGG19 code:
input_shape: optional shape tuple, only to be specified
if include_top is False (otherwise the input shape
has to be (224, 224, 3)
(with channels_last data format)
or (3, 224, 224) (with channels_first data format).
It should have exactly 3 inputs channels,
and width and height should be no smaller than 32.
E.g. (200, 200, 3) would be one valid value.
Thus your input images has to be 3 channels. If you even want to feed 1 channel(grayscale) images to VGG19 you should make the following, if channels first:
X = np.repeat(X, 3 , axis=0)
or
X = np.repeat(X, 3 , axis=2)
if channels last without batch dimension or
X = np.repeat(X, 3 , axis=3)
with batch dimension.
If you provide more information regarding the actual dimensions of your input matrices of your images and type of it(grayscale,RGB), I can give you more help upon needing it.

Keras wrong image size

I want to test the accuracy of my CNN model for the test-images. Following is the code for converting Ground-truth images in mha format to png format.
def save_labels(fns):
'''
INPUT list 'fns': filepaths to all labels
'''
progress.currval = 0
for label_idx in progress(xrange(len(fns))):
slices = io.imread(fns[label_idx], plugin = 'simpleitk')
for slice_idx in xrange(len(slices)):
'''
commented code in order to reshape the image slices. I tried reshaping but it did not work
strip=slices[slice_idx].reshape(1200,240)
if np.max(strip)!=0:
strip /= np.max(strip)
if np.min(strip)<=-1:
strip/=abs(np.min(strip))
'''
io.imsave('Labels2/{}_{}L.png'.format(label_idx, slice_idx), slices[slice_idx])
This code is producing 240 X 240 images in png format. However most of them are low contrast or completely blackened. Moving on, Now I pass these images to function for calculating knowing the class of labelled image.
def predict_image(self, test_img, show=False):
'''
predicts classes of input image
INPUT (1) str 'test_image': filepath to image to predict on
(2) bool 'show': True to show the results of prediction, False to return prediction
OUTPUT (1) if show == False: array of predicted pixel classes for the center 208 x 208 pixels
(2) if show == True: displays segmentation results
'''
imgs = io.imread(test_img,plugin='simpleitk').astype('float').reshape(5,240,240)
plist = []
# create patches from an entire slice
for img in imgs[:-1]:
if np.max(img) != 0:
img /= np.max(img)
p = extract_patches_2d(img, (33,33))
plist.append(p)
patches = np.array(zip(np.array(plist[0]), np.array(plist[1]), np.array(plist[2]), np.array(plist[3])))
# predict classes of each pixel based on model
full_pred = keras.utils.np_utils.probas_to_classes(self.model_comp.predict(patches))
fp1 = full_pred.reshape(208,208)
if show:
io.imshow(fp1)
plt.show
else:
return fp1
I am getting ValueError: cannot reshape array of size 172800 into shape (5,240,240). I changed 5 to 3 so that 3X240X240=172800. But then there is new problem then ValueError: Error when checking : expected convolution2d_input_1 to have 4 dimensions, but got array with shape (43264, 33, 33).
My model looks like this:
single = Sequential()
single.add(Convolution2D(self.n_filters[0], self.k_dims[0], self.k_dims[0], border_mode='valid', W_regularizer=l1l2(l1=self.w_reg, l2=self.w_reg), input_shape=(self.n_chan,33,33)))
single.add(Activation(self.activation))
single.add(BatchNormalization(mode=0, axis=1))
single.add(MaxPooling2D(pool_size=(2,2), strides=(1,1)))
single.add(Dropout(0.5))
single.add(Convolution2D(self.n_filters[1], self.k_dims[1], self.k_dims[1], activation=self.activation, border_mode='valid', W_regularizer=l1l2(l1=self.w_reg, l2=self.w_reg)))
single.add(BatchNormalization(mode=0, axis=1))
single.add(MaxPooling2D(pool_size=(2,2), strides=(1,1)))
single.add(Dropout(0.5))
single.add(Convolution2D(self.n_filters[2], self.k_dims[2], self.k_dims[2], activation=self.activation, border_mode='valid', W_regularizer=l1l2(l1=self.w_reg, l2=self.w_reg)))
single.add(BatchNormalization(mode=0, axis=1))
single.add(MaxPooling2D(pool_size=(2,2), strides=(1,1)))
single.add(Dropout(0.5))
single.add(Convolution2D(self.n_filters[3], self.k_dims[3], self.k_dims[3], activation=self.activation, border_mode='valid', W_regularizer=l1l2(l1=self.w_reg, l2=self.w_reg)))
single.add(Dropout(0.25))
single.add(Flatten())
single.add(Dense(5))
single.add(Activation('softmax'))
sgd = SGD(lr=0.001, decay=0.01, momentum=0.9)
single.compile(loss='categorical_crossentropy', optimizer='sgd')
print 'Done.'
return single
I am using keras 1.2.2. Please refer here and here( is it due to this change in full_predict in above code) for my previous post for background information. Please refer this for knowing why these specific sizes like 33,33.

You should check the shape of the patches array. This should have 4 dimensions (nrBatches, nrChannels, Width, Height). According to your error message there are only 3 dimensions. Therefore it seems like you merged your channel dimension with your batch dimension.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

TF Keras how to get expected input shape when loading a model? - python

I finally found the answer myself. config = model.get_config() # Returns pretty much every information about your model print(config["layers"][0]["config"]["batch_input_shape"]) # returns a tuple of width, height and channels This will output the following: (None, 128, 96, 3)

I found the answer from here to be more concise: model.layers[0].input_shape[0] The way I understand it, it should be easier to deal with multiple inputs that way too.

Related

Resizing a numpy array to 224x224 for VGG16 Model

The well-defined dimension of a tf.tensor is inexplicably `None`

What's the cleanest and most efficient way to pass two stereo images to a loss function in Keras?

Display output of vgg19 layer as image

Keras wrong image size

Categories

Resources