I am running a CNN for classification of medical scans using Keras and transfer learning with imagenet and InceptionV3. I am building the model with some practice data of size X_train = (624, 128, 128, 1) and Y_train = (624, 2).
I am trying to resize the input_tensor to suit the shape of my images (128 x 128 x 1) using the below code.
input_tensor = Input(shape=(128, 128, 1))
base_model = InceptionV3(input_tensor=input_tensor,weights='imagenet',include_top=False)
Doing this I get a value error:
ValueError: Dimension 0 in both shapes must be equal, but are 3 and 32. Shapes
are [3,3,1,32] and [32,3,3,3]. for 'Assign_753' (op: 'Assign') with input
shapes: [3,3,1,32], [32,3,3,3]
Is there a way to allow this model to accept my images in their format?
Edit:
For what its worth, here is the code to generate the training data.
X = []
Y = []
for subj, subj_slice in slices.items():
# X.extend([s[:, :, np.newaxis, np.newaxis] for s in slice])
subj_slice_norm = [((imageArray - np.min(imageArray)) / np.ptp(imageArray)) for imageArray in subj_slice]
X.extend([s[ :, :, np.newaxis] for s in subj_slice_norm])
subj_status = labels_df['deadstatus.event'][labels_df['PatientID'] == subj]
subj_status = np.asanyarray(subj_status)
#print(subj_status)
Y.extend([subj_status] * len(subj_slice))
X = np.stack(X, axis=0)
Y = to_categorical(np.stack(Y, axis=0))]
n_samp_train = int(X.shape[0]*0.8)
X_train, Y_train = X[:n_samp_train], Y[:n_samp_train]
Edit2:
I think the other alternative would be to take my X which is shape (780, 128, 128, 1), clone each of the 780 images and append two as dummies. Is this possible? Resulting in (780, 128, 128, 3).
We can use the existing keras layers to convert the existing image shape to the expected shape for the pre-trained model rather than using the numpy for replicating channels. As replicating channels before training may consume 3x the memory, but integrating this processing at runtime will save up a lot of memory.
You can proceed this way.
Step 1: Create a Keras Model that converts your input images to the shape that can be fed as the input for the base_model as follows:
from keras.models import Model
from keras.layers import RepeatVector, Input, Reshape
inputs = Input(shape=(128, 128, 1))
reshaped1 = Reshape(target_shape=((128 * 128 * 1,)))(inputs)
repeated = RepeatVector(n=3)(reshaped1)
reshaped2 = Reshape(target_shape=(3, 128, 128))(repeated)
input_model = Model(inputs=inputs, outputs=reshaped2)
Step 2: Define pre-trained model InceptionV3 as follows:
base_model = InceptionV3(input_tensor=input_model.output, weights='imagenet', include_top=False)
Step 3: Combine both the models as follows:
combined_model = Model(inputs=input_model.input, outputs=base_model.output)
The advantage of this method is that the keras model itself will take care of the image processing stuff like channel replication at runtime. Thus, we need not replicate the image channels by ourselves with numpy and the results will be memory efficient.
Related
I am trying to augment (random crop) images while loading them using a tensorflow Dataset.
I am getting this error when I call the method tf.image.random_crop in the mapped function:
ValueError: Dimensions must be equal, but are 4 and 3 for '{{node random_crop/GreaterEqual}} = GreaterEqual[T=DT_INT32](random_crop/Shape, random_crop/size)' with input shapes: [4], [3].
In order to reproduce the error, just place some png images in the directory:
./img/class0/
Then run this code:
import os
import tensorflow as tf
train_set_raw = tf.keras.preprocessing.image_dataset_from_directory('./img',label_mode=None,validation_split=None,batch_size=32)
def augment(tensor):
tensor = tf.cast(x=tensor, dtype=tf.float32)
tensor = tf.divide(x=tensor, y=tf.constant(255.))
tensor = tf.image.random_crop(value=tensor, size=(256, 256, 3))
return tensor
train_set_raw = train_set_raw.map(augment).batch(32)
If I specify the batch size explicitly,
tensor = tf.image.random_crop(value=tensor, size=(32,256, 256, 3))
the error can be sorted. However, if you try to fit a model with a dataset created with a fixed batch size, you will get an error:
tensorflow.python.framework.errors_impl.InvalidArgumentError: assertion failed: [Need value.shape >= size, got ] [1 256 256 3] [32 256 256 3]
[[{{node random_crop/Assert/Assert}}]]
Try using a batch size of 1:
tensor = tf.image.random_crop(value=tensor, size=(1,256, 256, 3))
But I don't think you should mix high-level data loaders with a lower level tf.data.Dataset. Try using only the latter.
import tensorflow as tf
image_dir = r'C:\Users\user\Pictures'
files = tf.data.Dataset.list_files(image_dir + '\\*jpg')
def load(filepath):
image = tf.io.read_file(filepath)
image = tf.image.decode_image(image)
return image
ds = files.map(load)
def augment(tensor):
tensor = tf.cast(x=tensor, dtype=tf.float32)
tensor = tf.divide(x=tensor, y=tf.constant(255.))
tensor = tf.image.random_crop(value=tensor, size=(100, 100, 3))
random_target = tf.random.uniform((1,), dtype=tf.int32, maxval=2)
return tensor, random_target
train_set_raw = ds.map(augment).batch(32)
model = tf.keras.Sequential([
tf.keras.layers.Flatten(),
tf.keras.layers.Dense(8, activation='relu'),
tf.keras.layers.Dense(1, activation='sigmoid')
])
model.compile(loss='binary_crossentropy', optimizer='adam')
history = model.fit(train_set_raw)
I'm trying to reshape a Tensorflow model's input along the batch dimension. I want to combine some of the batch samples into a time-series so I can feed it into an LSTM layer.
Specifically, I have 1024 samples and I'd like to put them into groups of 64 timesteps with the result being 16 batches of 64 timesteps, each timestep having the original 24 features.
#input tensor is (1024, 24)
inputLayer = Input(shape=(24,))
#I want it to be (16, 64, 24)
reshapedLayer = layers.Reshape([64, 24])(inputLayer)
lstmLayer = layers.LSTM(128, activation='relu')(reshapedLayer)
This compiles but throws a runtime error
tensorflow.python.framework.errors_impl.InvalidArgumentError:
Input to reshape is a tensor with 24576 values, but the requested shape has 1572864
I understand what the error is telling me, but I'm not sure the right way to go about fixing it.
Perhaps this could work for you:
import tensorflow as tf
inputs = tf.keras.layers.Input(shape=(24,))
x = tf.reshape(inputs, (16, 64, 24))
x = tf.keras.layers.LSTM(128, activation='relu')(x)
model = tf.keras.Model(inputs=inputs, outputs=x)
# dummy data
inputs = tf.random.uniform(shape=(1024, 24))
outputs = model(inputs)
Replacing the Reshape layer with tf.reshape.
I'm trying to make a CNN (still a beginner). When trying to fit the model I am getting this error:
ValueError: A target array with shape (10000, 10) was passed for output of shape (None, 6, 6, 10) while using as loss categorical_crossentropy. This loss expects targets to have the same shape as the output.
The shape of labels = (10000, 10)
the shape of the image data = (10000, 32, 32, 3)
Code:
import pickle
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import (Dense, Dropout, Activation, Flatten,
Conv2D, MaxPooling2D)
from tensorflow.keras.callbacks import TensorBoard
from keras.utils import to_categorical
import numpy as np
import time
MODEL_NAME = f"_________{int(time.time())}"
BATCH_SIZE = 64
class ConvolutionalNetwork():
'''
A convolutional neural network to be used to classify images
from the CIFAR-10 dataset.
'''
def __init__(self):
'''
self.training_images -- a 10000x3072 numpy array of uint8s. Each
a row of the array stores a 32x32 colour image.
The first 1024 entries contain the red channel
values, the next 1024 the green, and the final
1024 the blue. The image is stored in row-major
order, so that the first 32 entries of the array are the red channel values of the first row of the image.
self.training_labels -- a list of 10000 numbers in the range 0-9.
The number at index I indicates the label
of the ith image in the array data.
'''
# List of image categories
self.label_names = (self.unpickle("cifar-10-batches-py/batches.meta",
encoding='utf-8')['label_names'])
self.training_data = self.unpickle("cifar-10-batches-py/data_batch_1")
self.training_images = self.training_data[b'data']
self.training_labels = self.training_data[b'labels']
# Reshaping the images + scaling
self.shape_images()
# Converts labels to one-hot
self.training_labels = np.array(to_categorical(self.training_labels))
self.create_model()
self.tensorboard = TensorBoard(log_dir=f'logs/{MODEL_NAME}')
def unpickle(self, file, encoding='bytes'):
'''
Unpickles the dataset files.
'''
with open(file, 'rb') as fo:
training_dict = pickle.load(fo, encoding=encoding)
return training_dict
def shape_images(self):
'''
Reshapes the images and scales by 255.
'''
images = list()
for d in self.training_images:
image = np.zeros((32,32,3), dtype=np.uint8)
image[...,0] = np.reshape(d[:1024], (32,32)) # Red channel
image[...,1] = np.reshape(d[1024:2048], (32,32)) # Green channel
image[...,2] = np.reshape(d[2048:], (32,32)) # Blue channel
images.append(image)
for i in range(len(images)):
images[i] = images[i]/255
images = np.array(images)
self.training_images = images
print(self.training_images.shape)
def create_model(self):
'''
Creating the ConvNet model.
'''
self.model = Sequential()
self.model.add(Conv2D(64, (3, 3), input_shape=self.training_images.shape[1:]))
self.model.add(Activation("relu"))
self.model.add(MaxPooling2D(pool_size=(2,2)))
self.model.add(Conv2D(64, (3,3)))
self.model.add(Activation("relu"))
self.model.add(MaxPooling2D(pool_size=(2,2)))
# self.model.add(Flatten())
# self.model.add(Dense(64))
# self.model.add(Activation('relu'))
self.model.add(Dense(10))
self.model.add(Activation(activation='softmax'))
self.model.compile(loss="categorical_crossentropy", optimizer="adam",
metrics=['accuracy'])
def train(self):
'''
Fits the model.
'''
print(self.training_images.shape)
print(self.training_labels.shape)
self.model.fit(self.training_images, self.training_labels, batch_size=BATCH_SIZE,
validation_split=0.1, epochs=5, callbacks=[self.tensorboard])
network = ConvolutionalNetwork()
network.train()
Would appreciate the help, have been trying to fix for an hour.
You need to uncomment the Flatten layer when creating your model. Essentially what this layer does is that it takes a 4D input (batch_size, height, width, num_filters) and unrolls it into a 2D one (batch_size, height * width * num_filters). This is needed to get the output shape you want.
Un-comment the flatten layer before your output layer in create_model(self), conv layers don't work with 1D tensors/arrays, and so for you to get the output layer of the right shape to add a Flatten() layer right before your output layer, like this:
def create_model(self):
'''
Creating the ConvNet model.
'''
self.model = Sequential()
self.model.add(Conv2D(64, (3, 3), input_shape=self.training_images.shape[1:]), activation='relu')
#self.model.add(Activation("relu"))
self.model.add(MaxPooling2D(pool_size=(2,2)))
self.model.add(Conv2D(64, (3,3), activation='relu'))
#self.model.add(Activation("relu"))
self.model.add(MaxPooling2D(pool_size=(2,2)))
# self.model.add(Dense(64))
# self.model.add(Activation('relu'))
self.model.add(Flatten())
self.model.add(Dense(10, activation='softmax'))
#self.model.add(Activation(activation='softmax'))
self.model.compile(loss="categorical_crossentropy", optimizer="adam",
metrics=['accuracy'])
print ('model output shape:', self.model.output_shape)#prints out the output shape of your model
The code above will give you a model with an output shape of (None, 10).
Also please use activation as a layer parameter in the future.
Use model.summary() to inspect the output shapes of your model. Without the commented out Flatten() layer the shapes of your layers retain the original dimensions of the image and the shape of the output layer is (None, 6, 6, 10).
What you want to do here is roughly:
start with a shape of (batch_size, img width, img heigh, channels)
use convolutions to detect patterns through the image by applying a filter
reduce the img width and height with max pooling
then Flatten() the dimensions of the image so that instead of (width, heigh, features) you end up with just a set of features.
match against your classes.
The commented out code does step 4; when you remove the Flatten() layer you end up with the wrong set of dimensions at the end.
You have to get your model output into the same shape as your labels.
Perhaps the simplest solution would be to ensure the model ends with these layers:
model.add(Flatten())
## possibly an extra dense layer or 2 with 'relu' activation
model.add(Dense(10, activation=`softmax`))
This is amongst the most common 'endings' to a categorisation model and is arguably the most straightforward to understand.
It's not clear why you commented out this section:
# self.model.add(Flatten())
# self.model.add(Dense(64))
# self.model.add(Activation('relu'))
which would appear to give you the required output shape?
This is my piece of code for GAN where the model is being initialized, everything is working and only the relevant code to the problem is present here:
z = Input(shape=(100+384,))
img = self.generator(z)
print("before: ",img) #128x128x3 shape, dtype=tf.float32
temp = tf.get_variable("temp", [1, 128, 3],dtype=tf.float32)
img=tf.concat(img,temp)
print("after: ",img) #error ValueError: Incompatible type conversion requested to type 'int32' for variable of type 'float32_ref'
valid = self.discriminator(img)
self.combined = Model(z, valid)
I have 128x128x3 images to generate, what I want to do is give 129x128x3 images to discriminator and the 1x128x3 text-embedding matrix is concatenated with the image while training. But I have to specify at the start the shape of tensors and input value that each model i.e. GEN and DISC will get. Gen takes 100noise+384embedding matrix and generates 128x128x3 image which is again embeded by some embedding i.e. 1x128x3 and is fed to DISC. So my question is that whether this approach is correct or not? Also, if it is correct or it makes sense then how can I specific the stuff needed at the start so that it does not give me errors like incompatible shape because at the start I have to add these lines:-
z = Input(shape=(100+384,))
img = self.generator(z) #128x128x3
valid = self.discriminator(img) #should be 129x128x3
self.combined = Model(z, valid)
But img is of 128x128x3 and is later during training changed to 129x128x3 by concatenating embedding matrix. So how can I change "img" from 128,128,3 to 129,128,3 in the above code either by padding or appending another tensor or by simply reshaping which of course is not possible. Any help will be much much appreciated. Thanks.
The first argument of tf.concat should be the list of tensors, while the second is the axis along which to concatenate. You could concatenate the img and temp tensors as follows:
import tensorflow as tf
img = tf.ones(shape=(128, 128, 3))
temp = tf.get_variable("temp", [1, 128, 3], dtype=tf.float32)
img = tf.concat([img, temp], axis=0)
with tf.Session() as sess:
print(sess.run(tf.shape(img)))
UPDATE: Here you have a minimal example showing why you get the error "AttributeError: 'Tensor' object has no attribute '_keras_history'". This error pops up in the following snippet:
from keras.layers import Input, Lambda, Dense
from keras.models import Model
import tensorflow as tf
img = Input(shape=(128, 128, 3)) # Shape=(batch_size, 128, 128, 3)
temp = Input(shape=(1, 128, 3)) # Shape=(batch_size, 1, 128, 3)
concat = tf.concat([img, temp], axis=1)
print(concat.get_shape())
dense = Dense(1)(concat)
model = Model(inputs=[img, temp], outputs=dense)
This happens because tensor concatis not a Keras tensor, and therefore some of the typical Keras tensors' attributes (such as _keras_history) are missing. To overcome this problem, you need to encapsulate all TensorFlow tensors into a Keras Lambda layer:
from keras.layers import Input, Lambda, Dense
from keras.models import Model
import tensorflow as tf
img = Input(shape=(128, 128, 3)) # Shape=(batch_size, 128, 128, 3)
temp = Input(shape=(1, 128, 3)) # Shape=(batch_size, 1, 128, 3)
concat = Lambda(lambda x: tf.concat([x[0], x[1]], axis=1))([img, temp])
print(concat.get_shape())
dense = Dense(1)(concat)
model = Model(inputs=[img, temp], outputs=dense)
I'm fairly new to TensorFlow and Image Classification, so I may be missing key knowledge and is probably why I'm facing this issue.
I've built a ResNet50 model in TensorFlow for the purpose of image classification of Dog Breeds using the ImageNet library and I have successfully trained a neural network which can detect various Dog Breeds.
I'm now at the point in which I would like to pass a random image of a dog to my model for it to spit out an output on what it thinks the dog breed is. However, when I run this function, dog_breed_predictor("<file path to image>"), I get the error expected global_average_pooling2d_1_input to have shape (1, 1, 2048) but got array with shape (7, 7, 2048) when it tries to execute the line Resnet50_model.predict(bottleneck_feature) and I don't know how to get around this.
Here's the code. I've provided all that I feel is relevant to the problem.
import cv2
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
from keras.applications.resnet50 import ResNet50
from keras.preprocessing import image
from tqdm import tqdm
from sklearn.datasets import load_files
np_utils = tf.keras.utils
# define function to load train, test, and validation datasets
def load_dataset(path):
data = load_files(path)
dog_files = np.array(data['filenames'])
dog_targets = np_utils.to_categorical(np.array(data['target']), 133)
return dog_files, dog_targets
# load train, test, and validation datasets
train_files, train_targets = load_dataset('dogImages/dogImages/train')
valid_files, valid_targets = load_dataset('dogImages/dogImages/valid')
test_files, test_targets = load_dataset('dogImages/dogImages/test')
#define Resnet50 model
Resnet50_model = ResNet50(weights="imagenet")
def path_to_tensor(img_path):
#loads RGB image as PIL.Image.Image type
img = image.load_img(img_path, target_size=(224, 224))
#convert PIL.Image.Image type to 3D tensor with shape (224, 224, 3)
x = image.img_to_array(img)
#convert 3D tensor into 4D tensor with shape (1, 224, 224, 3)
return np.expand_dims(x, axis=0)
from keras.applications.resnet50 import preprocess_input, decode_predictions
def ResNet50_predict_labels(img_path):
#returns prediction vector for image located at img_path
img = preprocess_input(path_to_tensor(img_path))
return np.argmax(Resnet50_model.predict(img))
###returns True if a dog is detected in the image stored at img_path
def dog_detector(img_path):
prediction = ResNet50_predict_labels(img_path)
return ((prediction <= 268) & (prediction >= 151))
###Obtain bottleneck features from another pre-trained CNN
bottleneck_features = np.load("bottleneck_features/DogResnet50Data.npz")
train_DogResnet50 = bottleneck_features["train"]
valid_DogResnet50 = bottleneck_features["valid"]
test_DogResnet50 = bottleneck_features["test"]
###Define your architecture
Resnet50_model = tf.keras.Sequential()
Resnet50_model.add(tf.keras.layers.GlobalAveragePooling2D(input_shape=train_DogResnet50.shape[1:]))
Resnet50_model.add(tf.contrib.keras.layers.Dense(133, activation="softmax"))
Resnet50_model.summary()
###Compile the model
Resnet50_model.compile(loss="categorical_crossentropy", optimizer="rmsprop", metrics=["accuracy"])
###Train the model
checkpointer = tf.keras.callbacks.ModelCheckpoint(filepath="saved_models/weights.best.ResNet50.hdf5",
verbose=1, save_best_only=True)
Resnet50_model.fit(train_DogResnet50, train_targets,
validation_data=(valid_DogResnet50, valid_targets),
epochs=20, batch_size=20, callbacks=[checkpointer])
###Load the model weights with the best validation loss.
Resnet50_model.load_weights("saved_models/weights.best.ResNet50.hdf5")
###Calculate classification accuracy on the test dataset
Resnet50_predictions = [np.argmax(Resnet50_model.predict(np.expand_dims(feature, axis=0))) for feature in test_DogResnet50]
#Report test accuracy
test_accuracy = 100*np.sum(np.array(Resnet50_predictions)==np.argmax(test_targets, axis=1))/len(Resnet50_predictions)
print("Test accuracy: %.4f%%" % test_accuracy)
def extract_Resnet50(tensor):
from keras.applications.resnet50 import ResNet50, preprocess_input
return ResNet50(weights='imagenet', include_top=False).predict(preprocess_input(tensor))
def dog_breed(img_path):
#extract bottleneck features
bottleneck_feature = extract_Resnet50(path_to_tensor(img_path))
#obtain predicted vector
predicted_vector = Resnet50_model.predict(bottleneck_feature) #shape error occurs here
#return dog breed that is predicted by the model
return dog_names[np.argmax(predicted_vector)]
def dog_breed_predictor(img_path):
#determine the predicted dog breed
breed = dog_breed(img_path)
#display the image
img = cv2.imread(img_path)
cv_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
plt.imshow(cv_rgb)
plt.show()
#display relevant predictor result
if dog_detector(img_path):
print("This is a dog and its breed is: " + str(breed))
elif face_detector(img_path):
print("This is a human but it looks like a: " + str(breed))
else:
print("I don't know what this is.")
dog_breed_predictor("dogImages/dogImages/train/016.Beagle/Beagle_01126.jpg")
The image I'm feeding into my function is from the same dataset that was used to train the model - I wanted to see myself if the model is working as intended - so this error makes it extra confusing. What could I be doing wrong?
Thanks to nessuno's assistance, I figured out the issue. The problem was indeed with the pooling layer of ResNet50.
The following code in my script above:
return ResNet50(weights='imagenet',
include_top=False).predict(preprocess_input(tensor))
returns a shape of (1, 7, 7, 2048) (admittedly though, I do not fully understand why). To get around this, I added in the parameter pooling="avg" as so:
return ResNet50(weights='imagenet',
include_top=False,
pooling="avg").predict(preprocess_input(tensor))
This instead returns a shape of (1, 2048) (again, admittedly, I do not know why.)
However, the model still expects a 4-D shape. To get around this I added in the following code in my dog_breed() function:
print(bottleneck_feature.shape) #returns (1, 2048)
bottleneck_feature = np.expand_dims(bottleneck_feature, axis=0)
bottleneck_feature = np.expand_dims(bottleneck_feature, axis=0)
bottleneck_feature = np.expand_dims(bottleneck_feature, axis=0)
print(bottleneck_feature.shape) #returns (1, 1, 1, 1, 2048) - yes a 5D shape, not 4.
and this returns a shape of (1, 1, 1, 1, 2048). For some reason, the model still complained it was a 3D shape when I only added 2 more dimensions, but stopped when I added a 3rd (this is peculiar, and I would like to find out more about why this is.).
So overall, my dog_breed() function went from:
def dog_breed(img_path):
#extract bottleneck features
bottleneck_feature = extract_Resnet50(path_to_tensor(img_path))
#obtain predicted vector
predicted_vector = Resnet50_model.predict(bottleneck_feature) #shape error occurs here
#return dog breed that is predicted by the model
return dog_names[np.argmax(predicted_vector)]
to this:
def dog_breed(img_path):
#extract bottleneck features
bottleneck_feature = extract_Resnet50(path_to_tensor(img_path))
print(bottleneck_feature.shape) #returns (1, 2048)
bottleneck_feature = np.expand_dims(bottleneck_feature, axis=0)
bottleneck_feature = np.expand_dims(bottleneck_feature, axis=0)
bottleneck_feature = np.expand_dims(bottleneck_feature, axis=0)
print(bottleneck_feature.shape) #returns (1, 1, 1, 1, 2048) - yes a 5D shape, not 4.
#obtain predicted vector
predicted_vector = Resnet50_model.predict(bottleneck_feature) #shape error occurs here
#return dog breed that is predicted by the model
return dog_names[np.argmax(predicted_vector)]
whilst ensuring the parameter pooling="avg" is added to my call to ResNet50.
The documentation of ResNet50 says something about the constructor parameter input_shape (emphasis is mine):
input_shape: optional shape tuple, only to be specified if include_top is False (otherwise the input shape has to be (224, 224, 3) (with 'channels_last' data format) or (3, 224, 224) (with 'channels_first' data format). It should have exactly 3 inputs channels, and width and height should be no smaller than 197. E.g. (200, 200, 3) would be one valid value.
My guess is that since you specified include_top to False the network definition pads the input to a bigger shape than 224x224, so when you extract the features you end up with a feature map and not with a feature vector (and that's the cause of your error).
Just try to specify and input_shape in this way:
return ResNet50(weights='imagenet',
include_top=False,
input_shape=(224, 224, 3)).predict(preprocess_input(tensor))