I am trying to extract features from a convolution layer of the VGGFace model, using TensorFlow & Keras.
This is my code:
# Layer Features
layer_name = 'conv1_2' # Edit this line
vgg_model = VGGFace() # Pooling: None, avg or max
out = vgg_model.get_layer(layer_name).output
vgg_model_new = Model(vgg_model.input, out)
def main():
img = image.load_img('myimage.jpg', target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = utils.preprocess_input(x, version=1)
preds = vgg_model_new.predict(x)
print('Predicted:', utils.decode_predictions(preds))
exit(0)
However, at the print('Predicted:', utils.decode_predictions(preds)) line I am getting the following error:
Message=decode_predictions expects a batch of predictions (i.e. a
2D array of shape (samples, 2622)) for V1 or (samples, 8631) for
V2.Found array with shape: (1, 224, 224, 64)
I just want to extract features, I don't need to classify my images at this point. This code is based on https://github.com/rcmalli/keras-vggface
You shouldn't use utils.decode_predictions(preds) there because it's only for classification. You can see the definition of the function here https://github.com/rcmalli/keras-vggface/blob/master/keras_vggface/utils.py#L66
If you want to print the features, use print('Predicted:',preds)
Related
I am using Transfer learning for recognizing objects. I used trained VGG16 model as the base model and added my classifier on top of it using Keras. I then trained the model on my data, the model works well. I want to see the feature generated by the intermediate layers of the model for the given data. I used the following code for this purpose:
def ModeloutputAtthisLayer(model, layernme, imgnme, width, height):
layer_name = layernme
intermediate_layer_model = Model(inputs=model.input,
outputs=model.get_layer(layer_name).output)
img = image.load_img(imgnme, target_size=(width, height))
imageArray = image.img_to_array(img)
image_batch = np.expand_dims(imageArray, axis=0)
processed_image = preprocess_input(image_batch.copy())
intermediate_output = intermediate_layer_model.predict(processed_image)
print("outshape of ", layernme, "is ", intermediate_output.shape)
In the code, I used np.expand_dims to add one extra dimension for the batch as the input matrix to the network should be of the form (batchsize, height, width, channels). This code works fine. The shape of the feature vector is 1, 224, 224, 64.
Now I wish to display this as image, for this I understand there is an additional dimension added as batch so I should remove it. Following this I used the following lines of the code:
imge = np.squeeze(intermediate_output, axis=0)
plt.imshow(imge)
However it throws an error:
"Invalid dimensions for image data"
I wonder how can I display the extracted feature vector as an image. Any suggestion please.
Your feature shape is (1,224,224,64), you cannot directly plot a 64 channel image. What you can do is plot the individual channels independently like following
imge = np.squeeze(intermediate_output, axis=0)
filters = imge.shape[2]
plt.figure(1, figsize=(32, 32)) # plot image of size (32x32)
n_columns = 8
n_rows = math.ceil(filters / n_columns) + 1
for i in range(filters):
plt.subplot(n_rows, n_columns, i+1)
plt.title('Filter ' + str(i))
plt.imshow(imge[:,:,i], interpolation="nearest", cmap="gray")
This will plot 64 images in 8 rows and 8 columns.
A possible way to go consists in combining the 64 channels into a single-channel image through a weighted sum like this:
weighted_imge = np.sum(imge*weights, axis=-1)
where weights is an array with 64 weighting coefficients.
If you wish to give all the channels the same weight you could simply compute the average:
weighted_imge = np.mean(imge, axis=-1)
Demo
import numpy as np
import matplotlib.pyplot as plt
intermediate_output = np.random.randint(size=(1, 224, 224, 64),
low=0, high=2**8, dtype=np.uint8)
imge = np.squeeze(intermediate_output, axis=0)
weights = np.random.random(size=(imge.shape[-1],))
weighted_imge = np.sum(imge*weights, axis=-1)
plt.imshow(weighted_imge)
plt.colorbar()
In [33]: intermediate_output.shape
Out[33]: (1, 224, 224, 64)
In [34]: imge.shape
Out[34]: (224, 224, 64)
In [35]: weights.shape
Out[35]: (64,)
In [36]: weighted_imge.shape
Out[36]: (224, 224)
This is my piece of code for GAN where the model is being initialized, everything is working and only the relevant code to the problem is present here:
z = Input(shape=(100+384,))
img = self.generator(z)
print("before: ",img) #128x128x3 shape, dtype=tf.float32
temp = tf.get_variable("temp", [1, 128, 3],dtype=tf.float32)
img=tf.concat(img,temp)
print("after: ",img) #error ValueError: Incompatible type conversion requested to type 'int32' for variable of type 'float32_ref'
valid = self.discriminator(img)
self.combined = Model(z, valid)
I have 128x128x3 images to generate, what I want to do is give 129x128x3 images to discriminator and the 1x128x3 text-embedding matrix is concatenated with the image while training. But I have to specify at the start the shape of tensors and input value that each model i.e. GEN and DISC will get. Gen takes 100noise+384embedding matrix and generates 128x128x3 image which is again embeded by some embedding i.e. 1x128x3 and is fed to DISC. So my question is that whether this approach is correct or not? Also, if it is correct or it makes sense then how can I specific the stuff needed at the start so that it does not give me errors like incompatible shape because at the start I have to add these lines:-
z = Input(shape=(100+384,))
img = self.generator(z) #128x128x3
valid = self.discriminator(img) #should be 129x128x3
self.combined = Model(z, valid)
But img is of 128x128x3 and is later during training changed to 129x128x3 by concatenating embedding matrix. So how can I change "img" from 128,128,3 to 129,128,3 in the above code either by padding or appending another tensor or by simply reshaping which of course is not possible. Any help will be much much appreciated. Thanks.
The first argument of tf.concat should be the list of tensors, while the second is the axis along which to concatenate. You could concatenate the img and temp tensors as follows:
import tensorflow as tf
img = tf.ones(shape=(128, 128, 3))
temp = tf.get_variable("temp", [1, 128, 3], dtype=tf.float32)
img = tf.concat([img, temp], axis=0)
with tf.Session() as sess:
print(sess.run(tf.shape(img)))
UPDATE: Here you have a minimal example showing why you get the error "AttributeError: 'Tensor' object has no attribute '_keras_history'". This error pops up in the following snippet:
from keras.layers import Input, Lambda, Dense
from keras.models import Model
import tensorflow as tf
img = Input(shape=(128, 128, 3)) # Shape=(batch_size, 128, 128, 3)
temp = Input(shape=(1, 128, 3)) # Shape=(batch_size, 1, 128, 3)
concat = tf.concat([img, temp], axis=1)
print(concat.get_shape())
dense = Dense(1)(concat)
model = Model(inputs=[img, temp], outputs=dense)
This happens because tensor concatis not a Keras tensor, and therefore some of the typical Keras tensors' attributes (such as _keras_history) are missing. To overcome this problem, you need to encapsulate all TensorFlow tensors into a Keras Lambda layer:
from keras.layers import Input, Lambda, Dense
from keras.models import Model
import tensorflow as tf
img = Input(shape=(128, 128, 3)) # Shape=(batch_size, 128, 128, 3)
temp = Input(shape=(1, 128, 3)) # Shape=(batch_size, 1, 128, 3)
concat = Lambda(lambda x: tf.concat([x[0], x[1]], axis=1))([img, temp])
print(concat.get_shape())
dense = Dense(1)(concat)
model = Model(inputs=[img, temp], outputs=dense)
I'm fairly new to TensorFlow and Image Classification, so I may be missing key knowledge and is probably why I'm facing this issue.
I've built a ResNet50 model in TensorFlow for the purpose of image classification of Dog Breeds using the ImageNet library and I have successfully trained a neural network which can detect various Dog Breeds.
I'm now at the point in which I would like to pass a random image of a dog to my model for it to spit out an output on what it thinks the dog breed is. However, when I run this function, dog_breed_predictor("<file path to image>"), I get the error expected global_average_pooling2d_1_input to have shape (1, 1, 2048) but got array with shape (7, 7, 2048) when it tries to execute the line Resnet50_model.predict(bottleneck_feature) and I don't know how to get around this.
Here's the code. I've provided all that I feel is relevant to the problem.
import cv2
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
from keras.applications.resnet50 import ResNet50
from keras.preprocessing import image
from tqdm import tqdm
from sklearn.datasets import load_files
np_utils = tf.keras.utils
# define function to load train, test, and validation datasets
def load_dataset(path):
data = load_files(path)
dog_files = np.array(data['filenames'])
dog_targets = np_utils.to_categorical(np.array(data['target']), 133)
return dog_files, dog_targets
# load train, test, and validation datasets
train_files, train_targets = load_dataset('dogImages/dogImages/train')
valid_files, valid_targets = load_dataset('dogImages/dogImages/valid')
test_files, test_targets = load_dataset('dogImages/dogImages/test')
#define Resnet50 model
Resnet50_model = ResNet50(weights="imagenet")
def path_to_tensor(img_path):
#loads RGB image as PIL.Image.Image type
img = image.load_img(img_path, target_size=(224, 224))
#convert PIL.Image.Image type to 3D tensor with shape (224, 224, 3)
x = image.img_to_array(img)
#convert 3D tensor into 4D tensor with shape (1, 224, 224, 3)
return np.expand_dims(x, axis=0)
from keras.applications.resnet50 import preprocess_input, decode_predictions
def ResNet50_predict_labels(img_path):
#returns prediction vector for image located at img_path
img = preprocess_input(path_to_tensor(img_path))
return np.argmax(Resnet50_model.predict(img))
###returns True if a dog is detected in the image stored at img_path
def dog_detector(img_path):
prediction = ResNet50_predict_labels(img_path)
return ((prediction <= 268) & (prediction >= 151))
###Obtain bottleneck features from another pre-trained CNN
bottleneck_features = np.load("bottleneck_features/DogResnet50Data.npz")
train_DogResnet50 = bottleneck_features["train"]
valid_DogResnet50 = bottleneck_features["valid"]
test_DogResnet50 = bottleneck_features["test"]
###Define your architecture
Resnet50_model = tf.keras.Sequential()
Resnet50_model.add(tf.keras.layers.GlobalAveragePooling2D(input_shape=train_DogResnet50.shape[1:]))
Resnet50_model.add(tf.contrib.keras.layers.Dense(133, activation="softmax"))
Resnet50_model.summary()
###Compile the model
Resnet50_model.compile(loss="categorical_crossentropy", optimizer="rmsprop", metrics=["accuracy"])
###Train the model
checkpointer = tf.keras.callbacks.ModelCheckpoint(filepath="saved_models/weights.best.ResNet50.hdf5",
verbose=1, save_best_only=True)
Resnet50_model.fit(train_DogResnet50, train_targets,
validation_data=(valid_DogResnet50, valid_targets),
epochs=20, batch_size=20, callbacks=[checkpointer])
###Load the model weights with the best validation loss.
Resnet50_model.load_weights("saved_models/weights.best.ResNet50.hdf5")
###Calculate classification accuracy on the test dataset
Resnet50_predictions = [np.argmax(Resnet50_model.predict(np.expand_dims(feature, axis=0))) for feature in test_DogResnet50]
#Report test accuracy
test_accuracy = 100*np.sum(np.array(Resnet50_predictions)==np.argmax(test_targets, axis=1))/len(Resnet50_predictions)
print("Test accuracy: %.4f%%" % test_accuracy)
def extract_Resnet50(tensor):
from keras.applications.resnet50 import ResNet50, preprocess_input
return ResNet50(weights='imagenet', include_top=False).predict(preprocess_input(tensor))
def dog_breed(img_path):
#extract bottleneck features
bottleneck_feature = extract_Resnet50(path_to_tensor(img_path))
#obtain predicted vector
predicted_vector = Resnet50_model.predict(bottleneck_feature) #shape error occurs here
#return dog breed that is predicted by the model
return dog_names[np.argmax(predicted_vector)]
def dog_breed_predictor(img_path):
#determine the predicted dog breed
breed = dog_breed(img_path)
#display the image
img = cv2.imread(img_path)
cv_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
plt.imshow(cv_rgb)
plt.show()
#display relevant predictor result
if dog_detector(img_path):
print("This is a dog and its breed is: " + str(breed))
elif face_detector(img_path):
print("This is a human but it looks like a: " + str(breed))
else:
print("I don't know what this is.")
dog_breed_predictor("dogImages/dogImages/train/016.Beagle/Beagle_01126.jpg")
The image I'm feeding into my function is from the same dataset that was used to train the model - I wanted to see myself if the model is working as intended - so this error makes it extra confusing. What could I be doing wrong?
Thanks to nessuno's assistance, I figured out the issue. The problem was indeed with the pooling layer of ResNet50.
The following code in my script above:
return ResNet50(weights='imagenet',
include_top=False).predict(preprocess_input(tensor))
returns a shape of (1, 7, 7, 2048) (admittedly though, I do not fully understand why). To get around this, I added in the parameter pooling="avg" as so:
return ResNet50(weights='imagenet',
include_top=False,
pooling="avg").predict(preprocess_input(tensor))
This instead returns a shape of (1, 2048) (again, admittedly, I do not know why.)
However, the model still expects a 4-D shape. To get around this I added in the following code in my dog_breed() function:
print(bottleneck_feature.shape) #returns (1, 2048)
bottleneck_feature = np.expand_dims(bottleneck_feature, axis=0)
bottleneck_feature = np.expand_dims(bottleneck_feature, axis=0)
bottleneck_feature = np.expand_dims(bottleneck_feature, axis=0)
print(bottleneck_feature.shape) #returns (1, 1, 1, 1, 2048) - yes a 5D shape, not 4.
and this returns a shape of (1, 1, 1, 1, 2048). For some reason, the model still complained it was a 3D shape when I only added 2 more dimensions, but stopped when I added a 3rd (this is peculiar, and I would like to find out more about why this is.).
So overall, my dog_breed() function went from:
def dog_breed(img_path):
#extract bottleneck features
bottleneck_feature = extract_Resnet50(path_to_tensor(img_path))
#obtain predicted vector
predicted_vector = Resnet50_model.predict(bottleneck_feature) #shape error occurs here
#return dog breed that is predicted by the model
return dog_names[np.argmax(predicted_vector)]
to this:
def dog_breed(img_path):
#extract bottleneck features
bottleneck_feature = extract_Resnet50(path_to_tensor(img_path))
print(bottleneck_feature.shape) #returns (1, 2048)
bottleneck_feature = np.expand_dims(bottleneck_feature, axis=0)
bottleneck_feature = np.expand_dims(bottleneck_feature, axis=0)
bottleneck_feature = np.expand_dims(bottleneck_feature, axis=0)
print(bottleneck_feature.shape) #returns (1, 1, 1, 1, 2048) - yes a 5D shape, not 4.
#obtain predicted vector
predicted_vector = Resnet50_model.predict(bottleneck_feature) #shape error occurs here
#return dog breed that is predicted by the model
return dog_names[np.argmax(predicted_vector)]
whilst ensuring the parameter pooling="avg" is added to my call to ResNet50.
The documentation of ResNet50 says something about the constructor parameter input_shape (emphasis is mine):
input_shape: optional shape tuple, only to be specified if include_top is False (otherwise the input shape has to be (224, 224, 3) (with 'channels_last' data format) or (3, 224, 224) (with 'channels_first' data format). It should have exactly 3 inputs channels, and width and height should be no smaller than 197. E.g. (200, 200, 3) would be one valid value.
My guess is that since you specified include_top to False the network definition pads the input to a bigger shape than 224x224, so when you extract the features you end up with a feature map and not with a feature vector (and that's the cause of your error).
Just try to specify and input_shape in this way:
return ResNet50(weights='imagenet',
include_top=False,
input_shape=(224, 224, 3)).predict(preprocess_input(tensor))
I am running a CNN for classification of medical scans using Keras and transfer learning with imagenet and InceptionV3. I am building the model with some practice data of size X_train = (624, 128, 128, 1) and Y_train = (624, 2).
I am trying to resize the input_tensor to suit the shape of my images (128 x 128 x 1) using the below code.
input_tensor = Input(shape=(128, 128, 1))
base_model = InceptionV3(input_tensor=input_tensor,weights='imagenet',include_top=False)
Doing this I get a value error:
ValueError: Dimension 0 in both shapes must be equal, but are 3 and 32. Shapes
are [3,3,1,32] and [32,3,3,3]. for 'Assign_753' (op: 'Assign') with input
shapes: [3,3,1,32], [32,3,3,3]
Is there a way to allow this model to accept my images in their format?
Edit:
For what its worth, here is the code to generate the training data.
X = []
Y = []
for subj, subj_slice in slices.items():
# X.extend([s[:, :, np.newaxis, np.newaxis] for s in slice])
subj_slice_norm = [((imageArray - np.min(imageArray)) / np.ptp(imageArray)) for imageArray in subj_slice]
X.extend([s[ :, :, np.newaxis] for s in subj_slice_norm])
subj_status = labels_df['deadstatus.event'][labels_df['PatientID'] == subj]
subj_status = np.asanyarray(subj_status)
#print(subj_status)
Y.extend([subj_status] * len(subj_slice))
X = np.stack(X, axis=0)
Y = to_categorical(np.stack(Y, axis=0))]
n_samp_train = int(X.shape[0]*0.8)
X_train, Y_train = X[:n_samp_train], Y[:n_samp_train]
Edit2:
I think the other alternative would be to take my X which is shape (780, 128, 128, 1), clone each of the 780 images and append two as dummies. Is this possible? Resulting in (780, 128, 128, 3).
We can use the existing keras layers to convert the existing image shape to the expected shape for the pre-trained model rather than using the numpy for replicating channels. As replicating channels before training may consume 3x the memory, but integrating this processing at runtime will save up a lot of memory.
You can proceed this way.
Step 1: Create a Keras Model that converts your input images to the shape that can be fed as the input for the base_model as follows:
from keras.models import Model
from keras.layers import RepeatVector, Input, Reshape
inputs = Input(shape=(128, 128, 1))
reshaped1 = Reshape(target_shape=((128 * 128 * 1,)))(inputs)
repeated = RepeatVector(n=3)(reshaped1)
reshaped2 = Reshape(target_shape=(3, 128, 128))(repeated)
input_model = Model(inputs=inputs, outputs=reshaped2)
Step 2: Define pre-trained model InceptionV3 as follows:
base_model = InceptionV3(input_tensor=input_model.output, weights='imagenet', include_top=False)
Step 3: Combine both the models as follows:
combined_model = Model(inputs=input_model.input, outputs=base_model.output)
The advantage of this method is that the keras model itself will take care of the image processing stuff like channel replication at runtime. Thus, we need not replicate the image channels by ourselves with numpy and the results will be memory efficient.
I am using VGG16 with keras for transfer learning (I have 7 classes in my new model) and as such I want to use the build-in decode_predictions method to output the predictions of my model. However, using the following code:
preds = model.predict(img)
decode_predictions(preds, top=3)[0]
I receive the following error message:
ValueError: decode_predictions expects a batch of predictions (i.e. a 2D array of shape (samples, 1000)). Found array with shape: (1, 7)
Now I wonder why it expects 1000 when I only have 7 classes in my retrained model.
A similar question I found here on stackoverflow (Keras: ValueError: decode_predictions expects a batch of predictions
) suggests to include 'inlcude_top=True' upon model definition to solve this problem:
model = VGG16(weights='imagenet', include_top=True)
I have tried this, however it is still not working - giving me the same error as before. Any hint or suggestion on how to solve this issue is highly appreciated.
i suspect you are using some pre-trained model, let's say for instance resnet50 and you are importing decode_predictions like this:
from keras.applications.resnet50 import decode_predictions
decode_predictions transform an array of (num_samples, 1000) probabilities to class name of original imagenet classes.
if you want to transer learning and classify between 7 different classes you need to do it like this:
base_model = resnet50 (weights='imagenet', include_top=False)
# add a global spatial average pooling layer
x = base_model.output
x = GlobalAveragePooling2D()(x)
# add a fully-connected layer
x = Dense(1024, activation='relu')(x)
# and a logistic layer -- let's say we have 7 classes
predictions = Dense(7, activation='softmax')(x)
model = Model(inputs=base_model.input, outputs=predictions)
...
after fitting the model and calculate predictions you have to manually assign the class name to output number without using imported decode_predictions
Overloading of 'decode_predictions' function.Comment out the 1000 classes constraints of original function:
CLASS_INDEX = None
#keras_modules_injection
def test_my_decode_predictions(*args, **kwargs):
return my_decode_predictions(*args, **kwargs)
def my_decode_predictions(preds, top=5, **kwargs):
global CLASS_INDEX
backend, _, _, keras_utils = get_submodules_from_kwargs(kwargs)
# if len(preds.shape) != 2 or preds.shape[1] != 1000:
# raise ValueError('`decode_predictions` expects '
# 'a batch of predictions '
# '(i.e. a 2D array of shape (samples, 1000)). '
# 'Found array with shape: ' + str(preds.shape))
if CLASS_INDEX is None:
fpath = keras_utils.get_file(
'imagenet_class_index.json',
CLASS_INDEX_PATH,
cache_subdir='models',
file_hash='c2c37ea517e94d9795004a39431a14cb')
with open(fpath) as f:
CLASS_INDEX = json.load(f)
results = []
for pred in preds:
top_indices = pred.argsort()[-top:][::-1]
result = [tuple(CLASS_INDEX[str(i)]) + (pred[i],) for i in top_indices]
result.sort(key=lambda x: x[2], reverse=True)
results.append(result)
return results
print('Predicted: ', test_my_decode_predictions(pred, top=10))