I was reading this paper: Neural Style Transfer. In this paper author reconstructs image from output of layers of vgg19. I am using Keras. The size of output of block1_conv1 layer is (1, 400, 533, 64). Here 1 is number of images as input, 400 is number of rows, 533 number of columns and 64 number of channels. When I try to reconstruct it as an image, I get an error as size of image is 13644800 which is not a multiple of 3, so I can't display the image in three channels. How can I reconstruct this image?
I want to reconstruct images from layers as shown below:
Below is the code for the same:
from keras.preprocessing.image import load_img, img_to_array
from scipy.misc import imsave
import numpy as np
from keras.applications import vgg19
from keras import backend as K
CONTENT_IMAGE_FN = store image as input here
def preprocess_image(image_path):
img = load_img(image_path, target_size=(img_nrows, img_ncols))
img = img_to_array(img)
img = np.expand_dims(img, axis=0)
img = vgg19.preprocess_input(img)
return img
width, height = load_img(CONTENT_IMAGE_FN).size
img_nrows = 400
img_ncols = int(width * img_nrows / height)
base_image = K.variable(preprocess_image(CONTENT_IMAGE_FN))
RESULT_DIR = "generated/"
RESULT_PREFIX = RESULT_DIR + "gen"
if not os.path.exists(RESULT_DIR):
os.makedirs(RESULT_DIR)
result_prefix = RESULT_PREFIX
# this will contain our generated image
if K.image_data_format() == 'channels_first':
combination_image = K.placeholder((1, 3, img_nrows, img_ncols))
else:
combination_image = K.placeholder((1, img_nrows, img_ncols, 3))
x = preprocess_image(CONTENT_IMAGE_FN)
outputs_dict = dict([(layer.name, layer.output) for layer in model.layers])
feature_layers = ['block1_conv1', 'block2_conv1',
'block3_conv1', 'block4_conv1',
'block5_conv1']
outputs = []
for layer_name in feature_layers:
outputs.append(outputs_dict[layer_name])
functor = K.function([combination_image], outputs ) # evaluation function
# Testing
test = x
layer_outs = functor([test])
print(layer_outs)
layer_outs[0].reshape(400, -1 , 3) //getting error here
I am getting following error:
ValueError: cannot reshape array of size 13644800 into shape (400,newaxis,3)
You wrote:
"The size of output of block1_conv1 layer is (1, 400, 533, 64). Here 1
is number of images as input, 400 is number of rows, 533 number of
columns and 64 number of channels"
But this is not correct. The block1_conv1 output corresponds 1 channel dimension(channel first), 400 * 533 image dimension and 64 filters.
The error occurs, as you try to reshape a vector of VGG19 output of an image input with a 1 channel (400 * 533 * 64 = 13644800) to a vector which correspond to a 3 channels output.
Furthermore you have to pass 3 channel input:
From the VGG19 code:
input_shape: optional shape tuple, only to be specified
if include_top is False (otherwise the input shape
has to be (224, 224, 3)
(with channels_last data format)
or (3, 224, 224) (with channels_first data format).
It should have exactly 3 inputs channels,
and width and height should be no smaller than 32.
E.g. (200, 200, 3) would be one valid value.
Thus your input images has to be 3 channels. If you even want to feed 1 channel(grayscale) images to VGG19 you should make the following, if channels first:
X = np.repeat(X, 3 , axis=0)
or
X = np.repeat(X, 3 , axis=2)
if channels last without batch dimension or
X = np.repeat(X, 3 , axis=3)
with batch dimension.
If you provide more information regarding the actual dimensions of your input matrices of your images and type of it(grayscale,RGB), I can give you more help upon needing it.
Related
I have to apply tf.image.crop_and_resize on my images and want to generate 5 boxes from each image. I have written the below code which works fine
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
import matplotlib.pyplot as plt
import numpy as np
# Load the pre-trained Xception model to be used as the base encoder.
xception = keras.applications.Xception(
include_top=False, weights="imagenet", pooling="avg"
)
# Set the trainability of the base encoder.
for layer in xception.layers:
layer.trainable = False
# Receive the images as inputs.
inputs = layers.Input(shape=(299, 299, 3), name="image_input")
input ='/content/1.png'
input = tf.keras.preprocessing.image.load_img(input,target_size=(299,299,3))
image = tf.expand_dims(np.asarray(input)/255, axis=0)
BATCH_SIZE = 1
NUM_BOXES = 5
IMAGE_HEIGHT = 256
IMAGE_WIDTH = 256
CHANNELS = 3
CROP_SIZE = (24, 24)
boxes = tf.random.uniform(shape=(NUM_BOXES, 4))
box_indices = tf.random.uniform(shape=(NUM_BOXES,), minval=0, maxval=BATCH_SIZE, dtype=tf.int32)
output = tf.image.crop_and_resize(image, boxes, box_indices, CROP_SIZE)
xception_input = tf.keras.applications.xception.preprocess_input(output)
The above code works fine however when I want to display these boxes I run below code
for i in range(5):
# define subplot
plt.subplot(330 + 1 + i)
# generate batch of images
batch = xception_input.next()
# convert to unsigned integers for viewing
image = batch[0].astype('uint8')
image = np.reshape(24,24,3)
# plot raw pixel data
plt.imshow(image)
#show the figure
plt.show()
But it generates this error AttributeError: 'tensorflow.python.framework.ops.EagerTensor' object has no attribute 'next'.
You have to use [i] instead of .next()
And there is also problem with converting it to uint8 (but it doesn't need to reshape)
for i in range(5):
plt.subplot(331 + i)
tensor = xception_input[i]
#print(tensor)
tensor = tensor*255
image = np.array(tensor, dtype=np.uint8)
#print(image)
plt.imshow(image)
or use for to get items
for i, tensor in enumerate(xception_input):
#print(tensor)
plt.subplot(331 + i)
tensor = tensor*255
image = np.array(tensor, dtype=np.uint8)
#print(image)
plt.imshow(image)
I don't know what your code should do but this gives me empty images because tensor has values like -0.9 and it convert it all to 0
I'm trying to load a dataset to tensorflow, preprocess it and then creating batches to feed to a gan but for some reason some of the images has 4 channels!! Cannot add tensor to the batch: number of elements does not match. Shapes are: [tensor]: [128,128,4], [batch]: [128,128,3] [Op:IteratorGetNext]
this is the function to preprocess data and then adding them to batch
BATCH_SIZE = 32
def map_images(file):
img = tf.io.decode_jpeg(tf.io.read_file(file))
img = tf.dtypes.cast(img, tf.float32)
img = tf.image.resize(img, size=[128, 128])
img = img / 255.0
reimg = tf.reshape(img, [128, 128, 3])
return reimg
# create training batches
filename_dataset = tf.data.Dataset.list_files("/content/drive/MyDrive/Dataset/Damage type dataset/Damage type/Broken Glass/*.JPG")
image_dataset = filename_dataset.map(map_images).batch(BATCH_SIZE)
I solved it by making channels parameter = 3 in tf.io.decode_jpeg, since it's default is 0 so making it equal 3 will force all the images to have only 3 channels
I am using Transfer learning for recognizing objects. I used trained VGG16 model as the base model and added my classifier on top of it using Keras. I then trained the model on my data, the model works well. I want to see the feature generated by the intermediate layers of the model for the given data. I used the following code for this purpose:
def ModeloutputAtthisLayer(model, layernme, imgnme, width, height):
layer_name = layernme
intermediate_layer_model = Model(inputs=model.input,
outputs=model.get_layer(layer_name).output)
img = image.load_img(imgnme, target_size=(width, height))
imageArray = image.img_to_array(img)
image_batch = np.expand_dims(imageArray, axis=0)
processed_image = preprocess_input(image_batch.copy())
intermediate_output = intermediate_layer_model.predict(processed_image)
print("outshape of ", layernme, "is ", intermediate_output.shape)
In the code, I used np.expand_dims to add one extra dimension for the batch as the input matrix to the network should be of the form (batchsize, height, width, channels). This code works fine. The shape of the feature vector is 1, 224, 224, 64.
Now I wish to display this as image, for this I understand there is an additional dimension added as batch so I should remove it. Following this I used the following lines of the code:
imge = np.squeeze(intermediate_output, axis=0)
plt.imshow(imge)
However it throws an error:
"Invalid dimensions for image data"
I wonder how can I display the extracted feature vector as an image. Any suggestion please.
Your feature shape is (1,224,224,64), you cannot directly plot a 64 channel image. What you can do is plot the individual channels independently like following
imge = np.squeeze(intermediate_output, axis=0)
filters = imge.shape[2]
plt.figure(1, figsize=(32, 32)) # plot image of size (32x32)
n_columns = 8
n_rows = math.ceil(filters / n_columns) + 1
for i in range(filters):
plt.subplot(n_rows, n_columns, i+1)
plt.title('Filter ' + str(i))
plt.imshow(imge[:,:,i], interpolation="nearest", cmap="gray")
This will plot 64 images in 8 rows and 8 columns.
A possible way to go consists in combining the 64 channels into a single-channel image through a weighted sum like this:
weighted_imge = np.sum(imge*weights, axis=-1)
where weights is an array with 64 weighting coefficients.
If you wish to give all the channels the same weight you could simply compute the average:
weighted_imge = np.mean(imge, axis=-1)
Demo
import numpy as np
import matplotlib.pyplot as plt
intermediate_output = np.random.randint(size=(1, 224, 224, 64),
low=0, high=2**8, dtype=np.uint8)
imge = np.squeeze(intermediate_output, axis=0)
weights = np.random.random(size=(imge.shape[-1],))
weighted_imge = np.sum(imge*weights, axis=-1)
plt.imshow(weighted_imge)
plt.colorbar()
In [33]: intermediate_output.shape
Out[33]: (1, 224, 224, 64)
In [34]: imge.shape
Out[34]: (224, 224, 64)
In [35]: weights.shape
Out[35]: (64,)
In [36]: weighted_imge.shape
Out[36]: (224, 224)
I have a 4 dimensional tensor of image pixel data (Red(height, width), Green (height, width), Blue (height, width), 14000 examples) and a CSV file containing the coordinates of the bounding boxes that each image has ie, (Image name, X1, Y1, X2, Y2), it has 14000 rows, one for each example, as well.
How do I feed this data to my neural network? Currently, if I try feeding the tensor it passes the entire array of 14000 examples against one row of (X1,Y1,X2,Y2) {it should have passed one array for one row of x1,y1,x2,y2}.
Any idea how to fix this?
Here's the code and the associated error:
train_csv = pd.read_csv('datasets/training.csv').values
test_csv = pd.read_csv('datasets/test.csv').values
y_train = train_csv[:,[1,2,3,4]] #done
x_train_names = train_csv[:,0] #obtained names of images in array
#### load images into an array ####
X_train = []
path = "datasets/images/images/"
imagelist = listdir(path)
for i in range(len(x_train_names)):
img_name = x_train_names[i]
img = Image.open(path + str(img_name))
arr = array(img)
X_train.append(arr)
#### building a very basic classifier, just to get some result ####
classifier = Sequential()
classifier.add(Convolution2D(64,(3,3),input_shape=(64,64,3), activation =
'relu'))
classifier.add(Dropout(0.2))
classifier.add(MaxPooling2D((4,4)))
classifier.add(Convolution2D(32,(2,2), activation = 'relu'))
classifier.add(MaxPooling2D((2,2)))
classifier.add(Flatten())
classifier.add(Dense(16, activation = 'relu'))
classifier.add(Dropout(0.5))
classifier.add(Dense(4))
classifier.compile('adam','binary_crossentropy',['accuracy'])
classifier.fit(x=X_train,y=y_train, steps_per_epoch=80, batch_size=32,
epochs=25)
Error:
ValueError: Error when checking model input: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 1 array(s), but instead got the following list of 14000 arrays:
[array([[[141, 154, 144],
[141, 154, 144],
[141, 154, 144],
...,
[149, 159, 150],
[150, 160, 151],
[150, 160, 151]],
[[140, 153, 143],
[…
EDIT: I converted all my images to grayscale so I don't get a memory error. This means that my X_train should have 1 dimension along the number of channels (earlier, RGB). Here's my edited code:
y_train = train_csv[:,[1,2,3,4]] #done
x_train_names = train_csv[:,0] #obtained names of images in array
# load images into an array
path = "datasets/images/images/"
imagelist = listdir(path)
img_name = x_train_names[0]
X_train = np.ndarray((14000,img.height,img.width,1))
for i in range(len(x_train_names)):
img_name = x_train_names[i]
img = Image.open(path + str(img_name)).convert('L')
##converting image to grayscale because I get memory error else
X_train[i,:,:,:] = np.asarray(img)
ValueError: could not broadcast input array from shape (480,640) into shape (480,640,1)
(At X_train[i,:,:,:] = np.asarray(img) line)
The first step is always to find out which input shape your first convolution layer expects. The documentation of tf.nn.conv2d states that the expected shape of the 4D input tensor is [batch, in_height, in_width, in_channels].
To load the data we can use a numpy ndarray. For that we should know the number of images you want to load, as well as the dimensions of the images:
path = "datasets/images/images/"
imagelist = listdir(path)
img_name = x_train_names[0]
img = Image.open(path + str(img_name))
X_train = np.ndarray((len(imagelist),img.height,img.width,3))
for i in range(len(x_train_names)):
img_name = x_train_names[i]
img = Image.open(path + str(img_name))
X_train[i,:,:,:] = np.asarray(img)
The shape property of your X_train tensor should give you then:
print(X_train.shape)
> (len(x_train_names), img.height, img.width, 3)
EDIT:
To load the images in multiple batches you could do something like this:
#### Build and compile your classifier up here here ####
num_batches = 5
len_batch = np.floor(len(x_train_names)/num_batches).astype(int)
X_train = np.ndarray((len_batch,img.height,img.width,3))
for batch_idx in range(num_batches):
idx_start = batch_idx*len_batch
idx_end = (batch_idx+1)*len_batch-1
x_train_names_batch = x_train_names[idx_start:idx_end]
for i in range(len(x_train_names_batch)):
img_name = x_train_names_batch[i]
img = Image.open(path + str(img_name))
X_train[i,:,:,:] = np.asarray(img)
classifier.fit(x=X_train,y=y_train, steps_per_epoch=num_batches, batch_size=len(x_train_names_batch), epochs=2)
I have the same question:Input to reshape is a tensor with 37632 values, but the requested shape has 150528.
writer = tf.python_io.TFRecordWriter("/home/henson/Desktop/vgg/test.tfrecords") # 要生成的文件
for index, name in enumerate(classes):
class_path = cwd + name +'/'
for img_name in os.listdir(class_path):
img_path = class_path + img_name # 每一个图片的地址
img = Image.open(img_path)
img = img.resize((224, 224))
img_raw = img.tobytes() # 将图片转化为二进制格式
example = tf.train.Example(features=tf.train.Features(feature={
"label": tf.train.Feature(int64_list=tf.train.Int64List(value=[index])),
'img_raw': tf.train.Feature(bytes_list=tf.train.BytesList(value=[img_raw]))
})) # example对象对label和image数据进行封装
writer.write(example.SerializeToString()) # 序列化为字符串
writer.close()
def read_and_decode(filename): # 读入dog_train.tfrecords
filename_queue = tf.train.string_input_producer([filename]) # 生成一个queue队列
reader = tf.TFRecordReader()
_, serialized_example = reader.read(filename_queue) # 返回文件名和文件
features = tf.parse_single_example(serialized_example,
features={
'label': tf.FixedLenFeature([], tf.int64),
'img_raw': tf.FixedLenFeature([], tf.string),
}) # 将image数据和label取出来
img = tf.decode_raw(features['img_raw'], tf.uint8)
img = tf.reshape(img, [224, 224, 3]) # reshape为128*128的3通道图片
img = tf.cast(img, tf.float32) * (1. / 255) - 0.5 # 在流中抛出img张量
label = tf.cast(features['label'], tf.int32) # 在流中抛出label张量
print(img,label)
return img, label
images, labels = read_and_decode("/home/henson/Desktop/vgg/TFrecord.tfrecords")
print(images,labels)
images, labels = tf.train.shuffle_batch([images, labels], batch_size=20, capacity=16*20, min_after_dequeue=8*20)
I thonght I have resize img to 224*224,and reshape to [224,224,3],but it doesn't work. How could I make it?
The problem is basically related to shape of Architecture of CNN.Let say I defined architecture shown in picture int coding we defined weights and biases in following way
If we see (weights) Lets start with
wc1 in this layer I defined 32 filters of 3x3 size will be applied
wc2 in this layer I defined 64 filters of 3x3 size will be applied
wc3 in this layer I defined 128 filters of 3x3 size will be applied
wd1 38*38*128 is interesting (Where it comes from).
And in Architecture we also defined maxpooling concept.
See Architecture pic in every step
1.Lets Explain it Let say your input image is 300 x 300 x 1 (in picture it is 28x28x1)
2. (If strides defined is set to 1)Each filter will have an 300x300x1 picture so After applying 32 filter of 3x3
the we will have 32 pictures of 300x300 thus collected images will be 300x300x32
3.After Maxpooling if (Strides=2 depends what you defined usually it is 2) image size will change from 300 x 300 x 32 to 150 x 150 x 32
(If strides defined is set to 1)Now Each filter will have an 150x150x32 picture so After applying 64 filter of 3x3
the we will have 64 pictures of 300x300 thus collected images will be 150x150x(32x64)
5.After Maxpooling if (Strides=2 depends what you defined usually it is 2) image size will change from 150x150x(32x64)
to 75 x 75 x (32x64)
(If strides defined is set to 1)Now Each filter will have an 75 x 75 x (32x64) picture so After applying 64 filter of 3x3
the we will have 128 pictures of 75 x 75 x (32x64) thus collected images will be 75 x 75 x (32x64x128)
7.After Maxpooling since dimension of image is 75x75(odd dimension make it even) so it is needed to pad first (if padding defined ='Same') then it will change to 76x76(even) ** if (Strides=2 depends what you defined usually it is 2) image size will change from 76x76x(32x64x128)
to **38 x 38 x (32x64x128)
Now See 'wd1' in coding picture here comes 38*38*128
I had the same error , so changed my code from this:
image = tf.decode_raw(image_raw, tf.float32)
image = tf.reshape(image, [img_width, img_height, 3])
to this:
image = tf.decode_raw(image_raw, tf.uint8)
image = tf.reshape(image, [img_width, img_height, 3])
# The type is now uint8 but we need it to be float.
image = tf.cast(image, tf.float32)
It is because somehow there's a mismatch in my generate_tf_record data format. I serialized it to string instead of bytelist. I notice the difference you and me, you change your image to byte . Here's how I write my image to tfrecord.
file_path, label = sample
image = Image.open(file_path)
image = image.resize((224, 224))
image_raw = np.array(image).tostring()
features = {
'label': _int64_feature(class_map[label]),
'text_label': _bytes_feature(bytes(label, encoding = 'utf-8')),
'image': _bytes_feature(image_raw)
}
example = tf.train.Example(features=tf.train.Features(feature=features))
writer.write(example.SerializeToString())
hope it will help.
I had the same error just like you, and I found the reason behind it. It is because when you store your image with .tostring(), the data is stored with the format of tf.float32. Then you decode the tfrecord with decode_raw(tf.uint8), which causes the dismatch error.
I solved it by change the code to:
image=tf.decode_raw(image_raw,tf.float32)
or:
image=tf.image.decode_jpeg(image_raw,channels=3)
if you image_raw is jpeg format originally