I'm doing a project in deeplearning where I'll train a classifier using the SVHN dataset.
Initially, I had train_images and test_images with shapes (73257, 32, 32, 3) and (26032, 32, 32, 3), respectively and they are normally displayed using plt.imshow().
I converted the images to greyscale using the following code:
f = lambda img: tf.expand_dims(np.average(img, axis=-1), axis=-1).numpy()
train_images = f(train_images)
test_images = f(test_images)
I tried, as well:
train_images = tf.image.rgb_to_grayscale(train_images).numpy()
test_images = tf.image.rgb_to_grayscale(test_images).numpy()
On both cases, the output gives images with shape (32, 32, 1) as items in each of the train_images, test_images - the new shapes are, therefore, (73257, 32, 32, 3) and (26032, 32, 32, 1), respectively.
But, for some reason, when I try to display these new greyscale images, I get the following error : TypeError: Invalid dimensions for image data.
I have no idea why.
I solved the problem by using the following function for transforming to grayscale:
f = lambda img: np.mean(img, axis=-1, keepdims=True)
Then displayed the image using:
img = np.squeeze(array)
plt.imshow(img, cmap='Greys')
Where array is any item on train_images or test_images.
Related
I found the method of resizing the MNIST training dataset from (60000, 28, 28) to (60000, 14, 14).
This is the code and results:
import tensorflow as tf
import numpy as np
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
x_train, x_test = x_train[..., np.newaxis], x_test[..., np.newaxis]
x_train_small = tf.image.resize(x_train, (14,14)).numpy()
x_test_small = tf.image.resize(x_test, (14,14)).numpy()
print(x_train.shape)
print(x_test.shape)
print(x_train_small.shape)
print(x_test_small.shape)
>>>(60000, 28, 28, 1)
>>>(10000, 28, 28, 1)
>>>(60000, 14, 14, 1)
>>>(10000, 14, 14, 1)
I'm confused about why it has to add a new axis to change the shape that I want.
I would like to know whether there is another method to do the resize work without adding a new axis.
This is all described in the docs:
The first argument of resize is images: "4-D Tensor of shape [batch, height, width, channels] or 3-D Tensor of shape [height, width, channels]."
The second is size: "A 1-D int32 Tensor of 2 elements: new_height, new_width. The new size for the images."
Conclusion: You need the fourth dimension because those are the channels which tf.image.resize expects no matter what. The size along that dimension is 1 because the MNIST image are grayscale.
Of course you could use a some other library to resize, but personally I would avoid unnecessary dependencies, just for the sake of cleanliness.
I want to predict from image url. In the past, I use ImageDatagenerator().flow_from_directory() methods, but now I have only one image. so I want to predict from this single image.
I have tried the below code, but failed. (Dimension error)
url = "http://3.36.149.28/uploads/WEBUPLOADprofile.png"
img = Image.open(requests.get(url, stream=True).raw)
img = img_to_array(img)
img = img/255.
#Predict
pred = model.predict(img)
so I tried reshape & retrying, but failed (cannot reshape array of size 1048576 into shape (28,28,1))
img = img.reshape(-1, 28, 28, 1)
img = img/255.
#Predict
pred = model.predict(img)
for getting reshape & get colored predict image, what can I do ? please help..
Additional : I trained srcnn model, and inputs :
inputs = Input((None, None, 3), dtype='float')
I resolved this problem.
First, my url image shape is (None, None, 4), but my trained shape is (None, None, 3).
So I tried another jpg image (None, None, 3) and expand dimension via np,
and result shape = (1, None, None, 3)
image = np.expand_dims(image, axis=0)
model.predict(image)
from link
and now I get predict image successfully.
I am using Transfer learning for recognizing objects. I used trained VGG16 model as the base model and added my classifier on top of it using Keras. I then trained the model on my data, the model works well. I want to see the feature generated by the intermediate layers of the model for the given data. I used the following code for this purpose:
def ModeloutputAtthisLayer(model, layernme, imgnme, width, height):
layer_name = layernme
intermediate_layer_model = Model(inputs=model.input,
outputs=model.get_layer(layer_name).output)
img = image.load_img(imgnme, target_size=(width, height))
imageArray = image.img_to_array(img)
image_batch = np.expand_dims(imageArray, axis=0)
processed_image = preprocess_input(image_batch.copy())
intermediate_output = intermediate_layer_model.predict(processed_image)
print("outshape of ", layernme, "is ", intermediate_output.shape)
In the code, I used np.expand_dims to add one extra dimension for the batch as the input matrix to the network should be of the form (batchsize, height, width, channels). This code works fine. The shape of the feature vector is 1, 224, 224, 64.
Now I wish to display this as image, for this I understand there is an additional dimension added as batch so I should remove it. Following this I used the following lines of the code:
imge = np.squeeze(intermediate_output, axis=0)
plt.imshow(imge)
However it throws an error:
"Invalid dimensions for image data"
I wonder how can I display the extracted feature vector as an image. Any suggestion please.
Your feature shape is (1,224,224,64), you cannot directly plot a 64 channel image. What you can do is plot the individual channels independently like following
imge = np.squeeze(intermediate_output, axis=0)
filters = imge.shape[2]
plt.figure(1, figsize=(32, 32)) # plot image of size (32x32)
n_columns = 8
n_rows = math.ceil(filters / n_columns) + 1
for i in range(filters):
plt.subplot(n_rows, n_columns, i+1)
plt.title('Filter ' + str(i))
plt.imshow(imge[:,:,i], interpolation="nearest", cmap="gray")
This will plot 64 images in 8 rows and 8 columns.
A possible way to go consists in combining the 64 channels into a single-channel image through a weighted sum like this:
weighted_imge = np.sum(imge*weights, axis=-1)
where weights is an array with 64 weighting coefficients.
If you wish to give all the channels the same weight you could simply compute the average:
weighted_imge = np.mean(imge, axis=-1)
Demo
import numpy as np
import matplotlib.pyplot as plt
intermediate_output = np.random.randint(size=(1, 224, 224, 64),
low=0, high=2**8, dtype=np.uint8)
imge = np.squeeze(intermediate_output, axis=0)
weights = np.random.random(size=(imge.shape[-1],))
weighted_imge = np.sum(imge*weights, axis=-1)
plt.imshow(weighted_imge)
plt.colorbar()
In [33]: intermediate_output.shape
Out[33]: (1, 224, 224, 64)
In [34]: imge.shape
Out[34]: (224, 224, 64)
In [35]: weights.shape
Out[35]: (64,)
In [36]: weighted_imge.shape
Out[36]: (224, 224)
I have a 4 dimensional tensor of image pixel data (Red(height, width), Green (height, width), Blue (height, width), 14000 examples) and a CSV file containing the coordinates of the bounding boxes that each image has ie, (Image name, X1, Y1, X2, Y2), it has 14000 rows, one for each example, as well.
How do I feed this data to my neural network? Currently, if I try feeding the tensor it passes the entire array of 14000 examples against one row of (X1,Y1,X2,Y2) {it should have passed one array for one row of x1,y1,x2,y2}.
Any idea how to fix this?
Here's the code and the associated error:
train_csv = pd.read_csv('datasets/training.csv').values
test_csv = pd.read_csv('datasets/test.csv').values
y_train = train_csv[:,[1,2,3,4]] #done
x_train_names = train_csv[:,0] #obtained names of images in array
#### load images into an array ####
X_train = []
path = "datasets/images/images/"
imagelist = listdir(path)
for i in range(len(x_train_names)):
img_name = x_train_names[i]
img = Image.open(path + str(img_name))
arr = array(img)
X_train.append(arr)
#### building a very basic classifier, just to get some result ####
classifier = Sequential()
classifier.add(Convolution2D(64,(3,3),input_shape=(64,64,3), activation =
'relu'))
classifier.add(Dropout(0.2))
classifier.add(MaxPooling2D((4,4)))
classifier.add(Convolution2D(32,(2,2), activation = 'relu'))
classifier.add(MaxPooling2D((2,2)))
classifier.add(Flatten())
classifier.add(Dense(16, activation = 'relu'))
classifier.add(Dropout(0.5))
classifier.add(Dense(4))
classifier.compile('adam','binary_crossentropy',['accuracy'])
classifier.fit(x=X_train,y=y_train, steps_per_epoch=80, batch_size=32,
epochs=25)
Error:
ValueError: Error when checking model input: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 1 array(s), but instead got the following list of 14000 arrays:
[array([[[141, 154, 144],
[141, 154, 144],
[141, 154, 144],
...,
[149, 159, 150],
[150, 160, 151],
[150, 160, 151]],
[[140, 153, 143],
[…
EDIT: I converted all my images to grayscale so I don't get a memory error. This means that my X_train should have 1 dimension along the number of channels (earlier, RGB). Here's my edited code:
y_train = train_csv[:,[1,2,3,4]] #done
x_train_names = train_csv[:,0] #obtained names of images in array
# load images into an array
path = "datasets/images/images/"
imagelist = listdir(path)
img_name = x_train_names[0]
X_train = np.ndarray((14000,img.height,img.width,1))
for i in range(len(x_train_names)):
img_name = x_train_names[i]
img = Image.open(path + str(img_name)).convert('L')
##converting image to grayscale because I get memory error else
X_train[i,:,:,:] = np.asarray(img)
ValueError: could not broadcast input array from shape (480,640) into shape (480,640,1)
(At X_train[i,:,:,:] = np.asarray(img) line)
The first step is always to find out which input shape your first convolution layer expects. The documentation of tf.nn.conv2d states that the expected shape of the 4D input tensor is [batch, in_height, in_width, in_channels].
To load the data we can use a numpy ndarray. For that we should know the number of images you want to load, as well as the dimensions of the images:
path = "datasets/images/images/"
imagelist = listdir(path)
img_name = x_train_names[0]
img = Image.open(path + str(img_name))
X_train = np.ndarray((len(imagelist),img.height,img.width,3))
for i in range(len(x_train_names)):
img_name = x_train_names[i]
img = Image.open(path + str(img_name))
X_train[i,:,:,:] = np.asarray(img)
The shape property of your X_train tensor should give you then:
print(X_train.shape)
> (len(x_train_names), img.height, img.width, 3)
EDIT:
To load the images in multiple batches you could do something like this:
#### Build and compile your classifier up here here ####
num_batches = 5
len_batch = np.floor(len(x_train_names)/num_batches).astype(int)
X_train = np.ndarray((len_batch,img.height,img.width,3))
for batch_idx in range(num_batches):
idx_start = batch_idx*len_batch
idx_end = (batch_idx+1)*len_batch-1
x_train_names_batch = x_train_names[idx_start:idx_end]
for i in range(len(x_train_names_batch)):
img_name = x_train_names_batch[i]
img = Image.open(path + str(img_name))
X_train[i,:,:,:] = np.asarray(img)
classifier.fit(x=X_train,y=y_train, steps_per_epoch=num_batches, batch_size=len(x_train_names_batch), epochs=2)
Im loading mnist dataset as follows,
(X_train, y_train), (X_test, y_test) = mnist.load_data()
However since I need to load and train my own dataset, I wrote the little script as follows which will give the exact train and test values
def load_train(path):
X_train = []
y_train = []
print('Read train images')
for j in range(10):
files = glob(path + "*.jpeg")
for fl in files:
img = get_im(fl)
print(fl)
X_train.append(img)
y_train.append(j)
return np.asarray(X_train), np.asarray(y_train)
the pertained model generates a numpy array of size (64, 28, 28, 1) while training. Im concatenating the image_batch from the generated image as follows,
X = np.concatenate((image_batch, generated_images))
However im getting the following error,
ValueError: all the input arrays must have same number of dimensions
img_batch is of size (64, 28, 28)
generated_images is of size (64, 28, 28, 1)
How do I expand the dimension of the img_batch in X_train so as to concatenate with generated_images? or is there any other ways to load the custom images in place of loadmnist?
There is a function in python called np.expand_dims() which can expand the dimension of any array along the axis provided in arguments. In your case use, img_batch = np.expand_dims(img_batch, axis=3).
One other approach would be to use reshape function as suggested by #Ioannis Nasios. img_batch = img_batch.reshape(64,28,28,1)
image_batch = image_batch.reshape(64, 28, 28, 1)