Im loading mnist dataset as follows,
(X_train, y_train), (X_test, y_test) = mnist.load_data()
However since I need to load and train my own dataset, I wrote the little script as follows which will give the exact train and test values
def load_train(path):
X_train = []
y_train = []
print('Read train images')
for j in range(10):
files = glob(path + "*.jpeg")
for fl in files:
img = get_im(fl)
print(fl)
X_train.append(img)
y_train.append(j)
return np.asarray(X_train), np.asarray(y_train)
the pertained model generates a numpy array of size (64, 28, 28, 1) while training. Im concatenating the image_batch from the generated image as follows,
X = np.concatenate((image_batch, generated_images))
However im getting the following error,
ValueError: all the input arrays must have same number of dimensions
img_batch is of size (64, 28, 28)
generated_images is of size (64, 28, 28, 1)
How do I expand the dimension of the img_batch in X_train so as to concatenate with generated_images? or is there any other ways to load the custom images in place of loadmnist?
There is a function in python called np.expand_dims() which can expand the dimension of any array along the axis provided in arguments. In your case use, img_batch = np.expand_dims(img_batch, axis=3).
One other approach would be to use reshape function as suggested by #Ioannis Nasios. img_batch = img_batch.reshape(64,28,28,1)
image_batch = image_batch.reshape(64, 28, 28, 1)
Related
I found the method of resizing the MNIST training dataset from (60000, 28, 28) to (60000, 14, 14).
This is the code and results:
import tensorflow as tf
import numpy as np
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
x_train, x_test = x_train[..., np.newaxis], x_test[..., np.newaxis]
x_train_small = tf.image.resize(x_train, (14,14)).numpy()
x_test_small = tf.image.resize(x_test, (14,14)).numpy()
print(x_train.shape)
print(x_test.shape)
print(x_train_small.shape)
print(x_test_small.shape)
>>>(60000, 28, 28, 1)
>>>(10000, 28, 28, 1)
>>>(60000, 14, 14, 1)
>>>(10000, 14, 14, 1)
I'm confused about why it has to add a new axis to change the shape that I want.
I would like to know whether there is another method to do the resize work without adding a new axis.
This is all described in the docs:
The first argument of resize is images: "4-D Tensor of shape [batch, height, width, channels] or 3-D Tensor of shape [height, width, channels]."
The second is size: "A 1-D int32 Tensor of 2 elements: new_height, new_width. The new size for the images."
Conclusion: You need the fourth dimension because those are the channels which tf.image.resize expects no matter what. The size along that dimension is 1 because the MNIST image are grayscale.
Of course you could use a some other library to resize, but personally I would avoid unnecessary dependencies, just for the sake of cleanliness.
I'm doing a project in deeplearning where I'll train a classifier using the SVHN dataset.
Initially, I had train_images and test_images with shapes (73257, 32, 32, 3) and (26032, 32, 32, 3), respectively and they are normally displayed using plt.imshow().
I converted the images to greyscale using the following code:
f = lambda img: tf.expand_dims(np.average(img, axis=-1), axis=-1).numpy()
train_images = f(train_images)
test_images = f(test_images)
I tried, as well:
train_images = tf.image.rgb_to_grayscale(train_images).numpy()
test_images = tf.image.rgb_to_grayscale(test_images).numpy()
On both cases, the output gives images with shape (32, 32, 1) as items in each of the train_images, test_images - the new shapes are, therefore, (73257, 32, 32, 3) and (26032, 32, 32, 1), respectively.
But, for some reason, when I try to display these new greyscale images, I get the following error : TypeError: Invalid dimensions for image data.
I have no idea why.
I solved the problem by using the following function for transforming to grayscale:
f = lambda img: np.mean(img, axis=-1, keepdims=True)
Then displayed the image using:
img = np.squeeze(array)
plt.imshow(img, cmap='Greys')
Where array is any item on train_images or test_images.
I have a 4 dimensional tensor of image pixel data (Red(height, width), Green (height, width), Blue (height, width), 14000 examples) and a CSV file containing the coordinates of the bounding boxes that each image has ie, (Image name, X1, Y1, X2, Y2), it has 14000 rows, one for each example, as well.
How do I feed this data to my neural network? Currently, if I try feeding the tensor it passes the entire array of 14000 examples against one row of (X1,Y1,X2,Y2) {it should have passed one array for one row of x1,y1,x2,y2}.
Any idea how to fix this?
Here's the code and the associated error:
train_csv = pd.read_csv('datasets/training.csv').values
test_csv = pd.read_csv('datasets/test.csv').values
y_train = train_csv[:,[1,2,3,4]] #done
x_train_names = train_csv[:,0] #obtained names of images in array
#### load images into an array ####
X_train = []
path = "datasets/images/images/"
imagelist = listdir(path)
for i in range(len(x_train_names)):
img_name = x_train_names[i]
img = Image.open(path + str(img_name))
arr = array(img)
X_train.append(arr)
#### building a very basic classifier, just to get some result ####
classifier = Sequential()
classifier.add(Convolution2D(64,(3,3),input_shape=(64,64,3), activation =
'relu'))
classifier.add(Dropout(0.2))
classifier.add(MaxPooling2D((4,4)))
classifier.add(Convolution2D(32,(2,2), activation = 'relu'))
classifier.add(MaxPooling2D((2,2)))
classifier.add(Flatten())
classifier.add(Dense(16, activation = 'relu'))
classifier.add(Dropout(0.5))
classifier.add(Dense(4))
classifier.compile('adam','binary_crossentropy',['accuracy'])
classifier.fit(x=X_train,y=y_train, steps_per_epoch=80, batch_size=32,
epochs=25)
Error:
ValueError: Error when checking model input: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 1 array(s), but instead got the following list of 14000 arrays:
[array([[[141, 154, 144],
[141, 154, 144],
[141, 154, 144],
...,
[149, 159, 150],
[150, 160, 151],
[150, 160, 151]],
[[140, 153, 143],
[…
EDIT: I converted all my images to grayscale so I don't get a memory error. This means that my X_train should have 1 dimension along the number of channels (earlier, RGB). Here's my edited code:
y_train = train_csv[:,[1,2,3,4]] #done
x_train_names = train_csv[:,0] #obtained names of images in array
# load images into an array
path = "datasets/images/images/"
imagelist = listdir(path)
img_name = x_train_names[0]
X_train = np.ndarray((14000,img.height,img.width,1))
for i in range(len(x_train_names)):
img_name = x_train_names[i]
img = Image.open(path + str(img_name)).convert('L')
##converting image to grayscale because I get memory error else
X_train[i,:,:,:] = np.asarray(img)
ValueError: could not broadcast input array from shape (480,640) into shape (480,640,1)
(At X_train[i,:,:,:] = np.asarray(img) line)
The first step is always to find out which input shape your first convolution layer expects. The documentation of tf.nn.conv2d states that the expected shape of the 4D input tensor is [batch, in_height, in_width, in_channels].
To load the data we can use a numpy ndarray. For that we should know the number of images you want to load, as well as the dimensions of the images:
path = "datasets/images/images/"
imagelist = listdir(path)
img_name = x_train_names[0]
img = Image.open(path + str(img_name))
X_train = np.ndarray((len(imagelist),img.height,img.width,3))
for i in range(len(x_train_names)):
img_name = x_train_names[i]
img = Image.open(path + str(img_name))
X_train[i,:,:,:] = np.asarray(img)
The shape property of your X_train tensor should give you then:
print(X_train.shape)
> (len(x_train_names), img.height, img.width, 3)
EDIT:
To load the images in multiple batches you could do something like this:
#### Build and compile your classifier up here here ####
num_batches = 5
len_batch = np.floor(len(x_train_names)/num_batches).astype(int)
X_train = np.ndarray((len_batch,img.height,img.width,3))
for batch_idx in range(num_batches):
idx_start = batch_idx*len_batch
idx_end = (batch_idx+1)*len_batch-1
x_train_names_batch = x_train_names[idx_start:idx_end]
for i in range(len(x_train_names_batch)):
img_name = x_train_names_batch[i]
img = Image.open(path + str(img_name))
X_train[i,:,:,:] = np.asarray(img)
classifier.fit(x=X_train,y=y_train, steps_per_epoch=num_batches, batch_size=len(x_train_names_batch), epochs=2)
I want to reshape the MNIST dataset from shape (70000, 784) to (70000, 28, 28), the following code is tryed, but it gets a TypeError:
TypeError: only integer scalar arrays can be converted to a scalar index
df = pd.read_csv('images.csv', sep=',', header=None)
x_data = np.array(df)
x_data = x_data.reshape(x_data[0], 28, 28)
This works, but is slow
data = np.array(df)
x_data = []
for d in data:
x_data.append(d.reshape(28,28))
x_data = np.array(x_data)
How should this be with numpy.reshape() and without looping?
Manny thanks!
I think, the problem with the second one is because ur using a for loop it can take more time. So i would suggest you can try this
import tensorflow as tf
#load the data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('MNIST_data', validation_size=0)
#considering only first 2 data points
img = mnist.train.images[:2]
x = tf.reshape(img, shape=[-1, 28, 28, 1]) # -1 refers to standard feature which is equivalent to 28*28*1 here
Ideally i got the shape for x as (2, 28, 28, 1). Hope this helps!!
For MNIST dataset, you may use following to convert your dataset into 3D,
train = pd.read_csv("images.csv")
data = data.values.reshape(-1,28,28,1)
assuming you have data as pandas dataframe and first label column is already dropped.
Datasets.fetch_openml returns pair values includes features and target of mnist data.
Then we reshape the a certain row of feature in (28,28) 2-D array.
And as these features are the pixel intensity we can plot this 2-D array to visualise.
pixel_values,targets=datasets.fetch_openml(
'mnist_784',
version=1,
return_X_y=True
)
single_image=pixel_values[1:2].values.reshape(28,28)
plt.imshow(single_image,cmap='gray')
I am going to make some GAN-Model Tester using tf.keras with MNIST hand-writed digits dataset. Because my model is going to be used in 128x128 images, I resized MNIST dataset to 128x128x1. but, the program makes some errors, that I never seen.
(x_train, _), (_, _) = mnist.load_data()
x_train = (x_train.astype(np.float32) - 127.5) / 127.5
x_train = tf.image.resize_images(x_train, [128, 128])
idx = np.random.randint(0, x_train.shape[0], batch_size) # picks some data, count is batch_size=32.
imgs = x_train[idx] # This line made errors
The last line made two errors:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Shape must be rank 1 but is rank 2 for 'strided_slice_1' (op: 'StridedSlice') with input shapes: [60000,128,128,1], [1,32], [1,32], [1].
and,
ValueError: Shape must be rank 1 but is rank 2 for 'strided_slice_1' (op: 'StridedSlice') with input shapes: [60000,128,128,1], [1,32], [1,32], [1].
I think the number '32' means batch_size(=32).
I tried to find about this error, but I couldn't find like this error.
I don't have any ideas to solve this problems (because I started to use keras a week ago, before I used pytorch).
You have more issue with your code above, but the primer cause of error is that tensorflow doesn't support the numpy type advanced slicing.
Actually the error message is because tensorflow try to align your input array in his strided-slices:
An example for the strided-slices:
foo[5:,:,:3] on a 7x8x9 tensor is equivalent to foo[5:7,0:8,0:3].
foo[::-1] reverses a tensor with shape 8.
Unfortunately only basic type indexing is available in Tensorflow currently. Advanced type indexing is under development.
The secondary problem, that your resizing was not proper.
Tensorflow assumes a 3D or 4D input. You tried to pass a 2D image to `tf.image.resize_images(), which doesn't return the required new image dimensions. So we have to reshape the raw images like this:
x_train = x_train.reshape((-1, x_train.shape[1], x_train.shape[1], 1))
only then we can pass them to:
`x_train = tf.image.resize_images(x_train, [128, 128])
It will return then the proper dimensions:
print(x_train.shape)
Out:
(60000, 128, 128, 1)
So summarizing the whole solution, currently you can do it as follows:
import numpy as np
import tensorflow as tf
batch_size = 32
mnist = tf.keras.datasets.mnist
(x_train, _), (_, _) = mnist.load_data()
x_train = x_train.reshape((-1, x_train.shape[1], x_train.shape[1], 1))
x_train = (x_train.astype(np.float32) - 127.5) / 127.5
x_train = tf.image.resize_images(x_train, [128, 128])
idx = np.random.randint(0, x_train.shape[0], batch_size)
imgs = [x_train[i,:,:,:] for i in idx]
which is a pretty messy "solution".
Other, actually a real solution with rearrange the original code we can achieve what we aimed to as a workaround to the tensorflow indexing issue:
import numpy as np
import tensorflow as tf
batch_size = 32
mnist = tf.keras.datasets.mnist
(x_train, _), (_, _) = mnist.load_data()
x_train = (x_train.astype(np.float32) - 127.5) / 127.5
idx = np.random.randint(0, x_train.shape[0], batch_size)
x_train = x_train[idx]
x_train = x_train.reshape((-1, x_train.shape[1], x_train.shape[1], 1))
x_train = tf.image.resize_images(x_train, [128, 128])
print(x_train.shape)
Out:
(32, 128, 128, 1)
That's it!
Alternatively instead of tf.image.resize_images() you can use an additional image tools like skimage.transform.resize() from scikit-image which returns numpy array type data.