How to use increase dimension for MNIST dataset? - python

I try to use the MNIST dataset for Alexnet with Keras, so I should change dimension(because MNIST is gray-scale, Alexnet needs to be RGB and also 227*227). Now I get some results, numpy_imgs=(10,227,227,1) but I should do this like (10,227,227,3) you can see what I did before in my code,
thank you.
import tensorflow as tf
import numpy as np
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True)
batch=mnist.train.next_batch(10)
X_batch = batch[0]
batch_tensor = tf.reshape(X_batch, [10, 28, 28, 1])
resized_images = tf.image.resize_images(batch_tensor, [227,227])
with tf.Session() as sess:
numpy_imgs = resized_images.eval(session=sess) # mnist images converted to numpy array
r2=[]
t=list(numpy_imgs)
dim = np.zeros((227,227))
for i in range(0,10):
R=np.stack((t[i],dim,dim),axis=2)
R=list(R)
r2.append(R)
y3=np.asarray(r2)
I tried something below but got an error like "ValueError: all input arrays must have the same shape", how can I fix it?

Take a look at tf.tile which repeat a tensor along one of its dimension:
y3 = tf.tile(numpy_imgs, (1, 1, 1, 3))
If you want to complete it with zeroed tensors, you should use tf.concat (or np.concatenate instead of stack.
dim = np.zeros((227, 227, 2))
for i in range(0, 10):
R = np.concatenate((t[i], dim), axis=2)
...
You can even do it more concisely, treating all batch at once:
dim = np.zeros((10, 227, 227, 2))
y3 = np.concatenate((numpy_imgs, dim), axis=3
Here is a more general example:
import numpy as np
def main():
i = np.random.random((10, 227, 227, 1))
dim = np.zeros((10, 227, 227, 2))
print(i.shape)
print(dim.shape)
print(np.concatenate((i, dim), axis=3).shape)
if __name__ == '__main__':
main()
(10, 227, 227, 1)
(10, 227, 227, 2)
(10, 227, 227, 3)

Related

ImageDataGenerator that outputs patches instead of full image

I have a big dataset that I want to use to train a CNN with Keras (too big to load it in memory). I always train using ImageDataGenerator.flow_from_dataframe, as I have my images across different directories, as shown below.
datagen = ImageDataGenerator(
rescale=1./255.
)
train_gen=datagen.flow_from_dataframe(
dataframe=train_df),
x_col="filepath",
class_mode="input",
shuffle=True,
seed=1)
However, this time I don't want to use my full images, but random patches of the images instead, i.e., I want to choose a random image and take a random patch of 32x32 of that image each time. How can I do this?
I thought of using tf.extract_image_patches and sklearn.feature_extraction.image.extract_patches_2d, but I don't know if it is possible to integrate these to the flow_from_dataframe.
Any help would be appreciated.
You could try using a preprocessing function in your ImageDataGenerator combined with tf.image.extract_patches:
import tensorflow as tf
import matplotlib.pyplot as plt
BATCH_SIZE = 32
def get_patches():
def _get_patches(image):
image = tf.expand_dims(image,0)
patches = tf.image.extract_patches(images=image,
sizes=[1, 32, 32, 1],
strides=[1, 32, 32, 1],
rates=[1, 1, 1, 1],
padding='VALID')
patches = tf.reshape(patches, (1, 256, 256, 3))
return patches
return _get_patches
def reshape_data(images, labels):
ta = tf.TensorArray(tf.float32, size=0, dynamic_size=True)
for b in tf.range(BATCH_SIZE):
i = tf.random.uniform((), maxval=int(256/32), dtype=tf.int32)
j = tf.random.uniform((), maxval=int(256/32), dtype=tf.int32)
patched_image = tf.reshape(images[b], (8, 8, 3072))
ta = ta.write(ta.size(), tf.reshape(patched_image[i, j], shape=(32, 32 ,3)))
return ta.stack(), labels
preprocessing = get_patches()
flowers = tf.keras.utils.get_file(
'flower_photos',
'https://storage.googleapis.com/download.tensorflow.org/example_images/flower_photos.tgz',
untar=True)
img_gen = tf.keras.preprocessing.image.ImageDataGenerator(rescale=1./255, rotation_range=20, preprocessing_function = preprocessing)
ds = tf.data.Dataset.from_generator(
lambda: img_gen.flow_from_directory(flowers, batch_size=BATCH_SIZE, shuffle=True),
output_types=(tf.float32, tf.float32))
ds = ds.map(reshape_data)
images, _ = next(iter(ds.take(1)))
image = images[0] # (32, 32, 3)
plt.imshow(image.numpy())
The problem is that the preprocessing_function of the ImageDataGenerator expects the same output shape as the input shape. I therefore first create the patches and construct the same output shape of the original image based on the patches. Later, in the method reshape_data, I reshape the images from (256, 256, 3) to (8, 8, 3072), extract a random patch and then return it with the shape (32, 32, 3).

How can i convert mnist data to RGB format?

I am trying to convert MNIST dataset to RGB format, the actual shape of each image is (28, 28), but i need (28, 28, 3).
import numpy as np
import tensorflow as tf
mnist = tf.keras.datasets.mnist
(x_train, _), (x_test, _) = mnist.load_data()
X = np.concatenate([x_train, x_test])
X = X / 127.5 - 1
X.reshape((70000, 28, 28, 1))
tf.image.grayscale_to_rgb(
X,
name=None
)
But i get the following error:
ValueError: Dimension 1 in both shapes must be equal, but are 84 and 3. Shapes are [28,84] and [28,3].
You should store the reshaped 3D [28x28x1] images in an array:
X = X.reshape((70000, 28, 28, 1))
When converting, set an other array to the return value of the tf.image.grayscale_to_rgb() function :
X3 = tf.image.grayscale_to_rgb(
X,
name=None
)
Finally, to plot out one example from the resulting tensor images with matplotlib and tf.session():
import matplotlib.pyplot as plt
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
image_to_plot = sess.run(image)
plt.figure()
plt.imshow(image_to_plot)
plt.grid(False)
The complete code:
import numpy as np
import tensorflow as tf
mnist = tf.keras.datasets.mnist
(x_train, _), (x_test, _) = mnist.load_data()
X = np.concatenate([x_train, x_test])
X = X / 127.5 - 1
# Set reshaped array to X
X = X.reshape((70000, 28, 28, 1))
# Convert images and store them in X3
X3 = tf.image.grayscale_to_rgb(
X,
name=None
)
# Get one image from the 3D image array to var. image
image = X3[0,:,:,:]
# Plot it out with matplotlib.pyplot
import matplotlib.pyplot as plt
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
image_to_plot = sess.run(image)
plt.figure()
plt.imshow(image_to_plot)
plt.grid(False)
If you print the shape of X before tf.image.grayscale_to_rgb you will see the output dimension is (70000, 28, 28). Inputs to tf.image.grayscale must have size 1 as it's final dimension.
Expand the final dimension of X to make it compatible with the function
tf.image.grayscale_to_rgb(tf.expand_dims(X, axis=3))
In addition to #DMolony and #Aqwis01 answers, another simple solution could be using numpy.repeat method to duplicate the last dimension of your tensor several times:
X = X.reshape((70000, 28, 28, 1))
X = X.repeat(3, -1) # repeat the last (-1) dimension three times
X_t = tf.convert_to_tensor(X)
assert X_t.shape == (70000, 28, 28, 3)

Convert a tensor from 128,128,3 to 129,128,3 and the 1,128,3 values padded to that tensor happens later

This is my piece of code for GAN where the model is being initialized, everything is working and only the relevant code to the problem is present here:
z = Input(shape=(100+384,))
img = self.generator(z)
print("before: ",img) #128x128x3 shape, dtype=tf.float32
temp = tf.get_variable("temp", [1, 128, 3],dtype=tf.float32)
img=tf.concat(img,temp)
print("after: ",img) #error ValueError: Incompatible type conversion requested to type 'int32' for variable of type 'float32_ref'
valid = self.discriminator(img)
self.combined = Model(z, valid)
I have 128x128x3 images to generate, what I want to do is give 129x128x3 images to discriminator and the 1x128x3 text-embedding matrix is concatenated with the image while training. But I have to specify at the start the shape of tensors and input value that each model i.e. GEN and DISC will get. Gen takes 100noise+384embedding matrix and generates 128x128x3 image which is again embeded by some embedding i.e. 1x128x3 and is fed to DISC. So my question is that whether this approach is correct or not? Also, if it is correct or it makes sense then how can I specific the stuff needed at the start so that it does not give me errors like incompatible shape because at the start I have to add these lines:-
z = Input(shape=(100+384,))
img = self.generator(z) #128x128x3
valid = self.discriminator(img) #should be 129x128x3
self.combined = Model(z, valid)
But img is of 128x128x3 and is later during training changed to 129x128x3 by concatenating embedding matrix. So how can I change "img" from 128,128,3 to 129,128,3 in the above code either by padding or appending another tensor or by simply reshaping which of course is not possible. Any help will be much much appreciated. Thanks.
The first argument of tf.concat should be the list of tensors, while the second is the axis along which to concatenate. You could concatenate the img and temp tensors as follows:
import tensorflow as tf
img = tf.ones(shape=(128, 128, 3))
temp = tf.get_variable("temp", [1, 128, 3], dtype=tf.float32)
img = tf.concat([img, temp], axis=0)
with tf.Session() as sess:
print(sess.run(tf.shape(img)))
UPDATE: Here you have a minimal example showing why you get the error "AttributeError: 'Tensor' object has no attribute '_keras_history'". This error pops up in the following snippet:
from keras.layers import Input, Lambda, Dense
from keras.models import Model
import tensorflow as tf
img = Input(shape=(128, 128, 3)) # Shape=(batch_size, 128, 128, 3)
temp = Input(shape=(1, 128, 3)) # Shape=(batch_size, 1, 128, 3)
concat = tf.concat([img, temp], axis=1)
print(concat.get_shape())
dense = Dense(1)(concat)
model = Model(inputs=[img, temp], outputs=dense)
This happens because tensor concatis not a Keras tensor, and therefore some of the typical Keras tensors' attributes (such as _keras_history) are missing. To overcome this problem, you need to encapsulate all TensorFlow tensors into a Keras Lambda layer:
from keras.layers import Input, Lambda, Dense
from keras.models import Model
import tensorflow as tf
img = Input(shape=(128, 128, 3)) # Shape=(batch_size, 128, 128, 3)
temp = Input(shape=(1, 128, 3)) # Shape=(batch_size, 1, 128, 3)
concat = Lambda(lambda x: tf.concat([x[0], x[1]], axis=1))([img, temp])
print(concat.get_shape())
dense = Dense(1)(concat)
model = Model(inputs=[img, temp], outputs=dense)

TensorFlow Tensor handled differently in numpy argmax vs keras argmax

Why does a TensorFlow tensor behave differently in math functions in Numpy than it behaves in math functions in Keras?
Numpy arrays seem to act normally when put in the same situation as the TensorFlow Tensor.
This example shows that a numpy matrix is handled correctly under numpy functions and keras functions.
import numpy as np
from keras import backend as K
arr = np.random.rand(19, 19, 5, 80)
np_argmax = np.argmax(arr, axis=-1)
np_max = np.max(arr, axis=-1)
k_argmax = K.argmax(arr, axis=-1)
k_max = K.max(arr, axis=-1)
print('np_argmax shape: ', np_argmax.shape)
print('np_max shape: ', np_max.shape)
print('k_argmax shape: ', k_argmax.shape)
print('k_max shape: ', k_max.shape)
Outputs this (as expected)
np_argmax shape: (19, 19, 5)
np_max shape: (19, 19, 5)
k_argmax shape: (19, 19, 5)
k_max shape: (19, 19, 5)
As opposed to this example
import numpy as np
from keras import backend as K
import tensorflow as tf
arr = tf.constant(np.random.rand(19, 19, 5, 80))
np_argmax = np.argmax(arr, axis=-1)
np_max = np.max(arr, axis=-1)
k_argmax = K.argmax(arr, axis=-1)
k_max = K.max(arr, axis=-1)
print('np_argmax shape: ', np_argmax.shape)
print('np_max shape: ', np_max.shape)
print('k_argmax shape: ', k_argmax.shape)
print('k_max shape: ', k_max.shape)
which outputs
np_argmax shape: ()
np_max shape: (19, 19, 5, 80)
k_argmax shape: (19, 19, 5)
k_max shape: (19, 19, 5)
You need to execute/run code (say under a TF session) to have tensors evaluated. Until then, the shapes of tensors are not evaluated.
TF docs say:
Each element in the Tensor has the same data type, and the data type is always known. The shape (that is, the number of dimensions it has and the size of each dimension) might be only partially known. Most operations produce tensors of fully-known shapes if the shapes of their inputs are also fully known, but in some cases it's only possible to find the shape of a tensor at graph execution time.
Why don't you try the following code for the 2nd example:
import numpy as np
from keras import backend as K
import tensorflow as tf
arr = tf.constant(np.random.rand(19, 19, 5, 80))
with tf.Session() as sess:
arr = sess.run(arr)
np_argmax = np.argmax(arr, axis=-1)
np_max = np.max(arr, axis=-1)
k_argmax = K.argmax(arr, axis=-1)
k_max = K.max(arr, axis=-1)
print('np_argmax shape: ', np_argmax.shape)
print('np_max shape: ', np_max.shape)
print('k_argmax shape: ', k_argmax.shape)
print('k_max shape: ', k_max.shape)
After arr = tf.constant(np.random.rand(19, 19, 5, 80)), the type of arr is tf.Tensor, but after running arr = sess.run(arr) its type will be changed to numpy.ndarray.

'int' object has no attribute __get__item error when training model with tflearn?

I'm getting this error when I tried to train a custom grayscale image dataset (using 2 images only) with the example mnist code from tflearn. The images are all of different sizes, around the range of (3000,3000) height and width. This is my error:
Run id: convnet_images
Log directory: /tmp/tflearn_logs/
Exception in thread Thread-14:
Traceback (most recent call last):
File "/usr/lib/python2.7/threading.py", line 801, in __bootstrap_inner
self.run()
File "/usr/lib/python2.7/threading.py", line 754, in run
self.__target(*self.__args, **self.__kwargs)
File "/usr/local/lib/python2.7/dist-packages/tflearn/data_flow.py", line 186, in fill_feed_dict_queue
data = self.retrieve_data(batch_ids)
File "/usr/local/lib/python2.7/dist-packages/tflearn/data_flow.py", line 221, in retrieve_data
utils.slice_array(self.feed_dict[key], batch_ids)
File "/usr/local/lib/python2.7/dist-packages/tflearn/utils.py", line 187, in slice_array
return X[start]
TypeError: 'int' object has no attribute '__getitem__'
and this is my code:
import tensorflow as tf
import tflearn
from scipy.misc import imread
import numpy as np
np.set_printoptions(threshold=np.nan)
image = imread('image.jpg')
image2 = imread('image2.jpg')
image3 = imread('image3.jpg')
image4 = imread('image4.jpg')
image = np.resize(image, (256, 256, 1))
image2 = np.resize(image2, (256, 256, 1))
image3 = np.resize(image3, (256, 256, 1))
image4 = np.resize(image4, (256, 256,1 ))
image_train = np.stack((image, image2), axis = 0)
image_test = np.stack((image3, image4), axis = 0)
# # build the neural net
from tflearn.layers.core import input_data, dropout, fully_connected
from tflearn.layers.conv import conv_2d, max_pool_2d
from tflearn.layers.normalization import local_response_normalization
from tflearn.layers.estimator import regression
network = input_data(shape = [None, 256, 256, 1], name = 'input')
network = conv_2d(network, 32, 3, activation = 'relu', regularizer = 'L2')
network = max_pool_2d(network, 2)
network = local_response_normalization(network)
# network = conv_2d(network, 64, 3, activation = 'relu', regularizer = 'L2')
# network = max_pool_2d(network, 2)
# network = local_response_normalization(network)
network = fully_connected(network, 128, activation = 'tanh')
network = dropout(network, 0.8)
network = fully_connected(network, 1, activation = 'softmax')
network = regression(network, optimizer = 'adam', learning_rate = '0.001', name = 'target')
#Training
model = tflearn.DNN(network, tensorboard_verbose = 3)
model.fit({'input': image_train}, {'target': 0}, n_epoch = 20, batch_size = 1,
validation_set = ({'input': image_test}, {'target': 0}),
snapshot_step = 100, show_metric = True, run_id = 'convnet_images')
I highly suspect the error comes from the fact that my grayscale pixel intensity value is a np.uint8 and not an np array. Meaning to say, when I print type(image_train[0, 0, 0, 0]) which gives the pixel value, it is a np.uint8, which probably means the code in tflearn cannot access value using an index selector __getitem__ (from what I've read so far). But how can I get the pixel value to become an np array? Is np.resize the correct way to handle grayscale images? Because ideally speaking, this should work for colored images which means the 4th dimension - the channel pixel values of RGB - have to be an np array that houses the 3 pixel values (and so understandably the tflearn code probably requested to access the pixel values using __getitem__. But this is my guess only; I'm still unsure of how to go about this.
OP self-answered in a comment:
I have discovered an error in my code, that is my Y value isn't formatted as a 2 by 1 np array and is instead just a float. Upon fixing this error, the index error is now gone.

Categories

Resources