Get output from Lasagne (python deep neural network framework) - python

I loaded the mnist_conv.py example from official github of Lasagne.
At the and, I would like to predict my own example. I saw that "lasagne.layers.get_output()" should handle numpy arrays from official documentation, but it doesn't work and I cannot figure out how can I do that.
Here's my code:
if __name__ == '__main__':
output_layer = main() #the output layer from the net
exampleChar = np.zeros((28,28)) #the example I would predict
outputValue = lasagne.layers.get_output(output_layer, exampleChar)
print(outputValue.eval())
but it gives me:
TypeError: ConvOp (make_node) requires input be a 4D tensor; received "TensorConstant{(28, 28) of 0.0}" (2 dims)
I understand that it expects a 4D tensor, but I don't have any idea how to correct it.
Can you help me? Thanks

First you try pass a single "image" into your network, which so it has the dimension (256,256).
But it need a list of 3 dimensional data i.e. images, which in theano is implemented as 4D tensor.
I don't see your full code, how you intended to use lasagne's interface, but if your code is written properly, from what I saw so far, I think you should convert your (256,256) data first to a one single channel image like (1,256,256), then make a list from either use more (1,256,256) data passed in a list e.g. [(1,256,256), (1,256,256), (1,256,256)], or make a list from this single example like [(1,256,256)].
Former you get and then pass a (3,1,256,256), latter a (1,1,256,256) 4D tensor, which will be accepted by lasagne interface.

As written in your error message, the input is expected to be a 4D tensor, of shape (n_samples, n_channel, width, height). In the MNIST case, n_channels is 1, and width and height are 28.
But you are inputting a 2D tensor, of shape (28, 28). You need to add new axes, which you can do with exampleChar = exampleChar[None, None, :, :]
exampleChar = np.zeros(28, 28)
print exampleChar.shape
exampleChar = exampleChar[None, None, :, :]
print exampleChar.shape
outputs
(28, 28)
(1, 1, 28, 28)
Note: I think you can use np.newaxis instead of None to add an axis. And exampleChar = exampleChar[None, None] should work too.

Related

Tensorflow Keras Conv2D error with 2D numpy array input

I would like to train a CNN using a 2D numpy array as input, but I am receiving this error: ValueError: Error when checking input: expected conv2d_input to have 4 dimensions, but got array with shape (21, 21).
My input is indeed a 21x21 numpy array of floats. The first layer of the network is defined as Conv2D(32, (3, 3), input_shape=(21, 21, 1)) to match the shape of the input array.
I have found some similar questions but none pertaining to a 2D input array, they mostly deal with images. According to the documentation, Conv2D is expecting an input of a 4D tensor containing (samples, channels, rows, cols), but I cannot find any documentation explaining the meaning of these values. Similar questions pertaining to image inputs suggest reshaping the input array using np.ndarray.reshape(), but when trying to do that I receive an input error.
How can I train a CNN on such an input array? Should input_shape be a different size tuple?
Your current numpy array has dimensions (21, 21). However, TensorFlow expects input tensors to have dimensions in the format (batch_size, height, width, channels) or BHWC implying that you need to convert your numpy input array to 4 dimensions (from the current 2 dimensions). One way to do so is as follows:
input = np.expand_dims(input, axis=0)
input = np.expand_dims(input, axis=-1)
Now, the numpy input array has dimensions: (1, 21, 21, 1) which can be passed to a TF Conv2D operation.
Hope this helps! :)

How to modify flow_from_directory or preprocessing_function in ImageDataGenerator?

According to the keras docs:
preprocessing_function: function that will be implied on each input. The function will run after the image is resized and augmented. The function should take one argument: one image (Numpy tensor with rank 3), and should output a Numpy tensor with the same shape.
My numpy tensor is of rank 5 because my input images have 3 dimensions (height, width, depth).
from keras.preprocessing.image import ImageDataGenerator
label_datagen = ImageDataGenerator()
train_label_generator = label_datagen.flow_from_directory(
directory="some_directory",
target_size=(32, 32, 32),
color_mode='grayscale',
class_mode=None,
batch_size=4)
When I check the first batch, I get my 5D numpy tensor:
first_item = train_image_generator.__getitem__(0)
>>>print('first_item.shape')
(4, 32, 32, 32, 1)
Now I first want to do a simple operation on every input image, I also check the input shape by printing it:
def some_function(arr):
print(arr.shape)
arr += 1
return arr
Here I add this function to my ImageDataGenerator:
label_datagen = FixedImageDataGenerator(preprocessing_function=some_function)
This is what I get as input shape
(32, 32, 1)
which means that it really is limited to rank 3. Any idea how I can modify this so that the input shape is (32, 32, 32)?
My goal is to use the to_categorical function on every input in the ImageDataGenerator. I cannot simply say class_mode="categorical" as I am doing semantic segmentation (not image classification). I know that I could write some custom code for generators for that purpose but I want to know if it would be difficult to modify the keras ImageDataGenerator.
You can use ImageDataGenerator like you would normally, but at the last step, instead of passing a preprocessing_function, wrap your generator in a generator of your own. By doing this, you get full control over the preprocessing function. This means its output no longer has to have the same shape as the input. Be aware that this wrapper function gets fed batches, not single images.
For example:
def preprocess(generator):
for batch in generator:
yield batch[:,1:-1,1:-1] # example: crop 1 px of each border
Now use preprocess(label_datagen) instead. I hope you can use this to circumvent the limitations.
ImageDataGenerator is a generator for images.
This means that in order for this to work your data should be images with 1 (grayscale) or 3 channels (rgb). I think it won't work with your 4-D images (unless depth equals 1 or 3).

Keras predict getting incorrect shape?

I'm new to Keras and am trying to test out a model I've just trained.
I'm using Tensorflow backend and Python 3.
However, the shape my input has and the shape Keras says it has in an error are completely different. Here's my code:
testnote = np.zeros((3,))
testnote[0] = 70
testnote[1] = 70
print(testnote.shape)
pred = model.predict(testnote)
print(pred)
My consistent output is "(3,)" for the shape of testnote and then an error for my predict line: "ValueError: Error when checking input: expected dense_1_input to have shape (3,) but got array with shape (1,)"
How is it that Keras reads testnote as having shape (1,) when I've just confirmed that the shape is (3,)? Is it using some sort of different standard for what "shape" means? I've tried reshaping and adding brackets and a bunch of other things, but I don't really know what the problem is.
For additional context, the model takes in an array with 3 scalar input (representing pitch, velocity, and instrument class) and outputs an array with 1025 scalar outputs. I am carefully not using the word "dimension" since I think this is where I'm getting confused, and technically both are only 1 dimension. I'm sure there are many problems with my model which I will have to fix following this. However, I'd like to just get this prediction function working so I can understand what my output looks like.
Thanks in advance for any help.
A Keras Model implicitly expects that your data (passed as a np array) has a dimension for the batch size. Currently, your model is interpreting testnote as being 3 examples of shape 1. Try adding the batch dimension to 'testnote' as follows:
testnote = testnote.reshape(1,-1)
This will reshape testnote to shape (1, 3), so that you explicitly define the batch size to be 1.

tensorflow resize image by random factor

I am trying to resize an image by a factor during graph construction when the image size is unknown:
H, W, _ = img.get_shape()
scale = tf.random_uniform([1], minval=1, maxval=1.5, dtype=tf.float32, seed=None, name=None)
Out of these I need to magically compute a size which would translate to the following:
tf.image.resize_images(tf.expand_dims(img, 0), [H*scale, W*scale])
Which returns:
ValueError: 'size' must be a 1-D int32 Tensor
Any help is greatly appreciated. Thank you.
Your scale has shape (1,), so [H*scale, W*scale] has shape (1, 2). To fix it, just generate a scalar random instead of a 1-element vector:
scale = tf.random_uniform([], minval=1, maxval=1.5, dtype=tf.float32, seed=None, name=None)
Note the shape is an empty list, meaning you want a scalar.
In addition to that, you have to fix the data type of the size parameter, like this:
tf.image.resize_images(tf.expand_dims(img, 0), tf.cast([H*scale, W*scale], tf.int32))
Your problem is that you're mixing the python list [H*scale, W*scale] with tensors. By default, tensorflow will convert lists like this to tensorflow constants as appropriate. But in this case, your list contains tensors and you're ending up with a nested set of tensors that aren't 1D.
To avoid confusion in cases like this stop using python constructs such as the list and convert your height and width into tensorflow constructs explicitly and check their shape before proceeding.
x = tf.concat((scale*W, scale*H), axis=0)
print(x)
Tensor("concat_3:0", shape=(2,), dtype=float32)
Doing so shows us that we now have a 1D tensor as required. But it's float32 (at least it was in the simple test case I set up), so let's cast that to an int:
x = tf.cast(x, tf.int32)
Now you're ready to go
tf.image.resize_images(tf.expand_dims(img, 0), size=x)
No error should occur there.

TFLearn - What is input_data

I came across the following statement:
convnet = input_data(shape=[None,img_size,img_size,1], name='input')
I tried looking for a description, but couldn't find a clear explanation.
My main question here is what is the function input_data mainly doing? Is it like a place holder for our input data?
Regarding the shape, what is None at the beginning, and 1 at the end?
Thanks.
The input_data is a layer that will be used as the input layer to your network. Before adding any of the usual layer in your sequential model, you need to specify how your input looks like. Like for example in the mnist data set where you have 784 array representing 28x28 images.
In your example the network wants an input with the shape (None, img_size,img_size,1] meaning in human language:
None - many or a number of or how many images of
img_size X img_size - dimensions of the image
1 - with one color channel
If the mnist dataset would be in full RGB color the input data would be of shape (None, 28, 28, 3)
Usually the None you could think of it as the batch_size.
To be even more explicit, if you would have a batch_size of 1 then you would need as input, in our mnist RGB example, three 28x28 matrixes, one representing the R pixels, another the G pixels and lastly one for the B pixels of the image. This is just one entry. In this case the None value would be 1, but usually it is whatever you decide the batch_size is. You get the picture from here.
Hope it clears things out.
Cheers,
Gabriel
De Santa answer is right: input_data is a placeholder for input features. The array you mention holds first None (always), then IMG width and height (seems the image is squared since width=height) and channels (in this case is 1; ex.: in case of RGB you would get 3 channels). This way the net gets to know the dimensions of input features.

Categories

Resources