I am working on image classification tasks and decided to use Lasagne + Nolearn for neural networks prototype.
All standard examples like MNIST numbers classification run well, but problems appear when I try to work with my own images.
I want to use 3-channel images, not grayscale.
And there is the code where I'm trying to get arrays from images:
img = Image.open(item)
img = ImageOps.fit(img, (256, 256), Image.ANTIALIAS)
img = np.asarray(img, dtype = 'float64') / 255.
img = img.transpose(2,0,1).reshape(3, 256, 256)
X.append(img)
Here is the code of NN and its fitting:
X, y = simple_load("new")
X = np.array(X)
y = np.array(y)
net1 = NeuralNet(
layers=[ # three layers: one hidden layer
('input', layers.InputLayer),
('hidden', layers.DenseLayer),
('output', layers.DenseLayer),
],
# layer parameters:
input_shape=(None, 65536), # 96x96 input pixels per batch
hidden_num_units=100, # number of units in hidden layer
output_nonlinearity=None, # output layer uses identity function
output_num_units=len(y), # 30 target values
# optimization method:
update=nesterov_momentum,
update_learning_rate=0.01,
update_momentum=0.9,
regression=True, # flag to indicate we're dealing with regression problem
max_epochs=400, # we want to train this many epochs
verbose=1,
)
net1.fit(X, y)
I recieve exceptions like this one:
Traceback (most recent call last):
File "las_mnist.py", line 39, in <module>
net1.fit(X[i], y[i])
File "/usr/local/lib/python2.7/dist-packages/nolearn/lasagne.py", line 266, in fit
self.train_loop(X, y)
File "/usr/local/lib/python2.7/dist-packages/nolearn/lasagne.py", line 273, in train_loop
X, y, self.eval_size)
File "/usr/local/lib/python2.7/dist-packages/nolearn/lasagne.py", line 377, in train_test_split
kf = KFold(y.shape[0], round(1. / eval_size))
IndexError: tuple index out of range
So, in which format do you "feed" your networks with image data?
Thanks for answers or any tips!
If you're doing classification you need to modify a couple of things:
In your code you have set regression = True. To do classification remove this line.
Ensure that your input shape matches the shape of X if want to input 3 distinct channels
Because you are doing classification you need the output to use a softmax nonlinearity (at the moment you have the identity which will not help you with classification)
X, y = simple_load("new")
X = np.array(X)
y = np.array(y)
net1 = NeuralNet(
layers=[ # three layers: one hidden layer
('input', layers.InputLayer),
('hidden', layers.DenseLayer),
('output', layers.DenseLayer),
],
# layer parameters:
input_shape=(None, 3, 256, 256), # TODO: change this
hidden_num_units=100, # number of units in hidden layer
output_nonlinearity=lasagne.nonlinearities.softmax, # TODO: change this
output_num_units=len(y), # 30 target values
# optimization method:
update=nesterov_momentum,
update_learning_rate=0.01,
update_momentum=0.9,
max_epochs=400, # we want to train this many epochs
verbose=1,
)
I also asked it in lasagne-users forum and Oliver Duerr helped me a lot with code sample:
https://groups.google.com/forum/#!topic/lasagne-users/8ZA7hr2wKfM
Related
How to solve this error?
Preprocessing of image:
def PreprocessData(img, mask, target_shape_img, target_shape_mask, path1, path2):
"""
Processes the images and mask present in the shared list and path
Returns a NumPy dataset with images as 3-D arrays of desired size
"""
# Pull the relevant dimensions for image and mask
m = len(img) # number of images
i_h,i_w,i_c = target_shape_img # pull height, width, and channels of image
m_h,m_w,m_c = target_shape_mask # pull height, width, and channels of mask
# Define X and Y as number of images along with shape of one image
X = np.zeros((m,i_h,i_w,1), dtype=np.float32)
y = np.zeros((m,m_h,m_w,1), dtype=np.int32)
# RGBA image has 4 channels.
#255 will make the pixel completely opaque,
#value 0 fully transparent,
#values in between will make the pixels partly transparent
# Resize images and masks
for file in img:
# convert image into an array of desired shape (3 channels)
index = img.index(file)
path = os.path.join(path1, file)
single_img = np.asarray(Image.open(path).resize((i_h,i_w))) # (0.21, 0.75, 0.04)
#single_img = np.reshape(single_img,(i_h,i_w,i_c))
single_img = single_img/255.
X[index] = single_img[..., None] #X (dims: # images, img height, img width, img channels)
# convert mask into an array of desired shape 4 channel
single_mask_ind = mask[index]
path = os.path.join(path2, single_mask_ind)
single_mask = np.asarray(Image.open(path).resize((i_h,i_w)))
single_mask = single_mask > 0 #binarizing of targets
# single_mask = single_mask - 1 ### single_mask = single_mask/256???
y[index] = single_mask[..., None] #y (dims: # masks, mask height, mask width, mask channels)
return X, y
Encoder:
def EncoderMiniBlock(inputs, n_filters=32, dropout_prob=0.3, max_pooling=True):
"""
This block uses multiple convolution layers, max pool, relu activation to create an architecture for learning.
Dropout can be added for regularization to prevent overfitting.
The block returns the activation values for next layer along with a skip connection which will be used in the decoder
"""
# Add 2 Conv Layers with relu activation and HeNormal initialization using TensorFlow
# Proper initialization prevents from the problem of exploding and vanishing gradients
# 'Same' padding will pad the input to conv layer such that the output has the same height and width (hence, is not reduced in size)
conv = Conv2D(n_filters,
3, # Kernel size
activation='relu',
padding='same',
kernel_initializer='HeNormal')(inputs)
conv = Conv2D(n_filters,
3, # Kernel size
activation='relu',
padding='same',
kernel_initializer='HeNormal')(conv)
# Batch Normalization will normalize the output of the last layer based on the batch's mean and standard deviation
conv = BatchNormalization()(conv, training=False)
# In case of overfitting, dropout will regularize the loss and gradient computation to shrink the influence of weights on output
if dropout_prob > 0:
conv = tf.keras.layers.Dropout(dropout_prob)(conv)
# Pooling reduces the size of the image while keeping the number of channels same
# Pooling has been kept as optional as the last encoder layer does not use pooling (hence, makes the encoder block flexible to use)
# Below, Max pooling considers the maximum of the input slice for output computation and uses stride of 2 to traverse across input image
if max_pooling:
next_layer = tf.keras.layers.MaxPooling2D(pool_size = (2,2))(conv)
else:
next_layer = conv
# skip connection (without max pooling) will be input to the decoder layer to prevent information loss during transpose convolutions
skip_connection = conv
return next_layer, skip_connection
Decoder:
def DecoderMiniBlock(prev_layer_input, skip_layer_input, n_filters=32):
"""
Decoder Block first uses transpose convolution to upscale the image to a bigger size and then,
merges the result with skip layer results from encoder block
Adding 2 convolutions with 'same' padding helps further increase the depth of the network for better predictions
The function returns the decoded layer output
"""
# Start with a transpose convolution layer to first increase the size of the image
up = Conv2DTranspose(
n_filters,
(3,3), # Kernel size
strides=(2,2),
padding='same')(prev_layer_input)
# Merge the skip connection from previous block to prevent information loss
merge = concatenate([up, skip_layer_input], axis=3)
# Add 2 Conv Layers with relu activation and HeNormal initialization for further processing
# The parameters for the function are similar to encoder
conv = Conv2D(n_filters,
3, # Kernel size
activation='relu',
padding='same',
kernel_initializer='HeNormal')(merge)
conv = Conv2D(n_filters,
3, # Kernel size
activation='relu',
padding='same',
kernel_initializer='HeNormal')(conv)
return conv
U-Net compilation
def UNetCompiled(input_size=(128, 128, 3), n_filters=32, n_classes=3):
"""
Combine both encoder and decoder blocks according to the U-Net research paper
Return the model as output
"""
# Input size represent the size of 1 image (the size used for pre-processing)
inputs = Input(input_size)
# Encoder includes multiple convolutional mini blocks with different maxpooling, dropout and filter parameters
# Observe that the filters are increasing as we go deeper into the network which will increase the # channels of the image
cblock1 = EncoderMiniBlock(inputs, n_filters,dropout_prob=0, max_pooling=True)
cblock2 = EncoderMiniBlock(cblock1[0],n_filters*2,dropout_prob=0, max_pooling=True)
cblock3 = EncoderMiniBlock(cblock2[0], n_filters*4,dropout_prob=0, max_pooling=True)
cblock4 = EncoderMiniBlock(cblock3[0], n_filters*8,dropout_prob=0.3, max_pooling=True)
cblock5 = EncoderMiniBlock(cblock4[0], n_filters*16, dropout_prob=0.3, max_pooling=False)
# Decoder includes multiple mini blocks with decreasing number of filters
# Observe the skip connections from the encoder are given as input to the decoder
# Recall the 2nd output of encoder block was skip connection, hence cblockn[1] is used
ublock6 = DecoderMiniBlock(cblock5[0], cblock4[1], n_filters * 8)
ublock7 = DecoderMiniBlock(ublock6, cblock3[1], n_filters * 4)
ublock8 = DecoderMiniBlock(ublock7, cblock2[1], n_filters * 2)
ublock9 = DecoderMiniBlock(ublock8, cblock1[1], n_filters)
# Complete the model with 1 3x3 convolution layer (Same as the prev Conv Layers)
# Followed by a 1x1 Conv layer to get the image to the desired size.
# Observe the number of channels will be equal to number of output classes
conv9 = Conv2D(n_filters,
3,
activation='relu',
padding='same',
kernel_initializer='he_normal')(ublock9)
conv10 = Conv2D(n_classes, 1, padding='same')(conv9)
# Define the model
model = tf.keras.Model(inputs=inputs, outputs=conv10)
return model
Define the desired shape
target_shape_img = [128, 128, 3]
target_shape_mask = [128, 128,1]
Process data using apt helper function
X, y = PreprocessData(img, mask, target_shape_img, target_shape_mask, path1, path2)
I am not able to understand what is wrong because I am getting this error:
ValueError: in user code:
File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py",
line 1021, in train_function *
return step_function(self, iterator)
File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py",
line 1010, in step_function **
outputs = model.distribute_strategy.run(run_step, args=(data,))
File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py",
line 1000, in run_step **
outputs = model.train_step(data)
File "/usr/local/lib/python3.7/dist-packages/keras/engine/training.py",
line 859, in train_step
y_pred = self(x, training=True)
File "/usr/local/lib/python3.7/dist-packages/keras/utils/traceback_utils.py",
line 67, in error_handler
raise e.with_traceback(filtered_tb) from None
File "/usr/local/lib/python3.7/dist-packages/keras/engine/input_spec.py",
line 249, in assert_input_compatibility
f'Input {input_index} of layer "{layer_name}" is '
ValueError: Exception encountered when calling layer "model" (type Functional).
Input 0 of layer "conv2d" is incompatible with the layer: expected axis -1 of input shape to have value 3, but received input with shape None, 128, 128, 1)
Call arguments received:
• inputs=tf.Tensor(shape=(None, 128, 128, 1), dtype=float32)
• training=True
• mask=None
You seem to have defined a model that takes inputs of shape (128,128,3) and are inputting shape (128,128,1). If you change the input shape when you define the UNetCompiled function, it should solve the issue.
def UNetCompiled(input_size=(128, 128, 1), n_filters=32, n_classes=3):
Or you could change the input shape in the PreprocessData function if the images are colour and not greyscale images
You have defined the images as having 1 channel
# Define X and Y as number of images along with shape of one image
X = np.zeros((m,i_h,i_w,1), dtype=np.float32)
y = np.zeros((m,m_h,m_w,1), dtype=np.int32)
but in the next line have written # RGBA image has 4 channels.
If your input image has 4 channels, both the images and the model input_shape needs to reflect this
I'm trying to make a CNN (still a beginner). When trying to fit the model I am getting this error:
ValueError: A target array with shape (10000, 10) was passed for output of shape (None, 6, 6, 10) while using as loss categorical_crossentropy. This loss expects targets to have the same shape as the output.
The shape of labels = (10000, 10)
the shape of the image data = (10000, 32, 32, 3)
Code:
import pickle
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import (Dense, Dropout, Activation, Flatten,
Conv2D, MaxPooling2D)
from tensorflow.keras.callbacks import TensorBoard
from keras.utils import to_categorical
import numpy as np
import time
MODEL_NAME = f"_________{int(time.time())}"
BATCH_SIZE = 64
class ConvolutionalNetwork():
'''
A convolutional neural network to be used to classify images
from the CIFAR-10 dataset.
'''
def __init__(self):
'''
self.training_images -- a 10000x3072 numpy array of uint8s. Each
a row of the array stores a 32x32 colour image.
The first 1024 entries contain the red channel
values, the next 1024 the green, and the final
1024 the blue. The image is stored in row-major
order, so that the first 32 entries of the array are the red channel values of the first row of the image.
self.training_labels -- a list of 10000 numbers in the range 0-9.
The number at index I indicates the label
of the ith image in the array data.
'''
# List of image categories
self.label_names = (self.unpickle("cifar-10-batches-py/batches.meta",
encoding='utf-8')['label_names'])
self.training_data = self.unpickle("cifar-10-batches-py/data_batch_1")
self.training_images = self.training_data[b'data']
self.training_labels = self.training_data[b'labels']
# Reshaping the images + scaling
self.shape_images()
# Converts labels to one-hot
self.training_labels = np.array(to_categorical(self.training_labels))
self.create_model()
self.tensorboard = TensorBoard(log_dir=f'logs/{MODEL_NAME}')
def unpickle(self, file, encoding='bytes'):
'''
Unpickles the dataset files.
'''
with open(file, 'rb') as fo:
training_dict = pickle.load(fo, encoding=encoding)
return training_dict
def shape_images(self):
'''
Reshapes the images and scales by 255.
'''
images = list()
for d in self.training_images:
image = np.zeros((32,32,3), dtype=np.uint8)
image[...,0] = np.reshape(d[:1024], (32,32)) # Red channel
image[...,1] = np.reshape(d[1024:2048], (32,32)) # Green channel
image[...,2] = np.reshape(d[2048:], (32,32)) # Blue channel
images.append(image)
for i in range(len(images)):
images[i] = images[i]/255
images = np.array(images)
self.training_images = images
print(self.training_images.shape)
def create_model(self):
'''
Creating the ConvNet model.
'''
self.model = Sequential()
self.model.add(Conv2D(64, (3, 3), input_shape=self.training_images.shape[1:]))
self.model.add(Activation("relu"))
self.model.add(MaxPooling2D(pool_size=(2,2)))
self.model.add(Conv2D(64, (3,3)))
self.model.add(Activation("relu"))
self.model.add(MaxPooling2D(pool_size=(2,2)))
# self.model.add(Flatten())
# self.model.add(Dense(64))
# self.model.add(Activation('relu'))
self.model.add(Dense(10))
self.model.add(Activation(activation='softmax'))
self.model.compile(loss="categorical_crossentropy", optimizer="adam",
metrics=['accuracy'])
def train(self):
'''
Fits the model.
'''
print(self.training_images.shape)
print(self.training_labels.shape)
self.model.fit(self.training_images, self.training_labels, batch_size=BATCH_SIZE,
validation_split=0.1, epochs=5, callbacks=[self.tensorboard])
network = ConvolutionalNetwork()
network.train()
Would appreciate the help, have been trying to fix for an hour.
You need to uncomment the Flatten layer when creating your model. Essentially what this layer does is that it takes a 4D input (batch_size, height, width, num_filters) and unrolls it into a 2D one (batch_size, height * width * num_filters). This is needed to get the output shape you want.
Un-comment the flatten layer before your output layer in create_model(self), conv layers don't work with 1D tensors/arrays, and so for you to get the output layer of the right shape to add a Flatten() layer right before your output layer, like this:
def create_model(self):
'''
Creating the ConvNet model.
'''
self.model = Sequential()
self.model.add(Conv2D(64, (3, 3), input_shape=self.training_images.shape[1:]), activation='relu')
#self.model.add(Activation("relu"))
self.model.add(MaxPooling2D(pool_size=(2,2)))
self.model.add(Conv2D(64, (3,3), activation='relu'))
#self.model.add(Activation("relu"))
self.model.add(MaxPooling2D(pool_size=(2,2)))
# self.model.add(Dense(64))
# self.model.add(Activation('relu'))
self.model.add(Flatten())
self.model.add(Dense(10, activation='softmax'))
#self.model.add(Activation(activation='softmax'))
self.model.compile(loss="categorical_crossentropy", optimizer="adam",
metrics=['accuracy'])
print ('model output shape:', self.model.output_shape)#prints out the output shape of your model
The code above will give you a model with an output shape of (None, 10).
Also please use activation as a layer parameter in the future.
Use model.summary() to inspect the output shapes of your model. Without the commented out Flatten() layer the shapes of your layers retain the original dimensions of the image and the shape of the output layer is (None, 6, 6, 10).
What you want to do here is roughly:
start with a shape of (batch_size, img width, img heigh, channels)
use convolutions to detect patterns through the image by applying a filter
reduce the img width and height with max pooling
then Flatten() the dimensions of the image so that instead of (width, heigh, features) you end up with just a set of features.
match against your classes.
The commented out code does step 4; when you remove the Flatten() layer you end up with the wrong set of dimensions at the end.
You have to get your model output into the same shape as your labels.
Perhaps the simplest solution would be to ensure the model ends with these layers:
model.add(Flatten())
## possibly an extra dense layer or 2 with 'relu' activation
model.add(Dense(10, activation=`softmax`))
This is amongst the most common 'endings' to a categorisation model and is arguably the most straightforward to understand.
It's not clear why you commented out this section:
# self.model.add(Flatten())
# self.model.add(Dense(64))
# self.model.add(Activation('relu'))
which would appear to give you the required output shape?
I am trying to extract features from a convolution layer of the VGGFace model, using TensorFlow & Keras.
This is my code:
# Layer Features
layer_name = 'conv1_2' # Edit this line
vgg_model = VGGFace() # Pooling: None, avg or max
out = vgg_model.get_layer(layer_name).output
vgg_model_new = Model(vgg_model.input, out)
def main():
img = image.load_img('myimage.jpg', target_size=(224, 224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = utils.preprocess_input(x, version=1)
preds = vgg_model_new.predict(x)
print('Predicted:', utils.decode_predictions(preds))
exit(0)
However, at the print('Predicted:', utils.decode_predictions(preds)) line I am getting the following error:
Message=decode_predictions expects a batch of predictions (i.e. a
2D array of shape (samples, 2622)) for V1 or (samples, 8631) for
V2.Found array with shape: (1, 224, 224, 64)
I just want to extract features, I don't need to classify my images at this point. This code is based on https://github.com/rcmalli/keras-vggface
You shouldn't use utils.decode_predictions(preds) there because it's only for classification. You can see the definition of the function here https://github.com/rcmalli/keras-vggface/blob/master/keras_vggface/utils.py#L66
If you want to print the features, use print('Predicted:',preds)
I am trying to develop a 1D convolutional neural network with residual connections and batch-normalization based on the paper Cardiologist-Level Arrhythmia Detection with Convolutional Neural Networks, using keras.
This is the code so far:
# define model
x = Input(shape=(time_steps, n_features))
# First Conv / BN / ReLU layer
y = Conv1D(filters=n_filters, kernel_size=n_kernel, strides=n_strides, padding='same')(x)
y = BatchNormalization()(y)
y = ReLU()(y)
shortcut = MaxPooling1D(pool_size = n_pool)(y)
# First Residual block
y = Conv1D(filters=n_filters, kernel_size=n_kernel, strides=n_strides, padding='same')(y)
y = BatchNormalization()(y)
y = ReLU()(y)
y = Dropout(rate=drop_rate)(y)
y = Conv1D(filters=n_filters, kernel_size=n_kernel, strides=n_strides, padding='same')(y)
# Add Residual (shortcut)
y = add([shortcut, y])
# Repeated Residual blocks
for k in range (2,3): # smaller network for testing
shortcut = MaxPooling1D(pool_size = n_pool)(y)
y = BatchNormalization()(y)
y = ReLU()(y)
y = Dropout(rate=drop_rate)(y)
y = Conv1D(filters=n_filters * k, kernel_size=n_kernel, strides=n_strides, padding='same')(y)
y = BatchNormalization()(y)
y = ReLU()(y)
y = Dropout(rate=drop_rate)(y)
y = Conv1D(filters=n_filters * k, kernel_size=n_kernel, strides=n_strides, padding='same')(y)
y = add([shortcut, y])
z = BatchNormalization()(y)
z = ReLU()(z)
z = Flatten()(z)
z = Dense(64, activation='relu')(z)
predictions = Dense(classes, activation='softmax')(z)
model = Model(inputs=x, outputs=predictions)
# Compiling
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['categorical_accuracy'])
# Fitting
model.fit(train_x, train_y, epochs=n_epochs, batch_size=n_batch)
And this is the graph of a simplified model of what I am trying to build.
The model described in the paper uses an incrementing number of filters:
The network consists of 16 residual blocks with 2 convolutional layers per block. The convolutional layers all have a filter length of 16 and have 64k filters, where k starts out as 1 and is incremented every 4-th residual block. Every alternate residual block subsamples its inputs by a factor of 2, thus the original input is ultimately subsampled by a factor of 2^8. When a residual block subsamples the input, the corresponding shortcut connections also subsample their input using a Max Pooling operation with the same subsample factor.
But I can only make it work if I use the same number of filters in every Conv1D layer, with k=1, strides=1 and padding=same, without applying any MaxPooling1D. Any changes in these parameters causes a tensor size mismatch and failure to compile with the following error:
ValueError: Operands could not be broadcast together with shapes (70, 64) (70, 128)
Does anyone have any idea on how to fix this size mismatch and make it work?
In addition, if the input has more than one channel (or features) the mismatch is even worst! Is there a way to deal with more than one channel?
The issue of tensor shape mismatch should be happening in add([y, shortcut]) layer. Because of the fact that you are using MaxPooling1D layer, this halves your time-steps by default, which you can change it by using the pool_size parameter. On the other hand, your residual portion is not reducing the time-steps by same amount. You should apply stride=2 with padding='same' before adding shortcut and y in any one of Conv1D layer (preferably the last one).
For reference, you can check out the Resnet code here Keras-applications-github
The full error message is like this:
ValueError: Shapes (2, 1) and (50, 1) are incompatible
It occurs when my model is trained. The mistake either is in my input_fn:
train_input_fn = tf.estimator.inputs.numpy_input_fn(
x = {"x" : training_data},
y = training_labels,
batch_size = 50,
num_epochs = None,
shuffle = True)
in my logits and loss function:
dense = tf.layers.dense(inputs = pool2_flat, units = 1024, activation = tf.nn.relu)
dropout = tf.layers.dropout(inputs = dense, rate = 0.4, training = mode == tf.estimator.ModeKeys.TRAIN)
logits = tf.layers.dense(inputs = dropout, units = 1)
loss = tf.losses.softmax_cross_entropy(labels = labels, logits = logits)
or in my dataset. I can only print out the shape of my dataset for you to take a look at it.
#shape of the dataset
train_data.shape
(1196,2,1)
train_data[0].shape
(2,1)
#this is the data
train_data[0][0].shape
(1,)
train_data[0][0][0].shape
(20,50,50)
#this is the labels
train_data[0][1].shape
(1,)
The problem seems to be the shape of the logits. They are supposed to be [batch_size, num_classes] in this case [50,1] but are [2,1]. The shape of the labels is correctly [50,1]
I have made a github gist if you want to take a look at the whole code.
https://gist.github.com/hjkhjk1999/38f358a53da84a94bf5a59f44050aad5
In your code, you are stating that the inputs to your model will be feed in batches of 50 samples per batch with one variable. But it looks like your are feeding actually a batch of 2 samples with 1 variable (shape=[2, 1]) despite feeding labels with shape [50, 1].
That's the problem, you are giving 50 'questions' and two 'answers'.
Also, your dataset is shaped in a really weird way. I see you named your github gist 3D Conv. If you are indeed trying to do a 3D convolution you might want to reshape your dataset into a tensor (numpy array) of this shape shape = [samples, width, height, deepth]