Creating a Keras CNN for image alteration - python

I'm working on a problem that involves computationally evaluating three-dimensional data of the shape (32, 16, 5) and providing a corrected form of this data also in the shape of (32, 16, 5). The problem is relatively specific to my field, but it can be viewed as analogous to processing color images (just with five color channels instead of three). If it helps, this could be thought of as a color correction model.
In my initial efforts, I created a random forest model using XGBoost for each of these output parameters. I had good results, but found that the sheer number of output parameters (32*16*5 = 2560) made the runtime of this approach too long, so I am looking for an alternative.
I'm looking at using Keras to solve this, using a convolutional neural network approach, since the adjacent 'pixels' in my data should have some useful information about their neighbors. Note that 'adjacency' here is both spatial and in the color channels. So far, I am doing alright in creating a simple model that I believe has inputs/outputs of the correct shape, but I am running into an issue when I try to train the model on some dummy images:
#!/usr/bin/env python3
import tensorflow as tf
import pandas as pd
import numpy as np
def create_model(image_shape, batch_size = 10):
width, height, channels = image_shape
conv_shape = (batch_size, width, height, channels)
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Conv3D(filters = channels, kernel_size = 3, input_shape = conv_shape, padding = "same"))
model.add(tf.keras.layers.Dense(channels, activation = "relu"))
return model
if __name__ == "__main__":
image_shape = (32, 16, 5)
# Create test input/output data sets:
input_img = np.random.rand(*image_shape) # Create one dummy input image
output_img = np.random.rand(*image_shape) # Create one dummy output image
# Create a bogus 'training set' by copying the input/output images into lists many times
inputs = [input_img]*500
outputs = [output_img]*500
# Create the model and fit it to the dummy data
model = create_model(image_shape)
model.summary()
model.compile(loss = "mean_squared_error", optimizer = "adam", metrics = ["accuracy"])
model.fit(input_img, output_img)
However, when I run this code, I get the following error:
ValueError: Input 0 of layer sequential is incompatible with the layer: : expected min_ndim=5, found ndim=3. Full shape received: [32, 16, 5]
I am not really sure what the other two expected dimensions are for the data passed into model.fit(). I suspect this is a problem with the way that I am formatting my input data. Even if I have a list of input/output images, that will only bring the ndim of my data to 4, not 5.
I have been trying to find similar examples in the documentation and around the web to see what I'm doing incorrectly, but 3D convolution on a non-classifier network seems a bit off the beaten path, and I'm not having much luck (or just don't know the name of what I should search for).
I have tried passing the dummy training set to model.fit instead of two individual images. Fitting with model.fit(inputs, outputs) instead, I get:
ValueError: Layer sequential expects 1 inputs, but it received 500 input tensors.
It seems that passing a list of tensors isn't correct here. If I convert the list of input images to numpy arrays with:
inputs = np.array(inputs)
outputs = np.array(outputs)
This does bring up the number of dimensions in my input data to 4, but Keras is still expecting 5. The error I get in this case is very similar to the first:
ValueError: Input 0 of layer sequential is incompatible with the layer: : expected min_ndim=5, found ndim=4. Full shape received: [None, 32, 16, 5]
I'm definitely not understanding something here, and any help would be appreciated.

I think you made two mistakes in your code:
Instead of using Conv3D, you need to use Conv2D.
model.fit(input_img, output_img) should be model.fit(inputs, outputs).
The reason why you need to use Conv2D is the shape of your data is (length,width,channel), it doesn't possess an extra dimension.
Try the script below
#!/usr/bin/env python3
import tensorflow as tf
import pandas as pd
import numpy as np
def create_model(image_shape, batch_size = 10):
width, height, channels = image_shape
conv_shape = (width, height, channels)
model = tf.keras.models.Sequential()
model.add(tf.keras.layers.Conv2D(filters = channels, kernel_size = 3, input_shape = conv_shape, padding = "same"))
model.add(tf.keras.layers.Dense(channels, activation = "relu"))
return model
if __name__ == "__main__":
image_shape = (32, 16, 5)
# Create test input/output data sets:
input_img = np.random.rand(*image_shape) # Create one dummy input image
output_img = np.random.rand(*image_shape) # Create one dummy output image
# Create a bogus 'training set' by copying the input/output images into lists many times
inputs = np.array([input_img]*500)
outputs = np.array([output_img]*500)
# Create the model and fit it to the dummy data
model = create_model(image_shape)
model.summary()
model.compile(loss = "mean_squared_error", optimizer = "adam", metrics = ["accuracy"])
model.fit(inputs, outputs)

Related

Python CNN Implementation: logits and labels must have the same shape ((4, 2) vs (4, 1))

I have been working on building a CNN with multiple inputs for a binary classification problem in Python using the Keras module. It appears that I have built all of the input layers and hidden layers correctly, with a ValueError occurring after the first epoch saying that the dimension of my logits/labels are not the same shape (see title). The code for my output layer is:
num_classes = 2
### code for other layers ###
# fully connected layers for output
fc1 = Dense(128,activation='relu')(flatten)
fc2 = Dense(num_classes,activation='softmax')(fc1)
# complete the model
model = Model(inputs=[m1.input,m2.input,m3.input,m4.input],outputs=fc2)
After this step, I go on to compile the model and begin to fit it with my data and the training/validation labels. This is where the error occurs, and I am not sure what to do. Similar questions have suggested reshaping the labels with
train_labels = train_labels.reshape(-1,1)
val_labels = val_labels.reshape(-1,1)
but this resulted in the same error. Any thoughts would be welcome on how I could fix this.

How to print out the tensor values of a specific layer

I wish to exam the values of a tensor after mask is applied to it.
Here is a truncated part of the model. I let temp = x so later I wish to print temp to check the exact values.
So given a 4-class classification model using acoustic features. Assume I have data in (1000,50,136) as (batch, timesteps, features)
The objective is to check if the model is studying the features by timesteps. In other words, we wish to reassure the model is learning using slice as the red rectangle in the picture. Logically, it is the way for Keras LSTM layer but the confusion matrix produced is quite different when a parameter changes (eg. Dense units). The validation accuracy stays 45% thus we would like to visualize the model.
The proposed idea is to print out the first step of the first batch and print out the input in the model. If they are the same, then model is learning in the right way ((136,1) features once) instead of (50,1) timesteps of a single feature once.
input_feature = Input(shape=(X_train.shape[1],X_train.shape[2]))
x = Masking(mask_value=0)(input_feature)
temp = x
x = Dense(Dense_unit,kernel_regularizer=l2(dense_reg), activation='relu')(x)
I have tried tf.print() which brought me AttributeError: 'Tensor' object has no attribute '_datatype_enum'
As Get output from a non final keras model layer suggested by Lescurel.
model2 = Model(inputs=[input_attention, input_feature], outputs=model.get_layer('masking')).output
print(model2.predict(X_test))
AttributeError: 'Masking' object has no attribute 'op'
You want to output after mask.
lescurel's link in the comment shows how to do that.
This link to github, too.
You need to make a new model that
takes as inputs the input from your model
takes as outputs the output from the layer
I tested it with some made-up code derived from your snippets.
import numpy as np
from keras import Input
from keras.layers import Masking, Dense
from keras.regularizers import l2
from keras.models import Sequential, Model
X_train = np.random.rand(4,3,2)
Dense_unit = 1
dense_reg = 0.01
mdl = Sequential()
mdl.add(Input(shape=(X_train.shape[1],X_train.shape[2]),name='input_feature'))
mdl.add(Masking(mask_value=0,name='masking'))
mdl.add(Dense(Dense_unit,kernel_regularizer=l2(dense_reg),activation='relu',name='output_feature'))
mdl.summary()
mdl2mask = Model(inputs=mdl.input,outputs=mdl.get_layer("masking").output)
maskoutput = mdl2mask.predict(X_train)
mdloutput = mdl.predict(X_train)
maskoutput # print output after/of masking
mdloutput # print output of mdl
maskoutput.shape #(4, 3, 2): masking has the shape of the layer before (input here)
mdloutput.shape #(4, 3, 1): shape of the output of dense

Transfer learning with Keras: Input shape mismatch

I'm running a classification and predition neural network algorithme using pre-trained model with keras.
Now I know the shape of the input for keras is (224,224,3) but my input has this shape (180, 200, 20) and I get the following error:
ValueError: Dimension 0 in both shapes must be equal, but are 3 and 64. Shapes are [3,3,20,64] and [64,3,3,3]. for 'Assign_32' (op: 'Assign') with input shapes: [3,3,20,64], [64,3,3,3].
and here is the code:
from keras import applications
from keras.layers import Input
input_tensor = Input(shape = (180, 200, 20))
vgg_model = applications.VGG16(weights = 'imagenet', include_top = False, input_tensor = input_tensor)
vgg_model.summary()
Any idea how to get around this? Thank you
From Documentation:
input_shape: optional shape tuple, only to be specified if include_top
is False (otherwise the input shape has to be (224, 224, 3) (with
'channels_last' data format) or (3, 224, 224) (with 'channels_first'
data format). It should have exactly 3 inputs channels, and width and
height should be no smaller than 32. E.g. (200, 200, 3) would be one
valid value.
You can try to create a vgg16 from scratch from this link. VGG16 model for Keras
You need to resize your input image
from keras.preprocessing import image
img = image.load_img("image1.jpeg",target_size=(224,224))
If you want to learn to do transfer learning from scratch in keras you can read this article. This article has step by step implementation.
https://medium.com/#1297rohit/transfer-learning-from-scratch-using-keras-339834b153b9
In your case, since you are not dealing with images of the right size (or number of channels) you may want to cut out large parts of the vgg network to still save the information contained in the middle layers, but I am not sure how efficient it would be.
You would need to remove the first convolution layer, and all the dense layers at the end, replacing them with your own layers. You would certainly need to retrain the whole network, so rather than transfer learning you would be doing very smart initialization.

Keras: Understanding the role of Embedding layer in a Conditional GAN

I am working to understand Erik Linder-Norén's implementation of the Categorical GAN model, and am confused by the generator in that model:
def build_generator(self):
model = Sequential()
# ...some lines removed...
model.add(Dense(np.prod(self.img_shape), activation='tanh'))
model.add(Reshape(self.img_shape))
model.summary()
noise = Input(shape=(self.latent_dim,))
label = Input(shape=(1,), dtype='int32')
label_embedding = Flatten()(Embedding(self.num_classes, self.latent_dim)(label))
model_input = multiply([noise, label_embedding])
img = model(model_input)
return Model([noise, label], img)
My question is: How does the Embedding() layer work here?
I know that noise is a vector that has length 100, and label is an integer, but I don't understand what the label_embedding object contains or how it functions here.
I tried printing the shape of label_embedding to try and figure out what's going on in that Embedding() line but that returns (?,?).
If anyone could help me understand how the Embedding() lines here work, I'd be very grateful for their assistance!
To keep in mind why use embedding here at all: the alternative is to concatenate the noise with the conditioned class, which may cause the generator to completely ignore the noise values, generating data with high similarity in each class (or even just 1 per class).
From the documentation, https://keras.io/layers/embeddings/#embedding,
Turns positive integers (indexes) into dense vectors of fixed size.
eg. [[4], [20]] -> [[0.25, 0.1], [0.6, -0.2]]
In the GAN model, the input integer(0-9) is converted to a vector of shape 100. With this short code snippet, we can feed some test input to check the output shape of the Embedding layer.
from keras.layers import Input, Embedding
from keras.models import Model
import numpy as np
latent_dim = 100
num_classes = 10
label = Input(shape=(1,), dtype='int32')
label_embedding = Embedding(num_classes, latent_dim)(label)
mod = Model(label, label_embedding)
test_input = np.zeros((1))
print(f'output shape is {mod.predict(test_input).shape}')
mod.summary()
output shape is (1, 1, 100)
From model summary, output shape for embedding layer is (1,100) which is the same as output of predict.
embedding_1 (Embedding) (None, 1, 100) 1000
One additional point, in the output shape (1,1,100), the leftmost 1 is the batch size, the middle 1 is the input length. In this case, we provided an input of length 1.
The embedding stores the per label state. If I read the code correctly, each label corresponds to a digit; i.e. there is an embedding that captures how to generate a 0, 1, ... 9.
This code takes some random noise and multiplies it to this per label state. The result should be a vector that leads the generator to display the digit corresponding to the label (i.e. 0..9).

Keras and input shape to Conv1D issues

First off, I am very new to Neural Nets and Keras.
I am trying to create a simple Neural Network using Keras where the input is a time series and the output is another time series of same length (1 dimensional vectors).
I made dummy code to create random input and output time series using a Conv1D layer. The Conv1D layer then outputs 6 different time series (because I have 6 filters) and the next layer I define to add all 6 of those outputs into one which is the output to the entire network.
import numpy as np
import tensorflow as tf
from tensorflow.python.keras.models import Model
from tensorflow.python.keras.layers import Conv1D, Input, Lambda
def summation(x):
y = tf.reduce_sum(x, 0)
return y
time_len = 100 # total length of time series
num_filters = 6 # number of filters/outputs to Conv1D layer
kernel_len = 10 # length of kernel (memory size of convolution)
# create random input and output time series
X = np.random.randn(time_len)
Y = np.random.randn(time_len)
# Create neural network architecture
input_layer = Input(shape = X.shape)
conv_layer = Conv1D(filters = num_filters, kernel_size = kernel_len, padding = 'same')(input_layer)
summation_layer = Lambda(summation)(conv_layer)
model = Model(inputs = input_layer, outputs = summation_layer)
model.compile(loss = 'mse', optimizer = 'adam', metrics = ['mae'])
model.fit(X,Y,epochs = 1, metrics = ['mae'])
The error I get is:
ValueError: Input 0 of layer conv1d_1 is incompatible with the layer: expected ndim=3, found ndim=2. Full shape received: [None, 100]
Looking at the Keras documentation for Conv1D, the input shape is supposed to be a 3D tensor of shape (batch, steps, channels) which I don't understand if we are working with 1 dimensional data.
Can you explain the meaning of each of the items: batch, steps, and channels? And how should I shape my 1D vectors to allow my network to run?
What is a (training) sample?
The (training) data may consists of tens, hundreds or thousands of samples. For example, each image in an image dataset like Cifar-10 or ImageNet is a sample. As another example, for a timseries dataset which consists of weather statistics recorded during the days over 10 years, each training sample may be a timeseries of each day. If we have recorded 100 measurements during the day and each measurement consists of temperature and humidity (i.e. we have two features per measurement) then the shape of our dataset is roughly (10x365, 100, 2).
What is batch size?
The batch size is simply the number of samples that can be processed by the model at a single time. We can set the batch size using the batch_size argument of fit method in Keras. The common values are 16, 32, 64, 128, 256, etc (though you must choose a number such that your machine could have enough RAM to allocate the required resources).
Further, the "steps" (also called "sequence length") and "channels" (also called "feature size") are the number of measurements and the size of each measurement, respectively. For example in our weather example above, we have steps=100 and channels=2.
To resolve the issue with your code you need to define your training data (i.e. X) such that it has a shape of (num_samples, steps or time_len, channels or feat_size):
n_samples = 1000 # we have 1000 samples in our training data
n_channels = 1 # each measurement has one feature
X = np.random.randn(n_samples, time_len, n_channels)
# if you want to predict one value for each measurement
Y = np.random.randn(n_samples, time_len)
# or if you want to predict one value for each sample
Y = np.random.randn(n_samples)
Edit:
One more thing is that you should pass the shape of one sample as the input shape of the model. Therefore, the input shape of Input layer must be passed like shape=X.shape[1:].

Categories

Resources