Keras: feed images into CNN and get image output - python

So far, I've been practicing neural networks on numerical datasets in pandas, but now I need to create a model that will take an image as input and output a binary mask of that image.
I have my training data as numpy arrays of shape (602, 2048, 2048, 1). 602 images of dimensions 2048x2048 with one channel. The array of output masks have the same dimensions.
What I can't figure out is how to define the first layer or how to correctly feed the data into the model. I would greatly appreciate your help on this issue

Well, this is not a "rule", but probably you will be using mostly 2D conv and related layers.
You feed everything as numpy arrays, as usual, maybe normalizing the values. Common options are:
Between 0 and 1 (just divide by 255.)
Between -1 and 1 (divide by 255., multiply by 2, subtract 1)
Caffe style: subtract from each channel a specific value to "center" the values based on their usual mean without rescaling them.
Your model should start with something like:
inputTensor = Input((2048,2048,1))
output = Conv2D(filters, kernel_size, .....)(inputTensor)
Or, in sequential models: model.add(Conv2D(...., input_shape=(2048,2048,1))
Later, it's up to you to decide which layers to use.
Conv2D
MaxPooling2D
Upsampling2D
Whether you're going to create a linear model or if you're going to divide branches, join branches, etc. is also your call.
Models in a U-Net style should be a good start for you.
What you can't do:
Don't use Flatten layers (actually you can, if you later reshape the output for having image dimensions... but why?)
Don't use Global Pooling layers (you don't want to sacrifice your spatial dimensions)

Related

Mask layer is not working with MLPs, how to add a custom layer with masking?

I'm using MLPs to forecast a time series, I implement a code that contain a mask layer to let the model skip the mask values.
for instance, in my data, the time series has a lot of NaN values, I fill it by a 'value = -999'. I don't want to remove it, but I want the Keras masking to skip it in gentle way.
My code as the following:
model = Sequential()
model.add(Masking(mask_value=-999, input_shape=(n_steps_in, )))
model.add(Dense(1024, activation='relu'))
model.add(Dense(n_steps_out))
I read an answer that said it is impossible to let masking works with MLPs.
How to add a mask layer for MLPs, or a custom mask layer to solve this problem?
What you're doing has a fundamental flaw.
The most important rule of masking is that,
the dimension(s) you're applying the mask on needs to stay without changing until the final prediction of the model
If that dimension(s) the mask is applied on changes, there's no way to propagate it forward.
Some scenarios masking will work,
If you want to mask specific batch items in an MLP. For example, in your MLP, you can mask a whole feature vector, but not a single feature in that vector. Here your mask would be a [batch size] size tensor and mask_value would be the same size as the feature vector.
If you want to mask a specific (batch item, time step) combination of an LSTM model, you can have the mask as [batch size, time step] sized tensor, where mask_value would have the same size as the feature vector of a time step.
So in summary, you can't mask only items, not specific features.

Can 2D convolutional neural network be converted into 1D convolutional neural network?

I have designed a neural network using 2d convolutional layers and max-pooling layers with the input shape for input, one hot encoded sequences as 2d array. then it is reshaped before inputting the model.
data = np.zeros( (100, 21 * 1000), dtype=np.float32 )
#reshape
x_data = tf.reshape( data, [-1, 1, 1000, 21] )
However, I used the same dataset using 1D convolutional layers by changing the model and input array without reshaping as it is 1D
data = np.zeros( (100, 1000,21), dtype=np.float32 )
finally, the 1D convolutional model performed well with 96% act. and 2d CNN gave 93%. Can someone explain to me what actually happens there to increase the accuracy?
Can someone explain to me what actually happens there to increase the accuracy?
That's hard to tell and depends on your specific dataset, network, hyperparameters etc.
Generally, in a conv2D-Layer the filter shifts horizontal and vertical. In a conv1D-Layer the filter shifts only vertical in the convolution process.
So which one is the best? That depends on your problem. For time series conv1D could be better and for images conv2D could be the better choice.

What input shape should I take in first layer of Sequential model when the dimensions of the images are (2048*1536)

I am having an image dataset each image is of dimensions=(2048,1536).In ImageDataGenerator to fetch data from the directory, I have used the same target size i.e (2048,1536) but while making Sequential model first layer, what input shape should I have to use?? Will it be same as (2048,1536) or I can take any random shape like (224,224).
You should probably flatten your input data by making a vector of size 3145728 (2048 * 1536). If your data is in a numpy array you can use np.flatten() (numpy flatten).
Then your first layer can have the same shape as this vector.
I would resize first the images with cv2.resize(). Do you really need all the information from such a big image?
For a sequential Model it follows for example:
model = models.Sequential()
model.add(layers.Conv2D(32,(3,3), activation='relu', input_shape = (height,width, ndim)))
...,
where height and width denote your input image dimensions and ndim = 1 for greyscale and ndim = 3 for colored images.
The first(i.e. input)layer is supposed to be the number of features in your dataset. Regarding images, each pixel is considered as a feature. Hence in your case, the image dimension is (2048,1536) you need to flatten it out to get the total number of the pixel(i.e. features). If it is greyscaled image it would be (2048*1536*1) else if it is colour it would be(2048*1536*3).
Also, you use below code from TensorFlow/Keras API while Sequential model creation and it will take care of your input layer size
tf.keras.models.Sequential([tf.keras.layers.Flatten(),
tf.keras.layers.Dense(128,activation=tf.nn.relu) #1st hidden layer
tf.keras.layers.Dense(128,activation=tf.nn.relu) #2nd hidden layer
tf.keras.layers.Dense(2,activation=tf.nn.softmax)])#output layer

what is the difference between Conventional 1D or 2D in keras layers

I'm discovering keras library and i can't tell what does the dimention mean in keras layers and how to choose them ? (model.add(Convolution2D(...)) or model.add(Convolution1D(...)) ).
For example i have a set of 9000 train traces and 1000 of test traces and each trace has 1000 samples, so i created the arrays X_train with a size of 9000*1000, X_test has a size of 1000*1000, Y_train has a size of 9000, and Y_test has a size of 1000.
my question is how can i choose the first layer dimension ?.
I tried using the same example implemented in MNIST such :
model.add(Convolution2D(9000, (1, 1), activation='relu', input_shape(1,9000000,1),dim_ordering='th'))
but it didn't work, i don't even know what should i put in each argument of Convolution function.
The choice of dimension (1D, 2D, etc.) depends on the dimensions of your input. For example, since you're using the MNIST dataset, you would use 2D layers since your input is an image with height and width (two dimensions). Alternatively, if you were using text data, you might use a 1D layer because sentences are linear lists of words (one dimension).
I would suggest looking at Francois Chollet's example of a convolutional neural net with MNIST: https://github.com/fchollet/keras/blob/master/examples/mnist_cnn.py. (Note: Conv2D is the same as Convolution2D.)

Giving input of different sizes to LSTM model in keras

I have a multi-class(4-class) classification model in keras which looks like 1
While training, the model expects the input shape to be (None,None,300). That is, If there are 'n' different input sequences, then the input shape should be (n,None,300). In my case, the size of each input sequence is different.
Say, the input sequences are of shapes (1000,300), (1500,300), (1200,300) and (2000,300). Now I need to put them together to (4,None,300). I tried using numpy array, but numpy array won't give shape of (4,None,300),instead it will be (4L,).
Now I want to know how to train my model? Is it possible to do with numpy arrays or any different data structures are available?
Since your sequences are of different duration, you may consider padding them with zeros (adjusting the loss/labels accordingly) and then
max_duration = 2000
in_ = np.zeros((4, max_duration, 300), dtype='f4')
for i in xrange(4):
# fit sequence
in_[i,:len(seq[i]),:] = seq[i]

Categories

Resources