Input/output shapes of GANs for sequential data - python

I am trying to do time series prediction using GANs. I am using MXNet/Gluon. Thus, I have a sequential data of size (N, 1), which I have transformed it into (N-stepsize, stepsize). Now I have a hard time understanding the input out shapes of the network. Here, the code for Generator and Discriminator networks.
netG = nn.Sequential()
with netG.name_scope():
netG.add(nn.Dense(20))
netG.add(nn.BatchNorm(momentum = 0.8))
netG.add(nn.Dropout(0.5))
netG.add(nn.Dense(15))
netG.add(nn.BatchNorm(momentum = 0.8))
netG.add(nn.Dropout(0.5))
netG.add(nn.Dense(20))
netG.add(nn.BatchNorm(momentum = 0.8))
netG.add(nn.Dropout(0.5))
netG.add(nn.Dense(step_size, activation = "tanh"))
#300, 50, 2
#input shape is inferred
netD = nn.Sequential()
with netD.name_scope():
netD.add(nn.Dense(20))
netG.add(nn.BatchNorm(momentum = 0.8))
netD.add(nn.Dense(15, activation='tanh'))
netG.add(nn.BatchNorm(momentum = 0.8))
netD.add(nn.Dense(20, activation='tanh'))
netD.add(nn.Dense(step_size))
Thanks in advance.

You can check the tensor shapes with the following code:
print(mx.viz.print_summary(netG(mx.sym.var('data')), shape={'data':(1,100,10)}))
I am assuming here that N-stepsize is equal 100 and stepsize is equal 10.
You have 2 errors in the discriminator: you add the Batchnorm layers to netG instead of netD

Related

kernel size of CNN model

I was wondering if you would be able to help me with an errors that I am getting in the code that I am writing.
I have 2 datasets as inputs and 1 other as taget dataset. All datasets are set of images in dimantion of (17, 20, 1).
I set a code as:
from tensorflow.keras.layers import Concatenate
from tensorflow.keras import Model
# define two sets of inputs
inputA = Input(shape=(17, 20, 1))
inputB = Input(shape=(17, 20, 1))
# the first branch operates on the first input
x = Sequential()(inputA)
x = Conv2D(filters=64, kernel_size=(3,3), activation='relu')(x)
x = Model(inputs=inputA, outputs=x)
# the second branch opreates on the second input
y = Sequential()(inputB)
y = Conv2D(filters=64, kernel_size=(3,3), activation='relu')(y)
y = Model(inputs=inputB, outputs=y)
# combine the output of the two branches
combined = Concatenate()([x.output, y.output])
# apply a FC layer and then a regression prediction on the
# combined outputs
z = Sequential()(combined)
z = Conv2D(filters=64, kernel_size=(3,3), activation='relu')(z)
# our model will accept the inputs of the two branches and
# then output a single value
model = Model(inputs=[x.input, y.input], outputs=z)
model.compile(loss="mean_absolute_percentage_error", optimizer='adam', metrics=['accuracy'])
history = model.fit([input1_train, input2_train], target_train, validation_data=([input1_test, input2_test], target_test), epochs=100, verbose=0)
then I get the error as:
ValueError: Dimensions must be equal, but are 17 and 13 for '{{node mean_absolute_percentage_error/sub}} = Sub[T=DT_FLOAT](IteratorGetNext:1, model_18/conv2d_18/Relu)' with input shapes: [?,17,20,1], [?,13,16,64].
I also test this code for images with shape (20, 20, 1), and get the code as:
ValueError: Dimensions must be equal, but are 20 and 16 for '{{node mean_absolute_percentage_error/sub}} = Sub[T=DT_FLOAT](IteratorGetNext:1, model_15/conv2d_15/Relu)' with input shapes: [?,20,20,1], [?,16,16,64].
but when I set the kernel size as (1,1) the code run with no problem.
Does anyone know where the problem comes from? And what should I do?
I'll very thanks if anyone can help me fix it.
I got an error while running a python code

LSTM layers accepting any input shape when expecting them not to

I'm building a neural network using keras and I'm a little lost on the LSTM layer input shape. Below is an image of the relevant part.
Both towers are similar with the only difference that the left accepts sequences of any length and the right only accepts sequences of length 5. This results in their LSTM layers receiving an ambiguous sequence length and a sequence length of 4 respectively, both with 8 features per timestep. I'd thus expect both LSTM layers should have an input_shape of (1,8).
My confusion now comes from the fact that both LSTM layers will accept any input shape without a problem, which is why I think this might not work the way I think it does. I'd expect the right LSTM layer to require an input shape with the first dimension either 1, 2 or 4 as only these sizes would be able to divide the input sequence of 4. Further, I'd expect both to require the second dimension to always be 8.
Could someone explain why the LSTM layers can accept any input shape and if they process the sequnces correctly with an input_shape=(1,8)? Below is the relevant code.
# Tower 1
inp_sentence1 = Input(shape=(None, 300, 1))
conv11 = Conv2D(32, (2, 300))(inp_sentence1)
reshape11 = K.squeeze(conv11, 2)
maxpl11 = MaxPooling1D(4, data_format='channels_first')(reshape11)
lstm11 = LSTM(units=6, input_shape=(1,8))(maxpl11)
# Tower 2
inp_sentence2 = Input(shape=(5, 300, 1))
conv21 = Conv2D(32, (2, 300))(inp_sentence2)
reshape21 = Reshape((4,32))(conv21)
maxpl21 = MaxPooling1D(4, data_format='channels_first')(reshape21)
lstm21 = LSTM(units=6, input_shape=(1,8))(maxpl21)
EDIT: Short reproduction of problem on dummy data:
# Tower 1
inp_sentence1 = Input(shape=(None, 300, 1))
conv11 = Conv2D(32, (2, 300))(inp_sentence1)
reshape11 = K.squeeze(conv11, 2)
maxpl11 = MaxPooling1D(4, data_format='channels_first')(reshape11)
lstm11 = LSTM(units=6, input_shape=(1,8))(maxpl11)
# Tower 2
inp_sentence2 = Input(shape=(5, 300, 1))
conv21 = Conv2D(32, (2, 300))(inp_sentence2)
reshape21 = Reshape((4,32))(conv21)
maxpl21 = MaxPooling1D(4, data_format='channels_first')(reshape21)
lstm21 = LSTM(units=6, input_shape=(1,8))(maxpl21)
# Combine towers
substract = Subtract()([lstm11, lstm21])
dense = Dense(16, activation='relu')(substract)
final = Dense(1, activation='sigmoid')(dense)
# Build model
model = Model([inp_sentence1, inp_sentence2], final)
model.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
# Create data
random_length = random.randint(2, 10)
x1 = numpy.random.random((100, random_length, 300))
x2 = numpy.random.random((100, 5, 300))
y = numpy.random.randint(2, size=100)
# Train and predict on data
model.fit([x1, x2], y, epochs=10, batch_size=5)
prediction = model.predict([x1, x2])
prediction = [round(x) for [x] in prediction]
classification = prediction == y
print("accuracy:", sum(classification)/len(prediction))

Target array shape different to expected output using Tensorflow

I'm trying to make a CNN (still a beginner). When trying to fit the model I am getting this error:
ValueError: A target array with shape (10000, 10) was passed for output of shape (None, 6, 6, 10) while using as loss categorical_crossentropy. This loss expects targets to have the same shape as the output.
The shape of labels = (10000, 10)
the shape of the image data = (10000, 32, 32, 3)
Code:
import pickle
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import (Dense, Dropout, Activation, Flatten,
Conv2D, MaxPooling2D)
from tensorflow.keras.callbacks import TensorBoard
from keras.utils import to_categorical
import numpy as np
import time
MODEL_NAME = f"_________{int(time.time())}"
BATCH_SIZE = 64
class ConvolutionalNetwork():
'''
A convolutional neural network to be used to classify images
from the CIFAR-10 dataset.
'''
def __init__(self):
'''
self.training_images -- a 10000x3072 numpy array of uint8s. Each
a row of the array stores a 32x32 colour image.
The first 1024 entries contain the red channel
values, the next 1024 the green, and the final
1024 the blue. The image is stored in row-major
order, so that the first 32 entries of the array are the red channel values of the first row of the image.
self.training_labels -- a list of 10000 numbers in the range 0-9.
The number at index I indicates the label
of the ith image in the array data.
'''
# List of image categories
self.label_names = (self.unpickle("cifar-10-batches-py/batches.meta",
encoding='utf-8')['label_names'])
self.training_data = self.unpickle("cifar-10-batches-py/data_batch_1")
self.training_images = self.training_data[b'data']
self.training_labels = self.training_data[b'labels']
# Reshaping the images + scaling
self.shape_images()
# Converts labels to one-hot
self.training_labels = np.array(to_categorical(self.training_labels))
self.create_model()
self.tensorboard = TensorBoard(log_dir=f'logs/{MODEL_NAME}')
def unpickle(self, file, encoding='bytes'):
'''
Unpickles the dataset files.
'''
with open(file, 'rb') as fo:
training_dict = pickle.load(fo, encoding=encoding)
return training_dict
def shape_images(self):
'''
Reshapes the images and scales by 255.
'''
images = list()
for d in self.training_images:
image = np.zeros((32,32,3), dtype=np.uint8)
image[...,0] = np.reshape(d[:1024], (32,32)) # Red channel
image[...,1] = np.reshape(d[1024:2048], (32,32)) # Green channel
image[...,2] = np.reshape(d[2048:], (32,32)) # Blue channel
images.append(image)
for i in range(len(images)):
images[i] = images[i]/255
images = np.array(images)
self.training_images = images
print(self.training_images.shape)
def create_model(self):
'''
Creating the ConvNet model.
'''
self.model = Sequential()
self.model.add(Conv2D(64, (3, 3), input_shape=self.training_images.shape[1:]))
self.model.add(Activation("relu"))
self.model.add(MaxPooling2D(pool_size=(2,2)))
self.model.add(Conv2D(64, (3,3)))
self.model.add(Activation("relu"))
self.model.add(MaxPooling2D(pool_size=(2,2)))
# self.model.add(Flatten())
# self.model.add(Dense(64))
# self.model.add(Activation('relu'))
self.model.add(Dense(10))
self.model.add(Activation(activation='softmax'))
self.model.compile(loss="categorical_crossentropy", optimizer="adam",
metrics=['accuracy'])
def train(self):
'''
Fits the model.
'''
print(self.training_images.shape)
print(self.training_labels.shape)
self.model.fit(self.training_images, self.training_labels, batch_size=BATCH_SIZE,
validation_split=0.1, epochs=5, callbacks=[self.tensorboard])
network = ConvolutionalNetwork()
network.train()
Would appreciate the help, have been trying to fix for an hour.
You need to uncomment the Flatten layer when creating your model. Essentially what this layer does is that it takes a 4D input (batch_size, height, width, num_filters) and unrolls it into a 2D one (batch_size, height * width * num_filters). This is needed to get the output shape you want.
Un-comment the flatten layer before your output layer in create_model(self), conv layers don't work with 1D tensors/arrays, and so for you to get the output layer of the right shape to add a Flatten() layer right before your output layer, like this:
def create_model(self):
'''
Creating the ConvNet model.
'''
self.model = Sequential()
self.model.add(Conv2D(64, (3, 3), input_shape=self.training_images.shape[1:]), activation='relu')
#self.model.add(Activation("relu"))
self.model.add(MaxPooling2D(pool_size=(2,2)))
self.model.add(Conv2D(64, (3,3), activation='relu'))
#self.model.add(Activation("relu"))
self.model.add(MaxPooling2D(pool_size=(2,2)))
# self.model.add(Dense(64))
# self.model.add(Activation('relu'))
self.model.add(Flatten())
self.model.add(Dense(10, activation='softmax'))
#self.model.add(Activation(activation='softmax'))
self.model.compile(loss="categorical_crossentropy", optimizer="adam",
metrics=['accuracy'])
print ('model output shape:', self.model.output_shape)#prints out the output shape of your model
The code above will give you a model with an output shape of (None, 10).
Also please use activation as a layer parameter in the future.
Use model.summary() to inspect the output shapes of your model. Without the commented out Flatten() layer the shapes of your layers retain the original dimensions of the image and the shape of the output layer is (None, 6, 6, 10).
What you want to do here is roughly:
start with a shape of (batch_size, img width, img heigh, channels)
use convolutions to detect patterns through the image by applying a filter
reduce the img width and height with max pooling
then Flatten() the dimensions of the image so that instead of (width, heigh, features) you end up with just a set of features.
match against your classes.
The commented out code does step 4; when you remove the Flatten() layer you end up with the wrong set of dimensions at the end.
You have to get your model output into the same shape as your labels.
Perhaps the simplest solution would be to ensure the model ends with these layers:
model.add(Flatten())
## possibly an extra dense layer or 2 with 'relu' activation
model.add(Dense(10, activation=`softmax`))
This is amongst the most common 'endings' to a categorisation model and is arguably the most straightforward to understand.
It's not clear why you commented out this section:
# self.model.add(Flatten())
# self.model.add(Dense(64))
# self.model.add(Activation('relu'))
which would appear to give you the required output shape?

How to implement a 1D convolutional neural network with residual connections and batch-normalization in Keras?

I am trying to develop a 1D convolutional neural network with residual connections and batch-normalization based on the paper Cardiologist-Level Arrhythmia Detection with Convolutional Neural Networks, using keras.
This is the code so far:
# define model
x = Input(shape=(time_steps, n_features))
# First Conv / BN / ReLU layer
y = Conv1D(filters=n_filters, kernel_size=n_kernel, strides=n_strides, padding='same')(x)
y = BatchNormalization()(y)
y = ReLU()(y)
shortcut = MaxPooling1D(pool_size = n_pool)(y)
# First Residual block
y = Conv1D(filters=n_filters, kernel_size=n_kernel, strides=n_strides, padding='same')(y)
y = BatchNormalization()(y)
y = ReLU()(y)
y = Dropout(rate=drop_rate)(y)
y = Conv1D(filters=n_filters, kernel_size=n_kernel, strides=n_strides, padding='same')(y)
# Add Residual (shortcut)
y = add([shortcut, y])
# Repeated Residual blocks
for k in range (2,3): # smaller network for testing
shortcut = MaxPooling1D(pool_size = n_pool)(y)
y = BatchNormalization()(y)
y = ReLU()(y)
y = Dropout(rate=drop_rate)(y)
y = Conv1D(filters=n_filters * k, kernel_size=n_kernel, strides=n_strides, padding='same')(y)
y = BatchNormalization()(y)
y = ReLU()(y)
y = Dropout(rate=drop_rate)(y)
y = Conv1D(filters=n_filters * k, kernel_size=n_kernel, strides=n_strides, padding='same')(y)
y = add([shortcut, y])
z = BatchNormalization()(y)
z = ReLU()(z)
z = Flatten()(z)
z = Dense(64, activation='relu')(z)
predictions = Dense(classes, activation='softmax')(z)
model = Model(inputs=x, outputs=predictions)
# Compiling
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['categorical_accuracy'])
# Fitting
model.fit(train_x, train_y, epochs=n_epochs, batch_size=n_batch)
And this is the graph of a simplified model of what I am trying to build.
The model described in the paper uses an incrementing number of filters:
The network consists of 16 residual blocks with 2 convolutional layers per block. The convolutional layers all have a filter length of 16 and have 64k filters, where k starts out as 1 and is incremented every 4-th residual block. Every alternate residual block subsamples its inputs by a factor of 2, thus the original input is ultimately subsampled by a factor of 2^8. When a residual block subsamples the input, the corresponding shortcut connections also subsample their input using a Max Pooling operation with the same subsample factor.
But I can only make it work if I use the same number of filters in every Conv1D layer, with k=1, strides=1 and padding=same, without applying any MaxPooling1D. Any changes in these parameters causes a tensor size mismatch and failure to compile with the following error:
ValueError: Operands could not be broadcast together with shapes (70, 64) (70, 128)
Does anyone have any idea on how to fix this size mismatch and make it work?
In addition, if the input has more than one channel (or features) the mismatch is even worst! Is there a way to deal with more than one channel?
The issue of tensor shape mismatch should be happening in add([y, shortcut]) layer. Because of the fact that you are using MaxPooling1D layer, this halves your time-steps by default, which you can change it by using the pool_size parameter. On the other hand, your residual portion is not reducing the time-steps by same amount. You should apply stride=2 with padding='same' before adding shortcut and y in any one of Conv1D layer (preferably the last one).
For reference, you can check out the Resnet code here Keras-applications-github

keras / tensorflow requires unnecessary values fed to placeholders

I'm using Keras with TF backend. Recently, when using the functional API to make "hybrid" models, it seemed to me that Keras requires me to feed values that it shouldn't need.
As a background, I am trying to implement a conditional GAN in Keras. My implementation has a generator and a discriminator. As an example, the generator accepts (20, 20, 1) inputs and returns (20, 20, 1) outputs. These are stacked by channel to produce a (20, 20, 2) input to the discriminator. The discriminator is supposed to decide whether it is seeing a ground-truth translation of the original (20, 20, 1) image or a translation by the generator. This is represented by 0=fake, 1=real.
By itself, the discriminator is just a CNN for binary classification. Therefore, it can be trained by feeding data points with inputs of shape (20, 20, 2) and outputs in {0,1}. Therefore, if I write something like:
# <disc> is the discriminator
arbitrary_input = np.full(shape=(5, 20, 20, 2), fill_value=0.5)
arbitrary_labels = np.array([1, 1, 0, 0, 1])
disc.fit(arbitrary_input, arbitrary_labels, epochs=5)
training will proceed without errors (obviously this is a useless dataset, though).
However, when I insert the discriminator into the generator-discriminator stack:
# <disc> is the discriminator, <gen> is the generator
input = Input(shape=(20, 20, 1), name='stack_input')
gen_output = gen(input)
pair = Concatenate(axis=FEATURES_AXIS)([input, gen_output])
disc_output = disc(gen_output)
stack = Model(input, disc_output)
stack.compile(optimizer='adam', loss='binary_crossentropy')
arbitrary_input = np.full(shape=(5, 20, 20, 2), fill_value=0.5)
arbitrary_labels = np.array([1, 1, 0, 0, 1])
disc.fit(arbitrary_input, arbitrary_labels, epochs=5)
suddenly I need to feed an extra placeholder. I get this error message on disc.fit():
InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'stack_input' with dtype float
[[Node: stack_input = Placeholder[dtype=DT_FLOAT, shape=[], _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
As you can see by the name, this is the input to the hybrid/stacked model. I haven't changed the discriminator at all, I have only included it in another model. Therefore disc.fit() should still work, right?
There's a workaround available by freezing the weights of the generator and using fit() on the full stack, I think, but I do not understand why the method above doesn't work.
Is it perhaps some issue with scoping?
Edit: The discriminator is really just a simple CNN. It is initialized with disc = pix2pix_discriminator(input_shape=(20, 20, 2), n_filters=(32, 64)). The function in question is:
def pix2pix_discriminator(input_shape, n_filters, kernel_size=4, strides=2, padding='same', alpha=0.2):
x = Input(shape=input_shape, name='disc_input')
# first layer
h = Conv2D(filters=n_filters[0],
kernel_size=kernel_size,
strides=strides,
padding=padding,
data_format=DATA_FORMAT)(x)
# no BatchNorm
h = LeakyReLU(alpha=alpha)(h)
for i in range(1, len(n_filters)):
h = Conv2D(filters=n_filters[i],
kernel_size=kernel_size,
strides=strides,
padding=padding,
data_format=DATA_FORMAT)(h)
h = BatchNorm(axis=FEATURES_AXIS)(h)
h = LeakyReLU(alpha=alpha)(h)
h_flatten = Flatten()(h) # required for the upcoming Dense layer
y_pred = Dense(units=1, activation='sigmoid')(h_flatten) # binary output
discriminator = Model(inputs=x, outputs=y_pred)
discriminator.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
return discriminator

Categories

Resources