I am going to perform pixel-based classification on an image. Here is the code I used for training the NN
net = input_data(shape=[None, 1,4])
net = tflearn.lstm(net, 128, return_seq=True)
net = tflearn.lstm(net, 128)
net = tflearn.fully_connected(net, 1, activation='softmax')
net = tflearn.regression(net, optimizer='adam',
loss='categorical_crossentropy')
model = tflearn.DNN(net, tensorboard_verbose=2, checkpoint_path='model.tfl.ckpt')
X_train = np.expand_dims(X_train, axis=1)
model.fit(X_train, y_train, n_epoch=1, validation_set=0.1, show_metric=True,snapshot_step=100)
The problem is that after training the model, the result of p.array(model.predict(x_test)) is 1 only although I expected this to be either 2 or 3. In one example where I have had 4 classes of objects and I expected the result of that command to be a label between 2 and 5 (note:y_train has int values between 2 and 5) but again the output of the prediction function is 1. Could that be a problem of training phase?
The None parameter is used to denote different training examples. In your case, each image has a total of 28*28*4 parameters, due to the custom four channel dataset you are using.
To make this LSTM work, you should try to do the following -
X = np.reshape(X, (-1, 28, 28, 4))
testX = np.reshape(testX, (-1, 28, 28, 4))
net = tflearn.input_data(shape=[None, 28, 28, 4])
Of course, (this is very important), make sure that reshape() puts the four different channels corresponding to a single pixel in the last dimension of the numpy array, and the 28, 28 correspond to pixels in a single image.
In case your images don't have dimension 28*28, adjust those parameters accordingly.
Related
I have just preprocessed my categories data to one-hot encoding and used tf.argmax(). This way, I got it into a range of numbers of 1 to 34; thus, there are sequences of, say: [7, 4, 28, 14, 5, 15, 22, 22].
My question comes about how do I make the further dataset preparations. I intend to use 3 numbers in sequence to predict the next 2. So, do I need to map the dataset into features being the first 3 and labels the last 2? Should I batch it with buffer_size 5 to specify the sequence legnth? Lastly, do I keep the one_hot representation over it's transformation into a simpler number?
No one had answered me in time, so I found the solution some days ago.
Just a note that I'm now using 8 numbers to predict the next 2.
First, to make a prediction of the 2 next steps, I decided that I can predict the label for the 8 initial steps and then make another prediction with the last 7 time steps from the initial steps and use the predicted value as the 8th input. Second, my teacher told me it was better for the rnn to use one-hot, so I mapped the dataset as features of 8 one-hots and a label of 1 one-hot. Third, what impressed me, is that in fact I was able to use batching as a form of grouping the splitted data sequence and by so indicating that the last one-hot is my label.
So here is the code:
INPUT_WIDTH = 8
LABEL_WIDTH = 1
shift = 1
INPUT_SLICE = slice(0, INPUT_WIDTH)
total_window_size = INPUT_WIDTH + shift
label_start = total_window_size - LABEL_WIDTH
LABELS_SLICE = slice(label_start, None)
BATCH_SIZE = INPUT_WIDTH + LABEL_WIDTH
Above are some constants that I got from [1]. The only one that I couldn't understand correctly is the shift var, but set it to 1 and you're fine.
Below, the split function:
def split_window(features):
inputs = features[INPUT_SLICE]
labels = features[LABELS_SLICE]
return inputs, labels
Simple and cute, isn't it? But be clear it was a disgrace to change this function from [1] and make it compatible with my input shape.
Now the dataset:
def create_seq_dataset(data):
ds = tf.data.Dataset.from_tensor_slices(data)
#At this point, my input was a single array of 3644 string categories to be turned
#into one-hot. The below function just one-hots the data and turns it into float32 dtype,
#as required by the rnn, so I won't cover it in detail.
ds = ds.map(get_seq_label)
#As I'm using from_tensor_slices() function, the 3644 number disappears from the shape.
#Now, from shape (1), I've got shape (53) after one-hotting it, beign 53 the number of
#possible categories that I'm working with.
#Here I batch the one-hot data
ds = ds.batch(BATCH_SIZE, drop_remainder=True)
#Shape (53) => (9, 53)
#Without using drop_remainder, the rnn will complain that it can't predict "empty" data.
ds = ds.map(split_window)
#Got features shape (8, 53) and labels shape(53,)
ds = ds.batch(16, drop_remainder=True)
#Got features shape (16, 8, 53) and labels: (16, 53)
return
train_ds = create_seq_dataset(train_df)
for features_batch, labels_batch in train_ds:
print(features_batch.shape)
print(labels_batch.shape)
break
What I got from the print was:
(16, 8, 53) and
(16, 53)
Lastly, the LSTM:
inputs = layers.Input(name="sequence", shape=(INPUT_WIDTH, 53), dtype='float32')
#Reminder that the batch size is implicit when the input is a dataset, unless the LSTM
#is stateful.
def create_rnn_model():
#From [1]: Shape [batch, time, features] => [batch, time, lstm_units]
#In my case [16, 8, 53] => [16, 8 100], but as the 16 is the implicit batch size, it would
#appear as None if you use rnn_model.summary() after declaring the model
x = layers.LSTM(100, activation='tanh', stateful=False, return_sequences=False, name='LSTM_1')(inputs)
#Got shape after this layer as [16, 100], or [None, 100].
x = layers.Dense(32, activation='relu')(x)
output = layers.Dense(NUM_CLASSES, activation='softmax')(x)
model = Model(inputs = inputs, outputs = output)
optimizer = tf.keras.optimizers.Adam(learning_rate=0.001)
model.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['accuracy'])
return model
rnn_model = create_rnn_model()
[1]. https://www.tensorflow.org/tutorials/structured_data/time_series#4_create_tfdatadatasets
I am trying to code a ResNet CNN architecture based on the paper by using Python3, TensorFlow2 and CIFAR-10 dataset. You can access the Jupyter notebook here.
During training the model using "model.fit()", after just one epoch of training, I get the following error:
ValueError: Input 0 is incompatible with layer model: expected
shape=(None, 32, 32, 3), found shape=(32, 32, 3)
The training images are batched using batch_size = 128, hence the training loop gives the following 4-d tensor which TF Conv2D expects- (128, 32, 32, 3).
What's the source of this error?
Ok, I found a small issue in your code. The problem occurs in the test data set. You forget to transform it properly. So currently you have like this
images, labels = next(iter(test_dataset))
images.shape, labels.shape
(TensorShape([32, 32, 3]), TensorShape([10]))
You need to do the same transformation on the test as you did on the train set. But of course, things you consider: no shuffling, no augmentation.
def testaugmentation(x, y):
x = tf.image.resize_with_crop_or_pad(x, HEIGHT + 8, WIDTH + 8)
x = tf.image.random_crop(x, [HEIGHT, WIDTH, NUM_CHANNELS])
return x, y
def normalize(x, y):
x = tf.image.per_image_standardization(x)
return x, y
test_dataset = (test_dataset
.map(testaugmentation)
.map(normalize)
.batch(batch_size = batch_size, drop_remainder = True))
images, labels = next(iter(test_dataset))
images.shape, labels.shape
(TensorShape([128, 32, 32, 3]), TensorShape([128, 10]))
I'm following the TensorFlow Keras tutorial for text generation. The training part works perfectly, but when I try to predict the next token, I get an error.
Here's all the important code:
Making the vocabulary and dataset.
vocab = sorted(set(text))
char2index = { c:i for i, c in enumerate(vocab) }
index2char = np.array(vocab)
chars_to_int = np.array([char2index[c] for c in text])
char_dataset = tf.data.Dataset.from_tensor_slices(chars_to_int)
sequences = char_dataset.batch(seq_length + 1, drop_remainder=True)
def split_input_and_target(sequence):
input_ = sequence[:-1]
target_ = sequence[1:]
return input_, target_
dataset = sequences.map(split_input_and_target)
dataset = dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE, drop_remainder=True)
Building the model
(important part here is that BATCH_SIZE = 64):
model = tf.keras.Sequential()
model.add(tf.keras.layers.Embedding(len(vocab), EMBEDDING_DIM,
batch_input_shape=[BATCH_SIZE, None]))
# here are a few more layers
model.compile(loss="sparse_categorical_crossentropy", optimizer="adam")
model.fit(dataset, epochs=EPOCHS)
Actually trying to generate text (this one was copied almost directly from the tutorial after I started getting desperate):
num_tokens = 100
seed = "some text"
input_eval = [char2index[c] for c in seed]
input_eval = tf.expand_dims(input_eval, 0)
text_generated = []
model.reset_states()
for i in range(num_tokens):
predictions = model(input_eval)
predictions = tf.squeeze(predictions, 0)
# more stuff
Then, I first get a warning:
WARNING:tensorflow:Model was constructed with shape (64, None) for input Tensor("embedding_14_input:0", shape=(64, None), dtype=float32), but it was called on an input with incompatible shape (1, 9).
Then it gives me an error:
---->3 predictions = model(input_eval)
...
ValueError: Tensor's shape (9, 64, 256) is not compatible with supplied shape [9, 1, 256]
The second number, 64, is my batch size. If I change BATCH_SIZE to 1, everything works and all is fine, but this is obviously not the solution I am hoping for.
(I somehow managed to miss a step in the tutorial despite reading it several times over the past few hours.)
Here's the relevant passage:
To keep this prediction step simple, use a batch size of 1.
Because of the way the RNN state is passed from timestep to timestep, the model only accepts a fixed batch size once built.
To run the model with a different batch_size, we need to rebuild the model and restore the weights from the checkpoint.
tf.train.latest_checkpoint(checkpoint_dir)
model = build_model(vocab_size, embedding_dim, rnn_units, batch_size=1)
model.load_weights(tf.train.latest_checkpoint(checkpoint_dir))
model.build(tf.TensorShape([1, None]))
I hope my silly mistake will help somebody to remember to reload the model in the future!
I am currently trying to get familiar with the Tensorflow library and I have a rather fundamental question that bugs me.
While building a convolutional neural network for MNIST classification I tried to use my own model_fn. In which usually the following line occurs to reshape the input features.
x = tf.reshape(x, shape=[-1, 28, 28, 1]), with the -1 referring to the input batch size.
Since I use this node as input to my convolutional layer,
x = tf.reshape(x, shape=[-1, 28, 28, 1])
conv1 = tf.layers.conv2d(x, 32, 5, activation=tf.nn.relu)
does this mean that the size all my networks layers are dependent on the batch size?
I tried freezing and running the graph on a single test input, which will only work if I provide n=batch_size test images.
Can you give me a hint on how to make my network run on any input batchsize while predicting?
Also I guess using the tf.reshape node (see first node in cnn_layout) in the network definition is not the best input for serving.
I will append my network layer-up and the model_fn
def cnn_layout(features,reuse,is_training):
with tf.variable_scope('cnn',reuse=reuse):
# resize input to [batchsize,height,width,channel]
x = tf.reshape(features['x'], shape=[-1,30,30,1], name='input_placeholder')
# conv1, 32 filter, 5 kernel
conv1 = tf.layers.conv2d(x, 32, 5, activation=tf.nn.relu, name='conv1')
# pool1, 2 stride, 2 kernel
pool1 = tf.layers.max_pooling2d(conv1, 2, 2, name='pool1')
# conv2, 64 filter, 3 kernel
conv2 = tf.layers.conv2d(pool1, 64, 3, activation=tf.nn.relu, name='conv2')
# pool2, 2 stride, 2 kernel
pool2 = tf.layers.max_pooling2d(conv2, 2, 2, name='pool2')
# flatten pool2
flatten = tf.contrib.layers.flatten(pool2)
# fc1 with 1024 neurons
fc1 = tf.layers.dense(flatten, 1024, name='fc1')
# 75% dropout
drop = tf.layers.dropout(fc1, rate=0.75, training=is_training, name='dropout')
# output logits
output = tf.layers.dense(drop, 1, name='output_logits')
return output
def model_fn(features, labels, mode):
# setup two networks one for training one for prediction while sharing weights
logits_train = cnn_layout(features=features,reuse=False,is_training=True)
logits_test = cnn_layout(features=features,reuse=True,is_training=False)
# predictions
predictions = tf.round(tf.sigmoid(logits_test),name='predictions')
if mode == tf.estimator.ModeKeys.PREDICT:
return tf.estimator.EstimatorSpec(mode, predictions=predictions)
# define loss and optimizer
loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=logits_train,labels=labels),name='loss')
optimizer = tf.train.AdamOptimizer(learning_rate=LEARNING_RATE, name='optimizer')
train = optimizer.minimize(loss, global_step=tf.train.get_global_step(),name='train')
# accuracy for evaluation
accuracy = tf.metrics.accuracy(labels=labels,predictions=predictions,name='accuracy')
# summarys for tensorboard
tf.summary.scalar('loss',loss)
# return training and evalution spec
return tf.estimator.EstimatorSpec(
mode=mode,
predictions=predictions,
loss=loss,
train_op=train,
eval_metric_ops={'accuracy':accuracy}
)
Thanks!
In the typical scenario, the rank of features['x'] is already going to be 4, with the outer dimension being the actual batch size, so there's no need to resize it.
Let me try to explain.
You haven't shown your serving_input_receiver_fn yet and there are several ways to do that, although in the end the principle is similar across them all. If you're using TensorFlow Serving, then you probably use build_parsing_serving_input_receiver_fn. It's informative to look at the source code:
def build_parsing_serving_input_receiver_fn(feature_spec,
default_batch_size=None):
serialized_tf_example = array_ops.placeholder(
dtype=dtypes.string,
shape=[default_batch_size],
name='input_example_tensor')
receiver_tensors = {'examples': serialized_tf_example}
features = parsing_ops.parse_example(serialized_tf_example, feature_spec)
return ServingInputReceiver(features, receiver_tensors)
So in your client, you're going to prepare a request that has one or more Examples in it (let's say the length is N). The server treats the serialized examples as a list of strings which get "fed" into the input_example_tensor placeholder. The shape (which is None) dynamically gets filled in to be the size of the list (N).
Then the parse_example op parses each item in the placeholder and out pops a Tensor for each feature whose outer dimension is N. In your case, you'll have x with shape=[N, 30, 30, 1].
(Note that other serving systems, such as CloudML Engine, do not operate on Example objects, but the principles are the same).
I just want to briefly provide my found solution. Since I did not want to build a scalable production grade model, but a simple model runner in python to execute my CNN locally.
To export the model I used,
input_size = 900
def serving_input_receiver_fn():
inputs = {"x": tf.placeholder(shape=[None, input_size], dtype=tf.float32)}
return tf.estimator.export.ServingInputReceiver(inputs, inputs)
model.export_savedmodel(
export_dir_base=model_dir,
serving_input_receiver_fn=serving_input_receiver_fn)
To load and run it (without needing the model definition again) I used the tensorflow predictor class.
from tensorflow.contrib import predictor
class TFRunner:
""" runs a frozen cnn graph """
def __init__(self,model_dir):
self.predictor = predictor.from_saved_model(model_dir)
def run(self, input_list):
""" runs the input list through the graph, returns output """
if len(input_list) > 1:
inputs = np.vstack(input_list)
predictions = self.predictor({"x": inputs})
elif len(input_list) == 1:
predictions = self.predictor({"x": input_list[0]})
else:
predictions = []
return predictions
I'm using Keras with TF backend. Recently, when using the functional API to make "hybrid" models, it seemed to me that Keras requires me to feed values that it shouldn't need.
As a background, I am trying to implement a conditional GAN in Keras. My implementation has a generator and a discriminator. As an example, the generator accepts (20, 20, 1) inputs and returns (20, 20, 1) outputs. These are stacked by channel to produce a (20, 20, 2) input to the discriminator. The discriminator is supposed to decide whether it is seeing a ground-truth translation of the original (20, 20, 1) image or a translation by the generator. This is represented by 0=fake, 1=real.
By itself, the discriminator is just a CNN for binary classification. Therefore, it can be trained by feeding data points with inputs of shape (20, 20, 2) and outputs in {0,1}. Therefore, if I write something like:
# <disc> is the discriminator
arbitrary_input = np.full(shape=(5, 20, 20, 2), fill_value=0.5)
arbitrary_labels = np.array([1, 1, 0, 0, 1])
disc.fit(arbitrary_input, arbitrary_labels, epochs=5)
training will proceed without errors (obviously this is a useless dataset, though).
However, when I insert the discriminator into the generator-discriminator stack:
# <disc> is the discriminator, <gen> is the generator
input = Input(shape=(20, 20, 1), name='stack_input')
gen_output = gen(input)
pair = Concatenate(axis=FEATURES_AXIS)([input, gen_output])
disc_output = disc(gen_output)
stack = Model(input, disc_output)
stack.compile(optimizer='adam', loss='binary_crossentropy')
arbitrary_input = np.full(shape=(5, 20, 20, 2), fill_value=0.5)
arbitrary_labels = np.array([1, 1, 0, 0, 1])
disc.fit(arbitrary_input, arbitrary_labels, epochs=5)
suddenly I need to feed an extra placeholder. I get this error message on disc.fit():
InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'stack_input' with dtype float
[[Node: stack_input = Placeholder[dtype=DT_FLOAT, shape=[], _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
As you can see by the name, this is the input to the hybrid/stacked model. I haven't changed the discriminator at all, I have only included it in another model. Therefore disc.fit() should still work, right?
There's a workaround available by freezing the weights of the generator and using fit() on the full stack, I think, but I do not understand why the method above doesn't work.
Is it perhaps some issue with scoping?
Edit: The discriminator is really just a simple CNN. It is initialized with disc = pix2pix_discriminator(input_shape=(20, 20, 2), n_filters=(32, 64)). The function in question is:
def pix2pix_discriminator(input_shape, n_filters, kernel_size=4, strides=2, padding='same', alpha=0.2):
x = Input(shape=input_shape, name='disc_input')
# first layer
h = Conv2D(filters=n_filters[0],
kernel_size=kernel_size,
strides=strides,
padding=padding,
data_format=DATA_FORMAT)(x)
# no BatchNorm
h = LeakyReLU(alpha=alpha)(h)
for i in range(1, len(n_filters)):
h = Conv2D(filters=n_filters[i],
kernel_size=kernel_size,
strides=strides,
padding=padding,
data_format=DATA_FORMAT)(h)
h = BatchNorm(axis=FEATURES_AXIS)(h)
h = LeakyReLU(alpha=alpha)(h)
h_flatten = Flatten()(h) # required for the upcoming Dense layer
y_pred = Dense(units=1, activation='sigmoid')(h_flatten) # binary output
discriminator = Model(inputs=x, outputs=y_pred)
discriminator.compile(optimizer='adam',
loss='binary_crossentropy',
metrics=['accuracy'])
return discriminator