keras / tensorflow requires unnecessary values fed to placeholders - python

I'm using Keras with TF backend. Recently, when using the functional API to make "hybrid" models, it seemed to me that Keras requires me to feed values that it shouldn't need.
As a background, I am trying to implement a conditional GAN in Keras. My implementation has a generator and a discriminator. As an example, the generator accepts (20, 20, 1) inputs and returns (20, 20, 1) outputs. These are stacked by channel to produce a (20, 20, 2) input to the discriminator. The discriminator is supposed to decide whether it is seeing a ground-truth translation of the original (20, 20, 1) image or a translation by the generator. This is represented by 0=fake, 1=real.
By itself, the discriminator is just a CNN for binary classification. Therefore, it can be trained by feeding data points with inputs of shape (20, 20, 2) and outputs in {0,1}. Therefore, if I write something like:
# <disc> is the discriminator
arbitrary_input = np.full(shape=(5, 20, 20, 2), fill_value=0.5)
arbitrary_labels = np.array([1, 1, 0, 0, 1]), arbitrary_labels, epochs=5)
training will proceed without errors (obviously this is a useless dataset, though).
However, when I insert the discriminator into the generator-discriminator stack:
# <disc> is the discriminator, <gen> is the generator
input = Input(shape=(20, 20, 1), name='stack_input')
gen_output = gen(input)
pair = Concatenate(axis=FEATURES_AXIS)([input, gen_output])
disc_output = disc(gen_output)
stack = Model(input, disc_output)
stack.compile(optimizer='adam', loss='binary_crossentropy')
arbitrary_input = np.full(shape=(5, 20, 20, 2), fill_value=0.5)
arbitrary_labels = np.array([1, 1, 0, 0, 1]), arbitrary_labels, epochs=5)
suddenly I need to feed an extra placeholder. I get this error message on
InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'stack_input' with dtype float
[[Node: stack_input = Placeholder[dtype=DT_FLOAT, shape=[], _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
As you can see by the name, this is the input to the hybrid/stacked model. I haven't changed the discriminator at all, I have only included it in another model. Therefore should still work, right?
There's a workaround available by freezing the weights of the generator and using fit() on the full stack, I think, but I do not understand why the method above doesn't work.
Is it perhaps some issue with scoping?
Edit: The discriminator is really just a simple CNN. It is initialized with disc = pix2pix_discriminator(input_shape=(20, 20, 2), n_filters=(32, 64)). The function in question is:
def pix2pix_discriminator(input_shape, n_filters, kernel_size=4, strides=2, padding='same', alpha=0.2):
x = Input(shape=input_shape, name='disc_input')
# first layer
h = Conv2D(filters=n_filters[0],
# no BatchNorm
h = LeakyReLU(alpha=alpha)(h)
for i in range(1, len(n_filters)):
h = Conv2D(filters=n_filters[i],
h = BatchNorm(axis=FEATURES_AXIS)(h)
h = LeakyReLU(alpha=alpha)(h)
h_flatten = Flatten()(h) # required for the upcoming Dense layer
y_pred = Dense(units=1, activation='sigmoid')(h_flatten) # binary output
discriminator = Model(inputs=x, outputs=y_pred)
return discriminator


tensorflow.python.framework.errors_impl.InvalidArgumentError: cannot convert a Tensor of dtype resource to a numpy array

I am trying to design a GAN using tensorflow.keras models and layers classes. I made a discriminator that takes in a list of 2 pictures and outputs a Dense sigmoid activated percentage of similarity:
prediction = Dense(1, activation = "sigmoid")(Flatten()(conv4))
model = Model(inputs = [firstImage, secondImage], outputs = prediction)
Then a generator that takes in a random one dimension vector and returns a picture out of it:
generated = Conv2D(3, kernel_size = (4, 4), padding = "same",
kernel_initializer = kernelInit, activation = "sigmoid")(conv5) # output shape (256, 256, 3)
model = Model(inputs = noise, outputs = generated)
I made a custom generator using a keras.ImageDataGenerator.flow_from_directory() to load in pictures:
def loadRealImages(batch):
for gen in pixGen.flow_from_directory(picturesPath, target_size = (256, 256),
batch_size = batch, class_mode = "binary"):`
yield gen
I didn't have any trouble compiling any of these two but then when I try to link them together into an adversarial model with this code:
inNoise = Input(shape = (generatorInNoise,))
fake = generator(inNoise) # get one fake
real = np.array(next(loadRealImages(1))[0], dtype = np.float32) # get one real image
discriminator.trainable = False # lock discriminator weights
prediction = discriminator([real, fake]) # check similarity
adversarial = Model(inputs = inNoise, outputs = [fake, prediction]) # set adversarial model
I get this error on the last line:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot convert a Tensor of dtype resource to a NumPy array.
I ascertained the shape of inNoise, fake and prediction:
<class 'tensorflow.python.framework.ops.Tensor'> (None, 16) Tensor("input_4:0", shape=(None, 16), dtype=float32)
<class 'tensorflow.python.framework.ops.Tensor'> (None, 256, 256, 3) Tensor("model_1/Identity:0", shape=(None, 256, 256, 3), dtype=float32)
<class 'tensorflow.python.framework.ops.Tensor'> (1, 1) Tensor("dense_2/Identity:0", shape=(1, 1), dtype=float32)
But I still can't figure out what is raising the error and looking it up on google didn't really give me any pointers either. Can anyone help with this?
At the core, the issue here is that you're trying to make a numpy array a part of the computation graph. This can lead to undefined behaviour depending on how you use it. Some minor changes to you code can help:
inNoise = Input(shape = (generatorInNoise,))
fake = generator(inNoise) # get one fake
real = Input((real_image_shape)) # get one real image
discriminator.trainable = False # lock discriminator weights
prediction = discriminator([real, fake]) # check similarity
adversarial = Model(inputs = [inNoise, real], outputs = [fake, prediction]) # set adversarial model
As you can see, the real image needs to be provided as an input to the model, not derived as a part of it.

Keras: Trying to model.predict() gives "ValueError: Tensor's shape is not compatible with supplied shape"

I'm following the TensorFlow Keras tutorial for text generation. The training part works perfectly, but when I try to predict the next token, I get an error.
Here's all the important code:
Making the vocabulary and dataset.
vocab = sorted(set(text))
char2index = { c:i for i, c in enumerate(vocab) }
index2char = np.array(vocab)
chars_to_int = np.array([char2index[c] for c in text])
char_dataset =
sequences = char_dataset.batch(seq_length + 1, drop_remainder=True)
def split_input_and_target(sequence):
input_ = sequence[:-1]
target_ = sequence[1:]
return input_, target_
dataset =
dataset = dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE, drop_remainder=True)
Building the model
(important part here is that BATCH_SIZE = 64):
model = tf.keras.Sequential()
model.add(tf.keras.layers.Embedding(len(vocab), EMBEDDING_DIM,
batch_input_shape=[BATCH_SIZE, None]))
# here are a few more layers
model.compile(loss="sparse_categorical_crossentropy", optimizer="adam"), epochs=EPOCHS)
Actually trying to generate text (this one was copied almost directly from the tutorial after I started getting desperate):
num_tokens = 100
seed = "some text"
input_eval = [char2index[c] for c in seed]
input_eval = tf.expand_dims(input_eval, 0)
text_generated = []
for i in range(num_tokens):
predictions = model(input_eval)
predictions = tf.squeeze(predictions, 0)
# more stuff
Then, I first get a warning:
WARNING:tensorflow:Model was constructed with shape (64, None) for input Tensor("embedding_14_input:0", shape=(64, None), dtype=float32), but it was called on an input with incompatible shape (1, 9).
Then it gives me an error:
---->3 predictions = model(input_eval)
ValueError: Tensor's shape (9, 64, 256) is not compatible with supplied shape [9, 1, 256]
The second number, 64, is my batch size. If I change BATCH_SIZE to 1, everything works and all is fine, but this is obviously not the solution I am hoping for.
(I somehow managed to miss a step in the tutorial despite reading it several times over the past few hours.)
Here's the relevant passage:
To keep this prediction step simple, use a batch size of 1.
Because of the way the RNN state is passed from timestep to timestep, the model only accepts a fixed batch size once built.
To run the model with a different batch_size, we need to rebuild the model and restore the weights from the checkpoint.
model = build_model(vocab_size, embedding_dim, rnn_units, batch_size=1)
model.load_weights(tf.train.latest_checkpoint(checkpoint_dir))[1, None]))
I hope my silly mistake will help somebody to remember to reload the model in the future!

Input/output shapes of GANs for sequential data

I am trying to do time series prediction using GANs. I am using MXNet/Gluon. Thus, I have a sequential data of size (N, 1), which I have transformed it into (N-stepsize, stepsize). Now I have a hard time understanding the input out shapes of the network. Here, the code for Generator and Discriminator networks.
netG = nn.Sequential()
with netG.name_scope():
netG.add(nn.BatchNorm(momentum = 0.8))
netG.add(nn.BatchNorm(momentum = 0.8))
netG.add(nn.BatchNorm(momentum = 0.8))
netG.add(nn.Dense(step_size, activation = "tanh"))
#300, 50, 2
#input shape is inferred
netD = nn.Sequential()
with netD.name_scope():
netG.add(nn.BatchNorm(momentum = 0.8))
netD.add(nn.Dense(15, activation='tanh'))
netG.add(nn.BatchNorm(momentum = 0.8))
netD.add(nn.Dense(20, activation='tanh'))
Thanks in advance.
You can check the tensor shapes with the following code:
print(mx.viz.print_summary(netG(mx.sym.var('data')), shape={'data':(1,100,10)}))
I am assuming here that N-stepsize is equal 100 and stepsize is equal 10.
You have 2 errors in the discriminator: you add the Batchnorm layers to netG instead of netD

How to use TensorFlow Dataset API in combination with dense layers

I am trying out the Dataset API for my input pipeline shown in the TensorFlow documentation and use almost the same code:
tr_data = Dataset.from_tensor_slices((train_images, train_labels))
tr_data =, NUM_CORES, output_buffer_size=2000)
tr_data = tr_data.batch(BATCH_SIZE)
tr_data = tr_data.repeat(EPOCHS)
iterator = dataset.make_one_shot_iterator()
next_example, next_label = iterator.get_next()
# Script throws error here
loss = model_function(next_example, next_label)
with tf.Session(...) as sess:
while True:
train_loss =
except tf.errors.OutOfRangeError:
print("End of training dataset.")
This should be faster since it avoids using the slow feed_dicts. But I can't make it work with my model, which is a simplified LeNet architecture. The problem is the tf.layers.dense in my model_function() which expects an known input shape (I guess because it has to know the number of weights beforehand). But next_example and next_label only get their shape by running them in the session. Before evaluating them their shape is just undefined ?
Declaring the model_function() throws this error:
ValueError: The last dimension of the inputs to Dense should be
defined. Found None.
Right now, I don't know if I am using this Dataset API in the intended way or if there is a workaround.
Thanks in advance!
Edit 1:
Below is my model and it throws the error at the first dense layer
def conv_relu(input, kernel_shape):
# Create variable named "weights".
weights = tf.get_variable("weights", kernel_shape,
# Create variable named "biases".
biases = tf.get_variable("biases", kernel_shape[3],
conv = tf.nn.conv2d(input, weights,
strides=[1, 1, 1, 1], padding='VALID')
return tf.nn.relu(conv + biases)
def fully(input, output_dim):
assert len(input.get_shape())==2, 'Wrong input shape, need flattened tensor as input'
input_dim = input.get_shape()[1]
weight = tf.get_variable("weight", [input_dim, output_dim],
bias = tf.get_variable('bias', [output_dim],
fully = tf.nn.bias_add(tf.matmul(input, weight), bias)
return fully
def simple_model(x):
with tf.variable_scope('conv1'):
conv1 = conv_relu(x, [3,3,1,10])
conv1 = tf.nn.max_pool(conv1,[1,2,2,1],[1,2,2,1],'SAME')
with tf.variable_scope('conv2'):
conv2 = conv_relu(conv1, [3,3,10,10])
conv2 = tf.nn.max_pool(conv2,[1,2,2,1],[1,2,2,1],'SAME')
with tf.variable_scope('conv3'):
conv3 = conv_relu(conv2, [3,3,10,10])
conv3 = tf.nn.max_pool(conv3,[1,2,2,1],[1,2,2,1],'SAME')
flat = tf.contrib.layers.flatten(conv3)
with tf.variable_scope('fully1'):
fully1 = tf.layers.dense(flat, 1000)
fully1 = tf.nn.relu(fully1)
with tf.variable_scope('fully2'):
fully2 = tf.layers.dense(fully1, 100)
fully2 = tf.nn.relu(fully2)
with tf.variable_scope('output'):
output = tf.layers.dense(fully2, 4)
fully1 = tf.nn.relu(output)
return output
Edit 2:
Here you see the print of the tensors. Notice that next_example does not have a shape
next_example: Tensor("IteratorGetNext:0", dtype=float32)
next_label: Tensor("IteratorGetNext:1", shape=(?, 4), dtype=float32)
I found the answer myself.
Following this thread the easy fix is to just set the shape with tf.Tensor.set_shape if you know your image sizes beforehand.
def input_parser(img_path, label):
# read the img from file
img_file = tf.read_file(img_path)
img_decoded = tf.image.decode_image(img_file, channels=1)
img_decoded = tf.image.convert_image_dtype(img_decoded, dtype=tf.float32)
img_decoded.set_shape([90,160,1]) # This line was missing
return img_decoded, label
It would have been nice if the tensorflow documentation included this line.

Tensorflow: Layer size dependent on batch size?

I am currently trying to get familiar with the Tensorflow library and I have a rather fundamental question that bugs me.
While building a convolutional neural network for MNIST classification I tried to use my own model_fn. In which usually the following line occurs to reshape the input features.
x = tf.reshape(x, shape=[-1, 28, 28, 1]), with the -1 referring to the input batch size.
Since I use this node as input to my convolutional layer,
x = tf.reshape(x, shape=[-1, 28, 28, 1])
conv1 = tf.layers.conv2d(x, 32, 5, activation=tf.nn.relu)
does this mean that the size all my networks layers are dependent on the batch size?
I tried freezing and running the graph on a single test input, which will only work if I provide n=batch_size test images.
Can you give me a hint on how to make my network run on any input batchsize while predicting?
Also I guess using the tf.reshape node (see first node in cnn_layout) in the network definition is not the best input for serving.
I will append my network layer-up and the model_fn
def cnn_layout(features,reuse,is_training):
with tf.variable_scope('cnn',reuse=reuse):
# resize input to [batchsize,height,width,channel]
x = tf.reshape(features['x'], shape=[-1,30,30,1], name='input_placeholder')
# conv1, 32 filter, 5 kernel
conv1 = tf.layers.conv2d(x, 32, 5, activation=tf.nn.relu, name='conv1')
# pool1, 2 stride, 2 kernel
pool1 = tf.layers.max_pooling2d(conv1, 2, 2, name='pool1')
# conv2, 64 filter, 3 kernel
conv2 = tf.layers.conv2d(pool1, 64, 3, activation=tf.nn.relu, name='conv2')
# pool2, 2 stride, 2 kernel
pool2 = tf.layers.max_pooling2d(conv2, 2, 2, name='pool2')
# flatten pool2
flatten = tf.contrib.layers.flatten(pool2)
# fc1 with 1024 neurons
fc1 = tf.layers.dense(flatten, 1024, name='fc1')
# 75% dropout
drop = tf.layers.dropout(fc1, rate=0.75, training=is_training, name='dropout')
# output logits
output = tf.layers.dense(drop, 1, name='output_logits')
return output
def model_fn(features, labels, mode):
# setup two networks one for training one for prediction while sharing weights
logits_train = cnn_layout(features=features,reuse=False,is_training=True)
logits_test = cnn_layout(features=features,reuse=True,is_training=False)
# predictions
predictions = tf.round(tf.sigmoid(logits_test),name='predictions')
if mode == tf.estimator.ModeKeys.PREDICT:
return tf.estimator.EstimatorSpec(mode, predictions=predictions)
# define loss and optimizer
loss = tf.reduce_mean(tf.nn.sigmoid_cross_entropy_with_logits(logits=logits_train,labels=labels),name='loss')
optimizer = tf.train.AdamOptimizer(learning_rate=LEARNING_RATE, name='optimizer')
train = optimizer.minimize(loss, global_step=tf.train.get_global_step(),name='train')
# accuracy for evaluation
accuracy = tf.metrics.accuracy(labels=labels,predictions=predictions,name='accuracy')
# summarys for tensorboard
# return training and evalution spec
return tf.estimator.EstimatorSpec(
In the typical scenario, the rank of features['x'] is already going to be 4, with the outer dimension being the actual batch size, so there's no need to resize it.
Let me try to explain.
You haven't shown your serving_input_receiver_fn yet and there are several ways to do that, although in the end the principle is similar across them all. If you're using TensorFlow Serving, then you probably use build_parsing_serving_input_receiver_fn. It's informative to look at the source code:
def build_parsing_serving_input_receiver_fn(feature_spec,
serialized_tf_example = array_ops.placeholder(
receiver_tensors = {'examples': serialized_tf_example}
features = parsing_ops.parse_example(serialized_tf_example, feature_spec)
return ServingInputReceiver(features, receiver_tensors)
So in your client, you're going to prepare a request that has one or more Examples in it (let's say the length is N). The server treats the serialized examples as a list of strings which get "fed" into the input_example_tensor placeholder. The shape (which is None) dynamically gets filled in to be the size of the list (N).
Then the parse_example op parses each item in the placeholder and out pops a Tensor for each feature whose outer dimension is N. In your case, you'll have x with shape=[N, 30, 30, 1].
(Note that other serving systems, such as CloudML Engine, do not operate on Example objects, but the principles are the same).
I just want to briefly provide my found solution. Since I did not want to build a scalable production grade model, but a simple model runner in python to execute my CNN locally.
To export the model I used,
input_size = 900
def serving_input_receiver_fn():
inputs = {"x": tf.placeholder(shape=[None, input_size], dtype=tf.float32)}
return tf.estimator.export.ServingInputReceiver(inputs, inputs)
To load and run it (without needing the model definition again) I used the tensorflow predictor class.
from tensorflow.contrib import predictor
class TFRunner:
""" runs a frozen cnn graph """
def __init__(self,model_dir):
self.predictor = predictor.from_saved_model(model_dir)
def run(self, input_list):
""" runs the input list through the graph, returns output """
if len(input_list) > 1:
inputs = np.vstack(input_list)
predictions = self.predictor({"x": inputs})
elif len(input_list) == 1:
predictions = self.predictor({"x": input_list[0]})
predictions = []
return predictions

