Behavior difference when building TF Keras RNN with two different methods - python

I am building a RNN text generator, mostly going from the Tensorflow docs here.
My question, I have defined the model two ways:
Method (1):
model = tf.keras.Sequential([
tf.keras.layers.Embedding(vocab_size, embed_dim,
batch_input_shape=[batch_size, None]),
tf.keras.layers.GRU(rnn_units, return_sequences=True, stateful=True,
recurrent_initializer='glorot_uniform'),
tf.keras.layers.Dense(vocab_size)
])
Method (2):
model = tf.keras.Sequential()
model.add(tf.keras.layers.Embedding(vocab_size, embed_dim,
batch_input_shape=[BATCH_SIZE, None]))
model.add(tf.keras.layers.GRU(rnn_units, return_sequences=True,
stateful=True,
recurrent_initializer='glorot_uniform'))
model.add(tf.keras.layers.Dense(vocab_size))
In my mind, these both do the same thing. However when generating text with:
def generate_text(model, start_string, length=1000):
# converting start string to numbers (vectorisation)
input_eval = [char2idx[s] for s in start_string]
input_eval = tf.expand_dims(input_eval, 0)
# initialise empty string to store results
text = []
model.reset_states()
for i in range(length):
predictions = model(input_eval)
# remove batch dimension
predictions = tf.squeeze(predictions, 0)
# use categorical distribution to predict character returned by model
predicted_id = tf.random.categorical(predictions, num_samples=1)[-1,0].numpy()
# we pass the predicted character as the next input to the model
# along with previous hidden state
input_eval = tf.expand_dims([predicted_id], 0)
# append predicted text
text.append(idx2char[predicted_id])
return (start_string + ''.join(text))
Which I pass:
print(generate_text(model, start_string=u'From '))
Method (1) works perfectly, but method (2) throws the following error:
WARNING:tensorflow:Model was constructed with shape Tensor("embedding_1_input:0", shape=(64, None), dtype=float32) for input (64, None), but it was re-called on a Tensor with incompatible shape (1, 5).
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-33-eb814780c9fe> in <module>()
----> 1 print(generate_text(model, start_string=u'From ', length=PRINT))
14 frames
/usr/local/lib/python3.6/dist-packages/tensorflow_core/python/framework/ops.py in set_shape(self, shape)
1086 raise ValueError(
1087 "Tensor's shape %s is not compatible with supplied shape %s" %
-> 1088 (self.shape, shape))
1089
1090 # Methods not supported / implemented for Eager Tensors.
ValueError: Tensor's shape (5, 64, 1024) is not compatible with supplied shape [5, 1, 1024]
If anyone could help me understand what the difference is between these two methods that would be amazing, thankyou!
Edit:
Including model saving and loading code. I use this to save the model (with a batch size 64) and then load with a batch size of 1 for text generation.
Saving weights:
checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(
filepath='./training_checkpoints/ckpt_{epoch}',
save_weights_only=True
)
Loading weights into new model (batch size = 1):
model = build_model(len(vocab_char), EMBED_DIM, UNITS, 1)
model.load_weights(tf.train.latest_checkpoint('./training_checkpoints'))
model.build(tf.TensorShape([1, None]))
model.summary()

Related

ValueError: Error when checking input: expected dense_input to have 2 dimensions, but got array with shape (1, 1, 15)

I am trying to make a custom Gym Environment so that I can use it in a Keras Network. But there is a problem that is happening to me when I try to fit de neural network.
ValueError: Error when checking input: expected dense_6_input to have 2 dimensions, but got array with shape (1, 1, 15)
What I understand about this problem is that the states (the inputs that the network receives) are structured as a 3 dimensional array, but I don´t know why.
Here is my init method in the class that defines the environment:
def __init__ (self):
self.action_space = Discrete (4)
self.observation_space = Box(low=0, high=100, shape=(15,))
self.state = np.array([1,2,0,3,2,0,4,0,0,1,3,0,0,0,0], dtype=float)
#self.state = state
self.length = 15
self.index = 0
After, i initialize two variables that save the shape of the states and the actions, so we can define the model.
states = env.observation_space.shape
actions = env.action_space.n
def build_model(states, actions):
model = Sequential()
model.add(Dense(24, activation='relu', input_shape=states))
model.add(Dense(24, activation='relu'))
model.add(Dense(actions, activation='linear'))
return model
The summary of the model:
Layer (type) Output Shape Param #
dense_6 (Dense) (None, 24) 384
dense_7 (Dense) (None, 24) 600
dense_8 (Dense) (None, 4) 100
The last step before the error is when i buid the agent. After that, we call the fit method and occurs the problem.
def build_agent(model, actions):
policy = BoltzmannQPolicy()
memory = SequentialMemory(limit=50000, window_length=1)
dqn = DQNAgent(model=model, memory=memory, policy=policy,
nb_actions=actions, nb_steps_warmup=10, target_model_update=1e-2)
return dqn
dqn = build_agent(model, actions)
dqn.compile(Adam(lr=1e-3), metrics=['mae'])
dqn.fit(env, nb_steps=50000, visualize=False, verbose=1)
I tried to change the input_shape of the first layer to be (1, 1, 15) but doesn´t seems to work. Maybe the problem is related with the definition of the environment (observation space) or how the environments provides the information to the network. I don´t know...
I hope you can help me. let me know if you need some more information to handle the error.
Thank you so much!
The input shape should be (1,15) in your case. This is because the actual input shape is increased by an additional dimension to what you specify in shape as keras processes inputs in batches, with the batch size being the first parameter.
The error message is telling you that inputs of shape (1,1,15) are being passed to the model ( i.e. batch of 1, and input shape (1,15), so ndims=3), but you only have ndims=2. Despite only passing (15,) as the input shape, notice that ndims = 2. This is because Keras is adding an additional dimension for the batch.
So to fix, set shape=(1,15) which will then be processed as (1,1,15) which is what Keras expects.

tensorflow.python.framework.errors_impl.InvalidArgumentError: cannot convert a Tensor of dtype resource to a numpy array

I am trying to design a GAN using tensorflow.keras models and layers classes. I made a discriminator that takes in a list of 2 pictures and outputs a Dense sigmoid activated percentage of similarity:
prediction = Dense(1, activation = "sigmoid")(Flatten()(conv4))
model = Model(inputs = [firstImage, secondImage], outputs = prediction)
Then a generator that takes in a random one dimension vector and returns a picture out of it:
generated = Conv2D(3, kernel_size = (4, 4), padding = "same",
kernel_initializer = kernelInit, activation = "sigmoid")(conv5) # output shape (256, 256, 3)
model = Model(inputs = noise, outputs = generated)
I made a custom generator using a keras.ImageDataGenerator.flow_from_directory() to load in pictures:
def loadRealImages(batch):
for gen in pixGen.flow_from_directory(picturesPath, target_size = (256, 256),
batch_size = batch, class_mode = "binary"):`
yield gen
I didn't have any trouble compiling any of these two but then when I try to link them together into an adversarial model with this code:
inNoise = Input(shape = (generatorInNoise,))
fake = generator(inNoise) # get one fake
real = np.array(next(loadRealImages(1))[0], dtype = np.float32) # get one real image
discriminator.trainable = False # lock discriminator weights
prediction = discriminator([real, fake]) # check similarity
adversarial = Model(inputs = inNoise, outputs = [fake, prediction]) # set adversarial model
I get this error on the last line:
tensorflow.python.framework.errors_impl.InvalidArgumentError: Cannot convert a Tensor of dtype resource to a NumPy array.
I ascertained the shape of inNoise, fake and prediction:
<class 'tensorflow.python.framework.ops.Tensor'> (None, 16) Tensor("input_4:0", shape=(None, 16), dtype=float32)
<class 'tensorflow.python.framework.ops.Tensor'> (None, 256, 256, 3) Tensor("model_1/Identity:0", shape=(None, 256, 256, 3), dtype=float32)
<class 'tensorflow.python.framework.ops.Tensor'> (1, 1) Tensor("dense_2/Identity:0", shape=(1, 1), dtype=float32)
But I still can't figure out what is raising the error and looking it up on google didn't really give me any pointers either. Can anyone help with this?
At the core, the issue here is that you're trying to make a numpy array a part of the computation graph. This can lead to undefined behaviour depending on how you use it. Some minor changes to you code can help:
inNoise = Input(shape = (generatorInNoise,))
fake = generator(inNoise) # get one fake
real = Input((real_image_shape)) # get one real image
discriminator.trainable = False # lock discriminator weights
prediction = discriminator([real, fake]) # check similarity
adversarial = Model(inputs = [inNoise, real], outputs = [fake, prediction]) # set adversarial model
As you can see, the real image needs to be provided as an input to the model, not derived as a part of it.

Keras: Trying to model.predict() gives "ValueError: Tensor's shape is not compatible with supplied shape"

I'm following the TensorFlow Keras tutorial for text generation. The training part works perfectly, but when I try to predict the next token, I get an error.
Here's all the important code:
Making the vocabulary and dataset.
vocab = sorted(set(text))
char2index = { c:i for i, c in enumerate(vocab) }
index2char = np.array(vocab)
chars_to_int = np.array([char2index[c] for c in text])
char_dataset = tf.data.Dataset.from_tensor_slices(chars_to_int)
sequences = char_dataset.batch(seq_length + 1, drop_remainder=True)
def split_input_and_target(sequence):
input_ = sequence[:-1]
target_ = sequence[1:]
return input_, target_
dataset = sequences.map(split_input_and_target)
dataset = dataset.shuffle(BUFFER_SIZE).batch(BATCH_SIZE, drop_remainder=True)
Building the model
(important part here is that BATCH_SIZE = 64):
model = tf.keras.Sequential()
model.add(tf.keras.layers.Embedding(len(vocab), EMBEDDING_DIM,
batch_input_shape=[BATCH_SIZE, None]))
# here are a few more layers
model.compile(loss="sparse_categorical_crossentropy", optimizer="adam")
model.fit(dataset, epochs=EPOCHS)
Actually trying to generate text (this one was copied almost directly from the tutorial after I started getting desperate):
num_tokens = 100
seed = "some text"
input_eval = [char2index[c] for c in seed]
input_eval = tf.expand_dims(input_eval, 0)
text_generated = []
model.reset_states()
for i in range(num_tokens):
predictions = model(input_eval)
predictions = tf.squeeze(predictions, 0)
# more stuff
Then, I first get a warning:
WARNING:tensorflow:Model was constructed with shape (64, None) for input Tensor("embedding_14_input:0", shape=(64, None), dtype=float32), but it was called on an input with incompatible shape (1, 9).
Then it gives me an error:
---->3 predictions = model(input_eval)
...
ValueError: Tensor's shape (9, 64, 256) is not compatible with supplied shape [9, 1, 256]
The second number, 64, is my batch size. If I change BATCH_SIZE to 1, everything works and all is fine, but this is obviously not the solution I am hoping for.
(I somehow managed to miss a step in the tutorial despite reading it several times over the past few hours.)
Here's the relevant passage:
To keep this prediction step simple, use a batch size of 1.
Because of the way the RNN state is passed from timestep to timestep, the model only accepts a fixed batch size once built.
To run the model with a different batch_size, we need to rebuild the model and restore the weights from the checkpoint.
tf.train.latest_checkpoint(checkpoint_dir)
model = build_model(vocab_size, embedding_dim, rnn_units, batch_size=1)
model.load_weights(tf.train.latest_checkpoint(checkpoint_dir))
model.build(tf.TensorShape([1, None]))
I hope my silly mistake will help somebody to remember to reload the model in the future!

keras custom loss function call hidden layer dense operations

I am trying to define a custom loss function in keras that uses an intermediary layer output, manipulate it (let's say multiply by 2( and then and back into the model to produce the final output. So assuming a model
input_dim = X_train.shape[1]
encoding_dim = 14
#encoder
input_tensor = Input(shape=(input_dim, ))
encoderOut = Dense(encoding_dim, activation="tanh",
activity_regularizer=regularizers.l1(10e-5))(input_tensor)
encoderOut = Dense(int(encoding_dim / 2), activation="relu")(encoderOut)
encoder = Model(input_tensor, encoderOut)
#decoder
decoder_input = Input(shape=(int(encoding_dim / 2),))
decoderOut = Dense(int(encoding_dim / 2), activation='tanh',name='decoder_input')(decoder_input)
decoderOut = Dense(input_dim, activation='relu',name='decoder_output')(decoderOut)
decoder = Model(decoder_input, decoderOut)
#autoencoder
autoInput = Input(shape=(input_dim, ))
encoderOut = encoder(autoInput)
decoderOut = decoder(encoderOut)
autoencoder = Model(inputs=autoInput, outputs=decoderOut)
My loss function is
def L2Loss(y_true,y_pred):
get_layer_output_enc = K.function([encoder.layers[0].input, K.learning_phase()], [encoder.layers[2].output])
out= get_layer_output_enc([y_true])[0]*10
Unfortunately when I run it I got:
517 None, None,
518 compat.as_text(c_api.TF_Message(self.status.status)),
--> 519 c_api.TF_GetCode(self.status.status))
520 # Delete the underlying status object from memory otherwise it stays alive
521 # as there is a reference to status from this from the traceback due to
InvalidArgumentError: You must feed a value for placeholder tensor 'model_89_target_28' with dtype float and shape [?,?]
[[Node: model_89_target_28 = Placeholder[dtype=DT_FLOAT, shape=[?,?], _device="/job:localhost/replica:0/task:0/device:CPU:0"]()]]
Alternatively I tried to reproduce a dense layer operation extracting the weights:
layer_output_enc = encoder.layers[2].output#get_layer_output_enc([y_true])[0]*10
w_dec0 = decoder.layers[1].get_weights()[0]
b_dec0 = decoder.layers[1].get_weights()[1]
print type(layer_output_enc),'--',layer_output_enc.shape
layer_output_enc = backend.cast(layer_output_enc,'float64')#tf.convert_to_tensor(layer_output_enc)
out_dec0 = K.dot(layer_output_enc,w_dec0)+b_dec0
print out_dec0.shape
out2 = K.tanh(out_dec0)
But again I got the error:
AttributeError: 'numpy.ndarray' object has no attribute 'get_shape'
which is weird because I now 'layer_output_enc' is of type :
Any help appreciated.
You can't call your model within the loss function of Keras model, you can only use the input tensors y_true and y_pred. So the loss function cannot access intermediate layers. I had the same need and the tricky solution I found was to concatenate the output tensor with the intermediate layer as a new output of the model. It may be much simpler working directly with tensorflow though.

MobileNet ValueError: Error when checking target: expected dense_1 to have 4 dimensions, but got array with shape (24, 2)

I am trying to implement number of networks using Keras applications. Here I am attaching a piece of code and this code works fine for ResNet50 and VGG16 but when it comes to MobileNet it generate the error:
ValueError: Error when checking target: expected dense_1 to have 4 dimensions, but got array with shape (24, 2)
I am working with 224x224 images with 3 channels and batch size of 24 and trying to classify them in 2 classes, so the number 24 mentioned in the error is the batch size but I am not sure about number 2, probably it is number of classes.
Btw is there anyone who knows why I am receiving this error for keras.applications.mobilenet?
# basic_model = ResNet50()
# basic_model = VGG16()
basic_model = MobileNet()
classes = list(iter(train_generator.class_indices))
basic_model.layers.pop()
for layer in basic_model.layers[:25]:
layer.trainable = False
last = basic_model.layers[-1].output
temp = Dense(len(classes), activation="softmax")(last)
fineTuned_model = Model(basic_model.input, temp)
fineTuned_model.classes = classes
fineTuned_model.compile(optimizer=Adam(lr=0.0001), loss='categorical_crossentropy', metrics=['accuracy'])
fineTuned_model.fit_generator(
train_generator,
steps_per_epoch=3764 // batch_size,
epochs=100,
validation_data=validation_generator,
validation_steps=900 // batch_size)
fineTuned_model.save('mobile_model.h5')
From the source code, we can see that you're popping a Reshape() layer. Exactly the one that transforms the convolution's output (4D) into a class tensor (2D).
Source code:
if include_top:
if K.image_data_format() == 'channels_first':
shape = (int(1024 * alpha), 1, 1)
else:
shape = (1, 1, int(1024 * alpha))
x = GlobalAveragePooling2D()(x)
x = Reshape(shape, name='reshape_1')(x)
x = Dropout(dropout, name='dropout')(x)
x = Conv2D(classes, (1, 1),
padding='same', name='conv_preds')(x)
x = Activation('softmax', name='act_softmax')(x)
x = Reshape((classes,), name='reshape_2')(x)
But all the keras convolutional models are meant to be used in a different way. If you want your own number of classes, you should create these models with include_top=False. This way, the final part of the model (the classes part) will simply not exist and you just add your own layers:
basic_model = MobileNet(include_top=False)
for layer in basic_model.layers:
layers.trainable=False
furtherOutputs = YourOwnLayers()(basic_model.outputs)
You should probably try to copy that final part shown in the keras code, changing classes with your own number of classes. Or maybe try pop 3 layers from the complete model, the Reshape, the Activation and the Conv2D, replacing them with your own.

Categories

Resources