How to get predictions on X_test given the DNN? - python

I finished building the DNN model for the Titanic Dataset. Given that, how do I make predictions on the X_test? My code can be accessed through my github:
https://github.com/isaac-altair/Titanic-Dataset
Thanks

When you trained your model you asked tensorflow to evaluate your train_op. Your train_op is your optimizer, e.g.:
train_op = tf.train.AdamOptimizer(...).minimize(cost)
You ran something like this to train the model:
sess.run([train_op], feed_dict={x:data, y:labels})
The train_op depends on things like the gradients and the operations that update the weights, so all of these things happened when you ran the train_op.
At inference time you simply ask it to perform different calculations. You can have the optimizer defined, but if you don't ask it to run the optimizer it won't perform any of the actions that the optimizer is dependent on. You probably have an output of the network called logits (you could call it anything, but logits is the most common and seen in most tutorials). You might also have defined an op called accuracy which computes the accuracy of the batch. You can get the value of those with a similar request to tensorflow:
sess.run([logits, accuracy], feed_dict={x:data, y:labels})
Almost any tutorial will demonstrate this. My favorite tutorials are here: https://github.com/aymericdamien/TensorFlow-Examples

Related

Keras loss and metrics values do not match with same function in each

I am using keras with a custom loss function like below:
def custom_fn(y_true, y_pred):
# changing y_true, y_pred values systematically
return mean_absolute_percentage_error(y_true, y_pred)
Then I am calling model.compile(loss=custom_fn) and model.fit(X, y,..validation_data=(X_val, y_val)..)
Keras is then saving loss and val_loss in model history. As a sanity check, when the model finishes training, I am using model.predict(X_val) so I can calculate validation loss manually with my custom_fn using the trained model.
I am saving the model with the best epoch using this callback:
callbacks.append(ModelCheckpoint(path, save_best_only=True, monitor='val_loss', mode='min'))
so after calculating this, the validation loss should match keras' val_loss value of the best epoch. But this is not happening.
As another attempt to figure this issue out, I am also doing this:
model.compile(loss=custom_fn, metrics=[custom_fn])
And to my surprise, val_loss and val_custom_fn do not match (neither loss or loss_custom_fn for that matter).
This is really strange, my custom_fn is essentially keras' built in mape with the y_true and y_pred slightly manipulated. what is going on here?
PS: the layers I am using are LSTM layers and a final Dense layer. But I think this information is not relevant to the problem. I am also using regularisation as hyperparameter but not dropout.
Update
Even removing custom_fn and using keras' built in mape as a loss function and metric like so:
model.compile(loss='mape', metrics=['mape'])
and for simplicity, removing ModelCheckpoint callback is having the same effect; val_loss and val_mape for each epoch are not equivalent. This is extremely strange to me. I am either missing something or there is a bug in Keras code..the former might be more realistic.
This blog post suggests that keras adds any regularisation used in the training when calculating the validation loss. And obviously, when calculating the metric of choice no regularisation is applied. This is why it occurs with any loss function of choice as stated in the question.
This is something I could not find any documentation on from Keras. However, it seems to hold up since when I remove all regularisation hyperparameters, the val_loss and val_custom_fn match exactly in each epoch.
An easy workaround is to either use the custom_fn as a metric and save the best model based on the metric (val_custom_fn) than on the val_loss. Or else Loop through each epoch manually and calculate the correct val_loss manually after training each epoch. The latter seems to make more sense since there is no reason to include custom_fn both as a metric and as a loss function.
If anyone can find any evidence of this in the Keras documentation that would be helpful.

Different behaviors between train_on_batch and gradientTape

I'm trying to train a simple keras model on some data using
approach 1
model.train_on_batch(x, y)
and approach 2
with tf.GradientTape() as g:
g.watch(model.variables)
loss = my_loss(
y_true=y,
y_pred=model(x)
)
gradients = g.gradient(loss, model.variables)
opt.apply_gradients(
zip(gradients, model.variables)
)
Even if the optimizer (Adam with some fixed learning rate) and the loss are the same, I do not get the exact same behavior. Is is expected ? (aka train_on_batch doing some additional stuff)
In approach 1, train_on_batch trains using a single batch only and once. Runs a single gradient update on a single batch of data. The idea of using train_on_batch is probably to do more things between each batch.
In approach 2, gradient update depends on the frequency of optimizer.apply_gradients(zip(grads, model.trainable_weights)) in the training loop.
As gradient update is happening differently in both the cases, there might be difference in the model behavior.
Hope this answers your question. Happy Learning.

Keras: how to reset optimizer state?

How to reset optimizer state in keras?
Looking at Optimizer class I can't see such a method:
https://github.com/keras-team/keras/blob/613aeff37a721450d94906df1a3f3cc51e2299d4/keras/optimizers.py#L60
Also what is actually self.updates and self.weights?
There isn't an "easy" way to reset the "states", but you can always simply recompile your model with a new optimizer (model's weights are preserved):
newOptimizer = Adadelta()
model.compile(optimizer=newOptimizer)
You can also use the method set_weights(weightsListInNumpy) (not recommended), in the base class Optimizer, but this would be rather cumbersome as you would need to know all initial values and shapes, which sometimes may not be trivial zeroes .
Now, the property self.weights doesn't do much, but the functions that save and load optimizers will save and load this property. It's a list of tensors and should not be changed directly. At most use K.set_value(...) in each entry of the list. You can see the weights in saving the optimizer in the _serialize_model method.
The self.updates are something a little more complex to understand. It stores the variables that will be updated with every batch that is processed by the model in training. But it's a symbolic graph variable.
The self.updates, as you can see in the code, is always appended with a K.update(var, value) or K.update_add(var, value). This is the correct way to tell the graph that these values should be updated every iteration.
Usually, the updated vars are iterations, params (the model's weights), moments, accumulators, etc.
I don't think there is a universal method for this, but you should be able to reset the state of your optimizer by initializing the variables holding it. This would need to be done with the TensorFlow API, though. The state variables depend on the specific kind of optimizer. For example, if you have a Adam optimizer (source), you could do the following:
from keras.optimizers import Adam
from keras import backend as K
optimizer = Adam(...)
# These depend on the optimizer class
optimizer_state = [optimizer.iterations, optimizer.lr, optimizer.beta_1,
optimizer.beta_2, optimizer.decay]
optimizer_reset = tf.variables_initializer(optimizer_state)
# Later when you want to reset the optimizer
K.get_session().run(optimizer_reset)
The optimizer is just adjusting the wheihts of your model, thus the information is stored in the model, not in the optimizer.
That means you can't reset an optimizer in a way you might think. You need to reset (or maybe easyier, recreate) your model.
That means you also can optimize your model with an optimizer A, stop after some epochs, and continue optimizing your model with optimizer B not loosing the progress optimizer A made allready.
I don't know exactly what self.updates and self.weights are there for. But because those are internal variables of the class someone needs to know/read about the optimizer class itself and understand its code. Here we need to wait fore someone who dived deeper into the sourcecode of keras.
EDIT
You can just recreate your optimizer for example:
model = Seqeuential()
...
...
...
model.compile(optimizer=keras.optimizers.Adadelta(lr = 5, loss='mean_squared_error')
model.fit(X, y, epochs=10)
model.compile(optimizer=keras.optimizers.Adadelta(lr = 0.5, loss='mean_squared_error')
model.fit(X, y, epochs=10)
With the above code you train 10 epochs with learning rate 5, compile your model with a new optimizer, and continue for another 10 epochs with learning rate 0.5. The weights which you could also call your training progress do not get lost if you compile your model again.

Convolutional Neural Network using TensorFlow

I'm building a CNN model using Tensorflow, without the use of any frontend APIs such as Keras. I'm creating a VGG-16 model and using the pre-trained weights, and want to fine tune the last layers to serve my purpose.
Following the tutorial here, http://cv-tricks.com/tensorflow-tutorial/training-convolutional-neural-network-for-image-classification/
I re-created the training script and modified as per my requirements. However, my training does not happen and the training accuracy is stuck at 50.00% and validation accuracy is forming a pattern repeating the numbers.
Attached is the screenshot of the same.
I have been stuck on this for days now and can't seem to find the error. Any help is appreciated.
The code is pretty long and hence here is the gist file for the same
Your cross entropy is wrong, you are comparing your logits with the softmax of your logits.
This:
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=layer_fc2,
labels=y_pred)
Should be:
cross_entropy = tf.nn.softmax_cross_entropy_with_logits(logits=layer_fc2,
labels=y_true)
Some things to note. I would not train on some data point and then evaluate on the same datapoint. Your training accuracy is probably going to be biased by doing so. Another point to note ist that tf.argmax(tf.softmax(logits)) is the same as tf.argmax(logits).

TensorFlow FullyConnected Tutorial: How are the trained weights used for Eval and Test?

I've been looking through the TensorFlow FullyConnected tutorial. This also uses the helper code mnist.py
I understand the code but for one nagging piece. After training the Neural Net, the weights obtained from training should be used to evaluate the precision of the model on the Validation (and Test) data. However, I don't see that being done anywhere.
Infact, this is the only thing I see in fully_connected_feed.py
# Evaluate against the validation set.
print('Validation Data Eval:')
do_eval(sess,
eval_correct,
images_placeholder,
labels_placeholder,
data_sets.validation)
# Evaluate against the test set.
print('Test Data Eval:')
do_eval(sess,
eval_correct,
images_placeholder,
labels_placeholder,
data_sets.test)
the do_eval() function seems to be passed a parameter eval_correct which seems to be recalculating the logits again on this new data. I've been playing around with TF for a while now but I'm baffled by this code. Any thoughts would be great.
TensorFlow creates a graph with the weights and biases. Roughly speaking while you train this neural net the weights and biases get changed so it produces expected outputs. The line 131 in fully_connected_feed.py (with tf.Graph().as_default():) is used to tell TensorFlow to use the default graph. Therefore every line in the training loop including the calls of the do_eval() function use the default graph. Since the weights obtained from training are not resetted before evaluation they are used for it.
eval_correct is the operation used instead of the training operation to just evaluate the neural net without training it. This is important because otherwise the neural net would be trained to them which would result in distorted (too good) results.

Categories

Resources