Saving model and initialization in Keras - python

I have created a model in Keras, which I then initialised by calling
session=tf.Session()
session.run(tf.global_variables_initializer())
After training, I tried to save the model by running
saver = tf.train.Saver()
saver.save(session, "action_inference_cart_pole_plan16_5000episode.ckpt")
However, it keeps returning this error
FailedPreconditionError: Attempting to use uninitialized value dense_241/kernel
[[Node: dense_241/kernel/_21554 = _Send[T=DT_FLOAT, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_1854_dense_241/kernel", _device="/job:localhost/replica:0/task:0/gpu:0"](dense_241/kernel)]]
[[Node: dense_284/bias/_21741 = _Recv[_start_time=0, client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_1947_dense_284/bias", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"](^_arg_save_15/Const_0_0, ^save_15/SaveV2/tensor_names, ^save_15/SaveV2/shape_and_slices)]]
I have tried to manually initialize the variables that failed, and that worked once before. However, now there are different variables, and I can't even find them. I would like to understand why this is happening.
Here is the full code

Keras usually has it's own built-in model save and load methods. When training keras models, you should
use them instead of the TF saver, since keras has its own meta computation graph, that should probably be initialized when loading a model.
Here is an example (copied from the keras documentation) for how to save and load a keras model
from keras.models import load_model
model.save('my_model.h5') # creates a HDF5 file 'my_model.h5'
del model # deletes the existing model
# returns a compiled model
# identical to the previous one
model = load_model('my_model.h5')

Related

Keras: How to save models or weights?

I am sorry if this question seems pretty straight forward. But reading the Keras save and restore help page :
https://www.tensorflow.org/beta/tutorials/keras/save_and_restore_models
I do not understand how to use the "ModelCheckpoint" for saving during training. The help file mentions it should give 3 files, I see only one, MODEL.ckpt.
Here is my code:
checkpoint_dir = FolderName + "/tmp/model.ckpt"
cp_callback = k.callbacks.ModelCheckpoint(checkpoint_dir,verbose=1,save_weights_only=True)
parallel_model.compile(optimizer=tf.keras.optimizers.Adam(lr=learning_rate),loss=my_cost_MSE, metrics=['accuracy])
parallel _model.fit(image, annotation, epochs=epoch,
batch_size=batch_size, steps_per_epoch=10,
validation_data=(image_val,annotation_val),validation_steps=num_batch_val,callbacks=callbacks_list)
Also, when I want to load the weights after training with:
model = k.models.load_model(file_checkpoint)
I get the error:
"raise ValueError('Unknown ' + printable_module_name + ':' + object_name)
ValueError: Unknown loss function:my_cost_MSE"
my-cost_MSE is my cost function that is used in the training.
First of all, it looks like you are using the tf.keras (from tensorflow) implementation rather than keras (from the keras-team/keras repo). In this case, as stated in the tf.keras guide :
When saving a model's weights, tf.keras defaults to the checkpoint
format. Pass save_format='h5' to use HDF5.
On the other hand, note that adding the callback ModelCheckpoint is, usually, roughly equivalent to call model.save(...) at the end of each epoch, so that's why you should expect three files to be saved (according to the checkpoint format).
The reason it's not doing so is because, by using the option save_weights_only=True, you are saving just the weights. Roughly equivalent to replace the call to model.save for model.save_weights at the end of each epoch. Hence, the only file that's being saved is the one with the weights.
From here, you can proceed in two different ways:
Storing just the weights
You need your model (the structure, let's say) to be loaded beforehand and then call model.load_weights instead of keras.models.load_model:
model = MyModel(...) # Your model definition as used in training
model.load_weights(file_checkpoint)
Note that in this case, you won't have problems with custom definitions (my_cost_MSE) since you are just loading model weights.
Storing the whole model
Another way to proceed is to store the whole model and load it accordingly:
cp_callback = k.callbacks.ModelCheckpoint(
checkpoint_dir,verbose=1,
save_weights_only=False
)
parallel_model.compile(
optimizer=tf.keras.optimizers.Adam(lr=learning_rate),
loss=my_cost_MSE,
metrics=['accuracy']
)
model.fit(..., callbacks=[cp_callback])
Then you could load it by:
model = k.models.load_model(file_checkpoint, custom_objects={"my_cost_MSE": my_cost_MSE})
Note that in this latter case, you need to specify custom_objects since its definition is needed to deserialize the model.
keras has a save command. It saves all the details needed to rebuild the model.
(from the keras docs)
from keras.models import load_model
model.save('my_model.h5') # creates a HDF5 file 'my_model.h5'
del model # deletes the existing model
# returns am identical compiled model
model = load_model('my_model.h5')

The name 'Tensor' refers to a Tensor which does not exist. The operation, 'Tensor', does not exist in the graph

I know, this question has already been asked but none of the answers worked for me.
I'm trying to restore a model that I saved before. I used both the classical TF saver and the SavedModel. But the problem when loading remains the same. I get the error message as in the header:
"The name 'train_op:0' refers to a Tensor which does not exist. The operation, 'train_op', does not exist in the graph."
Well I get it that train_op is somehow not in the restored model graph but I don't know why.
I'm creating the train_op variable and assigning a name to it: train_op = optimizer.apply_gradients(grads_and_vars, global_step=global_step, name="train_op"). I'm saving the model using simple_save:
tf.saved_model.simple_save(sess,
model_file,
inputs={"x": X, "y": y},
outputs={"z": loss})
This is how I am loading the model:
tf.saved_model.loader.load(sess, [tf.saved_model.tag_constants.SERVING], model_file)
This is the line of code where the model appears:
_, summary = sess.run(["train_op:0", "summary_op:0"], feed_dict={"X-input:0": X_batch, "y-input:0": y_batch}, run_metadata=run_metadata)
I really don't understand the issue. I get the same problem for the other variables so it's not specific to train_op.
Thanks!

"tensor not found in the checkpoint" when evaluating the re-tuned inception-v3 model using TF-slim

When I evaluated the re-tuned inception-v3 model with my dataset using eval_image_classifier.py in TF-slim, I got an error:
NotFoundError (see above for traceback): Key InceptionV3/AuxLogits/Conv2d_2a_3x3/weights not found in checkpoint
[[Node: save/RestoreV2_7 = RestoreV2[dtypes=[DT_FLOAT], _device="/job:localhost/replica:0/task:0/device:CPU:0"](_arg_save/Const_0_0, save/RestoreV2_7/tensor_names, save/RestoreV2_7/shape_and_slices)]]
Then I printed all tensors' names in the orignal inception-v3 checkpoint and the re-tuned checkpoint, then compared these tensors with tensors inception_v3.arg_scope, I found that some tensors in the checkpoint were different from that in the inception_v3.arg_scope:
"InceptionV3/AuxLogits/Conv2d_2a_3x3" in the inception_v3.arg_scope
"InceptionV3/AuxLogits/Conv2d_2a_5x5" in the checkpoints
Has the architecture of inception_v3 changed before? How to solve this problem? Retrain the whole inception-v3 model with inception_v3.arg_scope rather than re-tuning?
Add tf.reset_default_graph() to python script it will fix issue like this.

How to use numpy functions on a keras tensor in the loss function?

I'm using Keras with TensorFlow backend to build and run a neural network. I need to use a numpy function on my output tensor in the loss function. More specifically, my loss function involves finding nearest neighbors, and I need to use the Keras functionality for ckdTree for this purpose. I have tried converting my output tensor to a numpy array using K.eval(). However, this throws an InvalidArgument error when I try to compile the model, I believe, since you can't run eval() on a symbolic variable.
Here's a toy code snippet that reproduces this error.
import numpy as np
from keras import backend as K
from keras.models import Sequential
from keras.layers.core import Flatten, Dense, Reshape
from keras.optimizers import Adam
def loss(y_true, y_pred):
y_pred_numpy = K.eval(y_pred)
# perform some numpy operations on y_pred_numpy
return K.constant(0)
''' Model '''
input_shape = (10,10,10,3)
train_images = np.zeros((1,10,10,10,3))
train_labels = np.zeros((1,1,1,1,3))
model = Sequential()
model.add(Flatten(input_shape=input_shape))
model.add(Dense(3000, use_bias=True, bias_initializer='zeros'))
model.add(Reshape((10,10,10,3)))
model.summary()
opt = Adam(lr=1E-4)
model.compile(optimizer=opt, loss=loss)
The above gives the following error:
InvalidArgumentError (see above for traceback): You must feed a value for placeholder tensor 'flatten_3_input' with dtype float
[[Node: flatten_3_input = Placeholder[dtype=DT_FLOAT, shape=[], _device="/job:localhost/replica:0/task:0/gpu:0"]()]]
[[Node: reshape_3/Reshape/_11 = _Recv[client_terminated=false, recv_device="/job:localhost/replica:0/task:0/cpu:0", send_device="/job:localhost/replica:0/task:0/gpu:0", send_device_incarnation=1, tensor_name="edge_20_reshape_3/Reshape", tensor_type=DT_FLOAT, _device="/job:localhost/replica:0/task:0/cpu:0"]()]]
How then do I work with Keras tensors without having to rewrite (complex) numpy functionality using Keras?
The direct using of this numpy function is impossible - as it's not implemented in neither Tensorflow nor Theano. Moreover - there is no a direct correspondence between tensors and arrays. Tensors should be understood as an algebraic variables whereas numpy arrays as numbers. tensor is an abstract thing and applying a numpy functions to it is usually impossible.
But you could still try to reimplement this function on your own using keras.backend functions. Then you'll use the valid tensor operations and no problem should be raised.

In tensorflow, when I use DNNLinearClassifier [wide_n_deep_tutorial.py], how to restore the history model trained out and train base on it

I was learning the Google wide and deep model in tensorflow.
My code was modified based on the tf example.
My question is:
How to load history model, and train new sample based on it. I had search many on the web, but most methods are like the following code.
Examples of saver
v1 = tf.Variable(..., name="v1")
v2 = tf.Variable(..., name="v2")
...
init_op = tf.initialize_all_variables()
saver = tf.train.Saver()
with tf.Session() as sess:
sess.run(init_op)
..
save_path = saver.save(sess, "/tmp/model.ckpt")
But in wide_n_deep train, it define a method "build_estimator" which returns:
m = tf.contrib.learn.DNNLinearCombinedClassifier.
When I use the method above to define a saver, it gave error no Varibles to save. And the "m" returned have no save, restore method.
I have also try model.load in tflearn, but it also doesn't work.
Second is related with above, how to save model. Or how to use model which the fit method saved in /tmp/. (checkpoint)
My question can be summarized as:
How to save model and restore history model when I use the DNNLinearCombinedClassifier in tensorflow?
There is an argument called model_dir in DNNLinearCombinedClassifier: https://github.com/tensorflow/tensorflow/blob/master/tensorflow/contrib/learn/python/learn/estimators/dnn_linear_combined.py#L476. Just point it to the directory where you store your model checkpoint.

Categories

Resources