I have have trained binary classifier model. The model class contains self.cost, self.initial_state, self.final_state and self.logits params. It is saved simply with tf.train.Saver:
saver = tf.train.Saver(tf.global_variables(), max_to_keep=1)
saver.save(session, 'model.ckpt')
After the model was trained I load it as:
with tf.variable_scope("Model", reuse=False):
model = MODEL(config, is_training=False)
with tf.Session() as session:
saver = tf.train.Saver(tf.global_variables())
saver.restore(session, 'model.ckpt')
However, my model.run function returns cross-entropy loss which is the last op in the graph. I don't need loss, I need the model predictions for each batch element
logits = tf.sigmoid(tf.nn.xw_plus_b(last_layer, self.output_w, self.output_b))
where last_layer is a 800x1 matrix which I then later reshape into 32x25x1 (batch_size, sequence_length, 1) matrix. It is this matrix that contains the model prediction values in [0-1] range.
So, how can I use this model to make a prediction for single element matrix 1x1x1?
Add the OPs necessary to compute accuracy, something like what I have copied below (simply copied out of the closest model I had at hand).
self.logits_flat = tf.argmax(logits, axis=1, output_type=tf.int32)
labels_flat = tf.argmax(labels, axis=1, output_type=tf.int32)
accuracy = tf.cast(tf.equal(self.logits_flat, labels_flat), tf.float32, name='accuracy')
Now when you run the model (either during test or training time) add accuracy to the sess.run call as:
sess.run([train_op, accuracy], feed_dict=...)
or
sess.run([accuracy, logits], feed_dict=...)
All you're doing when you call sess.run is to tell tensorflow to compute the value of whatever you ask for. You need to pass it in any data it needs to perform those computations. Tensorflow is lazy, it won't perform any computations that aren't explicitly necessary to produce the results you request. E.g. if you run the second version of sess.run listed above the optimizer will not be run and hence your weights will not be updated.
Note that you can add the OPs after the network was trained because none of them actually add any variables so they won't affect the save/restore process any.
Related
I want to train Googles VGGish network (Hershey et al 2017) from scratch to predict classes specific to my own audio files.
For this I am using the vggish_train_demo.py script available on their github repo which uses tensorflow. I've been able to modify the script to extract melspec features from my own audio by changing the _get_examples_batch() function, and, then train the model on the output of this function. This runs to completetion and prints the loss at each epoch.
However, I've been unable to figure out how to get this trained model to generate predictions from new data. Can this be done with changes to the vggish_train_demo.py script?
For anyone who stumbles across this in the future, I wrote this script which does the job. You must save logmel specs for train and test data in the arrays: X_train, y_train, X_test, y_test. The X_train/test are arrays of the (n, 96,64) features and the y_train/test are arrays of shape (n, _NUM_CLASSES) for two classes, where n = the number of 0.96s audio segments and _NUM_CLASSES = the number of classes used.
See the function definition statement for more info and the vggish github in my original post:
### Run the network and save the predictions and accuracy at each epoch
### Train NN, output results
r"""This uses the VGGish model definition within a larger model which adds two
layers on top, and then trains this larger model.
We input log-mel spectrograms (X_train) calculated above with associated labels
(y_train), and feed the batches into the model. Once the model is trained, it
is then executed on the test log-mel spectrograms (X_test), and the accuracy is
ouput, alongside a .csv file with the predictions for each 0.96s chunk and their
true class."""
def main(X):
with tf.Graph().as_default(), tf.Session() as sess:
# Define VGGish.
embeddings = vggish_slim.define_vggish_slim(training=FLAGS.train_vggish)
# Define a shallow classification model and associated training ops on top
# of VGGish.
with tf.variable_scope('mymodel'):
# Add a fully connected layer with 100 units. Add an activation function
# to the embeddings since they are pre-activation.
num_units = 100
fc = slim.fully_connected(tf.nn.relu(embeddings), num_units)
# Add a classifier layer at the end, consisting of parallel logistic
# classifiers, one per class. This allows for multi-class tasks.
logits = slim.fully_connected(
fc, _NUM_CLASSES, activation_fn=None, scope='logits')
tf.sigmoid(logits, name='prediction')
linear_out= slim.fully_connected(
fc, _NUM_CLASSES, activation_fn=None, scope='linear_out')
logits = tf.sigmoid(linear_out, name='logits')
# Add training ops.
with tf.variable_scope('train'):
global_step = tf.train.create_global_step()
# Labels are assumed to be fed as a batch multi-hot vectors, with
# a 1 in the position of each positive class label, and 0 elsewhere.
labels_input = tf.placeholder(
tf.float32, shape=(None, _NUM_CLASSES), name='labels')
# Cross-entropy label loss.
xent = tf.nn.sigmoid_cross_entropy_with_logits(
logits=logits, labels=labels_input, name='xent')
loss = tf.reduce_mean(xent, name='loss_op')
tf.summary.scalar('loss', loss)
# We use the same optimizer and hyperparameters as used to train VGGish.
optimizer = tf.train.AdamOptimizer(
learning_rate=vggish_params.LEARNING_RATE,
epsilon=vggish_params.ADAM_EPSILON)
train_op = optimizer.minimize(loss, global_step=global_step)
# Initialize all variables in the model, and then load the pre-trained
# VGGish checkpoint.
sess.run(tf.global_variables_initializer())
vggish_slim.load_vggish_slim_checkpoint(sess, FLAGS.checkpoint)
# The training loop.
features_input = sess.graph.get_tensor_by_name(
vggish_params.INPUT_TENSOR_NAME)
accuracy_scores = []
for epoch in range(num_epochs):#FLAGS.num_batches):
epoch_loss = 0
i=0
while i < len(X_train):
start = i
end = i+batch_size
batch_x = np.array(X_train[start:end])
batch_y = np.array(y_train[start:end])
_, c = sess.run([train_op, loss], feed_dict={features_input: batch_x, labels_input: batch_y})
epoch_loss += c
i+=batch_size
#print no. of epochs and loss
print('Epoch', epoch+1, 'completed out of', num_epochs,', loss:',epoch_loss) #FLAGS.num_batches,', loss:',epoch_loss)
#If these lines are left here, it will evaluate on the test data every iteration and print accuracy
#note this adds a small computational cost
correct = tf.equal(tf.argmax(logits, 1), tf.argmax(labels_input, 1)) #This line returns the max value of each array, which we want to be the same (think the prediction/logits is value given to each class with the highest value being the best match)
accuracy = tf.reduce_mean(tf.cast(correct, 'float')) #changes correct to type: float
accuracy1 = accuracy.eval({features_input:X_test, labels_input:y_test})
accuracy_scores.append(accuracy1)
print('Accuracy:', accuracy1)#TF is smart so just knows to feed it through the model without us seeming to tell it to.
#Save predictions for test data
predictions_sigm = logits.eval(feed_dict = {features_input:X_test}) #not really _sigm, change back later
#print(predictions_sigm) #shows table of predictions, meaningless if saving at each epoch
test_preds = pd.DataFrame(predictions_sigm, columns = col_names) #converts predictions to df
true_class = np.argmax(y_test, axis = 1) #This saves the true class
test_preds['True class'] = true_class #This adds true class to the df
#Saves csv file of table of predictions for test data. NB. header will not save when using np.text for some reason
np.savetxt("/content/drive/MyDrive/..."+"Epoch_"+str(epoch+1)+"_Accuracy_"+str(accuracy1), test_preds.values, delimiter=",")
if __name__ == '__main__':
tf.app.run()
#'An exception has occurred, use %tb to see the full traceback.' error will occur, fear not, this just means its finished (perhaps as its exited the tensorflow session?)
I am having different results when I run model.evaluate in Tensorflow more than once in the same validation set.
The model includes data augmentation layers, EfficientNetB0 baseline, and a GlobalAveragePooling layer (see below). I am loading the validation dataset using tf.data pipeline from tensor slices from a dataframe, and it is not being shuffled, so that the order is always the same.
def get_custom_model(input_shape, saved_model_path=None, training_base_model=True):
input_layer = Input(shape=input_shape)
data_augmentation = RandomFlip('horizontal')(input_layer, training=False)
data_augmentation = RandomRotation(factor=(-0.2, 0.2))(data_augmentation, training=False)
data_augmentation = RandomZoom(height_factor=(-0.2, 0.2))(data_augmentation, training=False)
data_augmentation = RandomCrop(width = input_shape[0], height = input_shape[1](data_augmentation, training=False)
baseline_model = EfficientNetB0(include_top=False, weights='imagenet')
baseline_model.trainable = training_base_model # Added for bsg hypertuning
baseline_output = baseline_model(data_augmentation, training=training_base_model)
baseline_output = GlobalAveragePooling2D()(baseline_output)
attributes_output = Dense(units=228, activation='sigmoid', name='attributes_output')(baseline_output)
model = Model(inputs=[input_layer], outputs=[attributes_output])
# Load weights
if saved_model_path != None:
model.load_weights(saved_model_path)#.expect_partial()
return model
I am aware if I trained the model again, indeed the results might be different because some layers are initialized with random weights, but I expected the evaluation on the same model to be equal. I am running the method get_custom_model with the same saved_model_path so that every time the model loads the same weights (that were previously saved).
The metrics I am using to compare and that are different are loss, Precision, and Recall, in case they can be relevant. The optimizer is rmsprop and the loss BinaryCrossentropy. Also, I have tried changing training_base_model to False and the metrics are much poorer (almost like random weights).
PS: Also during the training, I was using the same validation set to have the validation metrics and save the best weights from them, but when I load the best weights again the results are not the same. For instance, I can get a Precision of 81.28% during the validation in a training epoch and then 57% when loading those weights and doing model.evaluate().
I have a Keras pre-trained model "model_keras" and I want to use it in a loss function. The input of model "model_keras" is an output of another Tensorflow model "model_tf" (a generative model). I'm trying to update the weights of "model_tf" by minimizing the loss. During the optimization, "model_kears" is only used for inference and will not get updated. My problem is that I'm not able to get the correct inference result from "model_keras", due to this issue, I'm not able to update the "model_tf" correctly. The code is shown below:
loss_func(input, target, model_keras): # the input is an output of another Tensorflow model.
inference_res = model_keras(input)
loss = tf.reduce_mean(inference_res-target)
return loss
train_phase = tf.placeholder(tf.bool)
z = tf.placeholder(tf.float32, [None, 128])
y = tf.placeholder(tf.int32, [None])
t = tf.placeholder(tf.float32, [None, 10])
model_tf = Generator("generator") # Building the Tensorflow model "model_tf"
fake_img = model_tf(z, train_phase, y, NUMS_CLASS) # fake_img is the output of "model_tf" and will be served as the input of "model_keras"
model_keras = MyKerasModel("Vgg19") # Loading the pretrained Keras model
G_loss = loss_func(fake_img, t, model_keras)
G_opt = tf.train.AdamOptimizer(4e-4, beta1=0., beta2=0.9).minimize(G_loss, var_list=model_tf.var_list())
sess = tf.Session()
sess.run(tf.global_variables_initializer())
sess.run(G_opt, feed_dict={z: Z, train_phase: True, y: Y, t: target}) # Z, Y and target are numpy arrays.
I also tried to use model.predict(input) but got the ValueError: "When feeding symbolic tensors to a model, we expect the tensors to have a static batch size". Reason behind is that model.predict() expects the input to be real data tensor instead of a symbolic tensor. However, since I want to update the weights of "model_tf", I need to make the loss function differentiable and compute the gradients. Therefore, I can not just pass a numpy array to "model_keras".
How can I get the correct output(inference_res) of "model_keras" in this case? The Tensorflow and Keras version I'm using is 1.15 and 2.2.5, respectively.
If I understood your question, here is an idea. You can pass your input to model_keras and lets name the output keras_y. Then freeze the model_keras and add the model to the end of model_tf so you have a big model which is sequence of model_tf and then model_keras (which the second part has been freezed). Next give the inputs to your model and name the output as model_y. Now you can compute the loss as loss_func(keras_y, model_y)
I am training a deep neural network multi-class classifier using TensorFlow. The network outputs the linear values from the final layer, which the tf.nn.softmax_cross_entropy_with_logits cost function takes as input. However, I don't really care about that linear output per se - I want to know what it looks like when the softmax function is applied to it.
Below the relevant parts of my code:
def train_network(x, num_hidden_layers):
prediction = neural_network_model(x, num_hidden_layers)
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=prediction, labels=y))
optimizer = tf.train.AdamOptimizer(learning_rate=0.01).minimize(cost)
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
# train the network
...
# get the network output; x_test is my test data (len=663)
output = sess.run(prediction,feed_dict={x: x_test})
# get softmax values of output
for i in range(len(x_test)):
softm = sess.run(tf.nn.softmax(output[i]))
pred_class = sess.run(tf.argmax(softm))
print(pred_class)
...
Now, that final for-loop in which I calculate the softmax values is extremely slow. Why is that, and how do I do this properly?
I would like to train the weights of a model based on the sum of the loss value of several batches. However it seems that once you run the graph for each of the individual batches, the object that is returned is just a regular numpy array. So when you try and use an optimizer like GradientDescentOptimizer, it no longer has information about the variables that were used to calculate the sum of the losses, so it can't find the gradients of the weights that what help minimize the loss. Here's an example tensorflow script to illustrate what I'm talking about:
weights = tf.Variable(tf.ones([num_feature_values], tf.float32))
feature_values = tf.placeholder(tf.int32, shape=[num_feature_values])
labels = tf.placeholder(tf.int32, shape=[1])
loss_op = some_loss_function(weights, feature_values, labels)
with tf.Session() as sess:
for batch in batches:
feed_dict = fill_feature_values_and_labels(batch)
#Calculates loss for one batch
loss = sess.run(loss_op, feed_dict=feed_dict)
#Adds it to total loss
total_loss += loss
# Want to train weights to minimize total_loss, however this
# doesn't work because the graph has already been run.
optimizer = tf.train.GradientDescentOptimizer(1.0).minimize(total_loss)
with tf.Session() as sess:
for step in xrange(num_steps):
sess.run(optimizer)
The total_loss is a numpy array and thus cannot be used in the optimizer. Does anyone know a way around the problem, where I want to use information across many batches but still need the graph intact in order to preserve the fact that the total_loss is a function of the weights?
The thing you optimize in any of the trainers must be a part of the graph, here what you train on is the actual realized result, so it won't work.
I think the way you should probably do this is to construct your input as a batch of batches e.g.
intput = tf.placeholder("float", (number_of_batches, batch_size, input_size)
Then have your target also be a 3d tensor which can be trained on.