So I've a CNN implemented. I have made custom callbacks that are confirmed working but I have an issue.
This is a sample output.
Example of iteration 5 (batch-size of 10,000 for simplicity)
50000/60000 [========================>.....] - ETA: 10s ('new lr:', 0.01)
('accuracy:', 0.70)
I have 2 callbacks (tested to work as shown in the output):
(1) Changes the learning rate at each iteration. (2) Prints the accuracy at each iteration.
I have an external script that determines the learning rate by taking in the accuracy.
Question:
How to make the accuracy at each iteration available so that an external script can access it? In essence an accessible variable at each iteration. I'm able to access it only once the process is over with AccuracyCallback.accuracy
Problem
I can pass a changing learning rate. But how do I get the accuracy once this has been passed in a form of an accessible variable at each iteration?
Example
My external script determines the learning rate at iteration 1: 0.01. How do I get the accuracy as an accessible variable in my external script at iteration 1 instead of a print statement?
You can create your own callback
class AccCallback(keras.callbacks.Callback):
def on_batch_end(self, batch, logs={}):
accuracy = logs.get('acc')
# pass accuracy to your 'external' script and set new lr here
In order for logs.get('acc') to work, you have to tell Keras to monitor it:
model.compile(optimizer='...', loss='...', metrics=['accuracy'])
Lastly, note that the type of accuracy is ndarray here. Should it cause you any issue, I suggest wrapping it: float(accuracy).
Related
I am trying to build my own Keras layer by inheriting tf.keras.layers.Layer and I don't really understand what the call() method is doing. I have set my call() method to:
def call(self,inputs):
print('call')
return inputs
When I run the network, I would expect 'call' to be printed many times (with a training set of 100 examples and 10 epochs I would expect this to be printed 1000 times). However, 'call' is printed once when the model is built, then 3 times during the first epoch and then never again. Is my network not using this layer in the subsequent epochs? Why is it only being called 3 times in the first epoch despite there being 100 training examples?
Call method automatically decorated by #tf.function. It means that keras builds dataflow graph on the first call and runs this graph on the next calls.
Calling python functions happening only on the first call. See details here - https://www.tensorflow.org/guide/function#debugging.
I want to fine turn my model when using Keras, and I want to change my training data and learning rate to train when the epochs arrive 10, So how to get a callback when the specified epoch number is over.
You need to write your own Callback subclass.
https://keras.io/callbacks/ (general information)
https://github.com/keras-team/keras/blob/master/keras/callbacks.py#L275 (source code for the Callback base class)
Your Callback subclass should define an on_epoch_end() method, which accepts the epoch number as an argument.
Actually, the way keras works this is probably not the best way to go, it would be much better to treat this as fine tuning, meaning that you finish the 10 epochs, save the model and then load the model (from another script) and continue training with the lr and data you fancy.
There are several reasons for this.
It is much clearer and easier to debug. You check you model properly after the 10 epochs, verify that it works properly and carry on
It is much better to do several experiments this way, starting from epoch 10.
Good luck!
I am working on a audio set to train a neural network using tensorflow library but there is a weird issue that I can't figure out. So I am following this blog Urban Sound Classification, the only difference is that I have my own dataset.
So everything is working fine if I have small data like about 30 audio files or so but when I use the complete data my training code simply runs couple of iterations outputs cost and then that is about it, no error, exception or warning is thrown the tensorflow session simply doesn't give any further results. Let's see the code to better explanation:
with tf.Session() as sess:
sess.run(init)
for epoch in range(training_epochs):
_,cost = sess.run([optimizer,cost_function],feed_dict={X:tr_features,Y:tr_labels})
cost_history = np.append(cost_history,cost)
y_pred = sess.run(tf.argmax(y_,1),feed_dict={X: ts_features})
y_true = sess.run(tf.argmax(ts_labels,1))
print("Test accuracy: ",round(session.run(accuracy,
feed_dict={X: ts_features,Y: ts_labels}),3))
So when I run the above code for training on complete data (about 9000 files) it generates cost history for about 2 epochs and then stop generation history but the code keeps on executing like normal jus the session.run() stop outputting results. My guess is that due to some exception the session stops but how do I debug this stupid error? I have nothing to go on. Can anyone advise on this?
Note: I am not sure if this is the right forum but point me in right direction I will move the question if need be.
UPDATE 01:
So I have figured some correlation between the amount of data/learning rate and the error. Here is my understanding of what is happening. So when I was coding I used subset of my original data about 10-15 files for training and the learning rate was 0.01 and it worked well (as in it completed all it's epochs).
When I used 500 files for training it repeated the same behavior as described in original question (it would output 2 iterations and then kaboom not more outputs and no exception or error). I noticed in those iterations that cost was increasing so I tried to lower the learning rate, and viola it worked like a charm with a new learning rate of 0.001. (again all epochs ran and successfully outputted the results)
Finally when I run the training for all of my data that is about 9000 files but I observed the same behavior as previously discussed. So my question now is how much should I lower the learning rate? What is the correlation of the learning rate to the amount of data?
From past few days, I have been trying to figure out the flow of execution in the code https://github.com/tensorflow/models/blob/master/tutorials/embedding/word2vec.py#L28 .
I understood the logic behind negative sampling and loss function, but I am getting so confused about the flow of execution inside the train function, especially when it comes to _train_thread_body function. I am so confused about the while and if loops ( what is the impact ) and the concurrency related parts. It would be great, if someone can give a decent explanation, before down-voting this.
This sample code is called "Multi-threaded word2vec mini-batched skip-gram model", that's why it uses several independent threads for training. Word2Vec can be trained with a single thread as well, but this tutorial shows that word2vec is faster to compute when done in parallel.
The input, label and epoch tensors are provided by the native word2vec.skipgram_word2vec function, which is implemented in tutorials/embedding/word2vec_kernels.cc file. There you can see that current_epoch is a tensor updated once the whole corpus of sentences is processed.
The method you're asking about is actually pretty simple:
def _train_thread_body(self):
initial_epoch, = self._session.run([self._epoch])
while True:
_, epoch = self._session.run([self._train, self._epoch])
if epoch != initial_epoch:
break
First, it computes the current epoch, then it invokes the training until the epoch is increased. This means that all of threads running this method will make exactly one epoch of training. Each thread is doing one step at a time in parallel with others.
self._train is an op that optimizes the loss function (see optimize method), which is computed from current examples and labels (see build_graph method). The exact value of these tensors is in native code again, namely in NextExample. Essentially, each call of word2vec.skipgram_word2vec extracts the set of examples and labels, which form the input to the optimization function. Hope, it makes it clearer now.
By the way, this model uses NCE loss in training, not negative sampling.
I'm using Keras to predict a time series. As standard I'm using 20 epochs. I want to know what did my neural network predict for each one of the 20 epochs.
By using model.predict I'm getting only one prediction among all epochs (not sure how Keras select it). I want all predictions, or at least the 10 best.
According to a previous answer I got, I should compute the predictions after each training epoch by implementing an appropriate callback by subclassing Callback() and calling predict on the model inside the on_epoch_end function.
Well, the theory seems in shape but I'm in trouble to code that. Would anyone be able to give a code example on that?
Not sure how to implement the Callback() subclassing and neither how to mix that with the model.predict inside an on_epoch_end function.
Your help will be highly appreciated :)
EDIT
Well, I evolved a little bit.
Found out how to create the subclass and how to link it to the model.predict.
However, I'm burning my brain on how to create a list with all the predictions. Below is my current code:
#Creating a Callback subclass that stores each epoch prediction
class prediction_history(Callback):
def on_epoch_end(self, epoch, logs={}):
self.predhis=(model.predict(predictor_train))
#Calling the subclass
predictions=prediction_history()
#Executing the model.fit of the neural network
model.fit(X=predictor_train, y=target_train, nb_epoch=2, batch_size=batch,validation_split=0.1,callbacks=[predictions])
#Printing the prediction history
print predictions.predhis
However, all I'm getting with that is a list of predictions of the last epoch (same effect as printing model.predict(predictor_train)).
The question now is: How do I adapt my code so it adds to predhis the predictions of each one of the epochs?
You are overwriting the prediction for each epoch, that is why it doesn't work. I would do it like this:
class prediction_history(Callback):
def __init__(self):
self.predhis = []
def on_epoch_end(self, epoch, logs={}):
self.predhis.append(model.predict(predictor_train))
This way self.predhis is now a list and each prediction is appended to the list at the end of each epoch.