I'm using Keras to train a CNN using the fit_generator function.
It seems to be a known issue that TensorBoard doesn't show histograms and distributions in this setup.
Did anybody figure out a way to make it work anyway?
There is no easy way to just plug it in with one line of code, you have to write your summaries by hand.
The good news is that it's not difficult and you can use the TensorBoard callback code in Keras as a reference.
(There is also a version 2 ready for TensorFlow 2.x.)
Basically, write a function e.g. write_summaries(model) and call it whenever you want to write your summaries (e.g. just after your fit_generator())
Inside your write_summaries(model) function use tf.summary, histogram_summary and other summary functions to log data you want to see on tensorboard.
If you don't know exactly how to check official tutorial:
and this great example of MNIST with summaries.
I believe bartgras's explanation is superseded in more recent versions of Keras (I'm using Keras 2.2.2). To get histograms in Tensorboard all I did was the following, (where bg is a data wrangling class which exposes a generator for gb.training_batch(); gb.validation_batch() however is NOT a generator):
NAME = "Foo_{}".format(datetime.now().isoformat(timespec='seconds')).replace(':', '-')
tensorboard = keras.callbacks.TensorBoard(
log_dir="logs/{}".format(NAME),
histogram_freq=1,
write_images=True)
callbacks = [
tensorboard
]
history = model.fit_generator(
bg.training_batch(),
validation_data=bg.validation_batch(),
epochs=EPOCHS,
steps_per_epoch=bg.steps_per_epoch,
validation_steps=bg.validation_steps,
verbose=1,
shuffle=False,
callbacks=callbacks)
Related
I built my own sklearn-like estimator using pytorch training inside GPU (cuda) and it works fine with RandomizedSearchCV when n_jobs==1. When n_jobs > 1, I get the following error:
PicklingError: Can't pickle <class 'main.LSTM'>: attribute lookup LSTM on main failed
This is the piece of code giving me the error:
model = my_model(input_size=1, hidden_layer_size=80, n_lstm_units=3, bidirectional=False,
output_size=1, training_batch_size=60, epochs=7500, device=device)
model.to(device)
hidden_layer_size = random.uniform(40, 200, 20).astype("int")
n_lstm_units = arange(1, 4)
parametros = {'hidden_layer_size': hidden_layer_size, 'n_lstm_units': n_lstm_units}
splitter = ShuffleSplit()
regressor = model
cv_search = \
RandomizedSearchCV(estimator=regressor, cv=splitter,
search_spaces=parametros,
refit=True,
n_iter=4,
verbose=1,
n_jobs=2,
scoring=make_scorer(mean_squared_error,
greater_is_better=False,
needs_proba=False))
cv_search = MetaSKLearnWrapper(cv_search)
cv_search.fit(X, y)
Using Neuraxle wrapper leads to exactly same error, changes nothing.
I found closest solution here, but still don't know how to use RandomizedSearchCV within Neuraxle. It is a brand new project, so I couldn't find an answer on their docs or community examples. If anyone can give me an example or a good indication it will save my life. Thank you
Ps: Any way to run RandomizedSearchCV with my pytorch model on the gpu without Neuraxle also helps, I just need n_jobs>1.
Ps2: My model has a fit() method that creates and moves tensors to the gpu and works already tested.
There are multiple criteria that must be respected here for your code to work:
You need to use Neuraxle's RandomSearch instead of sklearn's random search for this to work. Use Neuraxle's base classes when possible.
Make sure that you use a Neuraxle BaseStep for your pytorch model, instead of a sklearn base classe.
Also, you should create your PyTorch code only in the setup() method or later. You can't create a PyTorch model in the __init__ of the BaseStep that contains pytorch code. You will want to read this page.
You will probably have to create a Saver for your BaseStep that contains PyTorch code if you want to serialize and then load your trained pipeline again. You can see how we created our TensorFlow Saver for our TensorFlow BaseStep and do something similar. Your saver will probably be much simpler than ours due to the more eager nature of PyTorch. For instance, you could have self.model inside your extension of the BaseStep class. The role of the saver would be to save and strip away this simple variable from self, and to be able to reload it whenever needed.
To sum up: you'd need to create two classes, and your two classes should look very similar our two TensorFlow step and saver classes here, to the exception that you PyTorch model is in a self.model variable of your step.
I'd be glad to see your implementation of your PyTorch base step and of your PyTorch saver!
You could then also even use the AutoML class (see AutoML example here) to save experiments in a Hyperparameter Repository as seen in the example.
A few weeks ago, I was working on a project and I installed an older version of tensorflow to try to fix a problem I was having. It didn't work as I had hoped and I pip install the newest version of tensorflow but now I'm regularly getting error messages related to tensorflow being out of date. They don't stop program execution but they are there. As far as I know, I have the most recent version installed but I think I must be missing something. This is an example of one of the errors I'm getting: WARNING: tensorflow: Can save best model only with val_loss available, skipping. This is happening when I try to save a keras model using ModelCheckpoint. I get a different message when I use model_save(). It seems the issues arise whenever I try to save any model in any way. If anyone has any advice, I would love it.
I'm using Python on Google Colab. Please let me know if you need more info from me.
Edit: Adding code for ModelCheckpoint:
save=ModelCheckpoint("/content/drive/My Drive/Colab Notebooks/cavity data/Frequency Model.h5", save_best_only=True, verbose=1)
it was then called in model.fit() like this:
model.fit(X_train, Y_train, epochs=500, callbacks=[save, stop], verbose=1)
The default monitor for ModelCheckpoint is the validation loss or "val_loss".
As the warning suggests, the key "val_loss" is missing because you didn't use validation data in model.fit().
Either specify the validation split or validation data in model.fit() or just use training loss or accuracy as a monitor for ModelCheckpoint as in my example below.
monitor = "accuracy" # or "loss"
save = ModelCheckpoint("/content/drive/My Drive/Colab Notebooks/cavity data/Frequency Model.h5", monitor=monitor, save_best_only=True, verbose=1)
model.fit(X_train, Y_train, epochs=500, callbacks=[save, stop], verbose=1)
I'm using Tensorflow 2.0 with the standard DNNClassifier estimator. It seems that the Optimizers in tf.optimizers are not instances of the Optimizer expected by the classifier.
I have already used tf.optimizers and the new keras implementation at tf.keras.optimizers and get the same error.
Even using the estimator call as a lambda function still returns the same message.
lr = tf.optimizers.schedules.ExponentialDecay(
initial_learning_rate=0.1,
decay_steps=10000,
decay_rate=0.96)
optimizer = tf.optimizers.Adam(learning_rate=lr)
classifier = tf.estimator.DNNClassifier([20, 45, 15],
feature_columns,
n_classes=df.target.nunique(),
optimizer=optimizer)
classifier.train(input_fn=lambda: df_to_dataset(
train, batch_size=batch_size), steps=10000)
Everytime I execute i get the same message, doesn't matter which optimizer I choose:
The given object is not an Optimizer instance. Given: <tensorflow.python.keras.optimizer_v2.adam.Adam object at 0x7f7f186dada0>
Use tf.compat.v1.train.AdamOptimizer() instead of tf.optimizers.Adam().
Ok, for whomever finds this:
This issue is a blocker for the future release of Tensorflow 2.0.
I found a similar issue today on the github repository:
https://github.com/tensorflow/tensorflow/issues/20812
As of today, there's no workaround but to use tailor made estimators. (I'm not sure if even that works)
Edit: I found a workaround thanks to #Vlad. He suggested to use compat to build the estimator, so I did.
Here is how the code ended up looking:
optimizer = tf.compat.v1.train.AdamOptimizer(
learning_rate=tf.compat.v1.train.exponential_decay(
global_step=0,
learning_rate=0.1,
decay_steps=10000,
decay_rate=0.96, staircase=True))
The reason is you are using tensorflow.python.keras api for model and layers and keras.optimizers for SGD. They are two different keras versions of tensorflow and pure keras. They could not work together. You have to change everything to one version.
After this, my code work well :P
Hope this helps.
I have been working with Keras for a week or so. I know that Keras can use either TensorFlow or Theano as a backend. In my case, I am using TensorFlow.
So I'm wondering: is there a way to write a NN in Keras, and then print out the equivalent version in TensorFlow?
MVE
For instance suppose I write
#create seq model
model = Sequential()
# add layers
model.add(Dense(100, input_dim = (10,), activation = 'relu'))
model.add(Dense(1, activation = 'linear'))
# compile model
model.compile(optimizer = 'adam', loss = 'mse')
# fit
model.fit(Xtrain, ytrain, epochs = 100, batch_size = 32)
# predict
ypred = model.predict(Xtest, batch_size = 32)
# evaluate
result = model.evaluate(Xtest)
This code might be wrong, since I just started, but I think you get the idea.
What I want to do is write down this code, run it (or not even, maybe!) and then have a function or something that will produce the TensorFlow code that Keras has written to do all these calculations.
First, let's clarify some of the language in the question. TensorFlow (and Theano) use computational graphs to perform tensor computations. So, when you ask if there is a way to "print out the equivalent version" in Tensorflow, or "produce TensorFlow code," what you're really asking is, how do you export a TensorFlow graph from a Keras model?
As the Keras author states in this thread,
When you are using the TensorFlow backend, your Keras code is actually building a TF graph. You can just grab this graph.
Keras only uses one graph and one session.
However, he links to a tutorial whose details are now outdated. But the basic concept has not changed.
We just need to:
Get the TensorFlow session
Export the computation graph from the TensorFlow session
Do it with Keras
The keras_to_tensorflow repository contains a short example of how to export a model from Keras for use in TensorFlow in an iPython notebook. This is basically using TensorFlow. It isn't a clearly-written example, but throwing it out there as a resource.
Do it with TensorFlow
It turns out we can actually get the TensorFlow session that Keras is using from TensorFlow itself, using the tf.contrib.keras.backend.get_session() function. It's pretty simple to do - just import and call. This returns the TensorFlow session.
Once you have the TensorFlow session variable, you can use the SavedModelBuilder to save your computational graph (guide + example to using SavedModelBuilder in the TensorFlow docs). If you're wondering how the SavedModelBuilder works and what it actually gives you, the SavedModelBuilder Readme in the Github repo is a good guide.
P.S. - If you are planning on heavy usage of TensorFlow + Keras in combination, have a look at the other modules available in tf.contrib.keras
So you want to use instead of WX+b a different function for your neurons. Well in tensorflow you explicitly calculate this product, so for example you do
y_ = tf.matmul(X, W)
you simply have to write your formula and let the network learn. It should not be difficult to implement.
In addition what you are trying to do (according to the paper you link) is called batch normalization and is relatively standard. The idea being you normalize your intermediate steps (in the different layers). Check for example https://www.google.ch/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&ved=0ahUKEwikh-HM7PnWAhXDXRQKHZJhD9EQFggyMAE&url=https%3A%2F%2Farxiv.org%2Fabs%2F1502.03167&usg=AOvVaw1nGzrGnhPhNGEczNwcn6WK or https://www.google.ch/url?sa=t&rct=j&q=&esrc=s&source=web&cd=4&ved=0ahUKEwikh-HM7PnWAhXDXRQKHZJhD9EQFghCMAM&url=https%3A%2F%2Fbcourses.berkeley.edu%2Ffiles%2F66022277%2Fdownload%3Fdownload_frd%3D1%26verifier%3DoaU8pqXDDwZ1zidoDBTgLzR8CPSkWe6MCBKUYan7&usg=AOvVaw0AHLwD_0pUr1BSsiiRoIFc
Hope that helps,
Umberto
I'm using Keras to predict a time series. As standard I'm using 20 epochs. I want to know what did my neural network predict for each one of the 20 epochs.
By using model.predict I get the last prediction. However I want all predictions, or at least the last 10 ones (which have acceptable error levels).
To access that I'm trying the ModelCheckpoint function from Keras, however I'm having trouble to access it afterwards. I'm using the following code:
model=Sequential()
model.add(GRU(input_dim=col,init='uniform',output_dim=20))
model.add(Dense(10))
model.add(Dense(5))
model.add(Activation("softmax"))
model.add(Dense(1))
model.compile(loss="mae", optimizer="RMSprop")
checkpoint=ModelCheckpoint(filepath='/Users/Alex/checkpoint.hdf5')
model.fit(X=predictor_train, y=target_train, nb_epoch=20, batch_size=batch,validation_split=0.1) #best validation split at 0.1
model.evaluate(X=predictor_train, y=target_train,batch_size=batch,show_accuracy=True)
print checkpoint
Objectively, my questions are:
I expected that after running the code I would find a file named checkpoint.hdf5 inside the folder /Users/Alex, however I didn't. What am I missing?
When I print checkpoint out what I get is a keras.callbacks.ModelCheckpoint object at 0x117471290. Is there a way to print what I want? How would the code look like?
Your help is very much appreciated :)
There are two problems in this code:
You are not passing the callback to the model's fit method. This is done with the keyword argument "callbacks".
The filepath should contain placeholders (like "{epoch:02d}-{val_loss:.2f}" that are used with str.format by Keras in order to save each epoch to a different file.
So the correct version should be something like:
checkpoint = ModelCheckpoint(filepath='/Users/Alex/checkpoint-{epoch:02d}-{val_loss:.2f}.hdf5')
model.fit(X=predictor_train, y=target_train, nb_epoch=20,
batch_size=batch,validation_split=0.1, callbacks=[checkpoint])
You can also add other kinds of callbacks in the list that is assigned to that keyword.
Unfortunately the callback object doesn't store the history information so it cannot be recovered from it.