Using Keras serialized model with dropout in pyspark

Using Keras serialized model with dropout in pyspark - python

I have several neural networks built using Keras that I used so far mostly in Jupyter. I often save models from scikit-learn with joblib and Keras with json + hdf5 and use them in other notebooks without issue.
I made a Python Spark application that can make use of those serialized models in cluster mode. joblib models are working fine however, I encountered an issue with Keras.
Here is the model used in notebook and pyspark:
def build_gru_model():
model = Sequential()
model.add(Embedding(max_nb_words, 128, input_length=max_sequence_length, dropout=0.2))
model.add(GRU(128, dropout_W=0.2, dropout_U=0.2))
model.add(Dense(2, activation='softmax'))
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy'])
return model
both called the same way:
preds = model.predict_proba(data, verbose=0)
However, only in Spark I get the error:
MissingInputError: ("An input of the graph, used to compute DimShuffle{x,x,x,x}(keras_learning_phase), was not provided and not given a value.Use the Theano flag exception_verbosity='high',for more information on this error.", keras_learning_phase)
I've done the mandatory search and found: https://github.com/fchollet/keras/issues/2430 which points to https://keras.io/getting-started/faq/
If I indeed remove dropout from my model, it works. However, I fail to understand how to implement something that would allow me to keep dropout during the training phase like described in the FAQ.
Based on the model code, how one would accomplish this?

You can try to put (before your prediction)
import keras.backend as K
K.set_learning_phase(0)
It should set your learning phase to 0 (test time)

Related

Keras load model after saving makes random predictions in a new python session

I am using tensorflow version '2.0.0' and keras version '2.3.0' to develop the model. Here's how I saved the model:
seed = 1234
random.seed(seed)
np.random.seed(seed)
tf.compat.v1.random.set_random_seed(seed)
I then save the entire model as instructed here:
model.save('some_model_name.h5')
I am getting an accuracy of about 95% during training. When I load the model from a different python session, like:
# Recreate the exact same model
new_model = load_model('some_model_name.h5', custom_objects={'SeqSelfAttention': SeqSelfAttention})
score = new_model.evaluate([x_img_train, x_txt_train], y_train, verbose=2)
print("%s: %.2f%%" % (new_model.metrics_names[1], score[1]*100))
The accuracy now is about 4%. Please note that I have batch norm and dropout layers. How can I make the predictions of my model consistent across different sessions?

Firstly, I have downgraded the TensorFlow version to 1.13.1, owing to stability issues of 2.0.0.
Secondly, I had to ensure a few things before I could achieve some level of reproducibility:
Use Adagrad optimizer instead of Adam gave me performance comparable to the train session. When every time I loaded the session, it was giving me a high variance in the predictions (for Adam)
Loading architecture from json and loading model weights subsequently gave me different results as compared to saving and loading weights only. The former approach seemed to produce comparable performance (to training)
Using tf.session to train and saving it and reloading the tf.session in a new python session did the trick.
There is no variation in the results with or without dropouts or Batch norm.
Please note that following these steps gave me some level of consistency although it's not 100% reproducible. If you're facing a similar issue, perhaps these insights could help.

After loading the model in a new kernel instance, make sure to config losses and metrics again with .compile() in the same way you did before saving.
For example:
old_model = tf.keras.Sequential([ ... ])
old_model.compile(loss = 'mean_squared_error', optimizer = 'sgd', metrics = ['accuracy'])
old_model.fit(train_ds, validation_data=valid_ds, epochs=3)
old_model.evaluate(test_ds)
old_model.save('some_model_name.h5')
Then in the new kernel:
from tensorflow.keras.models import load_model
new_model = load_model("some_model_name.h5")
new_model.compile(loss = 'mean_squared_error', optimizer = 'sgd', metrics = ['accuracy'])
new_model.evaluate(test_ds) # should be the same now

Is there a lightweight Python module to load pre-fitted ML modules and perform prediction?

I am implementing a Machine Learning module that should run in a Raspberry Pi that at the moment is shared among different services.
My idea is to store in the device only the code in charge of retrieving the inputs of the ML module and performing the prediction, together with the file containing the Neural Network model already fitted using Keras.
In other words, I would like to avoid to install all the Keras/Tensorflow packages and dependencies if my purpose is only to perform the prediction on a trained model, and not to train a new model.
Is there a way to do that? Are there any lightweight libraries that allow to load the model of a Neural Network (with all the weights and biases settings) and perform a prediction, given the inputs?
What I am able to do now is to load in the Raspberry Pi a ".h5" file containing the model, weights and biases, but still I have to declare the building function of the model through Keras.
from tensorflow.keras.models import load_model
from tensorflow.keras.wrappers.scikit_learn import KerasRegressor
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
def NN_model():
'''
Definition of the Neural Network model
'''
model = Sequential()
model.add(Dense(7, input_dim=6, kernel_initializer='normal', activation='relu'))
model.add(Dense(15, kernel_initializer='normal', activation='relu'))
model.add(Dense(24, kernel_initializer='normal'))
# Compile model
model.compile(loss='mean_squared_error', optimizer='adam')
return model
'''
Load NN model and use it to predict the radiation values
for the next 24 hours, hour by hour
'''
regr = KerasRegressor(build_fn=NN_model, epochs=1000, batch_size=5, verbose=0)
regr.model = load_model('saved_model.h5')
pred=regr.predict(input_row)
Since a fitted Neural Network is just a matter of weights and biases (and activation functions), I would expect that, once these parameters are determined, I wouldn't need the whole Tensforflow and Keras environment to map an output to the inputs I give to the NN.
What I would like to have is just something like:
import lightweight_module as lm
regression_model = lm.load_model('saved_model.h5')
prediction=regression_model.predict(inputs)

What you can do is, prune your neural network while retaining the same accuracy. It removes all the unwanted connections between different neurons that does not learn anything significant. It not only reduces complexity of your NN, also drastically reduces the storage space required & also reduces the inference time. In Keras I don't know of any such module (though I think people have made their own version), but modules like pytorch & caffe have some implementation of AlexNets & VGGNets they can reduce the size of your NN model by even 49x times. You can find one such implementation here.
https://github.com/felzek/AlexNet-A-Practical-Implementation/blob/master/testModel.py

TypeError: can't pickle NotImplementedType objects (in keras, python)

I was doing deep run using Keras.
However, the following error occurred in the process of storing the model after learning.
TypeError: can't pickle NotImplementedType objects
I had no problem when I ran the same code in another directory.
The code below is the portion of the code that is causing the error.
....
model.add(Dense(2, activation='relu'))
model.add(Dense(1, activation='sigmoid'))
model = multi_gpu_model(model, gpus=4)
model.compile(loss='binary_crossentropy', optimizer = 'adam', metrics = ['accuracy'])
model.fit(x_train,y_train,epochs = 3, batch_size =500)
scores = model.evaluate(x_test,y_test)
#print("%s:.2f%%"%(model.metrics_names[1], scores[1]*100))
model.save('/disk3/seaice/seaice_keras_model2.h5')
Is the type error of pickle appearing in the storage method inside the keras?
It's also the same environment, but I don't know why it works differently in different directories.
I'd appreciate it if you could provide me with a solution to this problem.

When saving a multi-gpu model, the Keras documentation recommends that you call the save(fname) or save_weights(fname) methods of the base model rather than those of the multi_gpu_model (see here, at the very bottom of the page).
I would assign your multi_gpu_model to a new variable rather than reassigning model. That way you'll have an easy reference to your base model that you can use to save weights.

How to obtain the Tensorflow code version of a NN built in Keras?

I have been working with Keras for a week or so. I know that Keras can use either TensorFlow or Theano as a backend. In my case, I am using TensorFlow.
So I'm wondering: is there a way to write a NN in Keras, and then print out the equivalent version in TensorFlow?
MVE
For instance suppose I write
#create seq model
model = Sequential()
# add layers
model.add(Dense(100, input_dim = (10,), activation = 'relu'))
model.add(Dense(1, activation = 'linear'))
# compile model
model.compile(optimizer = 'adam', loss = 'mse')
# fit
model.fit(Xtrain, ytrain, epochs = 100, batch_size = 32)
# predict
ypred = model.predict(Xtest, batch_size = 32)
# evaluate
result = model.evaluate(Xtest)
This code might be wrong, since I just started, but I think you get the idea.
What I want to do is write down this code, run it (or not even, maybe!) and then have a function or something that will produce the TensorFlow code that Keras has written to do all these calculations.

First, let's clarify some of the language in the question. TensorFlow (and Theano) use computational graphs to perform tensor computations. So, when you ask if there is a way to "print out the equivalent version" in Tensorflow, or "produce TensorFlow code," what you're really asking is, how do you export a TensorFlow graph from a Keras model?
As the Keras author states in this thread,
When you are using the TensorFlow backend, your Keras code is actually building a TF graph. You can just grab this graph.
Keras only uses one graph and one session.
However, he links to a tutorial whose details are now outdated. But the basic concept has not changed.
We just need to:
Get the TensorFlow session
Export the computation graph from the TensorFlow session
Do it with Keras
The keras_to_tensorflow repository contains a short example of how to export a model from Keras for use in TensorFlow in an iPython notebook. This is basically using TensorFlow. It isn't a clearly-written example, but throwing it out there as a resource.
Do it with TensorFlow
It turns out we can actually get the TensorFlow session that Keras is using from TensorFlow itself, using the tf.contrib.keras.backend.get_session() function. It's pretty simple to do - just import and call. This returns the TensorFlow session.
Once you have the TensorFlow session variable, you can use the SavedModelBuilder to save your computational graph (guide + example to using SavedModelBuilder in the TensorFlow docs). If you're wondering how the SavedModelBuilder works and what it actually gives you, the SavedModelBuilder Readme in the Github repo is a good guide.
P.S. - If you are planning on heavy usage of TensorFlow + Keras in combination, have a look at the other modules available in tf.contrib.keras

So you want to use instead of WX+b a different function for your neurons. Well in tensorflow you explicitly calculate this product, so for example you do
y_ = tf.matmul(X, W)
you simply have to write your formula and let the network learn. It should not be difficult to implement.
In addition what you are trying to do (according to the paper you link) is called batch normalization and is relatively standard. The idea being you normalize your intermediate steps (in the different layers). Check for example https://www.google.ch/url?sa=t&rct=j&q=&esrc=s&source=web&cd=2&ved=0ahUKEwikh-HM7PnWAhXDXRQKHZJhD9EQFggyMAE&url=https%3A%2F%2Farxiv.org%2Fabs%2F1502.03167&usg=AOvVaw1nGzrGnhPhNGEczNwcn6WK or https://www.google.ch/url?sa=t&rct=j&q=&esrc=s&source=web&cd=4&ved=0ahUKEwikh-HM7PnWAhXDXRQKHZJhD9EQFghCMAM&url=https%3A%2F%2Fbcourses.berkeley.edu%2Ffiles%2F66022277%2Fdownload%3Fdownload_frd%3D1%26verifier%3DoaU8pqXDDwZ1zidoDBTgLzR8CPSkWe6MCBKUYan7&usg=AOvVaw0AHLwD_0pUr1BSsiiRoIFc
Hope that helps,
Umberto

Feature Importance Chart in neural network using Keras in Python

I am using python(3.6) anaconda (64 bit) spyder (3.1.2). I already set a neural network model using keras (2.0.6) for a regression problem(one response, 10 variables). I was wondering how can I generate feature importance chart like so:
def base_model():
model = Sequential()
model.add(Dense(200, input_dim=10, kernel_initializer='normal', activation='relu'))
model.add(Dense(1, kernel_initializer='normal'))
model.compile(loss='mean_squared_error', optimizer = 'adam')
return model
clf = KerasRegressor(build_fn=base_model, epochs=100, batch_size=5,verbose=0)
clf.fit(X_train,Y_train)

I was recently looking for the answer to this question and found something that was useful for what I was doing and thought it would be helpful to share. I ended up using a permutation importance module from the eli5 package. It most easily works with a scikit-learn model. Luckily, Keras provides a wrapper for sequential models. As shown in the code below, using it is very straightforward.
from keras.wrappers.scikit_learn import KerasClassifier, KerasRegressor
import eli5
from eli5.sklearn import PermutationImportance
def base_model():
model = Sequential()
...
return model
X = ...
y = ...
my_model = KerasRegressor(build_fn=base_model, **sk_params)
my_model.fit(X,y)
perm = PermutationImportance(my_model, random_state=1).fit(X,y)
eli5.show_weights(perm, feature_names = X.columns.tolist())

This is a relatively old post with relatively old answers, so I would like to offer another suggestion of using SHAP to determine feature importance for your Keras models. SHAP offers support for both 2d and 3d arrays compared to eli5 which currently only supports 2d arrays (so if your model uses layers which require 3d input like LSTM or GRU, eli5 will not work).
Here is the link to an example of how SHAP can plot the feature importance for your Keras models, but in case it ever becomes broken some sample code and plots are provided below as well (taken from said link):
import shap
# load your data here, e.g. X and y
# create and fit your model here
# load JS visualization code to notebook
shap.initjs()
# explain the model's predictions using SHAP
# (same syntax works for LightGBM, CatBoost, scikit-learn and spark models)
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X)
# visualize the first prediction's explanation (use matplotlib=True to avoid Javascript)
shap.force_plot(explainer.expected_value, shap_values[0,:], X.iloc[0,:])
shap.summary_plot(shap_values, X, plot_type="bar")

At the moment Keras doesn't provide any functionality to extract the feature importance.
You can check this previous question:
Keras: Any way to get variable importance?
or the related GoogleGroup: Feature importance
Spoiler: In the GoogleGroup someone announced an open source project to solve this issue..

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.