When training and saving an xgboost model in Python using the base API (i.e., xgboost.train(args)), we can then save the parameters using .save_model():
import xgboost
model = xgboost.train(args) # Learning API
model.save_model(args)
loaded_model = xgboost.XGBRegressor() # Scikit-Learn API
loaded_model.load_model(args)
How can we load this trained model into the xgboost sklearn API? My goal is to load a trained xgboost model (trained using the Learning API) into the xgboost Scikit-Learn API as a fitted model, so that I can then leverage other sklearn functions.
The approach I included in the code above does not enable the loaded model to work with other sklearn functions, and I get a NotFittedError when I try to use other sklearn functions on the model.
Here is a link to the Python API for the model I am using: https://xgboost.readthedocs.io/en/latest/python/python_api.html
I am training the model using the 'Learning API' and trying to load the model into the 'Scikit-Learn API'.
Assuming you have used one the standard classifiers or models of scikit-learn package, you can save and load your models using pickle:
import pickle
model.train(X)
saved_model = pickle.dumps(model)
# Load the pickled model
loaded_model = pickle.loads(saved_model)
# Using the loaded model to predict new data
loaded_model.predict(X_test)
You can also save the saved_model into an arbitrary file and load it afterwards.
import pickle
model.train(X)
file_pi = open('model.obj', 'w')
pickle.dump(model, file_pi)
# Load the pickled model
filehandler = open(filename, 'r')
loaded_model = pickle.load(filehandler)
# Using the loaded model to predict new data
loaded_model.predict(X_test)
Related
I have my cognitive vision API model trained and have exported it (tried two formats: TensorFlow and SavedModel).
Now I would love to load this exported model in a Python script, ideally using Keras rather than native Tensorflow. I would like to print out the summary() of the model and copy the layers to retrain it in a custom Python script.
However, I don't seem to get this to work:
Loading this using the SavedModel format
With the following code:
import tensorflow as tf
loaded = tf.saved_model.load(export_dir='mydir/savedmodel')
loaded.summary()
I get the following exception: 'AutoTrackable' object has no attribute 'summary', and it seems that the load method returned a AutoTrackable, rather than a Model.
Using GraphDef
Taking the following code from this link, creates a TensorFlow specific type that I don't really know how to transform into a Keras model.
import tensorflow as tf
import os
graph_def = tf.compat.v1.GraphDef()
labels = []
# These are set to the default names from exported models, update as needed.
filename = 'mydir/tf/model.pb'
labels_filename = "mydir/tf/labels.txt"
# Import the TF graph
with tf.io.gfile.GFile(filename, 'rb') as f:
graph_def.ParseFromString(f.read())
tf.import_graph_def(graph_def, name='')
# Create a list of labels.
with open(labels_filename, 'rt') as lf:
for l in lf:
labels.append(l.strip())
The TensorFlow saved model​ format to load the model at run time.
https://www.tensorflow.org/guide/saved_model#building_a_savedmodel
Exporting the model can be done using a Python script that loads the model, creates a signature and then saves the model in the saved model format.
To persist it when we upload and read it for registration.If you are using ./output to send the file to output, Inside your train.py script, you just need to do something like this:
#persist the model to the local machine
tf.saved_model.save(model,'./outputs/model/')
#register the model with run object
run.register_model(model_name,'./outputs/model/')
I found a below link that shows that I can export the estimator as a tf.saved_model.
https://guillaumegenthial.github.io/serving-tensorflow-estimator.html
Below is how I trained an XGBClassifier and saved it:
import pickle
from xgboost import XGBClassifier
# train
model = XGBClassifier()
model.fit(X, y)
# export
pickle.dump(model, open('model.pickle', 'wb'))
This is how I loaded the model and made predictions
loaded_model = pickle.load(open('model.pickle', 'rb'))
y_pred = loaded_model.predict(X)
The model predictions are OK if the model was loaded from within the same python process where the training was performed, but the predictions are not OK (random) if the model was loaded from a different python process than the one used for training.
Note, I've the same problem if model.save_model and model.load_model were used instead of pickle.
The simple checks I did shows the model was saved and loaded properly; the dumps of model._Booster (acquired via model._Booster.dump_model(some_file)) and loaded_model._Booster are identical.
Python version: 3.7.5
xgboost version: tried both 0.80 and 0.90
Any suggestion is appreciated.
In my case, i had changed column order while predicting which led to different performance. The column order for training data and prediction data Must be same
I followed the steps on the Predicting Movie Reviews with BERT on TF Hub here.
At the end, how do I export the model to be used/loaded as a classifier later?
I found a link that shows that I can export the estimator as a tf.saved_model. However, I got stuck on creating the 'serving_input_receiver_fn()'.
Use pickle to save the model
import pickle
# save the model to disk
filename = 'finalized_model.sav'
pickle.dump(model, open(filename, 'wb'))
#Load the model
loaded_model = pickle.load(open(filename, 'rb'))
My impression is that it only saves the model's architecture, so I should be able to call it before I start training? And then save_weights() saves the weights I need to restore the model? Any more details on this?
At what stage can I call to_json()? I.e. do I have to call compile() first? Can it be before fit() ?
As mentioned in Keras docs it only saves the architecture of the model:
Saving/loading only a model's architecture
If you only need to save the architecture of a model, and not its
weights or its training configuration, you can do:
# save as JSON
json_string = model.to_json()
# save as YAML
yaml_string = model.to_yaml()
The generated JSON / YAML files are human-readable and can be manually
edited if needed.
You can then build a fresh model from this data:
# model reconstruction from JSON:
from keras.models import model_from_json
model = model_from_json(json_string)
# model reconstruction from YAML
from keras.models import model_from_yaml
model = model_from_yaml(yaml_string)
I have trained a model using CNTK, lets call simple.dnn
now for the phase of testing I do not want to install CNTK on windows,but use trained model with python. How can I use trained model (weights,...) for testing using python?
You can use the load_model function, see https://www.cntk.ai/pythondocs/cntk.html?highlight=load_model#cntk.persist.load_model. The basic flow should look like this:
from cntk import load_model
loaded_model = load_model("yourModel.model", 'float')
output = model.eval(arguments)