I use the method save_model and load_mode but it don't work.
I have an error : AttributeError: 'GridSearchCV' object has no attribute 'get_config'
I don't know if I use correctly this method. I show my code for take an example:
gridSearch = GridSearchCV(estimator = classifier,
param_grid = parameters,
scoring = "accuracy",
cv = 10)
gridSearch.fit(X_train, y_train)
save_model(gridSearch, filepath = 'monModele.h5')
The result is the error attribute Error. Can you help me to find a solution for this problem or to find an other method to save and load a keras model.
That is because GridSearchCV is not a Keras model, but a module from sklearn that also has a fit function with a similar API.
In order to use save_model and load_model you need the actual Keras model, my guess is it is your classifier. Specifically, an instance of the Model class from Keras.
Related
I use LinearSVC for a multi-label classification problem. Since LinearSVC does not provide a predict_proba method, I decided to use CalibratedClassifierCV to scale the decision function into [0, 1] probabilities.
from sklearn.svm import LinearSVC
from sklearn.calibration import CalibratedClassifierCV
classifier = CalibratedClassifierCV(LinearSVC(class_weight = 'balanced', max_iter = 100000)
classifier.fit(X_train, y_train)
However, I also need to access the weights coef_, but classifier.base_estimator.coef_ raise the following error:
AttributeError: 'LinearSVC' object has no attribute 'coef_'
I thought classifier.base_estimator returned the calibrated classifier and allowed to access all its attributes. Thanks in advance for explaining me what I missunderstood.
I have a some models and I am trying to fit them all.
At the moment I have tried with a dictionary and fit them:
dictionary_of_models = {'catboost':CatBoostClassifier(random_state=0,), 'logistic_regression':LogisticRegression(random_state=0)}
for model in dictionary_of_models.keys():
print(model)
dictionary_of_models[model]=model.fit(X_train, y_train)
But, even the model is printed out, I receive this error:
model.fit(X_train, y_train)
AttributeError: 'str' object has no attribute 'fit'
What's wrong with the code?
I think that a string is going passed to the fit function instead of a model object, but I don't know I can create a model from a dictionary, except for way I tried.
The problem is that you tried to apply fit to the name you gave the model. YOu have to fit the model not its name.
dictionary_of_models = {'catboost':CatBoostClassifier(random_state=0,),
'logistic_regression':LogisticRegression(random_state=0)}
for name, model in dictionary_of_models.items():
print(name)
dictionary_of_models[model]=model.fit(X_train, y_train)
Complementing Prune's answer, I believe you can avoid items. dictionary_of_model[model] would generate a KeyError given you are not passing the Key of the dictionary, but the value itself.
Please try:
for model in dictionary_of_models:
print(model)
dictionary_of_models[model] = dictionary_of_models[model].fit(X_train, y_train)
I'm using sklearn linear implementation of SVM classifier LinearSVM.
I didn't use it directly but I wrap it with CalibratedClassifierCV to get the probabilities in the prediction time, like:
model = CalibratedClassifierCV(LinearSVC(random_state=0))
After fitting the model, I tried to get the coef_ to print the Top features, following this post Visualising Top Features in Linear SVM with Scikit Learn and Matplotlib, but this I got this error:
coef = classifier.coef_.ravel()
AttributeError: 'CalibratedClassifierCV' object has no attribute 'coef_'
How can I get the coef in the case I wrap the classifier with a calibrator?, I'm not totally interested in this way, thus if there is another way to get the features importance, it will be welcomed.
coef_ is not an attribute of CalibratedClassifierCV however, it is an attribute of the base_estimator which is a LinearSVC in your case. You can access your base estimator via the calibrated_classifiers_ which is a list of the fitted models (which depends on the number of models you fit based on your cv value). I have shown a sample code which you can refer to for your need.
from sklearn import svm, datasets
from sklearn.model_selection import GridSearchCV
from sklearn.calibration import CalibratedClassifierCV
from sklearn.svm import LinearSVC
iris = datasets.load_iris()
model = CalibratedClassifierCV(LinearSVC(random_state=0))
model.fit(iris.data, iris.target)
model.calibrated_classifiers_
[<sklearn.calibration._CalibratedClassifier at 0x7f15d0c57550>,
<sklearn.calibration._CalibratedClassifier at 0x7f15d0c57c18>,
<sklearn.calibration._CalibratedClassifier at 0x7f15d0aec080>]
In this case my cv is three so I have three models built, so I would simple loop through them and taken an average.
coef_avg = 0
for i in model.calibrated_classifiers_:
coef_avg = coef_avg + i.base_estimator.coef_
coef_avg = coef_avg/len(model.calibrated_classifiers_)
array([[ 0.16464871, 0.45680981, -0.77801375, -0.4170196 ],
[ 0.1238834 , -0.89117967, 0.35451826, -0.89231957],
[-0.83826029, -0.9237139 , 1.30772955, 1.67592916]])
Note: Starting from sklearn version 0.24, CalibratedClassifierCV constructor exposes an ensemble argument, that, if set to False (assuming cv is not set to "prefit"), makes CalibratedClassifierCV expose only one calibrated classifier trained using all training data. This means we no longer need to loop over all calibrated_classifiers_ at prediction time:
model = CalibratedClassifierCV(LinearSVC(random_state=0), ensemble=False)
model.fit(iris.data, iris.target)
model.calibrated_classifiers_
# Returns a list with one element, [<sklearn.calibration._CalibratedClassifier at 0x7f15d0c57550>]
(using an example above, given by Parthasarathy)
I am using python with sklearn, and would like to get a list of available hyper parameters for a model, how can this be done? Thanks
This needs to happen before I initialize the model, when I try to use
model.get_params()
I get this
TypeError: get_params() missing 1 required positional argument: 'self'
This should do it: estimator.get_params() where estimator is the name of your model.
To use it on a model you can do the following:
reg = RandomForestRegressor()
params = reg.get_params()
# do something...
reg.set_params(params)
reg.fit(X, y)
EDIT:
To get the model hyperparameters before you instantiate the class:
import inspect
import sklearn
models = [sklearn.ensemble.RandomForestRegressor, sklearn.linear_model.LinearRegression]
for m in models:
hyperparams = inspect.getargspec(m.__init__).args
print(hyperparams) # Do something with them here
The model hyperparameters are passed in to the constructor in sklearn so we can use the inspect model to see what constructor parameters are available, and thus the hyperparameters. You may need to filter out some arguments that aren't specific to the model such as self and n_jobs.
As of May 2021:
(Building on sudo's answer)
# To get the model hyperparameters before you instantiate the class
import inspect
import sklearn
models = [sklearn.linear_model.LinearRegression]
for m in models:
hyperparams = inspect.signature(m.__init__)
print(hyperparams)
#>>> (self, *, fit_intercept=True, normalize=False, copy_X=True, n_jobs=None)
Using inspect.getargspec(m.__init__).args, as suggested by sudo in the accepted answer, generated the following warning:
DeprecationWarning: inspect.getargspec() is deprecated since Python 3.0,
use inspect.signature() or inspect.getfullargspec()
If you happen to be looking at CatBoost, try .get_all_params() instead of get_params().
estimator._get_param_names() will print out all available hyperparameters for a given estimator (model).
from sklearn.svm import SVR
from sklearn.ensemble import RandomForestRegressor
SVR._get_param_names()
['C',
'cache_size',
'coef0',
'degree',
'epsilon',
'gamma',
'kernel',
'max_iter',
'shrinking',
'tol',
'verbose']
RandomForestRegressor._get_param_names()
['bootstrap',
'ccp_alpha',
'criterion',
'max_depth',
'max_features',
'max_leaf_nodes',
'max_samples',
'min_impurity_decrease',
'min_samples_leaf',
'min_samples_split',
'min_weight_fraction_leaf',
'n_estimators',
'n_jobs',
'oob_score',
'random_state',
'verbose',
'warm_start']
I have a tensorflow contrib.learn.DNNRegressor that I have trained as part of the following code snippet:
regressor = tf.contrib.learn.DNNRegressor(feature_columns=fc,
hidden_units=hu_array,
optimizer=tf.train.AdamOptimizer(
learning_rate=0.001,
),
enable_centered_bias=False,
activation_fn=tf.tanh,
model_dir="./models/my_model/",
)
regressor.fit(x=training_features, y=training_labels, steps=10000)
The trained network performs quite well, and I'd like to use it as a part of some other code, on another machine. I have tried copying over the models/my_model directory, and constructing a new DNNRegressor pointing just at the model_dir, but it requires that I supply feature_columns and hidden_units definitions. Shouldn't that information be available via the snapshots stored in model_dir? Is there a better way to save/recover a trained model which is performing well, to be used as a predictor, without having to separately save the feature_columns and hidden_units?
I came up with something workable- not ideal, but it gets the job done. If anyone has a better idea, I am all ears.
I converted my kwargs for DNNRegressor into a dict, and used the ** operator. Then I was able to pickle the kwargs dict, and reconstruct the DNNRegressor from that. E.g:
reg_args = {'feature_columns': fc, 'hidden_units': hu_array, ...}
regressor = tf.contrib.learn.DNNRegressor(**reg_args)
pickle.dump(reg_args, open('reg_args.pkl', 'wb'))
Later on, I reconstruct via:
reg_args = pickle.load(open('reg_args.pkl', 'rb'))
# On another machine and so my model dir path changed:
reg_args['model_dir'] = NEW_MODEL_DIR
regressor = tf.contrib.learn.DNNRegressor(**reg_args)
It worked well. I'm sure there must be a better way but for now if someone is trying to figure out a workaround for tf.contrib.learn, this is a solution.
When training
You call DNNRegressor(..., model_dir) and then call the fit() and evaluate() method.
When testing
You call DNNRegressor(..., model_dir) and then can call predict() methods. Your model will find a trained model in the model_dir and will load the trained model params.
Reference
Issue #3340 of TF