I trained my data in tpot, and then I wrote a function to evaluate the best pipeline with different metrics. I want to automize the whole procedure. Tpot will return the best pipeline, and then different metrics will be calculated. The problem is sometimes the optimal model doesn't have "a predict_proba" method like ElasticNetCV or AdaBoostRegressor, then in my evaluation function, I have to divide the two different predictions methods. Something like:
if trained_model does have predict_proba:
do sth
else trained_model does not have predict_proba
do sth else
imagine code like this:
trained_model is the best pipeline from tpot
prob_test = trained_model.predict_proba(xtest)
if the trained_model is "AdaBoostRegressor", then it returns the error:
'AdaBoostRegressor' object has no attribute 'predict_proba'
Which it's true. I tried with while and if, but none of them work. I might be forgetting something here.
You'll want to use hasattr to check if the function is there:
if hasattr(trained_model, 'predict_proba'):
return trained_model.predict_proba(...)
# otherwise do something else...
Related
I need to set the attribute activation_out = 'logistic' in a MLPRegressor of sklearn. It is supposed that this attribute can take the names of the relevant activation functions ('relu','logistic','tanh' etc). The problem is that I cannot find the way that you can control this attribute and set it to the preferred functions. Please, if someone has faced this problem before or knows something more, I want some help.
I have tried to set attribute to MLPRegressor(), error. I have tried with the method set_params(), error. I have tried manually to change it through Variable Explorer, error. Finally, I used MLPName.activation_out = 'logistic' but again when I used fit() method it changed to 'identity'.
CODE:
X_train2, X_test2, y_train2,y_test2 =
train_test_split(signals_final,masks,test_size=0.05,random_state =
17)
scaler2 = MinMaxScaler()
X_train2 = scaler.fit_transform(X_train2)
X_test2 = scaler.transform(X_test2)
MatchingNetwork = MLPRegressor(alpha = 1e-15,hidden_layer_sizes=
(300,)
,random_state=1,max_iter=20000,activation='logistic',batch_size=64)
MLPRegressor().out_activation_ = 'logistic'
You cannot. The output activation is determined by the problem type at fit time. For regression, the identity activation is used; see the User Guide.
Here is the relevant bit of source code. You might be able to hack it by fitting one iteration, changing the attribute, then using partial_fit, since then this _initialize method won't be called again; but it's likely to break when back-propogating.
Generally I think the sklearn neural networks aren't designed to be super flexible: there are other packages that play that role, are more efficient (use GPUs), etc.
As the title states I'm wondering how I could access the privileged 'training' argument when I'm using the functional API.
So if I use subclassing, I can write something like:
class MyLayer(tf.keras.layers.Layer):
def __init__(self):
...
self.BN = tf.keras.Layers.BatchNormalization()
def call(self,inputs, training=None):
self.BN(inputs, training=training)
So I can control how my batchnorm behaves during training and prediction. But If I want to use the functional API:
input = tf.Input(someshape)
normalized = tf.keras.layers.BatchNormalization()(input)
tf.keras.Model(inputs=input, outputs=normalized)
Now I can't really set the priviledged 'training' argument for my batch_norm anymore. I love the functional API, its just really so much fun to use, but having to build around this kind of is a dealbreaker quite often. I feel like I must miss some important idea on how one would solve this here.
I'm aware that I could create a tf.Input, which could hold the 'training' argument. But this would change it from a keyord arg to some element of a list, which creates very very inconsistent code. Any smarter solution to this?
Edit: Should make it clear that I'm looking for a general idea that can be used for the 'training' arg, not just tackling the BatchNormalization in particular.
When you instantiate the model model = tf.keras.Model(inputs=input, outputs=normalized), the model has not yet been built. You will need to call the build method, usually when you do everything by hand using the gradient tape, or when you first call the fit method. At that point, the weights will be initialized. Now, if you use the fit method or call your model output_tensors = mymodel(input_tensors, training=True), or conversely if you use the predict method or use output_tensors = mymodel(input_tensors, training=False), the training flag will be set to True or False, (which is obvious if you call the model directly).
I train my classifier using DeepPavlov, and then when i call trained model for some sample function returns only one class label, but I want to get the probabilities of every class. I did not find function parameters that would allow me to get probabilities.
Has anyone encountered such a problem? Thank!
from deeppavlov import configs, train_model
model = train_model(configs.classifiers.intents_snips)
model(['Some sentence'])
I want the output like np.array with number of classes length, but current output is one label like ['PlayMusic'].
You can change chainer.out parameter of your config to be ["y_pred_probas"] before inferring, but it will also most likely require you to update change train.metrics if you want to train your model on the same config.
Alternatively you can call your model like
model.compute(['Some sentence'], targets=["y_pred_probas"])
And to get classes indexes you can run
dict(model['classes_vocab'])
I cannot find an answer to this question in the TensorFlow documentation. I once read that one should add losses from tf.nn functions but it isn't necessary for functions from tf.losses. Therefore:
When should I use tf.losses.add_loss()?
Example:
loss = tf.reduce_mean(tf.nn.sparse_softmax_corss_entropy_with_logits
(labels=ground_truth, logits=predictions))
tf.losses.add_loss(loss) <-- when is this required?
Thank yoou.
One would use this method to register the loss defined by user.
Namely, if you have created a tensor that defines your loss, for example as my_loss = tf.mean(output) you can use this method to add it to loss collection. You might want to do that if you are not tracking all your losses manually. For example if you are using a method like tf.losses.get_total_loss().
Inside tf.losses.add_loss is very much straightforward:
def add_loss(loss, loss_collection=ops.GraphKeys.LOSSES):
if loss_collection and not context.executing_eagerly():
ops.add_to_collection(loss_collection, loss)
Once a scikit-learn classifier is trained:
import sklearn.cluster
clf = sklearn.cluster.KMeans()
clf.fit(X)
there are (at least) two options to obtain values of its parameters. Specifically,
By referring to a parameter name with a traling underscore:
clf.n_clusters_
From a dictionary obtained with get_params():
ps = clf.get_params()
ps['n_clusters']
Which of these approaches is the preferred one?
I would say clf.get_params() because you don't always know what parameters might be available for a given estimator and this method will return everything, unless you know exactly what you are looking for. It also has a deep argument which when set to true, "...will return the parameters for this estimator and contained subobjects that are estimators"