What is the loss function that use the DNNRegressor? - python

I am using DNNRegressor to train my model. I search in the documentation what is the loss function used by this wrapper but i don't find it. On the other hand, it is possible to change that loss function?.
Thank you for your suggestions.

It uses L2 loss (mean squared error) as defined in target_column.py:
def regression_target(label_name=None,
weight_column_name=None,
target_dimension=1):
"""Creates a _TargetColumn for linear regression.
Args:
label_name: String, name of the key in label dict. Can be null if label
is a tensor (single headed models).
weight_column_name: A string defining feature column name representing
weights. It is used to down weight or boost examples during training. It
will be multiplied by the loss of the example.
target_dimension: dimension of the target for multilabels.
Returns:
An instance of _TargetColumn
"""
return _RegressionTargetColumn(loss_fn=_mean_squared_loss,
label_name=label_name,
weight_column_name=weight_column_name,
target_dimension=target_dimension)
and currently API does not support any changes here. However, since it is open source - you can always modify the constructor to call different function internally, with different loss.

Related

Custom reduction of losses within each batch in Keras

I am using keras for tensorflow in Python. I have a custom loss function that returns a single number for each sample in a batch (so a vector with length = batch size). How can I also specify a custom reduction method to aggregate these sample losses into a single loss for the entire batch? Is it acceptable to include this reduction within the custom loss function and have this function return just a single scalar rather than a vector of losses?
It really depends on your application and goal. A very common approach is to perform a reduce_mean over the loss generated on batch size. Some also use reduce_sum, which of course makes the loss value to depend on the batch size. A general (and maybe unnecessarily complicated) approach could be to use a function to call your desired function, which reduces the batch loss to a single value. Let's call it reducer. In your loss function, in the last line, you can call it right before return:
class my_loss(keras.losses.Loss):
def __init__(self, inputs)
# a bunch of assignments
self.reducer = self._get_reducer_function(inputs) (or a normal mean function)
def call(self, y_true, y_pred):
y_batch = ....
return self.reducer(y_batch)
def get_config(self):
return {'input': 1}
Of course you don't need to write so complicated, but it should give you an idea of how to do it. Also, you can simply add sample_weights if you need.

What are all metrics than can be used in compiling tf.keras models?

To compile a tf.keras model one would go:
model.compile(
optimizer='sgd',#'adam', or what not
loss='sparse_categorical_crossentropy',
metrics=['accuracy','mae']
)
What are the other metrics that can be used as the parameter value for 'metrics'? I fruitlessly tried to look for them in the documentation. I'm looking for the entire list.
Edit: I know one could use explicit form. e.g.
metrics=[tf.keras.metrics.MeanAbsoluteError()] #instead of "mae"
but the short tags are more useful later for for plotting:
lossplot = df.plot(y="mae", title="Loss vs Epochs", legend=False)
lossplot.set(xLabel="Epochs", yLabel="Loss")
From Keras Model training APIs page:
metrics: List of metrics to be evaluated by the model during training
and testing. Each of this can be a string (name of a built-in
function), function or a tf.keras.metrics.Metric instance.[...]
and from Keras Metrics page:
All built-in metrics may also be passed via their string identifier
(in this case, default constructor argument values are used, including
a default metric name)
Now you can find the entire list of build-in metrics in the documentation link you found under the section Functions. It's basically the function name as string.

Setting up an optimization solver on top of a neural network model

I have a trained neural network model developed using the Keras framework in a Jupyter notebook. It is a regression problem, where I am trying to predict an output variable using some 14 input variables or features.
As a next step, I would like to minimize my output and want to determine what configuration/values these 14 inputs would take to get to the minimal value of the output.
So, essentially, I would like to pass the trained model object as my objective function in a solver, and also a bunch of constraints on the input variables to optimize/minimize the objective.
What is the best Python solver that can help me get there?
Thanks in advance!
So you already have your trained model, which we can think of as f(x) = y.
The standard SciPy method to minimize this is appropriately named scipy.optimize.minimize.
To use it, you just need to adapt your f(x) = y function to fit the API that SciPy uses. That is, the first function argument is the list of params to optimize over. The second argument is optional, and can contain any args that are fixed for the entire optimization (i.e. your trained model).
def score_trained_model(params, args):
# Get the model from the fixed args.
model = args[0]
# Run the model on the params, return the output.
return model_predict(model, params)
With this, plus an initial guess, you can use the minimize function now:
# Nelder-Mead is my go-to to start with.
# But it doesn't take advantage of the gradient.
# Something that does, e.g. BGFS, may perform better for your case.
method = 'Nelder-Mead'
# All zeros is fine, but improving this initial guess can help.
guess_params = [0]*14
# Given a trained model, optimize the inputs to minimize the output.
optim_params = scipy.optimize.minimize(
score_trained_model,
guess_params,
args=(trained_model,),
method=method,
)
It is possible to supply constraints and bounds to some of the optimization methods. For Nelder-Mead that is not supported, but you can just return a very large error when constraints are violated.
Older answer.
OP wants to optimize the inputs, x, not the hyperparameters.
It sounds like you want to do hyperparameter optimization. My Python library of choice is hyperopt: https://github.com/hyperopt/hyperopt
Given that you already have some training and scoring code, for example:
def train_and_score(args):
# Unpack args and train your model.
model = make_model(**args)
trained = train_model(model, **args)
# Return the output you want to minimize.
return score_model(trained)
You can easily use hyperopt to tune parameters like the learning rate, dropout, or choice of activations:
from hyperopt import fmin, hp, tpe, space_eval
space = {
'lr': hp.loguniform('lr', np.log(0.01), np.log(0.5)),
'dropout': hp.uniform('dropout', 0, 1),
'activation': hp.choice('activation', ['relu', 'sigmoid']),
}
# Minimize the training score over the space.
trials = Trials()
best = fmin(train_and_score, space, trials=trials, algo=tpe.suggest, max_evals=100)
# Print details about the best results and hyperparameters.
print(best)
print(space_eval(space, best))
There are also libraries that will help you directly integrate this with Keras. A popular choice is hyperas: https://github.com/maxpumperla/hyperas

When should tf.losses.add_loss() be used in TensorFlow?

I cannot find an answer to this question in the TensorFlow documentation. I once read that one should add losses from tf.nn functions but it isn't necessary for functions from tf.losses. Therefore:
When should I use tf.losses.add_loss()?
Example:
loss = tf.reduce_mean(tf.nn.sparse_softmax_corss_entropy_with_logits
(labels=ground_truth, logits=predictions))
tf.losses.add_loss(loss) <-- when is this required?
Thank yoou.
One would use this method to register the loss defined by user.
Namely, if you have created a tensor that defines your loss, for example as my_loss = tf.mean(output) you can use this method to add it to loss collection. You might want to do that if you are not tracking all your losses manually. For example if you are using a method like tf.losses.get_total_loss().
Inside tf.losses.add_loss is very much straightforward:
def add_loss(loss, loss_collection=ops.GraphKeys.LOSSES):
if loss_collection and not context.executing_eagerly():
ops.add_to_collection(loss_collection, loss)

Alternatives for loss functions in python CNTK

I have created a sequential model in CNTK and pass this model into a loss function like the following:
ce = cross_entropy_with_softmax(model, labels)
As mentioned here and as I have multilabel classifier, I want to use a proper loss function. The problem is I can not find any proper document to find these loss functions in Python. Is there any suggestion or sample code for this requirement.
I should notice that I found these alternatives (logistic and weighted logistic) in BrainScript language, but not in Python.
"my data has more than one label (three label) and each label has more than two values (30 different values)"
Do I understand right, you have 3 network outputs and associated labels, and each one is a 1-in-30 classifier? Then it seems you can just add three cross_entropy_with_softmax() values. Is that what you want?
E.g. if the model function returns a triple (ending in something like return combine([z1, z2, z3])), then your criterion function that you pass to Trainer could look like this (if you don't use Python 3, the syntax is a little different):
from cntk.layers.typing import Tensor, SparseTensor
#Function
def my_criterion(input : Tensor[input_dim], labels1 : SparseTensor[30],
labels2 : SparseTensor[30], labels3 : SparseTensor[30]):
z1, z2, z3 = my_model(input).outputs
loss = cross_entropy_with_softmax(z1, labels1) + \
cross_entropy_with_softmax(z2, labels2) + \
cross_entropy_with_softmax(z3, labels3)
return loss
learner = ...
trainer = Trainer(None, my_criterion, learner)
# in MB loop:
input_mb, L1_mb, L2_mb, L3_mb = my_next_minibatch()
trainer.train_minibatch(my_criterion.argument_map(input_mb, L1_mb, L2_mb, L3_mb))
Update (based on comments below): If you are using a sequential model then you are probably interested in taking a sum over all positions in the sequence of the loss at each position. cross_entropy_with_softmax is appropriate for the per-position loss and CNTK will automatically compute the sum of the loss values over all positions in the sequence.
Note that the terminology multilabel is non-standard here as it is typically referring to problems with multiple binary labels. The wiki page you link to refers to that case which is different from what you are doing.
Original answer (valid for the actual multilabel case): You will want to use binary_cross_entropy or weighted_binary_cross_entropy. (We decided to rename Logistic when porting this to Python). At the time of this writing these operations only support {0,1} labels. If your labels are in (0,1) then you will need to define your loss like this
import cntk as C
my_bce = label*C.log(model)+(1-label)*C.log(1-model)
Currently, most operators are in the cntk.ops package and documented here. The only exception being the sequence related operators, which reside in cntk.ops.sequence.
We have plans to restructure the operator space (without breaking backwards compatibility) to increase discoverability.
For your particular case, cross_entropy_with_softmax seems to be a reasonable choice, and you can find its documentation with examples here. Please also check out this Jupyter Notebook for a complete example.

Categories

Resources