I want to create a custom loss function is Keras, where the loss of the current prediction y_pred depends for the prediction of the previous training sample XXX and other parameters h, b, K. The loss function looks like the following and I don't know how to call the previous prediction (i.e. replace XXX) in Keras during training.
(for context: it's a loss function for quantile regression + fixed costs)
def custom_loss(y_true, y_pred, h, b, K, XXX):
if y_pred > XXX:
F = K
else:
F = 0
loss = h * max(0, y_pred - y_true) + b * max(0, y_true - y_pred) + F
return loss
Thanks for your help!
Related
I am trying to implement logistic regression with Tensorflow. I assume that I have the labels in the form of {-1, 1}. So, I have implemented the decision function and loss function
def cross_entropy(y_pred, y_true):
return tf.reduce_mean(tf.math.log(1 + tf.math.exp(- y_true * y_pred[:, 0] ))) + tf.nn.l2_loss(W)`
def logistic_regression(x):
return tf.matmul(x, W) + b
Is this correct? The loss is nan.
This is an option,
def logistic_regression(x):
# Apply softmax to normalize the logits to a probability distribution.
return tf.nn.softmax(tf.matmul(x, W) + b)
def cross_entropy(y_pred, y_true):
# Encode label to a one hot vector.
y_true = tf.one_hot(y_true, depth=num_classes)
# Clip prediction values to avoid log(0) error.
y_pred = tf.clip_by_value(y_pred, 1e-9, 1.)
# Compute cross-entropy.
return tf.reduce_mean(-tf.reduce_sum(y_true * tf.math.log(y_pred)))
Take a look at this full implementation
https://builtin.com/data-science/guide-logistic-regression-tensorflow-20
I have an image segmentation problem I have to solve in TensorFlow 2.
In particular I have a training set composed by aerial images paired with their respective masks. In a mask the terrain is colored in black and the buildings are colored in white. The purpose is to predict the mask for the images in the test set.
I use a UNet with a final Conv2DTranspose with 1 filter and a sigmoid activation function. The prediction is made in the following way on the output of the final sigmoid layer: if y_pred>0.5, then it's a building, otherwise it's the background.
I want to implement a dice loss, so I wrote the following function
def dice_loss(y_true, y_pred):
print("[dice_loss] y_pred=",y_pred,"y_true=",y_true)
y_pred = tf.cast(y_pred > 0.5, tf.float32)
y_true = tf.cast(y_true, tf.float32)
numerator = 2 * tf.reduce_sum(y_true * y_pred)
denominator = tf.reduce_sum(y_true + y_pred)
return 1 - numerator / denominator
which I pass to TensorFlow in the following way:
loss = dice_loss
optimizer = tf.keras.optimizers.Adam(learning_rate=config.learning_rate)
metrics = [my_IoU, 'acc']
model.compile(optimizer=optimizer, loss=loss, metrics=metrics)
but at training time TensorFlow throw me the following error:
ValueError: No gradients provided for any variable:
The problem is in your loss function (obviously). Particularly, the following operation.
y_pred = tf.cast(y_pred > 0.5, tf.float32)
This is not a differentiable operation. Which results in Gradients being None. Change your loss function to the following and it will work.
def dice_loss(y_true, y_pred):
print("[dice_loss] y_pred=",y_pred,"y_true=",y_true)
y_true = tf.cast(y_true, tf.float32)
numerator = 2 * tf.reduce_sum(y_true * y_pred)
denominator = tf.reduce_sum(y_true + y_pred)
return 1 - numerator / denominator
I have implemented a custom mean average error (MAE) loss in lightgbm. The gradient is nonzero, but the loss stays constant. How could that be?
My implementation:
def abs_obj(preds, dtrain):
y_true = dtrain.get_label()
a = preds - y_true
grad = np.sign(a)
hess = np.zeros(len(a))
return grad, hess
def abs_eval(preds, dtrain):
y_true = dtrain.get_label()
loss = np.abs(preds - y_true).sum()
return "error", loss, False
A minimal reproducible example: the loss stays constant.
dtrain = pd.DataFrame({'x':np.random.rand(100),
'y':np.random.rand(100)})
ytrain = dtrain.x + 2 * dtrain.y
dval = dtrain
yval = ytrain
lgb_train = lgb.Dataset(dtrain, ytrain)
lgb_valid = lgb.Dataset(dval, yval)
params = {'objective':None,
'learning_rate':30,
'num_leaves':33}
clf = lgb.train(params,
lgb_train,
valid_sets=[lgb_valid],
num_boost_round=10,
verbose_eval=1,
fobj=abs_obj,
feval=abs_eval)
For a custom loss in lightgbm, you need a twice differentiable function with a positive second derivative.
To speed up their algorithm, lightgbm uses Newton's approximation to find the optimal leaf value:
y = - L' / L''
(See this blogpost for details).
When the second derivative is zero or the function is not twice differentiable, this approximation is very wrong. Lightgbm has built-in objective functions which do not fit this criterion, such as MAE. For these functions they have different, special implementations.
I am currently experimenting with generative adversarial networks in Keras.
As proposed in this paper, I want to use the historical averaging loss function. Meaning that I want to penalize the change of the network weights.
I am not sure how to implement it in a clever way.
I was implementing the custom loss function according to the answer to this post.
def historical_averaging_wrapper(current_weights, prev_weights):
def historical_averaging(y_true, y_pred):
diff = 0
for i in range(len(current_weights)):
diff += abs(np.sum(current_weights[i]) + np.sum(prev_weights[i]))
return K.binary_crossentropy(y_true, y_pred) + diff
return historical_averaging
The weights of the network are penalized, and the weights are changing after each batch of data.
My first idea was to update the loss function after each batch.
Roughly like this:
prev_weights = model.get_weights()
for i in range(len(data)/batch_len):
current_weights = model.get_weights()
model.compile(loss=historical_averaging_wrapper(current_weights, prev_weights), optimizer='adam')
model.fit(training_data[i*batch_size:(i+1)*batch_size], training_labels[i*batch_size:(i+1)*batch_size], epochs=1, batch_size=batch_size)
prev_weights = current_weights
Is this reasonable? That approach seems to be a bit "messy" in my opinion.
Is there another possibility to do this in a "smarter" way?
Like maybe updating the loss function in a data generator and use fit_generator()?
Thanks in advance.
Loss functions are operations on the graph using tensors.
You can define additional tensors in the loss function to hold previous values. This is an example:
import tensorflow as tf
import tensorflow.keras.backend as K
keras = tf.keras
class HistoricalAvgLoss(object):
def __init__(self, model):
# create tensors (initialized to zero) to hold the previous value of the
# weights
self.prev_weights = []
for w in model.get_weights():
self.prev_weights.append(K.variable(np.zeros(w.shape)))
def loss(self, y_true, y_pred):
err = keras.losses.mean_squared_error(y_true, y_pred)
werr = [K.mean(K.abs(c - p)) for c, p in zip(model.get_weights(), self.prev_weights)]
self.prev_weights = K.in_train_phase(
[K.update(p, c) for c, p in zip(model.get_weights(), self.prev_weights)],
self.prev_weights
)
return K.in_train_phase(err + K.sum(werr), err)
The variable prev_weights holds the previous values. Note that we added a K.update operation after the weight errors are calculated.
A sample model for testing:
model = keras.models.Sequential([
keras.layers.Input(shape=(4,)),
keras.layers.Dense(8),
keras.layers.Dense(4),
keras.layers.Dense(1),
])
loss_obj = HistoricalAvgLoss(model)
model.compile('adam', loss_obj.loss)
model.summary()
Some test data and objective function:
import numpy as np
def test_fn(x):
return x[0]*x[1] + 2.0 * x[1]**2 + x[2]/x[3] + 3.0 * x[3]
X = np.random.rand(1000, 4)
y = np.apply_along_axis(test_fn, 1, X)
hist = model.fit(X, y, validation_split=0.25, epochs=10)
The model losses decrease over time, in my test.
Assume that y_true and y_pred are in [-1,1]. I want a weighted mean-square-error loss function, in which the loss for samples that are positive in the y_true and negative in y_pred or vice versa are weighted by exp(alpha). Here is my code:
import keras.backend as K
alpha = 1.0
def custom_loss(y_true, y_pred):
se = K.square(y_pred-y_true)
true_label = K.less_equal(y_true,0.0)
pred_label = K.less_equal(y_pred,0.0)
return K.mean(se * K.exp(alpha*K.cast(K.not_equal(true_label,pred_label), tf.float32)))
And here is a plot of this loss function. Different curves are for different values for y_true.
I want to know:
Whether this is a valid loss function, since it is not differentiable in 0?
Is my code correct?
I suggest you this type of loss function to handle imbalance dataset
def focal_loss(y_true, y_pred):
gamma = 2.0, alpha = 0.25
pt_1 = tf.where(tf.equal(y_true, 1), y_pred, tf.ones_like(y_pred))
pt_0 = tf.where(tf.equal(y_true, 0), y_pred, tf.zeros_like(y_pred))
return -K.sum(alpha * K.pow(1. - pt_1, gamma) * K.log(pt_1))-K.sum((1-alpha) * K.pow(pt_0, gamma) * K.log(1. - pt_0))
from this source