Whats the correct and optimized way to use multiple loss functions with Python and Keras?
I'm playing with this article:
https://medium.com/#polanitzer/predicting-the-israeli-lottery-results-for-the-november-29-2022-game-using-an-artificial-191489eb2c10
The author use the standard 'mse' as loss, and I wanna add some more custom functions, those functions they do not have mathematical accuracy, but are the result of observations only.
Let's say I have 5 situations:
Previous results will not happen again
The sum of results need be between V and W
From an game numbered from 00 to 99 I wanna limit the predicted numbers from 00 to 49 occurency times in X (numbers in this range cannot exceed X times)
From an game numbered from 00 to 99 I wanna limit the predicted numbers from 50
to 99 occurency times in Y (numbers in this range cannot exceed Y)
I wanna keep the default mse function
My approach started from:
Situation 1
def fn_never_repeat(y_true, y_pred, previous_data):
# Compute the difference score between the predicted outputs and the previous data
diff = K.mean(K.square(y_pred - previous_data), axis=-1)
# Return the weighted sum of the difference scores
return diff
Situation 2
def fn_sum_values(y_true, y_pred):
# Calculate the sum of the predicted numbers
predicted_sum = K.sum(y_pred)
# Set the minimum value to 133 and maximum value to 249
X = 133
Y = 249
# Calculate the loss based on the deviation from the desired range (X, Y)
loss = K.maximum(X - predicted_sum, 0) + K.maximum(predicted_sum - Y, 0)
return loss
Situation 3 and 4
(Using 10 as example for both quadrant)
def fn_quadrant(y_true, y_pred):
count_0_to_49 = K.sum(K.cast(K.less(y_pred, 50), 'float'))
count_50_to_99 = K.sum(K.cast(K.greater_equal(y_pred, 49), 'float'))
penalty = 0
if count_0_to_49 > 10:
penalty += K.square(count_0_to_49 - 10)
if count_50_to_99 > 10:
penalty += K.square(count_50_to_99 - 10)
return K.mean(K.square(y_true - y_pred)) + penalty
Situation 5
def fn_combined_loss(y_true, y_pred):
fn_never_repeat = fn_never_repeat(y_true, y_pred, previous_data)
fn_sum_values = fn_sum_values(y_true, y_pred)
fn_quadrant = fn_quadrant(y_true, y_pred)
mse = K.mean(K.square(y_true - y_pred))
return 0.25 * fn_never_repeat + 0.25 * fn_sum_values + 0.25 * fn_quadrant + 0.25 * mse
Calling it with:
model.compile(optimizer=Adam(learning_rate=0.0001), loss='fn_combined_loss', metrics=['accuracy'], custom_objects={'fn_combined_loss': fn_combined_loss})
And error occours after:
model.fit(x=x_train, y=y_train, batch_size=number_of_batch, epochs=Nepochs, verbose=1, callbacks=[callbacks_list])
I'm stuck on this error:
ValueError: Unknown loss function: combined_loss_fn. Please ensure
this object is passed to the custom_objects argument. See
https://www.tensorflow.org/guide/keras/save_and_serialize#registering_the_custom_object
for details.
The paremeters from lowest times or X times something can happen or cant exceed will come from stored procedure in database. And since the value is provided based on registered results, although it is not an absolute mathematical statistic, it will be dynamically informed.
I'm not sure if functions are correctly, since it is not compiling yet, you guys can point any mistaken if you seen something wrong too.
Thanks in advice!!!
With these few changes I think your training should work
Compile your model with the actual loss function, not as string
model.compile(optimizer=Adam(learning_rate=0.0001),loss=fn_combined_loss,metrics=['accuracy'])
In the combined loss rename variables, do not use the same name as the loss functions have
def fn_combined_loss(y_true, y_pred):
fn_never_repeat_ = fn_never_repeat(y_true, y_pred)
fn_sum_values_ = fn_sum_values(y_true, y_pred)
fn_quadrant_ = fn_quadrant(y_true, y_pred)
mse = K.mean(K.square(y_true - y_pred))
return 0.25 * fn_never_repeat_ + 0.25 * fn_sum_values_ + 0.25 * fn_quadrant_ + 0.25 * mse
In fn_quadrant change the constants to float
def fn_quadrant(y_true, y_pred):
count_0_to_49 = K.sum(K.cast(K.less(y_pred, 50), 'float'))
count_50_to_99 = K.sum(K.cast(K.greater_equal(y_pred, 49), 'float'))
penalty = 0.
if count_0_to_49 > 10:
penalty += K.square(count_0_to_49 - 10.)
if count_50_to_99 > 10:
penalty += K.square(count_50_to_99 - 10.)
return K.mean(K.square(y_true - y_pred)) + penalty
Related
I tried to implement logistic regression only with numpy in Python, but the result is not satisfying. The predictions seems incorrect and loss is not improving so it is probably something wrong with the code. Does anyone know what could fix it? Thank you very much!
Here is algorithm:
import numpy as np
# training data and labels
X = np.concatenate((np.random.normal(0.25, 0.1, 50), np.random.normal(0.75, 0.1, 50)), axis=None)
Y = np.concatenate((np.zeros((50,), dtype=np.int32), np.ones((50,), dtype=np.int32)), axis=None)
def logistic_sigmoid(a):
return 1 / (1 + np.exp(-a))
# forward pass
def forward_pass(w, x):
return logistic_sigmoid(w * x)
# gradient computation
def backward_pass(x, y, y_real):
return np.sum((y - y_real) * x)
# computing loss
def loss(y, y_real):
return -np.sum(y_real * np.log(y) + (1 - y_real) * np.log(1 - y))
# training
def train():
w = 0.0
learning_rate = 0.01
i = 200
test_number = 0.3
for epoch in range(i):
y = forward_pass(w, X)
gradient = backward_pass(X, y, Y)
w = w - learning_rate * gradient
print(f'epoch {epoch + 1}, x = {test_number}, y = {forward_pass(w, test_number):.3f}, loss = {loss(y, Y):.3f}')
train()
At first glance you are missing you intercept term (typically called b_0, or bias) and its gradient update. Also in the backward_pass and loss calculations you are not dividing by the amount of data samples.
You can see two examples of how to implement it from scratch here:
1: Example based on Andrew Ng explanations in the Machine Learning course in Coursera
2: Implementation of Jason Brownlee from Machine Learning mastery website
I'm trying to run on pytorch a UNet model for a multi-class image segmentation. I found an architecture of the model online that is apparently working ... I have 100 classes, my input is corresponding to a tensor size [8, 3, 32, 32], my label is [8, 32, 32] and as expected my output is [8, 100, 32, 32].
I want to compute the accuracy for every iteration so I followed this code for the computation of the accuracy :
def multi_acc(pred, label):
probs = torch.log_softmax(pred, dim = 1)
_, tags = torch.max(probs, dim = 1)
corrects = (tags == label).float()
acc = corrects.sum()/len(corrects)
acc = torch.round(acc)*100
return acc
But then when i'm running the training i get an accuracy that is always the same as :
Epoch : [2] [1/38311] Loss : 0.3168763518333435
Acc: 102400.0
Epoch : [2] [2/38311] Loss : 0.31527179479599
Acc: 102400.0
Epoch : [2] [3/38311] Loss : 0.2920961081981659
Acc: 102400.0
And it keep going like this...
If anyone has an idea to better understand that would be super great !
Thanks for the answers.
You calculate the accuracy with:
acc = corrects.sum()/len(corrects)
corrects has a size of torch.Size([8, 32, 32]), taking the sum with corrects.sum() gives you the number of correctly classified pixels, and there are a total of 8 * 32 * 32 = 8192. The accuracy should be num_correct / num_total, but you're dividing it by len(corrects) == 8. To get the total number of elements you can use torch.numel.
Another problem is that you're rounding your accuracy:
acc = torch.round(acc)*100
The accuracy is a value between 0 and 1. By rounding it, you'll get 0 for everything below 0.5 and 1 for everything else. That means you would only determine whether you've achieved over 50% accuracy. You need to remove the rounding entirely.
Applying these changes, you get the following function. I also removed the log_softmax, which leaves the order unchanged (larger values have larger probabilities). Since you're not using the probabilities, it has no effect:
def multi_acc(pred, label):
_, tags = torch.max(pred, dim = 1)
corrects = (tags == label).float()
acc = corrects.sum() / corrects.numel()
acc = acc * 100
return acc
corrects is a 3-dimensional array (batch, wdith, height) or something like that. When you call acc = corrects.sum() / len(corrects), len returns the size of the first dimension of the tensor, in this case 8 I think. Instead use .numel() to return the total number of elements in the 3-dimensional tensor. Also I recommend using torch.eq(). Also, don't round at the end. acc should be between 0 and 1 before rounding so if round it you'll always either get 0 or 1, which will correspond to 0 or 100 % accuracy after converting to percentage. Leave your accuracy metric unrounded and round it when you print it.
def multi_acc(pred, label):
probs = torch.log_softmax(pred, dim = 1)
_, tags = torch.max(probs, dim = 1)
corrects = torch.eq(tags,label).int()
acc = corrects.sum()/corrects.numel()
return acc
I am trying to regress two variables (mean and std) and then trying to optimize log(gaussian_distribution) = log(std) + (target - mean) / (2 * std ^ 2).
Note that on the same data, if change loss to (or even just use (target - mean) ^ 2 in my defined function) MSE on the first variable, I am able to get a very good score of the mean.
I am unable to figure out the mistake in the code or figure out a reason why it should not work.
Here is the code
def gaussian_loss(y2, y1):
std = K.exp(y1[:,1])
mean = y1[:,0]
return K.mean(K.log(std) + K.square(mean - y2[:,0]) / (2 * K.square(std)), axis = -1)
I have implemented this in Keras. Some related blogs (https://engineering.taboola.com/predicting-probability-distributions/)
I have implemented this loss, your implementation was my first try, and as you say it just does not work. I am not sure why, but here is a correct way to implement this loss:
def regression_nll_loss(sigma_sq, epsilon = 1e-6):
def nll_loss(y_true, y_pred):
return 0.5 * K.mean(K.log(sigma_sq + epsilon) + K.square(y_true - y_pred) / (sigma_sq + epsilon))
return nll_loss
As you see this loss only takes supervised labels for the mean. The variance has to be passed directly to the loss as a tensor:
inp = Input(shape=(1,))
x = Dense(10, activation="relu")(inp)
x = Dense(20, activation="relu")(x)
x = Dense(30, activation="relu")(x)
mean = Dense(1, activation="linear")(x)
var = Dense(1, activation="softplus")(x)
train_model = Model(inp, mean)
pred_model = Model(inp, [mean, var])
train_model.compile(loss=regression_nll_loss(var), optimizer="adam")
Then you can use train_model for training normally with model.fit, and pred_model to make predictions.
You can see a full example using my library at: https://github.com/mvaldenegro/keras-uncertainty/blob/master/examples/regression_deep-ensemble.py
I think it is also possible to implement this loss using the add_loss API, but I have not tried.
I am confused about loss in keras when using 'train_on_batch'.
See my model below (a implementation of triplet)
Sometimes my model outputs Nan, and the loss still shows something like 0.4.
And after some iterations, the encoder always shows the same output regardless what the input is.
And in that situation the loss of the triplet model should be 0.5 (see function bpr_triplet_loss below), but keras tells the loss is something like 0.1 or 0.2, even sometimes greater than 1 or smaller than 0. Why?
Is it because gradient exploding problem occurs and then the model is broken?
ter_input = Input(shape=self.img_shape)
ter_input_pos = Input(shape=self.img_shape)
ter_input_neg = Input(shape=self.img_shape)
ed = self.encoder()
ter_encoded = ed(ter_input)
pos_encoded = ed(ter_input_pos)
neg_encoded = ed(ter_input_neg)
def bpr_triplet_loss(X):
positive_item_latent, negative_item_latent, user_latent = X
# BPR loss
loss = 1.0 - K.sigmoid(
K.sum(user_latent * positive_item_latent, axis=-1, keepdims=True) -
K.sum(user_latent * negative_item_latent, axis=-1, keepdims=True))
return loss
loss_layer = Lambda(bpr_triplet_loss, output_shape=(1,))([pos_encoded, neg_encoded, ter_encoded])
def identity_loss(y_true, y_pred):
return K.mean(y_pred - 0 * y_true)
self.triplet_net = Model(
input=[ter_input, ter_input_pos, ter_input_neg],
output=loss_layer)
self.triplet_net.compile(loss=identity_loss, optimizer=Adam(0.0001, beta_1=0, beta_2=0.99, decay=0.00001)) # Adam()
Assume that y_true and y_pred are in [-1,1]. I want a weighted mean-square-error loss function, in which the loss for samples that are positive in the y_true and negative in y_pred or vice versa are weighted by exp(alpha). Here is my code:
import keras.backend as K
alpha = 1.0
def custom_loss(y_true, y_pred):
se = K.square(y_pred-y_true)
true_label = K.less_equal(y_true,0.0)
pred_label = K.less_equal(y_pred,0.0)
return K.mean(se * K.exp(alpha*K.cast(K.not_equal(true_label,pred_label), tf.float32)))
And here is a plot of this loss function. Different curves are for different values for y_true.
I want to know:
Whether this is a valid loss function, since it is not differentiable in 0?
Is my code correct?
I suggest you this type of loss function to handle imbalance dataset
def focal_loss(y_true, y_pred):
gamma = 2.0, alpha = 0.25
pt_1 = tf.where(tf.equal(y_true, 1), y_pred, tf.ones_like(y_pred))
pt_0 = tf.where(tf.equal(y_true, 0), y_pred, tf.zeros_like(y_pred))
return -K.sum(alpha * K.pow(1. - pt_1, gamma) * K.log(pt_1))-K.sum((1-alpha) * K.pow(pt_0, gamma) * K.log(1. - pt_0))
from this source