Weird behaviour in tensorflow metric - python

I have created a tensorflow metric as seen below:
def AttackAcc(y_true, y_pred):
r = tf.random.uniform(shape=(), minval=0, maxval=11, dtype=tf.int32)
if tf.math.greater(r,tf.constant(5) ):
return tf.math.equal( tf.constant(0.6) , tf.constant(0.2) )
else:
return tf.math.equal( tf.constant(0.6) , tf.constant(0.6) )
The metric is added to the model.compile as :
metrics=[AttackAcc]
This should return 0 half of the time and 1 in the other half. SO while training my model i should see a value for this metric of around 0.5.
However it is always 0.
Any ideas about why?

It looks like you are comparing two constants and they will always not be equal. Try BinaryAccuracy and use your input variables to update the state.
def AttackAcc(y_true, y_pred):
r = tf.random.uniform(shape=(), minval=0, maxval=11, dtype=tf.int32)
acc_metric = tf.keras.metrics.BinaryAccuracy()
acc_metric.update_state(y_true, y_pred)
if tf.math.greater(r, tf.constant(5)):
return acc_metric.result()
else:
return 1 - acc_metric.result()

Related

Using a custom loss function that references some of the inputs of the NN

I'm having trouble implementing a custom loss function into a Neural Network I'm building in TensorFlow. I want use one of my features as part of the loss function, so I've tried using model.add_loss instead of giving loss a value in the model.compile function.
My data looks like this:
import tensorflow as tf
import numpy as np
from tensorflow.keras import layers
feature_df = np.array([600,9])
training, test, = feature_df[:350,:], feature_df[350:,:]
x_train = training[:,[0,1,2,3,4,5,6]]
y_train = training[:,8]
loss_inp_train = training[:,[6]]
x_test = test[:,[0,1,2,3,4,5,6]]
y_test = test[:,8]
loss_inp_test = test[:,[6]]
I want to use a custom loss function because its not necessarily the mse I'm interested in minimizing, I want to minimize the profitability of this model, which depends if y_true and y_pred fall above or below loss_inp_train
I've tried creating a loss function that looks like this
def custom_loss(y_pred, y_true,inp):
loss = 0
if (y_pred < inp):
if y_true < inp:
loss = loss + .9
else:
loss = loss - 1
else:
if y_true > inp:
loss = loss + .9
else:
loss = loss - 1
loss = loss*-1
return(loss)
And the Model
model = tf.keras.Sequential([
normalize,
layers.Dense(18),
layers.Dense(1)
])
model.add_loss(profit_loss(y_pred,y_train,loss_inp_train))
model.compile(loss = None,
optimizer = tf.optimizers.Adam())
I'm having trouble feeding the loss function the output of the model. I'm still new to TensorFlow, whenever I've accessed predicted values its after the training using model.predict, but obviously I don't have a fitted model yet. How do I reference both a feature of the training data and y_true, y_pred in a function?
Probably the best way to do this is to define a custom loss. Unfortunately I'm not sure how to handle nested if statements like you have. Probably with a combination of K.switch. I can try to give you a partial solutions taking in consideration only the presence of a single if statement. Let's take the following simplified code:
loss = 0
if (y_pred < inp):
loss = # assignment 1
else:
loss = # assignment 2
In this case the loss function could be converted into this:
def profit_loss(inp):
def loss_function(y_true, y_pred):
loss = 0
condition = K.greater(y_pred - inp, 0)
loss1 = # assignment 1 if y_pred < inp
loss2 = # assignment 2 if y_pred >= inp
loss = K.switch(condition, loss2, loss1)
return - K.sum(loss)
return loss_function
model.compile(optimizer = tf.optimizers.Adam(), loss=profit_loss(inp))
This way y_true and y_pred are automatically handled and you just have to feed the inp argument.
Hope this helps getting you closer to solving the problem.

Tensorflow custom loss for multiple outputs model

I am working on a image segmentation and classification model which take one 3D image (64x64x64) and return two outputs (3D mask 64x64x64 and one-hot encoded category). Two outputs has been defined like this:
seg_final_stage = Conv3D(seg_classes, kernel_size=3, strides=1, padding="same", name="Seg_Final")(decoder1)
output_seg = Activation('sigmoid', name = "Seg_Final_Sigmoid")(seg_final_stage)
class_final = Dense(texture_classes, name="Class_Final")(class_dense3)
output_class = Activation('softmax', name = "Class_Final_SoftMax")(class_final)
model = Model(input, [output_seg, output_class], name = Config["Model_Name"])
There are two different method that I have tried but both failed, it seems like each output will return its own loss instead of just one loss. Following are the loss function I have now.
# Custom loss function
def dice_coef(y_true, y_pred):
smooth = 1.
y_true_f = tf.reshape(y_true, [-1])
y_pred_f = tf.reshape(y_pred, [-1])
intersection = tf.reduce_sum(y_true_f * y_pred_f)
score = (2. * intersection + smooth) / (tf.reduce_sum(y_true_f) + tf.reduce_sum(y_pred_f) + smooth)
return score
def dice_loss(y_true, y_pred):
return 1 - dice_coef(y_true, y_pred)
def CCE(y_true, y_pred):
class_loss = CategoricalCrossentropy(reduction=tf.keras.losses.Reduction.NONE)
return class_loss(y_true, y_pred)
def hybrid(y_true, y_pred):
mask_weight = 0.8
class_weight = 0.2
mask_loss = dice_loss(y_true[0], y_pred[0])
class_loss = CCE(y_true[1], y_pred[1])
return mask_weight * mask_loss + class_weight * class_loss
The first try:
I use single hybrid loss function that try to sum the dice loss and CCE together with weights. However, the model returns 2 losses like output_seg_hybrid and output_class_hybrid. I thought the y_true and y_pred are formed as array so that I can that the first item to calculate dice and the second item for CCE.
model.compile(loss=hybrid,
optimizer=Nadam(learning_rate=Config["learning_rate"], beta_1=0.9, beta_2=0.999, epsilon=1e-07, name="Nadam"),
metrics=[hybrid])
The second try:
I applied 2 losses like below to ensure the output has corresponding loss so it can return 2 loss and sum-up with weights. What I get is actually 4 losses like Seg_Final_Sigmoid_dice_loss, Class_Final_SoftMax_dice_loss, Seg_Final_Sigmoid_CCE and Class_Final_SoftMax_CCE.
model.compile(loss={'Seg_Final_Sigmoid':dice_loss, 'Class_Final_SoftMax':CCE},
optimizer=model.optimizer, metrics=[dice_loss, CCE], loss_weights = [0.8,0.2])
What should I do if I just want to sum-up these two losses as one loss?

Use if/else logic in tensorflow to either add an element to one tensor or another

I am building a custom loss function that needs to know whether the truth and the prediction have N pixels above a threshold. This is because the logic breaks if I supply an np.where() array which is empty. I can get around this issue by using try/else to return a 'flagged constant' in the case that the function fails on the empty set, but I'd like to do something different. Here is my current method.
def some_loss(cutoff=20, min_pix=10):
def gen_loss(y_true, y_pred):
trues = tf.map_fn(fn = lambda x: x, elems = y_true)
preds = tf.map_fn(fn = lambda x: x, elems = y_pred)
for idx in tf.range(tf.shape(y_true)[0]):
# binarize both by cutoff
true = y_true[idx]
pred = y_pred[idx]
true = tf.where(true < cutoff, 0.0, 1.0)
pred = tf.where(pred < cutoff, 0.0, 1.0)
# now I sum each to get the number of pixels above threshold
n_true, n_pred = tf.reduce_sum(true), tf.reduce_sum(pred)
# then I create a switch using tf.conditional
switch = tf.cond(tf.logical_or(n_true < min_pix, n_pred < min_pix), lambda: tf.zeros_like(true), lambda: tf.ones_like(true))
# this essentially allows me to turn off the loss if either condition is met
# so I then run the function
loss = get_loss(true, pred) # returns random constant if either is below threshold
loss += tf.reduce_sum(tf.math.multiply(loss, switch))
return loss
return gen_loss
This may work, it compiles and trains a convolutional model. However, I don't like that there are random constants wandering about my loss function, and I'd rather only operate the function get_loss() if both true and pred meet the minimum conditions.
I'd prefer to make two tensors, one with samples not meeting the condition, the other with samples meeting the condition.
Separately, I've tried to use tf.conditional to test for each case and call a separate loss function in either case. The code is repeated below.
def avgMED(scaler, cutoff=20, min_N=30,c=3):
def AVGmed(y_true, y_pred):
const = tf.constant([c],tf.float32) # constant c, multiplied by MED (
batch_size = tf.cast(tf.shape(y_true)[0], tf.float32)
MSE = tf.reduce_mean(tf.square(y_true-y_pred))
y_true = tf.reshape(y_true, shape=(tf.shape(y_true)[0], -1))
y_pred = tf.reshape(y_pred, shape=(tf.shape(y_pred)[0], -1))
loss, loss_med = tf.cast(0,dtype=tf.float32), tf.cast(0,dtype=tf.float32)
# rescale
y_true = y_true*scaler.scale_
y_true = y_true+scaler.mean_
y_pred = y_pred*scaler.scale_
y_pred = y_pred+scaler.mean_
trues = tf.map_fn(fn = lambda x: x, elems=y_true)
preds = tf.map_fn(fn = lambda x: x, elems=y_pred)
min_nonzero_pixels = tf.reduce_sum(tf.constant(min_N, dtype=tf.float32))
for idx in tf.range(batch_size):
idx = tf.cast(idx, tf.int32)
true = trues[idx]
pred = preds[idx]
MSE = tf.reduce_mean(tfm.square(tfm.subtract(true,pred)))
true = tf.where(true<cutoff,0.0,1.0)
pred = tf.where(pred<cutoff,0.0,1.0)
n_true = tf.reduce_sum(true)
n_pred = tf.reduce_sum(pred)
loss_TA = tf.cond(tf.logical_or(n_true < min_nonzero_pixels, n_pred < min_nonzero_pixels), get_zero(true,pred), get_MED(true,pred))
loss_med += loss_TA.read(0)
loss += loss_med + MSE # do we benefit from reducing across the batch dimension? we should be able to look at familiar batches and see the little increase due to the distance component
tf.print(n_true,n_pred)
tf.print(loss_med)
return loss # this is essentially MSE given c ~ 0. Thus, this will show if there are some weird gradients flowing through that are preventing the model from learning
return AVGmed
def get_MED(A,B):
# takes in binary tensors
indices_A, indices_B = tf.where(A), tf.where(B)
coordX_A_TA, coordY_A_TA = find_coord(indices_A) # finds x,y coordinates and returns tensor array
coordX_B_TA, coordY_B_TA = find_coord(indices_B)
mindists_AB_TA = find_min_distances(coordX_A_TA, coordY_A_TA, coordX_B_TA, coordY_B_TA)
mindists_BA_TA = find_min_distances(coordX_B_TA, coordY_B_TA, coordX_A_TA, coordY_A_TA)
# MED = mean error distance =
med_AB = tf.reduce_mean(mindists_AB_TA.read(0))
med_BA = tf.reduce_mean(mindists_BA_TA.read(0))
avg_med = tfm.divide(tfm.add(med_AB,med_BA),tf.constant(0.5))
loss_TA = tf.TensorArray(dtype=tf.float32, size=0, dynamic_size=True)
loss_TA.write(loss_TA.size(), avg_med)
return loss_TA
def get_zero(A,B):
loss_TA = tf.TensorArray(dtype=tf.float32, size=0, dynamic_size=True)
loss_TA.write(loss_TA.size(), 0)
return loss_TA
However, with this framework I am now getting new errors about my generator not having enough data, which is absurd given the batch size I test with is 10 and 1 step_per_epoch on a train size of 100. Got a warning about not closing the TensorArray, which I expect happens whether the conditional is true or false. Inching closer to a solution but could use some guidance on how problematic my tensorflow logic is.

How to initialise the Weights for specific task and Backpropagation modification

My model is used to predict values based on an minimising a loss function L. But, the loss function doesn’t have a single global minima value, but rather a large number of places where it achieves global minima.
So, the model is based like this:
Model Input is [nXn] tensor (let’s say: inp=[ [i_11, i_12, i_13, ..., i_1n],[i_21, i_22, ..., i_2n],...,[i_n1,i_n2, ..., i_nn] ]) and model output is [nX1] tensor (let’s say: out1=[o_1, o_2,..., o_n ])
Output tensor is out1 is passed in a function f to get out2 (let’s say: f(o_1, o_2, o_3,..., o_n)=[O_1, O_2, O_3, ..., O_n] )
These 2 values (i.e., out1 and out2) are minimised using MSELoss i.e., Loss = ||out1 - out2||
Now, there are a lot of values for [o_1, o_2, ..., o_n] for which the Loss goes to minimum.
But, I want the values of [o_1, o_2, ..., o_n] for which |o_1| + |o_2| + |o_3| + ... + |o_n| is maximum
Right now, the weights are initialised randomly:
self.weight = torch.nn.parameter.Parameter(torch.FloatTensor(in_features, out_features)) for some value of in_features and out_features
But by doing this, I am getting the values of [o_1, o_2, ..., o_n] for which |o_1| + |o_2| + |o_3| + ... + |o_n| is minimum.
I know this problem can be solved by without using deep-learning, but I am trying to get the results like this for some task computation.
Is there a way to change this to get the largest values predicted at the output of the neural net?
Or is there any other technique (backpropagation change) to change it to get the desired largest valued output?
Thanks in advance.
EDIT 1:
Based on the answer, out1=[o_1, o_2,..., o_n ] is tending to zero-valued tensor. In the initial epochs, out2=[O_1, O_2, O_3, ..., O_n] takes very large values, but subsequently comes down to lower values.
A snippet of code below will give the idea:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
import numpy as np
class Model(nn.Module):
def __init__(self, inp_l, hid_l, out_l=1):
super(Model, self).__init__()
self.lay1 = nn.Linear(inp_l ,hid_l)
self.lay2 = nn.Linear(hid_l ,out_l)
self.dp = nn.Dropout(p=0.5)
def forward(self, inp):
self.out1= torch.tensor([]).float()
for row in range(x.shape[0]):
y = self.lay1(inp[row])
y = F.relu(y)
y = self.dp(y.float())
y = self.lay2(y)
y = F.relu(y)
self.out1= torch.cat((self.out1, y))
return self.out1.view(inp.shape[0],-1)
def function_f(inp, out1):
'''
Some functional computation is done to return out2.
'''
return out2
def train_model(epoch):
model.train()
t = time.time()
optimizer.zero_grad()
out1 = model(inp)
out2 = function_f(inp, out1)
loss1 = ((out1-out2)**2).mean()
loss2 = -out1.abs().mean()
loss_train = loss1 + loss2
loss_train.backward(retain_graph=True)
optimizer.step()
if epoch%40==0:
print('Epoch: {:04d}'.format(epoch+1),
'loss_train: {:.4f}'.format(loss_train.item()),
'time: {:.4f}s'.format(time.time() - t))
model= Model(inp_l=10, hid_l=5, out_l=1)
optimizer = optim.Adam(model.parameters(), lr=0.001)
inp = torch.randint(100, (10, 10))
for ep in range(100):
train_model(ep)
But, out1 value goes to trivial solution i.e., zero-valued tensor which is the minimum valued solution. As mentioned before EDIT, I want to get the max-valued solution.
Thank you.
I am not sure I understand what you want.
Your weight initialization is overly complicated as well, you may just do:
self.weight = torch.nn.Linear(in_features, out_featues)
If you want to have the largest value of a batch of inputs you may simply do:
y = self.weight(x)
return y.max(dim=0)[0]
But I am not entirely sure that is what you meant with your question.
EDIT:
It seems you have two objectives. The first thing I would try is to convert both of them in losses to be minimized by the optimizer.
loss1 = MSE(out1, out2)
loss2 = - out1.abs().mean()
loss = loss1 + loss2
minimizing loss will simutaneously minimize the MSE between out1 and out2 and maximize the absolute values of out1. (minimizing - out1.abs().mean() is the same as maximizing out1.abs().mean()).
Notice that it is possible your neural net will just create large biases and zero the weights as a lazy solution for the objective. You may turn of biases to avoid the problem, but I would still expect some other training problems.

How does loss extracly work in keras when using train_on_batch?

I am confused about loss in keras when using 'train_on_batch'.
See my model below (a implementation of triplet)
Sometimes my model outputs Nan, and the loss still shows something like 0.4.
And after some iterations, the encoder always shows the same output regardless what the input is.
And in that situation the loss of the triplet model should be 0.5 (see function bpr_triplet_loss below), but keras tells the loss is something like 0.1 or 0.2, even sometimes greater than 1 or smaller than 0. Why?
Is it because gradient exploding problem occurs and then the model is broken?
ter_input = Input(shape=self.img_shape)
ter_input_pos = Input(shape=self.img_shape)
ter_input_neg = Input(shape=self.img_shape)
ed = self.encoder()
ter_encoded = ed(ter_input)
pos_encoded = ed(ter_input_pos)
neg_encoded = ed(ter_input_neg)
def bpr_triplet_loss(X):
positive_item_latent, negative_item_latent, user_latent = X
# BPR loss
loss = 1.0 - K.sigmoid(
K.sum(user_latent * positive_item_latent, axis=-1, keepdims=True) -
K.sum(user_latent * negative_item_latent, axis=-1, keepdims=True))
return loss
loss_layer = Lambda(bpr_triplet_loss, output_shape=(1,))([pos_encoded, neg_encoded, ter_encoded])
def identity_loss(y_true, y_pred):
return K.mean(y_pred - 0 * y_true)
self.triplet_net = Model(
input=[ter_input, ter_input_pos, ter_input_neg],
output=loss_layer)
self.triplet_net.compile(loss=identity_loss, optimizer=Adam(0.0001, beta_1=0, beta_2=0.99, decay=0.00001)) # Adam()

Categories

Resources