I want to use the MeanIoU metric in keras (doc link). But I don't really understand how it could be integrated with the keras api. In the example, the prediction and the ground truth are given as binary values but with keras we should get probabilities, especially because the loss is mse...
We should have something like:
m = tf.keras.metrics.MeanIoU(num_classes=2)
m.update_state([0, 0, 1, 1], [0.3, 0.6, 0.2, 0.9])
But now the result isn't the same, we have:
# <tf.Variable 'UnreadVariable' shape=(2, 2) dtype=float64, numpy=array([[2., 0.],
# [2., 0.]])>
m.result().numpy() # 0.25
So my question is how should we use this metric if the output of the model is probabilities? binary or even in a multi-class setting (one hot)?
For the Accuracy there is a distinction between BinaryAccuracy and CategoricalAccuracy and they both take probabilities in y_pred. Shouldn't it be the same for MeanIoU?
I am having similar issues. Despite looking for examples online, all demonstrations happens after applying argmax on the model's output.
The workaround I have for now is to subclass tf.keras.metrics.MeanIoU:
class MyMeanIOU(tf.keras.metrics.MeanIoU):
def update_state(self, y_true, y_pred, sample_weight=None):
return super().update_state(tf.argmax(y_true, axis=-1), tf.argmax(y_pred, axis=-1), sample_weight)
It is also possible to create your own function, but it is recommended to subclass tf.keras.metrics.Metric if you wish to benefit from the extra features such as distributed strategies.
I am still looking for cleaner solutions.
i have the same problem, and i look into the source code.
In tf2.0, at the end of the update_state function, there is :
current_cm = confusion_matrix.confusion_matrix(
y_true,
y_pred,
self.num_classes,
weights=sample_weight,
dtype=dtypes.float64)
and i look into confusion_matrix function,
with ops.name_scope(name, 'confusion_matrix',
(predictions, labels, num_classes, weights)) as name:
labels, predictions = remove_squeezable_dimensions(
ops.convert_to_tensor(labels, name='labels'),
ops.convert_to_tensor(
predictions, name='predictions'))
predictions = math_ops.cast(predictions, dtypes.int64)
labels = math_ops.cast(labels, dtypes.int64)
# Sanity checks - underflow or overflow can cause memory corruption.
labels = control_flow_ops.with_dependencies(
[check_ops.assert_non_negative(
labels, message='`labels` contains negative values')],
labels)
predictions = control_flow_ops.with_dependencies(
[check_ops.assert_non_negative(
predictions, message='`predictions` contains negative values')],
predictions)
if num_classes is None:
num_classes = math_ops.maximum(math_ops.reduce_max(predictions),
math_ops.reduce_max(labels)) + 1
else:
num_classes_int64 = math_ops.cast(num_classes, dtypes.int64)
labels = control_flow_ops.with_dependencies(
[check_ops.assert_less(
labels, num_classes_int64, message='`labels` out of bound')],
labels)
predictions = control_flow_ops.with_dependencies(
[check_ops.assert_less(
predictions, num_classes_int64,
message='`predictions` out of bound')],
predictions)
if weights is not None:
weights = ops.convert_to_tensor(weights, name='weights')
predictions.get_shape().assert_is_compatible_with(weights.get_shape())
weights = math_ops.cast(weights, dtype)
shape = array_ops.stack([num_classes, num_classes])
indices = array_ops.stack([labels, predictions], axis=1)
values = (array_ops.ones_like(predictions, dtype)
if weights is None else weights)
cm_sparse = sparse_tensor.SparseTensor(
indices=indices,
values=values,
dense_shape=math_ops.cast(shape, dtypes.int64))
zero_matrix = array_ops.zeros(math_ops.cast(shape, dtypes.int32), dtype)
return sparse_ops.sparse_add(zero_matrix, cm_sparse)
the trick is at 6th line of the code, tf use math_ops.cast cast the predictions to int64, so when you send [0.3, 0.6, 0.2, 0.9] into cast function, it returns [0, 0, 0, 0].
So, that's why you got a confusion maxtrix
[[2., 0.],
[2., 0.]]
Related
My neural network in Keras learns a representation of my original data. In order to see exactly how it learns I thought it would be interesting to plot the data for every training batch (or epoch alternatively) and convert the plots into a video.
I'm stuck on how to get the outputs of my model during the training phase.
I thought about doing something like this (pseudo code):
epochs = 200
plt_outputs = []
for i in range(epochs):
model.fit(x_train,y_train, epochs = 1)
plt_outputs.append(output_layer(x_test))
where output_layer is the layer in my neural network I'm interested in. Afterwards I would use plot_data to generate each plot and turn it into a video. (That part I'm not concerned about yet..)
But that doesn't strike me as a good solution, plus I don't know how get the output for every batch. Any thoughts on this?
You can customize what happens in the test step, much like this official tutorial:
import tensorflow as tf
import numpy as np
class CustomModel(tf.keras.Model):
def test_step(self, data):
# Unpack the data
x, y = data
# Compute predictions
y_pred = self(x, training=False)
test_outputs.append(y_pred) # ADD THIS HERE
# Updates the metrics tracking the loss
self.compiled_loss(y, y_pred, regularization_losses=self.losses)
# Update the metrics.
self.compiled_metrics.update_state(y, y_pred)
# Return a dict mapping metric names to current value.
# Note that it will include the loss (tracked in self.metrics).
return {m.name: m.result() for m in self.metrics}
# Construct an instance of CustomModel
inputs = tf.keras.Input(shape=(8,))
x = tf.keras.layers.Dense(8, activation='relu')(inputs)
outputs = tf.keras.layers.Dense(1)(x)
model = CustomModel(inputs, outputs)
model.compile(loss="mse", metrics=["mae"], run_eagerly=True)
test_outputs = list() # ADD THIS HERE
# Evaluate with our custom test_step
x = np.random.random((1000, 8))
y = np.random.random((1000, 1))
model.evaluate(x, y)
I added a list, and now in the test step, it will append this list with the output. You will need to add run_eagerly=True in model.compile() for this to work. This will output a list of such outputs:
<tf.Tensor: shape=(32, 1), dtype=float32, numpy=
array([[ 0.10866462],
[ 0.2749035 ],
[ 0.08196291],
[ 0.25862294],
[ 0.30985728],
[ 0.20230596],
...
[ 0.17108777],
[ 0.29692617],
[-0.03684975],
[ 0.03525433],
[ 0.26774448],
[ 0.21728781],
[ 0.0840873 ]], dtype=float32)>
When I am trying to import a metric from sklearn, I get the following error:
from sklearn.metrics import mean_absolute_percentage_error
ImportError: cannot import name 'mean_absolute_percentage_error' from 'sklearn.metrics'
/Users/carter/opt/anaconda3/lib/python3.8/site-packages/sklearn/metrics/__init__.py)
I have used conda update all, and reinstalled scikit-learn to no avail. Any other reasons this might happen and solutions?
The function mean_absolute_percentage_error is new in scikit-learn version 0.24 as noted in the documentation.
As of December 2020, the latest version of scikit-learn available from Anaconda is v0.23.2, so that's why you're not able to import mean_absolute_percentage_error.
You could try installing the latest version from source instead, or implement the function you need yourself. The source is available here if you'd like to take a look.
The answer above is the right one. For those who cannot upgrade/install from source, below is the required code.
The function itself relies on other functions - one defined in the same module and others is from sklearn.utils.validation.
Here is the required code I pulled from the source - if anyone needs it (and I hope I am not violating any license):
from sklearn.utils.validation import check_consistent_length, check_array
def mean_absolute_percentage_error(y_true, y_pred,
sample_weight=None,
multioutput='uniform_average'):
"""Mean absolute percentage error regression loss.
Note here that we do not represent the output as a percentage in range
[0, 100]. Instead, we represent it in range [0, 1/eps]. Read more in the
:ref:`User Guide <mean_absolute_percentage_error>`.
.. versionadded:: 0.24
Parameters
----------
y_true : array-like of shape (n_samples,) or (n_samples, n_outputs)
Ground truth (correct) target values.
y_pred : array-like of shape (n_samples,) or (n_samples, n_outputs)
Estimated target values.
sample_weight : array-like of shape (n_samples,), default=None
Sample weights.
multioutput : {'raw_values', 'uniform_average'} or array-like
Defines aggregating of multiple output values.
Array-like value defines weights used to average errors.
If input is list then the shape must be (n_outputs,).
'raw_values' :
Returns a full set of errors in case of multioutput input.
'uniform_average' :
Errors of all outputs are averaged with uniform weight.
Returns
-------
loss : float or ndarray of floats in the range [0, 1/eps]
If multioutput is 'raw_values', then mean absolute percentage error
is returned for each output separately.
If multioutput is 'uniform_average' or an ndarray of weights, then the
weighted average of all output errors is returned.
MAPE output is non-negative floating point. The best value is 0.0.
But note the fact that bad predictions can lead to arbitarily large
MAPE values, especially if some y_true values are very close to zero.
Note that we return a large value instead of `inf` when y_true is zero.
Examples
--------
>>> from sklearn.metrics import mean_absolute_percentage_error
>>> y_true = [3, -0.5, 2, 7]
>>> y_pred = [2.5, 0.0, 2, 8]
>>> mean_absolute_percentage_error(y_true, y_pred)
0.3273...
>>> y_true = [[0.5, 1], [-1, 1], [7, -6]]
>>> y_pred = [[0, 2], [-1, 2], [8, -5]]
>>> mean_absolute_percentage_error(y_true, y_pred)
0.5515...
>>> mean_absolute_percentage_error(y_true, y_pred, multioutput=[0.3, 0.7])
0.6198...
"""
y_type, y_true, y_pred, multioutput = _check_reg_targets(
y_true, y_pred, multioutput)
check_consistent_length(y_true, y_pred, sample_weight)
epsilon = np.finfo(np.float64).eps
mape = np.abs(y_pred - y_true) / np.maximum(np.abs(y_true), epsilon)
output_errors = np.average(mape,
weights=sample_weight, axis=0)
if isinstance(multioutput, str):
if multioutput == 'raw_values':
return output_errors
elif multioutput == 'uniform_average':
# pass None as weights to np.average: uniform mean
multioutput = None
return np.average(output_errors, weights=multioutput)
def _check_reg_targets(y_true, y_pred, multioutput, dtype="numeric"):
"""Check that y_true and y_pred belong to the same regression task.
Parameters
----------
y_true : array-like
y_pred : array-like
multioutput : array-like or string in ['raw_values', uniform_average',
'variance_weighted'] or None
None is accepted due to backward compatibility of r2_score().
Returns
-------
type_true : one of {'continuous', continuous-multioutput'}
The type of the true target data, as output by
'utils.multiclass.type_of_target'.
y_true : array-like of shape (n_samples, n_outputs)
Ground truth (correct) target values.
y_pred : array-like of shape (n_samples, n_outputs)
Estimated target values.
multioutput : array-like of shape (n_outputs) or string in ['raw_values',
uniform_average', 'variance_weighted'] or None
Custom output weights if ``multioutput`` is array-like or
just the corresponding argument if ``multioutput`` is a
correct keyword.
dtype : str or list, default="numeric"
the dtype argument passed to check_array.
"""
check_consistent_length(y_true, y_pred)
y_true = check_array(y_true, ensure_2d=False, dtype=dtype)
y_pred = check_array(y_pred, ensure_2d=False, dtype=dtype)
if y_true.ndim == 1:
y_true = y_true.reshape((-1, 1))
if y_pred.ndim == 1:
y_pred = y_pred.reshape((-1, 1))
if y_true.shape[1] != y_pred.shape[1]:
raise ValueError("y_true and y_pred have different number of output "
"({0}!={1})".format(y_true.shape[1], y_pred.shape[1]))
n_outputs = y_true.shape[1]
allowed_multioutput_str = ('raw_values', 'uniform_average',
'variance_weighted')
if isinstance(multioutput, str):
if multioutput not in allowed_multioutput_str:
raise ValueError("Allowed 'multioutput' string values are {}. "
"You provided multioutput={!r}".format(
allowed_multioutput_str,
multioutput))
elif multioutput is not None:
multioutput = check_array(multioutput, ensure_2d=False)
if n_outputs == 1:
raise ValueError("Custom weights are useful only in "
"multi-output cases.")
elif n_outputs != len(multioutput):
raise ValueError(("There must be equally many custom weights "
"(%d) as outputs (%d).") %
(len(multioutput), n_outputs))
y_type = 'continuous' if n_outputs == 1 else 'continuous-multioutput'
return y_type, y_true, y_pred, multioutput
You can go with one of these two solutions:
Upgrade your sklearn version
!pip install scikit-learn==0.24
Then,
from sklearn.metrics import mean_absolute_percentage_error
Build your own function to calculate MAPE
def MAPE(y_true, y_pred):
y_true, y_pred = np.array(y_true), np.array(y_pred)
return np.mean(np.abs((y_true - y_pred) / y_true)) * 100
But the problem with the above function is that when you have (0) true value your MAPE will go (inf). So, to solve this problem we use,
def MAPE(y_true, y_pred):
y_true, y_pred = np.array(y_true), np.array(y_pred)
return np.mean(np.abs((y_true - y_pred) / np.maximum(np.ones(len(y_true)), np.abs(y_true))))*100
Just had the same issue. Accessing Anaconda Prompt for the environment that one is working on, and running
pip install scikit-learn
Solved the problem.
It updated scikit-learn's version (at this precise moment it was upgraded to version 1.0.2, but it is present in versions starting at 0.24.), and now one is able to import and run sklearn.metrics.mean_absolute_percentage_error.
Here is an example (Source):
>>> from sklearn.metrics import mean_absolute_percentage_error
>>> y_true = [3, -0.5, 2, 7]
>>> y_pred = [2.5, 0.0, 2, 8]
>>> mean_absolute_percentage_error(y_true, y_pred)
0.3273...
Note: You may want to keep in mind that the MAPE can be problematic, as it may result in divisions by zero (see my answer here).
I find that the gradients computed depend on the interplay of tf.function decorators in the following way.
First I create some synthetic data for a binary classification
tf.random.set_seed(42)
np.random.seed(42)
x=tf.random.normal((2,1))
y=tf.constant(np.random.choice([0,1],2))
Then I define two loss functions that differ only in the tf.function decorator
weights=tf.constant([1.,.1])[tf.newaxis,...]
def customloss1(y_true,y_pred,sample_weight=None):
y_true_one_hot=tf.one_hot(tf.cast(y_true,tf.uint8),2)
y_true_scale=tf.multiply(weights,y_true_one_hot)
return tf.reduce_mean(tf.keras.losses.categorical_crossentropy(y_true_scale,y_pred))
#tf.function
def customloss2(y_true,y_pred,sample_weight=None):
y_true_one_hot=tf.one_hot(tf.cast(y_true,tf.uint8),2)
y_true_scale=tf.multiply(weights,y_true_one_hot)
return tf.reduce_mean(tf.keras.losses.categorical_crossentropy(y_true_scale,y_pred))
Then I make a very simple logistic regression model with all the bells and whistles removed to keep it simple
tf.random.set_seed(42)
np.random.seed(42)
model=tf.keras.Sequential([
tf.keras.layers.Dense(2,use_bias=False,activation='softmax',input_shape=[1,])
])
and finally define two functions to calculate the gradients of the aforementioned loss functions with one being decorated by tf.function and the other not being decorated by it
def get_gradients1(x,y):
with tf.GradientTape() as tape1:
p1=model(x)
l1=customloss1(y,p1)
with tf.GradientTape() as tape2:
p2=model(x)
l2=customloss2(y,p2)
gradients1=tape1.gradient(l1,model.trainable_variables)
gradients2=tape2.gradient(l2,model.trainable_variables)
return gradients1, gradients2
#tf.function
def get_gradients2(x,y):
with tf.GradientTape() as tape1:
p1=model(x)
l1=customloss1(y,p1)
with tf.GradientTape() as tape2:
p2=model(x)
l2=customloss2(y,p2)
gradients1=tape1.gradient(l1,model.trainable_variables)
gradients2=tape2.gradient(l2,model.trainable_variables)
return gradients1, gradients2
Now when I run
get_gradients1(x,y)
I get
([<tf.Tensor: shape=(1, 2), dtype=float32, numpy=array([[ 0.11473544, -0.11473544]], dtype=float32)>],
[<tf.Tensor: shape=(1, 2), dtype=float32, numpy=array([[ 0.11473544, -0.11473544]], dtype=float32)>])
and the gradients are equal as expected. However when I run
get_gradients2(x,y)
I get
([<tf.Tensor: shape=(1, 2), dtype=float32, numpy=array([[ 0.02213785, -0.5065186 ]], dtype=float32)>],
[<tf.Tensor: shape=(1, 2), dtype=float32, numpy=array([[ 0.11473544, -0.11473544]], dtype=float32)>])
where only the second answer is correct. Thus, when my outer function is decorated I only get the correct answer from the inner function that is decorated as well. I was under the impression that decorating the outer one (which is the training loop in many applications) is sufficient but here we see its not. I want to understand why and also then how deep does one have to go to decorate the functions being used?
Added some debugging info
I added some debugging info and I show the code only for customloss2 (the other is identical)
#tf.function
def customloss2(y_true,y_pred,sample_weight=None):
y_true_one_hot=tf.one_hot(tf.cast(y_true,tf.uint8),2)
y_true_scale=tf.multiply(weights,y_true_one_hot)
tf.print('customloss2',type(y_true_scale),type(y_pred))
tf.print('y_true_scale','\n',y_true_scale)
tf.print('y_pred','\n',y_pred)
return tf.reduce_mean(tf.keras.losses.categorical_crossentropy(y_true_scale,y_pred))
and on running get_gradients1 I get
customloss1 <type 'EagerTensor'> <type 'EagerTensor'>
y_true_scale
[[1 0]
[0 0.1]]
y_pred
[[0.510775387 0.489224613]
[0.529191136 0.470808864]]
customloss2 <class 'tensorflow.python.framework.ops.Tensor'> <class 'tensorflow.python.framework.ops.Tensor'>
y_true_scale
[[1 0]
[0 0.1]]
y_pred
[[0.510775387 0.489224613]
[0.529191136 0.470808864]]
we see that the tensors for customloss1 are Eager but for customloss2 are Tensor and yet we get same value for gradients.
On the other hand when I run it on get_gradients2
customloss1 <class 'tensorflow.python.framework.ops.Tensor'> <class 'tensorflow.python.framework.ops.Tensor'>
y_true_scale
[[1 0]
[0 0.1]]
y_pred
[[0.510775387 0.489224613]
[0.529191136 0.470808864]]
customloss2 <class 'tensorflow.python.framework.ops.Tensor'> <class 'tensorflow.python.framework.ops.Tensor'>
y_true_scale
[[1 0]
[0 0.1]]
y_pred
[[0.510775387 0.489224613]
[0.529191136 0.470808864]]
we see everything is identical with no tensors being Eager and yet I get different gradients!
This is a somewhat complicated issue, but it has an explanation. The problem lies within the function tf.keras.backend.categorical_crossentropy, which has a different behavior depending on whether you are running on eager or graph (tf.function) mode.
The function considers three possible situations. The first one is that you pass from_logits=True, in which case it just calls tf.nn.softmax_cross_entropy_with_logits:
if from_logits:
return nn.softmax_cross_entropy_with_logits_v2(
labels=target, logits=output, axis=axis)
If you give from_logits=False, which is the most common in Keras, since the output layer for categorical classification is generally a softmax, then it considers two possibilities. The first is that, if the given output value comes from a softmax operation, then it can just use the input to that operation and call tf.nn.softmax_cross_entropy_with_logits, which is preferred to compute the actual cross entropy with the softmax values because it prevents "saturated" results. However, this can only be done in graph mode, because eager mode tensors do not keep track of the operation it produced them, nevermind the inputs to that operation.
if not isinstance(output, (ops.EagerTensor, variables_module.Variable)):
output = _backtrack_identity(output)
if output.op.type == 'Softmax':
# When softmax activation function is used for output operation, we
# use logits from the softmax function directly to compute loss in order
# to prevent collapsing zero when training.
# See b/117284466
assert len(output.op.inputs) == 1
output = output.op.inputs[0]
return nn.softmax_cross_entropy_with_logits_v2(
labels=target, logits=output, axis=axis)
The last case is when you have given from_logits=False and either you are in eager mode or the given output tensor does not directly come from a softmax operation, in which case the only option is to compute the cross entropy from the softmax value.
# scale preds so that the class probas of each sample sum to 1
output = output / math_ops.reduce_sum(output, axis, True)
# Compute cross entropy from probabilities.
epsilon_ = _constant_to_tensor(epsilon(), output.dtype.base_dtype)
output = clip_ops.clip_by_value(output, epsilon_, 1. - epsilon_)
return -math_ops.reduce_sum(target * math_ops.log(output), axis)
The problem is that, even though these are mathematically equivalent ways to compute the cross entropy, they do not have the same precision. They are pretty much the same when logits are small, but if they get big they can diverge a lot. Here is a simple test:
import tensorflow as tf
#tf.function
def test_keras_xent(y, p, from_logits=False, mask_op=False):
# p is always logits
if not from_logits:
# Compute softmax if not using logits
p = tf.nn.softmax(p)
if mask_op:
# A dummy addition prevents Keras from detecting that
# the value comes from a softmax operation
p = p + tf.constant(0, p.dtype)
return tf.keras.backend.categorical_crossentropy(y, p, from_logits=from_logits)
# Test
tf.random.set_seed(0)
y = tf.constant([1., 0., 0., 0.])
# Logits in [0, 1)
p = tf.random.uniform([4], minval=0, maxval=1)
tf.print(test_keras_xent(y, p, from_logits=True))
# 1.50469065
tf.print(test_keras_xent(y, p, from_logits=False, mask_op=False))
# 1.50469065
tf.print(test_keras_xent(y, p, from_logits=False, mask_op=True))
# 1.50469065
# Logits in [0, 10)
p = tf.random.uniform([4], minval=0, maxval=10)
tf.print(test_keras_xent(y, p, from_logits=True))
# 3.47569656
tf.print(test_keras_xent(y, p, from_logits=False, mask_op=False))
# 3.47569656
tf.print(test_keras_xent(y, p, from_logits=False, mask_op=True))
# 3.47569656
# Logits in [0, 100)
p = tf.random.uniform([4], minval=0, maxval=100)
tf.print(test_keras_xent(y, p, from_logits=True))
# 68.0106506
tf.print(test_keras_xent(y, p, from_logits=False, mask_op=False))
# 68.0106506
tf.print(test_keras_xent(y, p, from_logits=False, mask_op=True))
# 16.1180954
Taking your example:
import tensorflow as tf
tf.random.set_seed(42)
x = tf.random.normal((2, 1))
y = tf.constant(np.random.choice([0, 1], 2))
y1h = tf.one_hot(y, 2, dtype=x.dtype)
model = tf.keras.Sequential([
# Linear activation because we want the logits for testing
tf.keras.layers.Dense(2, use_bias=False, activation='linear', input_shape=[1,])
])
p = model(x)
tf.print(test_keras_xent(y1h, p, from_logits=True))
# [0.603375256 0.964639068]
tf.print(test_keras_xent(y1h, p, from_logits=False, mask_op=False))
# [0.603375256 0.964639068]
tf.print(test_keras_xent(y1h, p, from_logits=False, mask_op=True))
# [0.603375256 0.964638948]
The results here are almost identical, but you can see there is a small difference in the second value. This has in turn an effect (probably in amplified) in the computed gradients, which of course are as well "equivalent" mathematical expression but with different precision properties.
It turns out this is a bug and I have raised it here.
I wanted to use a FCN (kind of U-Net) in order to make some semantic segmentation.
I performed it using Python & Keras based on Tensorflow backend. Now I have good results, I'm trying to improve them, and I think one way to do such a thing is by improving my loss computation.
I know that in my output, the several classes are imbalanced, and using the default categorical_crossentropy function can be a problem.
My model inputs and outputs are both in the float32 format, input are channel_first and output and channel_last (permutation done at the end of the model)
In the binary case, when I only want to segment one class, I have change the loss function in this way so it can add the weights case by case depending on the content of the output :
def weighted_loss(y_true, y_pred):
def weighted_binary_cross_entropy(y_true, y_pred):
w = tf.reduce_sum(y_true)/tf_cast(tf_size(y_true), tf_float32)
real_th = 0.5-th
tf_th = tf.fill(tf.shape(y_pred), real_th)
tf_zeros = tf.fill(tf.shape(y_pred), 0.)
return (1.0 - w) * y_true * - tf.log(tf.maximum(tf.zeros, tf.sigmoid(y_pred) + tf_th)) +
(1- y_true) * w * -tf.log(1 - tf.maximum(tf_zeros, tf.sigmoid(y_pred) + tf_th))
return weighted_binary_coss_entropy
Note that th is the activation threshold which by default is 1/nClasses and which I have changed in order to see what value gives me the best results
What do you think about it?
What about change it so it will be able to compute the weighted categorical cross entropy (in the case of multi-class)
Your implementation will work for binary classes , for multi class it will just be
-y_true * tf.log(tf.sigmoid(y_pred))
and use inbuilt tensorflow method for calculating categorical entropy as it avoids overflow for y_pred<0
you can view this answer Unbalanced data and weighted cross entropy ,it explains weighted categorical cross entropy implementation.
The only change for categorical_crossentropy would be
def weighted_loss(y_true, y_pred):
def weighted_categorical_cross_entropy(y_true, y_pred):
w = tf.reduce_sum(y_true)/tf_cast(tf_size(y_true), tf_float32)
loss = w * tf.nn.softmax_cross_entropy_with_logits(onehot_labels, logits)
return loss
return weighted_categorical_cross_entropy
extracting prediction for individual class
def loss(y_true, y_pred):
s = tf.shape(y_true)
# if number of output classes is at last
number_classses = s[-1]
# this will give you one hot code for your prediction
clf_pred = tf.one_hot(tf.argmax(y_pred, axis=-1), depth=number_classses, axis=-1)
# extract the values of y_pred where y_pred is max among the classes
prediction = tf.where(tf.equal(clf_pred, 1), y_pred, tf.zeros_like(y_pred))
# if one hotcode == 1 then class1_prediction == y_pred else class1_prediction ==0
class1_prediction = prediction[:, :, :, 0:1]
# you can compute your loss here on individual class and return the loss ,just for simplicity i am returning the class1_prediction
return class1_prediction
output from model
y_pred = [[[[0.5, 0.3, 0.7],
[0.6, 0.3, 0.2]]
,
[[0.7, 0.9, 0.6],
[0.3 ,0.9, 0.3]]]]
corresponding ground truth
y_true = [[[[0, 1, 0],
[1 ,0, 0]]
,
[[1,0 , 0],
[0,1, 0]]]]
prediction for class 1
prediction = loss(y_true, y_pred)
# prediction = [[[[0. ],[0.6]],[0. ],[0. ]]]]
Any advice is welcome as this is an ambitious second coding project. :)
Specifically, I'm having two different issues with this DNN.
I can only seem to get it to run 1 of 100 evaluation steps and,
Trouble getting meaningful predictions.
At some point it was running all 100 steps of evaluation. I cannot seem to replicate that now for anything. What am I missing?
The data set is for a dice game. The predictions I'm looking for would be in an array of the same shape as the features and labels with a binary prediction for each position in the array.
I have tried different array shapes and depths to the point that I'm all turned around. Perhaps a different estimator is the solution? It throws a features dictionary '1' not found if I try to feed one feature/label combination to the predictor; it demands the same set size as the training and test sets.
Is there a way to return predictions in this way?
Example:
predict_feature = {'0': [1, 2, 5, 1, 4, 3]} #1's and 5's would be 'keepers'
predict_label = np.array([1, 0, 1, 1, 0, 0])
desired output = np.array[.91, .12, .89, .92, .06, .15]
The features are generated randomly and labels are created via scoring algorithm from the game. They are passed through the below to create the features dictionary and put labels into an array. Similar functions create the evaluation and prediction sets.
def train_evaluation_set(features, labels):
"""Creates training input set"""
feature = {}
features = [[digit for digit in features[x]] for x in range(len(features))]
for x in range(len(features)):
feature.update({"{}".format(x): features[x]})
label = np.array(labels)
return feature, label
Tensors are then created.
def train_input_fn(feature, label, batch_size):
"""Input function for training"""
dataset = tf.data.Dataset.from_tensor_slices((dict(feature), label))
dataset = dataset.shuffle(shuffle_x).repeat().batch(100)
iterator = dataset.make_one_shot_iterator()
feature, label = iterator.get_next()
return feature, label
The estimator is set up thusly:
def main(main=None, argv=None):
# Set feature columns.
my_feature_columns = []
for key in feature.keys():
my_feature_columns.append(tf.feature_column.numeric_column(key=key))
# Instantiate estimator.
classifier = tf.estimator.DNNClassifier(
feature_columns=my_feature_columns,
hidden_units=[100, 100, 100],
n_classes=2)
# Train the Model.
classifier.train(
input_fn=lambda: train_input_fn(feature, label, batch_size),
steps=train_steps)
# Evaluate the model.
eval_result = classifier.evaluate(
input_fn=lambda: eval_input_fn(test_feature, test_label, batch_size),
steps=200)
print('\nTest set accuracy: {accuracy:0.3f}\n'.format(**eval_result))
# Generate predictions from the model
predictions = classifier.predict(
input_fn=lambda: predict_input_fn(predict_feature, predict_label[0]))
pp.pprint(next(predictions))
From here the training runs smoothly and one evaluation step is completed.
INFO:tensorflow:Loss for final step: 0.00292182.
WARNING:tensorflow:Casting <dtype: 'float32'> labels to bool.
WARNING:tensorflow:Casting <dtype: 'float32'> labels to bool.
INFO:tensorflow:Starting evaluation at 2018-02-20-09:06:14
INFO:tensorflow:Restoring parameters from C:\Users\Paul\AppData\Local\Temp\tmp97u0tbvx\model.ckpt-1000
INFO:tensorflow:Evaluation [1/200]
INFO:tensorflow:Finished evaluation at 2018-02-20-09:06:19
INFO:tensorflow:Saving dict for global step 1000: accuracy = 0.666667, accuracy_baseline = 0.833333, auc = 0.8, auc_precision_recall = 0.25, average_loss = 0.623973, global_step = 1000, label/mean = 0.166667, loss = 3.74384, prediction/mean = 0.216801
Test set accuracy: 0.667
I have a suspicion that the WARNING steps are where my problem with the prediction lies even though the labels have already been booled, but have no clue what to do about it.
And, finally, pretty print gives me:
{'class_ids': array([1], dtype=int64),
'classes': array([b'1'], dtype=object),
'logistic': array([ 0.70525986], dtype=float32),
'logits': array([ 0.87247205], dtype=float32),
'probabilities': array([ 0.2947402 , 0.70525986], dtype=float32)}
Full code can be found at https://github.com/llpk79/DNNTenThousand