I want to make a weighted metric, and print it out as Keras trains my data. I however fail to find any working examples of how to do this.
When running:
metrics = [MyClass.MyWeightedMetric]
model.compile(optimizer=RMSprop,loss="mean_squared_error",metrics=metrics)
where
class MyClass:
#staticmethod
def MyWeightedMetric(y_true,y_pred,sample_weight=None):
print(sample_weight)
#do stuff that doesn't even use sample_weight for now
then it prints None all the time. If I change my compilation line into
model.compile(optimizer=RMSprop,loss="mean_squared_error",weighted_metrics=metrics)
then I get the errors:
(0) Invalid argument: Can not squeeze dim[0], expected a dimension of 1, got 16384
when calling model.fit(). It's not clear to me what I'm doing wrong or what I should be doing instead. I tried making a subclass as in this example https://www.tensorflow.org/api_docs/python/tf/keras/metrics/Metric but that came with its own set of errors that I could not resolve.
Is there somewhere a working example of weighted metrics in keras, that I can provide at model compile stage? I find many examples of unweighted usages, but working weighted metric example seems impossible to find at the moment. For what it's worth, I'm using Tensorflow 2.0
When you pass weighted_metrics to model.compile during the training, the model expects a sample_weight column vector to compute the weighted metric. In your case, it seems like you are passing flatten tensor. The shape should be (16384, 1) instead of (16384). Further information
https://www.tensorflow.org/guide/keras/train_and_evaluate#using_a_validation_dataset
Related
I'm running a sequence-to-sequence model in TensorFlow, for which I need to extensively pad my input data (samples' length varies). Thus, any metrics calculated, it is heavily biased (if the true sample is ~10% of the data input to the model, the errors appearing there as somehow hidden by "correct" predictions on the padded part).
Thus, I'd like to calculate "true" metrics (accuracy, AUC or whathever), that takes into account the real sample only. In numpy-ish code, I'd like to do something like that:
def adjusted_metrics(y_true,y_pred):
last_index = np.nonzero(y_true)[-1] % y is padded with 0 and there is another value at the end of real y
return AUC(y_true[:last_index], y_pred[:last_index])
But, I'm pretty new to tensorflow and:
I can't do that in tensorflow code. Actually, I'm not able to find the index of the last nonzero element of y_true, when it is a Tensor. I tried casting to numpy using tensorflow.experimental.numpy (no effect actually, it still appears as a Tensor?) or calling .numpy() on a tensor (not working, despite the fact I don't have eager execution disabled). I tried masking, but it's hard for me to find the mask dimension, also due to the following point:
All my attempts also seem inappropriate in the context of batches - y_true and y_pred are of shape (None, max_length). I suppose the calculation in batches is governed by my model, but I've no idea how (and if it's possible) to change the metrics calculation to be done per sample, keeping the whole learning process in batches.
Any advice? :)
In the current version of TF (2.2.0) there is an option
to do multi class classification (i.e., more than two classes, by changing
n_classes to the relevant number in the estimator params).
However, all previous examples that I saw, for example the formal one here:
https://www.tensorflow.org/tutorials/estimator/boosted_trees_model_understanding
present binary classification. So I'm not sure what to do with the target (classes) vector.
If I keep him in the range of [0,...,num_classes-1], when I try to train the model, I get the error (comes from TF gradients.py file):
"'int' object has no attribute 'is_compatible_with'". It feels like a dimension\shape error with respect to class vector, but I
couldn't find the default loss function and what the this model expects to get. I don't think I need to convert the class vector to binary matrix (one hot encoding). Appreciate any help!
Indeed when I've changed the TF code manually everything worked.
Then, I found out there is a bug report on the issue here:
https://github.com/tensorflow/tensorflow/issues/40063
I'm using keras to solve a multi-class problem. My data is very unbalanced, so I'm trying to create something similar to a confusion matrix. My dataset is very large, and saved as HDF5, so I use HDF5Matrix to fetch the X and Y, making scikit-learn confusion matrix irrelevant (as far as I know).
I've seen it is possible to save the predictions and true labels, or output the error per label, however a more elegant solution would be to create a multi-dimensional metric that accumulates the (predicted,true) label pairs (sort of like a confusion matrix).
I have used the following callback to try and peek into what's going on per batch / epoch:
from keras.callbacks import LambdaCallback
batch_print_callback = LambdaCallback(on_batch_end=lambda batch, logs:
print(logs),on_epoch_end=lambda epoch, logs: print(logs))
but it only accumulates a single value (usually the average of sorts).
I've also tried to see if it's possible to return the y_pred / y_true as following (to try and see if I can print a multi-dimensional value in the logs):
def pred(y_true, y_pred):
return y_pred
def true(y_true, y_pred):
return y_true
However, it doesn't return a multi-dimensional value as I expected
So basically, my question is, can I use keras to accumulate multi-dimensional metric?
Well, to my best knowledge, it is not possible, since before returning the value of a tensor, K.mean is applied. I posted an issue about this on keras github.
The best design I came up with is a metric for each cell in the confusion matrix, and a callback that collects them, inpired by the thread mentioned in the question.
A sort-of working solution can be found here
My first time using Tensorflow on the MNIST dataset, I had a really simple bug where I forgot to take mean of my error values before passing it to the optimizer.
In other words, instead of
loss = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=y, labels=y_))
I accidentally used
loss = tf.nn.softmax_cross_entropy_with_logits(logits=y, labels=y_)
Not taking the mean or sum of the error values threw no errors when training the network, however. This led me thinking: Is there actually a case when someone would need to pass in multiple loss values into an optimizer? What was happening when I passed in a Tensor not of size [1] into minimize()?
They are being added up. This is side-product of TensorFlow using Reverse Mode AD to differentiate, which requires loss to be a scalar
I'm trying to write something similar to google's wide and deep learning after running into difficulties of doing multi-class classification(12 classes) with the sklearn api. I've tried to follow the advice in a couple of posts and used the tf.group(logistic_regression_optimizer, deep_model_optimizer). It seems to work but I was trying to figure out how to get predictions out of this model. I'm hoping that with the tf.group operator the model is learning to weight the logistic and deep models differently but I don't know how to get these weights out so I can get the right combination of the two model's predictions. Thanks in advance for any help.
https://groups.google.com/a/tensorflow.org/forum/#!topic/discuss/Cs0R75AGi8A
How to set layer-wise learning rate in Tensorflow?
tf.group() creates a node that forces a list of other nodes to run using control dependencies. It's really just a handy way to package up logic that says "run this set of nodes, and I don't care about their output". In the discussion you point to, it's just a convenient way to create a single train_op from a pair of training operators.
If you're interested in the value of a Tensor (e.g., weights), you should pass it to session.run() explicitly, either in the same call as the training step, or in a separate session.run() invocation. You can pass a list of values to session.run(), for example, your tf.group() expression, as well as a Tensor whose value you would like to compute.
Hope that helps!