Balanced Error Rate as metric function - python

I am trying to solve a binary classification problem with the sequential model from Keras
and have to meet a given Balanced Error Rate (BER)
So I thought it would be a good idea to use the BER instead of accuracy as a metric.
My custom metric implementation for BER looks like this:
def balanced_error_rate(y_true, y_pred):
labels = theano.shared(np.asmatrix([[0, 1]], dtype='int8'))
label_matrix = K.repeat_elements(labels, K.shape(y_true)[0], axis=1)
true_matrix = K.repeat_elements(y_true, K.shape(labels)[0], axis=1)
pred_matrix = K.repeat_elements(K.round(y_pred), K.shape(labels)[0], axis=1)
class_lens = K.sum(K.equal(label_matrix, true_matrix), axis=1)
return K.sum(K.sum(class_lens - K.sum(K.equal(label_matrix, K.not_equal(true_matrix,pred_matrix)), axis=1), axis=0)/class_lens, axis=0)/2
The idea is to create a matrix from the available labels and compare it to the input data (then sum the ones) to get the number of elements of this label....
My problem is that:
> K.shape(y_true)
Shape.0
> Typeinfo:
> type(y_true)
<class 'theano.tensor.var.TensorVariable'>
> type(K.shape(y_true))
<class 'theano.tensor.var.TensorVariable'>
...and I can't find out why.
I am now looking for:
A way to get the array dimensions / an explanation why shape acts like it does / the reason why y_true seems to have 0 dimensions
or
A method to create a tensor matrix with a given with/height by repeating a given row/column vector.
or
A smarter solution to calculate the BER using tensor functions.

A way to get the array dimensions / an explanation why shape acts like it does / the reason why y_true seems to have 0 dimensions
The deal with print and abstraction libraries like Theano is that you usually do not get the values but a represenation of the value. So if you do
print(foo.shape)
You won't get the actual shape but a representation of the operation that is done at runtime. Since this is all computed on an external device the computation is not run immediately but only after creating a function with appropriate inputs (or calling foo.shape.eval()).
Another way to print the value is to use theano.printing.Print when using the value, e.g.:
shape = theano.printing.Print('shape of foo')(foo.shape)
# use shape (not foo.shape!)
A method to create a tensor matrix with a given with/height by repeating a given row/column vector.
See theano.tensor.repeat for that. Example in numpy (usage is quite similar):
>>> x
array([[1, 2, 3]])
>>> x.repeat(3, axis=0)
array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3]])

Related

How can I apply np.apply_along_axis to combination of two arrays?

Arrays of labels of objects and distances to that objects are given. I want to apply knn to find the label of prediction. I want to use np.bincount for that. However, I don't understand how to use this.
See some example
labels = [[1,1,2,0,0,3,3,3,5,1,3],
[1,1,2,0,0,3,3,3,5,1,3]]
weights= [[0,0,0,0,0,0,0,0,1,0,0],
[0,0,0,0,0,0,0,0,1,0,0]]
Imagine 10 nearest neighbors for 2 objects are given and their labels and distances are given above. So I want the output as [5,5], because only neighbours with that label have nonzero weight. I am doing the next thing:
eps = 1e-5
lab_weight = np.array(list(zip(labels, weights)))
predict = np.apply_along_axis(lambda x: np.bincount(x[0], weights=x[1]).argmax(), 2, lab_weight)
I expect that x will correspond to [[1,1,2,0,0,3,3,3,5,1,3], [0,0,0,0,0,0,0,0,1,0,0]], but it won't. Other axis parameters are not working too. How can I achieve the goal? I want to use numpy functions and avoid python loops.
The next solution gives me desired result:
labels = [[1,1,2,0,0,3,3,3,5,1,3],
[1,1,2,0,0,3,3,3,5,1,3]]
weights= [[0,0,0,0,0,0,0,0,1,0,0],
[0,0,0,0,0,0,0,0,1,0,0]]
length = len(labels[0])
lab_weight = np.hstack((labels, weights))
predict = np.apply_along_axis(lambda x: np.bincount(x[:length], weights=x[length:]).argmax(), 1, lab_weight)
The problem with your code is that you attempt to use your
function to 2-D slices of your array, whereas apply_along_axis
applies the given function to 1-D slices.
So your code generates an exception: ValueError: object of too small
depth for desired array.
To apply your function to 2-D slices, use a list comprehension based on
np.rollaxis and then create a Numpy array from it:
result = np.array([ np.bincount(x[0], weights=x[1]).argmax()
for x in np.rollaxis(lab_weight, 2) ])
The result, for your array, is:
array([1, 1, 2, 0, 0, 3, 3, 3, 5, 1, 3], dtype=int64)
To trace, for each interation, the source array, intermediate results
and the final result, run:
i = 0
for x in np.rollaxis(lab_weight, 2):
print(f' i: {i}\n{x}'); i += 1
bc = np.bincount(x[0], weights=x[1])
bcm = bc.argmax()
print(bc, bcm)

How to implement tf.argmax on our own?

I want to use a function which takes as input a tensor and Returns the index with the largest value across axes of a tensor. I know there exists a function tf.argmax() that does exactly the same, but how do I implement it on my own (this may be necessary incase of implementing some custom function)?
Let us suppose for now the function takes as input only 1D tensor. So, the function needs to be of following signature:
argmax(
input, #input is a 1D tensor
name=None
)
I tried implementing it this way:
def argmax(input, name=None):
maxValue=0
maxIndex=0
for i in range(input.get_shape()[0]):
if input[i]>maxValue:
maxValue=input[i]
maxIndex=i
return maxIndex
However this does not work since during the construction phase, the values are not yet intialized and hence I cannot compare two values as I did in the above code. So, is there a way where we can write out custom functions like tf.argmax, tf.equal, etc?
Well, one simple way would be this:
idx = tf.where(tf.equal(input, tf.reduce_max(input)))[0, 0]
Example:
import tensorflow as tf
with tf.Session() as sess:
input = tf.constant([1, 3, 4, 2, 1, 2])
idx = tf.where(tf.equal(input, tf.reduce_max(input)))[0, 0]
print(sess.run(idx))
Output:
2

Numpy [...,None]

I have found myself needing to add features to existing numpy arrays which has led to a question around what the last portion of the following code is actually doing:
np.ones(shape=feature_set.shape)[...,None]
Set-up
As an example, let's say I wish to solve for linear regression parameter estimates by using numpy and solving:
Assume I have a feature set shape (50,1), a target variable of shape (50,), and I wish to use the shape of my target variable to add a column for intercept values.
It would look something like this:
# Create random target & feature set
y_train = np.random.randint(0,100, size = (50,))
feature_set = np.random.randint(0,100,size=(50,1))
# Build a set of 1s after shape of target variable
int_train = np.ones(shape=y_train.shape)[...,None]
# Able to then add int_train to feature set
X = np.concatenate((int_train, feature_set),1)
What I Think I Know
I see the difference in output when I include [...,None] vs when I leave it off. Here it is:
The second version returns an error around input arrays needing the same number of dimensions, and eventually I stumbled on the solution to use [...,None].
Main Question
While I see the output of [...,None] gives me what I want, I am struggling to find any information on what it is actually supposed to do. Can anybody walk me through what this code actually means, what the None argument is doing, etc?
Thank you!
The slice of [..., None] consists of two "shortcuts":
The ellipsis literal component:
The dots (...) represent as many colons as needed to produce a complete indexing tuple. For example, if x is a rank 5 array (i.e., it has 5 axes), then
x[1,2,...] is equivalent to x[1,2,:,:,:],
x[...,3] to x[:,:,:,:,3] and
x[4,...,5,:] to x[4,:,:,5,:].
(Source)
The None component:
numpy.newaxis
The newaxis object can be used in all slicing operations to create an axis of length one. newaxis is an alias for ‘None’, and ‘None’ can be used in place of this with the same result.
(Source)
So, arr[..., None] takes an array of dimension N and "adds" a dimension "at the end" for a resulting array of dimension N+1.
Example:
import numpy as np
x = np.array([[1,2,3],[4,5,6]])
print(x.shape) # (2, 3)
y = x[...,None]
print(y.shape) # (2, 3, 1)
z = x[:,:,np.newaxis]
print(z.shape) # (2, 3, 1)
a = np.expand_dims(x, axis=-1)
print(a.shape) # (2, 3, 1)
print((y == z).all()) # True
print((y == a).all()) # True
Consider this code:
np.ones(shape=(2,3))[...,None].shape
As you see the 'None' phrase change the (2,3) matrix to a (2,3,1) tensor. As a matter of fact it put the matrix in the LAST index of the tensor.
If you use
np.ones(shape=(2,3))[None, ...].shape
it put the matrix in the FIRST‌ index of the tensor

boolean_mask or sparse dot product in tensorflow

tl;dr what is the most efficient way to dynamically choose some entries of a tensor.
I am trying to implement syntactic GCN in Tensorflow. Basically, I need to have a different weight matrix for every label (lets ignore biases for this question) and choose at each run the relevant entries to use, those would be chosen by a sparse matrix (for each entry there is at most one label in one direction and mostly no edge so not even that).
More concretely, when I have a sparse matrix of labeled edges (zero-one), is it better to use it in a mask, a sparse-dense tensor multiplication or maybe just use normal multiplication (I guess not the latter, but for simplicty use it in the example)
example:
units = 6 # output size
x = ops.convert_to_tensor(inputs[0], dtype=self.dtype)
labeled_edges = ops.convert_to_tensor(inputs[1], dtype=self.dtype)
edges_shape = labeled_edges.get_shape().as_list()
labeled_edges = expand_dims(labeled_edges, -2)
labeled_edges = tile(
labeled_edges, [1] * (len(edges_shape) - 1) + [units, 1])
graph_kernel = math_ops.multiply(self.kernel, labeled_edges) # here is the question basically
outputs = standard_ops.tensordot(x, graph_kernel, [[1], [0]])
outputs = math_ops.reduce_sum(outputs, [-1])
To answer your tl;dr question, you can try using either of the following:
tf.nn.embedding_lookup : typical usage is tf.nn.embedding_lookup(params, ids). It returns a Tensor, which 0-axis entries are a subset of Tensor params. The indices of kept entries are defined by Tensor ids.
tf.nn.embedding_lookup_sparse : is the same as tf.nn.embedding_lookup but takes ids as a SparseTensor.

How to use numpy to calculate mean and standard deviation of an irregular shaped array

I have a numpy array that has many samples in it of varying length
Samples = np.array([[1001, 1002, 1003],
... ,
[1001, 1002]])
I want to (elementwise) subtract the mean of the array then divide by the standard deviation of the array. Something like:
newSamples = (Samples-np.mean(Samples))/np.std(Samples)
Except that doesn't work for irregular shaped arrays,
np.mean(Samples) causes
unsupported operand type(s) for /: 'list' and 'int'
due to what I assume to be it having set a static size for each axis and then when it encounters a different sized sample it can't handle it. What is an approach to solve this using numpy?
example input:
Sample = np.array([[1, 2, 3],
[1, 2]])
After subtracting by the mean and then dividing by standard deviation:
Sample = array([[-1.06904497, 0.26726124, 1.60356745],
[-1.06904497, 0.26726124]])
Don't make ragged arrays. Just don't. Numpy can't do much with them, and any code you might make for them will always be unreliable and slow because numpy doesn't work that way. It turns them into object dtypes:
Sample
array([[1, 2, 3], [1, 2]], dtype=object)
Which almost no numpy functions work on. In this case those objects are list objects, which makes your code even more confusing as you either have to switch between list and ndarray methods, or stick to list-safe numpy methods. This a recipe for disaster as anyone noodling around with the code later (even yourself if you forget) will be dancing in a minefield.
There's two things you can do with your data to make things work better:
First method is to index and flatten.
i = np.cumsum(np.array([len(x) for x in Sample]))
flat_sample = np.hstack(Sample)
This preserves the index of the end of each sample in i, while keeping the sample as a 1D array
The other method is to pad one dimension with np.nan and use nan-safe functions
m = np.array([len(x) for x in Sample]).max()
nan_sample = np.array([x + [np.nan] * (m - len(x)) for x in Sample])
So to do your calculations, you can use flat_sample and do similar to above:
new_flat_sample = (flat_sample - np.mean(flat_sample)) / np.std(flat_sample)
and use i to recreate your original array (or list of arrays, which I recommend:, see np.split).
new_list_sample = np.split(new_flat_sample, i[:-1])
[array([-1.06904497, 0.26726124, 1.60356745]),
array([-1.06904497, 0.26726124])]
Or use nan_sample, but you will need to replace np.mean and np.std with np.nanmean and np.nanstd
new_nan_sample = (nan_sample - np.nanmean(nan_sample)) / np.nanstd(nan_sample)
array([[-1.06904497, 0.26726124, 1.60356745],
[-1.06904497, 0.26726124, nan]])
#MichaelHackman (following the comment remark).
That's weird because when I compute the overall std and mean then apply it, I obtain different result (see code below).
import numpy as np
Samples = np.array([[1, 2, 3],
[1, 2]])
c = np.hstack(Samples) # Will gives [1,2,3,1,2]
mean, std = np.mean(c), np.std(c)
newSamples = np.asarray([(np.array(xi)-mean)/std for xi in Samples])
print newSamples
# [array([-1.06904497, 0.26726124, 1.60356745]), array([-1.06904497, 0.26726124])]
edit: Add np.asarray(), put mean,std computation outside loop following Imanol Luengo's excellent comments (Thanks!)

Categories

Resources