How to implement tf.argmax on our own?

How to implement tf.argmax on our own? - python

I want to use a function which takes as input a tensor and Returns the index with the largest value across axes of a tensor. I know there exists a function tf.argmax() that does exactly the same, but how do I implement it on my own (this may be necessary incase of implementing some custom function)?
Let us suppose for now the function takes as input only 1D tensor. So, the function needs to be of following signature:
argmax(
input, #input is a 1D tensor
name=None
)
I tried implementing it this way:
def argmax(input, name=None):
maxValue=0
maxIndex=0
for i in range(input.get_shape()[0]):
if input[i]>maxValue:
maxValue=input[i]
maxIndex=i
return maxIndex
However this does not work since during the construction phase, the values are not yet intialized and hence I cannot compare two values as I did in the above code. So, is there a way where we can write out custom functions like tf.argmax, tf.equal, etc?

Well, one simple way would be this:
idx = tf.where(tf.equal(input, tf.reduce_max(input)))[0, 0]
Example:
import tensorflow as tf
with tf.Session() as sess:
input = tf.constant([1, 3, 4, 2, 1, 2])
idx = tf.where(tf.equal(input, tf.reduce_max(input)))[0, 0]
print(sess.run(idx))
Output:
2

Related

How can I convert a tensor into a ndarray in TensorFlow?

My goal is to convert a tensor into a ndarray without 'run' or 'eval'.
I wanted to perform the same operation as the example.
A = tf.constant(5)
B = tf.constant([[A, 1], [0,0]])
However, ndarray can be inside tf.constant but tensor cannot.
Therefore, I tried to perform the operation using the following example, but tf.make_ndarray does not work.
A = tf.constant(5)
C = tf.make_ndarray(A)
B = tf.constant([[C, 1], [0,0]])
https://github.com/tensorflow/tensorflow/issues/28840#issuecomment-509551333
As mentioned in the github link above, tf.make_ndarray does not work.
To be precise, an error occurs because tensorflow requires a 'tensor_shape' that does not exist, instead of a 'shape' that exists.
How can I run the code in this situation?

tf.make_ndarray is used to convert TensorProto values into NumPy arrays. These values are generally the constants used in a graph. For example, when you use tf.constant, you create a Const operation with an attribute value holding the constant value that the operation will produce. That attribute is stored as a TensorProto. Hence, you can "extract" the value of a Const operation as a NumPy array like this:
import tensorflow as tf
A = tf.constant(5)
C = tf.make_ndarray(A.op.get_attr('value'))
print(C, type(C))
# 5 <class 'numpy.ndarray'>
In general, though, you cannot convert arbitrary tensors into NumPy arrays, as their values will depend on the values of the variables and the fed inputs within a particular session.

Find the intersection of two tensors. Return the sorted, unique values that are in both of the input tensors

Hi Tensorflow Beginner here,
I want to remove any numpy code in implementation and only use tensorflow functions. Currently I'm trying to filter out Background Bounding Boxes and boxes with a low confidence score. For that I want a index called keep that I can use to keep track of which boxes to keep:
# Filter out background boxes
keep = np.where(class_ids > 0)[0]
# Filter out low confidence boxes
if config.DETECTION_MIN_CONFIDENCE:
keep = np.intersect1d(
keep, np.where(class_scores >= config.DETECTION_MIN_CONFIDENCE)[0])
class_ids is a tensor of shape (1000,) where each entry is a number between 0 and 80 depending on the class (81 classes in total).
class_scores is a tensor of shape (1000,) where each entry is a probability for the class of the corresponding bounding box.
I know that np.where() is easily changed to tf.where but how can I get the same functionality as np.intersect1d() with tensorflow?
Thanks for the help.

This seems to duplicate the numpy.intersect1d example.
import tensorflow as tf
a = tf.constant([3, 1, 2, 1])
b = tf.constant([1, 3, 4, 3])
# This set appears to be sorted, but that is not documented behavior.
s = tf.sets.set_intersection(a[None,:], b[None, :])
fsort = tf.contrib.framework.sort(s.values)
with tf.Session() as sess:
print(sess.run(s).values)
print(sess.run(fsort))
This outputs
[1 3]
[1 3]
With a few test examples, the set function seems to give ordered results, but I could not verify that it will always do that. So, you might want to use the contrib function just to be sure.

TensorFlow, when can Python-like negative indexing be used if ever?

I'm new to TensorFlow (version 1.2), but not to Python or Numpy. I am building a model to predict the shape of a protein molecule. I need to wrap TensorFlow's standard tf.losses.cosine_distance function in some extra code, because I need to stop the propagation of some NaN values into the loss calculation.
I know exactly which cells will be NaN. Whatever my machine learning system predicts for those cells does not count. I plan to turn the NaN part of the output of tf.losses.cosine_distance into zeros before summing up the loss function.
Here's a snippet of working code, using tf.scatter_nd_update for the element assignment:
def custom_distance(predict, actual):
with tf.name_scope("CustomDistance"):
loss = tf.losses.cosine_distance(predict, actual, -1,
reduction=tf.losses.Reduction.NONE)
loss = tf.Variable(loss) # individual elements can be modified
indices = tf.constant([[0,0,0],[29,1,0],[29,2,0]])
updates = tf.constant([0., 0., 0.])
loss = tf.scatter_nd_update(loss, indices, updates)
return loss
But, that only works on the one protein that I have that is 30 amino acids long. What if I have a protein of a different length? I will have many.
In Numpy, I would just use Python's negative indexing, and substitute -1's for the two 29's on the indices line. Tensorflow will not accept that. If I make that substitution, I get a long traceback, but I think that the most important part of it is this:
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Invalid indices: [0,1] = [-1, 1, 0] is not in [0, 30)
(I could also modify the predict Tensor so that the cells in question exactly match the actual Tensor before calculating the loss, but in each case the challenge is the same: to assign the values of individual elements in a TensorFlow object.)
Should I just forget about negative indexing in TensorFlow? I am poring through the TensorFlow docs to understand the correct approach to this problem. I assume that I can retrieve the length of my input Tensors long the primary axis and use that. But after seeing the strong parallels between TensorFlow and Numpy, I have to wonder whether that's clunky.
Thanks for your suggestions.

It can be used with tensorflow's bindings to python slicing operators. So for example, loss[-1] is a valid slicing of loss.
In your case, if you have only three slices, you could assign them individually:
update_op0 = indices[0,0,0].assign(updates[0])
update_op1 = indices[-1,1,0].assign(updates[1])
update_op2 = indices[-1,2,0].assign(updates[2])
If you have more slices than that, or a variable number of slices, the previous approach is not practical. You can rather write a small helper function like this to convert "positive or negative indices" to "positive only indices":
def to_pos_idx(idx, x):
# todo: shape & bound checking
idx = tf.convert_to_tensor(idx)
s = tf.shape(x)[:tf.size(idx)]
idx = tf.where(idx < 0, s + idx, idx)
return idx
and modify your code like this :
indices = tf.constant([[0,0,0],[-1,1,0],[-1,2,0]])
indices = tf.map_fn(lambda i: to_pos_idx(i, loss), indices) # transform indices here
updates = tf.constant([0., 0., 0.])
loss = tf.scatter_nd_update(loss, indices, updates)

Balanced Error Rate as metric function

I am trying to solve a binary classification problem with the sequential model from Keras
and have to meet a given Balanced Error Rate (BER)
So I thought it would be a good idea to use the BER instead of accuracy as a metric.
My custom metric implementation for BER looks like this:
def balanced_error_rate(y_true, y_pred):
labels = theano.shared(np.asmatrix([[0, 1]], dtype='int8'))
label_matrix = K.repeat_elements(labels, K.shape(y_true)[0], axis=1)
true_matrix = K.repeat_elements(y_true, K.shape(labels)[0], axis=1)
pred_matrix = K.repeat_elements(K.round(y_pred), K.shape(labels)[0], axis=1)
class_lens = K.sum(K.equal(label_matrix, true_matrix), axis=1)
return K.sum(K.sum(class_lens - K.sum(K.equal(label_matrix, K.not_equal(true_matrix,pred_matrix)), axis=1), axis=0)/class_lens, axis=0)/2
The idea is to create a matrix from the available labels and compare it to the input data (then sum the ones) to get the number of elements of this label....
My problem is that:
> K.shape(y_true)
Shape.0
> Typeinfo:
> type(y_true)
<class 'theano.tensor.var.TensorVariable'>
> type(K.shape(y_true))
<class 'theano.tensor.var.TensorVariable'>
...and I can't find out why.
I am now looking for:
A way to get the array dimensions / an explanation why shape acts like it does / the reason why y_true seems to have 0 dimensions
or
A method to create a tensor matrix with a given with/height by repeating a given row/column vector.
or
A smarter solution to calculate the BER using tensor functions.

A way to get the array dimensions / an explanation why shape acts like it does / the reason why y_true seems to have 0 dimensions
The deal with print and abstraction libraries like Theano is that you usually do not get the values but a represenation of the value. So if you do
print(foo.shape)
You won't get the actual shape but a representation of the operation that is done at runtime. Since this is all computed on an external device the computation is not run immediately but only after creating a function with appropriate inputs (or calling foo.shape.eval()).
Another way to print the value is to use theano.printing.Print when using the value, e.g.:
shape = theano.printing.Print('shape of foo')(foo.shape)
# use shape (not foo.shape!)
A method to create a tensor matrix with a given with/height by repeating a given row/column vector.
See theano.tensor.repeat for that. Example in numpy (usage is quite similar):
>>> x
array([[1, 2, 3]])
>>> x.repeat(3, axis=0)
array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3]])

What is the difference between np.mean and tf.reduce_mean?

In the MNIST beginner tutorial, there is the statement
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
tf.cast basically changes the type of tensor the object is, but what is the difference between tf.reduce_mean and np.mean?
Here is the doc on tf.reduce_mean:
reduce_mean(input_tensor, reduction_indices=None, keep_dims=False, name=None)
input_tensor: The tensor to reduce. Should have numeric type.
reduction_indices: The dimensions to reduce. If None (the defaut), reduces all dimensions.
# 'x' is [[1., 1. ]]
# [2., 2.]]
tf.reduce_mean(x) ==> 1.5
tf.reduce_mean(x, 0) ==> [1.5, 1.5]
tf.reduce_mean(x, 1) ==> [1., 2.]
For a 1D vector, it looks like np.mean == tf.reduce_mean, but I don't understand what's happening in tf.reduce_mean(x, 1) ==> [1., 2.]. tf.reduce_mean(x, 0) ==> [1.5, 1.5] kind of makes sense, since mean of [1, 2] and [1, 2] is [1.5, 1.5], but what's going on with tf.reduce_mean(x, 1)?

The functionality of numpy.mean and tensorflow.reduce_mean are the same. They do the same thing. From the documentation, for numpy and tensorflow, you can see that. Lets look at an example,
c = np.array([[3.,4], [5.,6], [6.,7]])
print(np.mean(c,1))
Mean = tf.reduce_mean(c,1)
with tf.Session() as sess:
result = sess.run(Mean)
print(result)
Output
[ 3.5 5.5 6.5]
[ 3.5 5.5 6.5]
Here you can see that when axis(numpy) or reduction_indices(tensorflow) is 1, it computes mean across (3,4) and (5,6) and (6,7), so 1 defines across which axis the mean is computed. When it is 0, the mean is computed across(3,5,6) and (4,6,7), and so on. I hope you get the idea.
Now what are the differences between them?
You can compute the numpy operation anywhere on python. But in order to do a tensorflow operation, it must be done inside a tensorflow Session. You can read more about it here. So when you need to perform any computation for your tensorflow graph(or structure if you will), it must be done inside a tensorflow Session.
Lets look at another example.
npMean = np.mean(c)
print(npMean+1)
tfMean = tf.reduce_mean(c)
Add = tfMean + 1
with tf.Session() as sess:
result = sess.run(Add)
print(result)
We could increase mean by 1 in numpy as you would naturally, but in order to do it in tensorflow, you need to perform that in Session, without using Session you can't do that. In other words, when you are computing tfMean = tf.reduce_mean(c), tensorflow doesn't compute it then. It only computes that in a Session. But numpy computes that instantly, when you write np.mean().
I hope it makes sense.

The key here is the word reduce, a concept from functional programming, which makes it possible for reduce_mean in TensorFlow to keep a running average of the results of computations from a batch of inputs.
If you are not familiar with functional programming, this can seem mysterious. So first let us see what reduce does. If you were given a list like [1,2,5,4] and were told to compute the mean, that is easy - just pass the whole array to np.mean and you get the mean. However what if you had to compute the mean of a stream of numbers? In that case, you would have to first assemble the array by reading from the stream and then call np.mean on the resulting array - you would have to write some more code.
An alternative is to use the reduce paradigm. As an example, look at how we can use reduce in python to calculate the sum of numbers:
reduce(lambda x,y: x+y, [1,2,5,4]).
It works like this:
Step 1: Read 2 digits from the list - 1,2. Evaluate lambda 1,2. reduce stores the result 3. Note - this is the only step where 2 digits are read off the list
Step 2: Read the next digit from the list - 5. Evaluate lambda 5, 3 (3 being the result from step 1, that reduce stored). reduce stores the result 8.
Step 3: Read the next digit from the list - 4. Evaluate lambda 8,4 (8 being the result of step 2, that reduce stored). reduce stores the result 12
Step 4: Read the next digit from the list - there are none, so return the stored result of 12.
Read more here Functional Programming in Python
To see how this applies to TensorFlow, look at the following block of code, which defines a simple graph, that takes in a float and computes the mean. The input to the graph however is not a single float but an array of floats. The reduce_mean computes the mean value over all those floats.
import tensorflow as tf
inp = tf.placeholder(tf.float32)
mean = tf.reduce_mean(inp)
x = [1,2,3,4,5]
with tf.Session() as sess:
print(mean.eval(feed_dict={inp : x}))
This pattern comes in handy when computing values over batches of images. Look at The Deep MNIST Example where you see code like:
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))

The new documentation states that tf.reduce_mean() produces the same results as np.mean:
Equivalent to np.mean
It also has absolutely the same parameters as np.mean. But here is an important difference: they produce the same results only on float values:
import tensorflow as tf
import numpy as np
from random import randint
num_dims = 10
rand_dim = randint(0, num_dims - 1)
c = np.random.randint(50, size=tuple([5] * num_dims)).astype(float)
with tf.Session() as sess:
r1 = sess.run(tf.reduce_mean(c, rand_dim))
r2 = np.mean(c, rand_dim)
is_equal = np.array_equal(r1, r2)
print is_equal
if not is_equal:
print r1
print r2
If you will remove type conversion, you will see different results
In additional to this, many other tf.reduce_ functions such as reduce_all, reduce_any, reduce_min, reduce_max, reduce_prod produce the same values as there numpy analogs. Clearly because they are operations, they can be executed only from inside of the session.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to implement tf.argmax on our own? - python

Well, one simple way would be this: idx = tf.where(tf.equal(input, tf.reduce_max(input)))[0, 0] Example: import tensorflow as tf with tf.Session() as sess: input = tf.constant([1, 3, 4, 2, 1, 2]) idx = tf.where(tf.equal(input, tf.reduce_max(input)))[0, 0] print(sess.run(idx)) Output: 2

Related

How can I convert a tensor into a ndarray in TensorFlow?

Find the intersection of two tensors. Return the sorted, unique values that are in both of the input tensors

TensorFlow, when can Python-like negative indexing be used if ever?

Balanced Error Rate as metric function

What is the difference between np.mean and tf.reduce_mean?

Categories

Resources