I'm new to TensorFlow (version 1.2), but not to Python or Numpy. I am building a model to predict the shape of a protein molecule. I need to wrap TensorFlow's standard tf.losses.cosine_distance function in some extra code, because I need to stop the propagation of some NaN values into the loss calculation.
I know exactly which cells will be NaN. Whatever my machine learning system predicts for those cells does not count. I plan to turn the NaN part of the output of tf.losses.cosine_distance into zeros before summing up the loss function.
Here's a snippet of working code, using tf.scatter_nd_update for the element assignment:
def custom_distance(predict, actual):
with tf.name_scope("CustomDistance"):
loss = tf.losses.cosine_distance(predict, actual, -1,
reduction=tf.losses.Reduction.NONE)
loss = tf.Variable(loss) # individual elements can be modified
indices = tf.constant([[0,0,0],[29,1,0],[29,2,0]])
updates = tf.constant([0., 0., 0.])
loss = tf.scatter_nd_update(loss, indices, updates)
return loss
But, that only works on the one protein that I have that is 30 amino acids long. What if I have a protein of a different length? I will have many.
In Numpy, I would just use Python's negative indexing, and substitute -1's for the two 29's on the indices line. Tensorflow will not accept that. If I make that substitution, I get a long traceback, but I think that the most important part of it is this:
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/framework/errors_impl.py", line 466, in raise_exception_on_not_ok_status
pywrap_tensorflow.TF_GetCode(status))
tensorflow.python.framework.errors_impl.InvalidArgumentError: Invalid indices: [0,1] = [-1, 1, 0] is not in [0, 30)
(I could also modify the predict Tensor so that the cells in question exactly match the actual Tensor before calculating the loss, but in each case the challenge is the same: to assign the values of individual elements in a TensorFlow object.)
Should I just forget about negative indexing in TensorFlow? I am poring through the TensorFlow docs to understand the correct approach to this problem. I assume that I can retrieve the length of my input Tensors long the primary axis and use that. But after seeing the strong parallels between TensorFlow and Numpy, I have to wonder whether that's clunky.
Thanks for your suggestions.
It can be used with tensorflow's bindings to python slicing operators. So for example, loss[-1] is a valid slicing of loss.
In your case, if you have only three slices, you could assign them individually:
update_op0 = indices[0,0,0].assign(updates[0])
update_op1 = indices[-1,1,0].assign(updates[1])
update_op2 = indices[-1,2,0].assign(updates[2])
If you have more slices than that, or a variable number of slices, the previous approach is not practical. You can rather write a small helper function like this to convert "positive or negative indices" to "positive only indices":
def to_pos_idx(idx, x):
# todo: shape & bound checking
idx = tf.convert_to_tensor(idx)
s = tf.shape(x)[:tf.size(idx)]
idx = tf.where(idx < 0, s + idx, idx)
return idx
and modify your code like this :
indices = tf.constant([[0,0,0],[-1,1,0],[-1,2,0]])
indices = tf.map_fn(lambda i: to_pos_idx(i, loss), indices) # transform indices here
updates = tf.constant([0., 0., 0.])
loss = tf.scatter_nd_update(loss, indices, updates)
Related
I know PyTorch doesn't have a map-like function to apply a function to each element of a tensor. So, could I do something like the following without a map-like function in PyTorch?
if tensor_a * tensor_b.matmul(tensor_c) < 1:
return -tensor_a*tensor_b
else:
return 0
This would work if the tensors were 1D. However, I need this to work when tensor_b is 2D (tensor_a needs to be unsqueezed in the return statement). This means a 2D tensor should be returned where some of the rows will be 0 vectors.
Happy to use the latest features of the most recent Python version.
If I understand correctly, you are looking to return a tensor either way (hence the mapping) but by checking the condition element-wise. Assuming the shapes of tensor_a, tensor_b, and tensor_c are all two dimensional, as in "simple matrices", here is a possible solution.
What you're looking for is probably torch.where, it's fairly close to a mapping where based on a condition, it will return one value or another element-wise.
It works like torch.where(condition, value_if, value_else) where all three tensors have the same shape (value_if and value_else can actually be floats which will be cast to tensors, filled with the same value). Also, condition is a bool tensor which defines which value to assign to the outputted tensor: it's a boolean mask.
For the purpose of this example, I have used random tensors:
>>> a = torch.rand(2, 2, dtype=float)*100
>>> b = torch.rand(2, 2, dtype=float)*0.01
>>> c = torch.rand(2, 2, dtype=float)*10
>>> torch.where(a*(b#c) < 1, -a*b, 0.)
tensor([[ 0.0000, 0.0000],
[ 0.0000, -0.0183]], dtype=torch.float64)
More generally though, this will work if tensor_a and tensor_b have a shape of (m, n), and tensor_c has a shape of (n, m) because of the operation constraints. In your experiment I'm guessing you only had columns.
tl;dr what is the most efficient way to dynamically choose some entries of a tensor.
I am trying to implement syntactic GCN in Tensorflow. Basically, I need to have a different weight matrix for every label (lets ignore biases for this question) and choose at each run the relevant entries to use, those would be chosen by a sparse matrix (for each entry there is at most one label in one direction and mostly no edge so not even that).
More concretely, when I have a sparse matrix of labeled edges (zero-one), is it better to use it in a mask, a sparse-dense tensor multiplication or maybe just use normal multiplication (I guess not the latter, but for simplicty use it in the example)
example:
units = 6 # output size
x = ops.convert_to_tensor(inputs[0], dtype=self.dtype)
labeled_edges = ops.convert_to_tensor(inputs[1], dtype=self.dtype)
edges_shape = labeled_edges.get_shape().as_list()
labeled_edges = expand_dims(labeled_edges, -2)
labeled_edges = tile(
labeled_edges, [1] * (len(edges_shape) - 1) + [units, 1])
graph_kernel = math_ops.multiply(self.kernel, labeled_edges) # here is the question basically
outputs = standard_ops.tensordot(x, graph_kernel, [[1], [0]])
outputs = math_ops.reduce_sum(outputs, [-1])
To answer your tl;dr question, you can try using either of the following:
tf.nn.embedding_lookup : typical usage is tf.nn.embedding_lookup(params, ids). It returns a Tensor, which 0-axis entries are a subset of Tensor params. The indices of kept entries are defined by Tensor ids.
tf.nn.embedding_lookup_sparse : is the same as tf.nn.embedding_lookup but takes ids as a SparseTensor.
I am trying to solve a binary classification problem with the sequential model from Keras
and have to meet a given Balanced Error Rate (BER)
So I thought it would be a good idea to use the BER instead of accuracy as a metric.
My custom metric implementation for BER looks like this:
def balanced_error_rate(y_true, y_pred):
labels = theano.shared(np.asmatrix([[0, 1]], dtype='int8'))
label_matrix = K.repeat_elements(labels, K.shape(y_true)[0], axis=1)
true_matrix = K.repeat_elements(y_true, K.shape(labels)[0], axis=1)
pred_matrix = K.repeat_elements(K.round(y_pred), K.shape(labels)[0], axis=1)
class_lens = K.sum(K.equal(label_matrix, true_matrix), axis=1)
return K.sum(K.sum(class_lens - K.sum(K.equal(label_matrix, K.not_equal(true_matrix,pred_matrix)), axis=1), axis=0)/class_lens, axis=0)/2
The idea is to create a matrix from the available labels and compare it to the input data (then sum the ones) to get the number of elements of this label....
My problem is that:
> K.shape(y_true)
Shape.0
> Typeinfo:
> type(y_true)
<class 'theano.tensor.var.TensorVariable'>
> type(K.shape(y_true))
<class 'theano.tensor.var.TensorVariable'>
...and I can't find out why.
I am now looking for:
A way to get the array dimensions / an explanation why shape acts like it does / the reason why y_true seems to have 0 dimensions
or
A method to create a tensor matrix with a given with/height by repeating a given row/column vector.
or
A smarter solution to calculate the BER using tensor functions.
A way to get the array dimensions / an explanation why shape acts like it does / the reason why y_true seems to have 0 dimensions
The deal with print and abstraction libraries like Theano is that you usually do not get the values but a represenation of the value. So if you do
print(foo.shape)
You won't get the actual shape but a representation of the operation that is done at runtime. Since this is all computed on an external device the computation is not run immediately but only after creating a function with appropriate inputs (or calling foo.shape.eval()).
Another way to print the value is to use theano.printing.Print when using the value, e.g.:
shape = theano.printing.Print('shape of foo')(foo.shape)
# use shape (not foo.shape!)
A method to create a tensor matrix with a given with/height by repeating a given row/column vector.
See theano.tensor.repeat for that. Example in numpy (usage is quite similar):
>>> x
array([[1, 2, 3]])
>>> x.repeat(3, axis=0)
array([[1, 2, 3],
[1, 2, 3],
[1, 2, 3]])
I am trying to vectorize an operation using numpy, which I use in a python script that I have profiled, and found this operation to be the bottleneck and so needs to be optimized since I will run it many times.
The operation is on a data set of two parts. First, a large set (n) of 1D vectors of different lengths (with maximum length, Lmax) whose elements are integers from 1 to maxvalue. The set of vectors is arranged in a 2D array, data, of size (num_samples,Lmax) with trailing elements in each row zeroed. The second part is a set of scalar floats, one associated with each vector, that I have a computed and which depend on its length and the integer-value at each position. The set of scalars is made into a 1D array, Y, of size num_samples.
The desired operation is to form the average of Y over the n samples, as a function of (value,position along length,length).
This entire operation can be vectorized in matlab with use of the accumarray function: by using 3 2D arrays of the same size as data, whose elements are the corresponding value, position, and length indices of the desired final array:
sz_Y = num_samples;
sz_len = Lmax
sz_pos = Lmax
sz_val = maxvalue
ind_len = repmat( 1:sz_len ,1 ,sz_samples);
ind_pos = repmat( 1:sz_pos ,sz_samples,1 );
ind_val = data
ind_Y = repmat((1:sz_Y)',1 ,Lmax );
copiedY=Y(ind_Y);
mask = data>0;
finalarr=accumarray({ind_val(mask),ind_pos(mask),ind_len(mask)},copiedY(mask), [sz_val sz_pos sz_len])/sz_val;
I was hoping to emulate this implementation with np.bincounts. However, np.bincounts differs to accumarray in two relevant ways:
both arguments must be of same 1D size, and
there is no option to choose the shape of the output array.
In the above usage of accumarray, the list of indices, {ind_val(mask),ind_pos(mask),ind_len(mask)}, is 1D cell array of 1x3 arrays used as index tuples, while in np.bincounts it must be 1D scalars as far as I understand. I expect np.ravel may be useful but am not sure how to use it here to do what I want. I am coming to python from matlab and some things do not translate directly, e.g. the colon operator which ravels in opposite order to ravel. So my question is how might I use np.bincount or any other numpy method to achieve an efficient python implementation of this operation.
EDIT: To avoid wasting time: for these multiD index problems with complicated index manipulation, is the recommend route to just use cython to implement the loops explicity?
EDIT2: Alternative Python implementation I just came up with.
Here is a heavy ram solution:
First precalculate:
Using index units for length (i.e., length 1 =0) make a 4D bool array, size (num_samples,Lmax+1,Lmax+1,maxvalue) , holding where the conditions are satisfied for each value in Y.
ALLcond=np.zeros((num_samples,Lmax+1,Lmax+1,maxvalue+1),dtype='bool')
for l in range(Lmax+1):
for i in range(Lmax+1):
for v in range(maxvalue+!):
ALLcond[:,l,i,v]=(data[:,i]==v) & (Lvec==l)`
Where Lvec=[len(row) for row in data]. Then get the indices for these using np.where and initialize a 4D float array into which you will assign the values of Y:
[indY,ind_len,ind_pos,ind_val]=np.where(ALLcond)
Yval=np.zeros(np.shape(ALLcond),dtype='float')
Now in the loop in which I have to perform the operation, I compute it with the two lines:
Yval[ind_Y,ind_len,ind_pos,ind_val]=Y[ind_Y]
Y_avg=sum(Yval)/num_samples
This gives a factor of 4 or so speed up over the direct loop implementation. I was expecting more. Perhaps, this is a more tangible implementation for Python heads to digest. Any faster suggestions are welcome :)
One way is to convert the 3 "indices" to a linear index and then apply bincount. Numpy's ravel_multi_index is essentially the same as MATLAB's sub2ind. So the ported code could be something like:
shape = (Lmax+1, Lmax+1, maxvalue+1)
posvec = np.arange(1, Lmax+1)
ind_len = np.tile(Lvec[:,None], [1, Lmax])
ind_pos = np.tile(posvec, [n, 1])
ind_val = data
Y_copied = np.tile(Y[:,None], [1, Lmax])
mask = posvec <= Lvec[:,None] # fill-value independent
lin_idx = np.ravel_multi_index((ind_len[mask], ind_pos[mask], ind_val[mask]), shape)
Y_avg = np.bincount(lin_idx, weights=Y_copied[mask], minlength=np.prod(shape)) / n
Y_avg.shape = shape
This is assuming data has shape (n, Lmax), Lvec is Numpy array, etc. You may need to adapt the code a little to get rid of off-by-one errors.
One could argue that the tile operations are not very efficient and not very "numpythonic". Something with broadcast_arrays could be nice, but I think I prefer this way:
shape = (Lmax+1, Lmax+1, maxvalue+1)
posvec = np.arange(1, Lmax+1)
len_idx = np.repeat(Lvec, Lvec)
pos_idx = np.broadcast_to(posvec, data.shape)[mask]
val_idx = data[mask]
Y_copied = np.repeat(Y, Lvec)
mask = posvec <= Lvec[:,None] # fill-value independent
lin_idx = np.ravel_multi_index((len_idx, pos_idx, val_idx), shape)
Y_avg = np.bincount(lin_idx, weights=Y_copied, minlength=np.prod(shape)) / n
Y_avg.shape = shape
Note broadcast_to was added in Numpy 1.10.0.
In the MNIST beginner tutorial, there is the statement
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
tf.cast basically changes the type of tensor the object is, but what is the difference between tf.reduce_mean and np.mean?
Here is the doc on tf.reduce_mean:
reduce_mean(input_tensor, reduction_indices=None, keep_dims=False, name=None)
input_tensor: The tensor to reduce. Should have numeric type.
reduction_indices: The dimensions to reduce. If None (the defaut), reduces all dimensions.
# 'x' is [[1., 1. ]]
# [2., 2.]]
tf.reduce_mean(x) ==> 1.5
tf.reduce_mean(x, 0) ==> [1.5, 1.5]
tf.reduce_mean(x, 1) ==> [1., 2.]
For a 1D vector, it looks like np.mean == tf.reduce_mean, but I don't understand what's happening in tf.reduce_mean(x, 1) ==> [1., 2.]. tf.reduce_mean(x, 0) ==> [1.5, 1.5] kind of makes sense, since mean of [1, 2] and [1, 2] is [1.5, 1.5], but what's going on with tf.reduce_mean(x, 1)?
The functionality of numpy.mean and tensorflow.reduce_mean are the same. They do the same thing. From the documentation, for numpy and tensorflow, you can see that. Lets look at an example,
c = np.array([[3.,4], [5.,6], [6.,7]])
print(np.mean(c,1))
Mean = tf.reduce_mean(c,1)
with tf.Session() as sess:
result = sess.run(Mean)
print(result)
Output
[ 3.5 5.5 6.5]
[ 3.5 5.5 6.5]
Here you can see that when axis(numpy) or reduction_indices(tensorflow) is 1, it computes mean across (3,4) and (5,6) and (6,7), so 1 defines across which axis the mean is computed. When it is 0, the mean is computed across(3,5,6) and (4,6,7), and so on. I hope you get the idea.
Now what are the differences between them?
You can compute the numpy operation anywhere on python. But in order to do a tensorflow operation, it must be done inside a tensorflow Session. You can read more about it here. So when you need to perform any computation for your tensorflow graph(or structure if you will), it must be done inside a tensorflow Session.
Lets look at another example.
npMean = np.mean(c)
print(npMean+1)
tfMean = tf.reduce_mean(c)
Add = tfMean + 1
with tf.Session() as sess:
result = sess.run(Add)
print(result)
We could increase mean by 1 in numpy as you would naturally, but in order to do it in tensorflow, you need to perform that in Session, without using Session you can't do that. In other words, when you are computing tfMean = tf.reduce_mean(c), tensorflow doesn't compute it then. It only computes that in a Session. But numpy computes that instantly, when you write np.mean().
I hope it makes sense.
The key here is the word reduce, a concept from functional programming, which makes it possible for reduce_mean in TensorFlow to keep a running average of the results of computations from a batch of inputs.
If you are not familiar with functional programming, this can seem mysterious. So first let us see what reduce does. If you were given a list like [1,2,5,4] and were told to compute the mean, that is easy - just pass the whole array to np.mean and you get the mean. However what if you had to compute the mean of a stream of numbers? In that case, you would have to first assemble the array by reading from the stream and then call np.mean on the resulting array - you would have to write some more code.
An alternative is to use the reduce paradigm. As an example, look at how we can use reduce in python to calculate the sum of numbers:
reduce(lambda x,y: x+y, [1,2,5,4]).
It works like this:
Step 1: Read 2 digits from the list - 1,2. Evaluate lambda 1,2. reduce stores the result 3. Note - this is the only step where 2 digits are read off the list
Step 2: Read the next digit from the list - 5. Evaluate lambda 5, 3 (3 being the result from step 1, that reduce stored). reduce stores the result 8.
Step 3: Read the next digit from the list - 4. Evaluate lambda 8,4 (8 being the result of step 2, that reduce stored). reduce stores the result 12
Step 4: Read the next digit from the list - there are none, so return the stored result of 12.
Read more here Functional Programming in Python
To see how this applies to TensorFlow, look at the following block of code, which defines a simple graph, that takes in a float and computes the mean. The input to the graph however is not a single float but an array of floats. The reduce_mean computes the mean value over all those floats.
import tensorflow as tf
inp = tf.placeholder(tf.float32)
mean = tf.reduce_mean(inp)
x = [1,2,3,4,5]
with tf.Session() as sess:
print(mean.eval(feed_dict={inp : x}))
This pattern comes in handy when computing values over batches of images. Look at The Deep MNIST Example where you see code like:
correct_prediction = tf.equal(tf.argmax(y,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
The new documentation states that tf.reduce_mean() produces the same results as np.mean:
Equivalent to np.mean
It also has absolutely the same parameters as np.mean. But here is an important difference: they produce the same results only on float values:
import tensorflow as tf
import numpy as np
from random import randint
num_dims = 10
rand_dim = randint(0, num_dims - 1)
c = np.random.randint(50, size=tuple([5] * num_dims)).astype(float)
with tf.Session() as sess:
r1 = sess.run(tf.reduce_mean(c, rand_dim))
r2 = np.mean(c, rand_dim)
is_equal = np.array_equal(r1, r2)
print is_equal
if not is_equal:
print r1
print r2
If you will remove type conversion, you will see different results
In additional to this, many other tf.reduce_ functions such as reduce_all, reduce_any, reduce_min, reduce_max, reduce_prod produce the same values as there numpy analogs. Clearly because they are operations, they can be executed only from inside of the session.