tensorflow conditional on axis - python

I am attempting to gather the indices of specific tensors/(vectors/matrices) within a tensor in keras. Therefore, I attempted to use tf.gather with tf.where to get the indices to use in the gather function.
However, tf.where provides element wise indices for the matching values when testing for equality. I would like to have the ability to find the indices (rows) for tensors (vectors) which are equal to another.
This is especially useful for finding the one-hot vectors within a tensor which match a set of one-hot vectors of interest.
I have some code to illustrate the shortcoming so far:
# standard
import tensorflow as tf
import numpy as np
from sklearn.preprocessing import LabelBinarizer
sess = tf.Session()
# one-hot vector encoding labels
l = LabelBinarizer()
l.fit(['a','b','c'])
# input tensor
t = tf.constant(l.transform(['a','a','c','b', 'a']))
# find the indices where 'c' is label
# ***THIS WORKS***
np.all(t.eval(session = sess) == l.transform(['c']), axis = 1)
# We need to do everything in tensorflow and then wrap in Lambda layer for keras so...
from keras import backend as K
# ***THIS DOES NOT WORK***
K.all(t.eval(session = sess) == l.transform(['c']), axis = 1)
# go on from here to get smaller subset of vectors from another tensor with the indicies given by `tf.gather`
Clearly the code above shows I have tried to get this conditional by axis to work, and it does fine in numpy, but the tensorflow version is not as easily ported from numpy.
Is there a better way to do this?

Similarly to what you do, we can use tf.reduce_all which is the tensorflow equivalent of np.all:
tf.reduce_all(t.eval(session = sess) == l.transform(['c']), axis = 1)

Related

Custom Pytorch layer to apply LSTM on each group

I have a N × F tensor with features and a N × 1 tensor with group index. I want to design a custom pytorch layer which will apply LSTM on each group with sorted features. I have mentioned LSTM with sorted group features as an example, hypothetically it can be anything which supports variable length input or sequence. Please refer to the image below for visual interpretation of the problem.
The obvious approach would be calling a LSTM layer for each unique group but that would be inefficient. Is there any better way to do it?
You can certainly parallelize the LSTM application -- the problem is indexing the feature tensor efficiently.
The best thing I could come up with (I use something similar for my own stuff) would be to list comprehend over the unique group ids to make a list of variable-length tensors, then pad them over and run the LSTM on top.
In code:
import torch
from torch import Tensor
from torch.nn.utils.rnn import pad_sequence
n = 13
f = 77
n_groups = 3
xs = torch.rand(n, f)
ids = torch.randint(low=0, high=n_groups, size=(n,))
def groupbyid(xs: Tensor, ids: Tensor, batch_first: bool,
padding_value: int = 0) -> Tensor:
return pad_sequence([xs[ids==idx] for idx in ids.unique()],
batch_first=batch_first,
padding_value=padding_value)
grouped = groupbyid(xs, ids)
print(grouped.shape)
# torch.Size([3, 5, 77])
You can then apply your LSTM in parallel over the n_groups dimension on the grouped Tensor.
Note that you will also need to inspect the content of ids.unique() to assign each LSTM output to its corresponding group id, but this is easy to write and depends on your application.

TF2.x Custom loss function, numpy

I'm trying to use a custom loss function. I'm now using TF 2.x where eager execution is turned on by default. I gave this a go with TF 1.x, but ran into too many problems. Is there any alternative to wrapping my function with tf.py_function()? If not, how would I wrap this?
General purpose: Autoencoder with a custom loss function built around unusual ranked differences. For now I'm just using scipy stats rankdata, but that will change in the future.
Tensor shape: n, x, x, 1
n images, each of dim x, x.
Therefore, I want to run this custom loss function on each pair of orig, pred for all n images.
General algorithm:
import scipy.stats as ss
def rank_loss(orig, pred):
orig_arr = orig.numpy() # want x,x,1
pred_arr = pred.numpy()
orig_rank = (ss.rankdata(orig_arr)) # returns flat array of length size of array
pred_rank = (ss.rankdata(pred_arr))
distance_diff = 0
for i in range(len(orig_rank)): # gets sum of rank differences
distance_diff = abs(orig_rank[i] - pred_rank[i])
return distance_diff
If I can't do this, am I limited to the available tf.<funcs> or how can I pull out the tensor as some form of an array so I can run comparison computations across the two tensors?
I also looked at tf.make_ndarray, but that doesn't seem applicable.

How do I get back the error values of a keras loss function (tensor)

I would like to plot all the different loss functions available in Keras. Therefore I have created a dataframe and invoke the loss function. But how can I get back the values from the tensor?
import numpy as np
import pandas as pd
from keras import losses
points = 100
df = pd.DataFrame({"error": np.linspace(-3,3,points)})
df["mean_squared_error"] = losses.mean_squared_error(np.zeros(points), df["error"])
df.plot(x="error")
The loss functions in Keras return a Tensor object. You need to evaluate that Tensor object using the eval() function from the backend to get its actual value. Further, if you take a look at the definition of loss functions in Keras, say mean_squared_error(), you would realize that there is K.mean() operation which takes the average over the last axis which is the output axis (don't confuse this with batch or sample axis). Therefore, you may need to pass the true and predicted values in a shape of (n_samples, n_outputs), hence the reshapes in the following code:
import numpy as np
import pandas as pd
from keras import losses
from keras import backend as K
points = 100
df = pd.DataFrame({"error": np.linspace(-3,3,points)})
mse_loss = losses.mean_squared_error(np.zeros((points,1)), df["error"].values.reshape(-1,1))
df["mean_squared_error"] = K.eval(mse_loss)
df.plot(x="error")
Here is the output plot:

Keras model.predict for multinomial logistic regression

I'm training a model whose output is a softmax layer of size 19. When I try model.predict(x), for each input, I get what appears to be a probability distribution across the 19 classes. I tried model.predict_classes, and got a numpy array of the size of x, with each output equal to 0. How can I get one hot vectors for the output?
So a documentation of predcit_classes is somehow misleading because if you check carefully its implementation, you'll find out that it works only for binary classification. In order to solve your problem you may use the numpy library (basically - a function argmax) in a following way:
import numpy as np
classes = np.argmax(model.predict(x), axis = 1)
.. in order to get an array with a class number for each example. In order to get a one-hot vector - you might use a keras built-in function to_categorical in a following manner:
import numpy as np
from keras.utils.np_utils import to_categorical
classes_one_hot = to_categorical(np.argmax(model.predict(x), axis = 1))

Slice 3-d tensor of Theano

I'm building a chacter-based rnn model using Keras (Theano backend). One thing to note is that I don't want to use a prebuilt loss function. Instead, I want to calculate loss for some datapoints. Here's what I mean.
Vectoried training set and its label look like this:
X_train = np.array([[0,1,2,3,4]])
y_train = np.array([[1,2,3,4,5]])
But I replaced first k element in the y_train with 0 for some reason. So, for example, new y_train is
y_train = np.array([[0,0,3,4,5]])
The reason why I set the first two elements to 0 is I don't want to include them when computing loss. In other words, I want to calculate the loss between X_train[2:] and y_train[2:].
Here's my try.
import numpy as np
np.random.seed(0) # for reproducibility
from keras.preprocessing import sequence
from keras.utils import np_utils
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Embedding
from keras.layers import LSTM
from keras.layers.wrappers import TimeDistributed
X_train = np.array([[0,1,2,3,4]])
y_train = np.array([[0,0,3,4,5]])
y_3d = np.zeros((y_train.shape[0], y_train.shape[1], 6))
for i in range(y_train.shape[0]):
for j in range(y_train.shape[1]):
y_3d[i, j, y_train[i,j]] = 1
model = Sequential()
model.add(Embedding(6, 5, input_length=5, dropout=0.2))
model.add(LSTM(5, input_shape=(5, 12), return_sequences=True) )
model.add(TimeDistributed(Dense(6))) #output classes =6
model.add(Activation('softmax'))
from keras import backend as K
import theano.tensor as T
def custom_objective(y_true,y_pred):
# Find the last index of minimum value in y_true, axis=-1
# For example, y_train = np.array([[0,0,3,4,5]]) in my example, and
# I'd like to calculate the loss only between X_train[3:] and y_train[3:] because the values
# in y_train[:3] (i.e.0) are dummies. The following is pseudo code if y_true is 1-d numpy array, which is not true.
def rindex(y_true):
for i in range(len(y_true), -1, -1):
if y_true(i) == 0:
return i
starting_point = rindex(y_true)
return K.categorical_crossentropy(y_pred[starting_point:], y_true[starting_point:])
model.compile(loss=custom_objective,
optimizer='adam',
metrics=['accuracy'])
model.fit(X_train, y_t, batch_size=batch_size, nb_epoch=1)
Appart from minor errors like the wrong paranthesis in line 35 and a wrong variable name in the last line, there are two problems with your code.
First, the model you defined will return a matrix of probability distributions (due to the softmax activation) for classes at each timestep.
But in custom_objective you are treating the output as vectors. You are already correctly transforming y_train to a matrix above.
So you would first have to get the actual predictions, the most simplest case is assigning the class with highest probability, i.e.:
y_pred = y_pred.argmax(axis=2)
y_true = y_true.argmax(axis=2) # this reconstructs y_train resp. a subset thereof
The second problem is that you are treating these like real variables (numpy arrays).
However, y_true and y_pred are symbolic tensors. The error you get clearly states one of the resulting problems:
TypeError: object of type 'TensorVariable' has no len()
TensorVariables have no length, as it is simply not known before real values are inserted! This then also makes iteration the way you implemented it impossible.
By the way, in cases where you iterate real vectors you might want to do it backward iteration like this: range(len(y_true)-1, -1, -1) to not go out of bounds, or even for val in y_true[::-1]
To achieve what you want, you need to treat the corresponding variables as what they are and use methods supplied for tensors.
The center of this calculation is the argmin function to find the minimum. By default this returns the first occurrence of this minimum.
Since you want to find the last occurrence of this minimum, we need to apply it to the reversed tensor and calculate it back to an index into the origianl vector.
starting_point = y_true.shape[0] - y_true[::-1].argmin() - 1
Possibly, there might be an even simpler solution to your problem as it looks like you are trying to implement something like masking.
You might want to take a look at the mask_zero=True flag for Embedding layers. This would work on the input side, though.

Categories

Resources