I've been trying to do something seemingly simple, with no success.
I have a (?,4) tensor, where each row will be 4 floats between 0 and 1.
I want to replace this with a new tensor where each row has only the top 2 entries and zeros everywhere else.
Example with a (2, 4):
source = [ [0.1, 0.2, 0.5, 0.6],
[0.8, 0.7, 0.2, 0.1] ]
result = [ [0.0, 0.0, 0.5, 0.6],
[0.8, 0.7, 0.0, 0.0] ]
I tried using top_k on the source and then using scatter_nd with the indices returned by top_k, but it has literally been 4 hours of mismatched shapes and rank errors in scatter_nd.
I'm ready to give up, but I thought I would ask for help here first.
I've found a couple of questions here closely related, but I'm failing to generalize the info in there for my case.
Another approach I just tried is this:
tensor = tf.constant( [ [0.1, 0.2, 0.8], [0.1, 0.2, 0.7] ])
values, indices = tf.nn.top_k(tensor, 1)
elems = (tensor, values)
masked_a = tf.map_fn(
lambda a : tf.where( tf.greater_equal(a[0], a[1]), a[0],
tf.zeros_like(a[0]) ),
elems)
but this one gives me the following error:
ValueError: The two structures don't have the same number of elements.
First structure (2 elements): (tf.float32, tf.float32)
Second structure (1 elements): Tensor("map/while/Select:0", shape=(3,), dtype=float32)
I'm relatively new with TensorFlow, so apologies if I'm missing something simple or being unclear.
Thanks!
You can do it with tf.scatter_nd by appending the row indice to the indices returned by top_k.
import tensorflow as tf
source = tf.constant([
[0.1, 0.2, 0.5, 0.6],
[0.8, 0.7, 0.2, 0.1]])
# get indices of top k
k = 2
top_k, top_k_inds = tf.nn.top_k(source, k, )
# indices are only columns, we will stack
# it so the row indice is also there and
# make tensor of row numbers ie.
# [[0, 0],
# [1, 1],
# ...
num_rows = tf.shape(source)[0]
row_range = tf.range(num_rows)
row_tensor = tf.tile(row_range[:,None], (1, k))
# stack along the final dimension, as this is what
# scatter_nd uses as the indices
top_k_row_col_indices = tf.stack([row_tensor, top_k_inds], axis=2)
# to mask off everything, we will multiply the top_k by
# 1. so all the updates are just 1
updates = tf.ones([num_rows, k], dtype=tf.float32)
# build the mask
zero_mask = tf.scatter_nd(top_k_row_col_indices, updates, [num_rows, 4])
with tf.Session() as sess:
zeroed = source*zero_mask
print(zeroed.eval())
This should print
[[0. 0. 0.5 0.6]
[0.8 0.7 0. 0. ]]
Just paste some lines of code :)
import tensorflow as tf
def attach_indice(tensor, top_k = None):
flatty = tf.reshape(tensor, [-1])
orig_shape = tf.shape(tensor)
length = tf.shape(flatty)[0]
if top_k is not None:
orig_shape = orig_shape[:-1] # dim for top_k
length //= top_k
indice = tf.unravel_index(tf.range(length), orig_shape)
indice = tf.transpose(indice)
if indice.dtype != tensor.dtype:
indice = tf.cast(indice, tensor.dtype)
if top_k is not None:
_dims = len(tensor.shape) - 1 # indice of indice
shape = [1 for _ in range(_dims)]
shape[-1] *= top_k
indice = tf.reshape(tf.tile(indice, shape), [-1, _dims])
return tf.concat([indice, flatty[:, None]], -1)
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
# tf.enable_eager_execution()
from time import time
top_k = 3
shape = [50, 40, 100]
q = tf.random_uniform(shape)
# fast: 4.376221179962158 (GPU) / 2.483684778213501 (CPU)
v, k = tf.nn.top_k(q, top_k)
k = attach_indice(k, top_k)
s = tf.scatter_nd(k, tf.reshape(v, [-1]), shape)
# very slow: 281.82796931266785 (GPU) / 35.163344860076904 (CPU)
# s = tf.map_fn(lambda v__k__: tf.map_fn(lambda v_k_: tf.scatter_nd(v_k_[1][:, None], v_k_[0], [shape[-1]]), v__k__, q.dtype), tf.nn.top_k(q, top_k), q.dtype)
start = time()
with tf.Session() as sess:
for _ in range(1000):
sess.run(s)
print('time', time() - start)
Related
I have a 3D tensor with batch, sequence, feature dimension (N,s,e). It is a sequence of probability distributions. Then I want to order them according to the integer corresponding to the highest predictions. So say
x_probabs = 3D tensor (ex: [[[0.5, 0.1, 0.4], [0.3, 0.3, 0.4], [0.1,
0.8, 0.1]]]; # shape N s e
x = tf.argmax(x_probabs, axis=-1) = [[0, 2, 1]]; # shape N s
or another example would be
x_probabs=[[[0.6, 0.1, 0.1, 0.1, 0.1], [0.1,0.1,0.1,0.1,0.6], [0.1,0.1,0.1,0.6,0.1]]];
x = [[0, 4, 3]];
If i wanted to order x i can do ordered_x = tf.sort(x, axis=-1), then to get the ordering i can do indices_sorted_x = tf.argsort(x, axis=-1). I want the same ordering applied to x_probabs and i am confused how to that, i have tried sorted_x_probabs = tf.gather(x_probabs, indices_sorted_x) but it doesn't work because the indices are for a 2D tensor and not a 3D one. I'm stuck here.
The following is what it would look like for the first example
sorted_x = [[0,1,2]];
sorted_x_probabs = [[[0.5, 0.1, 0.4],[0.1,
0.8, 0.1],[0.3, 0.3, 0.4]]];
This would be for the 2nd example
sorted_x = [[0,3,4]];
sorted_x_probabs = [[[0.6, 0.1, 0.1, 0.1, 0.1],[0.1,0.1,0.1,0.6,0.1],[0.1,0.1,0.1,0.1,0.6]]];
Thank you very much in advance.
You can add batch_dims argument to start gathering from the lower dimension:
x = tf.gather(x_probabs, x, batch_dims=1)
I have a Numpy array, and I need to find the N maximum product subarrays of M elements. For example, I have the array p = [0.1, 0.2, 0.8, 0.5, 0.7, 0.9, 0.3, 0.5] and I want to find the 5 highest product subarrays of 3 elements. Is there a "fast" way to do that?
Here is another quick way to do it:
import numpy as np
p = [0.1, 0.2, 0.8, 0.5, 0.7, 0.9, 0.3, 0.5]
n = 5
m = 3
# Cumulative product (starting with 1)
pc = np.cumprod(np.r_[1, p])
# Cumulative product of each window
w = pc[m:] / pc[:-m]
# Indices of the first element of top N windows
idx = np.argpartition(w, n)[-n:]
print(idx)
# [1 2 5 4 3]
Approach #1
We can create sliding windows and then perform prod reduction and finally np.argpartition to get top N ones among them -
from skimage.util.shape import view_as_windows
def topN_windowed_prod(a, W, N):
w = view_as_windows(a,W)
return w[w.prod(1).argpartition(-N)[-N:]]
Sample run -
In [2]: p = np.array([0.1, 0.2, 0.8, 0.5, 0.7, 0.9, 0.3, 0.5])
In [3]: topN_windowed_prod(p, W=3, N=2)
Out[3]:
array([[0.8, 0.5, 0.7],
[0.5, 0.7, 0.9]])
Note that the order is not maintained with np.argpartition. So, if we need the top N in descending order of prod values, use range(N) with it. More info.
Approach #2
For smaller window lengths, we can simply slice and get our desired result, like so -
def topN_windowed_prod_with_slicing(a, W, N):
w = view_as_windows(a,W)
L = len(a)-W+1
acc = a[:L].copy()
for i in range(1,W):
acc *= a[i:i+L]
idx = acc.argpartition(-N)[-N:]
return w[idx]
I'm implementing a backward HMM algorithm in PyTorch. I used this link as reference. This link contains the results of the numerical example used (I am attempting to implement that and compare my generated results to it). Page 3, section 2. Backward probability, there is a table containing the calculated results.
Here is my code:
# Initial Transition matrix as shown in page 2 of above link
A = np.array([[0.6, 0.4], [0.3, 0.7]])
A = torch.from_numpy(A)
# Initial State Probability (page 2)
pi = np.array([0.8, 0.2])
pi = torch.from_numpy(pi)
# Output probabilities (page 2)
emission_matrix = np.array([[0.3, 0.4, 0.3, 0.3], [0.4, 0.3, 0.3, 0.3]])
emission_matrix = torch.from_numpy(emission_matrix)
# Initialize empty 2x4 matrix (dimensions of emission matrix)
backward = torch.zeros(emission_matrix.shape, dtype=torch.float64)
# Backward algorithm
def _backward(emission_matrix):
# Initialization: A(i, j) * B(T, i) * B(Ot+1, j) , where B(Ot+1, j) = 1
backward[:, -1] = torch.matmul(A, emission_matrix[:, -1])
# I reversed the emission matrix so as to start from the last column
rev_emission_mat = torch.flip(emission_matrix[:, :-1], [1])
# I transposed the reversed emission matrix such that each iterable in the for
# loop is the observation sequence probability
T_rev_emission_mat = torch.transpose(rev_emission_mat, 1, 0)
# This step is so that I assign a reverse index enumeration to each iterable in the
# emission matrix starts from time T to 0, rather than the opposite
zipped_cols = list(zip(range(len(T_rev_emission_mat)-1, -1, -1), T_rev_emission_mat))
for i, obs_prob in zipped_cols:
# Induction: Σ A(i, j) * B(j)(Ot+1) * β(t+1, j)
if i != 0:
backward[:, i] = torch.matmul(A * obs_prob, backward[:, i+1])
# Termination: Σ π(i) * bi * β(1, i)
backward[:, 0] = torch.matmul(pi * obs_prob, backward[:, 1])
# run backward algorithm
_backward(emission_matrix)
# check results, backward is an all zero matrix that was initialized above
print(backward)
>>> tensor([[0.0102, 0.0324, 0.0900, 0.3000],
[0.0102, 0.0297, 0.0900, 0.3000]], dtype=torch.float64)
As you can see, the 0-th index does not match the result in page 3 of the previous link. What did I do wrong? If there is anything I can clarify, please let me know. Thanks in advance!
backward[:, 0] = pi * obs_prob * backward[:, 1]
I have the following code:
import numpy as np
import tensorflow as tf
a = np.array([0.5, 0.5])
b = np.array([0.2, 0.2, 0.0, 0.0])
non_zeros = ~tf.equal(b, 0.)
cast_op = tf.cast(non_zeros, tf.float64)
new_vec = tf.multiply(a, cast_op) # won't work
# the required output is [0.5, 0.5, 0.0, 0.0]
I am trying to obtain the vector [0.5, 0.5, 0.0, 0.0] as explained in the code. Does anyone know how to do this? I also looked at tf.fill but that takes a scalar value, so won't work for me.
You get an error because tf.multiply expects tensors of the same shape. What you could do, however, is to simply do this:
a = np.array[0.5, 0.5])
b = np.array([0.2, 0.2, 0.0, 0.0])
b = np.logical_and(b, n.ones(b.shape)).astype(float)
a = np.concatenate((a, np.zeros(b.shape[0] - a.shape[0])))
new_vec = a * b
You can exploit the broadcasting capability of the tf.multiply op.
I've added next to every line the shape of the tensor: please note the usage of tf.expand_dims to add a 1 dimension to the a tensor in order to get, after the multiplication, a tensor with shape (2,4).
This tensor has repeated values (2 rows, 4 columns equal), hence we can just take the first row
import numpy as np
import tensorflow as tf
a = np.array([0.5, 0.5]) #(2)
b = np.array([0.2, 0.2, 0.0, 0.0]) #(4)
non_zeros = ~tf.equal(b, 0.) #(4)
cast_op = tf.cast(non_zeros, tf.float64) # (4)
new_vec = tf.multiply(tf.expand_dims(a, axis=[1]),
cast_op) # (2, 1) * (4) = (2, 4)
new_vec = new_vec[0, :] # (4)
print(new_vec)
sess = tf.InteractiveSession()
print(sess.run(new_vec))
This code produces [0.5 0.5 0. 0.]
I have labels that are OHE in the form of examples = tf.placeholder(tf.int32, [batch_size]) where each example is an int in the range 0:ohe_size.
My output is in the form of a softmax probability distribution with a shape [batch_size, ohe_size]
I'm trying to work out how to create a mask that will give me just the probability distribution for each example. e.g.
probs = [[0.1, 0.6, 0.3]
[0.2, 0.1, 0.7]
[0.9, 0.1, 0.0]]
examples = [2, 2, 0]
some_mask_func(probs, example) # <- Need this function
> [0.3, 0.7, 0.9]
If I understood your example correctly, you need tf.gather_nd
range = tf.range(tf.shape(examples)[0])
indices = tf.pack([range, examples], axis=1)
result = tf.gather_nd(probs, indices)