Creating a new tensor based on the old

Creating a new tensor based on the old - python

I have the following code:
import numpy as np
import tensorflow as tf
a = np.array([0.5, 0.5])
b = np.array([0.2, 0.2, 0.0, 0.0])
non_zeros = ~tf.equal(b, 0.)
cast_op = tf.cast(non_zeros, tf.float64)
new_vec = tf.multiply(a, cast_op) # won't work
# the required output is [0.5, 0.5, 0.0, 0.0]
I am trying to obtain the vector [0.5, 0.5, 0.0, 0.0] as explained in the code. Does anyone know how to do this? I also looked at tf.fill but that takes a scalar value, so won't work for me.

You get an error because tf.multiply expects tensors of the same shape. What you could do, however, is to simply do this:
a = np.array[0.5, 0.5])
b = np.array([0.2, 0.2, 0.0, 0.0])
b = np.logical_and(b, n.ones(b.shape)).astype(float)
a = np.concatenate((a, np.zeros(b.shape[0] - a.shape[0])))
new_vec = a * b

You can exploit the broadcasting capability of the tf.multiply op.
I've added next to every line the shape of the tensor: please note the usage of tf.expand_dims to add a 1 dimension to the a tensor in order to get, after the multiplication, a tensor with shape (2,4).
This tensor has repeated values (2 rows, 4 columns equal), hence we can just take the first row
import numpy as np
import tensorflow as tf
a = np.array([0.5, 0.5]) #(2)
b = np.array([0.2, 0.2, 0.0, 0.0]) #(4)
non_zeros = ~tf.equal(b, 0.) #(4)
cast_op = tf.cast(non_zeros, tf.float64) # (4)
new_vec = tf.multiply(tf.expand_dims(a, axis=[1]),
cast_op) # (2, 1) * (4) = (2, 4)
new_vec = new_vec[0, :] # (4)
print(new_vec)
sess = tf.InteractiveSession()
print(sess.run(new_vec))
This code produces [0.5 0.5 0. 0.]

Related

Tensorflow 2: Sort a 3D tensor accoding to a 2D tensor

I have a 3D tensor with batch, sequence, feature dimension (N,s,e). It is a sequence of probability distributions. Then I want to order them according to the integer corresponding to the highest predictions. So say
x_probabs = 3D tensor (ex: [[[0.5, 0.1, 0.4], [0.3, 0.3, 0.4], [0.1,
0.8, 0.1]]]; # shape N s e
x = tf.argmax(x_probabs, axis=-1) = [[0, 2, 1]]; # shape N s
or another example would be
x_probabs=[[[0.6, 0.1, 0.1, 0.1, 0.1], [0.1,0.1,0.1,0.1,0.6], [0.1,0.1,0.1,0.6,0.1]]];
x = [[0, 4, 3]];
If i wanted to order x i can do ordered_x = tf.sort(x, axis=-1), then to get the ordering i can do indices_sorted_x = tf.argsort(x, axis=-1). I want the same ordering applied to x_probabs and i am confused how to that, i have tried sorted_x_probabs = tf.gather(x_probabs, indices_sorted_x) but it doesn't work because the indices are for a 2D tensor and not a 3D one. I'm stuck here.
The following is what it would look like for the first example
sorted_x = [[0,1,2]];
sorted_x_probabs = [[[0.5, 0.1, 0.4],[0.1,
0.8, 0.1],[0.3, 0.3, 0.4]]];
This would be for the 2nd example
sorted_x = [[0,3,4]];
sorted_x_probabs = [[[0.6, 0.1, 0.1, 0.1, 0.1],[0.1,0.1,0.1,0.6,0.1],[0.1,0.1,0.1,0.1,0.6]]];
Thank you very much in advance.

You can add batch_dims argument to start gathering from the lower dimension:
x = tf.gather(x_probabs, x, batch_dims=1)

Summing three consecutive number when equal to or great than 0 - Python

I am using numpy in Python
I have an array of numbers, for example:
arr = np.array([0.1, 1, 1.2, 0.5, -0.3, -0.2, 0.1, 0.5, 1)
If i is a position in the array, I want to create a function which creates a running sum of i and the two previous numbers, but only accumulating the number if it is equal to or greater than 0.
In other words, negative numbers in the array become equal to 0 when calculating the three number running sum.
For example, the answer I would be looking for here is
2.3, 2.7, 1.7, 0.5, 0.1, 0.6, 1.6
The new array has two elements less than the original array as the calculation can't be completed for the first two number.
Thank you !

As Dani Mesejo answered, you can use stride tricks. You can either use clip or boolean indexing to handle the <0 elements. I have explained how stride tricks work below -
arr[arr<0]=0 sets all elements below 0 as 0
as_strided takes in the array, the expected shape of the view (7,3) and the number of strides in the respective axes, (8,8). This is the number of bytes you have to move in axis0 and axis1 respectively to access the next element. E.g. If you want to move every 2 elements, then you can set it to (16,8). This means you would move 16 bytes each time to get the element in axis0 (which is 0.1->1.2->0->0.1->.., till a shape of 7) and 8 bytes each time to get element in axis1 (which is 0.1->1->1.2, till a shape of 3)
Use this function with caution! Always use x.strides to define the strides parameter to avoid corrupting memory!
Lastly, sum this array view over axis=1 to get your rolling sum.
arr = np.array([0.1, 1, 1.2, 0.5, -0.3, -0.2, 0.1, 0.5, 1])
w = 3 #rolling window
arr[arr<0]=0
shape = arr.shape[0]-w+1, w #Expected shape of view (7,3)
strides = arr.strides[0], arr.strides[0] #Strides (8,8) bytes
rolling = np.lib.stride_tricks.as_strided(arr, shape=shape, strides=strides)
rolling_sum = np.sum(rolling, axis=1)
rolling_sum
array([2.3, 2.7, 1.7, 0.5, 0.1, 0.6, 1.6])

You could clip, roll and sum:
import numpy as np
def rolling_window(a, window):
"""Recipe from https://stackoverflow.com/q/6811183/4001592"""
shape = a.shape[:-1] + (a.shape[-1] - window + 1, window)
strides = a.strides + (a.strides[-1],)
return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)
a = np.array([0.1, 1, 1.2, 0.5, -0.3, -0.2, 0.1, 0.5, 1])
res = rolling_window(np.clip(a, 0, a.max()), 3).sum(axis=1)
print(res)
Output
[2.3 2.7 1.7 0.5 0.1 0.6 1.6]

You may use np.correlate to sweep an array of 3 ones over the clipped of arr to get desired output
In [20]: np.correlate(arr.clip(0), np.ones(3), mode='valid')
Out[20]: array([2.3, 2.7, 1.7, 0.5, 0.1, 0.6, 1.6])

arr = np.array([0.1, 1, 1.2, 0.5, -0.3, -0.2, 0.1, 0.5, 1])
def sum_3(x):
collector = []
for i in range(len(arr)-2):
collector.append(sum(arr[i:i+3][arr[i:i+3]>0]))
return collector
#output
[2.3, 2.7, 1.7, 0.5, 0.1, 0.6, 1.6]
Easiest and most comprehensible way. The collector will append the sum of the 3 consecutive numbers if their indices are True otherwise, they are all turned to 0s.
The method is not general, it is for 3 consecutives but you can adapt it.
def sum_any(x,n):
collector = []
for i in range(len(arr)-(n-1)):
collector.append(sum(arr[i:i+n][arr[i:i+n]>0]))
return collector

Masked arrays and view_as_windows (which uses numpy strides under the hood) are built for this purpose:
from skimage.util import view_as_windows
arr = view_as_windows(arr, 3)
arr2 = np.ma.masked_array(arr, arr<0).sum(-1)
output:
[2.3 2.7 1.7 0.5 0.1 0.6 1.6]

How to find N maximum product subarrays of M elements of a Numpy array?

I have a Numpy array, and I need to find the N maximum product subarrays of M elements. For example, I have the array p = [0.1, 0.2, 0.8, 0.5, 0.7, 0.9, 0.3, 0.5] and I want to find the 5 highest product subarrays of 3 elements. Is there a "fast" way to do that?

Here is another quick way to do it:
import numpy as np
p = [0.1, 0.2, 0.8, 0.5, 0.7, 0.9, 0.3, 0.5]
n = 5
m = 3
# Cumulative product (starting with 1)
pc = np.cumprod(np.r_[1, p])
# Cumulative product of each window
w = pc[m:] / pc[:-m]
# Indices of the first element of top N windows
idx = np.argpartition(w, n)[-n:]
print(idx)
# [1 2 5 4 3]

Approach #1
We can create sliding windows and then perform prod reduction and finally np.argpartition to get top N ones among them -
from skimage.util.shape import view_as_windows
def topN_windowed_prod(a, W, N):
w = view_as_windows(a,W)
return w[w.prod(1).argpartition(-N)[-N:]]
Sample run -
In [2]: p = np.array([0.1, 0.2, 0.8, 0.5, 0.7, 0.9, 0.3, 0.5])
In [3]: topN_windowed_prod(p, W=3, N=2)
Out[3]:
array([[0.8, 0.5, 0.7],
[0.5, 0.7, 0.9]])
Note that the order is not maintained with np.argpartition. So, if we need the top N in descending order of prod values, use range(N) with it. More info.
Approach #2
For smaller window lengths, we can simply slice and get our desired result, like so -
def topN_windowed_prod_with_slicing(a, W, N):
w = view_as_windows(a,W)
L = len(a)-W+1
acc = a[:L].copy()
for i in range(1,W):
acc *= a[i:i+L]
idx = acc.argpartition(-N)[-N:]
return w[idx]

Using scatter_nd with top_k output

I've been trying to do something seemingly simple, with no success.
I have a (?,4) tensor, where each row will be 4 floats between 0 and 1.
I want to replace this with a new tensor where each row has only the top 2 entries and zeros everywhere else.
Example with a (2, 4):
source = [ [0.1, 0.2, 0.5, 0.6],
[0.8, 0.7, 0.2, 0.1] ]
result = [ [0.0, 0.0, 0.5, 0.6],
[0.8, 0.7, 0.0, 0.0] ]
I tried using top_k on the source and then using scatter_nd with the indices returned by top_k, but it has literally been 4 hours of mismatched shapes and rank errors in scatter_nd.
I'm ready to give up, but I thought I would ask for help here first.
I've found a couple of questions here closely related, but I'm failing to generalize the info in there for my case.
Another approach I just tried is this:
tensor = tf.constant( [ [0.1, 0.2, 0.8], [0.1, 0.2, 0.7] ])
values, indices = tf.nn.top_k(tensor, 1)
elems = (tensor, values)
masked_a = tf.map_fn(
lambda a : tf.where( tf.greater_equal(a[0], a[1]), a[0],
tf.zeros_like(a[0]) ),
elems)
but this one gives me the following error:
ValueError: The two structures don't have the same number of elements.
First structure (2 elements): (tf.float32, tf.float32)
Second structure (1 elements): Tensor("map/while/Select:0", shape=(3,), dtype=float32)
I'm relatively new with TensorFlow, so apologies if I'm missing something simple or being unclear.
Thanks!

You can do it with tf.scatter_nd by appending the row indice to the indices returned by top_k.
import tensorflow as tf
source = tf.constant([
[0.1, 0.2, 0.5, 0.6],
[0.8, 0.7, 0.2, 0.1]])
# get indices of top k
k = 2
top_k, top_k_inds = tf.nn.top_k(source, k, )
# indices are only columns, we will stack
# it so the row indice is also there and
# make tensor of row numbers ie.
# [[0, 0],
# [1, 1],
# ...
num_rows = tf.shape(source)[0]
row_range = tf.range(num_rows)
row_tensor = tf.tile(row_range[:,None], (1, k))
# stack along the final dimension, as this is what
# scatter_nd uses as the indices
top_k_row_col_indices = tf.stack([row_tensor, top_k_inds], axis=2)
# to mask off everything, we will multiply the top_k by
# 1. so all the updates are just 1
updates = tf.ones([num_rows, k], dtype=tf.float32)
# build the mask
zero_mask = tf.scatter_nd(top_k_row_col_indices, updates, [num_rows, 4])
with tf.Session() as sess:
zeroed = source*zero_mask
print(zeroed.eval())
This should print
[[0. 0. 0.5 0.6]
[0.8 0.7 0. 0. ]]

Just paste some lines of code :)
import tensorflow as tf
def attach_indice(tensor, top_k = None):
flatty = tf.reshape(tensor, [-1])
orig_shape = tf.shape(tensor)
length = tf.shape(flatty)[0]
if top_k is not None:
orig_shape = orig_shape[:-1] # dim for top_k
length //= top_k
indice = tf.unravel_index(tf.range(length), orig_shape)
indice = tf.transpose(indice)
if indice.dtype != tensor.dtype:
indice = tf.cast(indice, tensor.dtype)
if top_k is not None:
_dims = len(tensor.shape) - 1 # indice of indice
shape = [1 for _ in range(_dims)]
shape[-1] *= top_k
indice = tf.reshape(tf.tile(indice, shape), [-1, _dims])
return tf.concat([indice, flatty[:, None]], -1)
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '0'
# tf.enable_eager_execution()
from time import time
top_k = 3
shape = [50, 40, 100]
q = tf.random_uniform(shape)
# fast: 4.376221179962158 (GPU) / 2.483684778213501 (CPU)
v, k = tf.nn.top_k(q, top_k)
k = attach_indice(k, top_k)
s = tf.scatter_nd(k, tf.reshape(v, [-1]), shape)
# very slow: 281.82796931266785 (GPU) / 35.163344860076904 (CPU)
# s = tf.map_fn(lambda v__k__: tf.map_fn(lambda v_k_: tf.scatter_nd(v_k_[1][:, None], v_k_[0], [shape[-1]]), v__k__, q.dtype), tf.nn.top_k(q, top_k), q.dtype)
start = time()
with tf.Session() as sess:
for _ in range(1000):
sess.run(s)
print('time', time() - start)

python mask matrice for selecting a list of vertices

I have a numpy matrix of booleans, whose shape is (N,N), e.g.:
[[True False False True]
[...]
[True True True False]]
and a numpy array of vertices, whose shape is (N,3), e.g:
[[0.1, 0.2, 0.3]
[0.4, 0.5, 0.6]
[0.7, 0.8, 0.9]
[1.0, 1.1, 1.2]]
I would like to compute a matrix, with shape (N, varying), in which each row is a list of vertices selected with each line of the boolean matrix.
From the examples above:
[[[0.1, 0.2, 0.3], [1.0, 1.1, 1.2]]
[...]
[[0.1, 0.2, 0.3],[0.4, 0.5, 0.6],[0.7, 0.8, 0.9]]]
Is it possible ?
Thanks in advance

Here's one approach after extracting rows, columns from the mask -
r,c = np.where(mask)
start = np.r_[0,np.flatnonzero(r[1:] != r[:-1])+1]
stop = np.r_[start[1:], r.size]
data_rep = data[c]
out = [data_rep[start[i]:stop[i]] for i in range(len(start))]

Thanks Divakar !!
I tried your solution and it works fine.
However, I also tried a solution with a loop:
result = []
for i in range(len(data)):
result.append(data[mask[i]])
and it's faster than doing:
result = extract_rows_using_mask(data, mask)
Weird isn't it ?

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Creating a new tensor based on the old - python

Related

Tensorflow 2: Sort a 3D tensor accoding to a 2D tensor

Summing three consecutive number when equal to or great than 0 - Python

How to find N maximum product subarrays of M elements of a Numpy array?

Using scatter_nd with top_k output

python mask matrice for selecting a list of vertices

Categories

Resources