TensorFlow - dense vector to one-hot - python

Suppose I have the following tensor:
T = [[0.1, 0.3, 0.7],
[0.2, 0.5, 0.3],
[0.1, 0.1, 0.8]]
I want to transform this into a one-hot tensor, such that the indexes with the maximum value over dimension 0 get set to 1 and all the other ones get set to zero, like this:
T_onehot = [[0, 0, 1],
[0, 1, 0],
[0, 0, 1]]
I know there's tf.argmax to get the indices of the largest elements in the tensor, but is there any method which allows me to do what I want to do in one step?

I don't know if there's a way to do this in one step, but there's a one_hot function in tensorflow:
import tensorflow as tf
T = tf.constant([[0.1, 0.3, 0.7], [0.2, 0.5, 0.3], [0.1, 0.1, 0.8]])
T_onehot = tf.one_hot(tf.argmax(T, 1), T.shape[1])
tf.InteractiveSession()
print(T_onehot.eval())
# [[ 0. 0. 1.]
# [ 0. 1. 0.]
# [ 0. 0. 1.]]

Related

How to mask row in Tensorflow without for loop

I want to create a custom Layer for a Tensorflow model but the logic I have uses a for loop, which Tensorflow doesn't like. How can I modify my code to remove the for loop but still achieve the same result?
class CustomMask(tf.keras.layers.Layer):
def call(self, inputs):
mask = tf.where(inputs[:, 0] < 0.5, 1, 0)
for i,m in enumerate(mask):
if m:
inputs = inputs[i, 1:].assign(tf.zeros(4, dtype=tf.float32))
else:
first = tf.where(inputs[:, 1] >= 0.5, 0, 1)
assign = tf.multiply(tf.cast(first, tf.float32), inputs[:, 2])
inputs = inputs[:, 2].assign(assign)
third = tf.where(inputs[:, 1] >= 0.5, 1, 0)
assign = tf.multiply(tf.cast(third, tf.float32), inputs[:, 1])
inputs = inputs[:, 1].assign(assign)
return inputs
Example input Tensor:
<tf.Variable 'Variable:0' shape=(3, 5) dtype=float32, numpy=
array([[0.8, 0.7, 0.2, 0.6, 0.9],
[0.8, 0.4, 0.8, 0.3, 0.7],
[0.3, 0.2, 0.4, 0.3, 0.8]], dtype=float32)>
Corresponding output:
<tf.Variable 'UnreadVariable' shape=(3, 5) dtype=float32, numpy=
array([[0.8, 0.7, 0. , 0.6, 0.9],
[0.8, 0. , 0.8, 0.3, 0.7],
[0.3, 0. , 0. , 0. , 0. ]], dtype=float32)>
EDIT:
The layer should take an array of shape (batch_size, 5) and if the first value of a row is less than 0.5, set the rest of the row values to 0, otherwise if the 2nd element is above 0.5, set the 3rd element to 0 and if the 3rd element is greater than 0.5, set the 2nd element to 0
Without using any foor loop, ask in comments if it doesn't solve your issue (tested in colab)
import tensorflow as tf
mask1 = tf.convert_to_tensor([0.0,1.0,1.0,1.0,1.0])
mask2 = tf.convert_to_tensor([0.0,0.0,1.0,0.0,0.0])
mask3 = tf.convert_to_tensor([0.0,1.0,0.0,0.0,0.0])
def masking(x):
mask = tf.ones(x.shape, tf.float32)
cond1 = tf.cast(x[0] < 0.5, tf.float32)
x = tf.multiply(x, tf.subtract(mask, tf.multiply(mask1, cond1)))
cond2 = tf.cast(x[1] > 0.5, tf.float32)
x = tf.multiply(x, tf.subtract(mask, tf.multiply(mask2, cond2)))
cond3 = tf.cast(x[2] > 0.5, tf.float32)
x = tf.multiply(x, tf.subtract(mask, tf.multiply(mask3, cond3)))
return x
inputs = tf.convert_to_tensor([[0.8, 0.7, 0.2, 0.6, 0.9],
[0.8, 0.4, 0.8, 0.3, 0.7],
[0.3, 0.2, 0.4, 0.3, 0.8]])
res = tf.vectorized_map(masking, inputs)
print (res)
tf.Tensor(
[[0.8 0.7 0. 0.6 0.9]
[0.8 0. 0.8 0.3 0.7]
[0.3 0. 0. 0. 0. ]], shape=(3, 5), dtype=float32)
I tested it with
%timeit tf.map_fn(masking, inputs)
%timeit tf.vectorized_map(masking, inputs)
and the tf.vectorized_map(masking, inputs) get faster when the batch size increase

Extract arrays based on positions indicated in another array

I have the data below as an example:
import numpy as np
data=[np.array([[0.9,0.6,0.5,0.4,0.7],[0.8,0.0,0.0,0.8,0.2],
[0.9,0.0,0.4,0.4,0.3],[0.9,0.6,0.3,0.2,0.5],[0.8,0.0,0.3,0.1,0.5]]),
np.array([[0.9,0.0,0.2,0.4,0.3],[0.0,0.2,0.4,0.0,0.0],
[0.0,0.0,0.0,0.2,0.0],[0.5,0.0,0.3,0.6,0.8],[0.5,0.6,0.9,0.0,0.0]])]
and I want to extract the relevant data based on these positions below:
positions_non_zero=[np.array([2,3,4]),np.array([1,4])]
the desired output should be this:
[array([[0.9, 0. , 0.4, 0.4, 0.3],
[0.9, 0.6, 0.3, 0.2, 0.5],
[0.8, 0. , 0.3, 0.1, 0.5]]),
array([[0. , 0.2, 0.4, 0. , 0. ],
[0.5, 0.6, 0.9, 0. , 0. ]])]
The reason is this:
The problem with my code is that only the np.array([1,4]) is taken under consideration.
My code:
df_class11=[]
for n in data:
def data_target(df_class_target):
for z in df_class_target:
x_classA=[n[i] for i in z]
x_classA=np.vstack(x_classA)
return x_classA
df_class11.append(data_target(positions_non_zero))
df_class11

Index a torch tensor with an array

I have the following torch tensor:
tensor([[-0.2, 0.3],
[-0.5, 0.1],
[-0.4, 0.2]])
and the following numpy array: (I can convert it to something else if necessary)
[1 0 1]
I want to get the following tensor:
tensor([0.3, -0.5, 0.2])
i.e. I want the numpy array to index each sub-element of my tensor. Preferably without using a loop.
Thanks in advance
You may want to use torch.gather - "Gathers values along an axis specified by dim."
t = torch.tensor([[-0.2, 0.3],
[-0.5, 0.1],
[-0.4, 0.2]])
idxs = np.array([1,0,1])
idxs = torch.from_numpy(idxs).long().unsqueeze(1)
# or torch.from_numpy(idxs).long().view(-1,1)
t.gather(1, idxs)
tensor([[ 0.3000],
[-0.5000],
[ 0.2000]])
Here, your index is numpy array so you have to convert it to LongTensor.
Just simply, use a range(len(index)) for the first dimension.
import torch
a = torch.tensor([[-0.2, 0.3],
[-0.5, 0.1],
[-0.4, 0.2]])
c = [1, 0, 1]
b = a[range(3),c]
print(b)

Tensorflow sparse tensor with vector value to dense tensor

I have some sparse indices:
[[0 0]
[0 1]
[1 0]
[1 1]
[1 2]
[2 0]]
The corresponding value of each index is:
[[0.1 0.2 0.3]
[0.4 0.5 0.6]
[0.7 0.8 0.9]
[1.0 1.1 1.2]
[1.3 1.4 1.5]
[1.6 1.7 1.8]]
How to convert the 6x3 value tensor to 3x3x3 dense tensor in tensorflow? The value for indices not specified in indices is zero vector [0. 0. 0.]. The dense tensor is just like this:
[[[0.1 0.2 0.3]
[0.4 0.5 0.6]
[0.0 0.0 0.0]]
[[0.7 0.8 0.9]
[1.0 1.1 1.2]
[1.3 1.4 1.5]]
[[1.6 1.7 1.8]
[0.0 0.0 0.0]
[0.0 0.0 0.0]]]
You can do that with tf.scatter_nd:
import tensorflow as tf
with tf.Graph().as_default(), tf.Session() as sess:
indices = tf.constant(
[[0, 0],
[0, 1],
[1, 0],
[1, 1],
[1, 2],
[2, 0]])
values = tf.constant(
[[0.1, 0.2, 0.3],
[0.4, 0.5, 0.6],
[0.7, 0.8, 0.9],
[1.0, 1.1, 1.2],
[1.3, 1.4, 1.5],
[1.6, 1.7, 1.8]])
out = tf.scatter_nd(indices, values, [3, 3, 3])
print(sess.run(out))
Output:
[[[0.1 0.2 0.3]
[0.4 0.5 0.6]
[0. 0. 0. ]]
[[0.7 0.8 0.9]
[1. 1.1 1.2]
[1.3 1.4 1.5]]
[[1.6 1.7 1.8]
[0. 0. 0. ]
[0. 0. 0. ]]]
There is no definite way to do it in Tensorflow using any reshape kind of function. I could only think about an iterative solution by creating a list and converting it back to a Tensor. This is perhaps not the most efficient solution, but this might work for your code.
# list of indices
idx=[[0,0],[0,1], [1,0],[1,1], [1,2], [2,0]]
# Original Tensor to reshape
dense_tensor=tf.Variable([[0.1, 0.2 ,0.3],[0.4, 0.5, 0.6], [0.7, 0.8, 0.9], [1.0,1.1,1.2],[1.3,1.4,1.5], [1.6,1.7,1.8]])
# creating a temporary list to later convert to Tensor
c=np.zeros([3,3,3]).tolist()
for i in range(3):
count=0
for j in range(3):
if([i,j] in idx):
c[i][j]=dense_tensor[count]
count=count+1
else:
c[i][j]=tf.Variable([0,0,0], dtype=tf.float32)
# Convert obtained list to Tensor
converted_tensor = tf.convert_to_tensor(c, dtype=tf.float32)
You can define the ranges depending upon the size of Tensor you want. For your case, I have chosen 3 as you wanted a 3x3x3 Tensor. I hope this helps!

Numpy array of distances to list of (row,col,distance)

I have an nd array that looks as follows:
[[ 0. 1.73205081 6.40312424 7.21110255 2.44948974]
[ 1.73205081 0. 5.09901951 5.91607978 1. ]
[ 6.40312424 5.09901951 0. 1. 4.35889894]
[ 7.21110255 5.91607978 1. 0. 5.09901951]
[ 2.44948974 1. 4.35889894 5.09901951 0. ]]
Each element in this array is a distance and I need to turn this into a list with the row,col,distance as follows:
l = [(0,0,0),(0,1, 1.73205081),(0,2, 6.40312424),...,(1,0, 1.73205081),(1,1,0),...,(4,4,0)]
Additionally, it would be cool to remove the diagonal elements and also the elements (j,i) as (i,j) are already there. Essentially, is it possible to take just the top triangular matrix of this?
Is this possible to do efficiently (without a lot of loops)? I had created this array with squareform, but couldn't find any docs to do this.
squareform does all this. Read the docs and experiment. It works in both directions. If you give it a matrix it returns the upper triangle values (condensed form). If you give it those values, it returns the matrix.
In [668]: M
Out[668]:
array([[ 0. , 0.1, 0.5, 0.2],
[ 0.1, 0. , 2. , 0.3],
[ 0.5, 2. , 0. , 0.2],
[ 0.2, 0.3, 0.2, 0. ]])
In [669]: spatial.distance.squareform(M)
Out[669]: array([ 0.1, 0.5, 0.2, 2. , 0.3, 0.2])
In [670]: v=spatial.distance.squareform(M)
In [671]: v
Out[671]: array([ 0.1, 0.5, 0.2, 2. , 0.3, 0.2])
In [672]: spatial.distance.squareform(v)
Out[672]:
array([[ 0. , 0.1, 0.5, 0.2],
[ 0.1, 0. , 2. , 0.3],
[ 0.5, 2. , 0. , 0.2],
[ 0.2, 0.3, 0.2, 0. ]])
You can also specify a force and checks parameter, but without those it just goes by the shape.
Indicies can come from triu
In [677]: np.triu_indices(4,1)
Out[677]:
(array([0, 0, 0, 1, 1, 2], dtype=int32),
array([1, 2, 3, 2, 3, 3], dtype=int32))
In [680]: np.vstack((np.triu_indices(4,1),v)).T
Out[680]:
array([[ 0. , 1. , 0.1],
[ 0. , 2. , 0.5],
[ 0. , 3. , 0.2],
[ 1. , 2. , 2. ],
[ 1. , 3. , 0.3],
[ 2. , 3. , 0.2]])
Just to check, we can fill in a 4x4 matrix with these values
In [686]: A=np.vstack((np.triu_indices(4,1),v)).T
In [687]: MM = np.zeros((4,4))
In [688]: MM[A[:,0].astype(int),A[:,1].astype(int)]=A[:,2]
In [689]: MM
Out[689]:
array([[ 0. , 0.1, 0.5, 0.2],
[ 0. , 0. , 2. , 0.3],
[ 0. , 0. , 0. , 0.2],
[ 0. , 0. , 0. , 0. ]])
Those triu indices can also fetch the values from M:
In [693]: I,J = np.triu_indices(4,1)
In [694]: M[I,J]
Out[694]: array([ 0.1, 0.5, 0.2, 2. , 0.3, 0.2])
squareform uses compiled code in spatial.distance._distance_wrap so I expect it will be quite fast for large arrays. Only problem it just returns the condensed form values, but not the indices. But given the shape,the indices can always be calculated. They don't need to be stored with the values.
If your input is x, first generate the indices:
i0,i1 = np.indices(x.shape)
Then:
np.concatenate((i1,i0,x)).reshape(3,5,5).T
That gives you the first result--for the entire matrix.
As for taking only the upper triangle, you might considering trying np.triu() but I'm not sure exactly what result you're looking for. You can probably figure out how to mask the parts you don't want now though.
you can try this,
print([(x,y, value) for (x,y), value in np.ndenumerate(numpymatrixarray)])
output [(0, 0, 0.0), (0, 1, 1.7320508100000001), (0, 2, 6.4031242400000004), (0, 3, 7.2111025499999997), (0, 4, 2.4494897400000002), (1, 0, 1.7320508100000001), (1, 1, 0.0), (1, 2, 5.0990195099999998), (1, 3, 5.9160797799999996), (1, 4, 1.0), (2, 0, 6.4031242400000004), (2, 1, 5.0990195099999998), (2, 2, 0.0), (2, 3, 1.0), (2, 4, 4.3588989400000004), (3, 0, 7.2111025499999997), (3, 1, 5.9160797799999996), (3, 2, 1.0), (3, 3, 0.0), (3, 4, 5.0990195099999998), (4, 0, 2.4494897400000002), (4, 1, 1.0), (4, 2, 4.3588989400000004), (4, 3, 5.0990195099999998), (4, 4, 0.0)]
Do you really want the top triangular matrix for an [nxm] matrix where n>m? That will give you (nxn-n)/2 elements and lose all the data where m⊖n.
What you probably want is the lower triangular matrix:
def tri_reduce(m):
n=m.shape
if n[0]>n[1]:
i=np.tril_indices(n[0],1,n[1])
else:
i=np.triu_indices(n[0],1,n[1])
return np.vstack((i,m[i])).T
Rebuilding it into a list of tuples would require a loop though I believe. list(tri_reduce(m)) would give a list of nd arrays.

Categories

Resources