Creating a tensor of ordered integers of shape (None, 1) - python

Given an input batch of size (None, 1), is it possible to create a tensor of ordered integers that is the same shape?
ex:
input = [3, 2, 3, 7], output = [0, 1, 2, 3]
ex:
input = [9, 3, 12, 4, 34 .....], output = [0, 1, 2, 3, ....]

tf.range() does what you need, you just need to provide the size based on the size of your input tensor. Because people already told you this, I will show you another approach.
tf.cumsum() on the ones vector:
import tensorflow as tf
x = tf.placeholder(tf.int32, shape=(None))
y = tf.cumsum(tf.ones_like(x)) - 1
with tf.Session() as sess:
print sess.run(y, {x: [4, 3, 2, 6, 3]})

You could try this:
x = tf.placeholder(tf.float32, shape=(None, 1))
op = tf.range(tf.size(x))[:,tf.newaxis]
# test with different sizes
sess.run(op, {x: np.expand_dims(range(10), axis=-1)})
sess.run(op, {x: np.expand_dims(range(3), axis=-1)})

Related

Splitting a 4-dimensional tensor into odd and even lines

I want to get the odd and even rows of a 4-dimensional input data into a separate variable.
e.g
tensor = [[[-1.8391453 ],
[ 1.9224693 ]],
[[-0.7931502 ],
[-0.16963768]]]
tensorodd= [[[1.8391453],[-0.7931502]]]
tensoreven=[[[1.9224693],[-0.16963768]]]
I couldn't convert it for 4 dimensional. and I wasn't sure if what I was doing was right.
I'm not sure what I wrote is correct
This is not exactly what I want. I want to get 1,3,5 rows in a separate variable and 0,2,4,6 rows in a separate variable in the tensor. Actually what I want to do is this:
I want to apply mae formula to tensor. so I want to separate the rows in the tensor and take them as y1 and y2 and apply the formula
I'm not sure I've understood what you want to get.
I'm assuming that you have a tensor of shape (3, 128, 128, 3) and you want to take:
The even rows of that tensor, leaving you with a tensor of this shape (2, 128, 128, 3)
The odd rows of that tensor, leaving you with a tensor of this shape (1, 128, 128, 3)
Then you could work with indices:
import tensorflow as tf
import numpy as np
X = tf.convert_to_tensor(np.ones((3, 128, 128, 3)))
# creating a list of indices for 0 axis, in this case [0, 1, 2]
indices = tf.range(start=0, limit=tf.shape(X)[0], dtype=tf.int32)
# separating the even and odd numbers
even_indices = [x for x in indices if x % 2 == 0]
odd_indices = [x for x in indices if x % 2 != 0]
even_X = tf.gather(X, even_indices)
odd_X = tf.gather(X, odd_indices)
print('Even tensors', even_X.shape) # prints (2, 128, 128, 3)
print('Odd tensors', odd_X.shape) # prints (1, 128, 128, 3)
Updated answer after new info:
# your input tensor has shape (3, 2, 3)
tensor = tf.constant([[ [0, 0, 0],
[1, 0, 0] ],
[ [2, 0, 0],
[3, 1, 1] ],
[[4, 1, 1],
[5, 1, 1]]])
even_tensor = [x[0] for x in tensor]
# even_tensor = [<tf.Tensor: [0, 0, 0]>, <tf.Tensor: [2, 0, 0]>, <tf.Tensor: [4, 1, 1]>]
odd_tensor = [x[1] for x in tensor]
# odd_tensor = [<tf.Tensor: [1, 0, 0]>, <tf.Tensor: [3, 1, 1]>, <tf.Tensor: [5, 1, 1]>]
mae = tf.keras.losses.MeanAbsoluteError()
result = mae(even_tensor, odd_tensor).numpy()
Instead of working with tensors you can convert to lists, e.g.:
odd_tensor = [list(x[1].numpy()) for x in tensor]

How to mask padded 0s in pytorch for RNN model training?

I have some time series data padded with 0s in the shape of (Batch, length, features). For more detail, I extracted MFCCs from audio files with (60,40), 60 frames, and 40 MFCCs for each audio file input.
I used to run Tensorflow and applied the Masking layer with the value I wish to mask.
I am trying to do the same thing in Pytorch. I have done some research on this and found people mentioning pack_padded_sequence from from torch.nn.utils.rnn
However, it appears that I have to prepare a list that contains the length of valid indexes for each input.
Even if I create a list of the length of the valid indexes, can it deal with the random state in train_test_split?
updated on 2022 July 27
It appears that pack_padded_sequence is the only way to do a mask for Pytorch RNN.
I have rewritten the dataset preparation codes and created a list containing all the 2D array data. It is a list with a length of 12746 and the 2d array inside is in the form of (x,40); "x" can be any number lower than 60. So basically I am going to prepare data for training in the shape of (12746,60,40)
How should I proceed as the packed sequence cannot be created as a PyTorch dataset?
class mydata(Dataset):
def __init__(self, X, y):
self.X = torch.FloatTensor(X)
self.y = torch.FloatTensor(y)
def __len__(self):
return len(self.X)
def __getitem__(self, index):
y = self.y[index]
X = self.X[index]
return X,y
padded = pad_sequence(data, batch_first=True, padding_value=0.0)
lengths = torch.tensor([len(t) for t in data])
# print('#padded', padded)
print('--------------------------------------------')
packed = torch.nn.utils.rnn.pack_padded_sequence(padded, lengths.to('cpu'), batch_first=True, enforce_sorted=False)
#split them into 0.7 proportion. It was done using Train_test_split.
X_train = packed[0:8900]
y_train = y[:8900]
X_valid = packed[8900:]
y_valid = y[8900:]
train_dataset = mytools.mydata(X_train,y_train)
valid_dataset = mytools.mydata(X_valid,y_valid)
trainloader = DataLoader(train_dataset, batch_size=256, shuffle=True, num_workers=0)
validloader = DataLoader(valid_dataset, batch_size=256, shuffle=False, num_workers=0)
You can do that using torch.nn.utils.rnn.pad_packed_sequence. Here is an example:
batch_first is False by default. Here is for batch_first=False:
from torch.nn.utils.rnn import pad_sequence
a = torch.tensor([1, 2, 3])
b = torch.tensor([0, 1])
c = torch.tensor([0, 5, 6, 8])
padded = pad_sequence([a, b, c], padding_value=0.0)
print('#padded', padded)
lengths = torch.tensor([len(t) for t in [a, b, c]])
packed = torch.nn.utils.rnn.pack_padded_sequence(padded, lengths.to('cpu'), enforce_sorted=False)
print('#packed', packed)
output, lengths = torch.nn.utils.rnn.pad_packed_sequence(packed)
print('output', output, '\n lengths', lengths )
Output:
#padded tensor([[1, 0, 0],
[2, 1, 5],
[3, 0, 6],
[0, 0, 8]])
#packed PackedSequence(data=tensor([0, 1, 0, 5, 2, 1, 6, 3, 8]), batch_sizes=tensor([3, 3, 2, 1]), sorted_indices=tensor([2, 0, 1]), unsorted_indices=tensor([1, 2, 0]))
output tensor([[1, 0, 0],
[2, 1, 5],
[3, 0, 6],
[0, 0, 8]])
lengths tensor([3, 2, 4])
and here is for batch_first=True:
padded = pad_sequence([a, b, c], batch_first=True, padding_value=0.0)
print('#padded', padded)
lengths = torch.tensor([len(t) for t in [a, b, c]])
packed = torch.nn.utils.rnn.pack_padded_sequence(padded, lengths.to('cpu'), batch_first=True, enforce_sorted=False)
print('#packed', packed)
output, lengths = torch.nn.utils.rnn.pad_packed_sequence(packed, batch_first=True)
print('output', output, '\n lengths', lengths )
Output:
#padded tensor([[1, 2, 3, 0],
[0, 1, 0, 0],
[0, 5, 6, 8]])
#packed PackedSequence(data=tensor([0, 1, 0, 5, 2, 1, 6, 3, 8]), batch_sizes=tensor([3, 3, 2, 1]), sorted_indices=tensor([2, 0, 1]), unsorted_indices=tensor([1, 2, 0]))
output tensor([[1, 2, 3, 0],
[0, 1, 0, 0],
[0, 5, 6, 8]])
lengths tensor([3, 2, 4])
Then you can work with packed values such as packed_output, (hidden, cell) = your_lstm(packed). You can use hidden as usual but the packed_output will be packed but you can convert it to batched case using torch.nn.utils.rnn.pad_packed_sequence. Remember that if you set batch_first=False you should also do it in your LSTM.
If you have problem with splitting, you can do it manually as follows:
seq = [a, b, c]
lengths = [len(t) for t in seq]
import random
idx = np.arange(len(seq))
random.shuffle(idx)
seq_shuffled = [seq[i] for i in idx]
lengths _shuffled = [lengths [i] for i in idx]

Why does Conv2D seems to only performing in 1D?

I trying to understand how Tensorflow work.
I have this code going trying to perform 2D convolution on an input image of size (5,5).
But it seems that it is only performing convolution in 1D instead of 2D.
Here is the code:
iX = iY = 5
kX = kY = 3
image = np.ones((iY,iX)).reshape((iY,iX,1))
kernel = np.ones((kY,kX)).reshape((kY,kX,1,1))
bias = np.array([0.0])
i = layers.Input(shape=(iY,iX,1))#l_input
x = layers.Conv2D(1, (kY,kX), strides=1, padding="same", activation='linear',weights=[kernel,bias])(i)#l_conv2d(i)
model = keras.Model(i, x)
model(image).numpy()
Actual output:
[2, 3, 3, 3, 2]
[2, 3, 3, 3, 2]
[2, 3, 3, 3, 2]
[2, 3, 3, 3, 2]
[2, 3, 3, 3, 2]
Expected output:
[4, 6, 6, 6, 4]
[6, 9, 9, 9, 6]
[6, 9, 9, 9, 6]
[6, 9, 9, 9, 6]
[4, 6, 6, 6, 4]
What am I doing wrong ?
You have mentioned padding='same' that's why you are getting output having the same shape as the input image 5x5x1x1 it is in the format Height X Width X Channel X No_of_filters. padding='same' parameter add additional padding values to the image to get the desired output volume shape. you can use padding='valid' to avoid padding.
You have missed batch dimension as well while providing an image to the model:
iX = iY = 9
kX = kY = 3
no_of_filter = 2
image = np.ones((iY,iX)).reshape((1,iY,iX,1))
kernel = np.ones((kY,kX,no_of_filter)).reshape((kY,kX,1,no_of_filter))
bias = np.array([0.0 for _ in range(no_of_filter)])
i = layers.Input(shape=(iY,iX,1))#l_input
x = layers.Conv2D(no_of_filter, (kY,kX), strides=1, padding="valid", activation='linear',weights=[kernel,bias])(i) #l_conv2d(i)
model = keras.Model(i, x)
model.summary()
model(image).numpy().shape

How to gather a tensor with unknown first (batch) dimension?

I have a tensor of shape (?, 3, 2, 5). I want to supply pairs of indices to select from the first and second dimensions of that tensor, that have shape (3, 2).
If I supply 4 such pairs, I would expect the resulting shape to be (?, 4, 5). I'd thought this is what what batch_gather is for: to "broadcast" gathering indices over the first (batch) dimension. But this is not what it's doing:
import tensorflow as tf
data = tf.placeholder(tf.float32, (None, 3, 2, 5))
indices = tf.constant([
[2, 1],
[2, 0],
[1, 1],
[0, 1]
], tf.int32)
tf.batch_gather(data, indices)
Which results in <tf.Tensor 'Reshape_3:0' shape=(4, 2, 2, 5) dtype=float32> instead of the shape that I was expecting.
How can I do what I want without explicitly indexing the batches (which have an unknown size)?
I wanted to avoid transpose and Python loops, and I think this works. This was the setup:
import numpy as np
import tensorflow as tf
shape = None, 3, 2, 5
data = tf.placeholder(tf.int32, shape)
idxs_list = [
[2, 1],
[2, 0],
[1, 1],
[0, 1]
]
idxs = tf.constant(idxs_list, tf.int32)
This allows us to gather the results:
batch_size, num_idxs, num_channels = tf.shape(data)[0], tf.shape(idxs)[0], shape[-1]
batch_idxs = tf.math.floordiv(tf.range(0, batch_size * num_idxs), num_idxs)[:, None]
nd_idxs = tf.concat([batch_idxs, tf.tile(idxs, (batch_size, 1))], axis=1)
gathered = tf.reshape(tf.gather_nd(data, nd_idxs), (batch_size, num_idxs, num_channels))
When we run with a batch size of 4, we get a result with shape (4, 4, 5), which is (batch_size, num_idxs, num_channels).
vals_shape = 4, *shape[1:]
vals = np.arange(int(np.prod(vals_shape))).reshape(vals_shape)
with tf.Session() as sess:
result = gathered.eval(feed_dict={data: vals})
Which ties out with numpy indexing:
x, y = zip(*idxs_list)
assert np.array_equal(result, vals[:, x, y])
Essentially, gather_nd wants batch indices in the first dimension, and those have to be repeated once for each index pair (i.e., [0, 0, 0, 0, 1, 1, 1, 1, 2, ...] if there are 4 index pairs).
Since there doesn't seem to be a tf.repeat, I used range and floordiv, and then concated the batch indices with the desired (x, y) indices (which are themselves tiled batch_size times).
Using tf.batch_gather the leading dimensions of the shape of the tensor should match with the leading dimension of the shape of the indice tensor.
import tensorflow as tf
data = tf.placeholder(tf.float32, (2, 3, 2, 5))
print(data.shape) // (2, 3, 2, 5)
# shape of indices, [2, 3]
indices = tf.constant([
[1, 1, 1],
[0, 0, 1]
])
print(tf.batch_gather(data, indices).shape) # (2, 3, 2, 5)
# if shape of indice was (2, 3, 1) the output would be 2, 3, 1, 5
What you rather want is to use tf.gather_nd as the following
data_transpose = tf.transpose(data, perm=[2, 1, 0, 3])
t_transpose = tf.gather_nd(data_transpose, indices)
t = tf.transpose(t_transpose, perm=[1, 0, 2])
print(t.shape) # (?, 4, 5)

How to count elements in tensorflow tensor?

I have a tensor for example : X = [1, 1, 0, 0, 1, 2, 2, 0, 1, 2].
And what I want is to reduce this tensor X to a tensor such as: Y = [3, 4, 3].
Where Y in position 0 is the count of how many 0s there are in X, and the position 1 how many 1s, so on and so forth.
What I'm doing right now is iterating through this tensor using the tf.where function. But this doesn`t seem elegant, and there must be a better way to do it.
Thanks.
You are looking for tf.unique_with_counts.
import tensorflow as tf
X = tf.constant([1, 1, 0, 0, 1, 2, 2, 0, 1, 2])
op = tf.unique_with_counts(X)
sess = tf.InteractiveSession()
res = sess.run(op)
print(res.count)
# [4 3 3]
Beware that tf.bincount only handle positive integers. If your input tensor is not of integer type, or contains negative values, you must use tf.unique_with_count. Otherwise bincount is fine and to the point.
I think you are looking for Y = tf.bincount(X):
X = tf.constant([1, 1, 0, 0, 1, 2, 2, 0, 1, 2])
Y = tf.bincount(X)
sess = tf.InteractiveSession()
tf.global_variables_initializer().run()
Y.eval()
# output
#[3, 4, 3]
For negative integers you can use:
tf.bincount(X + tf.abs(tf.reduce_min(X)) )

Categories

Resources