Sample rows from tensor in every 3 rows in Python

Sample rows from tensor in every 3 rows in Python - python

How to sample the tensor in Python?
I want to sample a frame every 3 frames from videos, and the tensor shape will be [color, frames, height, width].
Thus, the sampling tensor shape will be [color, frames / 3, height, width]
Assume there is a tensor.size([3,300,10,10]).
After sampling rows every 3 rows in the second dimension, the tensor will be tensor.size([3,100,10,10])
Another example,
A tensor = [[1,2,3,4,5,6,7,8,9,10],[1,2,3,4,5,6,7,8,9,10],[1,2,3,4,5,6,7,8,9,10]].
After sampling rows every 3 rows in the second dimension, the tensor will be [[1,4,7,10],[1,4,7,10],[1,4,7,10]]

Let N be the size of dimension you want to sample and you want to sample every kth row.
You can do (assuming you want to sample from the 1st dimension, and there are 4 dimensions),
new_tensor = tensor[:, torch.arange(0, N, k), : ,: ]
You may skip slicing the last two dimensions and the result won't change.
new_tensor = tensor[:, torch.arange(0, N, k)]
More specifically for the 2D tensor in question, you can use this code.
tensor=torch.tensor([
[1,2,3,4,5,6,7,8,9,10],
[1,2,3,4,5,6,7,8,9,10],
[1,2,3,4,5,6,7,8,9,10]
])
new_tensor=tensor[:, torch.arange(0,10,3)]

Related

Exploding tensor after using Dataset and .batch

I have a numpy array of shape (100,4,30). This represents 100 samples of 4 samples of encodings of length 30. The 4 samples, per row, are related.
I want to get a TensorFlow dataset, batched, where related samples are in the same batch.
I'm trying to do:
first, use np.vsplit to get a list of length 100, where each element in the list is a list of the 4 related samples.
Now if I call tf.data.Dataset.from_tensor_slices(...).batch(1) on this list of lists, I get a batch that contains a tensor of shape (4,1,30).
I want this batch to contain 4 tensors of shape (1,30).
How can I achieve this?

I may have missunderstood you, but if you just leave out the "vsplit":
data = np.zeros((100, 4, 30))
data_ds = tf.data.Dataset.from_tensor_slices(data).batch(1)
for element in data_ds.take(1):
print(element.shape)
you will get:
(1, 4, 30)
(so one batch contains all 4 related encodings).
If you really want the dimensions inside a batch to be 4 times (1, 30) you can do:
data = np.expand_dims(data, axis=2)
before dataset creation.
EDIT:
I think I just understood your question. You want every batch to have 4 elements and those are the related encodings? You can achieve this by:
data = np.swapaxes(data, 0, 1)
data = np.reshape(data, (100*4, -1))
data_ds = tf.data.Dataset.from_tensor_slices(data).batch(4)

Select tensor slice along a dimension based on index

I have a PyTorch tensor of the following shape: (100, 5, 100). I need to convert it into a tensor of shape (100, 100) by selecting from each row only one item in the second dimension, meaning that of those 5 elements I only need one, with its corresponding 100 elements.
To do this operation I have a second tensor of shape (100,) with the indices that specify which of those 5 items should be selected in each row.
Is there a simple way to perform this selection without having to mess with the dimensions too much?

Suppose tensor with indicies called idx and have shape (100,). Tensor with values called source. Then to select:
result = source[torch.arange(100), idx]

iterating a numpy matrix, and assign its rows with information from other dataframe and numpy array

I have a matrix, e.g., defined as x_matrix = np.zeros(200,16) Iterating over the rows, I need to assign each row of this matrix with two component vectors, a1 is an array with 10 elements, a2 is a corresponding row belonging to a pandas dataframe y_dataframe y_dataframe has shape of (200,6)
I can iterate the matrix as follows. But I also need the row number of x_matrix to retreive the corresponding row in the y_dataframe. Are there other ways to iterate the matrix rows, and compose its rows with different component vectors described as above.
for row in x_matrix

You can do this without iteration if you wish using np.repeat and np.hstack:
# assuming `a1` is shaped (10,) i.e. 1D array
a1_repeated = np.repeat(a1[np.newaxis, :], 200, axis=0)
x_matrix = np.hstack((a1_repeated, y_dataframe))
where we first convert a1 into a row vector of shape (10, 1) via [np.new_axis, :], then repeat it 200 times row-wise (axis=0). Lastly we horizontally stack this (200, 10) shaped a1_repeated and y_dataframe to get an array of shape (200, 16).
But if you want to iterate, enumerate gives you index you are at:
for row_number, row in enumerate(x_matrix):
x_matrix[row_number] = [*a1, *y_dataframe.iloc[row_number]]
where y_dataframe.iloc[row_number] is equal to a2 you mention i.e. a row of dataframe.

ValueError: cannot reshape array of size 74404 into shape (6764,1691,1)

My df shape is 2D (6764, 11).
I want to reshape it into 3D with 1691 time steps (i.e., 1/4 of 6764)
df = df.values.reshape((df.shape[0], 1691, df.shape[1]))
I get the error: ValueError: cannot reshape array of size 74404 into shape (6764,1691,11)
Why I get size 74404??? I get is 1674*11, but why is doing this multiplication?
edit
I actually want to reshape my data into [6764, 1691, 11], which is the dimension required for an LSTM model. This dimension stands for [Samples, TimeSteps, Features] where samples are the number of data points, time steps the number of data points I want to analyse/predict, and 11 the inputs (columns) I am using. Any advise on how to achieve this shape without getting the error ? my reference is this

From 2D dataframe you have an array of 6764 x 11 = 74404 values.
Multiplication indicates the number of values you have in the array/dataframe.
from your code (df.shape[0], 1691, df10.shape[1])) it would generate 6764 x 1691x 11 = 125817164 which is not matching to input array values thats why you are getting an error.
Considering you want 1691 series you can reshape your data into (1691 x 4 x 11)
df = df.values.reshape((1691,4, df.shape[1]))
If you need only 1st column that is 6764 values to reshape then use below code although it will generate 2D array with (1691,4) shape.
df = df['column_name'].values.reshape((1691,4))

Create a matrix with dimension [N,D] from tensor with dimension [N]

I have a tensor with dimension N and I would like to replicate it to create a tensor with dimension NxD being each column the initial vector.
Thank you

You want first to expand/reshape your tensor to a N x 1 shape, before tiling it D times in the 2nd dimension:
tensor_N_x_1 = tf.expand_dims(tensor, 1) # Expand by adding a dim in position 1
tensor_N_x_D = tf.tile(tensor_N_x_1, [1, D]) # Tile 1 time in the 1st dim, D times in the 2nd
Documentation:
tf.expand_dims
tf.tile

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Sample rows from tensor in every 3 rows in Python - python

Related

Exploding tensor after using Dataset and .batch

Select tensor slice along a dimension based on index

iterating a numpy matrix, and assign its rows with information from other dataframe and numpy array

ValueError: cannot reshape array of size 74404 into shape (6764,1691,1)

Create a matrix with dimension [N,D] from tensor with dimension [N]

Categories

Resources