I have a list of matrices with size of (63,32,1,600,600), when I want to stack it with torch.stack(matrices).cpu().detach().numpy() it's raising with error:
"stack expects each tensor to be equal size, but got [32, 1, 600, 600] at entry 0 and [16, 1, 600, 600] at entry 62". Is tried for resizing but it did not work. I appreciate any recommendations.
If I understand correctly what you're trying to do is stack the outputted mini-batches together into a single batch. My bet is that your last batch is partially filled (only has 16 elements instead of 32).
Instead of using torch.stack (creating a new axis), I would simply concatenate with torch.cat on the batch axis (axis=0). Assuming matrices is a list of torch.Tensors.
torch.cat(matrices).cpu().detach().numpy()
As torch.cat concatenates on axis=0 by default.
When we have tensors that differ in size only on the first dimension, as of PyTorch v1.7.0, we can use torch.vstack() to stack it along axis 0. Using torch.stack() fails here because it expects all the tensors to be of same shape.
Here is a reproducible illustration matching your problem description:
# sample tensors (as per your size)
In [65]: t1 = torch.randn([32, 1, 600, 600])
In [66]: t2 = torch.randn([16, 1, 600, 600])
# vertical stacking (i.e., stacking along axis 0)
In [67]: stacked = torch.vstack([t1, t2])
# check shape of output
In [68]: stacked.shape
Out[68]: torch.Size([48, 1, 600, 600])
we get 48 (32 + 16) as the size of first dimension in the result because we're stacking tensors along that dimension.
Note:
You can also initialize the result tensor, say stacked, by explicitly calculating the shape and pass this tensor as a parameter to out= kwarg of torch.vstack() if you want to write the result to a specific tensor, for instance updating the values of existing tensor (of same shape). However, this is optional.
# calculate new shape of stacking
In [80]: newshape = (t1.shape[0] + t2.shape[0], *t1.shape[1:])
# allocate an empty tensor, filled with garbage values
In [81]: stacked = torch.empty(newshape)
# stack it along axis 0 and write the result to `stacked`
In [83]: torch.vstack([t1, t2], out=stacked)
# check shape/size
In [84]: stacked.shape
Out[84]: torch.Size([48, 1, 600, 600])
Related
I have a PyTorch tensor of the following shape: (100, 5, 100). I need to convert it into a tensor of shape (100, 100) by selecting from each row only one item in the second dimension, meaning that of those 5 elements I only need one, with its corresponding 100 elements.
To do this operation I have a second tensor of shape (100,) with the indices that specify which of those 5 items should be selected in each row.
Is there a simple way to perform this selection without having to mess with the dimensions too much?
Suppose tensor with indicies called idx and have shape (100,). Tensor with values called source. Then to select:
result = source[torch.arange(100), idx]
I'm trying to essentially create a 3-D tensor from the indexed rows of a 2-D tensor. For example, assuming I have:
A = tensor(shape=[200, 256]) # 2-D Tensor.
Aidx = tensor(shape=[1000, 10]) # 2-D Tensor holding row indices of A for each of 1000 batches.
I wish to create:
B = tensor(shape=[1000, 10, 256]) # 3-D Tensor with each batch being of dims (10, 256) selected from A.
Right now, I'm doing this in a memory inefficient manner by doing a tf.broadcast() and then using a tf.gather(). This is very fast, but also takes up a lot of RAM:
A = tf.broadcast_to(A, [1000, A.shape[0], A.shape[1]])
A = tf.gather(A, Aidx, axis=1, batch_dims=1)
Is there a more memory efficient way of doing the above operation? Naively, one can make use of a for loop, but that is very compute inefficient for my use case. Thanks in advance!
You have to extract 10,000 rows correct? (10 rows 1000 different times)
make this [1000, 10] array into 1 dimensional array [10000] with reshape
See this answer
How to fetch specific rows from a tensor in Tensorflow?
This will give you output [10000, 256]
Then reshape the output into your final form. [1000, 10, 256]
I haven't tried it.
I have a 2D tensor my_tensor size [50,50] and dtype int32 and I need to increment the value at one specific location. The indices of the location to be updated is given by 2 integer tensors, which give the location in axis 0 and axis 1, respectively:
idx_0 is:
tf.Tensor([27], shape=(1,), dtype=int32)
idx_1 is:
tf.Tensor([26], shape=(1,), dtype=int32)
Tensorflow's tensor_scatter_nd_add seems to be the solution. The code works if I define the indexes manually, but if I try to use idx_0 and idx_1, every implementation gives some index/dimension mis-match error.
This works, incrementing location (27,26):
tf.tensor_scatter_nd_add(reversals_count, [[27, 26]], [1])
but this raises an error:
tf.tensor_scatter_nd_add(reversals_count, [[idx_0, idx_1]], [1])
with the error message
{InvalidArgumentError}Outer dimensions of indices and update must match. Indices shape: [1,2,1], updates shape:[1] [Op:TensorScatterAdd]
How can I use the idx_0 and idx_1 tensors in place of [[27, 26]]? Other syntaxes I've tried similarly do not produce the correct dimensions:
[[idx_0], [idx_1]]
tf.concat([idx_0, idx_1], axis=0)
I have created an (5x5) unit matrix for sake of simplicity and updating them at indices (0,0) ,(1,1),(2,2),(3,3) (i.e at the first four diagonal elements) . First define the indices as tensors ,then values as tensors that will add up those at respective indices then update using "tf.tensor_scatter_nd_add" command . You can do similarly for a (50X50) matrix.Thanks!
import tensorflow as tf
indices = tf.constant([[0,0], [1,1], [2,2],[3,3]]) # updating at diagonal index elements, you can see the change
updates = tf.constant([9, 10, 11, 12])# values that will add up at respective indexes
print("original tensor is ")
tensor = tf.ones([5,5], dtype=tf.int32)
print(tensor)
print("updated tensor is ")
updated = tf.tensor_scatter_nd_add(tensor, indices, updates)
print(updated)
I have the following situation. I have an array of size (3, 128, n) (where n is large). (This array represents a picture). I have a superresolution deep learning model that takes as input a (3, 128, 128) picture and gives it back in better quality. I want to use apply my model to the whole picture.
My existing solution
My first solution to this problem is to split my array into array of size (3, 128, 128). I then have a list of square images, and I can apply my model to each of this square and then concatenate all the results to get a new (3, 128, n) image. The problem with this method is that the model does not perform as well on the edges of the image.
My desired solution
To get around this problem, I have thought of an alternative solution. Instead of considering non overlapping square images, I can consider all square images that can be extracted from my original image. I can pass all those images to my model. Then to reconstruct a point of coordinates (a, b, c), I will consider all reconstructed square pictures that contains c, and take an average of them. I want this average to give more weight to the square where c is near the center.
To be more specific :
I start with a 3*128*n array (let's call it A). I pad on the left and on the right which gives me a new array (let's call it A_pad) of size 3*128*(n+2*127)
For i in range(0,n+127), let A_i = A_pad[:, :, i:i+128], A_i is of size (3*128*128) and can be fed to my model which creates a new array B_i of the same size.
Now I want a new array B of the same size than A that is defined like this : For each (x, y , z), B[x, y, z] is the mean of the 128 B_i[x, y, z+127-i] such that z <= i < z+128 with the weight 1 + min(z + 127 -i, i-z). That corresponds to taking the mean of all the windows that contains z with a weight proportional to the distance to the closest edge.
My question is based on the computation of B. Given what I've described, I could write multiple for loops that would yield the correct results, but I'm afraid it would be slow. I'm looking for a solution using numpy that is as fast as possible.
This is an example implementation that follows the steps you outlined in the section "My desired solution". It makes extensive use of np.lib.stride_tricks.as_strided which at first glance might not seem obvious at all; I added detailed comments to each usage for clarification. Also note that in your description you use z to denote the column position within images while in comments I use the term n-position in order to comply with the shape specification via n.
Regarding efficiency it's not obvious whether this is a winner or not. Computation happens all in numpy but the expression sliding_128 * weights builds a large array (128x the size of the original image) before reducing it along the frame dimension. This definitely comes at its cost, memory might even be an issue. A loop might come in handy at this position.
Lines which contain a comment prefixed with # [TEST] were added for testing purposes. Concretely this means we're overwriting the weights for the final sum of frames with 1 / 128 in order to eventually recover the original image (since no ML model transformation is applied either).
import numpy as np
n = 640 # For example.
image = np.random.randint(0, 256, size=(3, 128, n))
print('image.shape: ', image.shape) # (3, 128, 640)
padded = np.pad(image, ((0, 0), (0, 0), (127, 127)), mode='edge')
print('padded.shape: ', padded.shape) # (3, 128, 894)
sliding = np.lib.stride_tricks.as_strided(
padded,
# Frames stored along first dimension; sliding across last dimension of `padded`.
shape=(padded.shape[-1]-128+1, 3, 128, 128),
# First dimension: Moving one frame ahead -> move across last dimension of `padded`.
# Remaining three dimensions: Move as within `padded`.
strides=(padded.strides[-1:] + padded.strides)
)
print('sliding.shape: ', sliding.shape) # (767, 3, 128, 128)
# Now at this part we would feed the frames `sliding` to the ML model,
# where the first dimension is the batch size.
# Assume the output is assigned to `sliding` again.
# Since we're not using an ML model here, we create a copy instead
# in order to update the strides of `sliding` with it's actual shape (as defined above).
sliding = sliding.copy()
sliding_128 = np.lib.stride_tricks.as_strided(
# Reverse last dimension since we want the last column from the first frame.
# Need to copy again because `[::-1]` creates a view with negative stride,
# but we want actual reversal to work with the strides below.
# (There's perhaps a smart way of adjusting the strides below in order to not make a copy here.)
sliding[:, :, :, ::-1].copy(),
# Second dimension corresponds to the 128 consecutive frames.
# Previous last dimension is dropped since we're selecting the
# column that corresponds to the current n-position.
shape=(128, n, 3, 128),
# First dimension (frame position): Move one frame and one column ahead
# (actually want to move one column less in `sliding` but since we reverted order of columns
# we need to move one ahead now) -> move across first dimension of `sliding` + last dimension of `sliding`.
# Second dimension (n-position): Moving one frame ahead -> move across first dimension of `sliding`.
# Remaining two dimensions: Move within frames (channel and row dimensions).
strides=((sliding.strides[0] + sliding.strides[-1],) + sliding.strides[:1] + sliding.strides[1:3])
)
print('sliding_128.shape: ', sliding_128.shape) # (128, 640, 3, 128)
# Weights are independent of the n-position -> we can precompute.
weights = 1 + np.concatenate([np.arange(64), np.arange(64)[::-1]])
weights = np.ones(shape=128) # [TEST] Assign weights for testing -> want to obtain the original image back.
weights = weights.astype(float) / weights.sum() # Normalize?
weights = weights[:, None, None, None] # Prepare for broadcasting.
weighted_image = np.moveaxis(np.sum(sliding_128 * weights, axis=0), 0, 2)
print('weighted_image.shape: ', weighted_image.shape) # (3, 128, 640)
assert np.array_equal(image, weighted_image.astype(int)) # [TEST]
I am looking for an elegant way to flatten an array of arbitrary shape to a matrix based on a single parameter that specifies the dimension to retain. For illustration, I would like
def my_func(input, dim):
# code to compute output
return output
Given for example an input array of shape 2x3x4, output should be for dim=0 an array of shape 12x2; for dim=1 an array of shape 8x3; for dim=2 an array of shape 6x8. If I want to flatten the last dimension only, then this is easily accomplished by
input.reshape(-1, input.shape[-1])
But I would like to add the functionality of adding dim (elegantly, without going through all possible cases + checking with if conditions, etc.). It might be possible by first swapping dimensions, so that the dimension of interest is trailing and then applying the operation above.
Any help?
We can permute axes and reshape -
# a is input array; axis is input axis/dim
np.moveaxis(a,axis,-1).reshape(-1,a.shape[axis])
Functionally, it's basically pushing the specified axis to the back and then reshaping keeping that axis length to form the second axis and merging rest of the axes to form the first axis.
Sample runs -
In [32]: a = np.random.rand(2,3,4)
In [33]: axis = 0
In [34]: np.moveaxis(a,axis,-1).reshape(-1,a.shape[axis]).shape
Out[34]: (12, 2)
In [35]: axis = 1
In [36]: np.moveaxis(a,axis,-1).reshape(-1,a.shape[axis]).shape
Out[36]: (8, 3)
In [37]: axis = 2
In [38]: np.moveaxis(a,axis,-1).reshape(-1,a.shape[axis]).shape
Out[38]: (6, 4)