I'm working with 4 dimensional tensors and need to do a couple of calculations that work like the following example. Take A to be a tensor with shape (6,64,64,64). I want to use the function tf.where to obtain the voxels of each (64,64,64) volume that has a value larger than 0.75. The only way I have managed to do this is like this:
X = tf.convert_to_tensor([tf.where(A[i,:,:,:] > 0.75) for i in range(A.shape[0])]
This seems to be a very crude solution. Is there a better way to achieve this?
The problem with what you are trying to do is that it requires that each (64, 64, 64) volume has the same number of values greater than 0.75. If that is the case, you could just do the following:
X = tf.reshape(tf.where(A > 0.75)[:, 1:], (A.shape[0], -1, A.shape.ndims - 1))
But if that is not the case, you just cannot have a tensor like that because the second dimension would need to have multiple sizes.
Related
Regarding the answer posted here, when I want to use the equations for obtaining the values of the parameters of the transposed convolution, I face some problems. For example, I have a tensor with the size of [16, 256, 16, 160, 160] and I want to upsample that to the size of [16, 256, 16, 224, 224]. Based on the equation of the transposed convolution, when, for solving the equations for the height, I select stride+2 and I want to find the k (kernel size), I have the following equation that the kernel size will have a large and also negative value.
224 = (160 - 2)x (2) + 1x(k - 1) + 1
What is wrong with my calculations and how I can find the parameters.
I don't think you applied the formula incorrectly, I think it's primarily the issue with the input and output dimensions you desire that are not possible with a stride=2
Transposed or Dialated convolutions scale the output really quickly. Let's say for example, you were just taking these params for your Transposed Convolution(I'm simplifying the values here to 1D just to make the calculations clear):
Input Size = 160
Stride = 2
Kernel = 1
Padding = 0
Output Padding = 0
Now we apply the formula from the official docs for calculating output shape:
H_out =(H_in − 1)×stride[0]−2×padding[0]+dilation[0]×(kernel_size[0]−1)+output_padding[0]+1
OR we can simplify the formula a bit:
Output Size = ((Input Size - 1) * Strides) - (2 * Padding) + Filter_Size + Ouput Padding
Here, Filter_Size = dilation_factor* (kernel_size-1) to make the formula seem less scary.
Now let's take our example and put the values in to see what Transposed OUtput size we can get with the stride=2 and smallest kernel size possible, that is, kernel=1
Ouput_Size = ((160-1)*2) - (2*0) + 1*(1-1) + 0
Output_Size = 318 - 0 + 0 + 0
Output_Size = 318
So, with the stride you want, you will atleast have an output_size >= 318 and you want 224 hence the negative kernel_size.
I hope that answers your question.
Ref Links to understand Transposed Convolution calculations better with an example:
Paperspace: Transpose Convolution Explained for Up-Sampling Images
Calculating the Output Size of Convolutions and Transpose Convolutions
There is no good constructive answer to this question.
Being in some sense inverse to conv2d, which downsample image stride times, transposed_conv2d upsample stride times. One can not use it for arbitrary resize and get evenly good result, there's torchvision.transforms.Resize or adaptive pooling for this.
torchvision.transforms.Resize is the default choice, it is simple and flexible, one can feed PIL image or torch.Tensor to it, - use former, if input sizes vary dynamically, use latter, if not.
Adaptive pooling, usually it is AdaptiveAvgPool2d, is more sofisticated, it supposed to be a part of architecture. Being inserted at the begining of network, it works as (batched) ImageResize; no magic - it is CPU implemented usualy, one will have a hard time implementing it on tensor hardware. In embedded solutions it is typical to have special image processor for such work.
Well, you still could formaly solved the task with transposed_conv2d, by playing with padding, but it would be just cutting off part of the image, probably loosing information, or inserting a lot of useless spacing.
I have Pytorch 2d tensor with normal distribution.
Is there a fast way to nullify top 10% max values of this tensor using Python?
I see two possible ways here:
Flatten tensor to 1d and just sort it
Non-vectorized way using some native Python operators (for-if)
But neither of these looks fast enough.
So, what is the fastest way to set X max values of a tensor to zero?
Well, it seems that Pytorch has a useful operator torch.quantile() that helps here a lot.
The solution (for 1d tensor):
import torch
x = torch.randn(100)
y = torch.tensor(0.) #new value to assign
split_val = torch.quantile(x, 0.9)
x = torch.where(x < split_val, x, y)
I want to average a slice of a numpy array (its an image).
Currently i'm iterating over each pixel as follows but its dreadfully slow. I know there is a better way but I cant work it out. Its probably the numpy fancy indexing but i'm stuck.
I've used openCV to read the image into a numpy array with the shape 640,480,3 and I want to change the each of the last bit i.e [123,121,234] to the average of that slice for each of the 640x480.
You don't have to give me the answer but a shove in the right direction would be helpful.
This is whats slow for me:
def bw_image_arr(self):
for x in self.images:
for y in x:
for z in y:
z = z.mean()
Use axis argument to do mean-reduction along last axis and then broadcast to the original shape with np.broadcast_to -
np.broadcast_to(images.mean(axis=-1,keepdims=True),images.shape)
That np.broadcast_to helps us on achieving memory efficiency by giving us original shaped view into the averaged array. If you need the final output with its own memory space, append with .copy() -
np.broadcast_to(images.mean(axis=-1,keepdims=True),images.shape).copy()
Alternatively, we can use np.repeat -
images.mean(axis=-1,keepdims=True).repeat(images.shape[-1],axis=-1)
The posted solutions work for ndarrays of generic dimensions. Hence, will work on one image or a set of images with the desired result of average along the last axis being broadcasted/replicated/repeated along the same.
Also, note that the final output would be of float dtype. So, we might want to convert or/and round to int for usual image-dtype of unsigned-int dtype output.
You need to average over the x and y axes. In your case the axes 1 and 2 (you can input it in numpy.mean as a tuple). Then if you have 50 images in the first dimension example you will get (50, 3) shaped array.
I hope I'm not missing anything obvious here, but I've scoured the inter-webs to no avail, and finally come to ask here...
Here's a really dry and simple description of what I'd like to do:
Say I've got a tensor of shape (20, 40, 3, 5), and another tensor of shape (20, 40, 5, 7). The first two dimension sizes are to be kept as are, and are purposely identical for the two tensors. The last two dimensions on the other hand, are to be (matrix-)multiplied, matmul style. Meaning my resulting tensor would be of shape (20, 40, 3, 7). How can that be done??
I realize I can theoretically just loop over the first two dimensions and use tf.matmul() directly, but that's an absolute no-go due to runtime, efficiency, model-trainer and GPU world-wide protests, and my conscience if that's of any weight :-).
I've unfortunately disregarded as "not what I need" the following options:
tf.tensordot would give me an output of shape (20, 40, 3, 20, 40, 7). No good.
tf.scan is only good for the first dimension if I'm reading it correctly (suitable for RNNs maybe? Not my case anyhow).
tf.matmul works for tensors of rank >= 2, but works like # over the last and first dimensions respectively. Again, not my case.
So again - how can this be done?
A numpy answer that helps me get in the right direction would also be very helpful, but I'll need a tf implementation at the end of the day.
Thanks in advance, and sorry if I'm missing something dumb.
The following is closer to what I need, but less clear and so is being written separately:
The first two dimensions are spatial dimensions of an image. The last two are actually square matrices, obtained via tf.contrib.distributions.fill_triangular, and are being multiplied (along with an appropriate transpose on one of them) to obtain covariance matrices associated to each spatial coordinate. I don't know if that helps in any way, but it gives some context at the very least. Also, there might or might not be a batch dimension as well, but I'm assuming that solving the 4-D tensor case would be generalizable enough.
Posting this for future reference:
From numpy matmul doc:
If either argument is N-D, N > 2, it is treated as a stack of matrices
residing in the last two indexes and broadcast accordingly.
For dimensions >2 it will treat it as a stack of matrices, attempting to matmul the last 2 dimensions, resulting with a np array as the OP required.
For example:
import numpy as np
A = np.ones((1,2,1,2))
B = np.ones((1,2,2,1))
print(A.shape)
print(B.shape)
print(np.matmul(A,B).shape)
with result:
(1, 2, 1, 2)
(1, 2, 2, 1)
(1, 2, 1, 1)
I have been struggling with this for quite some time. All I want is a torch.diff() function. However, many matrix operations do not appear to be easily compatible with tensor operations.
I have tried an enormous amount of various pytorch operation combinations, yet none of them work.
Due to the fact that pytorch hasn't implemented this basic feature, I started by simply trying to subtract the element i+1 from element i along a specific axis.
However, you can't simply do this element-wise (due to the tensor limitations), so I tried to construct another tensor, with the elements shifted along one axis:
ix_plus_one = [0]+list(range(0,prediction.size(1)-1))
ix_differential_tensor = torch.LongTensor(ix_plus_one)
diff_one_tensor = prediction[:,ix_differential_tensor]
But now we have a different problem - indexing doesn't really work to mimic numpy in pytorch as it advertises, so you can't index with a "list-like" Tensor like this. I also tried using the tensor scatter functions
So I'm still stuck with this simple problem of trying to get a gradient on a pytoch tensor.
All of my searching leads to the marvelous capabilities of pytorchs' "autograd" function - which has nothing to do with this problem.
A 1D convolution with a fixed filter should do the trick:
filter = torch.nn.Conv1d(in_channels=1, out_channels=1, kernel_size=2, stride=1, padding=1, groups=1, bias=False)
kernel = np.array([-1.0, 1.0])
kernel = torch.from_numpy(kernel).view(1,1,2)
filter.weight.data = kernel
filter.weight.requires_grad = False
Then use filter like you would any other layer in torch.nn.
Also, you might want to change padding to suit your specific needs.
There appears to be a simpler solution to this (as I needed a similarly), referenced here: https://discuss.pytorch.org/t/equivalent-function-like-numpy-diff-in-pytorch/35327/2
diff = x[1:] - x[:-1]
which can be done along different dimensions such as
diff = polygon[:, 1:] - polygon[:, :-1]
I would recommend writing a unit test that verifies identical behavior though.
For all those running into the question after March 2021
As of torch 1.8 there's torch.diff that works exactly as expected by the OP