I have a tensors data of sensors, each tensor is of shape (4,1500)
This is 1500 timepoints and for each time point I have 4 features.
I want to "smooth" the sequences with rolling average or other rolling statistics. The end goal is to try to improve an lstm autoencoder with rolling statistics instead of the long raw sequence.
I am familiar with rolling windows of pandas and currently I am doing this:
#tensor shape:
data.shape
(4,1500)
#convert data to numpy array and then to dataframe and perform rolling mean
rolled_data=pd.DataFrame(data.numpy().swapaxes(1,0)).rolling(10).mean()[::10]
rolled_data.shape
(150, 4)
# convert back the dataframe to tensor
tensor_rolled_data=torch.Tensor(rolled_data.to_numpy().swapaxes(1,0))
tensor_rolled_data.shape
torch.Size([4, 150])
my question is- is there a better way to do it? a function in numpy/torch that can do rolling statistics in a cleaner or more efficient way?
Since you're striding the output by the size of the window this is actually more akin to downsampling by averaging than to a computing rolling statistics. We can take advantage of the fact that there are no overlaps by simply reshaping the initial tensor.
Using Tensor.reshape
Assuming your data tensor has a shape divisible by 10 then you can just reshape the tensor to shape (4, 150, 10) and compute the statistic along the last dimension. For example
win_size = 10
tensor_rolled_data = data.reshape(data.shape[0], -1, win_size).mean(dim=2)
This solution doesn't give exactly the same results as your tensor_rolled_data since in this solution the first entry will contain the mean of the first 10 samples, the second entry will contain the mean of the second 10 samples, etc... The pandas solution is a "causal filter" so the first entry will contain the mean of the 10 most recent samples up to and including sample 0, the second will contain the 10 most recent samples up to and including sample 10, etc... (Note that the first entry is nan in the pandas solution since less than 10 preceding samples exist).
If this difference is unacceptable you can recreate the pandas result by first padding with 9 nan values and clipping off the last 9 samples.
import torch.nn.functional as F
win_size = 10
# pad with `nan` to match behavior of pandas
data_padded = F.pad(data[None, :, :-(win_size - 1)], (win_size - 1, 0), 'constant', float('nan')).squeeze(0)
# find mean of groups of N samples
tensor_rolled_data = data_padded.reshape(data.shape[0], -1, win_size).mean(dim=2)
Using Tensor.unfold
To address the comment about what to do when there are overlaps. If you're only interested in the mean statistic then there are a number of ways to compute this (e.g. convolution, average pooling, tensor unfolding). That said, Tensor.unfold gives the most general solution since it could be used to compute any statistic over a window. For example
# same as first example above
win_size = 10
tensor_rolled_data = data.unfold(dimension=1, size=win_size, step=win_size).mean(dim=2)
or
# same as second example above
import torch.nn.functional as F
win_size = 10
data_padded = F.pad(data.unsqueeze(0), (win_size - 1, 0), 'constant', float('nan')).squeeze(0)
tensor_rolled_data = data_padded.unfold(dimension=1, size=win_size, step=win_size).mean(dim=2)
In the above cases, unfolding produces the same result as reshape since size and step are equal. However, unlike reshape, unfolding also supports size != step.
win_size = 10
stride = 2
tensor_rolled_data = data.unfold(1, win_size, stride).mean(dim=2).mean(dim=2)
# produces shape [4, 746]
or you can pad the front of the features with win_size - 1 values to achieve the same result as pandas.
import torch.nn.functional as F
win_size = 10
stride = 2
data_padded = F.pad(data.unsqueeze(0), (win_size - 1, 0), 'constant', float('nan')).squeeze(0)
tensor_rolled_data = data_padded.unfold(1, win_size, stride).mean(dim=2)
# produces shape [4, 750]
Note In practice you probably don't want to pad with NaN since this will probably become quite a headache. Instead you could use zero padding, 'replicate' padding, or 'mirror' padding.
Related
I have a numpy array of shape (100,4,30). This represents 100 samples of 4 samples of encodings of length 30. The 4 samples, per row, are related.
I want to get a TensorFlow dataset, batched, where related samples are in the same batch.
I'm trying to do:
first, use np.vsplit to get a list of length 100, where each element in the list is a list of the 4 related samples.
Now if I call tf.data.Dataset.from_tensor_slices(...).batch(1) on this list of lists, I get a batch that contains a tensor of shape (4,1,30).
I want this batch to contain 4 tensors of shape (1,30).
How can I achieve this?
I may have missunderstood you, but if you just leave out the "vsplit":
data = np.zeros((100, 4, 30))
data_ds = tf.data.Dataset.from_tensor_slices(data).batch(1)
for element in data_ds.take(1):
print(element.shape)
you will get:
(1, 4, 30)
(so one batch contains all 4 related encodings).
If you really want the dimensions inside a batch to be 4 times (1, 30) you can do:
data = np.expand_dims(data, axis=2)
before dataset creation.
EDIT:
I think I just understood your question. You want every batch to have 4 elements and those are the related encodings? You can achieve this by:
data = np.swapaxes(data, 0, 1)
data = np.reshape(data, (100*4, -1))
data_ds = tf.data.Dataset.from_tensor_slices(data).batch(4)
I would like to figure out a way to apply a function which calculates pairwise distances, let's call it dists(A, B), row-wise for every input element in a batch, meaning:
(100, 16, 3) -- input, 100 is the batch size so 100 instances, 16 is let's say image size, and 3 filters (asking for Conv2D)
(5, 3) -- tensor for which I want to calculate the row-wise distance (assume it's A in dists(A, B) and is fixed)
Now, for every instance I am supposed to get back a matrix of shape (5, 16). Naturally, I could use a for to span the batch and get my final (100,5,16) result. However, I would love to know if there is an easier way to apply my function row-wise, in parallel, using GPU.
Thank you very much for your time.
Suppose we are using the L1 distance:
import torch
# data and target
a = torch.randn(100, 16, 3)
b = torch.randn(5, 3)
# Reshape the tensors
a = a.unsqueeze(1)
b = b.unsqueeze(0).unsqueeze(2)
print(a.shape, b.shape)
# Compute distance
dist = (a-b).abs().sum(3)
print(dist.shape)
I have the following situation. I have an array of size (3, 128, n) (where n is large). (This array represents a picture). I have a superresolution deep learning model that takes as input a (3, 128, 128) picture and gives it back in better quality. I want to use apply my model to the whole picture.
My existing solution
My first solution to this problem is to split my array into array of size (3, 128, 128). I then have a list of square images, and I can apply my model to each of this square and then concatenate all the results to get a new (3, 128, n) image. The problem with this method is that the model does not perform as well on the edges of the image.
My desired solution
To get around this problem, I have thought of an alternative solution. Instead of considering non overlapping square images, I can consider all square images that can be extracted from my original image. I can pass all those images to my model. Then to reconstruct a point of coordinates (a, b, c), I will consider all reconstructed square pictures that contains c, and take an average of them. I want this average to give more weight to the square where c is near the center.
To be more specific :
I start with a 3*128*n array (let's call it A). I pad on the left and on the right which gives me a new array (let's call it A_pad) of size 3*128*(n+2*127)
For i in range(0,n+127), let A_i = A_pad[:, :, i:i+128], A_i is of size (3*128*128) and can be fed to my model which creates a new array B_i of the same size.
Now I want a new array B of the same size than A that is defined like this : For each (x, y , z), B[x, y, z] is the mean of the 128 B_i[x, y, z+127-i] such that z <= i < z+128 with the weight 1 + min(z + 127 -i, i-z). That corresponds to taking the mean of all the windows that contains z with a weight proportional to the distance to the closest edge.
My question is based on the computation of B. Given what I've described, I could write multiple for loops that would yield the correct results, but I'm afraid it would be slow. I'm looking for a solution using numpy that is as fast as possible.
This is an example implementation that follows the steps you outlined in the section "My desired solution". It makes extensive use of np.lib.stride_tricks.as_strided which at first glance might not seem obvious at all; I added detailed comments to each usage for clarification. Also note that in your description you use z to denote the column position within images while in comments I use the term n-position in order to comply with the shape specification via n.
Regarding efficiency it's not obvious whether this is a winner or not. Computation happens all in numpy but the expression sliding_128 * weights builds a large array (128x the size of the original image) before reducing it along the frame dimension. This definitely comes at its cost, memory might even be an issue. A loop might come in handy at this position.
Lines which contain a comment prefixed with # [TEST] were added for testing purposes. Concretely this means we're overwriting the weights for the final sum of frames with 1 / 128 in order to eventually recover the original image (since no ML model transformation is applied either).
import numpy as np
n = 640 # For example.
image = np.random.randint(0, 256, size=(3, 128, n))
print('image.shape: ', image.shape) # (3, 128, 640)
padded = np.pad(image, ((0, 0), (0, 0), (127, 127)), mode='edge')
print('padded.shape: ', padded.shape) # (3, 128, 894)
sliding = np.lib.stride_tricks.as_strided(
padded,
# Frames stored along first dimension; sliding across last dimension of `padded`.
shape=(padded.shape[-1]-128+1, 3, 128, 128),
# First dimension: Moving one frame ahead -> move across last dimension of `padded`.
# Remaining three dimensions: Move as within `padded`.
strides=(padded.strides[-1:] + padded.strides)
)
print('sliding.shape: ', sliding.shape) # (767, 3, 128, 128)
# Now at this part we would feed the frames `sliding` to the ML model,
# where the first dimension is the batch size.
# Assume the output is assigned to `sliding` again.
# Since we're not using an ML model here, we create a copy instead
# in order to update the strides of `sliding` with it's actual shape (as defined above).
sliding = sliding.copy()
sliding_128 = np.lib.stride_tricks.as_strided(
# Reverse last dimension since we want the last column from the first frame.
# Need to copy again because `[::-1]` creates a view with negative stride,
# but we want actual reversal to work with the strides below.
# (There's perhaps a smart way of adjusting the strides below in order to not make a copy here.)
sliding[:, :, :, ::-1].copy(),
# Second dimension corresponds to the 128 consecutive frames.
# Previous last dimension is dropped since we're selecting the
# column that corresponds to the current n-position.
shape=(128, n, 3, 128),
# First dimension (frame position): Move one frame and one column ahead
# (actually want to move one column less in `sliding` but since we reverted order of columns
# we need to move one ahead now) -> move across first dimension of `sliding` + last dimension of `sliding`.
# Second dimension (n-position): Moving one frame ahead -> move across first dimension of `sliding`.
# Remaining two dimensions: Move within frames (channel and row dimensions).
strides=((sliding.strides[0] + sliding.strides[-1],) + sliding.strides[:1] + sliding.strides[1:3])
)
print('sliding_128.shape: ', sliding_128.shape) # (128, 640, 3, 128)
# Weights are independent of the n-position -> we can precompute.
weights = 1 + np.concatenate([np.arange(64), np.arange(64)[::-1]])
weights = np.ones(shape=128) # [TEST] Assign weights for testing -> want to obtain the original image back.
weights = weights.astype(float) / weights.sum() # Normalize?
weights = weights[:, None, None, None] # Prepare for broadcasting.
weighted_image = np.moveaxis(np.sum(sliding_128 * weights, axis=0), 0, 2)
print('weighted_image.shape: ', weighted_image.shape) # (3, 128, 640)
assert np.array_equal(image, weighted_image.astype(int)) # [TEST]
So I am a little new to using matrices in Python, and I am looking for the best way to perform the following operation.
Say I have a vector of an arbitrary length, like this:
data = np.array(range(255))
And I want to fit this data inside a matrix with a shape like so:
concept = np.zeros((3, 9, 6))
Now, obviously this will not fit, and results in an error:
ValueError: cannot reshape array of size 255 into shape (3,9,6)
What would be the best way to go about fitting as much of the data vector inside the first matrix with the shape (3, 9, 6) while making sure any "overflow" is stored in a second (or third, fourth, etc.) matrix?
Does this make sense?
Basically, I want to be able to take a vector of any size and produce an arbitrary amount of matrices that have the data shaped according to the 3, 9, 6 dimensions.
Thank you for your help.
def each_matrix(a, dims):
size = dims.prod()
padded = np.concatenate([ a, np.zeros(size-1) ])
for i in range(len(padded) / size):
yield padded[i*size : (i+1)*size].reshape(dims)
for matrix in each_matrix(np.array(range(255)),
dims=np.array([ 3, 9, 6 ])):
print(str(matrix) + '\n\n-------\n')
This will fill the last matrix with zeros.
Here is a rough solution to your problem.
def split_padded(a,n):
padding = n - len(data)%n
numOfsplit = int(len(data)/n)+1
print padding, numOfsplit
return np.split(np.concatenate((a,np.zeros(padding))),numOfsplit)
data = np.array(range(255))
splitnum = 3*9*6
splitdata = split_padded(data,splitnum)
for mat in splitdata:
print mat.reshape(3,9,6)
It is very rough and works for 1D input for array.
First, calculating the number of 0 we need to pad in padding and then calculating the number of matrices we can get out of input data in numOfsplit and doing the splitting in last line.
I would like to use a generic filter to calculate the mean of values within a given window (or kernel), for values that fulfill a couple of conditions. I expected the following code to produce a mean filter of the first array in a 3-layer window, using the other two arrays to mask values from the mean calculation.
from scipy import ndimage
import numpy as np
#some test data
tstArr = np.random.rand(3,7,7)
tstArr = tstArr*10
tstArr = np.int_(tstArr)
tstArr[1] = tstArr[1]*100
tstArr[2] = tstArr[2] *1000
#mean function
def testFun(tstData,processLayer,nLayers,kernelSize):
funData= tstData.reshape((nLayers,kernelSize,kernelSize))
meanLayer = funData[processLayer]
maskedData = meanLayer[(funData[1]>1)&(funData[2]<9000)]
returnMean = np.mean(maskedData)
return returnMean
#number of layers in the array
nLayers = np.shape(tstArr)[0]
#window size
kernelSize = 5
#create a sampling window of 5x5 elements from each array
footprnt = np.ones((nLayers,kernelSize,kernelSize),dtype = np.int)
# calculate the mean of the first layer in the array (other two are for masking)
processLayer = 0
tstOut = ndimage.generic_filter(tstArr, testFun, footprint=footprnt, extra_arguments = (processLayer,nLayers,kernelSize))
I thought this would yield a 7x7 array of masked mean values from the first layer in the input array. The output is a 3x7x7 array, and I don't understand what the values represent. I'm not sure how to produce the "masked" mean-filtered array, or how to interpret the output as given.
Your code produce a mean filter of the first array in a 3-layer window, using the over two arrays to mask values from the mean calculation. You will find the result in tstOut[1].
What is going on ? When you call ndimage.generic_filter with tstArr of shape (3, 7, 7) and footprint=np.ones((3, 5, 5)) then for all i from 0 to 2, for all j from 0 to 6 and for all k from 0 to 6, testFun is called with the subarray of tstArr centered in (i, j, k) and of shape (3, 5, 5) (the array is reflected at the boundary to supply missing values).
In the end:
tstOut[0] is the mean filter of tstArr[0] with tstArr[0] and tstArr[1] as masks
tstOut[1] is the mean filter of tstArr[0] with tstArr[1] and tstArr[2] as masks
tstOut[2] is the mean filter of tstArr[1] with tstArr[2] and tstArr[2] as masks
Again, the wanted result is in tstOut[1].
I hope this will help you.