large alloc error when extracting 3D Image patches in python - python

I'm trying to extract small 3D patches (example patch size 20x20x4) from a 3D Image of size 250x250x250 with stride 1 for every axis. I'll be extracting all possible patches as I'll be running a function on each patch and returning the result in the form of a 3D image with the result of the current patch assigned to the center voxel of the patch. For extracting the patches I'll be using the code below :
import numpy as np
from numpy.lib import stride_tricks
def cutup(data, blck, strd):
sh = np.array(data.shape)
blck = np.asanyarray(blck)
strd = np.asanyarray(strd)
nbl = (sh - blck) // strd + 1
strides = np.r_[data.strides * strd, data.strides]
dims = np.r_[nbl, blck]
data6 = stride_tricks.as_strided(data, strides=strides, shape=dims)
return data6.reshape(-1, *blck)
#demo
x = np.zeros((250,250,250), int)
y = cutup(x, (20, 20, 4), (1, 1, 1))
I'm running this on google colab which has around 12gigs of ram. Since the result is large number of patches, I'm getting a large alloc error and then the kernel restarts. I think splitting the image in to parts would work, but If I do so how should I write the code in order for it to consider the neighbouring voxels? Is there a smart way to do this?

Don't reshape the newly strided array/view before returning.
def cutup(data, blck, strd):
sh = np.array(data.shape)
blck = np.asanyarray(blck)
strd = np.asanyarray(strd)
nbl = (sh - blck) // strd + 1
strides = np.r_[data.strides * strd, data.strides]
dims = np.r_[nbl, blck]
data6 = stride_tricks.as_strided(data, strides=strides, shape=dims)
return data6
Then iteratate over the patches.
p = np.zeros((250,250,250), int)
q = cutup(p, (20, 20, 4), (1, 1, 1))
print(f'windowed shape : {q.shape}')
print()
for i,x in enumerate(q):
print(f'x.shape:{x.shape}')
for j,y in enumerate(x):
print(f'\ty.shape:{y.shape}')
for k,z in enumerate(y):
print(f'\t\tz.shape:{z.shape}')
if k==5: break
break
break
>>>
windowed shape : (231, 231, 247, 20, 20, 4)
x.shape:(231, 247, 20, 20, 4)
y.shape:(247, 20, 20, 4)
z.shape:(20, 20, 4)
z.shape:(20, 20, 4)
z.shape:(20, 20, 4)
z.shape:(20, 20, 4)
z.shape:(20, 20, 4)
z.shape:(20, 20, 4)
Your example will produce an array (or a view of the array) with a shape of (231,231, 247, 20, 20, 4) or thirteen million+ 3-d patches.
That will solve your memory allocation problem.
when I try to reshape it to (231,231,247,-1). I get large alloc error
If your operation requires the last three dimensions to be flattened, do that in your iteration.
for i,x in enumerate(q):
for j,y in enumerate(x):
for k,z in enumerate(y):
z = z.reshape(-1)
print(f'\t\tz.shape:{z.shape}')
if k==5: break
break
break
Looks like you can do that reshape in the outermost loop - at least for a zeros array.
for i,x in enumerate(q):
zero,one,*last = x.shape
x = x.reshape(zero,one,-1)
print(f'x.shape:{x.shape}')
for j,y in enumerate(x):
print(f'\ty.shape:{y.shape}')
for k,z in enumerate(y):
print(f'\t\tz.shape:{z.shape}')
break
break
break
>>>
x.shape:(231, 247, 1600)
y.shape:(247, 1600)
z.shape:(1600,)
Is there a smart way to do this?
If you can figure out how to vectorize your operation so that you only need to iterate over the first dimension or the first and second dimensions you can speed up your processing. That should be a separate question if you encounter problems.

Related

Iterating over a multi-dimentional array

I have array of shape (3,5,96,96), where channels= 3, number of frames = 5 and height and width = 96
I want to iterate over dimension 5 to get images with size (3,96,96). The code which I have tried is below.
b = frame.shape[1]
for i in range(b):
fr = frame[:,i,:,:]
But this is not working.
You could swap axis (using numpy.swapaxes(a, axis1, axis2) to get the second (frame) in first position
import numpy as np
m = np.zeros((3, 5, 96, 96))
n = np.swapaxes(m, 0, 1)
print(n.shape)
(5, 3, 96, 96)
You need to iterate over the first axis to achieve your desired result, this means you need to move the axis you want to iterate over to the first position. You can achieve this with np.moveaxis
m = np.zeros((3, 5, 96, 96))
np.moveaxis(m, 1, 0).shape
(5, 3, 96, 96)

Multidimensional Tensor slicing

First things first: I'm relatively new to TensorFlow.
I'm trying to implement a custom layer in tensorflow.keras and I'm having relatively hard time when I try to achieve the following:
I've got 3 Tensors (x,y,z) of shape (?,49,3,3,32) [where ? is the batch size]
On each Tensor I compute the sum over the 3rd and 4th axes [thus I end up with 3 Tensors of shape (?,49,32)]
By doing an argmax (A)on the above 3 Tensors (?,49,32) I get a single (?,49,32) Tensor
Now I want to use this tensor to select slices from the initial x,y,z Tensors in the following form:
Each element in the last dimension of A corresponds to the selected Tensor.
(aka: 0 = X, 1 = Y, 2 = Z)
The index of the last dimension of A corresponds to the slice that I would like to extract from the Tensor last dimension.
I've tried to achieve the above using tf.gather but I had no luck. Then I tried using a series of tf.map_fn, which is ugly and computationally costly.
To simplify the above:
let's say we've got an A array of shape (3,3,3,32). Then the numpy equivalent of what I try to achieve is this:
import numpy as np
x = np.random.rand(3,3,32)
y = np.random.rand(3,3,32)
z = np.random.rand(3,3,32)
x_sums = np.sum(np.sum(x,axis=0),0);
y_sums = np.sum(np.sum(y,axis=0),0);
z_sums = np.sum(np.sum(z,axis=0),0);
max_sums = np.argmax([x_sums,y_sums,z_sums],0)
A = np.array([x,y,z])
tmp = []
for i in range(0,len(max_sums)):
tmp.append(A[max_sums[i],:,:,i)
output = np.transpose(np.stack(tmp))
Any suggestions?
ps: I tried tf.gather_nd but I had no luck
This is how you can do something like that with tf.gather_nd:
import tensorflow as tf
# Make example data
tf.random.set_seed(0)
b = 10 # Batch size
x = tf.random.uniform((b, 49, 3, 3, 32))
y = tf.random.uniform((b, 49, 3, 3, 32))
z = tf.random.uniform((b, 49, 3, 3, 32))
# Stack tensors together
data = tf.stack([x, y, z], axis=2)
# Put reduction axes last
data_t = tf.transpose(data, (0, 1, 5, 2, 3, 4))
# Reduce
s = tf.reduce_sum(data_t, axis=(4, 5))
# Find largest sums
idx = tf.argmax(s, 3)
# Make gather indices
data_shape = tf.shape(data_t, idx.dtype)
bb, ii, jj = tf.meshgrid(*(tf.range(data_shape[i]) for i in range(3)), indexing='ij')
# Gather result
output_t = tf.gather_nd(data_t, tf.stack([bb, ii, jj, idx], axis=-1))
# Reorder axes
output = tf.transpose(output_t, (0, 1, 3, 4, 2))
print(output.shape)
# TensorShape([10, 49, 3, 3, 32])

How to implement maxpool: taking a maximum on sliding window on image or tensor

In short: I am looking for a simple numpy (maybe oneliner) implementation of Maxpool - maximum on a window on numpy.narray for all location of the window across dimensions.
In more details: I am implementing a convolutional neural network ("CNN"), one of the typical layers in such a network is MaxPool layer (look for example here). Writing
y = MaxPool(x, S), x is an input narray and S is a parameter, using pseudocode, the output of the MaxPool is given by:
y[b,h,w,c] = max(x[b, s*h + i, s*w + j, c]) over i = 0,..., S-1; j = 0,...,S-1.
That is, y is narray where the value at indexes b,h,w,c equals the maximum taken over the window of size S x S along the second and the third dimension of the input x, the window "corner" is placed at the indexes b,h,w,c.
Some additional details: The network is implemented using numpy. CNN has many "layers" where output of one layer is the input to the next layer. The input to a layers are numpy.narrays called "tensors". In my case tensors are 4-dimensional numpy.narray's, x. That is x.shape is a tuple (B,H,W,C). Each size of dimensions changes after the tensor is process by a layer, for example the input to layer i= 4 can have size B = 10, H = 24, W = 24, C = 3, while the output, aka input to i+1 layer has B = 10, H = 12, W = 12, C = 5. As indicated in the comments the size after application of MaxPool is (B, H - S + 1, W - S + 1, C).
For a concreteness: if I use
import numpy as np
y = np.amax(x, axis = (1,2))
where x.shape is say (2,3,3,4) this will give me what I want but for a degenerate case where the window I am maximizing over is of the size 3 x 3, the size of the second and third dimension of x, which is not exactly what I want.
Here's a solution using np.lib.stride_tricks.as_strided to create sliding windows resulting in a 6D array of shape : (B,H-S+1,W-S+1,S,S,C) and then simply performing max along the fourth and fifth axes, resulting in an output array of shape : (B,H-S+1,W-S+1,C). The intermediate 6D array would be a view into the input array and as such won't occupy anymore memory. The subsequent operation of max being a reduction would efficiently utilize the sliding views.
Thus, an implementation would be -
# Based on http://stackoverflow.com/a/41850409/3293881
def patchify(img, patch_shape):
a, X, Y, b = img.shape
x, y = patch_shape
shape = (a, X - x + 1, Y - y + 1, x, y, b)
a_str, X_str, Y_str, b_str = img.strides
strides = (a_str, X_str, Y_str, X_str, Y_str, b_str)
return np.lib.stride_tricks.as_strided(img, shape=shape, strides=strides)
out = patchify(x, (S,S)).max(axis=(3,4))
Sample run -
In [224]: x = np.random.randint(0,9,(10,24,24,3))
In [225]: S = 5
In [226]: np.may_share_memory(patchify(x, (S,S)), x)
Out[226]: True
In [227]: patchify(x, (S,S)).shape
Out[227]: (10, 20, 20, 5, 5, 3)
In [228]: patchify(x, (S,S)).max(axis=(3,4)).shape
Out[228]: (10, 20, 20, 3)

Pairing images as np arrays into a specific format

So I have 2 images, X and Y, as numpy arrays, each of shape (3, 30, 30): that is, 3 channels (RGB), each of height and width 30 pixels. I'd like to pair them up into a numpy array to get a specific output shape:
my_pair = pair_up_images(X, Y)
my_pair.shape = (2, 3, 30, 30)
Such that I can get the original images by slicing:
my_pair[0] == X
my_pair[1] == Y
After a few attempts, I keep getting either:
my_pair.shape = (2,) #By converting the images into lists and adding them.
This works as well, but the next step in the pipeline just requires a shape (2, 3, 30, 30)
my_pair.shape = (6, 30, 30) # using np.vstack
my_pair.shape = (3, 60, 30) # using np.hstack
Thanks!
Simply:
Z = np.array([X, Y])
Z.shape
Out[62]: (2, 3, 30, 30)

Theano/numpy advanced indexing

I have a 4d theano tensor (with the shape (1, 700, 16, 95000) for example) and a 4d 'mask' tensor with the shape (1, 700, 16, 1024) such that every element in the mask is an index that I need from the original tensor. How can I use my mask to index my tensor? Things like sample[mask] or sample[:, :, :, mask] don't really seem to work.
I also tried using a binary mask but since the tensor is rather large I get a 'device out of memory' exception.
Other ideas on how to get my indices from the tensor would also be very appreciated.
Thanks
So in the lack of an answer, I've decided to use the more computationally intensive solution which is unfolding both my data the the indices tensors, adding an offset to the indices to bring them to global positions, indexing the data and reshaping it back to original.
I'm adding here my test code, including a (commented-out) solution for matrices.
def theano_convertion(els, inds, offsets):
els = T.flatten(els)
inds = T.flatten(inds) + offsets
return T.reshape(els[inds], (2, 3, 16, 5))
if __name__ == '__main__':
# command: np.transpose(t[range(2), indices])
# t = np.random.randint(0, 10, (2, 20))
# indices = np.random.randint(0, 10, (5, 2))
t = np.random.randint(0, 10, (2, 3, 16, 20)).astype('int32')
indices = np.random.randint(0, 10, (2, 3, 16, 5)).astype('int32')
offsets = np.asarray(range(1, 2 * 3 * 16 + 1), dtype='int32')
offsets = (offsets * 20) - 20
offsets = np.repeat(offsets, 5)
offsets_tens = T.ivector('offsets')
inds_tens = T.itensor4('inds')
t_tens = T.itensor4('t')
func = theano.function(
[t_tens, inds_tens, offsets_tens],
[theano_convertion(t_tens, inds_tens, offsets_tens)]
)
shaped_elements = []
flattened_elements = []
[tmp] = func(t, indices, offsets)
for i in range(2):
for j in range(3):
for k in range(16):
shaped_elements.append(t[i, j, k, indices[i, j, k, :]])
flattened_elements.append(tmp[i, j, k, :])
print shaped_elements[-1] == flattened_elements[-1]

Categories

Resources