Theano/numpy advanced indexing - python

I have a 4d theano tensor (with the shape (1, 700, 16, 95000) for example) and a 4d 'mask' tensor with the shape (1, 700, 16, 1024) such that every element in the mask is an index that I need from the original tensor. How can I use my mask to index my tensor? Things like sample[mask] or sample[:, :, :, mask] don't really seem to work.
I also tried using a binary mask but since the tensor is rather large I get a 'device out of memory' exception.
Other ideas on how to get my indices from the tensor would also be very appreciated.
Thanks

So in the lack of an answer, I've decided to use the more computationally intensive solution which is unfolding both my data the the indices tensors, adding an offset to the indices to bring them to global positions, indexing the data and reshaping it back to original.
I'm adding here my test code, including a (commented-out) solution for matrices.
def theano_convertion(els, inds, offsets):
els = T.flatten(els)
inds = T.flatten(inds) + offsets
return T.reshape(els[inds], (2, 3, 16, 5))
if __name__ == '__main__':
# command: np.transpose(t[range(2), indices])
# t = np.random.randint(0, 10, (2, 20))
# indices = np.random.randint(0, 10, (5, 2))
t = np.random.randint(0, 10, (2, 3, 16, 20)).astype('int32')
indices = np.random.randint(0, 10, (2, 3, 16, 5)).astype('int32')
offsets = np.asarray(range(1, 2 * 3 * 16 + 1), dtype='int32')
offsets = (offsets * 20) - 20
offsets = np.repeat(offsets, 5)
offsets_tens = T.ivector('offsets')
inds_tens = T.itensor4('inds')
t_tens = T.itensor4('t')
func = theano.function(
[t_tens, inds_tens, offsets_tens],
[theano_convertion(t_tens, inds_tens, offsets_tens)]
)
shaped_elements = []
flattened_elements = []
[tmp] = func(t, indices, offsets)
for i in range(2):
for j in range(3):
for k in range(16):
shaped_elements.append(t[i, j, k, indices[i, j, k, :]])
flattened_elements.append(tmp[i, j, k, :])
print shaped_elements[-1] == flattened_elements[-1]

Related

large alloc error when extracting 3D Image patches in python

I'm trying to extract small 3D patches (example patch size 20x20x4) from a 3D Image of size 250x250x250 with stride 1 for every axis. I'll be extracting all possible patches as I'll be running a function on each patch and returning the result in the form of a 3D image with the result of the current patch assigned to the center voxel of the patch. For extracting the patches I'll be using the code below :
import numpy as np
from numpy.lib import stride_tricks
def cutup(data, blck, strd):
sh = np.array(data.shape)
blck = np.asanyarray(blck)
strd = np.asanyarray(strd)
nbl = (sh - blck) // strd + 1
strides = np.r_[data.strides * strd, data.strides]
dims = np.r_[nbl, blck]
data6 = stride_tricks.as_strided(data, strides=strides, shape=dims)
return data6.reshape(-1, *blck)
#demo
x = np.zeros((250,250,250), int)
y = cutup(x, (20, 20, 4), (1, 1, 1))
I'm running this on google colab which has around 12gigs of ram. Since the result is large number of patches, I'm getting a large alloc error and then the kernel restarts. I think splitting the image in to parts would work, but If I do so how should I write the code in order for it to consider the neighbouring voxels? Is there a smart way to do this?
Don't reshape the newly strided array/view before returning.
def cutup(data, blck, strd):
sh = np.array(data.shape)
blck = np.asanyarray(blck)
strd = np.asanyarray(strd)
nbl = (sh - blck) // strd + 1
strides = np.r_[data.strides * strd, data.strides]
dims = np.r_[nbl, blck]
data6 = stride_tricks.as_strided(data, strides=strides, shape=dims)
return data6
Then iteratate over the patches.
p = np.zeros((250,250,250), int)
q = cutup(p, (20, 20, 4), (1, 1, 1))
print(f'windowed shape : {q.shape}')
print()
for i,x in enumerate(q):
print(f'x.shape:{x.shape}')
for j,y in enumerate(x):
print(f'\ty.shape:{y.shape}')
for k,z in enumerate(y):
print(f'\t\tz.shape:{z.shape}')
if k==5: break
break
break
>>>
windowed shape : (231, 231, 247, 20, 20, 4)
x.shape:(231, 247, 20, 20, 4)
y.shape:(247, 20, 20, 4)
z.shape:(20, 20, 4)
z.shape:(20, 20, 4)
z.shape:(20, 20, 4)
z.shape:(20, 20, 4)
z.shape:(20, 20, 4)
z.shape:(20, 20, 4)
Your example will produce an array (or a view of the array) with a shape of (231,231, 247, 20, 20, 4) or thirteen million+ 3-d patches.
That will solve your memory allocation problem.
when I try to reshape it to (231,231,247,-1). I get large alloc error
If your operation requires the last three dimensions to be flattened, do that in your iteration.
for i,x in enumerate(q):
for j,y in enumerate(x):
for k,z in enumerate(y):
z = z.reshape(-1)
print(f'\t\tz.shape:{z.shape}')
if k==5: break
break
break
Looks like you can do that reshape in the outermost loop - at least for a zeros array.
for i,x in enumerate(q):
zero,one,*last = x.shape
x = x.reshape(zero,one,-1)
print(f'x.shape:{x.shape}')
for j,y in enumerate(x):
print(f'\ty.shape:{y.shape}')
for k,z in enumerate(y):
print(f'\t\tz.shape:{z.shape}')
break
break
break
>>>
x.shape:(231, 247, 1600)
y.shape:(247, 1600)
z.shape:(1600,)
Is there a smart way to do this?
If you can figure out how to vectorize your operation so that you only need to iterate over the first dimension or the first and second dimensions you can speed up your processing. That should be a separate question if you encounter problems.

Indexing a vector with a matrix of indicies with numpy, similar to MATLAB

I want to pull out a matrix filled with the values from a vector indexed with a matrix of indices
i.e. output(i, j) = vector(indices(i, j))
In Matlab, this can be achieved with output = vector(indices).
In Python/numpy I have the following loop for this purpose but I was wondering if there was a more efficient way to do it:
idx = np.random.randint(0, 100, (25, 10))
data = np.random.random(100)
output = np.empty((np.size(idx, 0), np.size(idx, 1)))
for i in range(0, np.size(idx, 0)):
output[i, :] = np.squeeze(data[idx[i, :]])
Many thanks
In [547]: idx = np.random.randint(0, 100, (25, 10))
...: data = np.random.random(100)
...: output = np.empty((np.size(idx, 0), np.size(idx, 1)))
...: for i in range(0, np.size(idx, 0)):
...: output[i, :] = np.squeeze(data[idx[i, :]])
In [553]: idx.shape
Out[553]: (25, 10)
In [554]: output.shape
Out[554]: (25, 10)
Simply index; no need to iterate
In [555]: np.allclose(output, data[idx])
Out[555]: True
There are differences between MATLAB and numpy when indexing with two arrays, one for each dimension. To put is simply, in MATLAB it's easier to index a block, in numpy indexing a diagonal is more direct. But that's not relevant here.
Output=vector[x, y]
X and Y are coordinates that you choose.
Similarly, if you want to specify an interval:
Output=vector[X1:X2, Y1:Y2]
Just [] instead of ()

Multidimensional Tensor slicing

First things first: I'm relatively new to TensorFlow.
I'm trying to implement a custom layer in tensorflow.keras and I'm having relatively hard time when I try to achieve the following:
I've got 3 Tensors (x,y,z) of shape (?,49,3,3,32) [where ? is the batch size]
On each Tensor I compute the sum over the 3rd and 4th axes [thus I end up with 3 Tensors of shape (?,49,32)]
By doing an argmax (A)on the above 3 Tensors (?,49,32) I get a single (?,49,32) Tensor
Now I want to use this tensor to select slices from the initial x,y,z Tensors in the following form:
Each element in the last dimension of A corresponds to the selected Tensor.
(aka: 0 = X, 1 = Y, 2 = Z)
The index of the last dimension of A corresponds to the slice that I would like to extract from the Tensor last dimension.
I've tried to achieve the above using tf.gather but I had no luck. Then I tried using a series of tf.map_fn, which is ugly and computationally costly.
To simplify the above:
let's say we've got an A array of shape (3,3,3,32). Then the numpy equivalent of what I try to achieve is this:
import numpy as np
x = np.random.rand(3,3,32)
y = np.random.rand(3,3,32)
z = np.random.rand(3,3,32)
x_sums = np.sum(np.sum(x,axis=0),0);
y_sums = np.sum(np.sum(y,axis=0),0);
z_sums = np.sum(np.sum(z,axis=0),0);
max_sums = np.argmax([x_sums,y_sums,z_sums],0)
A = np.array([x,y,z])
tmp = []
for i in range(0,len(max_sums)):
tmp.append(A[max_sums[i],:,:,i)
output = np.transpose(np.stack(tmp))
Any suggestions?
ps: I tried tf.gather_nd but I had no luck
This is how you can do something like that with tf.gather_nd:
import tensorflow as tf
# Make example data
tf.random.set_seed(0)
b = 10 # Batch size
x = tf.random.uniform((b, 49, 3, 3, 32))
y = tf.random.uniform((b, 49, 3, 3, 32))
z = tf.random.uniform((b, 49, 3, 3, 32))
# Stack tensors together
data = tf.stack([x, y, z], axis=2)
# Put reduction axes last
data_t = tf.transpose(data, (0, 1, 5, 2, 3, 4))
# Reduce
s = tf.reduce_sum(data_t, axis=(4, 5))
# Find largest sums
idx = tf.argmax(s, 3)
# Make gather indices
data_shape = tf.shape(data_t, idx.dtype)
bb, ii, jj = tf.meshgrid(*(tf.range(data_shape[i]) for i in range(3)), indexing='ij')
# Gather result
output_t = tf.gather_nd(data_t, tf.stack([bb, ii, jj, idx], axis=-1))
# Reorder axes
output = tf.transpose(output_t, (0, 1, 3, 4, 2))
print(output.shape)
# TensorShape([10, 49, 3, 3, 32])

How to get numpy to broadcast an operation after a reduction operation

I am trying to normalize some data for the last dimensions.
#sample data
x = numpy.random.random((3, 1, 4, 16, 16))
x[1] = x[1]*2
x[2] = x[2]*4
I can get the mean,
m = x.mean((-3, -2, -1))
Now, x.shape is (3, 1, 4, 16, 16) and m.shape is (3, 1), I want to subtract the mean from each sample. So far I have.
for i in range(x.shape[0]):
for j in range(x.shape[1]):
x[i,j] = x[i,j] - m[i,j]
That works, but it has two drawbacks. I'm using explicit loops, and it it requires the shape to have 5 dimensions.
Simply keep dimensions with keepdims arg and then subtract -
m = x.mean((-3, -2, -1),keepdims=True)
x -= m
This would work regardless of the axes that are used for the reduction and should be a clean solution.

Numpy: How to stack 3D arrays in rows iteratively?

I am trying to stack into rows (axis=0) the results of a calculation that results in 3D arrays. I don't know the results ahead of time.
import numpy as np
h = 10
w = 20
c = 30
result_4d = np.??? # empty
for i in range(5):
result_3d = np.zeros((h, w, c)) #fake calculation
result_4d = np.??? # stacked result_3ds on axis=0
return result_4d
I've tried various permutations of the numpy *stack calls but I inevitably run into shape mismatch errors.
Put it in a list first and then stack.
h = 10
w = 20
c = 30
l = []
for i in range(5):
result_3d = np.zeros((h, w, c)) #fake calculation
l.append(result_3d)
res = np.stack(l, axis=-1)
res.shape # (10, 20, 30, 5)
# move stacked axis around ...
np.transpose(res, (3,0,1,2)).shape # (5, 10, 20, 30)
If you want to update in loop, you can potentially do this:
res = ''
for i in range(5):
result_3d = np.zeros((h, w, c)) #fake calculation
if type(res) is str:
res = np.array([result_3d]) # add dimension
continue
res = np.vstack((res, np.array([result_3d]))) # stack on that dimension
res.shape # (5, 10, 20, 30)

Categories

Resources