Numpy: Broadcasting from submatrix - python

Given two 2D arrays:
A =[[1, 1, 2, 2],
[1, 1, 2, 2],
[3, 3, 4, 4],
[3, 3, 4, 4]]
B =[[1, 2],
[3, 4]]
A - B = [[ 0, -1, 1, 0],
[-2, -3, -1, -2],
[ 2, 1, 3, 2],
[ 0, -1, 1, 0]]
B's shape is 2,2, A's is 4,4. I want to perform a broadcast subtraction of B over A: A - B.
I specifically want to use broadcasting as the array sizes I am dealing with are very large (8456,8456). I am hoping that broadcasting will provide a small performance optimization.
I've tried reshaping the arrays but with no luck, and am stumped. Scikit is not available to me to use.

You can expand B by tiling it twice in both dimensions:
print A - numpy.tile(B, (2, 2))
yields
[[ 0 -1 1 0]
[-2 -3 -1 -2]
[ 2 1 3 2]
[ 0 -1 1 0]]
However for big matrices this may create a lot of overhead in RAM.
Alternatively you can view A in blocks using Scikit Image's skimage.util.view_as_blocks and modify it in place
Atmp = skimage.util.view_as_blocks(A, block_shape=(2, 2))
Atmp -= B
print A
which will result, without needlessly repeating B
[[ 0 -1 1 0]
[-2 -3 -1 -2]
[ 2 1 3 2]
[ 0 -1 1 0]]

Approach #1 : Here's an approach using strides that uses the concept of views without making actual copies to then perform subtraction from A and as such should be quite efficient -
m,n = B.strides
m1,n1 = A.shape
m2,n2 = B.shape
s1,s2 = m1//m2, n1//n2
strided = np.lib.stride_tricks.as_strided
out = A - strided(B,shape=(s1,m2,s2,n2),strides=(0,n2*n,0,n)).reshape(A.shape)
Sample run -
In [78]: A
Out[78]:
array([[29, 53, 30, 25, 92, 10],
[ 2, 20, 35, 87, 0, 9],
[46, 30, 20, 62, 79, 63],
[44, 9, 78, 33, 6, 40]])
In [79]: B
Out[79]:
array([[35, 60],
[21, 86]])
In [80]: m,n = B.strides
...: m1,n1 = A.shape
...: m2,n2 = B.shape
...: s1,s2 = m1//m2, n1//n2
...: strided = np.lib.stride_tricks.as_strided
...:
In [81]: # Replicated view
...: strided(B,shape=(s1,m2,s2,n2),strides=(0,n2*n,0,n)).reshape(A.shape)
Out[81]:
array([[35, 60, 35, 60, 35, 60],
[21, 86, 21, 86, 21, 86],
[35, 60, 35, 60, 35, 60],
[21, 86, 21, 86, 21, 86]])
In [82]: A - strided(B,shape=(s1,m2,s2,n2),strides=(0,n2*n,0,n)).reshape(A.shape)
Out[82]:
array([[ -6, -7, -5, -35, 57, -50],
[-19, -66, 14, 1, -21, -77],
[ 11, -30, -15, 2, 44, 3],
[ 23, -77, 57, -53, -15, -46]])
Approach #2 : We can just reshape both A and B to 4D shapes with B having two singleton dimensions along which its elements would be broadcasted when subtracted from 4D version of A. After subtraction, we reshape back to 2D for final output. Thus, we would have an implementation, like so -
m1,n1 = A.shape
m2,n2 = B.shape
out = (A.reshape(m1//m2,m2,n1//n2,n2) - B.reshape(1,m2,1,n2)).reshape(m1,n1)

This should work if A has dimentions that are multiple of B's dimentions:
A - np.tile(B, (int(A.shape[0]/B.shape[0]), int(A.shape[1]/B.shape[1])))
And the result:
array([[ 0, -1, 1, 0],
[-2, -3, -1, -2],
[ 2, 1, 3, 2],
[ 0, -1, 1, 0]])

If you do not want to tile, you can reshape A to extract (2, 2) blocks, and use broadcasting to substract B:
C = A.reshape(A.shape[0]//2, 2, A.shape[1]//2, 2).swapaxes(1, 2)
C - B
array([[[[ 0, -1],
[-2, -3]],
[[ 1, 0],
[-1, -2]]],
[[[ 2, 1],
[ 0, -1]],
[[ 3, 2],
[ 1, 0]]]])
And then swap the axis back and reshape:
(C - B).swapaxes(1, 2).reshape(A.shape[0], A.shape[1])
This should be significantly faster, since C is a view on A, not a constructed array.

Related

What's the fastest way to slice a portion of a tensor to another in PyTorch?

I have three tensors as shown below:
a = tensor([[5, 2, 3, 24],
[8, 66, 7, 89],
[9, 10, 1, 12]])
b = tensor([[10, 22, 13, 1],
[35, 6, 17, 3],
[11, 13, 5,8]])
c = tensor([[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0,0]])
I want to chnage c values using this formula:
Note that the last values (column) in c are not changed at this point.
c[:,:-1]= a[:,:-1] -a[:, 1:] - b[:, 1:]
This means I will have
c = tensor([[5-2-22, 2-3-13, 3-24-1, 0],
[8-66-6, 66-7-17, 7-89-3, 0],
[9-10-13, 10-1-5, 1-12-8,0]])
>>> c = tensor([[-19, -14, -22, 0],
[-64, 42, -85, 0],
[-14, 4, -19,0]])
Finally, to change the last column, I want to use c[:, -1] = b[:, -1] - 1
And my final tensor will look like this:
c = tensor([[-19, -14, -22, 0],
[-64, 42, -85, 2],
[-14, 4, -19,7]])
I think the fact that you are "overriding" the elements of c is causing you errors.
Try creating c "from scratch", by concatenating its two parts:
c = torch.cat([a[:,:-1] -a[:, 1:] - b[:, 1:],
b[:, -1:] - 1], dim=-1)

How to append a new element into 3D numpy array in a loop?

I wanted to try to add a new element into the 3D numpy array in loop python but it didn't work with insert or append.
import numpy as np
a = np.array([[[24,24,3],[25,28,1],[13,34,1],[3,4,5]]])
a = np.insert(a,3,0,axis = 2)
print(a)
[[[24 24 3 0]
[25 28 1 0]
[13 34 1 0]
[ 3 4 5 0]]]
I don't want to insert 0 to each array but with a for loop
for i in range(4):
.......
The result should be like this:
[[[24 24 3 0]
[25 28 1 1]
[13 34 1 2]
[ 3 4 5 3]]]
You can't do this piecemeal. Think rather terms of concatenating a wwhole column or plane at once. The array has to remain 'rectangular' - no raggedness.
In [276]: a = np.array([[[24,24,3],[25,28,1],[13,34,1],[3,4,5]]])
In [277]: a.shape
Out[277]: (1, 4, 3)
In [278]: x = np.arange(4).reshape(1,4,1)
In [279]: x
Out[279]:
array([[[0],
[1],
[2],
[3]]])
In [280]: arr1 =np.concatenate((a,x), axis=2)
In [281]: arr1.shape
Out[281]: (1, 4, 4)
In [282]: arr1
Out[282]:
array([[[24, 24, 3, 0],
[25, 28, 1, 1],
[13, 34, 1, 2],
[ 3, 4, 5, 3]]])
alternatively
In [290]: arr2=np.zeros((1,4,4),int)
In [291]: arr2[:,:,:3]=a
...: arr2
Out[291]:
array([[[24, 24, 3, 0],
[25, 28, 1, 0],
[13, 34, 1, 0],
[ 3, 4, 5, 0]]])
In [292]: for i in range(4):
...: arr2[:,i,3]=i
...:
In [293]: arr2
Out[293]:
array([[[24, 24, 3, 0],
[25, 28, 1, 1],
[13, 34, 1, 2],
[ 3, 4, 5, 3]]])

Numpy: Combine list of arrays by another array (np.choose alternative)

I have a list of numpy arrays, each of the same shape. Let's say:
a = [np.array([[1, 2, 3],
[4, 5, 6],
[7, 8, 9]]),
np.array([[11, 12, 13],
[14, 15, 16],
[17, 18, 19]]),
np.array([[99, 98, 97],
[96, 95, 94],
[93, 92, 91]])]
And I have another array of the same shape that gives the list indices I want to take the elements from:
b = np.array([[0, 0, 1],
[2, 1, 0],
[2, 1, 2]])
What I want to get is the following:
np.array([[1, 2, 13],
[96, 15, 6],
[93, 18, 91]])
There was a simple solution that worked fine:
np.choose(b, a)
But this is limited to 32 arrays at most. But in my case, I have to combine more arrays (more than 100). So I need another way to do so.
I guess, it has to be something about advances indexing or maybe the np.take method. So probably, the first step is a = np.array(a) and then something like a[np.arange(a.shape[0]), b]. But I do not get it working.
Can somebody help? :)
You can try using np.ogrid. Based on this answer. Of course you will have to convert a to a NumPy array first
i, j = np.ogrid[0:3, 0:3]
print (a[b, i, j])
# array([[ 1, 2, 13],
# [96, 15, 6],
# [93, 18, 91]])
In [129]: a = [np.array([[1, 2, 3],
...: [4, 5, 6],
...: [7, 8, 9]]),
...: np.array([[11, 12, 13],
...: [14, 15, 16],
...: [17, 18, 19]]),
...: np.array([[99, 98, 97],
...: [96, 95, 94],
...: [93, 92, 91]])]
In [130]: b = np.array([[0, 0, 1],
...: [2, 1, 0],
...: [2, 1, 2]])
In [131]:
In [131]: A = np.array(a)
In [132]: A.shape
Out[132]: (3, 3, 3)
You want to use b to index the first dimension. For the other dimensions you need a indices that broadcast with b, i.e. a column vector and a row vector:
In [133]: A[b, np.arange(3)[:,None], np.arange(3)]
Out[133]:
array([[ 1, 2, 13],
[96, 15, 6],
[93, 18, 91]])
there are various convenience functions for creating these arrays, e.g.
In [134]: np.ix_(range(3),range(3))
Out[134]:
(array([[0],
[1],
[2]]), array([[0, 1, 2]]))
and ogrid as mentioned in the other answer.
Here's a relatively new function that also does the job:
In [138]: np.take_along_axis(A, b[None,:,:], axis=0)
Out[138]:
array([[[ 1, 2, 13],
[96, 15, 6],
[93, 18, 91]]])
I had to think a bit before I got the adjustment to b right.

Use 2D matrix as indexes for a 3D matrix in numpy?

Say I have an array of shape 2x3x3, which is a 3D matrix. I also have a 2D matrix of shape 3x3 that I would like to use as indices for the 3D matrix along the first axis. Example is below.
Example run:
>>> np.random.randint(0,2,(3,3)) # index
array([[0, 1, 0],
[1, 0, 1],
[1, 0, 0]])
>> np.random.randint(0,9,(2,3,3)) # 3D matrix
array([[[4, 4, 5],
[2, 6, 7],
[2, 6, 2]],
[[4, 0, 0],
[2, 7, 4],
[4, 4, 0]]])
>>> np.array([[4,0,5],[2,6,4],[4,6,2]]) # result
array([[4, 0, 5],
[2, 6, 4],
[4, 6, 2]])
It seems you are using 2D array as index array and 3D array to select values. Thus, you could use NumPy's advanced-indexing -
# a : 2D array of indices, b : 3D array from where values are to be picked up
m,n = a.shape
I,J = np.ogrid[:m,:n]
out = b[a, I, J] # or b[a, np.arange(m)[:,None],np.arange(n)]
If you meant to use a to index into the last axis instead, just move a there : b[I, J, a].
Sample run -
>>> np.random.seed(1234)
>>> a = np.random.randint(0,2,(3,3))
>>> b = np.random.randint(11,99,(2,3,3))
>>> a # Index array
array([[1, 1, 0],
[1, 0, 0],
[0, 1, 1]])
>>> b # values array
array([[[60, 34, 37],
[41, 54, 41],
[37, 69, 80]],
[[91, 84, 58],
[61, 87, 48],
[45, 49, 78]]])
>>> m,n = a.shape
>>> I,J = np.ogrid[:m,:n]
>>> out = b[a, I, J]
>>> out
array([[91, 84, 37],
[61, 54, 41],
[37, 49, 78]])
If your matrices get much bigger than 3x3, to the point that memory involved in np.ogrid is an issue, and if your indexes remain binary, you could also do:
np.where(a, b[1], b[0])
But other than that corner case (or if you like code golfing one-liners) the other answer is probably better.
There is a numpy function off-the-shelf: np.choose.
It also comes with some handy broadcast options.
import numpy as np
cube = np.arange(18).reshape((2,3,3))
sel = np.array([[1, 0, 1], [0, 1, 1], [0,1,0]])
the_selection = np.choose(sel, cube)
>>>the_selection
array([[ 9, 1, 11],
[ 3, 13, 14],
[ 6, 16, 8]])
This method works with any 3D array.

numpy array slicing, get one from each third dimension

I have a 3D array of data. I have a 2D array of indices, where the shape matches the first two dimensions of the data array, and it specfies the indices I want to pluck from the data array to make a 2D array. eg:
from numpy import *
a = arange(3 * 5 * 7).reshape((3,5,7))
getters = array([0,1,2] * (5)).reshape(3,5)
What I'm looking for is a syntax like a[:, :, getters] which returns an array of shape (3,5) by indexing independently into the third dimension of each item. However, a[:, :, getters] returns an array of shape (3,5,3,5). I can do it by iterating and building a new array, but this is pretty slow:
array([[col[getters[ri,ci]] for ci,col in enumerate(row)] for ri,row in enumerate(a)])
# gives array([[ 0, 8, 16, 21, 29],
# [ 37, 42, 50, 58, 63],
# [ 71, 79, 84, 92, 100]])
Is there a neat+fast way?
If I understand you correctly, I've done something like this using fancy indexing:
>>> k,j = np.meshgrid(np.arange(a.shape[1]),np.arange(a.shape[0]))
>>> k
array([[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4]])
>>> j
array([[0, 0, 0, 0, 0],
[1, 1, 1, 1, 1],
[2, 2, 2, 2, 2]])
>>> a[j,k,getters]
array([[ 0, 8, 16, 21, 29],
[ 37, 42, 50, 58, 63],
[ 71, 79, 84, 92, 100]])
Of course, you can keep k and j around and use them as often as you'd like. As pointed out by DSM in comments below, j,k = np.indices(a.shape[:2]) should also work instead of meshgrid. Which one is faster (apparently) depends on the number of elements you are using.

Categories

Resources