I have a 3d array of values,
vals = np.array([
[
[10, 20, 30],
[40, 50, 60],
],
[
[15, 25, 35],
[45, 55, 65],
],
])
and a corresponding 3d array of coordinates
coords = np.array([
[
[0,1],
[0,2],
[1,1]
],
[
[0,0],
[1,1],
[1,2]
]
])
Each inner-most array of coords represents (x,y) coordinates corresponding to one of the 2d arrays within vals. For example, the coordinate [0,1] in coords corresponds to the value 20 and the coordinate [1,2] in coords corresponds to the value 65.
How do I use coords to subset vals in this manner?
I can solve this specific example like so
np.array([
vals[0][coords[0][:, 0], coords[0][:, 1]],
vals[1][coords[1][:, 0], coords[1][:, 1]]
])
array([[20, 30, 50],
[15, 55, 65]])
but obviously I'd like a more dynamic solution.
Funny how writing my questions always seems to lead me to an answer.. Staring at the answer matrix,
array([[20, 30, 50],
[15, 55, 65]])
I asked myself, "how would I reproduce this matrix from raw index values?". For example, to extract the value 20, I know I can do
vals[0, 0, 1]
If I wanted to extract the first row of values in the answer, [20, 30, 50] I should do
vals[[0,0,0], [0,0,1], [1,2,1]]
Then to get the full answer matrix, I should do
vals[[[0,0,0],[1,1,1]], [[0,0,1],[0,1,1]], [[1,2,1],[0,1,2]]]
From here, I set my focus on producing those three index matrices. They can be constructed as follows:
i1 = np.arange(coords.shape[0])[:, None].repeat(coords.shape[1], axis=1)
i2 = coords[:,:,0]
i3 = coords[:,:,1]
# Thus the generalized solution
vals[i1, i2, i3]
This answer is extremely similar to the advanced indexing solution mentioned by #Psidom in the comments, but perhaps less elegant.
Related
I'm writing a script to reduce the number of colors in a list by finding clusters. The problem I seem to run into is that the clusters will have different dimensions. Here is my jumping off point after the original list of 6 colors got already seperated into 3 clusters:
import numpy
a = numpy.array([
[12, 44, 52],
[27, 0, 71],
[81, 99, 92]
])
b = numpy.array([
[ 12, 13, 93],
[128, 128, 128]
])
c = numpy.array([
[ 57, 14, 255]
])
clusters = numpy.array([a,b,c])
print(numpy.min(clusters, axis=1))
However now the function numpy.min() starts to throw an error - I suspect it's because of the differently sized arrays.
The cluster arrays will always have the shape (x, 3) (x number of colors, 3 components). I want to get an array with the minimums of all components of the colors in one cluster (n, 3) (n is number of clusters) - so array([12, 0, 52], [12, 13, 93], [57, 14, 255]) in this case.
Is there a way to do this? As I mentioned it works as long as all clusters have multiple values.
Since your arrays a, b and c don't have an equal shape, you can't put them in the same array (at least if you don't pad with some value). You could calculate the minimum first and then generate an array from these minima:
numpy.array([arr.min(axis=0) for arr in (a, b, c)])
Which gives you:
array([[ 12, 0, 52],
[ 12, 13, 93],
[ 57, 14, 255]])
I came along this problem when helping on this question where OP does some image processing. Regardless if there are other ways to do the whole thing, in one part, I have a 2D np.array filles with integers. The integers are just mask values, each standing for a RGB color.
I have a dictionary with integers as keys and arrays of RGB colors as value. This is the mapping and the goal is to replace each int in the array with the colors.
Starting with this array where all RGB-array where already replaced by integers so now it is an array of shape (2,3) (originially it was shape(2,3,3))
import numpy as np
arr = np.array([0,2,4,1,3,5]).reshape(2,3)
print(arr)
array([[0, 2, 4],
[1, 3, 5]])
Here is the dictionary (chosen numbers are just random for the example):
dic = {0 : [10,20,30], 1 : [12,22,32], 2 : [15,25,35], 3 : [40,50,60], 4 : [100,200,300], 5 : [250,350,450]}
replacing all these values with the arrays makes it an array with shape (2,3,3) like this:
array([[[ 10, 20, 30],
[ 15, 25, 35],
[100, 200, 300]],
[[ 12, 22, 32],
[ 40, 50, 60],
[250, 350, 450]]])
I looked into np.where because I thought it is the most obvious to me but I always got the error that the shapes are incorrect.
I don't know where exactly I'm stuck, when googling, I came across np.dstack, np.concatenate, reading about changing the shape with np.newaxis / None but I just don't get it done. Maybe creating a new array with np.zeros_like and go from there.
Do I need to create something like a placeholder before I'm able to insert an array holding these 3 RGB values?
Since every single key is in the array because it is created based on that, I thought about loop through the dict, check for key in array and replace it with the dict.value. Am I at least in the right direction or does that lead to nothing?
Any help much appreciated!!!
In this regard, we can create an array of dictionary values by unpacking that and then order them based on the specified orders in the arr. So:
np.array([*dic.values()])[arr]
If the dictionary keys were not in a sorted order, we can create a mask array for ordering based on them, using np.argsort. So, after sorting the array of dictionary values based on the mask array, we can get the results again e.g.:
dic = {0: [10, 20, 30], 2: [15, 25, 35], 3: [40, 50, 60], 1: [12, 22, 32], 4: [100, 200, 300], 5: [250, 350, 450]}
sort_mask = np.array([*dic.keys()]).argsort()
# [0 3 1 2 4 5]
np.array([*dic.values()])[sort_mask][arr]
# [[[ 10 20 30]
# [ 15 25 35]
# [100 200 300]]
#
# [[ 12 22 32]
# [ 40 50 60]
# [250 350 450]]]
I have two two dimensional arrays a and b (#columns of a <= #columns in b). I would like to find an efficient way of matching a row in array a to a contiguous part of a row in array b.
a = np.array([[ 25, 28],
[ 84, 97],
[105, 24],
[ 28, 900]])
b = np.array([[ 25, 28, 84, 97],
[ 22, 25, 28, 900],
[ 11, 12, 105, 24]])
The output should be np.array([[0,0], [0,1], [1,0], [2,2], [3,1]]). Row 0 in array a matches Row 0 in array b (first two positions). Row 1 in array a matches row 0 in array b (third and fourth positions).
We can leverage np.lib.stride_tricks.as_strided based scikit-image's view_as_windows for efficient patch extraction, and then compare those patches against each row off a, all of it in a vectorized manner. Then, get the matching indices with np.argwhere -
# a and b from posted question
In [325]: from skimage.util.shape import view_as_windows
In [428]: w = view_as_windows(b,(1,a.shape[1]))
In [429]: np.argwhere((w == a).all(-1).any(-2))[:,::-1]
Out[429]:
array([[0, 0],
[1, 0],
[0, 1],
[3, 1],
[2, 2]])
Alternatively, we could get the indices by the order of rows in a by pushing forward the first axis of a while performing broadcasted comparisons -
In [444]: np.argwhere((w[:,:,0] == a[:,None,None,:]).all(-1).any(-1))
Out[444]:
array([[0, 0],
[0, 1],
[1, 0],
[2, 2],
[3, 1]])
Another way I can think of is to loop over each row in a and perform a 2D correlation between the b which you can consider as a 2D signal a row in a.
We would find the results which are equal to the sum of squares of all values in a. If we subtract our correlation result with this sum of squares, we would find matches with a zero result. Any rows that give you a 0 result would mean that the subarray was found in that row. If you are using floating-point numbers for example, you may want to compare with some small threshold that is just above 0.
If you can use SciPy, the scipy.signal.correlate2d method is what I had in mind.
import numpy as np
from scipy.signal import correlate2d
a = np.array([[ 25, 28],
[ 84, 97],
[105, 24]])
b = np.array([[ 25, 28, 84, 97],
[ 22, 25, 28, 900],
[ 11, 12, 105, 24]])
EPS = 1e-8
result = []
for (i, row) in enumerate(a):
out = correlate2d(b, row[None,:], mode='valid') - np.square(row).sum()
locs = np.where(np.abs(out) <= EPS)[0]
unique_rows = np.unique(locs)
for res in unique_rows:
result.append((i, res))
We get:
In [32]: result
Out[32]: [(0, 0), (0, 1), (1, 0), (2, 2)]
The time complexity of this could be better, especially since we're looping over each row of a to find any subarrays in b.
I have one matrix, like
a = np.array([[1, 2, 3], [2, 3, 4], [3, 4, 5], [4, 5, 6]])
and I want to get a new matrix, where each element is the matrix product of the row of a with itself:
np.array([
np.dot(np.array([a[0]]).T, np.array([a[0]])),
np.dot(np.array([a[1]]).T, np.array([a[1]])),
np.dot(np.array([a[2]]).T, np.array([a[2]])),
np.dot(np.array([a[3]]).T, np.array([a[3]])),
])
which will be a 4x4 matrix with each element a 3x3 matrix.
After this I can sum over the 0 axis to get a new 3x3 matrix.
Is there any more elegant way to implement this except using loop?
Use NumPy broadcasting to keep the first axis aligned and perform outer product between the second one -
a[:,:,None]*a[:,None,:] # or a[...,None]*a[:,None]
With np.einsum, translates to -
np.einsum('ij,ik->ijk',a,a)
I might be missing something but isn't this just matrix multiplication ?
>>> a.T # a
array([[30, 40, 50],
[40, 54, 68],
[50, 68, 86]])
>>> np.sum(np.array([
np.dot(np.array([a[0]]).T, np.array([a[0]])),
np.dot(np.array([a[1]]).T, np.array([a[1]])),
np.dot(np.array([a[2]]).T, np.array([a[2]])),
np.dot(np.array([a[3]]).T, np.array([a[3]])),
]), axis=0)
array([[30, 40, 50],
[40, 54, 68],
[50, 68, 86]])
I have two ndarrays :
a = [[30,40],
[60,90]]
b = [[0,0,1],
[1,0,1],
[1,1,1]]
please notice that a shape might be larger but always square array (50,50) , (100,100)
The wanted result is :
Result = [[a*0,a*0,a*1],
[[a*1,a*0,a*1],
[[a*1,a*1,a*1]]
I managed to get the right answer with this code but I think there would be a built in function in numpy that accomplish this task in fast manners
totalrows=[]
for row in range(b.shape[0]):
cells=[]
for column in range(b.shape[1]):
print row,column
cells.append(b[row,column]*a)
totalrows.append(np.concatenate(cells,axis=1))
return np.concatenate(totalrows,axis=0)
Indeed there's a NumPy built-in np.kron for such block-based elementwise multiplication problems. To solve your case, it could be used like so -
np.kron(b,a)
Sample run -
In [50]: a
Out[50]:
array([[30, 40],
[60, 90]])
In [51]: b
Out[51]:
array([[0, 0, 1],
[1, 0, 1],
[1, 1, 1]])
In [52]: np.kron(b,a)
Out[52]:
array([[ 0, 0, 0, 0, 30, 40],
[ 0, 0, 0, 0, 60, 90],
[30, 40, 0, 0, 30, 40],
[60, 90, 0, 0, 60, 90],
[30, 40, 30, 40, 30, 40],
[60, 90, 60, 90, 60, 90]])
3D array case
Now, let's say we are working with a as a 3D array (m,n,p) and b as (q,r) and assuming you are looking to perform such a block-wise multiplication iteratively along the last axis of a. Thus, the shapes are to be multiplied along the first two axes on the two inputs to get the output array. To achieve such an output, we need to extend the dimension of b by introducing a singleton dimension as the last axis. The final output would be of shape (m*q,n*r,p*1). The implementation would be simply -
np.kron(b[...,None],a)
Shape check -
In [161]: a = np.random.randint(0,99,(4,5,2))
...: b = np.random.randint(0,99,(6,7))
...:
In [162]: np.kron(b[...,None],a).shape
Out[162]: (24, 35, 2)