Determine indexes where A is a submatrix of matrix B - python

I'm coding in python and I would like to know how I can get the indexes of a matrix A where the matrix B is contained in A. For example, if we have
A = [[1,2,3],
[4,5,6],
[7,8,9]]
and
B = [[2,3],
[5,6]]
Then it returns indexes ([0,0,1,1], [1,2,1,2]), where the first list corresponds to x-axis, and the second to the y-axis. Or something like this.
Thank you for yor help !

You can check this question to find if a matrix is a submatrix of another one.
Then, you can get the coordinates of each element exploiting the NumPy where function as:
import numpy as np
A = np.linspace(1, 9, 9).reshape([3, 3])
B = np.asarray([2, 3, 5, 6]).reshape([2, 2])
submatrix_tuple_coord = [list(np.where(A==b)) for bb in B for b in bb]
submatrix_xy = [[int(x), int(y)] for x, y in submatrix_tuple_coord]
# Return a list of list with the row-column indices
submatrix_xy
>>> [[0, 1], [0, 2], [1, 1], [1, 2]]

Related

I need to remove every point that has the same Y coordinate in an array

Basically I have an array list [x,y] that goes : [0,1][1,2][2,4][3,1][4,3] and the list goes on. I want to execute a code that removes the points that have the same y coordinate except the first one in order. I would like to have as output : [0,1][1,2][2,4][4,3]. How can I do this I have tried using np.unique but I can't mange to keep the first appearance or to remove based on the y coordinate.
Thanks
You can use HYRY's solution from numpy.unique with order preserved, you just need to select the Y column.
import numpy as np
a = np.array([[0,1], [1,2], [2,4], [3,1], [4,3]])
_, idx = np.unique(a[:, 1], return_index=True)
a[np.sort(idx)]
result:
[[0 1]
[1 2]
[2 4]
[4 3]]
array = [[0,1],[1,2],[2,4],[3,1],[4,3]]
occured = set()
result = []
for element in array:
if element[1] not in occured:
result.append(element)
occured.add(element[1])
array.clear()
array.extend(result)
print(array)
>> [[0, 1], [1, 2], [2, 4], [4, 3]]

Extracting subarray based on logical indexing

Take a look at the following code in MATLAB:
a = [1,2; 5,6]
b = [-1,1; -1,1]
d = a(b(:)>0)
Now d will be the 2x1 array,[2;6]. This is because array b has positive entry only at the positions (1,2) and (2,2), and the third line of the code is extracting elements of a in those positions.
Is there an equivalent method in Python that does this? I searched numpy documentation but could not find any. In my actual code, I have multiple large, multidimensional arrays from which I would want to extract elements based on the elements of other arrays. Of course, this can be done with nested for loops but it would be much better if there is a nicer way like MATLAB does.
Assuming a and b are numpy arrays use:
d = a[b > 0]
In numpy, indexing is done with the [] operator.
Without using any libraries:
a = [[1, 2], [5, 6]]
b = [[-1, 1], [-1, 1]]
d = [
a_xy
for a_x, b_x in zip(a, b)
for a_xy, b_xy in zip(a_x, b_x)
if b_xy > 0
]
Using numpy:
import numpy as np
a = np.array([[1, 2], [5, 6]])
b = np.array([[-1, 1], [-1, 1]])
d = a[b > 0]

NumPy apply function to groups of rows corresponding to another numpy array

I have a NumPy array with each row representing some (x, y, z) coordinate like so:
a = array([[0, 0, 1],
[1, 1, 2],
[4, 5, 1],
[4, 5, 2]])
I also have another NumPy array with unique values of the z-coordinates of that array like so:
b = array([1, 2])
How can I apply a function, let's call it "f", to each of the groups of rows in a which correspond to the values in b? For example, the first value of b is 1 so I would get all rows of a which have a 1 in the z-coordinate. Then, I apply a function to all those values.
In the end, the output would be an array the same shape as b.
I'm trying to vectorize this to make it as fast as possible. Thanks!
Example of an expected output (assuming that f is count()):
c = array([2, 2])
because there are 2 rows in array a which have a z value of 1 in array b and also 2 rows in array a which have a z value of 2 in array b.
A trivial solution would be to iterate over array b like so:
for val in b:
apply function to a based on val
append to an array c
My attempt:
I tried doing something like this, but it just returns an empty array.
func(a[a[:, 2]==b])
The problem is that the groups of rows with the same Z can have different sizes so you cannot stack them into one 3D numpy array which would allow to easily apply a function along the third dimension. One solution is to use a for-loop, another is to use np.split:
a = np.array([[0, 0, 1],
[1, 1, 2],
[4, 5, 1],
[4, 5, 2],
[4, 3, 1]])
a_sorted = a[a[:,2].argsort()]
inds = np.unique(a_sorted[:,2], return_index=True)[1]
a_split = np.split(a_sorted, inds)[1:]
# [array([[0, 0, 1],
# [4, 5, 1],
# [4, 3, 1]]),
# array([[1, 1, 2],
# [4, 5, 2]])]
f = np.sum # example of a function
result = list(map(f, a_split))
# [19, 15]
But imho the best solution is to use pandas and groupby as suggested by FBruzzesi. You can then convert the result to a numpy array.
EDIT: For completeness, here are the other two solutions
List comprehension:
b = np.unique(a[:,2])
result = [f(a[a[:,2] == z]) for z in b]
Pandas:
df = pd.DataFrame(a, columns=list('XYZ'))
result = df.groupby(['Z']).apply(lambda x: f(x.values)).tolist()
This is the performance plot I got for a = np.random.randint(0, 100, (n, 3)):
As you can see, approximately up to n = 10^5 the "split solution" is the fastest, but after that the pandas solution performs better.
If you are allowed to use pandas:
import pandas as pd
df=pd.DataFrame(a, columns=['x','y','z'])
df.groupby('z').agg(f)
Here f can be any custom function working on grouped data.
Numeric example:
a = np.array([[0, 0, 1],
[1, 1, 2],
[4, 5, 1],
[4, 5, 2]])
df=pd.DataFrame(a, columns=['x','y','z'])
df.groupby('z').size()
z
1 2
2 2
dtype: int64
Remark that .size is the way to count number of rows per group.
To keep it into pure numpy, maybe this can suit your case:
tmp = np.array([a[a[:,2]==i] for i in b])
tmp
array([[[0, 0, 1],
[4, 5, 1]],
[[1, 1, 2],
[4, 5, 2]]])
which is an array with each group of arrays.
c = np.array([])
for x in np.nditer(b):
c = np.append(c, np.where((a[:,2] == x))[0].shape[0])
Output:
[2. 2.]

python numpy `np.take` with 2 dimensional array

I'm trying to take a list of elements from an 2D numpy array with given list of coordinates and I want to avoid using loop. I saw that np.take works with 1D array but I can't make it work with 2D arrays.
Example:
a = np.array([[1,2,3], [4,5,6]])
print(a)
# [[1 2 3]
# [4 5 6]]
np.take(a, [[1,2]])
# gives [2, 3] but I want just [6]
I want to avoid loop because I think that will be slower (I need speed). But if you can persuade me that a loop is as fast as an existing numpy function solution, then I can go for it.
If I understand it correctly, you have a list of coordinates like this:
coords = [[y0, x0], [y1, x1], ...]
To get the values of array a at these coordinates you need:
a[[y0, y1, ...], [x0, x1, ...]]
So a[coords] will not work. One way to do it is:
Y = [c[0] for c in coords]
X = [c[1] for c in coords]
or
Y = np.transpose(coords)[0]
X = np.transpose(coords)[1]
Then
a[Y, X]
Does fancy indexing do what you want? np.take seems to flatten the array before operating.
import numpy as np
a = np.arange(1, 10).reshape(3,3)
a
# array([[1, 2, 3],
# [4, 5, 6],
# [7, 8, 9]])
rows = [ 1,1,2,0]
cols = [ 0,1,1,2]
# Use the indices to access items in a
a[rows, cols]
# array([4, 5, 8, 3])
a[1,0], a[1,1], a[2,1], a[0,2]
# (4, 5, 8, 3)

Numpy built in elementwise append

I have an array 3D array of points onto which I would like to append a corresponding value in a flat array.
points = [[1,2,3],[4,5,6]]
info = [1,2]
Is there a built in way to append elements from the second array to the corresponding positions in the first?
output = [[1,2,3,1],[4,5,6,2]]
Use np.hstack:
points = np.array([[1,2,3],[4,5,6]])
info = np.array([1, 2])
output = np.hstack([points, info.reshape(2,1)])
Output:
array([[1, 2, 3, 1],
[4, 5, 6, 2]])

Categories

Resources