How to find intersect indexes and values in Python? - python

I try to convert code from Matlab to python
I have code in Matlab:
[value, iA, iB] = intersect(netA{i},netB{j});
I am looking for code in python that find the values common to both A and B, as well as the index vectors ia and ib (for each common element, its first index in A and its first index in B).
I try to use different solution, but I received vectors with different length. tried to use numpy.in1d/intersect1d , that returns bad not the same value.
Thing I try to do :
def FindoverlapIndx(self,a, b):
bool_a = np.in1d(a, b)
ind_a = np.arange(len(a))
ind_a = ind_a[bool_a]
ind_b = np.array([np.argwhere(b == a[x]) for x in ind_a]).flatten()
return ind_a, ind_b
IS=np.arange(IDs[i].shape[0])[np.in1d(IDs[i], R_IDs[j])]
IR = np.arange(R_IDs[j].shape[0])[np.in1d(R_IDs[j],IDs[i])]
I received indexes with different lengths. But both must be of the same length as in Matlab's intersect.

MATLAB's intersect(a, b) returns:
common values of a and b, sorted
the first position of each of them in a
the first position of each of them in b
NumPy's intersect1d does only the first part. So I read its source and modified it to return indices as well.
import numpy as np
def intersect_mtlb(a, b):
a1, ia = np.unique(a, return_index=True)
b1, ib = np.unique(b, return_index=True)
aux = np.concatenate((a1, b1))
aux.sort()
c = aux[:-1][aux[1:] == aux[:-1]]
return c, ia[np.isin(a1, c)], ib[np.isin(b1, c)]
a = np.array([7, 1, 7, 7, 4]);
b = np.array([7, 0, 4, 4, 0]);
c, ia, ib = intersect_mtlb(a, b)
print(c, ia, ib)
This prints [4 7] [4 0] [2 0] which is consistent with the output on MATLAB documentation page, as I used the same example as they did. Of course, indices are 0-based in Python unlike MATLAB.
Explanation: the function takes unique elements from each array, puts them together, and concatenates: the result is [0 1 4 4 7 7]. Each number appears at most twice here; when it's repeated, that means it was in both arrays. This is what aux[1:] == aux[:-1] selects for.
The array ia contains the first index of each element of a1 in the original array a. Filtering it by isin(a1, c) leaves only the indices that were in c. Same is done for ib.
EDIT:
Since version 1.15.0, intersect1d does the second and third part if you pass return_indices=True:
x = np.array([1, 1, 2, 3, 4])
y = np.array([2, 1, 4, 6])
xy, x_ind, y_ind = np.intersect1d(x, y, return_indices=True)
Where you get xy = array([1, 2, 4]), x_ind = array([0, 2, 4]) and y_ind = array([1, 0, 2])

Related

Finding an index numpy python

Consider a NumPy array of shape (8, 8).
My Question: What is the index (x,y) of the 50th element?
Note: For counting the elements go row-wise.
Example, in array A, where A = [[1, 5, 9], [3, 0, 2]] the 5th element would be '0'.
Can someone explain how to find the general solution for this and, what would be the solution for this specific problem?
You can use unravel_index to find the coordinates corresponding to the index of the flattened array. Usually np.arrays start with index 0, you have to adjust for this.
import numpy as np
a = np.arange(64).reshape(8,8)
np.unravel_index(50-1, a.shape)
Out:
(6, 1)
In a NumPy array a of shape (r, c) (just like a list of lists), the n-th element is
a[(n-1) // c][(n-1) % c],
assuming that n starts from 1 as in your example.
It has nothing to do with r. Thus, when r = c = 8 and n = 50, the above formula is exactly
a[6][1].
Let me show more using your example:
from numpy import *
a = array([[1, 5, 9], [3, 0, 2]])
r = len(a)
c = len(a[0])
print(f'(r, c) = ({r}, {c})')
print(f'Shape: {a.shape}')
for n in range(1, r * c + 1):
print(f'Element {n}: {a[(n-1) // c][(n-1) % c]}')
Below is the result:
(r, c) = (2, 3)
Shape: (2, 3)
Element 1: 1
Element 2: 5
Element 3: 9
Element 4: 3
Element 5: 0
Element 6: 2
numpy.ndarray.faltten(a) returns a copy of the array a collapsed into one dimension. And please note that the counting starts from 0, therefore, in your example 0 is the 4th element and 1 is the 0th.
import numpy as np
arr = np.array([[1, 5, 9], [3, 0, 2]])
fourth_element = np.ndarray.flatten(arr)[4]
or
fourth_element = arr.flatten()[4]
the same for 8x8 matrix.
First need to create a 88 order 2d numpy array using np.array and range.Reshape created array as 88
In the output you check index of 50th element is [6,1]
import numpy as np
arr = np.array(range(1,(8*8)+1)).reshape(8,8)
print(arr[6,1])
output will be 50
or you can do it in generic way as well by the help of numpy where method.
import numpy as np
def getElementIndex(array: np.array, element):
elementIndex = np.where(array==element)
return f'[{elementIndex[0][0]},{elementIndex[1][0]}]'
def getXYOrderNumberArray(x:int, y:int):
return np.array(range(1,(x*y)+1)).reshape(x,y)
arr = getXYOrderNumberArray(8,8)
print(getElementIndex(arr,50))

Python equivalent of (matrix)*(vector) in R

In R, when I execute the code below:
> X=matrix(1,2,3)
> c=c(1,2,3)
> X*c
R gives out the following output:
[,1] [,2] [,3]
[1,] 1 3 2
[2,] 2 1 3
But when I do the below on Python:
>>> import numpy as np
>>> X=np.array([[1,1,1],[1,1,1]])
>>> c=np.array([1,2,3])
>>> X*c
the Python code above gives the following output:
array([[1, 2, 3],
[1, 2, 3]])
Is there any way that I can make the Python to come up with the identical output as R? I think I somehow have to tell Python that I want the numpy to multiply each element of the matrix X by each element of the vector c along the column, instead of along the row, but I am not sure how to go about this.
In [18]: np.reshape([1,2,3]*2,(2,3),order='F')
Out[18]:
array([[1, 3, 2],
[2, 1, 3]])
This starts with a list multiply, which is replication:
In [19]: [1,2,3]*2
Out[19]: [1, 2, 3, 1, 2, 3]
The rest uses numpy to reshape it into a (2,3) array, but with consecutive values going down, 'F' order.
Not knowning R, and in particular the c(1,2,3) expression, I can't say that's what's going on in R.
===
You talk about rows with columns, but I don't see how that works in your example. That said, we can easily perform outer like products
===
This reproduces your R_Product (at least in a few test cases):
In [138]: def foo(X,c):
...: X1 = X.ravel()
...: Y = np.resize(c,X1.shape)*X1
...: return Y.reshape(X.shape, order='F')
...:
In [139]: foo(np.ones((2,3)),np.arange(1,4))
Out[139]:
array([[1., 3., 2.],
[2., 1., 3.]])
In [140]: foo(np.arange(6).reshape(2,3),np.arange(1,4))
Out[140]:
array([[ 0, 6, 8],
[ 2, 3, 15]])
I'm using the resize function to replicate c to match the total number of elements of X. And order F to stack them in the desired column order. The default for numpy is order C.
In numpy replicating an array to match another is not common, at least not in this sense. Replicating by row or column, as in broadcasting is common. And of course reshaping.
I am the OP.
I was looking for a quick and easy solution, but I guess there is no straightforward functionality in Python that allows us to do this. So, I had to make a function that multiplies a matrix with a vector in the same manner that R does:
def R_product(X,c):
"""
Computes the regular R product
(not same as the matrix product) between
a 2D Numpy Array X, and a numpy vector c.
Args:
X: 2D Numpy Array
c: A Numpy vector
Returns: the output of X*c in R.
(This is different than X/*/c in R)
"""
X_nrow = X.shape[0]
X_ncol = X.shape[1]
X_dummy = np.zeros(shape=((X_nrow * X_ncol),1))
nrow = X_dummy.shape[0]
nc = nrow // len(c)
Y = np.zeros(shape=(nrow,1))
for j in range(X_ncol):
for u in range(X_nrow):
X_element = X[u,j]
if u == X_nrow - 1:
idx = X_nrow * (j+1) - 1
else:
idx = X_nrow * j + (u+1) - 1
X_dummy[idx,0] = X_element
for i in range(nc):
for j in range(len(c)):
Y[(i*len(c)+j):(i*len(c)+j+1),:] = (X_dummy[(i*len(c)+j):(i*len(c)+j+1),:]) * c[j]
for z in range(nrow-nc*len(c)):
Y[(nc*len(c)+z):(nc*len(c)+z+1),:] = (X_dummy[(nc*len(c)+z):(nc*len(c)+z+1),:]) * c[z]
return Y.reshape(X_ncol, X_nrow).transpose() # the answer I am looking for
Should work.

Aggregate elements based on position vector

I'm trying to vectorize a very simple operation but can't seem to figure out how.
Given a very large numerical vector (over 1M positions) and another array of size n with a given set of positions, I would like to get back a vector of size n with elements being the average of the values of the first vector as specified by the second
a = np.array([1,2,3,4,5,6,7])
b = np.array([[0,1],[2],[3,5],[4,6]])
c = [1.5,3,5,6]
I need to repeat this operation many times so performance is an issue.
Vanilla python solution:
import numpy as np
import time
a = np.array([1,2,3,4,5,6,7])
b = np.array([[0,1],[2],[3,5],[4,6]])
begin = time.time()
for i in range(100000):
c = []
for d in b:
c.append(np.mean(a[d]))
print(time.time() - begin, c)
# 3.7529971599578857 [1.5, 3.0, 5.0, 6.0]
I'm not sure if this is necessarily faster but you may as well try:
import numpy as np
a = np.array([1, 2, 3, 4, 5, 6, 7])
b = np.array([[0, 1], [2], [3, 5], [4, 6]])
# Get the length of each subset of indices
lens = np.fromiter((len(bi) for bi in b), count=len(b), dtype=np.int32)
# Compute reduction indices
reduce_idx = np.roll(np.cumsum(lens), 1)
reduce_idx[0] = 0
# Make flattened array of index lists
idx = np.fromiter((i for bi in b for i in bi), count=lens.sum(), dtype=np.int32)
# Reorder according to indices
a2 = a[idx]
# Sum reordered array at reduction indices and divide by number of indices
c = np.add.reduceat(a2, reduce_idx) / lens
print(c)
# [1.5 3. 5. 6. ]

Returning an numpy array of array based on a list of parameters

I have a very simple code to compute the vertical movement. I have set some initial conditions (in this case v0s). Instead to run a for loop over each one of the v0s, is that any way to "apply" each v0 to the x linspace and have a array of numpy arrays.
import numpy as np
v0s = [1, 2, 3]
g = 9.81
def VerticalPosition(v0,g,t):
return(v0*t - 0.5 * g * t**2)
def Solution(v0,g):
return(2*v0/g)
def Apex(v0,g):
return(VerticalPosition(v0,g,v0/g))
x=np.linspace(0,Solution(max(v0s),g),101)
y=[]
for v0 in v0s:
y.append(VerticalPosition(v0,g,x))
While #pekapa's answer (which returns a 2d array of floats) is what most would recommend, here is a method that produces an array of arrays.
y = np.frompyfunc(lambda a, b: VerticalPosition(a, b, x), 2, 1)(v0s, g)
Arrays of arrays are useful when the inner arrays have different shapes. (Not the case in the present example).
Re the use of x in the above expression. It is taken from the enclosing (not necessarily global) scope but that can with a bit of care be managed. The easiest is to just pack it in a function and make it explicit. Since the inner functions are evaluated immedately and then discarded x being mutable poses no problem here.
def capsule(v0s, g, x):
return np.frompyfunc(lambda a, b: VerticalPosition(a, b, x), 2, 1)(v0s, g)
Here is an example that essentially only works with an array of arrays:
a,b = np.ogrid[1:4, 5:9:2]
np.frompyfunc(np.arange, 2, 1)(a, b)
# array([[array([1, 2, 3, 4]), array([1, 2, 3, 4, 5, 6])],
# [array([2, 3, 4]), array([2, 3, 4, 5, 6])],
# [array([3, 4]), array([3, 4, 5, 6])]], dtype=object)
You just need to use all vectors, and, in your case, that's quite simple.
Try having v0s as a vector with:
v0s = np.array([[1], [2], [3]])
note that it's a 3x1 vector, v0s.shape should be (3, 1)
Your x linspace is already a vector x.shape is (101,)
Now you can just multiply them. Or, call VerticalPosition straight with your new v0s vector, i.e.
y = VerticalPosition(v0s, g, x)

Indexing from an ndimensional array - numpy/ python

I am working with matrices of (x,y,z) dimensions, and would like to index numerous values from this matrix simultaneously.
ie. if the index A[0,0,0] = 5
and A[1,1,1] = 10
A[[1,1,1], [5,5,5]] = [5, 10]
however indexing like this seems to return huge chunks of the matrix.
Does anyone know how I can accomplish this? I have a large array of indices (n, x, y, z) that i need to use to index from A)
Thanks
You are trying to use 1 as the first index 3 times and 5 as the index into the second dimension (again three times). This will give you the element at A[1,5,:] repeated three times.
A = np.random.rand(6,6,6);
B = A[[1,1,1], [5,5,5]]
# [[ 0.17135991, 0.80554887, 0.38614418, 0.55439258, 0.66504806, 0.33300839],
# [ 0.17135991, 0.80554887, 0.38614418, 0.55439258, 0.66504806, 0.33300839],
# [ 0.17135991, 0.80554887, 0.38614418, 0.55439258, 0.66504806, 0.33300839]]
B.shape
# (3, 6)
Instead, you will want to specify [1,5] for each axis of your matrix.
A[[1,5], [1,5], [1,5]] = [5, 10]
Advanced indexing works like this:
A[I, J, K][n] == A[I[n], J[n], K[n]]
with A, I, J, and K all arrays. That's not the full, general rule, but it's what the rules simplify down to for what you need.
For example, if you want output[0] == A[0, 0, 0] and output[1] == A[1, 1, 1], then your I, J, and K arrays should look like np.array([0, 1]). Lists also work:
A[[0, 1], [0, 1], [0, 1]]

Categories

Resources