Python - Obtain indices of intersecting values in two arrays [duplicate] - python

This question already has answers here:
Finding indices of matches of one array in another array
(4 answers)
Closed 2 years ago.
If I have two arrays:
A=[1,2,3,4,5,6,7]
B=[2,4,7]
I would like to obtain an array C that contains the indices of the the values of B also found in A
C=[1,3,6]
I'm quite new to Python and I'm frustrated of not being able to find an elegant solution to such a simple task without the need of using a loop combined with numpy.where().
Thanks in advance!!

Here's a linear-time solution: to efficiently test whether an element is in B, convert it to a set first.
B_set = set(B)
C = [i for i, x in enumerate(A) if x in B_set]
For large inputs, this is better than using .index in a loop, since that requires repeatedly searching the list in O(mn) time, where m and n are the size of A and B. In comparison, the solution above takes O(m + n) time to convert to a set and then build the result list.

You can use np.isin and np.nonzero.
a=np.array([1,2,3,4,5,6,7])
b=np.array([2,4,7])
c=np.nonzero(np.isin(a,b))[0]
# array([1, 3, 6], dtype=int64)

There is a special function in the numpy module for this, intersect1d by passing True in its return_indices argument you get indices of the intersection.
import numpy as np
a = np.array([1,2,3,4,5,6,7])
b = np.array([2,4,7])
c = np.intersect1d(a, b, return_indices=True)[1]
# array([1, 3, 6], dtype=int64)

You can iterate over B and use index() on A with the values
c = [A.index(i) for i in B]
As per of #kaya3's comment you can add a check if the value of B present in A in case it can contain non existing values
c = [A.index(i) for i in B if i in A]

You can use the index function.
A = [1,2,3,4,5,6,7]
B = [2,4,7]
C = [a.index(b) for b in B]
C = list(map(lambda b: a.index(b),B)

Related

finding index/location of value in 1D NumPy array (with identical values) [duplicate]

This question already has answers here:
Is there a NumPy function to return the first index of something in an array?
(20 answers)
Closed 3 years ago.
I want to find the index/location of a randomly chosen number in a 1D NumPy array within the array itself, but when I try the following:
a = np.array(np.linspace(1,10,10))
b = np.random.choice(a)
print(a.index(b))
It doesn't work and can't figure out where the problem is. Does anyone have an idea?
Thanks in advance!
EDIT: how do you only index the randomly chosen value if the values in the NumPy array are identical, for example:
a = np.array(np.linspace(10,10,10))
You have to use where function as already answer here Is there a NumPy function to return the first index of something in an array?
import numpy as np
a = np.array(np.linspace(1,10,10))
b = np.random.choice(a)
print(np.where(a==b))
If the value are the same, where return multiple index, example:
a = np.array(np.linspace(10,10,10))
print(np.where(a==10))
>>> (array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9]),)
Here all index are turned because 10 is in all position.
This will give you your desired output:
np.where(a==b)[0][0]
NumPy's where() function does what you want, as described in other answers. If you only have 1D arrays and you only want the index of the first element in a which equals b, where() is both clunky and inefficient. Instead, you may use
import numpy as np
a = np.linspace(1, 10, 10) # Hint: the np.array() is superfluous
b = np.random.choice(a)
index = next(i for i, el in enumerate(a) if el == b)
print(index, a[index])

Check how many numpy array within a numpy array are equal to other numpy arrays within another numpy array of different size

My problem
Suppose I have
a = np.array([ np.array([1,2]), np.array([3,4]), np.array([5,6]), np.array([7,8]), np.array([9,10])])
b = np.array([ np.array([5,6]), np.array([1,2]), np.array([3,192])])
They are two arrays, of different sizes, containing other arrays (the inner arrays have same sizes!)
I want to count how many items of b (i.e. inner arrays) are also in a. Notice that I am not considering their position!
How can I do that?
My Try
count = 0
for bitem in b:
for aitem in a:
if aitem==bitem:
count+=1
Is there a better way? Especially in one line, maybe with some comprehension..
The numpy_indexed package contains efficient (nlogn, generally) and vectorized solutions to these types of problems:
import numpy_indexed as npi
count = len(npi.intersection(a, b))
Note that this is subtly different than your double loop, discarding duplicate entries in a and b for instance. If you want to retain duplicates in b, this would work:
count = npi.in_(b, a).sum()
Duplicate entries in a could also be handled by doing npi.count(a) and factoring in the result of that; but anyway, im just rambling on for illustration purposes since I imagine the distinction probably does not matter to you.
Here is a simple way to do it:
a = np.array([ np.array([1,2]), np.array([3,4]), np.array([5,6]), np.array([7,8]), np.array([9,10])])
b = np.array([ np.array([5,6]), np.array([1,2]), np.array([3,192])])
count = np.count_nonzero(
np.any(np.all(a[:, np.newaxis, :] == b[np.newaxis, :, :], axis=-1), axis=0))
print(count)
>>> 2
You can do what you want in one liner as follows:
count = sum([np.array_equal(x,y) for x,y in product(a,b)])
Explanation
Here's an explanation of what's happening:
Iterate through the two arrays using itertools.product which will create an iterator over the cartesian product of the two arrays.
Compare each two arrays in a tuple (x,y) coming from step 1. using np.array_equal
True is equal to 1 when using sum on a list
Full example:
The final code looks like this:
import numpy as np
from itertools import product
a = np.array([ np.array([1,2]), np.array([3,4]), np.array([5,6]), np.array([7,8]), np.array([9,10])])
b = np.array([ np.array([5,6]), np.array([1,2]), np.array([3,192])])
count = sum([np.array_equal(x,y) for x,y in product(a,b)])
# output: 2
You can convert the rows to dtype = np.void and then use np.in1d as on the resulting 1d arrays
def void_arr(a):
return np.ascontiguousarray(a).view(np.dtype((np.void, a.dtype.itemsize * a.shape[1])))
b[np.in1d(void_arr(b), void_arr(a))]
array([[5, 6],
[1, 2]])
If you just want the number of intersections, it's
np.in1d(void_arr(b), void_arr(a)).sum()
2
Note: if there are repeat items in b or a, then np.in1d(void_arr(b), void_arr(a)).sum() likely won't be equal to np.in1d(void_arr(a), void_arr(b)).sum(). I've reversed the order from my original answer to match your question (i.e. how many elements of b are in a?)
For more information, see the third answer here

Check if all elements in a 2d array are equal to 1 in Python [duplicate]

This question already has answers here:
How to check if all elements of a list match a condition?
(5 answers)
Closed 6 years ago.
What's the most Python way to do this, other than iterating through nested loops and checking if each value is equal to 1?
If you're using numpy you can use its per-element equality check and then call the resulting arrays all method to check if all elements did satisfy the condition:
>>> import numpy as np
>>> c = np.array([[1,2], [3,4]])
>>> (c==1).all()
False
>>> c = np.array([[1,1], [1,1]])
>>> (c==1).all()
True
Use itertoools.chain to iterate over "flattened" list
all(x == 1 for x in itertools.chain(*my_list))
Using sets:
from itertools import chain
array = [[1,1,1],[1,0,1]]
if set(chain.from_iterable(array)) == {1}:
print("all ones")
You can use np.searchsorted along with np.apply_along_axis to find the elements in any dimensional array.
Generator function + all is probably the way to go.
array_2D = [[1, 1, 1], [1, 1, 1], [1, 1, 1]]
all(all(item == 1 for item in items) for items in array_2D)
Another option would be to first flatten the array, but python has no standard way to do this. But the code would handle any kind of dimensions (If you flatten it deep)!

Find whether a numpy array is a subset of a larger array in Python

I have 2 arrays, for the sake of simplicity let's say the original one is a random set of numbers:
import numpy as np
a=np.random.rand(N)
Then I sample and shuffle a subset from this array:
b=np.array() <------size<N
The shuffling I do do not store the index values, so b is an unordered subset of a
Is there an easy way to get the original indexes of b, so they are in the same order as a, say, if element 2 of b has the index 4 in a, create an array of its assignation.
I could use a for cycle checking element by element, but perhaps there is a more pythonic way
Thanks
I think the most computationally efficient thing to do is to keep track of the indices that associate b with a as b is created.
For example, instead of sampling a, sample the indices of a:
indices = random.sample(range(len(a)), k) # k < N
b = a[indices]
On the off chance a happens to be sorted you could do:
>>> from numpy import array
>>> a = array([1, 3, 4, 10, 11])
>>> b = array([11, 1, 4])
>>> a.searchsorted(b)
array([4, 0, 2])
If a is not sorted you're probably best off going with something like #unutbu's answer.

Excluding common element from list python [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Python, compute list difference
I have two lists
For example:
A = [1,3,5,7]
B = [1,2,3,4,5,6,7,8]
Now, A is always a subset of B
I want to generate a third list C:
which has elements which are present in B but absent in A
like
C = [2,4..]
Thanks
List comprehensions are one way to do this:
[x for x in B if x not in A]
If you use Python, I recommend gaining familiarity with list comprehensions. They're a very powerful tool.
(Several people have suggested using set. While this is a very good idea if you only care about whether or not an element is in the set, note that it will not preserve the order of the elements; a list comprehension will.)
>>> set(B) - set(A)
set([8, 2, 4, 6])
or
>>> sorted(set(B) - set(A))
[2, 4, 6, 8]
An easy way to do this is
C = [x for x in B if x not in A]
This will become slow for big lists, so it would be better to use a set for A:
A = set(A)
C = [x for x in B if x not in A]
If you have multiple operations like this, using sets all the time might be the best option. If A and B are sets, you can simply do
C = B - A
C = sorted(list(set(B) - set(A)))
That should do it.

Categories

Resources