I have been trying to construct the matrix Dij, defined as
I want to plot it for points located at xi = -cos[ π (2 i + 1) / (2 N)] on the interval [-1,1] to consequentially take derivatives of a function. I am though having problems constructing the differentiating matrix Dij.
I have written a python script as:
import numpy as np
N = 100
x = np.linspace(-1,1,N-1)
for i in range(0, N - 1):
x[i] = -np.cos(np.pi*(2*i + 1)/2*N)
def Dmatrix(x,N):
m_ij = np.zeros(3)
for k in range(len(x)):
for j in range(len(x)):
for i in range(len(x)):
m_ij[i,j,k] = -2/N*((k*np.sin(k*np.pi*(2*i + 1)/2*N(np.cos(k*np.pi*(2*j +1))/2*N)/(np.sin(np.pi*(2*i + 1)/2*N)))
return m_ij
xx = Dmatrix(x,N)
This thus returns the error:
IndexError: too many indices for array
Is there a way one could more efficiently construct this and successfully compute it over all k ?
The goal will be to multiply this matrix by a function and sum over j to get the first order derivative of given function.
m_ij = np.zeros(3) doesn't make a three-dimensional array, it makes an array with one dimension of length 3.
In [1]: import numpy as np
In [2]: m_ij = np.zeros(3)
In [3]: print(m_ij)
[0. 0. 0.]
I suspect you want (as a simple fix)
len_x = len(x)
m_ij = np.zeros((len_x, len_x, len_x))
Look at your x calc by itself
In [418]: N = 10
...: x = np.linspace(-1,1,N-1)
...: y = np.zeros(N)
...: for i in range(N):
...: y[i] = -np.cos(np.pi*(2*i + 1)/2*N)
...:
In [419]: x
Out[419]: array([-1. , -0.75, -0.5 , -0.25, 0. , 0.25, 0.5 , 0.75, 1. ])
In [420]: y
Out[420]: array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])
In [421]: (2*np.arange(N)+1)
Out[421]: array([ 1, 3, 5, 7, 9, 11, 13, 15, 17, 19])
In [422]: (2*np.arange(N)+1)/2*N
Out[422]: array([ 5., 15., 25., 35., 45., 55., 65., 75., 85., 95.])
I separated x and y, because otherwise it doesn't make any sense to create x and then over write it.
The y values don't look interesting because they are all just cos of odd whole multiples of pi.
Note how I use np.arange instead of looping on range.
can be implemented as
def D(N):
from numpy import zeros, pi, sin, cos
D = zeros((N, N))
for i in range(N):
for j in range(N):
for k in range(N):
D[i,j] -= k*sin(k*pi*(i+i+1)/2/N)*cos(k*pi*(j+j+1)/2/N)
D[i,j] /= sin(pi*(i+i+1)/2/N)
return D*2/N
It could be convenient to vectorize the inner loop.
On second tought, all the procedure can be vectorized using np.einsum (at the end I have also some timing, the einsum version, of course, abysmally faster than a triple loop):
In [1]: from numpy import set_printoptions ; set_printoptions(linewidth=120)
In [2]: def D(N):
...: from numpy import zeros, pi, sin, cos
...: D = zeros((N, N))
...: for i in range(N):
...: for j in range(N):
...: for k in range(N):
...: D[i,j] -= k * sin(k*pi*(2*i+1)/2/N) * cos(k*pi*(2*j+1)/2/N)
...: D[i,j] /= sin(pi*(2*i+1)/2/N)
...: return D*2/N
In [3]: def E(N):
...: from numpy import arange, cos, einsum, outer, pi, sin
...: i = j = k = arange(N)
...: s_i = sin((2*i+1)*pi/2/N)
...: s_ki = sin(outer(k,(2*i+1)*pi/2/N))
...: c_kj = cos(outer(k,(2*j+1)*pi/2/N))
...: return -2/N*einsum('k, ki, kj -> ij', k, s_ki, c_kj) / s_i[:,None]
In [4]: for N in (3,4,5):
...: print(D(N)) ; print(E(N)) ; print('==========')
...:
[[-1.73205081e+00 2.30940108e+00 -5.77350269e-01]
[-5.77350269e-01 1.22464680e-16 5.77350269e-01]
[ 5.77350269e-01 -2.30940108e+00 1.73205081e+00]]
[[-1.73205081e+00 2.30940108e+00 -5.77350269e-01]
[-5.77350269e-01 1.22464680e-16 5.77350269e-01]
[ 5.77350269e-01 -2.30940108e+00 1.73205081e+00]]
==========
[[-3.15432203 4.46088499 -1.84775907 0.5411961 ]
[-0.76536686 -0.22417076 1.30656296 -0.31702534]
[ 0.31702534 -1.30656296 0.22417076 0.76536686]
[-0.5411961 1.84775907 -4.46088499 3.15432203]]
[[-3.15432203 4.46088499 -1.84775907 0.5411961 ]
[-0.76536686 -0.22417076 1.30656296 -0.31702534]
[ 0.31702534 -1.30656296 0.22417076 0.76536686]
[-0.5411961 1.84775907 -4.46088499 3.15432203]]
==========
[[-4.97979657e+00 7.20682930e+00 -3.40260323e+00 1.70130162e+00 -5.25731112e-01]
[-1.05146222e+00 -4.49027977e-01 2.10292445e+00 -8.50650808e-01 2.48216561e-01]
[ 3.24919696e-01 -1.37638192e+00 2.44929360e-16 1.37638192e+00 -3.24919696e-01]
[-2.48216561e-01 8.50650808e-01 -2.10292445e+00 4.49027977e-01 1.05146222e+00]
[ 5.25731112e-01 -1.70130162e+00 3.40260323e+00 -7.20682930e+00 4.97979657e+00]]
[[-4.97979657e+00 7.20682930e+00 -3.40260323e+00 1.70130162e+00 -5.25731112e-01]
[-1.05146222e+00 -4.49027977e-01 2.10292445e+00 -8.50650808e-01 2.48216561e-01]
[ 3.24919696e-01 -1.37638192e+00 2.44929360e-16 1.37638192e+00 -3.24919696e-01]
[-2.48216561e-01 8.50650808e-01 -2.10292445e+00 4.49027977e-01 1.05146222e+00]
[ 5.25731112e-01 -1.70130162e+00 3.40260323e+00 -7.20682930e+00 4.97979657e+00]]
==========
In [5]: %timeit D(20)
36 ms ± 277 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [6]: %timeit E(20)
146 µs ± 777 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [7]: %timeit D(100)
4.35 s ± 30.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [8]: %timeit E(100)
7.7 ms ± 2.82 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [9]:
Related
I try to compute a convolution on a scipy.sparse matrix. Here is the code:
import numpy as np
import scipy.sparse, scipy.signal
M = scipy.sparse.csr_matrix([[0,1,0,0],[1,0,0,1],[1,0,1,0],[0,0,0,0]])
kernel = np.ones((3,3))
kernel[1,1]=0
X = scipy.signal.convolve(M, kernel, mode='same')
Which produces the following error:
ValueError: volume and kernel should have the same dimensionality
Computing scipy.signal.convolve(M.todense(), kernel, mode='same') provides the expected result. However, I would like to keep the computation sparse.
More generally speaking, my goal is to compute the 1-hop neighbourhood sum of the sparse matrix M. If you have any good idea how to calculate this on a sparse matrix, I would love to hear it !
EDIT:
I just tried a solution for this specific kernel (sum of neighbors) that is not really faster than the dense version (I didn't try in a very high dimension though). Here is the code:
row_ind, col_ind = M.nonzero()
X = scipy.sparse.csr_matrix((M.shape[0]+2, M.shape[1]+2))
for i in [0, 1, 2]:
for j in [0, 1, 2]:
if i!= 1 or j !=1:
X += scipy.sparse.csr_matrix( (M.data, (row_ind+i, col_ind+j)), (M.shape[0]+2, M.shape[1]+2))
X = X[1:-1, 1:-1]
In [1]: from scipy import sparse, signal
In [2]: M = sparse.csr_matrix([[0,1,0,0],[1,0,0,1],[1,0,1,0],[0,0,0,0]])
...: kernel = np.ones((3,3))
...: kernel[1,1]=0
In [3]: X = signal.convolve(M.A, kernel, mode='same')
In [4]: X
Out[4]:
array([[2., 1., 2., 1.],
[2., 4., 3., 1.],
[1., 3., 1., 2.],
[1., 2., 1., 1.]])
Why do posters show runnable code, but not the results? Most of us can't run code like this in our heads.
In [5]: M.A
Out[5]:
array([[0, 1, 0, 0],
[1, 0, 0, 1],
[1, 0, 1, 0],
[0, 0, 0, 0]])
Your alternative - while the result is a sparse matrix, all values are filled. Even if M is larger and sparser, X will be denser.
In [7]: row_ind, col_ind = M.nonzero()
...: X = sparse.csr_matrix((M.shape[0]+2, M.shape[1]+2))
...: for i in [0, 1, 2]:
...: for j in [0, 1, 2]:
...: if i!= 1 or j !=1:
...: X += sparse.csr_matrix( (M.data, (row_ind+i, col_ind+j)), (M
...: .shape[0]+2, M.shape[1]+2))
...: X = X[1:-1, 1:-1]
In [8]: X
Out[8]:
<4x4 sparse matrix of type '<class 'numpy.float64'>'
with 16 stored elements in Compressed Sparse Row format>
In [9]: X.A
Out[9]:
array([[2., 1., 2., 1.],
[2., 4., 3., 1.],
[1., 3., 1., 2.],
[1., 2., 1., 1.]])
Here's an alternative that builds the coo style inputs, and only makes the matrix at the end. Keep in mind that repeated coordinates are summed. That's handy in FEM stiffness matrix construction, and fits nicely here as well.
In [10]: row_ind, col_ind = M.nonzero()
...: data, row, col = [],[],[]
...: for i in [0, 1, 2]:
...: for j in [0, 1, 2]:
...: if i!= 1 or j !=1:
...: data.extend(M.data)
...: row.extend(row_ind+i)
...: col.extend(col_ind+j)
...: X = sparse.csr_matrix( (data, (row, col)), (M.shape[0]+2, M.shape[1]+2)
...: )
...: X = X[1:-1, 1:-1]
In [11]: X
Out[11]:
<4x4 sparse matrix of type '<class 'numpy.int64'>'
with 16 stored elements in Compressed Sparse Row format>
In [12]: X.A
Out[12]:
array([[2, 1, 2, 1],
[2, 4, 3, 1],
[1, 3, 1, 2],
[1, 2, 1, 1]])
===
My approach is noticeably faster (but still well behind the dense convolution). sparse.csr_matrix(...) is pretty slow, so it isn't a good idea to do repeatedly. And sparse addition isn't very good either.
In [13]: %%timeit
...: row_ind, col_ind = M.nonzero()
...: data, row, col = [],[],[]
...: for i in [0, 1, 2]:
...: for j in [0, 1, 2]:
...: if i!= 1 or j !=1:
...: data.extend(M.data)
...: row.extend(row_ind+i)
...: col.extend(col_ind+j)
...: X = sparse.csr_matrix( (data, (row, col)), (M.shape[0]+2, M.shape[1]+2)
...: )
...: X = X[1:-1, 1:-1]
...:
...:
793 µs ± 20 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [14]: %%timeit
...: row_ind, col_ind = M.nonzero()
...: X = sparse.csr_matrix((M.shape[0]+2, M.shape[1]+2))
...: for i in [0, 1, 2]:
...: for j in [0, 1, 2]:
...: if i!= 1 or j !=1:
...: X += sparse.csr_matrix( (M.data, (row_ind+i, col_ind+j)), (
...: M.shape[0]+2, M.shape[1]+2))
...: X = X[1:-1, 1:-1]
...:
...:
4.72 ms ± 92.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [15]: timeit X = signal.convolve(M.A, kernel, mode='same')
85.9 µs ± 339 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
I have an array of values arr with shape (N,) and an array of coordinates coords with shape (N,2). I want to represent this in an (M,M) array grid such that grid takes the value 0 at coordinates that are not in coords, and for the coordinates that are included it should store the sum of all values in arr that have that coordinate. So if M=3, arr = np.arange(4)+1, and coords = np.array([[0,0,1,2],[0,0,2,2]]) then grid should be:
array([[3., 0., 0.],
[0., 0., 3.],
[0., 0., 4.]])
The reason this is nontrivial is that I need to be able to repeat this step many times and the values in arr change each time, and so can the coordinates. Ideally I am looking for a vectorized solution. I suspect that I might be able to use np.where somehow but it's not immediately obvious how.
Timing the solutions
I have timed the solutions present at this time and it appear that the accumulator method is slightly faster than the sparse matrix method, with the second accumulation method being the slowest for the reasons explained in the comments:
%timeit for x in range(100): accumulate_arr(np.random.randint(100,size=(2,10000)),np.random.normal(0,1,10000))
%timeit for x in range(100): accumulate_arr_v2(np.random.randint(100,size=(2,10000)),np.random.normal(0,1,10000))
%timeit for x in range(100): sparse.coo_matrix((np.random.normal(0,1,10000),np.random.randint(100,size=(2,10000))),(100,100)).A
47.3 ms ± 1.79 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
103 ms ± 255 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
48.2 ms ± 36 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
One way would be to create a sparse.coo_matrix and convert that to dense:
from scipy import sparse
sparse.coo_matrix((arr,coords),(M,M)).A
# array([[3, 0, 0],
# [0, 0, 3],
# [0, 0, 4]])
With np.bincount -
def accumulate_arr(coords, arr):
# Get output array shape
m,n = coords.max(1)+1
# Get linear indices to be used as IDs with bincount
lidx = np.ravel_multi_index(coords, (m,n))
# Or lidx = coords[0]*(coords[1].max()+1) + coords[1]
# Accumulate arr with IDs from lidx
return np.bincount(lidx,arr,minlength=m*n).reshape(m,n)
Sample run -
In [58]: arr
Out[58]: array([1, 2, 3, 4])
In [59]: coords
Out[59]:
array([[0, 0, 1, 2],
[0, 0, 2, 2]])
In [60]: accumulate_arr(coords, arr)
Out[60]:
array([[3., 0., 0.],
[0., 0., 3.],
[0., 0., 4.]])
Another with np.add.at on similar lines and might be easier to follow -
def accumulate_arr_v2(coords, arr):
m,n = coords.max(1)+1
out = np.zeros((m,n), dtype=arr.dtype)
np.add.at(out, tuple(coords), arr)
return out
I have a sorted array of float32 Values, I want to split this array into a list of lists containing only the same Values like this:
>>> split_sorted(array) # [1., 1., 1., 2., 2., 3.]
>>> [[1., 1., 1.], [2., 2.], [3.]]
My current approach is this Function
def split_sorted(array):
split = [[array[0]]]
s_index = 0
a_index = 1
while a_index < len(array):
while a_index < len(array) and array[a_index] == split[s_index][0]:
split[s_index].append(array[a_index])
a_index += 1
else:
if a_index < len(array):
s_index += 1
a_index += 1
split.append([array[a_index]])
My Question now is, is there a more Pythonic way to do this? maybe even with numpy? And is this the most performant way?
Thanks a lot!
Approach #1
With a as the array, we can use np.split -
np.split(a,np.flatnonzero(a[:-1] != a[1:])+1)
Sample run -
In [16]: a
Out[16]: array([1., 1., 1., 2., 2., 3.])
In [17]: np.split(a,np.flatnonzero(a[:-1] != a[1:])+1)
Out[17]: [array([1., 1., 1.]), array([2., 2.]), array([3.])]
Approach #2
Another more performant way would be to get the splitting indices and then slicing the array and zipping -
idx = np.flatnonzero(np.r_[True, a[:-1] != a[1:], True])
out = [a[i:j] for i,j in zip(idx[:-1],idx[1:])]
Approach #3
If you have to get a list of sublists as output, we could re-create with list duplication -
mask = np.r_[True, a[:-1] != a[1:], True]
c = np.diff(np.flatnonzero(mask))
out = [[i]*j for i,j in zip(a[mask[:-1]],c)]
Benchmarking
Timings for vectorized approaches on 1000000 elements with 10000 unique elements -
In [145]: np.random.seed(0)
...: a = np.sort(np.random.randint(1,10000,(1000000)))
In [146]: x = a
# Approach #1 from this post
In [147]: %timeit np.split(a,np.flatnonzero(a[:-1] != a[1:])+1)
100 loops, best of 3: 10.5 ms per loop
# Approach #2 from this post
In [148]: %%timeit
...: idx = np.flatnonzero(np.r_[True, a[:-1] != a[1:], True])
...: out = [a[i:j] for i,j in zip(idx[:-1],idx[1:])]
100 loops, best of 3: 5.18 ms per loop
# Approach #3 from this post
In [197]: %%timeit
...: mask = np.r_[True, a[:-1] != a[1:], True]
...: c = np.diff(np.flatnonzero(mask))
...: out = [[i]*j for i,j in zip(a[mask[:-1]],c)]
100 loops, best of 3: 11.1 ms per loop
# #RafaelC's soln
In [149]: %%timeit
...: v,c = np.unique(x, return_counts=True)
...: out = [[a]*b for (a,b) in zip(v,c)]
10 loops, best of 3: 25.6 ms per loop
You can use numpy.unique and zip
v,c = np.unique(x, return_counts=True)
[[a]*b for (a,b) in zip(v,c)]
Outputs
[[1.0, 1.0, 1.0], [2.0, 2.0], [3.0]]
Timings for a 6,000,000 sized array
%timeit v,c = np.unique(x, return_counts=True); [[a]*b for (a,b) in zip(v,c)]
18.2 ms ± 236 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
%timeit np.split(x,np.flatnonzero(x[:-1] != x[1:])+1)
424 ms ± 11.5 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit [list(group) for value, group in itertools.groupby(x)]
180 ms ± 4.42 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
The function itertools.groupby has this exact behavior.
>>> from itertools import groupby
>>> [list(group) for value, group in groupby(array)]
[[1.0, 1.0, 1.0], [2.0, 2.0], [3.0]]
>>> from itertools import groupby
>>> a = [1., 1., 1., 2., 2., 3.]
>>> for k, g in groupby(a) :
... print k, list(g)
...
1.0 [1.0, 1.0, 1.0]
2.0 [2.0, 2.0]
3.0 [3.0]
You may join the lists, if you like:
>>> result = []
>>> for k, g in groupby(a) :
... result.append( list(g) )
...
>>> result
[[1.0, 1.0, 1.0], [2.0, 2.0], [3.0]]
I improved your code a bit, it's not pythonic, but doesn't use external libraries (and also your code didn't work on the last element in the array):
def split_sorted(array):
splitted = [[]]
standard = array[0]
li = 0 # inner lists index
n = len(array)
for i in range(n):
if standard != array[i]:
standard = array[i]
splitted.append([]) # appending empty list
li += 1
split[li].append(array[i])
return splitted
# test
array = [1,2,2,2,3]
a = split_sorted(array)
print(a)enter code here
This question already has answers here:
NumPy Broadcasting: Calculating sum of squared differences between two arrays
(3 answers)
Closed 4 years ago.
Basically, I have two matrices A and B, and I want C (dimensions marked by the side of the matrices), with computation like this:
The formula below is what I do now. I take advantage of some broadcasting, but I am still left with a loop. I am novel to Python so maybe I am wrong, but I just have a hunch that this loop can be eliminated. Can anyone share some ideas?
EDIT: 2018-04-27 09:48:28
as requested, an example:
In [5]: A
Out[5]:
array([[0, 1],
[2, 3],
[4, 5],
[6, 7],
[8, 9]])
In [6]: B
Out[6]:
array([[0, 1],
[2, 3],
[4, 5],
[6, 7]])
In [7]: C = np.zeros ((B.shape[0], A.shape[0]))
In [8]: for m in range (B.shape[0]):
...: C[m] = np.sum (np.square (B[m] - A), axis=1).flatten ()
...:
In [9]: C
Out[9]:
array([[ 0., 8., 32., 72., 128.],
[ 8., 0., 8., 32., 72.],
[ 32., 8., 0., 8., 32.],
[ 72., 32., 8., 0., 8.]])
This appears to work at the cost of some extra memory:
C = ((B[:, :, None] - A.T)**2).sum(axis=1)
Testing:
import numpy
D = 10
N = 20
M = 30
A = numpy.random.rand(N, D)
B = numpy.random.rand(M, D)
C = numpy.empty((M, N))
Timing:
for m in range(M):
C[m] = numpy.sum((B[m, :] - A)**2, axis=1)
514 µs ± 13.7 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
C2 = ((B[:, :, None] - A.T)**2).sum(axis=1)
53.6 µs ± 529 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
I need not just the values, but the locations of elements in one numpy array that also appear in a second numpy array, and I need the locations in that second array too.
Here's an example of the best I've been able to do:
>>> a=np.arange(0.,15.)
>>> a
array([ 0., 1., 2., 3., 4., 5., 6., 7., 8., 9., 10.,
11., 12., 13., 14.])
>>> b=np.arange(4.,8.,.5)
>>> b
array([ 4. , 4.5, 5. , 5.5, 6. , 6.5, 7. , 7.5])
>>> [ (i,j) for (i,alem) in enumerate(a) for (j,blem) in enumerate(b) if alem==blem]
[(4, 0), (5, 2), (6, 4), (7, 6)]
Anybody have anything faster, numpy specific, or more "pythonic"?
Here is an O((n+k)log(n+k)) (the naive algorithm is O(nk)) solution with np.unique
uniq, inv = np.unique(np.r_[a, b], return_inverse=True)
map = -np.ones((len(uniq),), dtype=int)
map[inv[:len(a)]] = np.arange(len(a))
bina = map[inv[len(a):]]
inds_in_b = np.where(bina != -1)[0]
elements, inds_in_a = b[inds_in_b], bina[inds_in_b]
or you could simply sort a for O((n+k)log(k))
inds = np.argsort(a)
aso = a[inds]
bina = np.searchsorted(aso[:-1], b)
inds_in_b = np.where(b == aso[bina])[0]
elements, inds_in_a = b[inds_in_b], inds[bina[inds_in_b]]
For sorted array a, here's another approach with np.searchsorted making use of its optional argument - side set as left and right -
lidx = np.searchsorted(a,b,'left')
ridx = np.searchsorted(a,b,'right')
mask = lidx != ridx
out = lidx[mask], np.flatnonzero(mask)
# for zipped o/p : zip(lidx[mask], np.flatnonzero(mask))
Runtime test
Approaches -
def searchsorted_where(a,b): # #Paul Panzer's soln
inds = np.argsort(a)
aso = a[inds]
bina = np.searchsorted(aso[:-1], b)
inds_in_b = np.where(b == aso[bina])[0]
return b[inds_in_b], inds_in_b
def in1d_masking(a,b): # #Psidom's soln
logic = np.in1d(b, a)
return b[logic], np.where(logic)[0]
def searchsorted_twice(a,b): # Proposed in this post
lidx = np.searchsorted(a,b,'left')
ridx = np.searchsorted(a,b,'right')
mask = lidx != ridx
return lidx[mask], np.flatnonzero(mask)
Timings -
Case #1 (Using sample data from question and scaling it up) :
In [2]: a=np.arange(0.,15000.)
...: b=np.arange(4.,15000.,0.5)
...:
In [3]: %timeit searchsorted_where(a,b)
...: %timeit in1d_masking(a,b)
...: %timeit searchsorted_twice(a,b)
...:
1000 loops, best of 3: 721 µs per loop
1000 loops, best of 3: 1.76 ms per loop
1000 loops, best of 3: 1.28 ms per loop
Case #2 (Same as case #1 with no. of elems in b comparatively lesser than in a) :
In [4]: a=np.arange(0.,15000.)
...: b=np.arange(4.,15000.,5)
...:
In [5]: %timeit searchsorted_where(a,b)
...: %timeit in1d_masking(a,b)
...: %timeit searchsorted_twice(a,b)
...:
10000 loops, best of 3: 77.4 µs per loop
1000 loops, best of 3: 428 µs per loop
10000 loops, best of 3: 128 µs per loop
Case #3 (and comparatively much lesser elems in b) :
In [6]: a=np.arange(0.,15000.)
...: b=np.arange(4.,15000.,10)
...:
In [7]: %timeit searchsorted_where(a,b)
...: %timeit in1d_masking(a,b)
...: %timeit searchsorted_twice(a,b)
...:
10000 loops, best of 3: 42.8 µs per loop
1000 loops, best of 3: 392 µs per loop
10000 loops, best of 3: 71.9 µs per loop
You can use numpy.in1d to find out the elements of b also in a, logical indexing and numpy.where can get the elements and index correspondingly:
logic = np.in1d(b, a)
list(zip(b[logic], np.where(logic)[0]))
# [(4.0, 0), (5.0, 2), (6.0, 4), (7.0, 6)]
b[logic], np.where(logic)[0]
# (array([ 4., 5., 6., 7.]), array([0, 2, 4, 6]))