This question already has answers here:
Indexing one array by another in numpy
(4 answers)
Closed 4 years ago.
Say I have a NumPy array:
>>> X = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
>>> X
array([[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12]])
and an array of indexes that I want to select for each row:
>>> ixs = np.array([[1, 3], [0, 1], [1, 2]])
>>> ixs
array([[1, 3],
[0, 1],
[1, 2]])
How do I index the array X so that for every row in X I select the two indices specified in ixs?
So for this case, I want to select element 1 and 3 for the first row, element 0 and 1 for the second row, and so on. The output should be:
array([[2, 4],
[5, 6],
[10, 11]])
A slow solution would be something like this:
output = np.array([row[ix] for row, ix in zip(X, ixs)])
however this can get kinda slow for extremely long arrays. Is there a faster way to do this without a loop using NumPy?
EDIT: Some very approximate speed tests on a 2.5K * 1M array with 2K wide ixs (10GB):
np.array([row[ix] for row, ix in zip(X, ixs)]) 0.16s
X[np.arange(len(ixs)), ixs.T].T 0.175s
X.take(idx+np.arange(0, X.shape[0]*X.shape[1], X.shape[1])[:,None]) 33s
np.fromiter((X[i, j] for i, row in enumerate(ixs) for j in row), dtype=X.dtype).reshape(ixs.shape) 2.4s
You can use this:
X[np.arange(len(ixs)), ixs.T].T
Here is the reference for complex indexing.
I believe you can use .take thusly:
In [185]: X
Out[185]:
array([[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12]])
In [186]: idx
Out[186]:
array([[1, 3],
[0, 1],
[1, 2]])
In [187]: X.take(idx + (np.arange(X.shape[0]) * X.shape[1]).reshape(-1, 1))
Out[187]:
array([[ 2, 4],
[ 5, 6],
[10, 11]])
If your array dimensions are massive, it might be faster, albeit uglier, to do:
idx+np.arange(0, X.shape[0]*X.shape[1], X.shape[1])[:,None]
Just for fun, see how the following performs:
np.fromiter((X[i, j] for i, row in enumerate(ixs) for j in row), dtype=X.dtype, count=ixs.size).reshape(ixs.shape)
Edit to add timings
In [15]: X = np.arange(1000*10000, dtype=np.int32).reshape(1000,-1)
In [16]: ixs = np.random.randint(0, 10000, (1000, 2))
In [17]: ixs.sort(axis=1)
In [18]: ixs
Out[18]:
array([[2738, 3511],
[3600, 7414],
[7426, 9851],
...,
[1654, 8252],
[2194, 8200],
[5497, 8900]])
In [19]: %timeit np.array([row[ix] for row, ix in zip(X, ixs)])
928 µs ± 23.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [20]: %timeit X[np.arange(len(ixs)), ixs.T].T
23.6 µs ± 491 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [21]: %timeit X.take(idx+np.arange(0, X.shape[0]*X.shape[1], X.shape[1])[:,None])
20.6 µs ± 530 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [22]: %timeit np.fromiter((X[i, j] for i, row in enumerate(ixs) for j in row), dtype=X.dtype, count=ixs.size).reshape(ixs.shape)
1.42 ms ± 9.94 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
#mxbi I've added some timings and my results aren't really consistent with yours, you should check it out
Here's a larger array:
In [33]: X = np.arange(10000*100000, dtype=np.int32).reshape(10000,-1)
In [34]: ixs = np.random.randint(0, 100000, (10000, 2))
In [35]: ixs.sort(axis=1)
In [36]: X.shape
Out[36]: (10000, 100000)
In [37]: ixs.shape
Out[37]: (10000, 2)
With some results:
In [42]: %timeit np.array([row[ix] for row, ix in zip(X, ixs)])
11.4 ms ± 177 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [43]: %timeit X[np.arange(len(ixs)), ixs.T].T
596 µs ± 17.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [44]: %timeit X.take(ixs+np.arange(0, X.shape[0]*X.shape[1], X.shape[1])[:,None])
540 µs ± 16.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Now, we are using column 500 indices instead of two, and we see the list-comprehension start winning out:
In [45]: ixs = np.random.randint(0, 100000, (10000, 500))
In [46]: ixs.sort(axis=1)
In [47]: %timeit np.array([row[ix] for row, ix in zip(X, ixs)])
93 ms ± 1.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [48]: %timeit X[np.arange(len(ixs)), ixs.T].T
133 ms ± 638 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [49]: %timeit X.take(ixs+np.arange(0, X.shape[0]*X.shape[1], X.shape[1])[:,None])
87.5 ms ± 1.13 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
The usual suggestion for indexing items from rows is:
X[np.arange(X.shape[0])[:,None], ixs]
That is, make a row index of shape (n,1) (column vector), which will broadcast with the (n,m) shape of ixs to give a (n,m) solution.
This basically the same as:
X[np.arange(len(ixs)), ixs.T].T
which broadcasts a (n,) index against a (m,n), and transposes.
Timings are essentially the same:
In [299]: X = np.ones((1000,2000))
In [300]: ixs = np.random.randint(0,2000,(1000,200))
In [301]: timeit X[np.arange(len(ixs)), ixs.T].T
6.58 ms ± 71.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [302]: timeit X[np.arange(X.shape[0])[:,None], ixs]
6.57 ms ± 129 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
and for comparison:
In [307]: timeit np.array([row[ix] for row, ix in zip(X, ixs)])
6.63 ms ± 229 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
I'm a little surprised that this list comprehension does so well. I wonder how the relative advantages compare when the dimensions change, particularly in the relative shape of X and ixs (long, wide etc).
The first solution is the style of indexing produced by ix_:
In [303]: np.ix_(np.arange(3), np.arange(2))
Out[303]:
(array([[0],
[1],
[2]]), array([[0, 1]]))
This should work
[X[i][[y]] for i, y in enumerate(ixs)]
EDIT: I just noticed you wanted no loop solution.
Related
I have the following array, is there a quick way of doing this using numpy or array?
[ ['one','two','three'] [1,2,3] ]
Need to convert it to the following
[ ['one',1], ['two',2], ['three',3] ]
Numpy or array
a = np.array([['one','two','three'], [1,2,3]])
aT = a.T
print(aT)
q = [['one', 'two', 'three'], [1,2,3]]
a = [[s, n] for s, n in zip(*q)]
# a = [['one', 1], ['two', 2], ['three', 3]]
You can use zip.
a = [['one','two','three'],[1,2,3]]
new_a = [[i, j] for i, j in zip(a[0],a[1])]
print(new_a)
[['one', 1], ['two', 2], ['three', 3]]
Extra
According to the comments and the answers, I suddenly got curious with the performance time among the answer by #lhopital, my answer and the answer provided by #moe. So I have created a 2d list with 260 chars and 260 values.
%timeit [[s, n] for s, n in zip(*a)]
40.2 µs ± 2.21 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit [[i, j] for i, j in zip(a[0],a[1])]
27.2 µs ± 1.15 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
a = np.array(a)
%timeit a.T
146 ns ± 2.47 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)
Apparently, np.transpose is the fastest method to use.
I have many lists that represent a sparse matrix (i.e., the columns that have nonzero entries) that I need to represent as a SciPy sparse csc_matrix. However, note that there is only one row in my sparse matrix and so the list simply points to the columns within this row that has nonzero entries. For example:
sparse_input = [4, 10, 21] # My lists are much, much longer but very sparse
This list tells me which columns within my single row sparse matrix where there are nonzero values. This is what the dense matrix would look like.
x = np.array([[0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1]])
I could use the (data, (row, col)) syntax but since my lists are super long the csc_matrix takes a lot of time and memory to build. So, I was thinking about using the indptr interface but I'm having trouble figuring out how to quickly and automatically build the indptr directly from a given sparse list of nonzero column entries. I tried looking at csr_matrix(x).indptr and I see that the indptr looks like:
array([0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
3], dtype=int32)
I've read the SciPy docs and the Sparse Matrix Wikipedia page but I can't seem to come up with an efficient method to construct the indptr directly from a list of nonzero columns. It just feels like indptr shouldn't be this long in length considering that there are only three nonzero entries in the sparse matrix.
How about just making the matrices, and exploring their attributs?
In [144]: from scipy import sparse
In [145]: x = np.array([[0,0,0,0,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,1]])
In [146]: M = sparse.coo_matrix(x)
In [147]: M
Out[147]:
<1x22 sparse matrix of type '<class 'numpy.int64'>'
with 3 stored elements in COOrdinate format>
In [148]: M.row
Out[148]: array([0, 0, 0], dtype=int32)
In [149]: M.col
Out[149]: array([ 4, 10, 21], dtype=int32)
In [150]: M.data
Out[150]: array([1, 1, 1])
csr:
In [152]: Mr = M.tocsr()
In [153]: Mr.indptr
Out[153]: array([0, 3], dtype=int32)
In [155]: Mr.indices
Out[155]: array([ 4, 10, 21], dtype=int32)
In [156]: Mr.data
Out[156]: array([1, 1, 1], dtype=int64)
csc:
In [157]: Mc = M.tocsc()
In [158]: Mc.indptr
Out[158]:
array([0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
3], dtype=int32)
In [159]: Mc.indices
Out[159]: array([0, 0, 0], dtype=int32)
In [160]: Mc.data
Out[160]: array([1, 1, 1], dtype=int64)
And the direct nonzero on x:
In [161]: np.nonzero(x)
Out[161]: (array([0, 0, 0]), array([ 4, 10, 21]))
For a 1 row matrix like this, I doubt if you'll save much time by creating the csr indptr directly. Most of the work will be in the nonzero step. But feel free to experiement.
===
Some timings
In [162]: timeit sparse.coo_matrix(x)
95.8 µs ± 110 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [163]: timeit sparse.csr_matrix(x)
335 µs ± 2.59 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [164]: timeit M.tocsr()
115 µs ± 948 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [165]: timeit M.tocsc()
117 µs ± 90.4 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [166]: sparse.csr_matrix?
In [167]: timeit M.tocsc()
117 µs ± 1.17 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [168]: timeit sparse.csc_matrix(x)
335 µs ± 257 ns per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [169]: timeit sparse.coo_matrix(x).tocsr()
219 µs ± 3.34 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
I'm a little surprised that the csr_matrix is slower than coo followed by conversion.
Now let's try to make the matrix with indptr etc.
In [170]: timeit np.nonzero(x)
2.52 µs ± 65.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [173]: timeit sparse.csr_matrix((Mr.data, Mr.indices, Mr.indptr))
92.5 µs ± 79.3 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [174]: %%timeit
...: indices = np.nonzero(x)[1]
...: data = np.ones_like(indices)
...: indptr = np.array([0,len(indices)])
...: sparse.csr_matrix((data, indices, indptr))
...:
...:
161 µs ± 605 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
I have two numpy arrays, A and B. A conatains unique values and B is a sub-array of A.
Now I am looking for a way to get the index of B's values within A.
For example:
A = np.array([1,2,3,4,5,6,7,8,9,10])
B = np.array([1,7,10])
# I need a function fun() that:
fun(A,B)
>> 0,6,9
You can use np.in1d with np.nonzero -
np.nonzero(np.in1d(A,B))[0]
You can also use np.searchsorted, if you care about maintaining the order -
np.searchsorted(A,B)
For a generic case, when A & B are unsorted arrays, you can bring in the sorter option in np.searchsorted, like so -
sort_idx = A.argsort()
out = sort_idx[np.searchsorted(A,B,sorter = sort_idx)]
I would add in my favorite broadcasting too in the mix to solve a generic case -
np.nonzero(B[:,None] == A)[1]
Sample run -
In [125]: A
Out[125]: array([ 7, 5, 1, 6, 10, 9, 8])
In [126]: B
Out[126]: array([ 1, 10, 7])
In [127]: sort_idx = A.argsort()
In [128]: sort_idx[np.searchsorted(A,B,sorter = sort_idx)]
Out[128]: array([2, 4, 0])
In [129]: np.nonzero(B[:,None] == A)[1]
Out[129]: array([2, 4, 0])
Have you tried searchsorted?
A = np.array([1,2,3,4,5,6,7,8,9,10])
B = np.array([1,7,10])
A.searchsorted(B)
# array([0, 6, 9])
Just for completeness: If the values in A are non negative and reasonably small:
lookup = np.empty((np.max(A) + 1), dtype=int)
lookup[A] = np.arange(len(A))
indices = lookup[B]
I had the same question these days. However, the timing performance is very critical for me. Therefore, I guess the timing comparison of different solutions may be useful for others.
As Divakar mentioned, you can use np.in1d(A, B) with np.where, np.nonzero. Moreover, you can use the np.in1d(A, B) with np.intersect1d (based on this page). Also, you can use np.searchsorted as another useful approach for sorted arrays.
I want to add another simple solution. You can use the comprehension list. It may take longer that the previous ones. However, if you take the advantage of Numba python package, it is much less time-consuming.
In [1]: import numpy as np
In [2]: from numba import njit
In [3]: a = np.array([1,2,3,4,5,6,7,8,9,10])
In [4]: b = np.array([1,7,10])
In [5]: np.where(np.in1d(a, b))[0]
...: array([0, 6, 9])
In [6]: np.nonzero(np.in1d(a, b))[0]
...: array([0, 6, 9])
In [7]: np.searchsorted(a, b)
...: array([0, 6, 9])
In [8]: np.searchsorted(a, np.intersect1d(a, b))
...: array([0, 6, 9])
In [9]: [i for i, x in enumerate(a) if x in b]
...: [0, 6, 9]
In [10]: #njit
...: def func(a, b):
...: return [i for i, x in enumerate(a) if x in b]
In [11]: func(a, b)
...: [0, 6, 9]
Now, let's compare the timing performance of these solutions.
In [12]: %timeit np.where(np.in1d(a, b))[0]
4.26 µs ± 6.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [13]: %timeit np.nonzero(np.in1d(a, b))[0]
4.39 µs ± 14.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [14]: %timeit np.searchsorted(a, b)
800 ns ± 6.04 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [15]: %timeit np.searchsorted(a, np.intersect1d(a, b))
8.8 µs ± 73.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [16]: %timeit [i for i, x in enumerate(a) if x in b]
15.4 µs ± 18.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [17]: %timeit func(a, b)
336 ns ± 0.579 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
I'm having a hard time trying to solve this problem, the main issue is I'm running a simulation, so for lops are mainly forbidden, I have a numpy array NxN, in this case mine is about (10000x20).
stoploss = 19.9 # condition to apply
monte_carlo_simulation(20,1.08,10000,20) #which gives me that 10000x20 np array
mask_trues = np.where(np.any((simulation <= stoploss) == True, axis=1)) # boolean mask
I need some code to make a new vector of len(10000) which returns an array with all the positions for every row, lets suppose:
function([[False,True,True],[False,False,True]])
output = [[1,2],[2]]
Again, the main problem resides in not using loops.
Simply this:
list(map(np.where, my_array))
performance comparison against Kasrâmvd's solution:
def f(a):
return list(map(np.where, a))
def g(a):
x, y = np.where(a)
return np.split(y, np.where(np.diff(x) != 0)[0] + 1)
a = np.random.randint(2, size=(10000,20))
%timeit f(a)
%timeit g(a)
7.66 ms ± 38.1 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
13.3 ms ± 188 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
For completeness I'll demonstrate a sparse matrix approach:
In [57]: A = np.array([[False,True,True],[False,False,True]])
In [58]: A
Out[58]:
array([[False, True, True],
[False, False, True]])
In [59]: M = sparse.lil_matrix(A)
In [60]: M
Out[60]:
<2x3 sparse matrix of type '<class 'numpy.bool_'>'
with 3 stored elements in LInked List format>
In [61]: M.data
Out[61]: array([list([True, True]), list([True])], dtype=object)
In [62]: M.rows
Out[62]: array([list([1, 2]), list([2])], dtype=object)
And to make a large sparse one:
In [63]: BM = sparse.random(10000,20,.05, 'lil')
In [64]: BM
Out[64]:
<10000x20 sparse matrix of type '<class 'numpy.float64'>'
with 10000 stored elements in LInked List format>
In [65]: BM.rows
Out[65]:
array([list([3]), list([]), list([6, 15]), ..., list([]), list([11]),
list([])], dtype=object)
Rough time tests:
In [66]: arr = BM.A
In [67]: timeit sparse.lil_matrix(arr)
19.5 ms ± 421 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [68]: timeit list(map(np.where,arr))
11 ms ± 55.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [69]: %%timeit
...: x,y = np.where(arr)
...: np.split(y, np.where(np.diff(x) != 0)[0] + 1)
...:
13.8 ms ± 24.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Generating a csr sparse format matrix is faster:
In [70]: timeit sparse.csr_matrix(arr)
2.68 ms ± 120 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [71]: Mr = sparse.csr_matrix(arr)
In [72]: Mr.indices
Out[72]: array([ 3, 6, 15, ..., 8, 16, 11], dtype=int32)
In [73]: Mr.indptr
Out[73]: array([ 0, 1, 1, ..., 9999, 10000, 10000], dtype=int32)
In [74]: np.where(arr)[1]
Out[74]: array([ 3, 6, 15, ..., 8, 16, 11])
It's indices is just like the column where, while the indptr is like the split indices.
Here is one way using np.split() and np.diff():
x, y = np.where(boolean_array)
np.split(y, np.where(np.diff(x) != 0)[0] + 1)
Demo:
In [12]: a = np.array([[False,True,True],[False,False,True]])
In [13]: x, y = np.where(a)
In [14]: np.split(y, np.where(np.diff(x) != 0)[0] + 1)
Out[14]: [array([1, 2]), array([2])]
I have a M x N matrix X and a 1 x N matrix Y. What I would like to do is replace any 0-entry in X with the appropriate value from Y based on its column.
So if
X = np.array([[0, 1, 2], [3, 0, 5]])
and
Y = np.array([10, 20, 30])
The desired end result would be [[10, 1, 2], [3, 20, 5]].
This can be done straightforwardly by generating a M x N matrix where every row is Y and then using filter arrays:
Y = np.ones((X.shape[0], 1)) * Y.reshape(1, -1)
X[X==0] = Y[X==0]
But could this be done using numpy's broadcasting functionality?
Sure. Instead of physically repeating Y, create a broadcasted view of Y with the shape of X, using numpy.broadcast_to:
expanded = numpy.broadcast_to(Y, X.shape)
mask = X==0
x[mask] = expanded[mask]
Expand X to make it a bit more general:
In [306]: X = np.array([[0, 1, 2], [3, 0, 5],[0,1,0]])
where identifies the 0s; the 2nd array identifies the columns
In [307]: idx = np.where(X==0)
In [308]: idx
Out[308]: (array([0, 1, 2, 2]), array([0, 1, 0, 2]))
In [309]: Z = X.copy()
In [310]: Z[idx]
Out[310]: array([0, 0, 0, 0]) # flat list of where to put the values
In [311]: Y[idx[1]]
Out[311]: array([10, 20, 10, 30]) # matching list of values by column
In [312]: Z[idx] = Y[idx[1]]
In [313]: Z
Out[313]:
array([[10, 1, 2],
[ 3, 20, 5],
[10, 1, 30]])
Not doing broadcasting, but reasonably clean numpy.
Times compared to broadcast_to approach
In [314]: %%timeit
...: idx = np.where(X==0)
...: Z[idx] = Y[idx[1]]
...:
9.28 µs ± 157 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [315]: %%timeit
...: exp = np.broadcast_to(Y,X.shape)
...: mask=X==0
...: Z[mask] = exp[mask]
...:
19.5 µs ± 513 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Faster, though the sample size is small.
Another way to make the expanded Y, is with repeat:
In [319]: %%timeit
...: exp = np.repeat(Y[None,:],3,0)
...: mask=X==0
...: Z[mask] = exp[mask]
...:
10.8 µs ± 55.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Whose time is close to my where. It turns out that broadcast_to is relatively slow:
In [321]: %%timeit
...: exp = np.broadcast_to(Y,X.shape)
...:
10.5 µs ± 52.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [322]: %%timeit
...: exp = np.repeat(Y[None,:],3,0)
...:
3.76 µs ± 11.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
We'd have to do more tests to see whether that is just due to a setup cost, or if the relative times still apply with much larger arrays.