Python Numpy appending multiple lists from objects - python

I am calling an object several times that is returning a numpy list:
for x in range(0,100):
d = simulation3()
d = [0, 1, 2, 3]
d = [4, 5, 6, 7]
..and many more
I want to take each list and append it to a 2D array.
final_array = [[0, 1, 2, 3],[4, 5, 6, 7]...and so forth]
I tried creating an empty array (final_array = np.zeros(4,4)) and appending it but the values are appending after the 4X4 matrix is created.
Can anyone help me with this? thank you!

You can use np.fromiter to create an array from an iterable. Since, by default, this function only works with scalars, you can use itertools.chain to help:
np.random.seed(0)
from itertools import chain
def simulation3():
return np.random.randint(0, 10, 4)
n = 5
d = np.fromiter(chain.from_iterable(simulation3() for _ in range(5)), dtype='i')
d.shape = 5, 4
print(d)
array([[5, 0, 3, 3],
[7, 9, 3, 5],
[2, 4, 7, 6],
[8, 8, 1, 6],
[7, 7, 8, 1]], dtype=int32)
But this is relatively inefficient. NumPy performs best with fixed size arrays. If you know the size of your array in advance, you can define an empty array and update rows sequentially. See the alternatives described by #norok2.

there are multiple way to do it in numpy , the easiest way is to use vstack like this :
for Ex :
#you have these array you want to concat
d1 = [0, 1, 2, 3]
d2 = [4, 5, 6, 7]
d3 = [4, 5, 6, 7]
#initialize your variable with zero raw
X = np.zeros((0,4))
#then each time you call your function use np.vstack like this :
X = np.vstack((np.array(d1),X))
X = np.vstack((np.array(d2),X))
X = np.vstack((np.array(d2),X))
# and finally you have your array like below
#array([[4., 5., 6., 7.],
# [4., 5., 6., 7.],
# [0., 1., 2., 3.]])

The optimal solution depends on the numbers / sizes you are dealing with.
My favorite solution (which only works if you already know the size of the final result) is to initialize the array which will contain your results and then fill each you could initialize your result and then fill it using views.
This the most memory efficient solution.
If you do not know the size of the final result, then you are better off by generating a list of lists, which can be converted (or stacked) as a NumPy array at the end of the process.
Here are some examples, where gen_1d_list() is used to generate some random numbers to mimic the result of simulate3() (meaning that in the following code, you should replace gen_1d_list(n, dtype) with simulate3()):
stacking1() implements the filling using views
stacking2() implements the list generation and converting to NumPy array
stacking3() implements the list generation and stacking to NumPy array
stacking4() implements the dynamic modification of a NumPy array using vstack() as proposed earlier.
import numpy as np
def gen_1d_list(n, dtype=int):
return list(np.random.randint(1, 100, n, dtype))
def stacking1(n, m, dtype=int):
arr = np.empty((n, m), dtype=dtype)
for i in range(n):
arr[i] = gen_1d_list(m, dtype)
return arr
def stacking2(n, m, dtype=int):
items = [gen_1d_list(m, dtype) for i in range(n)]
arr = np.array(items)
return arr
def stacking3(n, m, dtype=int):
items = [gen_1d_list(m, dtype) for i in range(n)]
arr = np.stack(items, dtype)
return arr
def stacking4(n, m, dtype=int):
arr = np.zeros((0, m), dtype=dtype)
for i in range(n):
arr = np.vstack((gen_1d_list(m, dtype), arr))
return arr
Time-wise, stacking1() and stacking2() are more or less equally fast, while stacking3() and stacking4() are slower (and, in proportion, much slower for small size inputs).
Some numbers, for small size inputs:
n, m = 4, 10
%timeit stacking1(n, m)
# 15.7 µs ± 182 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit stacking2(n, m)
# 14.2 µs ± 141 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit stacking3(n, m)
# 22.7 µs ± 282 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit stacking4(n, m)
# 31.8 µs ± 270 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
and for larger size inputs:
n, m = 4, 1000000
%timeit stacking1(n, m)
# 344 ms ± 1.64 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit stacking2(n, m)
# 350 ms ± 1.65 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit stacking3(n, m)
# 370 ms ± 2.75 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
%timeit stacking4(n, m)
# 369 ms ± 3.01 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)

Related

Boolean indexing array through array of boolean indexes without loop

I want to index an array with a boolean mask through multiple boolean arrays without a loop.
This is what I want to achieve but without a loop and only with numpy.
import numpy as np
a = np.array([[0, 1],[2, 3]])
b = np.array([[[1, 0], [1, 0]], [[0, 0], [1, 1]]], dtype=bool)
r = []
for x in b:
print(a[x])
r.extend(a[x])
# => array([0, 2])
# => array([2, 3])
print(r)
# => [0, 2, 2, 3]
# what I would like to do is something like this
r = some_fancy_indexing_magic_with_b_and_a
print(r)
# => [0, 2, 2, 3]
Approach #1
Simply broadcast a to b's shape with np.broadcast_to and then mask it with b -
In [15]: np.broadcast_to(a,b.shape)[b]
Out[15]: array([0, 2, 2, 3])
Approach #2
Another would be getting all the indices and mod those by the size of a, which would also be the size of each 2D block in b and then indexing into flattened a -
a.ravel()[np.flatnonzero(b)%a.size]
Approach #3
On the same lines as App#2, but keeping the 2D format and using non-zero indices along the last two axes of b -
_,r,c = np.nonzero(b)
out = a[r,c]
Timings on large arrays (given sample shapes scaled up by 100x) -
In [50]: np.random.seed(0)
...: a = np.random.rand(200,200)
...: b = np.random.rand(200,200,200)>0.5
In [51]: %timeit np.broadcast_to(a,b.shape)[b]
45.5 ms ± 381 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [52]: %timeit a.ravel()[np.flatnonzero(b)%a.size]
94.6 ms ± 1.64 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [53]: %%timeit
...: _,r,c = np.nonzero(b)
...: out = a[r,c]
128 ms ± 1.46 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Get indexes of chosen array elements in the order of these elements from a different array [duplicate]

I have two numpy arrays, A and B. A conatains unique values and B is a sub-array of A.
Now I am looking for a way to get the index of B's values within A.
For example:
A = np.array([1,2,3,4,5,6,7,8,9,10])
B = np.array([1,7,10])
# I need a function fun() that:
fun(A,B)
>> 0,6,9
You can use np.in1d with np.nonzero -
np.nonzero(np.in1d(A,B))[0]
You can also use np.searchsorted, if you care about maintaining the order -
np.searchsorted(A,B)
For a generic case, when A & B are unsorted arrays, you can bring in the sorter option in np.searchsorted, like so -
sort_idx = A.argsort()
out = sort_idx[np.searchsorted(A,B,sorter = sort_idx)]
I would add in my favorite broadcasting too in the mix to solve a generic case -
np.nonzero(B[:,None] == A)[1]
Sample run -
In [125]: A
Out[125]: array([ 7, 5, 1, 6, 10, 9, 8])
In [126]: B
Out[126]: array([ 1, 10, 7])
In [127]: sort_idx = A.argsort()
In [128]: sort_idx[np.searchsorted(A,B,sorter = sort_idx)]
Out[128]: array([2, 4, 0])
In [129]: np.nonzero(B[:,None] == A)[1]
Out[129]: array([2, 4, 0])
Have you tried searchsorted?
A = np.array([1,2,3,4,5,6,7,8,9,10])
B = np.array([1,7,10])
A.searchsorted(B)
# array([0, 6, 9])
Just for completeness: If the values in A are non negative and reasonably small:
lookup = np.empty((np.max(A) + 1), dtype=int)
lookup[A] = np.arange(len(A))
indices = lookup[B]
I had the same question these days. However, the timing performance is very critical for me. Therefore, I guess the timing comparison of different solutions may be useful for others.
As Divakar mentioned, you can use np.in1d(A, B) with np.where, np.nonzero. Moreover, you can use the np.in1d(A, B) with np.intersect1d (based on this page). Also, you can use np.searchsorted as another useful approach for sorted arrays.
I want to add another simple solution. You can use the comprehension list. It may take longer that the previous ones. However, if you take the advantage of Numba python package, it is much less time-consuming.
In [1]: import numpy as np
In [2]: from numba import njit
In [3]: a = np.array([1,2,3,4,5,6,7,8,9,10])
In [4]: b = np.array([1,7,10])
In [5]: np.where(np.in1d(a, b))[0]
...: array([0, 6, 9])
In [6]: np.nonzero(np.in1d(a, b))[0]
...: array([0, 6, 9])
In [7]: np.searchsorted(a, b)
...: array([0, 6, 9])
In [8]: np.searchsorted(a, np.intersect1d(a, b))
...: array([0, 6, 9])
In [9]: [i for i, x in enumerate(a) if x in b]
...: [0, 6, 9]
In [10]: #njit
...: def func(a, b):
...: return [i for i, x in enumerate(a) if x in b]
In [11]: func(a, b)
...: [0, 6, 9]
Now, let's compare the timing performance of these solutions.
In [12]: %timeit np.where(np.in1d(a, b))[0]
4.26 µs ± 6.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [13]: %timeit np.nonzero(np.in1d(a, b))[0]
4.39 µs ± 14.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [14]: %timeit np.searchsorted(a, b)
800 ns ± 6.04 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
In [15]: %timeit np.searchsorted(a, np.intersect1d(a, b))
8.8 µs ± 73.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [16]: %timeit [i for i, x in enumerate(a) if x in b]
15.4 µs ± 18.4 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [17]: %timeit func(a, b)
336 ns ± 0.579 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Numpy: Can you use broadcasting to replace values by row?

I have a M x N matrix X and a 1 x N matrix Y. What I would like to do is replace any 0-entry in X with the appropriate value from Y based on its column.
So if
X = np.array([[0, 1, 2], [3, 0, 5]])
and
Y = np.array([10, 20, 30])
The desired end result would be [[10, 1, 2], [3, 20, 5]].
This can be done straightforwardly by generating a M x N matrix where every row is Y and then using filter arrays:
Y = np.ones((X.shape[0], 1)) * Y.reshape(1, -1)
X[X==0] = Y[X==0]
But could this be done using numpy's broadcasting functionality?
Sure. Instead of physically repeating Y, create a broadcasted view of Y with the shape of X, using numpy.broadcast_to:
expanded = numpy.broadcast_to(Y, X.shape)
mask = X==0
x[mask] = expanded[mask]
Expand X to make it a bit more general:
In [306]: X = np.array([[0, 1, 2], [3, 0, 5],[0,1,0]])
where identifies the 0s; the 2nd array identifies the columns
In [307]: idx = np.where(X==0)
In [308]: idx
Out[308]: (array([0, 1, 2, 2]), array([0, 1, 0, 2]))
In [309]: Z = X.copy()
In [310]: Z[idx]
Out[310]: array([0, 0, 0, 0]) # flat list of where to put the values
In [311]: Y[idx[1]]
Out[311]: array([10, 20, 10, 30]) # matching list of values by column
In [312]: Z[idx] = Y[idx[1]]
In [313]: Z
Out[313]:
array([[10, 1, 2],
[ 3, 20, 5],
[10, 1, 30]])
Not doing broadcasting, but reasonably clean numpy.
Times compared to broadcast_to approach
In [314]: %%timeit
...: idx = np.where(X==0)
...: Z[idx] = Y[idx[1]]
...:
9.28 µs ± 157 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [315]: %%timeit
...: exp = np.broadcast_to(Y,X.shape)
...: mask=X==0
...: Z[mask] = exp[mask]
...:
19.5 µs ± 513 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
Faster, though the sample size is small.
Another way to make the expanded Y, is with repeat:
In [319]: %%timeit
...: exp = np.repeat(Y[None,:],3,0)
...: mask=X==0
...: Z[mask] = exp[mask]
...:
10.8 µs ± 55.3 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Whose time is close to my where. It turns out that broadcast_to is relatively slow:
In [321]: %%timeit
...: exp = np.broadcast_to(Y,X.shape)
...:
10.5 µs ± 52.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [322]: %%timeit
...: exp = np.repeat(Y[None,:],3,0)
...:
3.76 µs ± 11.6 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
We'd have to do more tests to see whether that is just due to a setup cost, or if the relative times still apply with much larger arrays.

What is a vectorized way to create multiple powers of a NumPy array?

I have a NumPy array:
arr = [[1, 2],
[3, 4]]
I want to create a new array that contains powers of arr up to a power order:
# arr_new = [arr^0, arr^1, arr^2, arr^3,...arr^order]
arr_new = [[1, 1, 1, 2, 1, 4, 1, 8],
[1, 1, 3, 4, 9, 16, 27, 64]]
My current approach uses for loops:
# Pre-allocate an array for powers
arr = np.array([[1, 2],[3,4]])
order = 3
rows, cols = arr.shape
arr_new = np.zeros((rows, (order+1) * cols))
# Iterate over each exponent
for i in range(order + 1):
arr_new[:, (i * cols) : (i + 1) * cols] = arr**i
print(arr_new)
Is there a faster (i.e. vectorized) approach to creating powers of an array?
Benchmarking
Thanks to #hpaulj and #Divakar and #Paul Panzer for the answers. I benchmarked the loop-based and broadcasting-based operations on the following test arrays.
arr = np.array([[1, 2],
[3,4]])
order = 3
arrLarge = np.random.randint(0, 10, (100, 100)) # 100 x 100 array
orderLarge = 10
The loop_based function is:
def loop_based(arr, order):
# pre-allocate an array for powers
rows, cols = arr.shape
arr_new = np.zeros((rows, (order+1) * cols))
# iterate over each exponent
for i in range(order + 1):
arr_new[:, (i * cols) : (i + 1) * cols] = arr**i
return arr_new
The broadcast_based function using hstack is:
def broadcast_based_hstack(arr, order):
# Create a 3D exponent array for a 2D input array to force broadcasting
powers = np.arange(order + 1)[:, None, None]
# Generate values (third axis contains array at various powers)
exponentiated = arr ** powers
# Reshape and return array
return np.hstack(exponentiated) # <== using hstack function
The broadcast_based function using reshape is:
def broadcast_based_reshape(arr, order):
# Create a 3D exponent array for a 2D input array to force broadcasting
powers = np.arange(order + 1)[:, None]
# Generate values (3-rd axis contains array at various powers)
exponentiated = arr[:, None] ** powers
# reshape and return array
return exponentiated.reshape(arr.shape[0], -1) # <== using reshape function
The broadcast_based function using cumulative product cumprod and reshape:
def broadcast_cumprod_reshape(arr, order):
rows, cols = arr.shape
# Create 3D empty array where the middle dimension is
# the array at powers 0 through order
out = np.empty((rows, order + 1, cols), dtype=arr.dtype)
out[:, 0, :] = 1 # 0th power is always 1
a = np.broadcast_to(arr[:, None], (rows, order, cols))
# Cumulatively multiply arrays so each multiplication produces the next order
np.cumprod(a, axis=1, out=out[:,1:,:])
return out.reshape(rows, -1)
On Jupyter notebook, I used the timeit command and got these results:
Small arrays (2x2):
%timeit -n 100000 loop_based(arr, order)
7.41 µs ± 174 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit -n 100000 broadcast_based_hstack(arr, order)
10.1 µs ± 137 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit -n 100000 broadcast_based_reshape(arr, order)
3.31 µs ± 61.5 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
%timeit -n 100000 broadcast_cumprod_reshape(arr, order)
11 µs ± 102 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
Large arrays (100x100):
%timeit -n 1000 loop_based(arrLarge, orderLarge)
261 µs ± 5.82 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit -n 1000 broadcast_based_hstack(arrLarge, orderLarge)
225 µs ± 4.15 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit -n 1000 broadcast_based_reshape(arrLarge, orderLarge)
223 µs ± 2.16 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
%timeit -n 1000 broadcast_cumprod_reshape(arrLarge, orderLarge)
157 µs ± 1.02 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Conclusions:
It seems that the broadcast based approach using reshape is faster for smaller arrays. However, for large arrays, the cumprod approach scales better and is faster.
Extend arrays to higher dims and let broadcasting do its magic with some help from reshaping -
In [16]: arr = np.array([[1, 2],[3,4]])
In [17]: order = 3
In [18]: (arr[:,None]**np.arange(order+1)[:,None]).reshape(arr.shape[0],-1)
Out[18]:
array([[ 1, 1, 1, 2, 1, 4, 1, 8],
[ 1, 1, 3, 4, 9, 16, 27, 64]])
Note that arr[:,None] is essentially arr[:,None,:], but we can skip the trailing ellipsis for brevity.
Timings on a bigger dataset -
In [40]: np.random.seed(0)
...: arr = np.random.randint(0,9,(100,100))
...: order = 10
# #hpaulj's soln with broadcasting and stacking
In [41]: %timeit np.hstack(arr **np.arange(order+1)[:,None,None])
1000 loops, best of 3: 734 µs per loop
In [42]: %timeit (arr[:,None]**np.arange(order+1)[:,None]).reshape(arr.shape[0],-1)
1000 loops, best of 3: 401 µs per loop
That reshaping part is practically free and that's where we gain performance here alongwith the broadcasting part of course, as seen in the breakdown below -
In [52]: %timeit (arr[:,None]**np.arange(order+1)[:,None])
1000 loops, best of 3: 390 µs per loop
In [53]: %timeit (arr[:,None]**np.arange(order+1)[:,None]).reshape(arr.shape[0],-1)
1000 loops, best of 3: 401 µs per loop
Use broadcasting to generate the values, and reshape or rearrange the values as desired:
In [34]: arr **np.arange(4)[:,None,None]
Out[34]:
array([[[ 1, 1],
[ 1, 1]],
[[ 1, 2],
[ 3, 4]],
[[ 1, 4],
[ 9, 16]],
[[ 1, 8],
[27, 64]]])
In [35]: np.hstack(_)
Out[35]:
array([[ 1, 1, 1, 2, 1, 4, 1, 8],
[ 1, 1, 3, 4, 9, 16, 27, 64]])
Here is a solution using cumulative multiplication which scales better than power based approaches, especially if the input array is of float dtype:
import numpy as np
def f_mult(a, k):
m, n = a.shape
out = np.empty((m, k, n), dtype=a.dtype)
out[:, 0, :] = 1
a = np.broadcast_to(a[:, None], (m, k-1, n))
a.cumprod(axis=1, out=out[:, 1:])
return out.reshape(m, -1)
Timings:
int up to power 9
divakar: 0.4342731796205044 ms
hpaulj: 0.794165057130158 ms
pp: 0.20520629966631532 ms
float up to power 39
divakar: 29.056487752124667 ms
hpaulj: 31.773792404681444 ms
pp: 1.0329263447783887 ms
Code for timings, thks #Divakar:
def f_divakar(a, k):
return (a[:,None]**np.arange(k)[:,None]).reshape(a.shape[0],-1)
def f_hpaulj(a, k):
return np.hstack(a**np.arange(k)[:,None,None])
from timeit import timeit
np.random.seed(0)
a = np.random.randint(0,9,(100,100))
k = 10
print('int up to power 9')
print('divakar:', timeit(lambda: f_divakar(a, k), number=1000), 'ms')
print('hpaulj: ', timeit(lambda: f_hpaulj(a, k), number=1000), 'ms')
print('pp: ', timeit(lambda: f_mult(a, k), number=1000), 'ms')
a = np.random.uniform(0.5,2.0,(100,100))
k = 40
print('float up to power 39')
print('divakar:', timeit(lambda: f_divakar(a, k), number=1000), 'ms')
print('hpaulj: ', timeit(lambda: f_hpaulj(a, k), number=1000), 'ms')
print('pp: ', timeit(lambda: f_mult(a, k), number=1000), 'ms')
You are creating a Vandermonde matrix with a reshape, so it is probably best to use numpy.vander to make it, and let someone else take care of the best algorithm.
This way your code is just:
np.vander(arr.ravel(), order).reshape((arr.shape[0], -1))
That said, it seems like they use something like Paul Panzer's cumprod method under the hood so it should scale well.

Index a NumPy array row-wise [duplicate]

This question already has answers here:
Indexing one array by another in numpy
(4 answers)
Closed 4 years ago.
Say I have a NumPy array:
>>> X = np.array([[1, 2, 3, 4], [5, 6, 7, 8], [9, 10, 11, 12]])
>>> X
array([[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12]])
and an array of indexes that I want to select for each row:
>>> ixs = np.array([[1, 3], [0, 1], [1, 2]])
>>> ixs
array([[1, 3],
[0, 1],
[1, 2]])
How do I index the array X so that for every row in X I select the two indices specified in ixs?
So for this case, I want to select element 1 and 3 for the first row, element 0 and 1 for the second row, and so on. The output should be:
array([[2, 4],
[5, 6],
[10, 11]])
A slow solution would be something like this:
output = np.array([row[ix] for row, ix in zip(X, ixs)])
however this can get kinda slow for extremely long arrays. Is there a faster way to do this without a loop using NumPy?
EDIT: Some very approximate speed tests on a 2.5K * 1M array with 2K wide ixs (10GB):
np.array([row[ix] for row, ix in zip(X, ixs)]) 0.16s
X[np.arange(len(ixs)), ixs.T].T 0.175s
X.take(idx+np.arange(0, X.shape[0]*X.shape[1], X.shape[1])[:,None]) 33s
np.fromiter((X[i, j] for i, row in enumerate(ixs) for j in row), dtype=X.dtype).reshape(ixs.shape) 2.4s
You can use this:
X[np.arange(len(ixs)), ixs.T].T
Here is the reference for complex indexing.
I believe you can use .take thusly:
In [185]: X
Out[185]:
array([[ 1, 2, 3, 4],
[ 5, 6, 7, 8],
[ 9, 10, 11, 12]])
In [186]: idx
Out[186]:
array([[1, 3],
[0, 1],
[1, 2]])
In [187]: X.take(idx + (np.arange(X.shape[0]) * X.shape[1]).reshape(-1, 1))
Out[187]:
array([[ 2, 4],
[ 5, 6],
[10, 11]])
If your array dimensions are massive, it might be faster, albeit uglier, to do:
idx+np.arange(0, X.shape[0]*X.shape[1], X.shape[1])[:,None]
Just for fun, see how the following performs:
np.fromiter((X[i, j] for i, row in enumerate(ixs) for j in row), dtype=X.dtype, count=ixs.size).reshape(ixs.shape)
Edit to add timings
In [15]: X = np.arange(1000*10000, dtype=np.int32).reshape(1000,-1)
In [16]: ixs = np.random.randint(0, 10000, (1000, 2))
In [17]: ixs.sort(axis=1)
In [18]: ixs
Out[18]:
array([[2738, 3511],
[3600, 7414],
[7426, 9851],
...,
[1654, 8252],
[2194, 8200],
[5497, 8900]])
In [19]: %timeit np.array([row[ix] for row, ix in zip(X, ixs)])
928 µs ± 23.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [20]: %timeit X[np.arange(len(ixs)), ixs.T].T
23.6 µs ± 491 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [21]: %timeit X.take(idx+np.arange(0, X.shape[0]*X.shape[1], X.shape[1])[:,None])
20.6 µs ± 530 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [22]: %timeit np.fromiter((X[i, j] for i, row in enumerate(ixs) for j in row), dtype=X.dtype, count=ixs.size).reshape(ixs.shape)
1.42 ms ± 9.94 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
#mxbi I've added some timings and my results aren't really consistent with yours, you should check it out
Here's a larger array:
In [33]: X = np.arange(10000*100000, dtype=np.int32).reshape(10000,-1)
In [34]: ixs = np.random.randint(0, 100000, (10000, 2))
In [35]: ixs.sort(axis=1)
In [36]: X.shape
Out[36]: (10000, 100000)
In [37]: ixs.shape
Out[37]: (10000, 2)
With some results:
In [42]: %timeit np.array([row[ix] for row, ix in zip(X, ixs)])
11.4 ms ± 177 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [43]: %timeit X[np.arange(len(ixs)), ixs.T].T
596 µs ± 17.8 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
In [44]: %timeit X.take(ixs+np.arange(0, X.shape[0]*X.shape[1], X.shape[1])[:,None])
540 µs ± 16.6 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)
Now, we are using column 500 indices instead of two, and we see the list-comprehension start winning out:
In [45]: ixs = np.random.randint(0, 100000, (10000, 500))
In [46]: ixs.sort(axis=1)
In [47]: %timeit np.array([row[ix] for row, ix in zip(X, ixs)])
93 ms ± 1.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
In [48]: %timeit X[np.arange(len(ixs)), ixs.T].T
133 ms ± 638 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [49]: %timeit X.take(ixs+np.arange(0, X.shape[0]*X.shape[1], X.shape[1])[:,None])
87.5 ms ± 1.13 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
The usual suggestion for indexing items from rows is:
X[np.arange(X.shape[0])[:,None], ixs]
That is, make a row index of shape (n,1) (column vector), which will broadcast with the (n,m) shape of ixs to give a (n,m) solution.
This basically the same as:
X[np.arange(len(ixs)), ixs.T].T
which broadcasts a (n,) index against a (m,n), and transposes.
Timings are essentially the same:
In [299]: X = np.ones((1000,2000))
In [300]: ixs = np.random.randint(0,2000,(1000,200))
In [301]: timeit X[np.arange(len(ixs)), ixs.T].T
6.58 ms ± 71.6 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
In [302]: timeit X[np.arange(X.shape[0])[:,None], ixs]
6.57 ms ± 129 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
and for comparison:
In [307]: timeit np.array([row[ix] for row, ix in zip(X, ixs)])
6.63 ms ± 229 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
I'm a little surprised that this list comprehension does so well. I wonder how the relative advantages compare when the dimensions change, particularly in the relative shape of X and ixs (long, wide etc).
The first solution is the style of indexing produced by ix_:
In [303]: np.ix_(np.arange(3), np.arange(2))
Out[303]:
(array([[0],
[1],
[2]]), array([[0, 1]]))
This should work
[X[i][[y]] for i, y in enumerate(ixs)]
EDIT: I just noticed you wanted no loop solution.

Categories

Resources