Check number of equal arrays for 2 array of arrays - python

I want to check how many numpy array elements inside numpy array are different.
The solution should not contain list comprehension.
Something along these lines (note that a and b differ in the last array):
a = np.array( [[1,1,1],[2,2,2],[3,3,3],[4,4,4],[5,5,5]] )
b = np.array( [[1,1,1],[2,2,2],[3,3,3],[4,4,4],[5,0,0]] )
y = diff_count( a,b )
print y
>> 1

Approach #1
Perform element-wise comparison for non-equality and then get ANY reduction along last axis and finally count -
(a!=b).any(-1).sum()
Approach #2
Probably faster one with np.count_nonzero for counting booleans -
np.count_nonzero((a!=b).any(-1))
Approach #3
Much faster one with views -
# https://stackoverflow.com/a/45313353/ #Divakar
def view1D(a, b): # a, b are arrays
a = np.ascontiguousarray(a)
b = np.ascontiguousarray(b)
void_dt = np.dtype((np.void, a.dtype.itemsize * a.shape[1]))
return a.view(void_dt).ravel(), b.view(void_dt).ravel()
a1D,b1D = view1D(a,b)
out = np.count_nonzero(a1D!=b1D)
Benchmarking
In [32]: np.random.seed(0)
...: m,n = 10000,100
...: a = np.random.randint(0,9,(m,n))
...: b = a.copy()
...:
...: # Let's set 10% of rows as different ones
...: b[np.random.choice(len(a), len(a)//10, replace=0)] = 0
In [33]: %timeit (a!=b).any(-1).sum() # app#1 from this soln
...: %timeit np.count_nonzero((a!=b).any(-1)) # app#2
...: %timeit np.any(a - b, axis=1).sum() # #Graipher's soln
1000 loops, best of 3: 1.14 ms per loop
1000 loops, best of 3: 1.08 ms per loop
100 loops, best of 3: 2.33 ms per loop
In [34]: %%timeit # app#3
...: a1D,b1D = view1D(a,b)
...: out = np.count_nonzero((a1D!=b1D).any(-1))
1000 loops, best of 3: 797 µs per loop

You can try it using np.ravel(). If you want element wise comparison.
(a.ravel()!=b.ravel()).sum()
(a-b).any(axis=0).sum()
above lines gives 2 as output.
If you want row wise comparison, you can use.
(a-b).any(axis=1).sum()
This gives 1 as output.

You can use numpy.any for this:
y = np.any(a - b, axis=1).sum()

Would this work?
y=sum(a[i]!=b[i]for i in range len(a))
Sorry that I can’t test this myself right now.

Related

Accumulating numbers in an array without a loop. (python)

So I have a (seemingly) simple problem, which I am currently doing now via a for loop.
Basically, I want to increment specific cells in a numpy matrix, but I want to do it without a for-loop if possible.
To give more details: I have 100 x 100 numpy matrix, X. I also have a 2x1000 numpy matrix P. P just stores indices into X, so for example, each column of P, has the row-column index of the cell, that I want to increment in X.
What I do right now is this:
for p in range(P.shape[1]):
X[P[0,p], P[1,p]] += 1
My question is, is there a way to do this without a for-loop?
Thanks!
Use the at method of the add ufunc with advanced indexing:
numpy.add.at(X, (P[0], P[1]), 1)
or just advanced indexing if P is guaranteed to never select the same cell of X twice:
X[P[0], P[1]] += 1
Using linear-indices and bincount -
lidx = np.ravel_multi_index(P, X.shape)
X += np.bincount(lidx, minlength=X.size).reshape(X.shape)
Benchmarking
For the case when indices are not repeated, advanced indexing based approach as suggested in #user2357112's post seems to be very efficient.
For repeated ones case, we have np.add.at and np.bincount and the performance numbers seem to be dependent on the size of indices array relative to the size of input array.
Approaches -
def app0(X,P): # #user2357112's soln1
np.add.at(X, (P[0], P[1]), 1)
def app1(X, P): # Proposed in this ppst
lidx = np.ravel_multi_index(P, X.shape)
X += np.bincount(lidx, minlength=X.size).reshape(X.shape)
Here's few timing tests to suggest that -
Case #1 :
In [141]: X = np.random.randint(0,9,(100,100))
...: P = np.random.randint(0,100,(2,1000))
...:
In [142]: %timeit app0(X, P)
...: %timeit app1(X, P)
...:
10000 loops, best of 3: 68.9 µs per loop
100000 loops, best of 3: 15.1 µs per loop
Case #2 :
In [143]: X = np.random.randint(0,9,(1000,1000))
...: P = np.random.randint(0,1000,(2,10000))
...:
In [144]: %timeit app0(X, P)
...: %timeit app1(X, P)
...:
1000 loops, best of 3: 687 µs per loop
1000 loops, best of 3: 1.48 ms per loop
Case #3 :
In [145]: X = np.random.randint(0,9,(1000,1000))
...: P = np.random.randint(0,1000,(2,100000))
...:
In [146]: %timeit app0(X, P)
...: %timeit app1(X, P)
...:
100 loops, best of 3: 11.3 ms per loop
100 loops, best of 3: 2.51 ms per loop

How do I searchsorted row by row with respect to corresponding rows in another array [duplicate]

Assume that I have two arrays A and B, where both A and B are m x n. My goal is now, for each row of A and B, to find where I should insert the elements of row i of A in the corresponding row of B. That is, I wish to apply np.digitize or np.searchsorted to each row of A and B.
My naive solution is to simply iterate over the rows. However, this is far too slow for my application. My question is therefore: is there a vectorized implementation of either algorithm that I haven't managed to find?
We can add each row some offset as compared to the previous row. We would use the same offset for both arrays. The idea is to use np.searchsorted on flattened version of input arrays thereafter and thus each row from b would be restricted to find sorted positions in the corresponding row in a. Additionally, to make it work for negative numbers too, we just need to offset for the minimum numbers as well.
So, we would have a vectorized implementation like so -
def searchsorted2d(a,b):
m,n = a.shape
max_num = np.maximum(a.max() - a.min(), b.max() - b.min()) + 1
r = max_num*np.arange(a.shape[0])[:,None]
p = np.searchsorted( (a+r).ravel(), (b+r).ravel() ).reshape(m,-1)
return p - n*(np.arange(m)[:,None])
Runtime test -
In [173]: def searchsorted2d_loopy(a,b):
...: out = np.zeros(a.shape,dtype=int)
...: for i in range(len(a)):
...: out[i] = np.searchsorted(a[i],b[i])
...: return out
...:
In [174]: # Setup input arrays
...: a = np.random.randint(11,99,(10000,20))
...: b = np.random.randint(11,99,(10000,20))
...: a = np.sort(a,1)
...: b = np.sort(b,1)
...:
In [175]: np.allclose(searchsorted2d(a,b),searchsorted2d_loopy(a,b))
Out[175]: True
In [176]: %timeit searchsorted2d_loopy(a,b)
10 loops, best of 3: 28.6 ms per loop
In [177]: %timeit searchsorted2d(a,b)
100 loops, best of 3: 13.7 ms per loop
The solution provided by #Divakar is ideal for integer data, but beware of precision issues for floating point values, especially if they span multiple orders of magnitude (e.g. [[1.0, 2,0, 3.0, 1.0e+20],...]). In some cases r may be so large that applying a+r and b+r wipes out the original values you're trying to run searchsorted on, and you're just comparing r to r.
To make the approach more robust for floating-point data, you could embed the row information into the arrays as part of the values (as a structured dtype), and run searchsorted on these structured dtypes instead.
def searchsorted_2d (a, v, side='left', sorter=None):
import numpy as np
# Make sure a and v are numpy arrays.
a = np.asarray(a)
v = np.asarray(v)
# Augment a with row id
ai = np.empty(a.shape,dtype=[('row',int),('value',a.dtype)])
ai['row'] = np.arange(a.shape[0]).reshape(-1,1)
ai['value'] = a
# Augment v with row id
vi = np.empty(v.shape,dtype=[('row',int),('value',v.dtype)])
vi['row'] = np.arange(v.shape[0]).reshape(-1,1)
vi['value'] = v
# Perform searchsorted on augmented array.
# The row information is embedded in the values, so only the equivalent rows
# between a and v are considered.
result = np.searchsorted(ai.flatten(),vi.flatten(), side=side, sorter=sorter)
# Restore the original shape, decode the searchsorted indices so they apply to the original data.
result = result.reshape(vi.shape) - vi['row']*a.shape[1]
return result
Edit: The timing on this approach is abysmal!
In [21]: %timeit searchsorted_2d(a,b)
10 loops, best of 3: 92.5 ms per loop
You would be better off just just using map over the array:
In [22]: %timeit np.array(list(map(np.searchsorted,a,b)))
100 loops, best of 3: 13.8 ms per loop
For integer data, #Divakar's approach is still the fastest:
In [23]: %timeit searchsorted2d(a,b)
100 loops, best of 3: 7.26 ms per loop

Vectorized searchsorted numpy

Assume that I have two arrays A and B, where both A and B are m x n. My goal is now, for each row of A and B, to find where I should insert the elements of row i of A in the corresponding row of B. That is, I wish to apply np.digitize or np.searchsorted to each row of A and B.
My naive solution is to simply iterate over the rows. However, this is far too slow for my application. My question is therefore: is there a vectorized implementation of either algorithm that I haven't managed to find?
We can add each row some offset as compared to the previous row. We would use the same offset for both arrays. The idea is to use np.searchsorted on flattened version of input arrays thereafter and thus each row from b would be restricted to find sorted positions in the corresponding row in a. Additionally, to make it work for negative numbers too, we just need to offset for the minimum numbers as well.
So, we would have a vectorized implementation like so -
def searchsorted2d(a,b):
m,n = a.shape
max_num = np.maximum(a.max() - a.min(), b.max() - b.min()) + 1
r = max_num*np.arange(a.shape[0])[:,None]
p = np.searchsorted( (a+r).ravel(), (b+r).ravel() ).reshape(m,-1)
return p - n*(np.arange(m)[:,None])
Runtime test -
In [173]: def searchsorted2d_loopy(a,b):
...: out = np.zeros(a.shape,dtype=int)
...: for i in range(len(a)):
...: out[i] = np.searchsorted(a[i],b[i])
...: return out
...:
In [174]: # Setup input arrays
...: a = np.random.randint(11,99,(10000,20))
...: b = np.random.randint(11,99,(10000,20))
...: a = np.sort(a,1)
...: b = np.sort(b,1)
...:
In [175]: np.allclose(searchsorted2d(a,b),searchsorted2d_loopy(a,b))
Out[175]: True
In [176]: %timeit searchsorted2d_loopy(a,b)
10 loops, best of 3: 28.6 ms per loop
In [177]: %timeit searchsorted2d(a,b)
100 loops, best of 3: 13.7 ms per loop
The solution provided by #Divakar is ideal for integer data, but beware of precision issues for floating point values, especially if they span multiple orders of magnitude (e.g. [[1.0, 2,0, 3.0, 1.0e+20],...]). In some cases r may be so large that applying a+r and b+r wipes out the original values you're trying to run searchsorted on, and you're just comparing r to r.
To make the approach more robust for floating-point data, you could embed the row information into the arrays as part of the values (as a structured dtype), and run searchsorted on these structured dtypes instead.
def searchsorted_2d (a, v, side='left', sorter=None):
import numpy as np
# Make sure a and v are numpy arrays.
a = np.asarray(a)
v = np.asarray(v)
# Augment a with row id
ai = np.empty(a.shape,dtype=[('row',int),('value',a.dtype)])
ai['row'] = np.arange(a.shape[0]).reshape(-1,1)
ai['value'] = a
# Augment v with row id
vi = np.empty(v.shape,dtype=[('row',int),('value',v.dtype)])
vi['row'] = np.arange(v.shape[0]).reshape(-1,1)
vi['value'] = v
# Perform searchsorted on augmented array.
# The row information is embedded in the values, so only the equivalent rows
# between a and v are considered.
result = np.searchsorted(ai.flatten(),vi.flatten(), side=side, sorter=sorter)
# Restore the original shape, decode the searchsorted indices so they apply to the original data.
result = result.reshape(vi.shape) - vi['row']*a.shape[1]
return result
Edit: The timing on this approach is abysmal!
In [21]: %timeit searchsorted_2d(a,b)
10 loops, best of 3: 92.5 ms per loop
You would be better off just just using map over the array:
In [22]: %timeit np.array(list(map(np.searchsorted,a,b)))
100 loops, best of 3: 13.8 ms per loop
For integer data, #Divakar's approach is still the fastest:
In [23]: %timeit searchsorted2d(a,b)
100 loops, best of 3: 7.26 ms per loop

Numpy fast check for complete array equality, like Matlabs isequal

In Matlab, the builtin isequal does a check if two arrays are equal. If they are not equal, this might be very fast, as the implementation presumably stops checking as soon as there is a difference:
>> A = zeros(1e9, 1, 'single');
>> B = A(:);
>> B(1) = 1;
>> tic; isequal(A, B); toc;
Elapsed time is 0.000043 seconds.
Is there any equavalent in Python/numpy? all(A==B) or all(equal(A, B)) is far slower, because it compares all elements, even if the initial one differs:
In [13]: A = zeros(1e9, dtype='float32')
In [14]: B = A.copy()
In [15]: B[0] = 1
In [16]: %timeit all(A==B)
1 loops, best of 3: 612 ms per loop
Is there any numpy equivalent? It should be very easy to implement in C, but slow to implement in Python because this is a case where we do not want to broadcast, so it would require an explicit loop.
Edit:
It appears array_equal does what I want. However, it is not faster than all(A==B), because it's not a built-in, but just a short Python function doing A==B. So it does not meet my need for a fast check.
In [12]: %timeit array_equal(A, B)
1 loops, best of 3: 623 ms per loop
First, it should be noted that in the OP's example the arrays have identical elements because B=A[:] is just a view onto the array, so:
>>> print A[0], B[0]
1.0, 1.0
But, although the test isn't a fit one, the basic complaint is true: Numpy does not have a short-circuiting equivalency check.
One can easily see from the source that all of allclose, array_equal, and array_equiv are just variations upon all(A==B) to match their respective details, and are not notable faster.
An advantage of numpy though is that slices are just views, and are therefore very fast, so one could write their own short-circuiting comparison fairly easily (I'm not saying this is ideal, but it does work):
from numpy import *
A = zeros(1e8, dtype='float32')
B = A[:]
B[0] = 1
C = array(B)
C[0] = 2
D = array(A)
D[-1] = 2
def short_circuit_check(a, b, n):
L = len(a)/n
for i in range(n):
j = i*L
if not all(a[j:j+L]==b[j:j+L]):
return False
return True
In [26]: %timeit short_circuit_check(A, C, 100) # 100x faster
1000 loops, best of 3: 1.49 ms per loop
In [27]: %timeit all(A==C)
1 loops, best of 3: 158 ms per loop
In [28]: %timeit short_circuit_check(A, D, 100)
10 loops, best of 3: 144 ms per loop
In [29]: %timeit all(A==D)
10 loops, best of 3: 160 ms per loop

Fast numpy fancy indexing

My code for slicing a numpy array (via fancy indexing) is very slow. It is currently a bottleneck in program.
a.shape
(3218, 6)
ts = time.time(); a[rows][:, cols]; te = time.time(); print('%.8f' % (te-ts));
0.00200009
What is the correct numpy call to get an array consisting of the subset of rows 'rows' and columns 'col' of the matrix a? (in fact, I need the transpose of this result)
Let my try to summarize the excellent answers by Jaime and TheodrosZelleke and mix in some comments.
Advanced (fancy) indexing always returns a copy, never a view.
a[rows][:,cols] implies two fancy indexing operations, so an intermediate copy a[rows] is created and discarded. Handy and readable, but not very efficient. Moreover beware that [:,cols] usually generates a Fortran contiguous copy form a C-cont. source.
a[rows.reshape(-1,1),cols] is a single advanced indexing expression basing on the fact that rows.reshape(-1,1) and cols are broadcast to the shape of the intended result.
A common experience is that indexing in a flattened array can be more efficient than fancy indexing, so another approach is
indx = rows.reshape(-1,1)*a.shape[1] + cols
a.take(indx)
or
a.take(indx.flat).reshape(rows.size,cols.size)
Efficiency will depend on memory access patterns and whether the starting array is C-countinous or Fortran continuous, so experimentation is needed.
Use fancy indexing only if really needed: basic slicing a[rstart:rstop:rstep, cstart:cstop:cstep] returns a view (although not continuous) and should be faster!
To my surprise this, kind of lenghty expression, which calculates first linear 1D-indices, is more than 50% faster than the consecutive array indexing presented in the question:
(a.ravel()[(
cols + (rows * a.shape[1]).reshape((-1,1))
).ravel()]).reshape(rows.size, cols.size)
UPDATE: OP updated the description of the shape of the initial array. With the updated size the speedup is now above 99%:
In [93]: a = np.random.randn(3218, 1415)
In [94]: rows = np.random.randint(a.shape[0], size=2000)
In [95]: cols = np.random.randint(a.shape[1], size=6)
In [96]: timeit a[rows][:, cols]
10 loops, best of 3: 186 ms per loop
In [97]: timeit (a.ravel()[(cols + (rows * a.shape[1]).reshape((-1,1))).ravel()]).reshape(rows.size, cols.size)
1000 loops, best of 3: 1.56 ms per loop
INITAL ANSWER:
Here is the transcript:
In [79]: a = np.random.randn(3218, 6)
In [80]: a.shape
Out[80]: (3218, 6)
In [81]: rows = np.random.randint(a.shape[0], size=2000)
In [82]: cols = np.array([1,3,4,5])
Time method 1:
In [83]: timeit a[rows][:, cols]
1000 loops, best of 3: 1.26 ms per loop
Time method 2:
In [84]: timeit (a.ravel()[(cols + (rows * a.shape[1]).reshape((-1,1))).ravel()]).reshape(rows.size, cols.size)
1000 loops, best of 3: 568 us per loop
Check that results are actually the same:
In [85]: result1 = a[rows][:, cols]
In [86]: result2 = (a.ravel()[(cols + (rows * a.shape[1]).reshape((-1,1))).ravel()]).reshape(rows.size, cols.size)
In [87]: np.sum(result1 - result2)
Out[87]: 0.0
You can get some speed up if you slice using fancy indexing and broadcasting:
from __future__ import division
import numpy as np
def slice_1(a, rs, cs) :
return a[rs][:, cs]
def slice_2(a, rs, cs) :
return a[rs[:, None], cs]
>>> rows, cols = 3218, 6
>>> rs = np.unique(np.random.randint(0, rows, size=(rows//2,)))
>>> cs = np.unique(np.random.randint(0, cols, size=(cols//2,)))
>>> a = np.random.rand(rows, cols)
>>> import timeit
>>> print timeit.timeit('slice_1(a, rs, cs)',
'from __main__ import slice_1, a, rs, cs',
number=1000)
0.24083110865
>>> print timeit.timeit('slice_2(a, rs, cs)',
'from __main__ import slice_2, a, rs, cs',
number=1000)
0.206566124519
If you think in term of percentages, doing something 15% faster is always good, but in my system, for the size of your array, this is taking 40 us less to do the slicing, and it is hard to believe that an operation taking 240 us will be your bottleneck.
Using np.ix_ you can a similar speed to ravel/reshape, but with code that is more clear:
a = np.random.randn(3218, 1415)
rows = np.random.randint(a.shape[0], size=2000)
cols = np.random.randint(a.shape[1], size=6)
a = np.random.randn(3218, 1415)
rows = np.random.randint(a.shape[0], size=2000)
cols = np.random.randint(a.shape[1], size=6)
%timeit (a.ravel()[(cols + (rows * a.shape[1]).reshape((-1,1))).ravel()]).reshape(rows.size, cols.size)
#101 µs ± 2.36 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
%timeit ix_ = np.ix_(rows, cols); a[ix_]
#135 µs ± 7.47 µs per loop (mean ± std. dev. of 7 runs, 10000 loops each)
ix_ = np.ix_(rows, cols)
result1 = a[ix_]
result2 = (a.ravel()[(cols + (rows * a.shape[1]).reshape((-1,1))).ravel()]).reshape(rows.size, cols.size)
​
np.sum(result1 - result2)
0.0

Categories

Resources