Related
Given a 2D array, I'm looking for a pythonic way to get an array of same shape, with only the maximum element per each row.
See max_row_filter function below
def max_row_filter(mat2d):
m = np.zeros(mat2d.shape)
for r in range(mat2d.shape[0]):
c = np.argmax(mat2d[r])
m[r,c]=mat2d[r,c]
return m
p = np.array([[1,2,3],[5,4,3,],[9,10,3]])
max_row_filter(p)
Out: array([[ 0., 0., 3.],
[ 5., 0., 0.],
[ 0., 10., 0.]])
I'm looking for an efficient way to do this, suitable to be done on big arrays.
Alternative answer (this will keep duplicates):
p * (p==p.max(axis=1, keepdims=True))
If there are no duplicates, you could use numpy.argmax:
import numpy as np
p = np.array([[1, 2, 3],
[5, 4, 3, ],
[9, 10, 3]])
result = np.zeros_like(p)
rows, cols = zip(*enumerate(np.argmax(p, axis=1)))
result[rows, cols] = p[rows, cols]
print(result)
Output
[[ 0 0 3]
[ 5 0 0]
[ 0 10 0]]
Note that, for multiple occurrences argmax return the first occurence.
Prerequisite
This is a question is an extension of this post. So, some of the introduction of the problem will be similar to that post.
Problem
Let's say result is a 2D array and values is a 1D array. values holds some values associated with each element in result. The mapping of an element in values to result is stored in x_mapping and y_mapping. A position in result can be associated with different values. (x,y) pair from x_mapping and y_mapping is associated with results[-y,x]. I have to find the unique count of the values grouped by associations.
An example for better clarification.
result array:
[[ 0., 0.],
[ 0., 0.],
[ 0., 0.],
[ 0., 0.]]
values array:
[ 1., 2., 1., 1., 5., 6., 7., 1.]
Note: Here result arrays and values have the same number of elements. But it might not be the case. There is no relation between the sizes at all.
x_mapping and y_mapping have mappings from 1D values to 2D result. The sizes of x_mapping, y_mapping and values will be the same.
x_mapping - [0, 1, 0, 0, 0, 0, 0, 0]
y_mapping - [0, 3, 2, 2, 0, 3, 2, 0]
Here, 1st value(values[0]), 5th value(values[4]) and 8th value(values[7]) have x as 0 and y as 0 (x_mapping[0] and y_mappping[0]) and hence associated with result[0, 0]. If we compute the count of distinct values from this group- (1,5,1), we will have 2 as result.
#WarrenWeckesser
Let's see how [1, 3] (x,y) pair from x_mapping and y_mapping contribute to results. Since there is only one value, ie 2, associated with this particular group, the results[-3,1] will have one as the number of distinct values associated with that cell is one.
Another example. Let's compute the value of results[-1,1]. From mappings, since there is no value associated with the cell, the value of results[-1,1] will be zero.
Similarly, the position [-2, 0] in results will have value 2.
Note that if there is no association at all then the default value for result will be zero.
The result after computation,
[[ 2., 0.],
[ 1., 1.],
[ 2., 0.],
[ 0., 0.]]
Current working solution
Using the answer from #Divakar, I was able to find a working solution.
x_mapping = np.array([0, 1, 0, 0, 0, 0, 0, 0])
y_mapping = np.array([0, 3, 2, 2, 0, 3, 2, 0])
values = np.array([ 1., 2., 1., 1., 5., 6., 7., 1.], dtype=np.float32)
result = np.zeros([4, 2], dtype=np.float32)
m,n = result.shape
out_dtype = result.dtype
lidx = ((-y_mapping)%m)*n + x_mapping
sidx = lidx.argsort()
idx = lidx[sidx]
val = values[sidx]
m_idx = np.flatnonzero(np.r_[True,idx[:-1] != idx[1:]])
unq_ids = idx[m_idx]
r_res = np.zeros(m_idx.size, dtype=np.float32)
for i in range(0, m_idx.shape[0]):
_next = None
arr = None
if i == m_idx.shape[0]-1:
_next = val.shape[0]
else:
_next = m_idx[i+1]
_start = m_idx[i]
if _start >= _next:
arr = val[_start]
else:
arr = val[_start:_next]
r_res[i] = np.unique(arr).size
result.flat[unq_ids] = r_res
Question
Now, the above solution takes 15ms for operating on 19943 values.
I'm looking for a way to compute the result faster. Is there any more performant way to do this?
Side note
I'm using Numpy version 1.14.3 with Python 3.5.2
Edits
Thanks to #WarrenWeckesser, pointing out that I haven't explained how an element in results is associated with (x,y) from mappings. I have updated the post and added examples for clarity.
Here is one solution
import numpy as np
x_mapping = np.array([0, 1, 0, 0, 0, 0, 0, 0])
y_mapping = np.array([0, 3, 2, 2, 0, 3, 2, 0])
values = np.array([ 1., 2., 1., 1., 5., 6., 7., 1.], dtype=np.float32)
result = np.zeros([4, 2], dtype=np.float32)
# Get flat indices
idx_mapping = np.ravel_multi_index((-y_mapping, x_mapping), result.shape, mode='wrap')
# Sort flat indices and reorders values accordingly
reorder = np.argsort(idx_mapping)
idx_mapping = idx_mapping[reorder]
values = values[reorder]
# Get unique values
val_uniq = np.unique(values)
# Find where each unique value appears
val_uniq_hit = values[:, np.newaxis] == val_uniq
# Find reduction indices (slices with the same flat index)
reduce_idx = np.concatenate([[0], np.nonzero(np.diff(idx_mapping))[0] + 1])
# Reduce slices
reduced = np.logical_or.reduceat(val_uniq_hit, reduce_idx)
# Count distinct values on each slice
counts = np.count_nonzero(reduced, axis=1)
# Put counts in result
result.flat[idx_mapping[reduce_idx]] = counts
print(result)
# [[2. 0.]
# [1. 1.]
# [2. 0.]
# [0. 0.]]
This method takes more memory (O(len(values) * len(np.unique(values)))), but a small benchmark comparing with your original solution shows a significant speedup (although that depends on the actual size of the problem):
import numpy as np
np.random.seed(100)
result = np.zeros([400, 200], dtype=np.float32)
values = np.random.randint(100, size=(20000,)).astype(np.float32)
x_mapping = np.random.randint(result.shape[1], size=values.shape)
y_mapping = np.random.randint(result.shape[0], size=values.shape)
res1 = solution_orig(x_mapping, y_mapping, values, result)
res2 = solution(x_mapping, y_mapping, values, result)
print(np.allclose(res1, res2))
# True
# Original solution
%timeit solution_orig(x_mapping, y_mapping, values, result)
# 76.2 ms ± 623 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
# This solution
%timeit solution(x_mapping, y_mapping, values, result)
# 13.8 ms ± 51.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
Full code of benchmark functions:
import numpy as np
def solution(x_mapping, y_mapping, values, result):
result = np.array(result)
idx_mapping = np.ravel_multi_index((-y_mapping, x_mapping), result.shape, mode='wrap')
reorder = np.argsort(idx_mapping)
idx_mapping = idx_mapping[reorder]
values = values[reorder]
val_uniq = np.unique(values)
val_uniq_hit = values[:, np.newaxis] == val_uniq
reduce_idx = np.concatenate([[0], np.nonzero(np.diff(idx_mapping))[0] + 1])
reduced = np.logical_or.reduceat(val_uniq_hit, reduce_idx)
counts = np.count_nonzero(reduced, axis=1)
result.flat[idx_mapping[reduce_idx]] = counts
return result
def solution_orig(x_mapping, y_mapping, values, result):
result = np.array(result)
m,n = result.shape
out_dtype = result.dtype
lidx = ((-y_mapping)%m)*n + x_mapping
sidx = lidx.argsort()
idx = lidx[sidx]
val = values[sidx]
m_idx = np.flatnonzero(np.r_[True,idx[:-1] != idx[1:]])
unq_ids = idx[m_idx]
r_res = np.zeros(m_idx.size, dtype=np.float32)
for i in range(0, m_idx.shape[0]):
_next = None
arr = None
if i == m_idx.shape[0]-1:
_next = val.shape[0]
else:
_next = m_idx[i+1]
_start = m_idx[i]
if _start >= _next:
arr = val[_start]
else:
arr = val[_start:_next]
r_res[i] = np.unique(arr).size
result.flat[unq_ids] = r_res
return result
I have two 1D numpy arrays A and B of size (n, ) and (m, ) respectively which correspond to the x positions of points on a line. I want to calculate the distance between every point in A to every point in B. I then need to use these distances at a set y distance, d, to work out the potential at each point in A.
I'm currently using the following:
V = numpy.zeros(n)
for i in range(n):
xdist = A[i] - B
r = numpy.sqrt(xdist**2 + d**2)
dV = 1/r
V[i] = numpy.sum(dV)
This works but for large data sets it can take a while so I would like to use a function similar to scipy.spatial.distance.cdist which doesn't work for 1D arrays and I don't want to add another dimension to the arrays as they become too large.
Vectorized approach
One vectorized approach after extending A to 2D with the introduction of a new axis using np.newaxis/None and thus making use of broadcasting would be -
(1/(np.sqrt((A[:,None] - B)**2 + d**2))).sum(1)
Hybrid approach for large arrays
Now, for large arrays, we might have to divide the data into chunks.
Thus, with BSZ as the block size, we would have a hybrid approach, like so -
dsq = d**2
V = np.zeros((n//BSZ,BSZ))
for i in range(n//BSZ):
V[i] = (1/(np.sqrt((A[i*BSZ:(i+1)*BSZ,None] - B)**2 + dsq))).sum(1)
Runtime test
Approaches -
def original_app(A,B,d):
V = np.zeros(n)
for i in range(n):
xdist = A[i] - B
r = np.sqrt(xdist**2 + d**2)
dV = 1/r
V[i] = np.sum(dV)
return V
def vectorized_app1(A,B,d):
return (1/(np.sqrt((A[:,None] - B)**2 + d**2))).sum(1)
def vectorized_app2(A,B,d, BSZ = 100):
dsq = d**2
V = np.zeros((n//BSZ,BSZ))
for i in range(n//BSZ):
V[i] = (1/(np.sqrt((A[i*BSZ:(i+1)*BSZ,None] - B)**2 + dsq))).sum(1)
return V.ravel()
Timings and verification -
In [203]: # Setup inputs
...: n,m = 10000,2000
...: A = np.random.rand(n)
...: B = np.random.rand(m)
...: d = 10
...:
In [204]: out1 = original_app(A,B,d)
...: out2 = vectorized_app1(A,B,d)
...: out3 = vectorized_app2(A,B,d, BSZ = 100)
...:
...: print np.allclose(out1, out2)
...: print np.allclose(out1, out3)
...:
True
True
In [205]: %timeit original_app(A,B,d)
10 loops, best of 3: 133 ms per loop
In [206]: %timeit vectorized_app1(A,B,d)
10 loops, best of 3: 138 ms per loop
In [207]: %timeit vectorized_app2(A,B,d, BSZ = 100)
10 loops, best of 3: 65.2 ms per loop
We can play around with the parameter block size BSZ -
In [208]: %timeit vectorized_app2(A,B,d, BSZ = 200)
10 loops, best of 3: 74.5 ms per loop
In [209]: %timeit vectorized_app2(A,B,d, BSZ = 50)
10 loops, best of 3: 67.4 ms per loop
Thus, the best one seems to be giving a 2x speedup with a block size of 100 at my end.
EDIT: My answer turned out to be nearly identical to Divakar's after a closer look. However, you can save some memory by doing the operations in-place. Taking the sum along the second axis is more efficient than long the first.
import numpy
a = numpy.random.randint(0,10,10) * 1.
b = numpy.random.randint(0,10,10) * 1.
xdist = a[:,None] - b
xdist **= 2
xdist += d**2
xdist **= -1
V = numpy.sum(xdist, axis=1)
which gives the same solution as your code.
I would like to use a function similar to scipy.spatial.distance.cdist which doesn't work for 1D arrays and I don't want to add another dimension to the arrays as they become too large.
cdist works fine, you just have to reshape the arrays to have shape (n, 1) instead of (n,). You can add another dimension to a one-dimensional array A without copying the underlying data by using A[:, None] or A.reshape(-1, 1).
For example,
In [56]: from scipy.spatial.distance import cdist
In [57]: A
Out[57]: array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
In [58]: B
Out[58]: array([0, 2, 4, 6, 8])
In [59]: A[:, None]
Out[59]:
array([[0],
[1],
[2],
[3],
[4],
[5],
[6],
[7],
[8],
[9]])
In [60]: cdist(A[:, None], B[:, None])
Out[60]:
array([[ 0., 2., 4., 6., 8.],
[ 1., 1., 3., 5., 7.],
[ 2., 0., 2., 4., 6.],
[ 3., 1., 1., 3., 5.],
[ 4., 2., 0., 2., 4.],
[ 5., 3., 1., 1., 3.],
[ 6., 4., 2., 0., 2.],
[ 7., 5., 3., 1., 1.],
[ 8., 6., 4., 2., 0.],
[ 9., 7., 5., 3., 1.]])
To compute V as shown in your code, you can use cdist with metric='sqeuclidean', as follows:
In [72]: d = 3.
In [73]: r = np.sqrt(cdist(A[:,None], B[:,None], metric='sqeuclidean') + d**2)
In [74]: V = (1/r).sum(axis=1)
Given two numpy arrays of nx3 and mx3, what is an efficient way to determine the row indices (counter) wherein the rows are common in the two arrays. For instance I have the following solution, which is significantly slow for not even much larger arrays
def arrangment(arr1,arr2):
hits = []
for i in range(arr2.shape[0]):
current_row = np.repeat(arr2[i,:][None,:],arr1.shape[0],axis=0)
x = current_row - arr1
for j in range(arr1.shape[0]):
if np.isclose(x[j,0],0.0) and np.isclose(x[j,1],0.0) and np.isclose(x[j,2],0.0):
hits.append(j)
return hits
It checks if rows of arr2 exist in arr1 and returns the row indices of arr1 where the rows match. I need this arrangement to be always sequentially ascending in terms of rows of arr2. For instance given
arr1 = np.array([[-1., -1., -1.],
[ 1., -1., -1.],
[ 1., 1., -1.],
[-1., 1., -1.],
[-1., -1., 1.],
[ 1., -1., 1.],
[ 1., 1., 1.],
[-1., 1., 1.]])
arr2 = np.array([[-1., 1., -1.],
[ 1., 1., -1.],
[ 1., 1., 1.],
[-1., 1., 1.]])
The function should return:
[3, 2, 6, 7]
quick and dirty answer
(arr1[:, None] == arr2).all(-1).argmax(0)
array([3, 2, 6, 7])
Better answer
Takes care of chance a row in arr2 doesn't match anything in arr1
t = (arr1[:, None] == arr2).all(-1)
np.where(t.any(0), t.argmax(0), np.nan)
array([ 3., 2., 6., 7.])
As pointed out by #Divakar np.isclose accounts for rounding error in comparing floats
t = np.isclose(arr1[:, None], arr2).all(-1)
np.where(t.any(0), t.argmax(0), np.nan)
I had a similar problem in the past and I came up with a fairly optimised solution for it.
First you need a generalisation of numpy.unique for multidimensional arrays, which for the sake of completeness I would copy it here
def unique2d(arr,consider_sort=False,return_index=False,return_inverse=False):
"""Get unique values along an axis for 2D arrays.
input:
arr:
2D array
consider_sort:
Does permutation of the values within the axis matter?
Two rows can contain the same values but with
different arrangements. If consider_sort
is True then those rows would be considered equal
return_index:
Similar to numpy unique
return_inverse:
Similar to numpy unique
returns:
2D array of unique rows
If return_index is True also returns indices
If return_inverse is True also returns the inverse array
"""
if consider_sort is True:
a = np.sort(arr,axis=1)
else:
a = arr
b = np.ascontiguousarray(a).view(np.dtype((np.void,
a.dtype.itemsize * a.shape[1])))
if return_inverse is False:
_, idx = np.unique(b, return_index=True)
else:
_, idx, inv = np.unique(b, return_index=True, return_inverse=True)
if return_index == False and return_inverse == False:
return arr[idx]
elif return_index == True and return_inverse == False:
return arr[idx], idx
elif return_index == False and return_inverse == True:
return arr[idx], inv
else:
return arr[idx], idx, inv
Now all you need is to concatenate (np.vstack) your arrays and find the unique rows. The reverse mapping together with np.searchsorted will give you the indices you need. So lets write another function similar to numpy.in2d but for multidimensional (2D) arrays
def in2d_unsorted(arr1, arr2, axis=1, consider_sort=False):
"""Find the elements in arr1 which are also in
arr2 and sort them as the appear in arr2"""
assert arr1.dtype == arr2.dtype
if axis == 0:
arr1 = np.copy(arr1.T,order='C')
arr2 = np.copy(arr2.T,order='C')
if consider_sort is True:
sorter_arr1 = np.argsort(arr1)
arr1 = arr1[np.arange(arr1.shape[0])[:,None],sorter_arr1]
sorter_arr2 = np.argsort(arr2)
arr2 = arr2[np.arange(arr2.shape[0])[:,None],sorter_arr2]
arr = np.vstack((arr1,arr2))
_, inv = unique2d(arr, return_inverse=True)
size1 = arr1.shape[0]
size2 = arr2.shape[0]
arr3 = inv[:size1]
arr4 = inv[-size2:]
# Sort the indices as they appear in arr2
sorter = np.argsort(arr3)
idx = sorter[arr3.searchsorted(arr4, sorter=sorter)]
return idx
Now all you need to do is call in2d_unsorted with your input parameters
>>> in2d_unsorted(arr1,arr2)
array([ 3, 2, 6, 7])
While may not be fully optimised this approach is much faster. Let's benchmark it against #piRSquareds solutions
def indices_piR(arr1,arr2):
t = np.isclose(arr1[:, None], arr2).all(-1)
return np.where(t.any(0), t.argmax(0), np.nan)
with the following arrays
n=150
arr1 = np.random.permutation(n).reshape(n//3, 3)
idx = np.random.permutation(n//3)
arr2 = arr1[idx]
In [13]: np.allclose(in2d_unsorted(arr1,arr2),indices_piR(arr1,arr2))
True
In [14]: %timeit indices_piR(arr1,arr2)
10000 loops, best of 3: 181 µs per loop
In [15]: %timeit in2d_unsorted(arr1,arr2)
10000 loops, best of 3: 85.7 µs per loop
Now, for n=1500
In [24]: %timeit indices_piR(arr1,arr2)
100 loops, best of 3: 10.3 ms per loop
In [25]: %timeit in2d_unsorted(arr1,arr2)
1000 loops, best of 3: 403 µs per loop
and for n=15000
In [28]: %timeit indices_piR(A,B)
1 loop, best of 3: 1.02 s per loop
In [29]: %timeit in2d_unsorted(arr1,arr2)
100 loops, best of 3: 4.65 ms per loop
So for largeish arrays this is over 200X faster compared to #piRSquared's vectorised solution.
How can I create a numpy matrix with its elements being a function of its indices?
For example, a multiplication table: a[i,j] = i*j
An Un-numpy and un-pythonic would be to create an array of zeros and then loop through.
There is no doubt that there is a better way to do this, without a loop.
However, even better would be to create the matrix straight-away.
A generic solution would be to use np.fromfunction()
From the doc:
numpy.fromfunction(function, shape, **kwargs)
Construct an array by executing a function over each coordinate. The
resulting array therefore has a value fn(x, y, z) at coordinate (x, y,
z).
The below snippet should provide the required matrix.
import numpy as np
np.fromfunction(lambda i, j: i*j, (5,5))
Output:
array([[ 0., 0., 0., 0., 0.],
[ 0., 1., 2., 3., 4.],
[ 0., 2., 4., 6., 8.],
[ 0., 3., 6., 9., 12.],
[ 0., 4., 8., 12., 16.]])
The first parameter to the function is a callable which is executed for each of the coordinates. If foo is a function that you pass as the first argument, foo(i,j) will be the value at (i,j). This holds for higher dimensions too. The shape of the coordinate array can be modified using the shape parameter.
Edit:
Based on the comment on using custom functions like lambda x,y: 2*x if x > y else y/2, the following code works:
import numpy as np
def generic_f(shape, elementwise_f):
fv = np.vectorize(elementwise_f)
return np.fromfunction(fv, shape)
def elementwise_f(x , y):
return 2*x if x > y else y/2
print(generic_f( (5,5), elementwise_f))
Output:
[[0. 0.5 1. 1.5 2. ]
[2. 0.5 1. 1.5 2. ]
[4. 4. 1. 1.5 2. ]
[6. 6. 6. 1.5 2. ]
[8. 8. 8. 8. 2. ]]
The user is expected to pass a scalar function that defines the elementwise operation. np.vectorize is used to vectorize the user-defined scalar function and is passed to np.fromfunction().
Here's one way to do that:
>>> indices = numpy.indices((5, 5))
>>> a = indices[0] * indices[1]
>>> a
array([[ 0, 0, 0, 0, 0],
[ 0, 1, 2, 3, 4],
[ 0, 2, 4, 6, 8],
[ 0, 3, 6, 9, 12],
[ 0, 4, 8, 12, 16]])
To further explain, numpy.indices((5, 5)) generates two arrays containing the x and y indices of a 5x5 array like so:
>>> numpy.indices((5, 5))
array([[[0, 0, 0, 0, 0],
[1, 1, 1, 1, 1],
[2, 2, 2, 2, 2],
[3, 3, 3, 3, 3],
[4, 4, 4, 4, 4]],
[[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4],
[0, 1, 2, 3, 4]]])
When you multiply these two arrays, numpy multiplies the value of the two arrays at each position and returns the result.
For the multiplication
np.multiply.outer(np.arange(5), np.arange(5)) # a_ij = i * j
and in general
np.frompyfunc(
lambda i, j: f(i, j), 2, 1
).outer(
np.arange(5),
np.arange(5),
).astype(np.float64) # a_ij = f(i, j)
basically you create an np.ufunc via np.frompyfunc and then outer it with the indices.
Edit
Speed comparision between the different solutions.
Small matrices:
Eyy![1]: %timeit np.multiply.outer(np.arange(5), np.arange(5))
100000 loops, best of 3: 4.97 µs per loop
Eyy![2]: %timeit np.array( [ [ i*j for j in xrange(5)] for i in xrange(5)] )
100000 loops, best of 3: 5.51 µs per loop
Eyy![3]: %timeit indices = np.indices((5, 5)); indices[0] * indices[1]
100000 loops, best of 3: 16.1 µs per loop
Bigger matrices:
Eyy![4]: %timeit np.multiply.outer(np.arange(4096), np.arange(4096))
10 loops, best of 3: 62.4 ms per loop
Eyy![5]: %timeit indices = np.indices((4096, 4096)); indices[0] * indices[1]
10 loops, best of 3: 165 ms per loop
Eyy![6]: %timeit np.array( [ [ i*j for j in xrange(4096)] for i in xrange(4096)] )
1 loops, best of 3: 1.39 s per loop
I'm away from my python at the moment, but does this one work?
array( [ [ i*j for j in xrange(5)] for i in xrange(5)] )
Just wanted to add that #Senderle's response can be generalized for any function and dimension:
dims = (3,3,3) #i,j,k
ii = np.indices(dims)
You could then calculate a[i,j,k] = i*j*k as
a = np.prod(ii,axis=0)
or a[i,j,k] = (i-1)*j*k:
a = (ii[0,...]-1)*ii[1,...]*ii[2,...]
etc