Sorry for confusing title, but not sure how to make it more concise. Here's my requirements:
arr1 = np.array([3,5,9,1])
arr2 = ?(arr1)
arr2 would then be:
[
[0,1,2,0,0,0,0,0,0],
[0,1,2,3,4,0,0,0,0],
[0,1,2,3,4,5,6,7,8],
[0,0,0,0,0,0,0,0,0]
]
It doesn't need to vary based on the max, the shape is known in advance. So to start I've been able to get a shape of zeros:
arr2 = np.zeros((len(arr1),max_len))
And then of course I could do a for loop over arr1 like this:
for i, element in enumerate(arr1):
arr2[i,0:element] = np.arange(element)
but that would likely take a long time and both dimensions here are rather large (arr1 is a few million rows, max_len is around 500). Is there a clean optimized way to do this in numpy?
Building on a 'padding' idea posted by #Divakar some years ago:
In [161]: res = np.arange(9)[None,:].repeat(4,0)
In [162]: res[res>=arr1[:,None]] = 0
In [163]: res
Out[163]:
array([[0, 1, 2, 0, 0, 0, 0, 0, 0],
[0, 1, 2, 3, 4, 0, 0, 0, 0],
[0, 1, 2, 3, 4, 5, 6, 7, 8],
[0, 0, 0, 0, 0, 0, 0, 0, 0]])
Try this with itertools.zip_longest -
import numpy as np
import itertools
l = map(range, arr1)
arr2 = np.column_stack((itertools.zip_longest(*l, fillvalue=0)))
print(arr2)
array([[0, 1, 2, 0, 0, 0, 0, 0, 0],
[0, 1, 2, 3, 4, 0, 0, 0, 0],
[0, 1, 2, 3, 4, 5, 6, 7, 8],
[0, 0, 0, 0, 0, 0, 0, 0, 0]])
I am adding a slight variation on #hpaulj's answer because you mentioned that max_len is around 500 and you have millions of rows. In this case, you can precompute a 500 by 500 matrix containing all possible rows and index into it using arr1:
import numpy as np
np.random.seed(0)
max_len = 500
arr = np.random.randint(0, max_len, size=10**5)
# generate all unique rows first, then index
# can be faster if max_len << len(arr)
# 53 ms
template = np.tril(np.arange(max_len)[None,:].repeat(max_len,0), k=-1)
res = template[arr,:]
# 173 ms
res1 = np.arange(max_len)[None,:].repeat(arr.size,0)
res1[res1>=arr[:,None]] = 0
assert (res == res1).all()
Related
In Python, for my application it is usually best to create a sparse matrix by creating a sparse COO matrix with rows, columns and values arrays and then changing it to CSC (CSR) format.
Now, say I want to condense the CSC matrix. What is an efficient way to do so? The condensation rows/columns vary during the code and are much smaller than the dimensions of the sparse matrix, so I do not believe rebuilding the COO matrix is efficient.
The following MWE shows an example for creating the condensed matrix, but without any optimization attempts. There is a sparse efficiency warning because the number of nonzeros is increased. In this MWE I use dia_array to create the sparse matrix for simplicity.
import numpy as np
from scipy.sparse import dia_array
def main():
n = 10
m = 6
data = np.tile(np.concatenate((np.arange(1, m+1),
np.arange(m-1, 0, -1)))[:, np.newaxis], (1, n))
offsets = np.arange(-m+1, m)
A = dia_array((data, offsets), shape=(n, n)).tocsc()
print("Matrix A:")
print(repr(A.toarray()))
# condense these rows/columns
cond_rowscols = np.arange(n-8, n, 2)
print("Condensed rows/columns of A:")
print(repr(cond_rowscols))
# IMPROVE HERE
# condensation algorithm
B = A.copy()
B[[cond_rowscols[0]]] += B[cond_rowscols[1:]].sum(axis=0)
B[:, [cond_rowscols[0]]] += B[:, cond_rowscols[1:]].sum(axis=1)[:, np.newaxis]
free_rowscols = np.ones(n, dtype=bool)
free_rowscols[cond_rowscols[1:]] = False
B = B[np.ix_(free_rowscols, free_rowscols)]
print("Condensed matrix A:")
print(repr(B.toarray()))
print('Done')
if __name__ == "__main__":
main()
The output is:
Matrix A:
array([[6, 5, 4, 3, 2, 1, 0, 0, 0, 0],
[5, 6, 5, 4, 3, 2, 1, 0, 0, 0],
[4, 5, 6, 5, 4, 3, 2, 1, 0, 0],
[3, 4, 5, 6, 5, 4, 3, 2, 1, 0],
[2, 3, 4, 5, 6, 5, 4, 3, 2, 1],
[1, 2, 3, 4, 5, 6, 5, 4, 3, 2],
[0, 1, 2, 3, 4, 5, 6, 5, 4, 3],
[0, 0, 1, 2, 3, 4, 5, 6, 5, 4],
[0, 0, 0, 1, 2, 3, 4, 5, 6, 5],
[0, 0, 0, 0, 1, 2, 3, 4, 5, 6]])
Condensed rows/columns of A:
array([2, 4, 6, 8])
Condensed matrix A:
array([[ 6, 5, 6, 3, 1, 0, 0],
[ 5, 6, 9, 4, 2, 0, 0],
[ 6, 9, 56, 14, 16, 14, 9],
[ 3, 4, 14, 6, 4, 2, 0],
[ 1, 2, 16, 4, 6, 4, 2],
[ 0, 0, 14, 2, 4, 6, 4],
[ 0, 0, 9, 0, 2, 4, 6]])
Done
Edit: As per hpaulj's comment, we can create a condensation matrix T:
# condensation with matrix multiplication
n_conds = cond_rowscols.shape[0] # number of condensed rows/cols
t_vals = np.ones(n, dtype=int)
t_rows = np.arange(n)
t_cols = np.empty_like(t_rows)
t_cols[free_rowscols] = np.arange(n-n_conds+1)
t_cols[cond_rowscols[1:]] = cond_rowscols[0]
T = csc_array((t_vals, (t_rows, t_cols)), shape=(n, n-n_conds+1))
Such that the condensed matrix A is B = T.T # A # T.
T is:
>>>print(repr(T.toarray()))
array([[1, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 0, 0],
[0, 0, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 1, 0, 0],
[0, 0, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 1, 0],
[0, 0, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 1]], dtype=int32)
This does not yield any sparse efficiency errors! I still have to time it, though, for larger problems. Does scipy.sparse use sparse BLAS for sparse matrix multiplications?
I have created a test case for both options shown in the MWE with a simple performance comparison, using a smaller set of values than I intend to use in practice.
It seems that the in-place sum option takes about 5 times more than the multiplication option.
Since solving linear systems is always much more expensive, I think I'll settle with the sparse matrix multiplication option, for now.
import numpy as np
from scipy.sparse import dia_array, csr_array
from time import perf_counter
def build_sp_mat(n=None, m=None):
"""Generates a square sparse CSC matrix of size `n` and half bandwidth `m`"""
if n is None:
n = 1_000 # matrix size
if m is None:
m = 50 # half bandwidth
data = np.tile(np.concatenate((np.arange(1, m+1),
np.arange(m-1, 0, -1)))[:, np.newaxis], (1, n))
offsets = np.arange(-m+1, m)
A = dia_array((data, offsets), shape=(n, n)).tocsc()
return A
def condensed_rowscols(n, n_conds=None):
""""Returns the indices of the condensed rows/columns
Also returns a boolean mask of the non-condensed rows/columns
"""
if n_conds is None:
n_conds = 50 # total number of condensed rows/columns
# condense these rows/columns
cond_rowscols = np.arange(n//n_conds-1, n, n//n_conds)
free_rowscols = np.ones(n, dtype=bool)
free_rowscols[cond_rowscols[1:]] = False
print(repr(cond_rowscols))
return cond_rowscols, free_rowscols
def condense_in_place_sum(A, cond_rowscols, free_rowscols):
"""Performs condensation via a sum of the condensed rows and columns"""
A[[cond_rowscols[0]]] += A[cond_rowscols[1:]].sum(axis=0)
A[:, [cond_rowscols[0]]] += A[:, cond_rowscols[1:]].sum(axis=1)[:, np.newaxis]
Acond = A[np.ix_(free_rowscols, free_rowscols)] # indices are sorted
return Acond
def condense_mat_mult(A, cond_rowscols, free_rowscols, n, n_conds):
"""Performs condensation via sparse matrix multiplications"""
t_vals = np.ones(n, dtype=int)
t_rows = np.arange(n)
t_cols = np.empty_like(t_rows)
t_cols[free_rowscols] = np.arange(n-n_conds+1)
t_cols[cond_rowscols[1:]] = cond_rowscols[0]
T = csr_array((t_vals, (t_rows, t_cols)), shape=(n, n-n_conds+1))
Acond = T.T # A # T
Acond.sort_indices() # indices have to be sorted
return Acond, T
if __name__ == "__main__":
n, m, n_conds = 500_000, 54, 1000
A = build_sp_mat(n, m)
cond_rowscols, free_rowscols = condensed_rowscols(n, n_conds)
A.sort_indices() # this is important
B = A.copy()
stime = perf_counter()
Acond1 = condense_in_place_sum(B, cond_rowscols, free_rowscols)
print(f"Condensation via in-place sum took {perf_counter()-stime} s")
# Acond1 comes with its indices array sorted automatically
stime = perf_counter()
Acond2, _ = condense_mat_mult(A, cond_rowscols, free_rowscols, n, n_conds)
print(f"Condensation via sparse multiplication took {perf_counter()-stime} s")
# Acond2's indices array has to be sorted, as shown in the function
print("Done")
The output is
Condensation via in-place sum took 9.0079096 s
Condensation via sparse multiplication took 1.4657909 s
Done
I have a matrix M with values 0 through N within it. I'd like to unroll this matrix to create a new matrix A where each submatrix A[i, :, :] represents whether or not M == i.
The solution below uses a loop.
# Example Setup
import numpy as np
np.random.seed(0)
N = 5
M = np.random.randint(0, N, size=(5,5))
# Solution with Loop
A = np.zeros((N, M.shape[0], M.shape[1]))
for i in range(N):
A[i, :, :] = M == i
This yields:
M
array([[4, 0, 3, 3, 3],
[1, 3, 2, 4, 0],
[0, 4, 2, 1, 0],
[1, 1, 0, 1, 4],
[3, 0, 3, 0, 2]])
M.shape
# (5, 5)
A
array([[[0, 1, 0, 0, 0],
[0, 0, 0, 0, 1],
[1, 0, 0, 0, 1],
[0, 0, 1, 0, 0],
[0, 1, 0, 1, 0]],
...
[[1, 0, 0, 0, 0],
[0, 0, 0, 1, 0],
[0, 1, 0, 0, 0],
[0, 0, 0, 0, 1],
[0, 0, 0, 0, 0]]])
A.shape
# (5, 5, 5)
Is there a faster way, or a way to do it in a single numpy operation?
Broadcasted comparison is your friend:
B = (M[None, :] == np.arange(N)[:, None, None]).view(np.int8)
np.array_equal(A, B)
# True
The idea is to expand the dimensions in such a way that the comparison can be broadcasted in the manner desired.
As pointed out by #Alex Riley in the comments, you can use np.equal.outer to avoid having to do the indexing stuff yourself,
B = np.equal.outer(np.arange(N), M).view(np.int8)
np.array_equal(A, B)
# True
You can make use of some broadcasting here:
P = np.arange(N)
Y = np.broadcast_to(P[:, None], M.shape)
T = np.equal(M, Y[:, None]).astype(int)
Alternative using indices:
X, Y = np.indices(M.shape)
Z = np.equal(M, X[:, None]).astype(int)
You can index into the identity matrix like so
A = np.identity(N, int)[:, M]
or so
A = np.identity(N, int)[M.T].T
Or use the new (v1.15.0) put_along_axis
A = np.zeros((N,5,5), int)
np.put_along_axis(A, M[None], 1, 0)
Note if N is much larger than 5 then creating an NxN identity matrix may be considered wasteful. We can mitigate this using stride tricks:
def read_only_identity(N, dtype=float):
z = np.zeros(2*N-1, dtype)
s, = z.strides
z[N-1] = 1
return np.lib.stride_tricks.as_strided(z[N-1:], (N, N), (-s, s))
This question already has answers here:
Increment Numpy array with repeated indices
(3 answers)
Closed 4 years ago.
I want to modify an empty bitmap by given indicators (x and y axis).
For every coordinate given by the indicators the value should be raised by one.
So far so good everything seems to work. But if I have some similar indicators in my array of indicators it will only raise the value once.
>>> img
array([[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0]])
>>> inds
array([[0, 0],
[3, 4],
[3, 4]])
Operation:
>>> img[inds[:,1], inds[:,0]] += 1
Result:
>>> img
array([[1, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 1, 0]])
Expected result:
>>> img
array([[1, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 2, 0]])
Does someone have an idea how to solve this? Preferably a fast approach without the use of loops.
This is one way. Counting algorithm courtesy of #AlexRiley.
For performance implications of relative sizes of img and inds, see #PaulPanzer's answer.
# count occurrences of each row and return array
counts = (inds[:, None] == inds).all(axis=2).sum(axis=1)
# apply indices and counts
img[inds[:,1], inds[:,0]] += counts
print(img)
array([[1, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 0, 0],
[0, 0, 0, 2, 0]])
You could use numpy.add.at with a bit of manipulation to get the indices ready.
np.add.at(img, tuple(inds[:, [1, 0]].T), 1)
If you have larger inds arrays, this approach should remain fast... (though Paul Panzer's solution is faster)
Two remarks on the other two answers:
1) #jpp's can be improved by using np.unique with the axis and return_counts keywords.
2) If we translate to flat indexing we can use np.bincount which often (but not always, see last test case in benchmarks) is faster than np.add.at.
Thanks #miradulo for initial version of benchmarks.
import numpy as np
def jpp(img, inds):
counts = (inds[:, None] == inds).all(axis=2).sum(axis=1)
img[inds[:,1], inds[:,0]] += counts
def jpp_pp(img, inds):
unq, cnts = np.unique(inds, axis=0, return_counts=True)
img[unq[:,1], unq[:,0]] += cnts
def miradulo(img, inds):
np.add.at(img, tuple(inds[:, [1, 0]].T), 1)
def pp(img, inds):
imgf = img.ravel()
indsf = np.ravel_multi_index(inds.T[::-1], img.shape[::-1])
imgf += np.bincount(indsf, None, img.size)
inds = np.random.randint(0, 5, (3, 2))
big_inds = np.random.randint(0, 5, (10000, 2))
sml_inds = np.random.randint(0, 1000, (5, 2))
from timeit import timeit
for f in jpp, jpp_pp, miradulo, pp:
print(f.__name__)
for i, n, a in [(inds, 1000, 5), (big_inds, 10, 5), (sml_inds, 10, 1000)]:
img = np.zeros((a, a), int)
print(timeit("f(img, i)", globals=dict(img=img, i=i, f=f), number=n) * 1000 / n, 'ms')
Output:
jpp
0.011815106990979984 ms
2623.5026352020213 ms
0.04642329877242446 ms
jpp_pp
0.041291153989732265 ms
5.418520100647584 ms
0.05826510023325682 ms
miradulo
0.007099648006260395 ms
0.7788308983435854 ms
0.009103797492571175 ms
pp
0.0035401539935264736 ms
0.06540440081153065 ms
3.486583800986409 ms
Take a look at this piece of code:
import numpy as np
a = np.random.random(10)
indicies = [
np.array([1, 4, 3]),
np.array([2, 5, 8, 7, 3]),
np.array([1, 2]),
np.array([3, 2, 1])
]
result = np.zeros(2)
result[0] = a[indicies[0]].sum()
result[1] = a[indicies[2]].sum()
Is there any way to get result more efficiently? In my case a is a very large array.
In other words I want to select elements from a with several varying size index arrays and then sum over them in one operation, resulting in a single array.
With your a and indicies list:
In [280]: [a[i].sum() for i in indicies]
Out[280]:
[1.3986792680307709,
2.6354365193743732,
0.83324677494990895,
1.8195179021311731]
Which of course could wrapped in np.array().
For a subset of the indicies items use:
In [281]: [a[indicies[i]].sum() for i in [0,2]]
Out[281]: [1.3986792680307709, 0.83324677494990895]
A comment suggests indicies comes from an Adjacency matrix, possibly sparse.
I could recreate such an array with:
In [289]: A=np.zeros((4,10),int)
In [290]: for i in range(4): A[i,indicies[i]]=1
In [291]: A
Out[291]:
array([[0, 1, 0, 1, 1, 0, 0, 0, 0, 0],
[0, 0, 1, 1, 0, 1, 0, 1, 1, 0],
[0, 1, 1, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 1, 1, 0, 0, 0, 0, 0, 0]])
and use a matrix product (np.dot) to do the selection and sum:
In [292]: A.dot(a)
Out[292]: array([ 1.39867927, 2.63543652, 0.83324677, 1.8195179 ])
A[[0,2],:].dot(a) would use a subset of rows.
A sparse matrix version has that list of row indices:
In [294]: Al=sparse.lil_matrix(A)
In [295]: Al.rows
Out[295]: array([[1, 3, 4], [2, 3, 5, 7, 8], [1, 2], [1, 2, 3]], dtype=object)
And a matrix product with that gives the same numbers:
In [296]: Al*a
Out[296]: array([ 1.39867927, 2.63543652, 0.83324677, 1.8195179 ])
If your array a is very large you might have memory issues if your array of indices contains many arrays of many indices when looping through it.
To avoid this issue use an iterator instead of a list :
indices = iter(indices)
and then loop through your iterator.
Given a label map of dimensions W X H where each element can take values from {0,..,K-1} I want to output a label tensor of dimensions K X W x H where each element in the K'th map is 1 only if the corresponding value in the labelmap was K. Currently my implementation uses two for loops and is very slow.
p_label = Labelmap with one channel
label = np.zeros((K,p_label.shape[0], p_label.shape[1]))
for i in xrange(p_label.shape[0]):
for j in xrange(p_label.shape[1]):
label[p_label[i,j],i,j] = 1
Is there a better way to do this operation in Numpy using broadcasting?
You can use the == operator with broadcasting.
For example,
In [19]: W = 5
In [20]: H = 8
In [21]: K = 10
Create a p_label for the example:
In [22]: p_label = np.random.randint(0, K, size=(W, H))
kvals is simply an array containing [0, 1, ..., K-1]:
In [23]: kvals = np.arange(K)
kvals.reshape(-1, 1, 1) converts kvals to an array with shape (K, 1, 1). This is compared using == to p_label. Broadcasting applies, so the result of the comparison has shape (K, W, H). It is a boolean array of the values that you want. .astype(int) converts the result to an integer array. (You can remove that if a boolean array would work for you.)
In [24]: label = (p_label == kvals.reshape(-1, 1, 1)).astype(int)
Here's the original p_label. Note, for example, the locations of the value 0:
In [25]: p_label
Out[25]:
array([[3, 3, 2, 6, 2, 2, 9, 3],
[1, 8, 1, 1, 4, 3, 7, 8],
[5, 9, 1, 0, 7, 2, 8, 0],
[1, 3, 5, 4, 6, 0, 9, 5],
[5, 7, 2, 0, 6, 4, 5, 3]])
label[0] is 1 in the positions where p_label is 0.
In [26]: label[0]
Out[26]:
array([[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 1, 0, 0, 0, 1],
[0, 0, 0, 0, 0, 1, 0, 0],
[0, 0, 0, 1, 0, 0, 0, 0]])
Label[p_label, np.arange(p_label.shape[0])[:,None], np.arange(p_label.shape[1])] = 1
The 3 index arrays broadcast against each other.
==============================
lmap = np.arange(12).reshape(3,4)
lbl = np.zeros((12,3,4),int)
lbl[lmap,np.arange(3)[:,None],np.arange(4)] = 1
In [5]: lbl
Out[5]:
array([[[1, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]],
[[0, 1, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]],
...
[[0, 0, 0, 0],
[0, 0, 0, 0],
[0, 0, 0, 1]]])