Combining numpys fancy indexing with slicing - python

I currently have a two-dimensional numpy array of shape (m, n). Furthermore, I have two (m, p) arrays of indices i1 and i2. The indices are always contiguous!
import numpy as np
t = np.array([[-1, -1, 0, 0, 1, 2, 2],
[-1, -1, 0, 1, 2, 3, 3],
[0, 0, 1, 2, 2, 3, 3]])
i1 = np.array([3, 2, 2])
i2 = np.array([4, 3, 3])
How do I use the arrays i1 and i2 to slice t in order to obtain the following sub-matrix?
expected_t = np.array([
[0, 1],
[0, 1],
[1, 2]
])
That is
expected_t[0, :] = t[0, i1[0]:i2[0]]
expected_t[1, :] = t[1, i1[1]:i2[1]]
expected_t[2, :] = t[2, i1[2]:i2[2]]
Furthermore, is this possible to do without copying the data by creating a view?
Thanks in advance for all help!

Use fancy indexing in numpy:
t[np.arange(3).reshape(3,1), np.vstack((i1,i2)).T]
OR
t[np.arange(3), np.vstack((i1,i2))].T
Both will have the result:
array([[0, 1],
[0, 1],
[1, 2]])

I suggest this, but I don't know if it exists a fastest way of indexing following the example :
import numpy as np
t = np.array([[-1, -1, 0, 0, 1, 2, 2],
[-1, -1, 0, 1, 2, 3, 3],
[0, 0, 1, 2, 2, 3, 3]])
i1 = np.array([3, 2, 2])
i2 = np.array([4, 3, 3])
output = []
for i, (min_, max_) in enumerate(zip(i1, i2)):
output.append(t[i, min_:max_+1])
expected_t = np.array(output)
Or shorter :
expected_t = np.array([t[i, j:k+1] for (i,j,k) in zip(range(len(t)), i1, i2)])

Related

Is there any fast way to find identical rows of two sparse matrices with different sizes?

Consider A, an n by j matrix, and B, an m by j matrix, both in SciPy with m<n. Is there any way that I can find the indices of the rows of A which are identical to rows of B?
I have tried for loops and tried to convert them into Numpy arrays. In my case, they're not working because I'm dealing with huge matrices.
Here is the link to the same question for Numpy arrays.
Edit:
An Example for A, B, and the desired output:
>>> import numpy as np
>>> from scipy.sparse import csc_matrix
>>> row = np.array([0, 2, 2, 0, 1, 2])
>>> col = np.array([0, 0, 1, 2, 2, 2])
>>> data = np.array([1, 3, 3, 4, 5, 6])
>>> A = csc_matrix((data, (row, col)), shape=(5, 3))
>>> A.toarray()
array([[1, 0, 4],
[0, 0, 5],
[3, 3, 6],
[0, 0, 0],
[0, 0, 0]])
>>> row = np.array([0, 2, 2, 0, 1, 2])
>>> col = np.array([0, 0, 1, 2, 2, 2])
>>> data = np.array([1, 2, 3, 4, 5, 6])
>>> B = csc_matrix((data, (row, col)), shape=(4, 3))
>>> B.toarray()
array([[1, 0, 4],
[0, 0, 5],
[2, 3, 6],
[0, 0, 0]])
Desired output:
def some_function(A,B):
# Some operations
return indices
>>> some_function(A,B)
[0, 1, 3, 4]

how to convert a 1d numpy array to a lower triangular matrix?

I have a numpy array like:
np.array([1,2,3,4])
and I want to convert it to a lower triangular matrix like
np.array([
[4, 0, 0, 0],
[3, 4, 0, 0],
[2, 3, 4, 0],
[1, 2, 3, 4]
])
, without for loop.... how can i do it?
A similar solution to proposed in a comment by Michael Szczesny can be:
b = np.arange(a.size)
result = np.tril(np.take(a, b - b[:,None] + a.size - 1, mode='clip'))
The result is:
array([[4, 0, 0, 0],
[3, 4, 0, 0],
[2, 3, 4, 0],
[1, 2, 3, 4]])

How to efficiently unroll a matrix by value with numpy?

I have a matrix M with values 0 through N within it. I'd like to unroll this matrix to create a new matrix A where each submatrix A[i, :, :] represents whether or not M == i.
The solution below uses a loop.
# Example Setup
import numpy as np
np.random.seed(0)
N = 5
M = np.random.randint(0, N, size=(5,5))
# Solution with Loop
A = np.zeros((N, M.shape[0], M.shape[1]))
for i in range(N):
A[i, :, :] = M == i
This yields:
M
array([[4, 0, 3, 3, 3],
[1, 3, 2, 4, 0],
[0, 4, 2, 1, 0],
[1, 1, 0, 1, 4],
[3, 0, 3, 0, 2]])
M.shape
# (5, 5)
A
array([[[0, 1, 0, 0, 0],
[0, 0, 0, 0, 1],
[1, 0, 0, 0, 1],
[0, 0, 1, 0, 0],
[0, 1, 0, 1, 0]],
...
[[1, 0, 0, 0, 0],
[0, 0, 0, 1, 0],
[0, 1, 0, 0, 0],
[0, 0, 0, 0, 1],
[0, 0, 0, 0, 0]]])
A.shape
# (5, 5, 5)
Is there a faster way, or a way to do it in a single numpy operation?
Broadcasted comparison is your friend:
B = (M[None, :] == np.arange(N)[:, None, None]).view(np.int8)
np.array_equal(A, B)
# True
The idea is to expand the dimensions in such a way that the comparison can be broadcasted in the manner desired.
As pointed out by #Alex Riley in the comments, you can use np.equal.outer to avoid having to do the indexing stuff yourself,
B = np.equal.outer(np.arange(N), M).view(np.int8)
np.array_equal(A, B)
# True
You can make use of some broadcasting here:
P = np.arange(N)
Y = np.broadcast_to(P[:, None], M.shape)
T = np.equal(M, Y[:, None]).astype(int)
Alternative using indices:
X, Y = np.indices(M.shape)
Z = np.equal(M, X[:, None]).astype(int)
You can index into the identity matrix like so
A = np.identity(N, int)[:, M]
or so
A = np.identity(N, int)[M.T].T
Or use the new (v1.15.0) put_along_axis
A = np.zeros((N,5,5), int)
np.put_along_axis(A, M[None], 1, 0)
Note if N is much larger than 5 then creating an NxN identity matrix may be considered wasteful. We can mitigate this using stride tricks:
def read_only_identity(N, dtype=float):
z = np.zeros(2*N-1, dtype)
s, = z.strides
z[N-1] = 1
return np.lib.stride_tricks.as_strided(z[N-1:], (N, N), (-s, s))

Logical indexing in python for nd arrays

I am trying to extract all the indexes from an (N x N x N) numpy array, where values in both A and B arrays are equal to some value x - find the common overlap.
I am trying:
A[A==1 and B==1]
but get an error:
ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()
How do I get around this?
Numpy cannot overload the "and" keyword. However it overloads the binary AND operator & for this. Try:
A[(A==1) & (B==1)]
The parantheses are important. I find it often (not always)better readable then logical_and
A == 1 and B == 1 are boolean arrays, while (A==1)*(B==1) is an array of integers. You can find the nonzero entries of this array through NumPy's where:
np.where((A==1)*(B==1))
Demo
Consider the following 3-dimensional arrays, which are randomly populated with values -1, 0 and 1:
In [1066]: import numpy as np
In [1067]: np.random.seed(2016) # this is to get the same results on multiple runs
In [1068]: N = 3
...: A = np.random.randint(low=-1, high=2, size=(N, N, N))
...: B = np.random.randint(low=-1, high=2, size=(N, N, N))
In [1069]: A
Out[1069]:
array([[[ 1, 1, 0],
[-1, 1, -1],
[-1, -1, -1]],
[[ 0, 1, 1],
[-1, 1, 1],
[ 0, 1, 0]],
[[ 0, 1, 0],
[-1, 1, 1],
[-1, 1, 0]]])
In [1070]: B
Out[1070]:
array([[[-1, 0, 0],
[-1, -1, 1],
[ 0, -1, -1]],
[[-1, -1, -1],
[-1, 1, 1],
[-1, 1, 1]],
[[ 1, 1, -1],
[-1, 0, 1],
[-1, 1, -1]]])
The function where returns a tuple of integer arrays which triggers advanced indexing:
In [1071]: idx = np.where((A==1)*(B==1))
In [1072]: idx
Out[1072]:
(array([1, 1, 1, 2, 2, 2], dtype=int64),
array([1, 1, 2, 0, 1, 2], dtype=int64),
array([1, 2, 1, 1, 2, 1], dtype=int64))
In [1073]: A[idx]
Out[1073]: array([1, 1, 1, 1, 1, 1])
In [1074]: B[idx]
Out[1074]: array([1, 1, 1, 1, 1, 1])
perhaps slightly hasty posting this question. Used numpy's
logical_and(x1, x2[, out])
in the end which did the job perfectly!

Binary numpy array to list of integers?

I have a binary array, and I would like to convert it into a list of integers, where each int is a row of the array.
For example:
from numpy import *
a = array([[1, 1, 0, 0], [0, 1, 0, 0], [0, 1, 1, 1], [1, 1, 1, 1]])
I would like to convert a to [12, 4, 7, 15].
#SteveTjoa's answer is fine, but for kicks, here's a numpy one-liner:
In [19]: a
Out[19]:
array([[1, 1, 0, 0],
[0, 1, 0, 0],
[0, 1, 1, 1],
[1, 1, 1, 1]])
In [20]: a.dot(1 << arange(a.shape[-1] - 1, -1, -1))
Out[20]: array([12, 4, 7, 15])
(arange is numpy.arange.)
If the bits are in the opposite order, change the order of the values produced by arange:
In [25]: a.dot(1 << arange(a.shape[-1]))
Out[25]: array([ 3, 2, 14, 15])
I once asked a similar question here. Here was my answer, adapted for your question:
def bool2int(x):
y = 0
for i,j in enumerate(x):
y += j<<i
return y
In [20]: a
Out[20]:
array([[1, 1, 0, 0],
[0, 1, 0, 0],
[0, 1, 1, 1],
[1, 1, 1, 1]])
In [21]: [bool2int(x[::-1]) for x in a]
Out[21]: [12, 4, 7, 15]
You could also do this within numpy directly:
from numpy import *
a = array([[1, 1, 0, 0], [0, 1, 0, 0], [0, 1, 1, 1], [1, 1, 1, 1]])
b2i = 2**arange(a.shape[0]-1, -1, -1)
result = (a*b2i).sum(axis=1) #[12 4 7 15]
If you like working directly with bitwise math, this one should work pretty well.
def bits2int(a, axis=-1):
return np.right_shift(np.packbits(a, axis=axis), 8 - a.shape[axis]).squeeze()
bits2int(a)
Out: array([12, 4, 7, 15], dtype=uint8)
Another one:
def row_bits2int(arr):
n = arr.shape[1] # number of columns
# shift the bits of the first column to the left by n - 1
a = arr[:, 0] << n - 1
for j in range(1, n):
# "overlay" with the shifted bits of the next column
a |= arr[:, j] << n - 1 - j
return a

Categories

Resources