I have a binary array, and I would like to convert it into a list of integers, where each int is a row of the array.
For example:
from numpy import *
a = array([[1, 1, 0, 0], [0, 1, 0, 0], [0, 1, 1, 1], [1, 1, 1, 1]])
I would like to convert a to [12, 4, 7, 15].
#SteveTjoa's answer is fine, but for kicks, here's a numpy one-liner:
In [19]: a
Out[19]:
array([[1, 1, 0, 0],
[0, 1, 0, 0],
[0, 1, 1, 1],
[1, 1, 1, 1]])
In [20]: a.dot(1 << arange(a.shape[-1] - 1, -1, -1))
Out[20]: array([12, 4, 7, 15])
(arange is numpy.arange.)
If the bits are in the opposite order, change the order of the values produced by arange:
In [25]: a.dot(1 << arange(a.shape[-1]))
Out[25]: array([ 3, 2, 14, 15])
I once asked a similar question here. Here was my answer, adapted for your question:
def bool2int(x):
y = 0
for i,j in enumerate(x):
y += j<<i
return y
In [20]: a
Out[20]:
array([[1, 1, 0, 0],
[0, 1, 0, 0],
[0, 1, 1, 1],
[1, 1, 1, 1]])
In [21]: [bool2int(x[::-1]) for x in a]
Out[21]: [12, 4, 7, 15]
You could also do this within numpy directly:
from numpy import *
a = array([[1, 1, 0, 0], [0, 1, 0, 0], [0, 1, 1, 1], [1, 1, 1, 1]])
b2i = 2**arange(a.shape[0]-1, -1, -1)
result = (a*b2i).sum(axis=1) #[12 4 7 15]
If you like working directly with bitwise math, this one should work pretty well.
def bits2int(a, axis=-1):
return np.right_shift(np.packbits(a, axis=axis), 8 - a.shape[axis]).squeeze()
bits2int(a)
Out: array([12, 4, 7, 15], dtype=uint8)
Another one:
def row_bits2int(arr):
n = arr.shape[1] # number of columns
# shift the bits of the first column to the left by n - 1
a = arr[:, 0] << n - 1
for j in range(1, n):
# "overlay" with the shifted bits of the next column
a |= arr[:, j] << n - 1 - j
return a
Related
Suppose I have a 2D array with shape (3, 3), call it a, and an array of zeros with shape (7, 7, 5, 5), call it b. I want to modify b in the following way:
for p in range(5):
for q in range(5):
b[p:p + 3, q:q + 3, p, q] = a
Given:
a = np.array([[4, 2, 2],
[9, 0, 5],
[9, 9, 4]])
b = np.zeros((7, 7, 5, 5), dtype=int)
b would end up something like:
>>> b[:, :, 0, 0]
array([[4, 2, 2, 0, 0, 0, 0],
[9, 0, 5, 0, 0, 0, 0],
[9, 9, 4, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0]])
>>> b[:, :, 0, 1]
array([[0, 4, 2, 2, 0, 0, 0],
[0, 9, 0, 5, 0, 0, 0],
[0, 9, 9, 4, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0]])
One way to think about this to make a sliding window view of b (6D), slice out the parts you want (3D or 4D), and assign a to them.
However, there is a simpler way to do this altogether. The way a sliding window view works is by creating a dimension that steps along less than the full size of the dimension you are viewing. For example:
>>> x = np.array([1, 2, 3, 4])
array([1, 2, 3, 4])
>>> window = np.lib.stride_tricks.as_strided(
x, shape=(x.shape[0] - 2, 3),
strides=x.strides * 2)
[[1 2 3]
[2 3 4]]
I'm deliberately using np.lib.stride_tricks.as_strided rather than np.lib.stride_tricks.sliding_window_view here because it has a certain flexibility that you need.
You can have a stride that is larger than the axis you are viewing, as long as you are careful. Contiguous arrays are more forgiving in this case, but by no means a requirement. An example of this is np.diag. You can implement it something like this:
>>> x = np.arange(12).reshape(3, 4)
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11]])
>>> diag = np.lib.stride_tricks.as_strided(
x, shape=(min(x.shape),),
strides=(sum(x.strides),))
array([ 0, 5, 10])
The trick is to make a view of only the parts of b you care about in a way that makes the assignment easy. Because of broadcasting rules, you will want the last two dimensions of the view to be a.shape, and the strides to be b.strides[:2], since that's where you want to place a.
The first two dimensions of the view will be responsible for making the copies of a. You want 25 copies, so the shape will be (5, 5). The strides are the trickier part. Let's take a look at a 2D case, just because that's easier to visualize, and then attempt to generalize:
>>> a0 = np.array([1, 2])
>>> b0 = np.zeros((4, 3), dtype=int)
>>> b0[0:2, 0] = b0[1:3, 1] = b0[2:4, 2] = a0
The goal is to make a view that strides along the diagonal of b0 in the first axis. So:
>>> np.lib.stride_tricks.as_strided(
b0, shape=(b0.shape[0] - a0.shape[0] + 1, a0.shape[0]),
strides=(sum(b0.strides), b0.strides[0]))[:] = a0
>>> b0
array([[1, 0, 0],
[2, 1, 0],
[0, 2, 1],
[0, 0, 2]])
So that's what you do for b, but adding up every second dimension:
a = np.array([[4, 2, 2],
[9, 0, 5],
[9, 9, 4]])
b = np.zeros((7, 7, 5, 5), dtype=int)
vshape = (*np.subtract(b.shape[:a.ndim], a.shape) + 1,
*a.shape)
vstrides = (*np.add(b.strides[:a.ndim], b.strides[a.ndim:]),
*b.strides[:a.ndim])
np.lib.stride_tricks.as_strided(b, shape=vshape, strides=vstrides)[:] = a
TL;DR
def emplace_window(a, b):
vshape = (*np.subtract(b.shape[:a.ndim], a.shape) + 1, *a.shape)
vstrides = (*np.add(b.strides[:a.ndim], b.strides[a.ndim:]), *b.strides[:a.ndim])
np.lib.stride_tricks.as_strided(b, shape=vshape, strides=vstrides)[:] = a
I've phrased it this way, because now you can apply it to any number of dimensions. The only expectations is that 2 * a.ndim == b.ndim and that b.shape[a.ndim:] == b.shape[:a.ndim] - a.shape + 1.
I have a matrix M with values 0 through N within it. I'd like to unroll this matrix to create a new matrix A where each submatrix A[i, :, :] represents whether or not M == i.
The solution below uses a loop.
# Example Setup
import numpy as np
np.random.seed(0)
N = 5
M = np.random.randint(0, N, size=(5,5))
# Solution with Loop
A = np.zeros((N, M.shape[0], M.shape[1]))
for i in range(N):
A[i, :, :] = M == i
This yields:
M
array([[4, 0, 3, 3, 3],
[1, 3, 2, 4, 0],
[0, 4, 2, 1, 0],
[1, 1, 0, 1, 4],
[3, 0, 3, 0, 2]])
M.shape
# (5, 5)
A
array([[[0, 1, 0, 0, 0],
[0, 0, 0, 0, 1],
[1, 0, 0, 0, 1],
[0, 0, 1, 0, 0],
[0, 1, 0, 1, 0]],
...
[[1, 0, 0, 0, 0],
[0, 0, 0, 1, 0],
[0, 1, 0, 0, 0],
[0, 0, 0, 0, 1],
[0, 0, 0, 0, 0]]])
A.shape
# (5, 5, 5)
Is there a faster way, or a way to do it in a single numpy operation?
Broadcasted comparison is your friend:
B = (M[None, :] == np.arange(N)[:, None, None]).view(np.int8)
np.array_equal(A, B)
# True
The idea is to expand the dimensions in such a way that the comparison can be broadcasted in the manner desired.
As pointed out by #Alex Riley in the comments, you can use np.equal.outer to avoid having to do the indexing stuff yourself,
B = np.equal.outer(np.arange(N), M).view(np.int8)
np.array_equal(A, B)
# True
You can make use of some broadcasting here:
P = np.arange(N)
Y = np.broadcast_to(P[:, None], M.shape)
T = np.equal(M, Y[:, None]).astype(int)
Alternative using indices:
X, Y = np.indices(M.shape)
Z = np.equal(M, X[:, None]).astype(int)
You can index into the identity matrix like so
A = np.identity(N, int)[:, M]
or so
A = np.identity(N, int)[M.T].T
Or use the new (v1.15.0) put_along_axis
A = np.zeros((N,5,5), int)
np.put_along_axis(A, M[None], 1, 0)
Note if N is much larger than 5 then creating an NxN identity matrix may be considered wasteful. We can mitigate this using stride tricks:
def read_only_identity(N, dtype=float):
z = np.zeros(2*N-1, dtype)
s, = z.strides
z[N-1] = 1
return np.lib.stride_tricks.as_strided(z[N-1:], (N, N), (-s, s))
I have a matrix with some zero
x=np.array([[1,2,3,0],[4,0,5,0],[7,0,0,0],[0,9,8,0]])
>>> x
array([[1, 2, 3, 0],
[4, 0, 5, 0],
[7, 0, 0, 0],
[0, 9, 8, 0]])
And want to random value into only a position which is not zero. I can get the (row, col) position as tuple from np.where
pos = np.where(x!=0)
>>> (array([0, 0, 0, 1, 1, 2, 3, 3], dtype=int64), array([0, 1, 2, 0, 2, 0, 1, 2], dtype=int64))
Is there a way to use np.random (or something else) for the matrix x at position from posonly without changing where is zero?
# pseudocode
new_x = np.rand(x, at pos)
I assume you want to replace non-zero value with random integer number.
You can use the combination of numpy.place and numpy.random.randint functions.
>>> x=np.array([[1,2,3,0],[4,0,5,0],[7,0,0,0],[0,9,8,0]])
>>> x
array([[1, 2, 3, 0],
[4, 0, 5, 0],
[7, 0, 0, 0],
[0, 9, 8, 0]])
>>> lower_bound, upper_bound = 1, 5 # random function boundary
>>> np.place(x, x!=0, np.random.randint(lower_bound, upper_bound, np.count_nonzero(x)))
>>> x
array([[2, 2, 3, 0],
[1, 0, 3, 0],
[2, 0, 0, 0],
[0, 4, 3, 0]])
well you can use x.nonzero() which gives you all indices of array with nonzero values
and then then you just need to put random values at those indices
nz_indices = x.nonzero()
for i,j in zip(nz_indices[0],nz_indices[1]):
x[i][j] = np.random.randint(1500) #random number till 1500
you can find more about randint() here >> randint docs
How about something simple like this:
import numpy as np
x = np.array([[1, 2, 3, 0], [4, 0, 5, 0], [7, 0, 0, 0], [0, 9, 8, 0]])
w = x != 0
x[w] = np.random.randint(10, size=x.shape)[w]
print(x)
[[2 2 2 0]
[0 0 4 0]
[1 0 0 0]
[0 3 1 0]]
You could also do
x = np.random.randint(1, 10, size=x.shape) * (x != 0)
Just index with np.nonzero
i = np.nonzero(x)
x[i] = np.random.randint(1, 10, i[0].size)
Note for reference that np.nonzero(x) <=> np.where(x) <=> np.where(x != 0)
I have a very large numpy.array of integers, where each integer is in the range [0, 31].
I would like to count, for every pair of integers (a, b) in the range [0, 31] (e.g. [0, 1], [7, 9], [18, 0]) how often b occurs right after a.
This would give me a (32, 32) matrix of counts.
I'm looking for an efficient way to do this with numpy. Raw python loops would be too slow.
Here's one way...
To make the example easier to read, I'll use a maximum value of 9 instead of 31:
In [178]: maxval = 9
Make a random input for the example:
In [179]: np.random.seed(123)
In [180]: x = np.random.randint(0, maxval+1, size=100)
Create the result, initially all 0:
In [181]: counts = np.zeros((maxval+1, maxval+1), dtype=int)
Now add 1 to each coordinate pair, using numpy.add.at to ensure that duplicates are counted properly:
In [182]: np.add.at(counts, (x[:-1], x[1:]), 1)
In [183]: counts
Out[183]:
array([[2, 1, 1, 0, 1, 0, 1, 1, 1, 1],
[2, 1, 1, 3, 0, 2, 1, 1, 1, 1],
[0, 2, 1, 1, 4, 0, 2, 0, 0, 0],
[1, 1, 1, 3, 3, 3, 0, 0, 1, 2],
[1, 1, 0, 1, 1, 0, 2, 2, 2, 0],
[1, 0, 0, 0, 0, 0, 1, 1, 0, 2],
[0, 4, 2, 3, 1, 0, 2, 1, 0, 1],
[0, 1, 1, 1, 0, 0, 2, 0, 0, 3],
[1, 2, 0, 1, 0, 0, 1, 0, 0, 0],
[2, 0, 2, 2, 0, 0, 2, 2, 0, 0]])
For example, the number of times 6 is followed by 1 is
In [188]: counts[6, 1]
Out[188]: 4
We can verify that with the following expression:
In [189]: ((x[:-1] == 6) & (x[1:] == 1)).sum()
Out[189]: 4
You can use numpy's built-in diff routine together with boolean arrays.
import numpy as np
test_array = np.array([1, 2, 3, 1, 2, 4, 5, 1, 2, 6, 7])
a, b = (1, 2)
sum(np.bitwise_and(test_array[:-1] == a, np.diff(test_array) == b - a))
# 3
If your array is multi-dimensional, you will need to flatten it first or make some small modifications to the code above.
I have a matrix named xs:
array([[1, 1, 1, 1, 1, 0, 1, 0, 0, 2, 1],
[2, 1, 0, 0, 0, 1, 2, 1, 1, 2, 2]])
Now I want to replace the zeros by the nearest previous element in the same row (Assuming that the first column must be nonzero.).
The rough solution as following:
In [55]: row, col = xs.shape
In [56]: for r in xrange(row):
....: for c in xrange(col):
....: if xs[r, c] == 0:
....: xs[r, c] = xs[r, c-1]
....:
In [57]: xs
Out[57]:
array([[1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 1],
[2, 1, 1, 1, 1, 1, 2, 1, 1, 2, 2]])
Any help will be greatly appreciated.
If you can use pandas, replace will explicitly show the replacement in one instruction:
import pandas as pd
import numpy as np
a = np.array([[1, 1, 1, 1, 1, 0, 1, 0, 0, 2, 1],
[2, 1, 0, 0, 0, 1, 2, 1, 1, 2, 2]])
df = pd.DataFrame(a, dtype=np.float64)
df.replace(0, method='pad', axis=1)
My version, based on step-by-step rolling and masking of initial array, no additional libraries required (except numpy):
import numpy as np
a = np.array([[1, 1, 1, 1, 1, 0, 1, 0, 0, 2, 1],
[2, 1, 0, 0, 0, 1, 2, 1, 1, 2, 2]])
for i in xrange(a.shape[1]):
a[a == 0] = np.roll(a,i)[a == 0]
if not (a == 0).any(): # when all of zeros
break # are filled
print a
## [[1 1 1 1 1 1 1 1 1 2 1]
## [2 1 1 1 1 1 2 1 1 2 2]]
Without going crazy with complicated indexing tricks that figure out consecutive zeros, you could have a while loop that goes for as many iterations as consecutive zeros there are in your array:
zero_rows, zero_cols = np.where(xs == 0)
while zero_cols :
xs[zero_rows, zero_cols] = xs[zero_rows, zero_cols-1]
zero_rows, zero_cols = np.where(xs == 0)