NumPy: fill fields surrounding a 1 in an array - python

Suppose I have a 4x4 matrix that looks like the following:
[[0, 0, 0, 0]
[0, 0, 1, 0]
[0, 0, 0, 0]
[0, 0, 0, 0]]
I want to write a function that takes all 4 surrounding fields of the one and turns them into a 1 as well.
The above matrix would become:
[[0, 0, 1, 0]
[0, 1, 1, 1]
[0, 0, 1, 0]
[0, 0, 0, 0]]
I know that this is possible using if-statements, but I really want to optimize my code.
The matrix only contains 0's and 1's. If the 1 is at the edge of the matrix, the 1's should not wrap around, i.e. if the most left field is a 1, the most right field still stays at 0. Also, I am using Python 3.5
Is there a more mathematical or concise way to do this?

This looks like binary dilation. There's a function available in SciPy that implements this efficiently:
>>> from scipy.ndimage import binary_dilation
>>> x
array([[0, 0, 0, 0],
[0, 0, 1, 0],
[0, 0, 0, 0],
[0, 0, 0, 0]])
>>> binary_dilation(x).astype(int)
array([[0, 0, 1, 0],
[0, 1, 1, 1],
[0, 0, 1, 0],
[0, 0, 0, 0]])
1s at the edges are handled as you've specified they should be (i.e. no wrapping).
See the documentation for further options and arguments.

FWIW, here's a way to do it just using Numpy. We pad the original data with rows & columns of zeros, and then bitwise-OR offset copies of the padded array together.
import numpy as np
def fill(data):
rows, cols = data.shape
padded = np.pad(data, 1, 'constant', constant_values=0)
result = np.copy(data)
for r, c in ((0, 1), (1, 0), (1, 2), (2, 1)):
result |= padded[r:r+rows, c:c+cols]
return result
data = np.asarray(
[
[0, 0, 0, 0],
[0, 0, 1, 0],
[0, 0, 0, 0],
[0, 0, 0, 0],
], dtype='uint8')
print(data, '\n')
result = fill(data)
print(result)
output
[[0 0 0 0]
[0 0 1 0]
[0 0 0 0]
[0 0 0 0]]
[[0 0 1 0]
[0 1 1 1]
[0 0 1 0]
[0 0 0 0]]

Related

How to mask a tensor's time steps prior to event boolean

I have time-series data in the form of [batch_size, horizon, feature]. Events occur every so often, and I demarcate them in a separate "meta" tensor as a boolean flag. i.e., it's a tensor of the same shape filled with zeros except for when a given event occurs (in which case it's a 1).
I need to be able to prevent my model from looking at data prior to the event if an event has occurred within the horizon; so by default within the 2nd dimension, the mask should be all ones, and timesteps before a detected event should be all zeros.
Only the last event should be considered, and all prior timesteps should be 0 even if there were prior events.
One-dimensional examples (meta -> mask):
[0, 0, 1, 0] -> [0, 0, 1, 1]
[0, 0, 0, 1] -> [0, 0, 0, 1]
[1, 0, 1, 0] -> [0, 0, 1, 1]
[1, 0, 0, 0] -> [1, 1, 1, 1]
[0, 0, 0, 0] -> [1, 1, 1, 1]
Maybe something like this:
# copy, paste, acknowledge
import tensorflow as tf
the_example = tf.constant([[0, 0, 1, 0],
[0, 0, 0, 1],
[1, 0, 1, 0],
[1, 0, 0, 0],
[0, 0, 0, 0]])
the_zero_mask = tf.where(tf.reduce_all(the_example == 0, axis=-1), True, False)
x = tf.boolean_mask(the_example, ~the_zero_mask)
this_shape = tf.shape(x)
something_special = tf.stack([tf.repeat(tf.where(~the_zero_mask), this_shape[-1]), tf.cast(tf.tile(tf.range(this_shape[-1]), [this_shape[0]]), dtype=tf.int64)], axis=-1)
tell_me_where = tf.where(x == 1)
here = tf.math.unsorted_segment_max(data = tell_me_where[:, 1], segment_ids = tell_me_where[:, 0], num_segments=this_shape[0])
raggidy_ragged = tf.reverse(tf.ones_like(tf.ragged.range(here, this_shape[-1])).to_tensor(), axis=[-1])
raggidy_ragged = tf.pad(raggidy_ragged , [[0, 0], [this_shape[1] - tf.shape(raggidy_ragged)[1], 0]])
we_made_it = tf.tensor_scatter_nd_update(tf.ones_like(the_example, dtype=tf.int64), something_special, tf.reshape(raggidy_ragged, [-1]))
print(we_made_it)
tf.Tensor(
[[0 0 1 1]
[0 0 0 1]
[0 0 1 1]
[1 1 1 1]
[1 1 1 1]], shape=(5, 4), dtype=int64)

Padding n rows and m columns of 0s to the each side of numpy array

So here's the problem
Given a 2D numpy array 'a' of sizes n×m . You need to pad the matrix
with 0s so that the dimensions of the matrix become (n+2n1)×(m+2m1)
a = np.array([[1, 1], [1, 1]])
n1 = 1
m1 = 2
print(padding(a, n1, m1))
>>[[0, 0, 0, 0, 0, 0],
[0, 0, 1, 1, 0, 0],
[0, 0, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 0]]
I thought solving it with the pad() function, but here is the problem with it
import numpy as np
def padding(a, n1, m1):
return np.pad(a, [n1, m1], constant_values=0)
a = np.array([[1, 1], [1, 1]])
n1 = 1
m1 = 2
print(padding(a, n1, m1))
Result is
[[0 0 0 0 0]
[0 1 1 0 0]
[0 1 1 0 0]
[0 0 0 0 0]
[0 0 0 0 0]]
Run:
result = np.pad(a, [(n1, n1), (m1, m1)])
The second parameter (pad_width) is a list of 2-tuples (before, after)
for each axis.
Padding with zeroes is the default mode, so you don't need to specify it.
The result is:
array([[0, 0, 0, 0, 0, 0],
[0, 0, 1, 1, 0, 0],
[0, 0, 1, 1, 0, 0],
[0, 0, 0, 0, 0, 0]])
A more concise version is:
np.pad(a, [(n1,), (m1,)])

Fast randomly sample a varied number of ones by row from one hot encoded matrix

I have a one hot encoded M x N matrix, A, with the following properties:
1 or more columns in each row can eq 1
Every column in the matrix will have exactly one cell with a value of one (all other cells will be zero)
M << N
I also an M x 1 array, B, that contains integers (i.e. number of random samples I want to select). Each cell of B has the following property:
B[i] <= np.sum(M[i])
I’m looking for the most efficient way to randomly sample a subset of the ones in each row of A. The number of samples returned for each row is given by the the integer values in the corresponding cells of B. The output will be an M x N matrix, let's call it C, where B == np.sum(C, axis=1)
Example
A = np.array([[0, 0, 1, 0, 0, 1, 0, 0],
[0, 1, 0, 0, 0, 0, 1, 1],
[1, 0, 0, 1, 1, 0, 0, 0]])
B = np.array([1, 3, 2])
A valid output of running this algorithm would be
array([[0, 0, 1, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 0, 0, 1, 1],
[1, 0, 0, 0, 1, 0, 0, 0]])
Another possible output would be
array([[0, 0, 0, 0, 0, 1, 0, 0],
[0, 1, 0, 0, 0, 0, 1, 1],
[0, 0, 0, 1, 1, 0, 0, 0]])
Looking for the ability to generate X random samples as fast as possible
What about this?
import numpy as np
np.random.seed(42)
A = np.array([[0, 0, 1, 0, 0, 1, 0, 0],
[0, 1, 0, 0, 0, 0, 1, 1],
[1, 0, 0, 1, 1, 0, 0, 0]])
B = np.array([1, 3, 2])
C = np.c_[B, A]
def sample_ones(x):
indices = np.where(x[1:] == 1)[0] # indices of 1s
subset_indices = np.random.choice(a=indices, size=indices.size - x[0], replace=False) # indices of NON-sampled 1s
x[subset_indices + 1] = 0 # replace non-sampled 1s with 0s
return x[1:]
print(np.apply_along_axis(sample_ones, 1, C))
Output:
[[0 0 1 0 0 0 0 0]
[0 1 0 0 0 0 1 1]
[0 0 0 1 1 0 0 0]]

Create a NxN array for all diagonals

Given an integer n, create nxn nummy array such that all of the elements present in both its diagonals are 1 and all others are 0
Input: 4
Output
*[[1, 0, 0, 1],
[0, 1, 1, 0],
[0, 1, 1, 0],
[1, 0, 0, 1]]*
how do i achieve this array?
You can use the fill_diagonal to fill the elements in the principal diagonal and use it with np.fliplr to fill elements across the other diagonal. Refer link
import numpy as np
a = np.zeros((4, 4), int)
np.fill_diagonal(a, 1)
np.fill_diagonal(np.fliplr(a), 1)
Output :
array([[1, 0, 0, 1],
[0, 1, 1, 0],
[0, 1, 1, 0],
[1, 0, 0, 1]])
Create an identity matrix and its flipped view, then take the maximum of the two:
np.maximum(np.eye(5, dtype=int), np.fliplr(np.eye(5, dtype=int)))
#array([[1, 0, 0, 0, 1],
# [0, 1, 0, 1, 0],
# [0, 0, 1, 0, 0],
# [0, 1, 0, 1, 0],
# [1, 0, 0, 0, 1]])
Edited: changed [::-1] to np.fliplr (for better performance).
I would do (assuming n=5):
import numpy as np
d = np.diagflat(np.ones(5,int))
a = d | np.rot90(d)
print(a)
Output:
[[1 0 0 0 1]
[0 1 0 1 0]
[0 0 1 0 0]
[0 1 0 1 0]
[1 0 0 0 1]]
I harness fact that we could use | (binary OR) here for getting same effect as max, because arrays holds solely 0s and 1s.

How can I check if an 1-D array is in a 2-D array?

I have the following matrix in numpy [[1 0 0 1 1 1], [1 0 0 0 1 0], [1 1 0 0 1 0], [0 1 0 1 1 1], [0 0 0 1 0 1]] and I want to check if the array [1 0 0 0 1 0] is in the matrix. I try to use
if 1-array in 2-D array:
print('True')
but I have an error DeprecationWarning: elementwise comparison failed; this will raise an error in the future.
If I run
import numpy as np
arr_2d = np.array([[1, 0, 0, 1, 1, 1],
[1, 0, 0, 0, 1, 0],
[1, 1, 0, 0, 1, 0],
[0, 1, 0, 1, 1, 1],
[0, 0, 0, 1, 0, 1]])
arr_1d = np.array([1, 0, 0, 0, 1, 0])
print(arr_1d in arr_2d)
It returns True without warnings.
I would suggest posting the code you used to get to those arrays, so we can see if there's something wrong with them.

Categories

Resources