Converting predefined 2d Array into 2d bool array - python

I am given a predefined array
a = [[2 3 4]
[5 6 7]
[8 9 10]]
my issue is converting this array into a boolean array where all even numbers are true. Any help is appreciated!

I would use numpy, and it's pretty straightforward:
import numpy as np
a = [[2, 3, 4],
[5, 6, 7],
[8, 9, 10]]
a = np.asarray(a)
(a % 2) == 0

Do as follows using Numpy:
import numpy as np
a = [[2, 3, 4],
[5, 6, 7],
[8, 9, 10]]
aBool=(np.asarray(a)%2==0)
The variable aBool will contain True where there is an Even value and False for an Odd value.
array([[ True, False, True],
[False, True, False],
[ True, False, True]])

This is adapted from the answer found here:
a=np.array([1,2,3,4,5,6])
answer = (a[a%2==0])
print(answer)
which is essentially what Andrew said but without using NumPy
If you're interested in getting booleans and testing for certain conditions within a NumPy array you can find a neat post here

This can be done simply using one line (if you don't want to use numpy):
[[i % 2 == 0 for i in sublist] for sublist in a]
>>> [[True, False, True], [False, True, False], [True, False, True]]
Here, i % 2 denotes the modulus operator, which gives the remainder when the first number is divided by the second. In this case, for even numbers i % 2 = 0, and for odd numbers i % 2 = 1. The two = signs means that the expression is evaluated as a boolean.
The two for loops iterate over each list inside of the 2D list.
This can be extended if you find this format easier to understand, but it essentially does the same thing:
>>> newlist = []
>>> for sublist in a:
partial_list = []
for i in sublist:
partial_list.append(i % 2 == 0)
newlist.append(partial_list)
>>> newlist
[[True, False, True], [False, True, False], [True, False, True]]

Why not to mention negations:
a = [[2, 3, 4],
[5, 6, 7],
[8, 9, 10]]
>>> ~np.array(a)%2
array([[1, 0, 1],
[0, 1, 0],
[1, 0, 1]], dtype=int32)
This is a boolean form:
>>> ~(np.array(a)%2).astype(bool)
array([[ True, False, True],
[False, True, False],
[ True, False, True]])

Related

In a matrix for a given index, how do I check if any neighboring values are 2 smaller than it?

I was doing a python challenge and this one stumped me.
This is the input matrix (numpy format):
# [[1, 7, 2, 2, 1],
# [7, 7, 9, 3, 2],
# [2, 9, 4, 4, 2],
# [2, 3, 4, 3, 2],
# [1, 2, 2, 7, 1]]
and the function would output this matrix
# [[False, True, False, False, False],
# [True, False, True, False, False],
# [False, True, False, True, False],
# [False, False, False, False, False],
# [False, False, False, True, False]]
And you can see the value will be 'true' if any (up/down/left/right) neighbor is 2 smaller than itself. We've been learning numpy, but this doesn't feel like it's too much of a numpy thing).
I tried to do simple if comparison=true checks, but I kept stumbling into out-of-index errors and I couldnt find any way to circumvent/ignore those.
Thanks in advance.
This is the essence of what I've tried so far. I've simplified the task here to simply check the first row horizontally. If I could get this to work, I would extend it to check the next row horizontally until the end, and then I would do the same thing but vertically.
import numpy as np
ex=np.array([[7, 2, 3, 4, 3, 4, 7]])
def count_peaks(A):
matrixHeight=A.shape[0]
matrixWidth=A.shape[1]
peakTable=np.zeros(shape=(matrixHeight,matrixWidth))
for i in range(matrixWidth):
if A[i]-A[i+1]>=2 or A[i]-A[i-1]>=2:
peakTable[0,i]=1
return peakTable
... which of course outputs:
IndexError: index 1 is out of bounds for axis 0 with size 1
as I'm trying to find the value of A[-1] which doesn't exist.
You are using numpy arrays, so don't loop, use vectorial code:
import numpy as np
# get shape
x,y = a.shape
# generate row/col of infinites
col = np.full([x, 1], np.inf)
row = np.full([1, y], np.inf)
# shift left/right/up/down
# and compute difference from initial array
left = a - np.c_[col, a[:,:-1]]
right = a - np.c_[a[:,1:], col]
up = a - np.r_[row, a[:-1,:]]
down = a -np.r_[a[1:,:], row]
# get max of each shift and compare to threshold
peak_table = np.maximum.reduce([left,right,up,down])>=2
# NB. if we wanted to use a maximum threshold, we would use
# `np.minimum` instead and initialize the shifts with `-np.inf`
output:
array([[False, True, False, False, False],
[ True, False, True, False, False],
[False, True, False, True, False],
[False, False, True, False, False],
[False, False, False, True, False]])
input:
import numpy as np
a = np.array([[1, 7, 2, 2, 1],
[7, 7, 9, 3, 2],
[2, 9, 4, 4, 2],
[2, 3, 4, 3, 2],
[1, 2, 2, 7, 1]])
If you don't mind me not using numpy to get the solution, but converting to numpy at the end, here is my attempt:
import numpy as np
def check_neighbors(mdarray,i,j):
neighbors = (-1, 0), (1, 0), (0, -1), (0, 1)
for neighbor in neighbors:
try:
if mdarray[i][j]-mdarray[i+neighbor[0]][j+neighbor[1]]>=2:
return True
except IndexError:
pass
return False
mdarray= [[1, 7, 2, 2, 1],
[7, 7, 9, 3, 2],
[2, 9, 4, 4, 2],
[2, 3, 4, 3, 2],
[1, 2, 2, 7, 1]]
peak_matrix =[]
for i in range(len(mdarray)):
row = []
for j in range(len(mdarray[i])):
#print(check_neighbors(mdarray,i,j))
row.append(check_neighbors(mdarray,i,j))
peak_matrix.append(row)
y=np.array([np.array(xi) for xi in peak_matrix])
print(y)
I use the try-except block to avoid errors when the index goes out of bounds.
Note: Row 4 Column 3 (starting counts at 1) of my output seems to differ from yours. I think that the 4 and 2 difference in the neighbors should make this entry true?
Output:
[[False True False False False]
[ True False True False False]
[False True False True False]
[False False True False False]
[False False False True False]]
Edit: changed from bare except to IndexError as Neither suggests in the comments. pass and continue doesn't make a difference in this case but yes.

Rowwise numpy.isin for 2D arrays [duplicate]

This question already has answers here:
check for identical rows in different numpy arrays
(7 answers)
Closed 1 year ago.
I have two arrays:
A = np.array([[3, 1], [4, 1], [1, 4]])
B = np.array([[0, 1, 5], [2, 4, 5], [2, 3, 5]])
Is it possible to use numpy.isin rowwise for 2D arrays? I want to check if A[i,j] is in B[i] and return this result into C[i,j]. At the end I would get the following C:
np.array([[False, True], [True, False], [False, False]])
It would be great, if this is also doable with the == operator, then I could use it also with PyTorch.
Edit:
I also considered check for identical rows in different numpy arrays. The question is somehow similar, but I am not able to apply its solutions to this slightly different problem.
Not sure that my code solves your problem perfectly. please run it on more test cases to confirm. but i would do smth like i've done in the following code taking advantage of numpys vector outer operations ability (similar to vector outer product). If it works as intended it should work with pytorch as well.
import numpy as np
A = np.array([[3, 1], [4, 1], [1, 4]])
B = np.array([[0, 1, 5], [2, 4, 5], [2, 3, 5]])
AA = A.reshape(3, 2, 1)
BB = B.reshape(3, 1, 3)
(AA == BB).sum(-1).astype(bool)
output:
array([[False, True],
[ True, False],
[False, False]])
Updated answer
Here is a way to do this :
(A == B[..., None]).any(axis=1).astype(bool)
# > array([[False, True],
# [ True, False],
# [False, False]])
Previous answer
You could also do it inside a list comprehension:
[np.isin(a, b) for a,b in zip(A, B)]
# > [array([False, True]), array([ True, False]), array([False, False])]
np.array([np.isin(a, b) for a,b in zip(A, B)])
# > array([[False, True],
# [ True, False],
# [False, False]])
But, as #alex said it defeats the purpose of numpy.

Numpy: When applying a boolean mask for an array of arrays, most efficient way to record which items were in the original arrays

I can perform a boolean mask on an array of arrays like this
import numpy as np
a = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
b = [[True, False, False], [True, True, False], [False, False, False]]
np.array(a)[np.array(b)]
and get array([1, 4, 5])
How would I preserve the information of which numbers belonged to the same array?
something like this would work
is_in_original(1, 4)
> False
is_in_origina(5, 4)
>True
One thing I could think of is this
def is_in_original(x, y):
for arry in np.array(a):
if x in arry and y in arry:
return True
return False
I am wondering if this is the most computationally efficient method. I will be working with very large array of arrays, and need the throughput to be as fast as possible.
You can use np.where(mask, array, 0) to preserve dimensions.
import numpy as np
a = [[1, 2, 3], [4, 5, 6], [7, 8, 9]]
b = [[True, False, False], [True, True, False], [False, False, False]]
ret = np.where(np.array(b), np.array(a), 0)
Output:
array([[1, 0, 0],
[4, 5, 0],
[0, 0, 0]])
In this case you can change third parameter of np.where is 0, you can change the value to any number or inf

How to find row of 2d array in 3d numpy array

I'm trying to find the row in which a 2d array appears in a 3d numpy ndarray. Here's an example of what I mean. Give:
arr = [[[0, 3], [3, 0]],
[[0, 0], [0, 0]],
[[3, 3], [3, 3]],
[[0, 3], [3, 0]]]
I'd like to find all occurrences of:
[[0, 3], [3, 0]]
The result I'd like is:
[0, 3]
I tried to use argwhere but that unfortunately got me nowhere. Any ideas?
Try
np.argwhere(np.all(arr==[[0,3], [3,0]], axis=(1,2)))
How it works:
arr == [[0,3], [3,0]] returns
array([[[ True, True],
[ True, True]],
[[ True, False],
[False, True]],
[[False, True],
[ True, False]],
[[ True, True],
[ True, True]]], dtype=bool)
This is a three dimensional array where the innermost axis is 2. The values at this axis are:
[True, True]
[True, True]
[True, False]
[False, True]
[False, True]
[True, False]
[True, True]
[True, True]
Now with np.all(arr==[[0,3], [3,0]], axis=2) you are checking if both elements on a row are True and its shape will be reduced to (4, 2) from (4, 2, 2). Like this:
array([[ True, True],
[False, False],
[False, False],
[ True, True]], dtype=bool)
You need one more step of reducing as you want both of them to be the same (both [0, 3] and [3, 0]. You can do it either by reducing on the result (now the innermost axis is 1):
np.all(np.all(test, axis = 2), axis=1)
Or you can also do it by giving a tuple for the axis parameter to do the same thing step by step (first innermost, then one step higher). The result will be:
array([ True, False, False, True], dtype=bool)
The 'contains' function in the numpy_indexed package (disclaimer: I am its author) can be used to make queries of this kind. It implements a solution similar to the one offered by Saullo.
import numpy_indexed as npi
test = [[[0, 3], [3, 0]]]
# check which elements of arr are present in test (checked along axis=0 by default)
flags = npi.contains(test, arr)
# if you want the indexes:
idx = np.flatnonzero(flags)
In you can use np.in1d after defining a new data type which will have the memory size of each row in your arr. To define such data type:
mydtype = np.dtype((np.void, arr.dtype.itemsize*arr.shape[1]*arr.shape[2]))
then you have to convert your arr to a 1-D array where each row will have arr.shape[1]*arr.shape[2] elements:
aView = np.ascontiguousarray(arr).flatten().view(mydtype)
You are now ready to look for your 2-D array pattern [[0, 3], [3, 0]] which also has to be converted to dtype:
bView = np.array([[0, 3], [3, 0]]).flatten().view(mydtype)
You can now check the occurrencies of bView in aView:
np.in1d(aView, bView)
#array([ True, False, False, True], dtype=bool)
This mask is easily converted to indices using np.where, for example.
Timings (updated)
THe following function is used to implement this approach:
def check2din3d(b, a):
"""
Return where `b` (2D array) appears in `a` (3D array) along `axis=0`
"""
mydtype = np.dtype((np.void, a.dtype.itemsize*a.shape[1]*a.shape[2]))
aView = np.ascontiguousarray(a).flatten().view(mydtype)
bView = np.ascontiguousarray(b).flatten().view(mydtype)
return np.in1d(aView, bView)
The updated timings considering #ayhan comments showed that this method can be faster the np.argwhere, but the different is not significant and for large arrays like below, #ayhan's approach is considerably faster:
arrLarge = np.concatenate([arr]*10000000)
arrLarge = np.concatenate([arrLarge]*10, axis=2)
pattern = np.ascontiguousarray([[0,3]*10, [3,0]*10])
%timeit np.argwhere(np.all(arrLarger==pattern, axis=(1,2)))
#1 loops, best of 3: 2.99 s per loop
%timeit check2din3d(pattern, arrLarger)
#1 loops, best of 3: 4.65 s per loop

Converting Specified Elements of a NumPy Array by a New Value

I wanted to convert the specified elements of the NumPy array A: 1, 5, and 8 into 0.
So I did the following:
import numpy as np
A = np.array([[1,2,3,4,5],[6,7,8,9,10]])
bad_values = (A==1)|(A==5)|(A==8)
A[bad_values] = 0
print A
Yes, I got the expected result, i.e., new array.
However, in my real world problem, the given array (A) is very large and is also 2-dimensional, and the number of bad_values to be converted into 0 are also too many. So, I tried the following way of doing that:
bads = [1,5,8] # Suppose they are the values to be converted into 0
bad_values = A == x for x in bads # HERE is the problem I am facing
How can I do this?
Then, of course the remaining is the same as before.
A[bad_values] = 0
print A
If you want to get the index of where a bad value occurs in your array A, you could use in1d to find out which values are in bads:
>>> np.in1d(A, bads)
array([ True, False, False, False, True, False, False, True, False, False], dtype=bool)
So you can just write A[np.in1d(A, bads)] = 0 to set the bad values of A to 0.
EDIT: If your array is 2D, one way would be to use the in1d method and then reshape:
>>> B = np.arange(9).reshape(3, 3)
>>> B
array([[0, 1, 2],
[3, 4, 5],
[6, 7, 8]])
>>> np.in1d(B, bads).reshape(3, 3)
array([[False, True, False],
[False, False, True],
[False, False, True]], dtype=bool)
So you could do the following:
>>> B[np.in1d(B, bads).reshape(3, 3)] = 0
>>> B
array([[0, 0, 2],
[3, 4, 0],
[6, 7, 0]])

Categories

Resources