Convert boolean to integer location in python [duplicate] - python

This question already has answers here:
Getting indices of True values in a boolean list
(9 answers)
Closed 6 years ago.
I have a boolean list, say:
x = [True, False, False, True]
How do you convert this list to integer locations, so that you get the result:
y = [1, 4]
?

You could use a list comprehension in combination with the enumerate function, for example:
>>> x = [True, False, False, True]
>>> [index for index, element in enumerate(x, start=1) if element]
[1, 4]
Alternatively, if you're willing to use NumPy and get a result of type numpy.ndarray, there's a NumPy function that (almost) does what you need: numpy.where.
>>> import numpy
>>> numpy.where(x)
(array([0, 3]),)
>>> numpy.where(x)[0] + 1
array([1, 4])
The strange [0] in the line above is there because numpy.where always returns its results in a tuple: one element of the tuple for each dimension of the input array. Since in this case the input array is one-dimensional, we don't really care about the outer tuple structure, so we use the [0] indexing operation to pull out the actual array we need. The + 1 is there to get from Python / NumPy's standard 0-based indexing to the 1-based indexing that it looks as though you want here.
If you're working with large input data (and especially if the input list is already in the form of a NumPy array), the NumPy solution is likely to be significantly faster than the list comprehension.

Use enumerate within a list comprehension:
>>> [i for i, j in enumerate(x, 1) if j]
[1, 4]

A simple one-liner would be:
[i+1 for i in range(len(x)) if x[i]]

Related

How to pythonically select a random index from a 2D list such that the corresponding element matches a value?

I have a 2D list of booleans. I want to select a random index from the the list where the value is False. For example, given the following list:
[[True, False, False],
[True, True, True],
[False, True, True]]
The valid choices would be: [0, 1], [0, 2], and [2, 0].
I could keep a list of valid indices and then use random.choice to select from it, but it seems unpythonic to keep a variable and update it every time the underlying list changes for only this one purpose.
Bonus points if your answer runs quickly.
We can use a oneliner like:
import numpy as np
from random import choice
choice(np.argwhere(~a))
With a the array of booleans.
This works as follows: by using ~a, we negate the elements of the array. Next we use np.argwhere to construct a k×2-array: an array where every row has two elements: for every dimension the value such that the corresponding value has as value False.
By choice(..) we thus select a random row. We can however not use this directly to access the element. We can use the tuple(..) constructor to cast it to a tuple:
>>> tuple(choice(np.argwhere(~a)))
(2, 0)
You can thus fetch the element then with:
t = tuple(choice(np.argwhere(~a)))
a[t]
But of course, it is not a surprise that:
>>> t = tuple(choice(np.argwhere(~a)))
>>> a[t]
False
My non-numpy version:
result = random.choice([
(i,j)
for i in range(len(a))
for j in range(len(a[i]))
if not a[i][j]])
Like Willem's np version, this generates a list of valid tuples and invokes random.choice() to pick one.
Alternatively, if you hate seeing range(len(...)) as much as I do, here is an enumerate() version:
result = random.choice([
(i, j)
for i, row in enumerate(a)
for j, cell in enumerate(row)
if not cell])
Assuming you don't want to use numpy.
matrix = [[True, False, False],
[True, True, True],
[False, True, True]]
valid_choices = [(i,j) for i, x in enumerate(matrix) for j, y in enumerate(x) if not y]
random.choice(valid_choices)
With list comprehensions, you can change the if condition (if not y) to suit your needs. This will return the coordinate that is randomly selected, but optionally you could change the value part of the list comprehension (i,j) in this case to: y and it'd return false, though thats a bit redundant in this case.

Python numpy array sum over certain indices

How to perform a sum just for a list of indices over numpy array, e.g., if I have an array a = [1,2,3,4] and a list of indices to sum, indices = [0, 2] and I want a fast operation to give me the answer 4 because the value for summing value at index 0 and index 2 in a is 4
You can use sum directly after indexing with indices:
a = np.array([1,2,3,4])
indices = [0, 2]
a[indices].sum()
The accepted a[indices].sum() approach copies data and creates a new array, which might cause problem if the array is large. np.sum actually has an argument to mask out colums, you can just do
np.sum(a, where=[True, False, True, False])
Which doesn't copy any data.
The mask array can be obtained by:
mask = np.full(4, False)
mask[np.array([0,2])] = True
Try:
>>> a = [1,2,3,4]
>>> indices = [0, 2]
>>> sum(a[i] for i in indices)
4
Faster
If you have a lot of numbers and you want high speed, then you need to use numpy:
>>> import numpy as np
>>> a = np.array([1,2,3,4])
>>> a[indices]
array([1, 3])
>>> np.sum(a[indices])
4

Check if all elements in a 2d array are equal to 1 in Python [duplicate]

This question already has answers here:
How to check if all elements of a list match a condition?
(5 answers)
Closed 6 years ago.
What's the most Python way to do this, other than iterating through nested loops and checking if each value is equal to 1?
If you're using numpy you can use its per-element equality check and then call the resulting arrays all method to check if all elements did satisfy the condition:
>>> import numpy as np
>>> c = np.array([[1,2], [3,4]])
>>> (c==1).all()
False
>>> c = np.array([[1,1], [1,1]])
>>> (c==1).all()
True
Use itertoools.chain to iterate over "flattened" list
all(x == 1 for x in itertools.chain(*my_list))
Using sets:
from itertools import chain
array = [[1,1,1],[1,0,1]]
if set(chain.from_iterable(array)) == {1}:
print("all ones")
You can use np.searchsorted along with np.apply_along_axis to find the elements in any dimensional array.
Generator function + all is probably the way to go.
array_2D = [[1, 1, 1], [1, 1, 1], [1, 1, 1]]
all(all(item == 1 for item in items) for items in array_2D)
Another option would be to first flatten the array, but python has no standard way to do this. But the code would handle any kind of dimensions (If you flatten it deep)!

appending numpy array with booleans

Can someone explain what this code is doing?
a = np.array([[1, 2], [3, 4]])
a[..., [True, False]]
What is the [True, False] doing there?
Ellipsis Notation and Booleans as Integers
From the numpy docs:
Ellipsis expand to the number of : objects needed to make a selection tuple of the same length as x.ndim. There may only be a single ellipsis present
True and False are just obfuscated 0 and 1. Taking the example from the docs:
x = np.array([[[1],[2],[3]], [[4],[5],[6]]])
x[...,0]
# outputs: array([[1, 2, 3],
# [4, 5, 6]])
x[..., False] # same thing
The boolean values are specifying an index, just like the numbers 0 or 1 would.
In response to your question in the comments
It first seems magical that
a = np.array([[1, 2], [3, 4]])
a[..., [True, True]] # = [[2,2],[4,4]]
But when we consider it as
a[..., [1,1]] # = [[2,2],[4,4]]
It seems less impressive.
Similarly:
b = array([[1,2,3],[4,5,6]])
b[...,[2,2]] # = [[3,3],[5,5]]
After applying the ellipsis rules; the true and false grab column indices, just like 0, 1, or 17 would have
Boolean Arrays for Complex Indexing
There are some subtle differences (bool's have a different type than ints). A lot of the hairy details can be found here. These do not seem to have any roll in your code, but they are interesting in figuring out how numpy indexing works.
In particular, this line is probably what you're looking for:
In the future Boolean array-likes (such as lists of python bools) will
always be treated as Boolean indexes
On this page, they talk about boolean arrays, which are quite complex as an indexing tool
Boolean arrays used as indices are treated in a different manner
entirely than index arrays. Boolean arrays must be of the same shape
as the initial dimensions of the array being indexed
Skipping down a bit
Unlike in the case of integer index arrays, in the boolean case, the
result is a 1-D array containing all the elements in the indexed array
corresponding to all the true elements in the boolean array. The
elements in the indexed array are always iterated and returned in
row-major (C-style) order. The result is also identical to
y[np.nonzero(b)]. As with index arrays, what is returned is a copy of
the data, not a view as one gets with slices.

What does this: s[s[1:] == s[:-1]] do in numpy?

I've been looking for a way to efficiently check for duplicates in a numpy array and stumbled upon a question that contained an answer using this code.
What does this line mean in numpy?
s[s[1:] == s[:-1]]
Would like to understand the code before applying it. Looked in the Numpy doc but had trouble finding this information.
The slices [1:] and [:-1] mean all but the first and all but the last elements of the array:
>>> import numpy as np
>>> s = np.array((1, 2, 2, 3)) # four element array
>>> s[1:]
array([2, 2, 3]) # last three elements
>>> s[:-1]
array([1, 2, 2]) # first three elements
therefore the comparison generates an array of boolean comparisons between each element s[x] and its "neighbour" s[x+1], which will be one shorter than the original array (as the last element has no neighbour):
>>> s[1:] == s[:-1]
array([False, True, False], dtype=bool)
and using that array to index the original array gets you the elements where the comparison is True, i.e. the elements that are the same as their neighbour:
>>> s[s[1:] == s[:-1]]
array([2])
Note that this only identifies adjacent duplicate values.
Check this out:
>>> s=numpy.array([1,3,5,6,7,7,8,9])
>>> s[1:] == s[:-1]
array([False, False, False, False, True, False, False], dtype=bool)
>>> s[s[1:] == s[:-1]]
array([7])
So s[1:] gives all numbers but the first, and s[:-1] all but the last.
Now compare these two vectors, e.g. look if two adjacent elements are the same. Last, select these elements.
s[1:] == s[:-1] compares s without the first element with s without the last element, i.e. 0th with 1st, 1st with 2nd etc, giving you an array of len(s) - 1 boolean elements. s[boolarray] will select only those elements from s which have True at the corresponding place in boolarray. Thus, the code extracts all elements that are equal to the next element.
It will show duplicates in a sorted array.
Basically, the inner expression s[1:] == s[:-1] compares the array with its shifted version. Imagine this:
1, [2, 3, ... n-1, n ]
- [1, 2, ... n-2, n-1] n
=> [F, F, ... F, F ]
In a sorted array, there will be no True in resulted array unless you had repetition. Then, this expression s[array] filters those which has True in the index array.

Categories

Resources