Python: Mapping with Index of Previous Occurrence - python

I'm often met with an analog of the following problem, and have had trouble writing clean code to solve it. Usually, I have something involving a temporary variable and a for loop, but is there a more elegant way?
Suppose I have a list of booleans or values which evaluate to booleans:
[True, False, True, False, False, True]
How would I map this to a list of values, with the index of the previous True, inclusive?
[0, 0, 2, 2, 2, 5]
[EDIT] Have tried something along the lines of:
def example(lst):
rst, tmp = [], None
for i in range(len(lst)):
if lst[i]:
tmp = i
rst.append(tmp)
return rst
Assuming the first element of the list is always True.

While it still uses a for loop and a temporary variable, it's still relatively clean, I think. If you want, you could replace the yield and append to a list and return that.
def get_indexes(booleans):
previous = 0
for index, b in enumerate(booleans):
if b:
previous = index
yield previous
>>> b = [True, False, True, False, False, True]
>>> list(get_indexes(b))
[0, 0, 2, 2, 2, 5]
This is even shorter (although potentially less readable):
def get_indexes(booleans):
previous = 0
for index, b in enumerate(booleans):
previous = index if b else previous
yield previous

Try this:
index = 0
bools = [True, False, True, False, False, True]
result = []
for i in range(len(bools)):
index = i if bools[i] else index
result.append(index)
Not tested, but should work.

[i if b else i-lst[i::-1].index(True) for i,b in enumerate(lst)]

Related

Find intervals of true values in vector

I am looking for a quick way to find the start and end indexes of each "block" of consecutive trues in a Vector.
Both julia or python would do the job for me. I'll write my example in julia syntax:
Say I have a vector
a = [false, true, true, true, false, true, false, true, true, false]
what I want to get is something like this (with 1-based indexing):
[[2, 4], [6, 6], [8, 9]]
The exact form/type of the returned value does not matter, I am mostly looking for a quick and syntactically easy solution. Single trues surrounded by falses should also be detected, as given in my example.
My use-case with this is that I want to find intervals in a Vector of data where the values are below a certain threshold. So I get a boolean array from my data where this is true. Ultimately I want to shade these intervals in a plot, for which I need the start and end indeces of each interval.
function intervals(a)
jumps = diff([false; a; false])
zip(findall(jumps .== 1), findall(jumps .== -1) .- 1)
end
Quick in terms of keystrokes, maybe not in performance or readability :)
My use-case with this is that I want to find intervals in a Vector of data where the values are below a certain threshold.
Let's say your vector is v and your threshold is 7:
julia> println(v); threshold
[9, 6, 1, 9, 5, 9, 4, 5, 6, 1]
7
You can use findall to get the indices where the value is below the threshold, and get the boundaries from that:
julia> let start = 1, f = findall(<(threshold), v), intervals = Tuple{Int, Int}[]
for i in Iterators.drop(eachindex(f), 1)
if f[i] - f[i - 1] > 1
push!(intervals, (f[start], f[i - 1]))
start = i
end
end
push!(intervals, (f[start], last(f)))
end
3-element Vector{Tuple{Int64, Int64}}:
(2, 3)
(5, 5)
(7, 10)
Here's a version that avoids running findall first, and is a bit faster as a consequence:
function intervals(v)
ints = UnitRange{Int}[]
i = firstindex(v)
while i <= lastindex(v)
j = findnext(v, i) # find next true
isnothing(j) && break
k = findnext(!, v, j+1) # find next false
isnothing(k) && (k = lastindex(v)+1)
push!(ints, j:k-1)
i = k+1
end
return ints
end
It also returns a vector of UnitRanges, since that seemed a bit more natural to me.
try this:
a = [False, True, True, True, False, True, False, True, True, False]
index = 0
foundTrue = False
booleanList = []
sublist = []
for i in a:
index += 1
if foundTrue:
if i == False:
foundTrue = False
sublist.append(index-1)
booleanList.append(sublist)
sublist = []
else:
if i == True:
foundTrue = True
sublist.append(index)
print(booleanList)
output should be: [[2, 4], [6, 6], [8, 9]]
This iterates in the a list and when it finds a True it marks a flag (foundTrue) and stores its index on sublist. Now with the maked flag (foundTrue), if it finds a False, then we store the previous index from that False into sublist, appends it to the booleanList and resets sublist.
This is not the shortest but very fast without using any find functions.
function find_intervals(v)
i = 0
res = Tuple{Int64, Int64}[]
while (i+=1) <= length(v)
v[i] || continue
s = f = i
while i < length(v) && v[i+=1]
f = i
end
push!(res, (s,f))
end
res
end
For a = [false, true, true, true, false, true, false, true, true, false], it gives:
find_intervals(a)
3-element Vector{Tuple{Int64, Int64}}:
(2, 4)
(6, 6)
(8, 9)

Check if all values in the same index among several lists are false

I have a list of lists of boolean values. I'm trying to return a list of indexes where all values in the same positions are only False. So in the example below, position 3 and 4 of each inner list are False, so return [3,4].
Some assumptions to keep in mind, the list of lists could contain X number of lists, so can't rely on just three, it could be 50. Also, the inner lists will always have equal lengths, so no worrying about different-sized lists, but they could be longer than 6 like in the example I gave. So the solution must work for dynamic sizes/lengths.
list1 = [True, True, True, False, False, False]
list2 = [True, True, False, False, False, False]
list3 = [True, True, True, False, False, True]
list_of_lists = [list1, list2, list3]
result_list_of_all_false_indexes = []
# Pseudo code
# Look in position 0 of each list in the list of lists. If they're all false
# result_list_of_all_false_indexes.append(index)
# Look in position 1 of each list in the list of lists. If they're all false
# result_list_of_all_false_indexes.append(index)
# continue for entire length of inner lists
assert result_list_of_all_false_indexes == [3,4], "Try again..."
With some help from numpy, we can check your conditions by axis:
import numpy as np
results = np.where(~np.any(list_of_lists, axis=0))[0].tolist()
# Output:
[3, 4]
lol - list of lists
output - returns desired list of indexes.
def output(lol):
res = []
if lol: # Checks if list contains any list at all
res = [i for i in range(len(lol[0]))]
for list_ in lol:
res = [i for i in res if not list_[i]]
if not res:
break
return res
I would use zip to unpack the list_of_lists and enumerate to get the indexes. Then the any function can be used with not to test for all False values.
import random
n_lists = random.randrange(1, 20)
n_elements = random.randrange(3, 10)
# I set the relative weights to favor getting all False entries
list_of_lists = [
random.choices([True, False], k=n_elements, weights=[1, 10])
for i in range(n_lists)
]
result_list_of_all_false_indexes = [i for i, vals in enumerate(zip(*lol)) if not any(vals)]
result_list_of_all_false_indexes = []
for i in range(len(list_of_lists[0])):
if not any(lst[i] for lst in list_of_lists):
result_list_of_all_false_indexes.append(i)
EDIT: added explanation
Iterate over each possible index, and then check if each list at that index is False. If so, add the index to your results list.
Thanks to #Cleb in the comments, I was able to make this work:
from numpy import where, any
list1 = [True, True, True, False, False, False]
list2 = [True, True, False, False, False, False]
list3 = [True, True, True, False, False, True]
list_of_lists = [list1, list2, list3]
result = where(np.any(list_of_lists, axis=0) == False)[0]
print(result)
assert result == [3,4], "Try again..."
Rather than using all (which evaluates to ALL values on a certain access are True) I used any and ==False to accomplish what I was after.
It fails the assertion as it's now in an array format (not list separated by commas), but that's fine with me. A little more concise than having to use iteration, but either will work. Thanks all!

How to compare one 2d array with 1d array to check for elements?

So I have a matrix of 3xn. Something like
A=[[1,2,3],
[6,2,5],
[8,1,7],
[2,9,8],
[1,9,3],
[1,4,3]]
and another list of B= [1,2,5,6,8,9]
So, if every element from A[i] is in list B then I have to delete the row. Eg. row 2,4 will need to be removed.
I wrote something like.
copy=[]
for i in A:
for j in B:
if int(j) in i:
for k in B[B.index(j):]:
if int(k) in i:
for l in B[B.index(k):]:
if int(l) in i:
copy.append(i)
This keeps returning recurring values. It also removes more than what I have already written. What am I doing wrong?
I also tried
for i in A:
copy=[x for x in i if x not in B]
copy=np.array(copy)
final.append(copy)
But it doesn't work.
Note: I am using numpy array for A and list for B so I always need to convert between them when I am doing comparing.
It is quite straightforward with numpy arrays, use isin to identify values present in B, then aggregate as a single boolean with all to get rows where all values are present, invert the mask with ~ and slice:
A = np.array([[1,2,3],
[6,2,5],
[8,1,7],
[2,9,8],
[1,9,3],
[1,4,3]])
B = np.array([1,2,5,6,8,9])
# or as list
B = [1,2,5,6,8,9]
A2 = A[~np.isin(A, B).all(1)]
output:
array([[1, 2, 3],
[8, 1, 7],
[1, 9, 3],
[1, 4, 3]])
intermediates:
np.isin(A, B)
array([[ True, True, False],
[ True, True, True],
[ True, True, False],
[ True, True, True],
[ True, True, False],
[ True, False, False]])
np.isin(A, B).all(1)
array([False, True, False, True, False, False])
~np.isin(A, B).all(1)
array([ True, False, True, False, True, True])
Loop through each sublist inside list "A". For each item in that sublist, we check if it is present in list "B". If it is, we increment count to 1.
When count reaches the length of our sublist, we remove that sublist by using the .remove() method, which takes an index, in our case, the index is that of the sublist inside the list "A".
for lst in A:
count=0
for subList_itm in lst:
if subList_itm in B:
count = count+1
if count== len(lst):
A.remove(lst)
count=0
print(A)
Concept
Iterate through all of the elements inside A, and check if each array is a subset of B. If it is not then put the array into a result array.
Code
A=[[1,2,3],[6,2,5],[8,1,7],[2,9,8],[1,9,3],[1,4,3]]
B=[1,2,5,6,8,9]
set_B=set(B)
result=[]
for arr in A:
if not set(arr).issubset(set_B):
result.append(arr)
print(result)

I don't understand how they are initializing current solution. Could someone explain what this does?

I don't understand how a boolean can by multiplied by a length. I'm fairly new to coding
def __init__(self, capacity, items):
self.currentSolution = [False]*len(items)
The notation [value] * number builds a list containing value at each index, with a length of number
Example
[False]*2 => [False, False]
[False]*10 => [False, False, False, False, False, False, False, False, False, False]
When you multiply a list by N it's actually creates a new list composed of N original lists.
Let me give you an example. When we'll use the following command:
[1, 2, 3] * 2
We'll get the following list:
[1, 2, 3, 1, 2, 3, 1, 2, 3]
So performing [False]*len(items) will actually create a list with the len of len(items) which every is False.
Another way to do the same thing could be:
[False for _ in range(len(items))]

How can I check that a Python list contains only True and then only False using one or two lines?

I would like to only allow lists where the first contiguous group of elements are True and then all of the remaining elements are False. I want lists like these examples to return True:
[True]
[False]
[True, False]
[True, False, False]
[True, True, True, False]
And lists like these to return False:
[False, True]
[True, False, True]
I am currently using this function, but I feel like there is probably a better way of doing this:
def my_function(x):
n_trues = sum(x)
should_be_true = x[:n_trues] # get the first n items
should_be_false = x[n_trues:len(x)] # get the remaining items
# return True only if all of the first n elements are True and the remaining
# elements are all False
return all(should_be_true) and all([not element for element in should_be_false])
Testing:
test_cases = [[True], [False],
[True, False],
[True, False, False],
[True, True, True, False],
[False, True],
[True, False, True]]
print([my_function(test_case) for test_case in test_cases])
# expected output: [True, True, True, True, True, False, False]
Is it possible to use a comprehension instead to make this a one/two line function? I know I could not define the two temporary lists and instead put their definitions in place of their names on the return line, but I think that would be too messy.
Method 1
You could use itertools.groupby. This would avoid doing multiple passes over the list and would also avoid creating the temp lists in the first place:
def check(x):
status = list(k for k, g in groupby(x))
return len(status) <= 2 and (status[0] is True or status[-1] is False)
This assumes that your input is non-empty and already all boolean. If that's not always the case, adjust accordingly:
def check(x):
status = list(k for k, g in groupby(map(book, x)))
return status and len(status) <= 2 and (status[0] or not status[-1])
If you want to have empty arrays evaluate to True, either special case it, or complicate the last line a bit more:
return not status or (len(status) <= 2 and (status[0] or not status[-1]))
Method 2
You can also do this in one pass using an iterator directly. This relies on the fact that any and all are guaranteed to short-circuit:
def check(x):
iterator = iter(x)
# process the true elements
all(iterator)
# check that there are no true elements left
return not any(iterator)
Personally, I think method 1 is total overkill. Method 2 is much nicer and simpler, and achieves the same goals faster. It also stops immediately if the test fails, rather than having to process the whole group. It also doesn't allocate any temporary lists at all, even for the group aggregation. Finally, it handles empty and non-boolean inputs out of the box.
Since I'm writing on mobile, here's an IDEOne link for verification: https://ideone.com/4MAYYa

Categories

Resources