Efficient way of counting True and False - python

This may be a trivial problem, but I want to learn more about other more clever and efficient ways of solving it.
I have a list of items and each item has a property a whose value is binary.
If every item in the list has a == 0, then I set a separate variable b = 0.
If every item in the list has a == 1, then I set b = 1.
If there is a mixture of a == 0 and a == 1 in the list, then I set
b = 2.
I can use a set to keep track of the types of a value, such that if there are two items in the set after iterating through the list, then I can set b = 2, whereas if there is only one item in the set I just retrieve the item (either 0 or 1) and use it to set b.
Any better way?

One pass through the list, and no extra data structures constructed:
def zot(bs):
n, s = len(bs), sum(bs)
return 1 if n == s else 2 if s else 0

I would suggest using any and all. I would say that the benefit of this is readability rather than cleverness or efficiency. For example:
>>> vals0 = [0, 0, 0, 0, 0]
>>> vals1 = [1, 1, 1, 1, 1]
>>> vals2 = [0, 1, 0, 1, 0]
>>> def category(vals):
... if all(vals):
... return 1
... elif any(vals):
... return 2
... else:
... return 0
...
>>> category(vals0)
0
>>> category(vals1)
1
>>> category(vals2)
2
This can be shortened a bit if you like:
>>> def category(vals):
... return 1 if all(vals) else 2 if any(vals) else 0
...
This works with anything that can be interpreted by __nonzero__ (or __bool__ in Python 3) as having a true or false value.

Somebody mentioned code golf, so can't resist a variation on #senderle's:
[0,2,1][all(vals) + any(vals)]
Short explanation: This uses the boolean values as their integer equivalents to index a list of desired responses. If all is true then any must also be true, so their sum is 2. any by itself gives 1 and no matches gives 0. These indices return the corresponding values from the list.
If the original requirements could be modified to use 1 for any and 2 for all it would be even simpler to just return the integer of any + all

Using a dictionary:
zonk_values = {frozenset([0]): 0, frozenset([1]): 1, frozenset([0, 1]): 2}
def zonk(a):
return zonk_values[frozenset(a)]
This also only needs a single pass through the list.

you could also use sets.
s = set([i.a for i in your_list])
if len(s) == 1:
b = s.pop()
else:
b = 2

def zot(bs):
return len(set(bs)) if sum(bs) else 0

You can define two boolean vars hasZero and hasOne and set them to True if corresponding value was met while iterating the list. Then b = 2 if hasZero and hasOne, b = 1 if only hasOne and b = 0 if only hasZero.
Another way: you can sum all the a values along the list. If sumA == len(list) then b = 1, if sumA == 0 then b = 0 and if 0 < sumA < len(list) then b = 2.

Short-circuiting solution. Probably the most efficient way you can do it in Python.
EDIT: Included any and all as per suggestion in comments.
EDIT2: It's now a one-liner.
b = 1 if all(A) else 2 if any(A) else 0

This is similar to senderle's suggestion, but written to access the objects' a properties.
from random import randint
class Item(object):
def __init__(self, a):
self.a = a
all_zeros = [Item(0) for _ in xrange(10)]
all_ones = [Item(1) for _ in xrange(10)]
mixture = [Item(randint(0, 1)) for _ in xrange(10)]
def check(items):
if all(item.a for item in items):
return 1
if any(item.a for item in items):
return 2
else:
return 0
print 'check(all_zeros):', check(all_zeros)
print 'check(all_ones):', check(all_ones)
print 'check(mixture):', check(mixture)

You can use list iterators:
>>> L = [0, 0, 0, 0, 0]
>>> L1 = [1, 1, 1, 1, 1]
>>> L2 = [0, 1, 0, 1, 0]
>>> def fn(i):
... i = iter(i)
... if all(i): return 1
... return 2 if any(i) else 0
...
>>> fn(L)
0
>>> fn(L1)
1
>>> fn(L2)
2

Related

Swapping variables vs swapping elements of array using indices [duplicate]

This question already has answers here:
Python assignment quirk w/ list index assign, dict index assign, and dict.get
(1 answer)
Tuple unpacking order changes values assigned
(4 answers)
Closed 2 years ago.
If I were to initialize two variables and swap them 'pythonically', elements get swapped.
a, b = 1, 0
a, b = b, a
>>> a
0
>>> b
1
But if I try to use the same pythonic syntax for following code, array remains unchanged.
>>> arr = [1, 0]
>>> arr[0], arr[arr[0]] = arr[arr[0]], arr[0]
>>> arr
[1, 0]
Why does this not swap the elements of the array?
I went through this question but it is still not clear to me how LOAD_FAST & STORE_FAST work for the second example (swapping elements of an array using indices).
Your problem becomes even more acute if you consider the simplified assignment:
arr[0], arr[arr[0]] = 0, 1
There are actually two assignments statements here, executed sequentially:
arr[0] = 0 # arr: [0,0]
arr[arr[0]] = 1 # arr: [1,0], because arr[0] was 0
The first item of arr is changed twice, but the second is not changed at all.
To determine when __get__ and __set__ is called, try this class:
from collections.abc import MutableSequence, Iterable
class PrintingMutable(MutableSequence):
def __init__(self, source: (MutableSequence, Iterable) = None):
try:
self.arr = list(source)
except TypeError:
self.arr = []
def __repr__(self):
return repr(self.arr)
def insert(self, index: int, o) -> None:
self.arr.insert(index, o) # fail-fast
self.on_insert(index, o)
def __getitem__(self, i: (int, slice)):
val = self.arr[i] # creating unnecessary name to fail-fast.
if isinstance(i, slice):
for idx in range(i.start, i.stop, i.step if i.step else 1):
self.on_get(idx)
else:
self.on_get(i)
return val
def __setitem__(self, i: int, o) -> None:
self.arr[i] = o
self.on_set(i)
def __delitem__(self, i: int) -> None: # I don't think I'm gonna use it.
del self.arr[i]
self.on_del(i)
def __len__(self) -> int:
return len(self.arr)
def on_insert(self, idx, item):
print(f"insert item {item} at {idx}")
self.print_statistics()
def on_get(self, idx):
print(f"get item {self.arr[idx]} at {idx}")
self.print_statistics()
def on_set(self, idx):
print(f"set item {self.arr[idx]} at {idx}")
self.print_statistics()
def on_del(self, idx):
print(f"del item {self.arr[idx]} at {idx}")
self.print_statistics()
def print_statistics(self):
print(self)
And play around in Python console - you'll see the sequence python evaluate your question.
Added comments to clarify steps.
a[0], a[1] = a[1], a[0]
get item 0 at 1
[1, 0]
get item 1 at 0
[1, 0]
set item 0 at 0
[0, 0]
set item 1 at 1
[0, 1]
Normally you'd expect __getitem__ and __setitem__ get/assign value in same order. For this case - get 0, get 1, set 0, set 1.
For your code:
>>> from _63866011 import PrintingMutable
>>> a = PrintingMutable([1, 0])
>>> a[0], a[a[0]] = a[a[0]], a[0]
get item 1 at 0 # get a[0] = 1
[1, 0]
get item 0 at 1 # get a[a[0]] = 0
[1, 0]
get item 1 at 0
[1, 0]
set item 0 at 0
[0, 0]
get item 0 at 0
[0, 0]
set item 1 at 0
[1, 0]
get 1, get 2, set 1, set 2, at same order just like above.
But use of list item to access list is messing up with indices.
Keep playing with this until you've got idea how assignment works in MutableSequence.

How to ignore False, so it doesn't count as 0

is there a way to make 0 not to be the same as False?
Because I keep getting the wrong result and I couldn't find a solution
yet.
def move_zeros(array):
a = 0
for i in array:
if array[a] == 0:
array.remove(array[a])
array.append(0)
a +=1
else:
a += 1
return array
print(move_zeros([0,1,None,2,False,1,0]))
Input array:
[0,1,None,2,False,1,0]
Expected output:
[1,None,2,False,1,0,0]
Actual output:
[1, None, 2, 1, 0, 0, 0]
Boolean type inherit from int type, so False is 0, but of a different type. This means you can distinguish 0 from False with a type check. False and bool, and therefore int, but 0 is not a bool.
Also, .remove removes first value equal to the value, so even with this check the code didn't work. ;) It took first 0, no matter the value of a, so it also grabbed False. Let's just use del
def move_zeros(array):
a = 0
for i in array:
if array[a] == 0 and not isinstance(array[a], bool):
del array[a]
array.append(0)
# a +=1 if you remove an element, next one jumps in its place, so you have to check the same index
else:
a += 1
return array
print(move_zeros([0,1,None,2,False,1,0]))
Output:
[1, None, 2, False, 1, 0, 0]
Iterating on a list while updating it often leads to problems, so you should avoid it.
Just build a list of non-zero values in a list comprehension and append the needed zeros:
def move_zeros(array):
out = [v for v in array if v != 0 or v is False]
out.extend([0] * (len(array) - len(out)))
return out
Note that 0 and False are equal, so 0 == False, but they are different objects, so 0 is not False.
Test with your data:
print(move_zeros([0,1,None,2,False,1,0]))
# [1, None, 2, False, 1, 0, 0]
Another test case that will fail with some solutions iterating on the list while updating it:
move_zeros([0,1,None, 0, 0, 2,False,1,0])
# [1, None, 2, False, 1, 0, 0, 0, 0]
You can use the additional clause is not False to distinguish between 0 and False.
def move_zeros(array):
zeroLocations = [i for i in range(len(array)) if array[i] == 0 and array[i] is not False]
zeroLocations.reverse()
for zeroId in zeroLocations:
del array[zeroId]
array.append(0)
return array
You could check the type of the element your working with so only if element is not a bool and is equal to 0 then it will be removed and a 0 appended to the list.
def move_zeros(input):
for index, element in enumerate(input):
if element == 0 and not isinstance(element, bool):
input.pop(index)
input.append(0)
return input
print(move_zeros([0,1,None,2,False,1,0]))

Storing every value of a changing variable

I am writing a piece of code that takes an input that varies according to discrete time steps. For each time step, I get a new value for the input.
How can I store each value as a list?
Here's an example:
"""when t = 0, d = a
when t = 1, d = b
when t = 2, d = c"""
n = []
n.append(d) #d is the changing variable
for i in range(t):
n.append(d)
What I expect to get is:
for t = 0, n = [a]; for t = 1, n = [a,b]; and for t = 2, n = [a,b,c]
What I actually get is:
for t = 0, n = [a], for t = 1, n = [b,b]; and for t = 2, n = [c,c,c]
See comment below, but based on the additional info you've provided, replace this:
n.append(d)
with this:
n.append(d[:])
Which type is the variable 'd'? If it is, for instance a list, the code you are showing pushes onto tbe list 'n' a reference to the variable 'd' rather than a copy of it. Thus, for each iteration of the loop you add a new reference of 'd' (like a pointer in C) to 'n', and when 'd' is updated all the entries in 'n' have, of course, the same value
To fix it you can modify the code so as to append a copy of 'd', either:
n.append(d[:])
n.append(list(d))
n.append(tuple(d))
You can simply do this
n = []
for i in range(t + 1):
n.append(chr(i+ord('a'))
And if you do not want to store the characters in the list rather some specific values which are related with d, then you have to change d in the for loop
n = []
d = 1
for i in range(t + 1):
n.append(d)
d += 2
It is difficult to say without seeing the code. But if d is not an int, this could happen. If d is a list for instance, it is passed by reference
n = []
d = [1]
n.append(d)
d[0] = 2
n.append(d)
print(n)
>>>> [[2], [2]]
So if each new d is just modified, your probleme arise. You can solve it by copying d :
from copy import copy
n = []
d = [1]
n.append(copy(d))
d[0] = 2
n.append(copy(d))
print(n)
>>>> [[1], [2]]
If you just wrap the variable inside an object you can watch what is being set to the variable by overriding __setattr__ method. A simple example.
class DummyClass(object):
def __init__(self, x):
self.history_of_x=[]
self.x = x
self._locked = True
def __setattr__(self, name, value):
self.__dict__[name] = value
if name == "x":
self.history_of_x.append(value)
d = DummyClass(4)
d.x=0
d.x=2
d.x=3
d.x=45
print d.history_of_x
Output :-
[4, 0, 2, 3, 45]

What would prevent filter from returning 0?

When I try to filter [1,2,0,3,8] with if x < 3: return x I end up with [1,2]. Why is the 0 not included in this list?
def TestFilter(x):
if x < 3:
return x
a = [1,2,0,3,8]
b = filter(TestFilter, a)
print b
Every time your function returns True filter() will add the current element from the original list to the new list. Python considers 0 to be False and any other number to be True. Therefore you will want to have the function return True instead of the number.
def TestFilter(x):
if x < 3:
return True
EDIT: Here's a lambda example:
a = [1, 2, 3, 0, 4, 8]
print filter(lambda x: x < 3, a)
When filtering you want to be returning True or False. Here's what you want:
def TestFilter(x):
return x < 3
When you filter using this, you'll get the results you're looking for.

How to test if a list contains another list as a contiguous subsequence?

How can I test if a list contains another list (ie. it's a contiguous subsequence). Say there was a function called contains:
contains([1,2], [-1, 0, 1, 2]) # Returns [2, 3] (contains returns [start, end])
contains([1,3], [-1, 0, 1, 2]) # Returns False
contains([1, 2], [[1, 2], 3]) # Returns False
contains([[1, 2]], [[1, 2], 3]) # Returns [0, 0]
Edit:
contains([2, 1], [-1, 0, 1, 2]) # Returns False
contains([-1, 1, 2], [-1, 0, 1, 2]) # Returns False
contains([0, 1, 2], [-1, 0, 1, 2]) # Returns [1, 3]
If all items are unique, you can use sets.
>>> items = set([-1, 0, 1, 2])
>>> set([1, 2]).issubset(items)
True
>>> set([1, 3]).issubset(items)
False
There's an all() and any() function to do this.
To check if big contains ALL elements in small
result = all(elem in big for elem in small)
To check if small contains ANY elements in big
result = any(elem in big for elem in small)
the variable result would be boolean (TRUE/FALSE).
Here is my version:
def contains(small, big):
for i in xrange(len(big)-len(small)+1):
for j in xrange(len(small)):
if big[i+j] != small[j]:
break
else:
return i, i+len(small)
return False
It returns a tuple of (start, end+1) since I think that is more pythonic, as Andrew Jaffe points out in his comment. It does not slice any sublists so should be reasonably efficient.
One point of interest for newbies is that it uses the else clause on the for statement - this is not something I use very often but can be invaluable in situations like this.
This is identical to finding substrings in a string, so for large lists it may be more efficient to implement something like the Boyer-Moore algorithm.
Note: If you are using Python3, change xrange to range.
May I humbly suggest the Rabin-Karp algorithm if the big list is really big. The link even contains almost-usable code in almost-Python.
This works and is fairly fast since it does the linear searching using the builtin list.index() method and == operator:
def contains(sub, pri):
M, N = len(pri), len(sub)
i, LAST = 0, M-N+1
while True:
try:
found = pri.index(sub[0], i, LAST) # find first elem in sub
except ValueError:
return False
if pri[found:found+N] == sub:
return [found, found+N-1]
else:
i = found+1
If we refine the problem talking about testing if a list contains another list with as a sequence, the answer could be the next one-liner:
def contains(subseq, inseq):
return any(inseq[pos:pos + len(subseq)] == subseq for pos in range(0, len(inseq) - len(subseq) + 1))
Here unit tests I used to tune up this one-liner:
https://gist.github.com/anonymous/6910a85b4978daee137f
After OP's edit:
def contains(small, big):
for i in xrange(1 + len(big) - len(small)):
if small == big[i:i+len(small)]:
return i, i + len(small) - 1
return False
I've Summarized and evaluated Time taken by different techniques
Used methods are:
def containsUsingStr(sequence, element:list):
return str(element)[1:-1] in str(sequence)[1:-1]
def containsUsingIndexing(sequence, element:list):
lS, lE = len(sequence), len(element)
for i in range(lS - lE + 1):
for j in range(lE):
if sequence[i+j] != element[j]: break
else: return True
return False
def containsUsingSlicing(sequence, element:list):
lS, lE = len(sequence), len(element)
for i in range(lS - lE + 1):
if sequence[i : i+lE] == element: return True
return False
def containsUsingAny(sequence:list, element:list):
lE = len(element)
return any(element == sequence[i:i+lE] for i in range(len(sequence)-lE+1))
Code for Time analysis (averaging over 1000 iterations):
from time import perf_counter
functions = (containsUsingStr, containsUsingIndexing, containsUsingSlicing, containsUsingAny)
fCount = len(functions)
for func in functions:
print(str.ljust(f'Function : {func.__name__}', 32), end=' :: Return Values: ')
print(func([1,2,3,4,5,5], [3,4,5,5]) , end=', ')
print(func([1,2,3,4,5,5], [1,3,4,5]))
avg_times = [0]*fCount
for _ in range(1000):
perf_times = []
for func in functions:
startTime = perf_counter()
func([1,2,3,4,5,5], [3,4,5,5])
timeTaken = perf_counter()-startTime
perf_times.append(timeTaken)
for t in range(fCount): avg_times[t] += perf_times[t]
minTime = min(avg_times)
print("\n\n Ratio of Time of Executions : ", ' : '.join(map(lambda x: str(round(x/minTime, 4)), avg_times)))
Output:
Conclusion: In this case, Slicing operation proves to be the fastest
Here's a straightforward algorithm that uses list methods:
#!/usr/bin/env python
def list_find(what, where):
"""Find `what` list in the `where` list.
Return index in `where` where `what` starts
or -1 if no such index.
>>> f = list_find
>>> f([2, 1], [-1, 0, 1, 2])
-1
>>> f([-1, 1, 2], [-1, 0, 1, 2])
-1
>>> f([0, 1, 2], [-1, 0, 1, 2])
1
>>> f([1,2], [-1, 0, 1, 2])
2
>>> f([1,3], [-1, 0, 1, 2])
-1
>>> f([1, 2], [[1, 2], 3])
-1
>>> f([[1, 2]], [[1, 2], 3])
0
"""
if not what: # empty list is always found
return 0
try:
index = 0
while True:
index = where.index(what[0], index)
if where[index:index+len(what)] == what:
return index # found
index += 1 # try next position
except ValueError:
return -1 # not found
def contains(what, where):
"""Return [start, end+1] if found else empty list."""
i = list_find(what, where)
return [i, i + len(what)] if i >= 0 else [] #NOTE: bool([]) == False
if __name__=="__main__":
import doctest; doctest.testmod()
Smallest code:
def contains(a,b):
str(a)[1:-1].find(str(b)[1:-1])>=0
Here is my answer. This function will help you to find out whether B is a sub-list of A. Time complexity is O(n).
`def does_A_contain_B(A, B): #remember now A is the larger list
b_size = len(B)
for a_index in range(0, len(A)):
if A[a_index : a_index+b_size]==B:
return True
else:
return False`
a=[[1,2] , [3,4] , [0,5,4]]
print(a.__contains__([0,5,4]))
It provides true output.
a=[[1,2] , [3,4] , [0,5,4]]
print(a.__contains__([1,3]))
It provides false output.
I tried to make this as efficient as possible.
It uses a generator; those unfamiliar with these beasts are advised to check out their documentation and that of yield expressions.
Basically it creates a generator of values from the subsequence that can be reset by sending it a true value. If the generator is reset, it starts yielding again from the beginning of sub.
Then it just compares successive values of sequence with the generator yields, resetting the generator if they don't match.
When the generator runs out of values, i.e. reaches the end of sub without being reset, that means that we've found our match.
Since it works for any sequence, you can even use it on strings, in which case it behaves similarly to str.find, except that it returns False instead of -1.
As a further note: I think that the second value of the returned tuple should, in keeping with Python standards, normally be one higher. i.e. "string"[0:2] == "st". But the spec says otherwise, so that's how this works.
It depends on if this is meant to be a general-purpose routine or if it's implementing some specific goal; in the latter case it might be better to implement a general-purpose routine and then wrap it in a function which twiddles the return value to suit the spec.
def reiterator(sub):
"""Yield elements of a sequence, resetting if sent ``True``."""
it = iter(sub)
while True:
if (yield it.next()):
it = iter(sub)
def find_in_sequence(sub, sequence):
"""Find a subsequence in a sequence.
>>> find_in_sequence([2, 1], [-1, 0, 1, 2])
False
>>> find_in_sequence([-1, 1, 2], [-1, 0, 1, 2])
False
>>> find_in_sequence([0, 1, 2], [-1, 0, 1, 2])
(1, 3)
>>> find_in_sequence("subsequence",
... "This sequence contains a subsequence.")
(25, 35)
>>> find_in_sequence("subsequence", "This one doesn't.")
False
"""
start = None
sub_items = reiterator(sub)
sub_item = sub_items.next()
for index, item in enumerate(sequence):
if item == sub_item:
if start is None: start = index
else:
start = None
try:
sub_item = sub_items.send(start is None)
except StopIteration:
# If the subsequence is depleted, we win!
return (start, index)
return False
I think this one is fast...
def issublist(subList, myList, start=0):
if not subList: return 0
lenList, lensubList = len(myList), len(subList)
try:
while lenList - start >= lensubList:
start = myList.index(subList[0], start)
for i in xrange(lensubList):
if myList[start+i] != subList[i]:
break
else:
return start, start + lensubList - 1
start += 1
return False
except:
return False
Dave answer is good. But I suggest this implementation which is more efficient and doesn't use nested loops.
def contains(small_list, big_list):
"""
Returns index of start of small_list in big_list if big_list
contains small_list, otherwise -1.
"""
loop = True
i, curr_id_small= 0, 0
while loop and i<len(big_list):
if big_list[i]==small_list[curr_id_small]:
if curr_id_small==len(small_list)-1:
loop = False
else:
curr_id_small += 1
else:
curr_id_small = 0
i=i+1
if not loop:
return i-len(small_list)
else:
return -1
Here's a simple and efficient function to check whether big list contains a small one in matching order:
def contains(big, small):
i = 0
for value in big:
if value == small[i]:
i += 1
if i == len(small):
return True
else:
i = 1 if value == small[0] else 0
return False
Usage:
"""
>>> contains([1,2,3,4,5], [2,3,4])
True
>>> contains([4,2,3,2,4], [2,3,4])
False
>>> contains([1,2,3,2,3,2,2,4,3], [2,4,3])
True
"""
The problem of most of the answers, that they are good for unique items in list. If items are not unique and you still want to know whether there is an intersection, you should count items:
from collections import Counter as count
def listContains(l1, l2):
list1 = count(l1)
list2 = count(l2)
return list1&list2 == list1
print( listContains([1,1,2,5], [1,2,3,5,1,2,1]) ) # Returns True
print( listContains([1,1,2,8], [1,2,3,5,1,2,1]) ) # Returns False
You can also return the intersection by using ''.join(list1&list2)
Here a solution with less line of code and easily understandable (or at least I like to think so).
If you want to keep order (match only if the smaller list is found in the same order on the bigger list):
def is_ordered_subset(l1, l2):
# First check to see if all element of l1 are in l2 (without checking order)
if not set(l1).issubset(l2):
return False
length = len(l1)
# Make sublist of same size than l1
list_of_sublist = [l2[i:i+length] for i, x in enumerate(l2)]
#Check if one of this sublist is l1
return l1 in list_of_sublist
You can use numpy:
def contains(l1, l2):
""" returns True if l2 conatins l1 and False otherwise """
if len(np.intersect1d(l1,l2))==len(l1):
return = True
else:
return = False

Categories

Resources