Python -- list comprehension with try/exception and nested conditional - python

This code should find the mode of a list in O(n) linear time. I want to turn this into a list comprehension because I'm teaching myself Python, and am trying to improve my list comprehension skills.
These were informative but don't really answer my question:
Convert nested loops and conditions to a list comprehension
`elif` in list comprehension conditionals
Nested list comprehension equivalent
The problem that I'm running into is nesting the if's and the try/except. I'm sure this is simple question so a junior Python programmer might have the answer quickly.
def mode(L):
# your code here
d = {}; mode = 0; freq = 0
for j in L:
try:
d[j] += 1
if d[j] > freq:
mode = j; freq = d[j]
except(KeyError): d[j] = 1
return mode
Note that L parameter is a list of ints like this:
L = [3,4,1,20,102,3,5,67,39,10,1,4,34,1,6,107,99]
I was thinking something like:
[try (d[j] += 1) if d[j] > freq (mode = j; freq = d[j]) except(KeyError): d[j] = 1 for j in L]
But I don't have enough duct tape to fix how badly the syntax is off with that thing.

I know you're learning comprehensions, but you can do this with a default dictionary, or a Counter too.
import collections
def mode(L):
# your code here
d = collections.defaultdict(lambda: 1); mode = 0; freq = 0
for j in L:
d[j] += 1
if d[j] > freq:
mode = j; freq = d[j]
return mode
Better still, when you are not trying to learn comprehensions:
import collections
def mode(L):
collections.Counter(L).most_common(1)[0][0]

While it might not be possible directly do this within a list comprehension, there's also no reason to. You only really want to be checking for errors when you're actually retrieving the results. As such, you really want to use a generator instead of a list comprehension.
Syntax is largely the same, just using parens instead instead of brackets, so you would do something like this:
generator = (do something)
try:
for thing in generator
except KeyError:
etc...
That said, you really don't want to do this for you particular application. You want to use a counter:
from collections import Counter
d = Counter(L)
mode = Counter.most_common(1)[0]

You can't incorporate try: except: in a list comprehension. However, you can get around it by refactoring into a dict comprehension:
d = {i: L.count(i) for i in L}
You can then determine the maximum and corresponding key in a separate test. However, this would be O(n**2).

It's not possible to use try-except expressions in list comprenhension.
Quoting this answer:
It is not possible to handle exceptions in a list comprehension for a list comprehension is an expression containing other expression, nothing more (i.e., no statements, and only statements can catch/ignore/handle exceptions).
Edit 1:
What you could do instead of using the try-except clause, is use the get method from the dictionary:
def mode(L):
d = {}
mode = 0
freq = 0
for j in L:
d[j] = d.get(j, 0) + 1
if d[j] > freq:
mode = j
freq = d[j]
return mode
From Python docs:
get(key[, default]): Return the value for key if key is in the dictionary, else default. If default is not given, it defaults to None, so that this method never raises a KeyError.
Edit 2:
This is my list comprenhension approach, not very efficient, just for fun:
r2 = max(zip(L, [L.count(e) for e in L]), key = lambda x: x[1])[0]

Since you're trying to find the value that appears most often, an easy way to do that is with max:
def mode(L):
return max(L, key=L.count)
This is a bit less efficient than the other answers that suggest using collections.Counter (it is O(N^2) rather than O(N)), but for a modest sized list it will probably be fast enough.

Related

Why swapping does not occur in below python code

def rev(n):
for i in range(int(len(n)//2)):
temp = n[i]
n[i] = n[len(n)-i-1]
n[len(n)-i-1] = temp
or,n[i], n[len(n)-i-1] = n[len(n)-i-1], n[i]
return n
n=[34,45,56,67]
print(rev(n))
Above code doesn't reverse the list even the logic is correct still the output is same as input.
Can anyone help me with that as i am little bit confused.
The intention appears to be to reverse the contents of a list in situ.
The most common way to do this is to create a new list with:
mylist[::-1]
...then... copy over the original like this:
mylist[:] = mylist[::-1]
However, the intent appears to be to use a custom loop to reverse in place.
def rev(_list):
i, j = 0, len(_list)
while (j := j - 1) > i:
_list[i], _list[j] = _list[j], _list[i]
i += 1
return _list
This demonstrates the correct element swap syntax and also obviates the need to create a new list. It is however very slow in comparison to traditional techniques

Avoiding re-computing the same expression in Python

In the following code, I use abs(v - i) three times on the same line. Is this expression computed three times when the code is run? Is there a way to avoid this without having to complicate the code?
x = sum(abs(v-i) if s == 1 else int((abs(v-i)*(abs(v-i)+1))/2) for v in list)
Is this expression computed three times when the code is run?
No, once or twice for every list value.
Is there a way to avoid this without having to complicate the code?
Depends on what you consider complicating the code.
You could use the idiom that even got optimized in Python 3.9:
x = sum(a if s == 1 else int((a*(a+1))/2)
for v in list_
for a in [abs(v-i)])
Or if your list values are ints, you could use math.comb:
x = sum(abs(v-i) if s == 1 else comb(abs(v-i)+1, 2) for v in list_)
While https://stackoverflow.com/a/70268402/1126841 regarding the assignment operator is correct, this is a case where I really dislike the assignment expression, as you have to hunt for where a is actually defined. I would probably ditch sum and accumulate the value in a for loop instead.
x = 0
for v in list_:
a = abs(v-i)
if s == 1:
x += a
else:
x += int(a*(a+1)/2)
However, since s never changes in the loop, I would refactor this into two separate loops chosen by the value of s, one of which can use sum without difficulty.
if s == 1:
x = sum(abs(v-i) for v in list_)
else:
x = 0
for v in list_:
a = abs(v-i)
x += int(a*(a+1)/2)
My answer only shows that it's possible to do what you want with a one-liner, but I would still advise to use a longer approach with an explicit if/else + caching the value, or using numpy arrays and masks.
You can use the walrus operator := to store the value as a variable. This line is equivalent to your original code and will only compute a = abs(v-i) once per loop instead of 1-2 times:
x = sum(a if ((a := abs(v-i)) is not None) and s == 1 else int(a*(a+1)/2) for v in list_)
The problem is that the walrus operator can only be used in a if check, so we need to add a check that's always true... It really doesn't help reading comprehension.
"Long" approach:
v = np.array(list_)
a = np.abs(v - i)
x = np.sum(a if s == 1 else np.int(a*(a+1)/2))```
Can't you just save abs(v-i) to a variable and then substitute in that variable?
I would create a variable called my_calc = abs(v-i) then use that name. This will clean up your code

Avoid nested loops when checking data in Python

I have two lists of dictionaries:
dict_list1 = [{'k1':1, 'k2':2}, {'k1':3, 'k2':4}]
dict_list2 = [{'k1':1, 'k2':2, 'k3':10}, {'k1':3, 'k2':4, 'k3':10}]
And now for each dict_x in dict_list1, I want to know if there is a dict_y on dict_list2 that contains every key,value from dict_x.
I cannot think of another way of doing this other then:
for dict_x in dict_list1:
for dict_y in dict_list2:
count = len(dict_x)
for key, val in dict_x.items():
if key in dict_y and dict_y[key] == val:
count -= 1
if count == 0:
print('YAY')
break
dict views can perform quick "is subset" testing via the inequality operators. So:
if dict_x.items() <= dict_y.items(): # Use .viewitems() instead of .items() on Python 2.7
will only return true if every key/value pair in dict_x also appears in dict_y.
This won't change anything in terms of big-O performance, but it does make the code somewhat cleaner:
for dict_x in dict_list1:
for dict_y in dict_list2:
if dict_x.items() <= dict_y.items():
print('YAY')
break
Note that creating the views costs something (it's just a fixed cost, not dependent on dict size), so if performance matters, it may be worth caching the views; doing so for dict_list1 is free:
for dict_x in dict_list1:
dict_x_view = dict_x.items()
for dict_y in dict_list2:
if dict_x_view <= dict_y.items():
print('YAY')
break
but some eager conversions would be needed to cache both:
# Convert all of dict_list2 to views up front; costs a little if
# not all views end up being tested (we always break before finishing)
# but usually saves some work at the cost of a tiny amount of memory
dict_list2_views = [x.items() for x in dict_list2]
for dict_x in dict_list1:
dict_x_view = dict_x.items()
for dict_y_view in dict_list2_views:
if dict_x_view <= dict_y_view:
print('YAY')
break
You could also collapse the loop using any (which removes the need to break since any short-circuits), so the first (simplest) check could become:
for dict_x in dict_list1:
if any(dict_x.items() <= dict_y.items() for dict_y in dict_list2):
print('YAY')
This could be further collapsed to a single list comprehension that results in the various matches, but at that point the code is going to be pretty cramped/ugly:
for _ in (dict_x in dict_list1 if any(dict_x.items() <= dict_y.items() for dict_y in dict_list2)):
print('YAY')
though without knowing what you'd really do (as opposed to just printing YAY) that's getting a little pointless.
Below, I use the fact that the dict.items view implements set operations to check for each d1.items() if there exists a d2.items(), such that d1.items() is a subset of d2.items()
[any(d1.items() <= d2.items() for d2 in dict_list2) for d1 in dict_list1]
You can use any and all:
dict_list1 = [{'k1':1, 'k2':2}, {'k1':3, 'k2':4}]
dict_list2 = [{'k1':1, 'k2':2, 'k3':10}, {'k1':3, 'k2':4, 'k3':10}]
v = [any(all(c in i and i[c] == k for c, k in b.items()) for i in dict_list2)\
for b in dict_list1]
Output:
[True, True]

O(n) list subtraction

When working on an AoC puzzle, I found I wanted to subtract lists (preserving ordering):
def bag_sub(list_big, sublist):
result = list_big[:]
for n in sublist:
result.remove(n)
return result
I didn't like the way the list.remove call (which is itself O(n)) is contained within the loop, that seems needlessly inefficient. So I tried to rewrite it to avoid that:
def bag_sub(list_big, sublist):
c = Counter(sublist)
result = []
for k in list_big:
if k in c:
c -= Counter({k: 1})
else:
result.append(k)
return result
Is this now O(n), or does the Counter.__isub__ usage still screw things up?
This approach requires that elements must be hashable, a restriction which the original didn't have. Is there an O(n) solution which avoids creating this additional restriction? Does Python have any better "bag" datatype than collections.Counter?
You can assume sublist is half the length of list_big.
I'd use a Counter, but I'd probably do it slightly differently, and I'd probably do this iteratively...
def bag_sub(big_list, sublist):
sublist_counts = Counter(sublist)
result = []
for item in big_list:
if sublist_counts[item] > 0:
sublist_counts[item] -= 1
else:
result.append(item)
return result
This is very similar to your solution, but it's probably not efficient to create an entire new counter every time you want to decrement the count on something.1
Also, if you don't need to return a list, then consider a generator function...
This works as long as all of the elements in list_big and sublist can be hashed. This solution is O(N + M) where N and M are the lengths of list_big and sublist respectively.
If the elements cannot be hashed, you are out of luck unless you have other constraints (e.g. the inputs are sorted using the same criterion). If your inputs are sorted, you could do something similar to the merge stage of merge-sort to determine which elements from bag_sub are in sublist.
1Note that Counters also behave a lot like a defaultdict(int) so it's perfectly fine to look for an item in a counter that isn't there already.
Is this now O(n), or does the Counter.__isub__ usage still screw things up?
This would be expected-case O(n), except that when Counter.__isub__ discards nonpositive values, it goes through every key to do so. You're better off just subtracting 1 from the key value the "usual" way and checking c[k] instead of k in c. (c[k] is 0 for k not in c, so you don't need an in check.)
if c[k]:
c[k] -= 1
else:
result.append(k)
Is there an O(n) solution which avoids creating this additional restriction?
Only if the inputs are sorted, in which case a standard variant of a mergesort merge can do it.
Does Python have any better "bag" datatype than collections.Counter?
collections.Counter is Python's bag.
Removing an item from a list of length N is O(N) if the list is unordered, because you have to find it.
Removing k items from a list of length N, therefore, is O(kN) if we focus on "reasonable" cases where k << N.
So I don't see how you could get it down to O(N).
A concise way to write this:
new_list = [x for x in list_big if x not in sublist]
But that's still O(kN).

Expanding elements in a list

I'm looking for a "nice" way to process a list where some elements need to be expanded into more elements (only once, no expansion on the results).
Standard iterative way would be to do:
i=0
while i < len(l):
if needs_expanding(l[i]):
new_is = expand(l[i])
l[i:i] = new_is
i += len(new_is)
else:
i += 1
which is pretty ugly. I could rewrite the contents into a new list with:
nl = []
for x in l:
if needs_expanding(x):
nl += expand(x)
else:
nl.append(x)
But they both seem too long. Or I could simply do 2 passes and flatten the list later:
flatten(expand(x) if needs_expanding(x) else x for x in l)
# or
def try_expanding(x)....
flatten(try_expanding(x) for x in l)
but this doesn't feel "right" either.
Are there any other clear ways of doing this?
Your last two answers are what I would do. I'm not familiar with flatten() though, but if you have such a function then that looks ideal. You can also use the built-in sum():
sum(expand(x) if needs_expanding(x) else [x] for x in l, [])
sum(needs_expanding(x) and expand(x) or [x] for x in l, [])
If you do not need random access in the list you are generating, you could also use write a generator.
def iter_new_list(old_list):
for x in old_list:
if needs_expanding(x):
for y in expand(x):
yield y
else:
yield x
new_list = list(iter_new_list(old_list))
This is functionally equivalent to your second example, but it might be more readable in your real-world situation.
Also, Python coding standards forbid the use of lowercase-L as a variable name, as it is nearly indistinguishable from the numeral one.
The last one is probably your most pythonic, but you could try an implied loop (or in py3, generator) with map:
flatten(map(lambda x: expand(x) if needs_expanding(x) else x, l))
flatten(map(try_expanding, l))

Categories

Resources