Why results of map() and list comprehension are different? [duplicate] - python

This question already has answers here:
What do lambda function closures capture?
(7 answers)
Closed 6 months ago.
The following test fails:
#!/usr/bin/env python
def f(*args):
"""
>>> t = 1, -1
>>> f(*map(lambda i: lambda: i, t))
[1, -1]
>>> f(*(lambda: i for i in t)) # -> [-1, -1]
[1, -1]
>>> f(*[lambda: i for i in t]) # -> [-1, -1]
[1, -1]
"""
alist = [a() for a in args]
print(alist)
if __name__ == '__main__':
import doctest; doctest.testmod()
In other words:
>>> t = 1, -1
>>> args = []
>>> for i in t:
... args.append(lambda: i)
...
>>> map(lambda a: a(), args)
[-1, -1]
>>> args = []
>>> for i in t:
... args.append((lambda i: lambda: i)(i))
...
>>> map(lambda a: a(), args)
[1, -1]
>>> args = []
>>> for i in t:
... args.append(lambda i=i: i)
...
>>> map(lambda a: a(), args)
[1, -1]

They are different, because the value of i in both the generator expression and the list comp are evaluated lazily, i.e. when the anonymous functions are invoked in f.
By that time, i is bound to the last value if t, which is -1.
So basically, this is what the list comprehension does (likewise for the genexp):
x = []
i = 1 # 1. from t
x.append(lambda: i)
i = -1 # 2. from t
x.append(lambda: i)
Now the lambdas carry around a closure that references i, but i is bound to -1 in both cases, because that is the last value it was assigned to.
If you want to make sure that the lambda receives the current value of i, do
f(*[lambda u=i: u for i in t])
This way, you force the evaluation of i at the time the closure is created.
Edit: There is one difference between generator expressions and list comprehensions: the latter leak the loop variable into the surrounding scope.

The lambda captures variables, not values, hence the code
lambda : i
will always return the value i is currently bound to in the closure. By the time it gets called, this value has been set to -1.
To get what you want, you'll need to capture the actual binding at the time the lambda is created, by:
>>> f(*(lambda i=i: i for i in t)) # -> [-1, -1]
[1, -1]
>>> f(*[lambda i=i: i for i in t]) # -> [-1, -1]
[1, -1]

Expression f = lambda: i is equivalent to:
def f():
return i
Expression g = lambda i=i: i is equivalent to:
def g(i=i):
return i
i is a free variable in the first case and it is bound to the function parameter in the second case i.e., it is a local variable in that case. Values for default parameters are evaluated at the time of function definition.
Generator expression is the nearest enclosing scope (where i is defined) for i name in the lambda expression, therefore i is resolved in that block:
f(*(lambda: i for i in (1, -1)) # -> [-1, -1]
i is a local variable of the lambda i: ... block, therefore the object it refers to is defined in that block:
f(*map(lambda i: lambda: i, (1,-1))) # -> [1, -1]

Related

Lambda function range() get one number always

def multipliers():
return [lambda x: i * x for i in range(3)]
print([m(2) for m in multipliers()])
how to fix this lambda function?
I except:
[0, 2, 4]
I got:
[4, 4, 4]
This has to do with how the lambdas capture the name i, not its value (and the last value of the name i within that listcomp will be 2).
Add one more function that will have a local name i:
def make_multiplier(x):
return lambda y: x * y
def multipliers():
return [make_multiplier(i) for i in range(3)]
print([m(2) for m in multipliers()])
lambda is also a function with its local scope. To tie each lambda to a respective i counter you can pass it as an argument with default value:
def multipliers():
return [lambda x, i=i: i * x for i in range(3)]
print([m(2) for m in multipliers()]) # [0, 2, 4]

Function in Python list comprehension, don't eval twice

I'm composing a Python list from an input list run through a transforming function. I would like to include only those items in the output list for which the result isn't None. This works:
def transform(n):
# expensive irl, so don't execute twice
return None if n == 2 else n**2
a = [1, 2, 3]
lst = []
for n in a:
t = transform(n)
if t is not None:
lst.append(t)
print(lst)
[1, 9]
I have a hunch that this can be simplified with a comprehension. However, the straighforward solution
def transform(n):
return None if n == 2 else n**2
a = [1, 2, 3]
lst = [transform(n) for n in a if transform(n) is not None]
print(lst)
is no good since transform() is applied twice to each entry. Any way around this?
Use the := operator for python >=3.8.
lst = [t for n in a if (t:= transform(n)) is not None]
If not able/don't want to use walrus operator, one can use #functools.lru_cache to cache the result from calling the function and avoid calling it twice
import functools
eggs = [2, 4, 5, 3, 2]
#functools.lru_cache
def spam(foo):
print(foo) # to demonstrate each call
return None if foo % 2 else foo
print([spam(n) for n in eggs if spam(n) is not None])
output
2
4
5
3
[2, 4, 2]
Compared with walrus operator (currently accepted answer) this will be the better option if there are duplicate values in the input list, i.e. walrus operator will always run the function once per element in the input list. Note, you may combine finctools.lru_cache with walrus operator, e.g. for readability.
eggs = [2, 4, 5, 3, 2]
def spam(foo):
print(foo) # to demonstrate each call
return None if foo % 2 else foo
print([bar for n in eggs if (bar:=spam(n)) is not None])
output
2
4
5
3
2
[2, 4, 2]

generating tuple vs list [duplicate]

This question already has answers here:
Why is there no tuple comprehension in Python?
(13 answers)
Closed 1 year ago.
When generating list we do not use the builtin "list" to specify that it is a list just the "[]" would do it.
But when the same style/pattern in used for tuple it does not.
l = [x for x in range(8)]
print(l)
y= ((x for x in range(8)))
print(y)
Output:
[0, 1, 2, 3, 4, 5, 6, 7]
<generator object <genexpr> at 0x000001D1DB7696D0>
Process finished with exit code 0
When "tuple" is specified it displays it right.
Question is:- In the code "list" is not explicitly mentioned but "tuple". Could you tell me why?
l = [x for x in range(8)]
print(l)
y= tuple((x for x in range(8)))
print(y)
Output:
[0, 1, 2, 3, 4, 5, 6, 7]
(0, 1, 2, 3, 4, 5, 6, 7)
Process finished with exit code 0
Using () is a generator expression:
>>> ((x for x in range(8)))
<generator object <genexpr> at 0x0000013FE1AD6040>
>>>
As mentioned in the documentation:
Generator iterators are created by the yield keyword. The real difference between them and ordinary functions is that yield unlike return is both exit and entry point for the function’s body. That means, after each yield call not only the generator returns something but also remembers its state. Calling the next() method brings control back to the generator starting after the last executed yield statement. Each yield statement is executed only once, in the order it appears in the code. After all the yield statements have been executed iteration ends.
A generator in a class would be something like:
class Generator:
def __init__(self, lst):
self.lst = lst
def __iter__(self):
it = iter(self.lst)
yield from it
def __next__(self):
it = iter(self.lst)
return next(it)
Usage:
>>> x = Generator(i for i in range(5))
>>> next(x)
0
>>> next(x)
1
>>> next(x)
2
>>> for i in x:
print(i)
3
4
>>>

How to understand scope within generators?

I seem to misunderstand variable scope within generators. Why do the following two results differ? (Note the second use of tuple when producing the second result.)
def f(x): return x
result1 = tuple(itertools.chain(*((f(val) for _ in range(2)) for val in (1,2,3))))
result2 = tuple(itertools.chain(*(tuple(f(val) for _ in range(2)) for val in (1,2,3))))
print(result1==result2) # False; why?
Fundamentally, scope is working like it always works. Generators just create a local, enclosing scope, just like a function. Essentially, you are creating a closure over val, and in Python, closures are lexically-scoped and late-binding, i.e., their value is evaluated at the point of executing not definition.
The difference between the two is when the outer generator get's iterated over versus the inner generator. In your first example, the outer generator is iterated completely before any of the inner generators are, in the second example, tuple forces them to be evaluated in-tandem.
The problem is that when you use * argument splatting, it immediately evaluates your generator (the outer one), however, the inner generator isn't evaluated yet, but it is closed over val, but val = 3 at the end of the first generator.
But, in your second example,
(tuple(f(val) for _ in range(2)) for val in (1,2,3)))
The inner call to tuple forces f to be called when val is 1, 2, and 3, and thus, f captures those values.
So, consider the following nested generator, and two different ways of iterating over them:
>>> def gen():
... for i in range(3):
... yield (i for _ in range(2))
...
>>> data = list(gen()) # essentially what you are doing with the splatting
>>> for item in data:
... print(list(item))
...
[2, 2]
[2, 2]
[2, 2]
>>> for item in gen():
... print(list(item))
...
[0, 0]
[1, 1]
[2, 2]
>>>
And finally, this should also be informative:
>>> gs = []
>>> for item in gen():
... gs.append(item)
...
>>> gs
[<generator object gen.<locals>.<genexpr> at 0x1041ceba0>, <generator object gen.<locals>.<genexpr> at 0x1041cecf0>, <generator object gen.<locals>.<genexpr> at 0x1041fc200>]
>>> [list(g) for g in gs]
[[2, 2], [2, 2], [2, 2]]
Again, you have to think of what the closure value will be when it actually is evaluated, in the above case, since I've already iterated over the outer generator, so i is 2, and simply appended the inner generators to another list, and then I evaluate the inner generators, they will see the value of i as 2, because that is what it is.
To reiterate, this occurs because * splatting force the generator to be iterated over. use chain.from_iterable instead and you'll get True for your result1 == result2.

mysterious behaviour of python built-in method filter in for loop

Consider the below fact:
a = list(range(10))
res = list(a)
for i in a:
if i in {3, 5}:
print('>>>', i)
res = filter(lambda x: x != i, res)
print(list(res))
>>> 3
>>> 5
[0, 1, 2, 3, 4, 5, 6, 7, 8]
So neither 3 nor 5 was removed, but 9 is gone...
If i force convert the filter object to list, then it work as expected:
a = list(range(10))
res = list(a)
for i in a:
if i in {3, 5}:
print('>>>', i)
# Here i force to convert filter object to list then it will work as expected.
res = list(filter(lambda x: x != i, res))
print(list(res))
>>> 3
>>> 5
[0, 1, 2, 4, 6, 7, 8, 9]
I can feel this is due to that the filter object is a generator, but cannot exactly interpreter how the generator cause this consistent weird behaviour, please help to elaborate the underlying rationalities.
The behaviour arises from a combination of two facts:
The lambda function contains the variable i taken from the surrounding scope, which is only evaluated at execution time. Consider this example:
>>> func = lambda x: x != i # i does not even need to exist yet
>>> i = 3
>>> func(3) # now i will be used
False
Because filter returns a generator, the function is evaluated lazily, when you actually iterate over it, rather than when filter is called.
The combined effect of these, in the first example, is that by the time that you iterate over the filter object, i has the value of 9, and this value is used in the lambda function.
The desired behaviour can be obtained by removing either (or both) of the two combined factors mentioned above:
In the lambda, force early binding by creating a closure, where you use the value of i as the default value of a parameter (say j), so in place of lambda x: x != i, you would use:
lambda x, j=i: x != j
The expression for the default value (i.e. i) is evaluated when the lambda is defined, and by calling the lambda with only one argument (x) this ensures that you do not override this default at execution time.
or:
Force early execution of all iterations of the generator by converting to list immediately (as you have observed).

Categories

Resources