Confusion on how Python generator works - python

Below is a copy from a Python book, the explanation on how generator works is not clear to me at all.
gen.send.py
def counter(start=0):
n = start
while True:
result = yield n # A
print(type(result), result) # B
if result == 'Q':
break
n += 1
c = counter()
print(next(c)) # C
print(c.send('Wow!')) # D
print(next(c)) # E
print(c.send('Q')) # F
And the output of the above is:
$ python gen.send.py
0
<class 'str'> Wow!
1
<class 'NoneType'> None
2
<class 'str'> Q
Traceback (most recent call last):
File "gen.send.py", line 14, in <module>
print(c.send('Q')) # F
StopIteration
Learning Python. . VitalBook file.
Explanation from book:
We start the generator execution with a call to next (#C). Within the generator, n is set to the same value of start. The while loop is entered, execution stops (#A) and n (0) is yielded back to the caller. 0 is printed on the console.
#Q1: At this point, n=1, right? because n+=1 should be executed since print(type(result), result) is executed.
We then call send (#D), execution resumes and result is set to 'Wow!' (still #A), then its type and value are printed on the console (#B). result is not 'Q', therefore n is incremented by 1 and execution goes back to the while condition, which, being True, evaluates to True (that wasn't hard to guess, right?). Another loop cycle begins, execution stops again (#A), and n (1) is yielded back to the caller. 1 is printed on the console.
Q2: 'Wow!' is sent to who? n, start, or result? and how? If n='Wow!' and what is the consequence of n+=1 then?
At this point, we call next (#E), execution is resumed again (#A), and because we are not sending anything to the generator explicitly, Python behaves exactly like functions that are not using the return statement: the yield n expression (#A) returns None.
Q3: Why None? whose value (start, n, result) exactly is suspended in this generator
result therefore is set to None, and its type and value are yet again printed on the console (#B). Execution continues, result is not 'Q' so n is incremented by 1, and we start another loop again. Execution stops again (#A) and n (2) is yielded back to the caller. 2 is printed on the console.
Q4: Why 2? Why not 4, or 5 because of n+=1 statement?
And now for the grand finale: we call send again (#F), but this time we pass in 'Q', therefore when execution is resumed, result is set to 'Q' (#A). Its type and value are printed on the console (#B), and then finally the if clause evaluates to True and the while loop is stopped by the break statement. The generator naturally terminates and this means a StopIteration exception is raised. You can see the print of its traceback on the last few lines printed on the console.
Thanks in advance.

One way to tackle this kind of learning problems is to visualize what is happening at each step. Let us start from the beginning:
def counter(start=0):
n = start
while True:
result = yield n # A
print(type(result), result) # B
if result == 'Q':
break
n += 1
Nothing interesting happens here. It is just a function definition.
c = counter()
Normally it should execute the function right? But since there is a yield keyword, it simply returns an object which can be used to execute the function. Read the previous sentence again! That is the first thing to understand about generators.
print(next(c)) # C
This is the way you execute the function using the object c. You don't invoke it with (), but instead do next(c). This is the very first time your instructions are executed and it happens till it finds the next yield statement. Since this is at A and the value of n is 0 at this moment, it just prints 0 and exits from the function - it would be better say the function pauses here. Remember it has not even reached n +=1! That answers your Q1.
print(c.send('Wow!')) # D
Now some more interesting stuff happens. The generator c, which had previously stopped at yield n, now just resumes and the next immediate instruction it has to perform is to result = in the result = yield n statement at A. It is given the value which you send in! So now result = 'Wow' has just happened.
Rest of the execution is normal. It again comes out of the function when it hits the next yield n. Now n is the n+1 because it was incremented in the while loop. I hope you can guess how the rest of the code behaves.
print(c.send('Q')) # F
Now this statement is somewhat different because it sends in a value that actually breaks the loop which in turn also stops any further yields in this case. Since the generator no longer finds any yield statements, it just throws a StopIteration exception and stops. If there was a yield outside of the while loop, it would have returned that and paused again.

Q1: Not sure what the question is here
Q2: 'Wow!' is sent to result.
Q3: result is None because execution was resumed with next(), so nothing was sent to the yield expression. As a result, the default value of None is sent instead.
Q4: Why would 4 or 5 be printed? n += 1 has only executed twice.

Q1: No, n is still 0 at this point. The generator stopped running at A, and the outside code has printed the first yielded value at C. The generator doesn't start running until send is called as part of line D.
Q2: The string "Wow!" becomes the value of the yield expression in line A, so it gets assigned to result in the generator. It, along with its type, gets printed out on line B, which is your second line of output. Then n gets incremented, and the loop starts over, with n (1) getting yielded as the return value from c.send. That gets printed on line D, for the third line of output.
Q3: You resume the generator on line E by calling next on it, which is equivalent to c.send(None). So result gets the value None (which is of type NoneType), and that gets printed in the generator code.
Q4: I'm not sure I understand what you're asking here. You seem to understand the execution flow. The code never prints more numbers because the generator has ended. It incremented n twice, but after it got the Q, it quit.
For what it's worth, you're very unlikely to ever need to write code quite like this example. It's very rare to mix next calls with send calls (except for the first next on a coroutine you're going to call send on the rest of the time).

Think of yield like a special return statement. When you get to the result = yield n line, first the right side is executed, returning n, which is 0. The difference from a return is that the function doesn't stop, it pauses, so the next time you call c.send(17) or next(c) it will resume from the yield, replacing it by the value you send (17) or None if you use the next(c) way. So when you call the first time next(c) it returns 0 and pauses, when you call c.send('Wow!') it resumes printing the type and the value you send from inside the generator, returning 1 and pausing, and it goes on.
Maybe if you add the letters to the print statements you can see easier where each output line comes from:
def counter(start=0):
n = start
while True:
result = yield n # A
print("B:", type(result), result) # B
if result == 'Q':
break
n += 1
c = counter()
print("C:", next(c)) # C
print("D:", c.send('Wow!')) # D
print("E:", next(c)) # E
print("F:", c.send('Q')) # F
This would output:
$ python gen.send.py
C: 0
B: <class 'str'> Wow!
D: 1
B: <class 'NoneType'> None
E: 2
B: <class 'str'> Q
Traceback (most recent call last):
File "gen.send.py", line 14, in <module>
print("F:", c.send('Q')) # F
StopIteration
So answering your questions:
n = 0 yet as it paused after yielding n. The print is from the print(next(c)) # C line. n += 1 will get executed after you resume the generator with c.send('Wow!').
What you pass to the send method works as if it replaced the yield where the generator paused, in this case result = yield n -> result = 'Wow!', so it is passed to result.
When you do next(c) is equivalent to doing c.send(None), so it resumes the execution replacing the yield for None (result = yield n -> result = None).
You woke up the generator 4 times: the first next(c) didn't reach any n += 1; the c.send('Wow!') reached it once; the second next(c) reached it another time; and the c.send('Q') didn't reach it as the break statement was executing getting out of the while loop.

Related

Difference between two yield statements

What would be the difference between the following two generator functions?
def get_primes(number):
while True:
if is_prime(number):
number = yield number
number += 1
And:
def get_primes(number):
while True:
if is_prime(number):
yield number
number += 1
As far as I understand, I can call them as:
p = get_primes(0)
# first call works for both
next(p) # or p.send(None)
# second call different for both
next(p) # works for second way only
p.send(14) # works for first way only
I think my issue is I don't really understand how send works and how it's setting the value and all.
If you check out the docs, it says:
Resumes the execution and “sends” a value into the generator function. The value argument becomes the result of the current yield expression.
That may sounds a little cryptic, so perhaps in other words:
Using send() the generator resumes where it yielded and the value you have sent is what yield returns (and can be assigned to any variable). You can also try the following code:
def get_num():
number = 1
while True:
print(number)
number = yield number
g = get_num()
g.send(None) # haven't yielded yet, cannot send a value to it
g.send(2)
g.send(5)
It'll return:
1: value we've initially assigned to number
2: we did send(2) and that is what number = yield ... assigned to number, then we continued, looped back to print() and yielded again.
5: Same thing, but we did send(5).

How does Python work with multiple "send" calls to a generator?

There's a number of good questions about similar matters, e.g.
python generator "send" function purpose?
What does the "yield" keyword do?
Lets get back to a definition of "send":
Resumes the execution and “sends” a value into the generator function.
The value argument becomes the result of the current yield expression.
The send() method returns the next value yielded by the generator, or
raises StopIteration if the generator exits without yielding another
value. When send() is called to start the generator, it must be called
with None as the argument, because there is no yield expression that
could receive the value
But I feel I am missing something important. Here's my example with 3 send calls, including the initial one with a None value just to initialize a generator:
def multiplier():
while True:
m = yield # Line #3
print('m = ' + str(m)) # Line #4
yield str(m * 2) # Line #5
yield str(m * 3) # Line #6
#------------------------
it = multiplier()
print('it.send(None): ')
print(str(it.send(None)))
print('--------------')
print('it.send(10): ')
print(it.send(10))
print('--------------')
print('it.send(100): ')
print(it.send(100))
print('--------------')
And here's an output:
it.send(None):
None
--------------
it.send(10):
m = 10
20
--------------
it.send(100):
30
--------------
Questions:
What happens exactly when I use it.send(10) in a line #5. If we
follow the definition, the generator execution resumes. Generator
accepts 10 as input value and uses it in a current yield
expression. It is yield str(m * 2) in my example, but then how m
is set to 10. When did that happen. Is that because of the
reference between m and yield in a line #3?
What happens in a line #6 it.send(10) and why output is still 30?
Does it mean that the reference in my previous question only worked
once?
Note:
If I've changed my example and added a line m = yield between lines #5 and #6 and then use next(it) after print(it.send(10)) - in that case the output starts to make sense: 20 and 300
Your generator function has three yield expressions, but you're throwing away the value from two of them (lines 5 and 6). If you did something with the values there, you'd see the 100 being used in the function. If you kept running your example, the fifth time you called send would cause the generator to update m to a new value.
Lets walk through the code that does the send calls in your example, and see what the generator is doing at the same time:
it = multiplier()
At this point the generator object has been created and saved to it. The generator code has not started running yet, it's paused at the start of the function's code.
print(str(it.send(None)))
This starts running the generator function's code. The value sent must be None or you'll get an error. The function never sees that value. It's more common to use next to start up a generator, since next(it) is equivalent to it.send(None).
The generator function runs until line 3, where the first yield appears. Since you're not yielding any particular value, the return value from send is None (which gets printed).
print(it.send(10))
This value gets sent to the generator and becomes the value of the yield expression on line 3. So 10 gets stored as m, and the code prints it out on line 4. The generator function keeps running to line 5, where it reaches the next yield expression. Since it's yielding str(m * 2), the calling code gets "20" and prints that.
print(it.send(100))
The 100 value gets sent into the generator as the value of the yield on line 4. That value is ignored, since you're not using the yield as an expression but as a statement. Just like putting 100 on a line by itself, this is perfectly legal, but maybe not very useful. The code goes on to line 5 where it yields str(m * 3), or "30", which gets printed by the calling code.
That's where your driving code stops, but the generator is still alive, and you could send more values to it (and get more values back). The next value you send to the generator would also be ignored, just like the 100 was, but the value after that would end up as a new m value when the while loop in the generator returned to the top and the line 3 yield was reached.
I suspect that some of your confusion with send in this code has to do with the fact that you're using yield both as an expression and as a statement. Probably you don't want to be doing both. Usually you'll either care about all the values being sent into the generator, or you don't care about any of them. If you want to yield several values together (like n*2 and n*3), you could yield a tuple rather than a single item.
Here's a modified version of your code that I think might be easier for you to play with and understand:
def multiplier():
print("top of generator")
m = yield # nothing to yield the first time, just a value we get
print("before loop, m =", m)
while True:
print("top of loop, m =", m)
m = yield m * 2, m * 3 # we always care about the value we're sent
print("bottom of loop, m =", m)
print("calling generator")
it = multiplier()
print("calling next")
next(it) # this is equivalent to it.send(None)
print("sending 10")
print(it.send(10))
print("sending 20")
print(it.send(20))
print("sending 100")
print(it.send(100))

What happens when closing a loop using an "infinite" iterable?

I have written the following python function(s):
import numpy
def primes_iterable():
"""Iterable giving the primes"""
# The lowest primes
primes = [2,3,5]
for p in primes:
yield p
for n in potential_primes():
m = int(numpy.sqrt(n))
check = True
for p in primes:
if p > m:
break
if n%p == 0:
check = False
if check:
primes.append(n)
yield n
def potential_primes():
"""Iterable starting at 7 and giving back the non-multiples of 2,3,5"""
yield 7
n = 7
gaps = [4,2,4,2,4,6,2,6]
while 1:
for g in gaps:
n += g
yield n
As you can see, both functions don't have a return statement. Suppose I was to write something like this:
for p in primes_iterable():
if p > 1000:
break
print p
What happens at the level of the memory when the break statement is reached? If I understand correctly, calling primes_iterable() makes the function start, go until the next yield and then pause until it is needed again. When the break statement is reached, does the function instance close up, or does it continue existing in the backgroud, completely useless?
Your function primes_iterable is a generator function. When you call it, nothing happens immediately (other than it returning a generator object). Only when next is called on it does it run to the next yield.
When you call the generator function, you get an iterable generator object. If you're doing that in a for loop, the loop will keep a reference to the generator object while it is running. If you break out of the loop, that reference is released and the generator object can be garbage collected.
But what happens to the code running in the generator function when the generator object is cleaned up? It gets interrupted by a GeneratorStop exception thrown in to it at the yield it was paused for. If you need to, you could have your generator function catch this exception, but you can't do anything useful other than cleaning up your resources and exiting. That is is often done with a try/finally pair, rather than an except statement.
Here's some example code that demonstrates the behavior:
def gen():
print("starting")
try:
while 1:
yield "foo"
except GeneratorExit:
print("caught GeneratorExit")
raise
finally:
print("cleaning up")
Here's a sample run:
>>> for i, s in enumerate(gen()):
print(s)
if i >= 3:
break
starting
foo
foo
foo
foo
caught GeneratorExit
cleaning up
When you break from the for loop there is no reference left to the generator so it will eventually be garbage collected...
Just for clarity calling primes_iterable() creates a generator. Calling next() on the generator passes control to the generator and it runs until it yields. The for implicitly calls next() each loop.
Consider this:
prime = primes_iterable()
print(next(prime)) # 2
for p in prime:
if p > 1000:
break
print(p) # 3, 5, 7, ...
Now you still have a reference to the generator called prime so you can always get the next prime:
print(next(prime)) # 1013
primes_iterable() returns an iterator. This is an object which spits out a new value whenever you call next on it. This is what a for loop does behind the scenes. Try this:
it = primes_iterable()
print(next(it))
print(next(it))
Important to note is that it isn't running forever behind the scenes here, it just runs far enough to spit out a new value whenever you ask it to. It keeps hold of its data so that it's ready to start running again whenever, but you can't access that data.
Now, in your code,
for p in primes_iterable():
As above primes_iterable has been called and has returned an iterator, although in this case the iterator has no name (i.e. it is not bound to a variable). For every step of the loop, p will be assigned to next of the iterator.
if p > 1000:
break
Now we break out and the for loop stops running next on the iterator. Nothing references the iterator any more (you can check this by calling dir() which shows you everything defined in the global namespace).
Therefore after a while Python frees up the memory that the iterator was taking up. This is called garbage collection. It's also what will happen if e.g. you type [1,2,3] into the interpreter but don't bind it to a variable name. It is created but then effectively deleted to free up space because it's pointless.
You can (and should) read more about iterators here:
https://docs.python.org/3/tutorial/classes.html#iterators

Python 3, yield expression return value influenced by its value just received via send()?

after reading documentation, questions, and making my own test code, I believe I have understood how a yield expression works.
Nevertheless, I am surprised of the behavior of the following example code:
def gen(n=0):
while True:
n = (yield n) or n+1
g=gen()
print( next(g) )
print( next(g) )
print( g.send(5) )
print( next(g) )
print( next(g) )
I would have expected that it returned 0, 1, 2, 5, 6, while instead it produces: 0, 1, 5, 6, 7.
I.e: I would have expected that the yield expression produce these effects:
calculate the value of the yield expression , and return it to the caller
get the value(s) from the caller's send() and use them in the as the value of the yield expression which the generator function code receives
suspend execution before anything else is executed; it will be resumed at the same point at the same next(g) or g.send() call
... and/or that Python would care to avoid any interference between the two
flows of information in (1) and (2), i.e. that they were guaranteed independent such as in a tuple assignment a, b = f(a,b), g(a,b)
(I would even wonder if it were better to make the suspension happen in between (1) and (2), but maybe it would be quite complicated because it would imply that only part of the statement is executed and the rest is held for the next resume)
Anyway, the order of the operations is rather (2), then (1), then (3), so that the assignment in (2) occurs before, and can influence the assignment in (1). I.e. the value injected by the g.send() call is used before calculating the yield expression itself, which is directly exposed to the caller as the value of the same g.send() expression.
I am astonished because from the point of view of the code in the generator expression, the value received in its lhs can influence the value taken by the rhs!
To me, this is kind of misleading because one expects that in a statement like lhs expr = rhs expr, all calculations in the rhs expr are finished before doing the assignment, and frozen during the assignment. It looks really weird that the lhs of an assignment can influence it's own rhs!
The question: which are the reasons why it was made this way? Any clue?
(I know that "We prefer questions that can be answered, not just discussed", but this is something in which I stumbled and made me consume a lot of time. I believe a bit of discussion won't to any bad and maybe will save someone else's time)
PS. of course I understand that I can separate the assignment into two steps, so that any value received from send() will be used only after resuming the operation. Like this:
def gen(n=0):
while True:
received = (yield n)
n = received or (n+1)
Your confusion lies with generator.send(). Sending is just the same thing as using next(), with the difference being that the yield expression produces a different value. Put differently, next(g) is the same thing as g.send(None), both operations resume the generator there and then.
Remember that a generator starts paused, at the top. The first next() call advances to the first yield expression, stops the generator and then pauses. When a yield expression is paused and you call either next(g) or g.send(..), the generator is resumed where it is right now, and then runs until the next yield expression is reached, at which point it pauses again.
For your code, this happens:
g is created, nothing happens in gen()
next(g) actually enters the function body, n = 0 is executed, yield n pauses g and yields 0. This is printed.
next(g) resumes the generator; None is returned for yield n (nothing was sent after all), so None or n + 1 is executed an n = 1 is set. The loop continues on and yield n is reached again, the generator pauses and 1 is yielded. This is printed.
g.send(5) resumes the generator. 5 or n + 1 means n = 5 is executed. The loop continues until yield n is reached, the generator is paused, 5 is yielded and you print 5.
next(g) resumes the generator; None is returned (nothing was sent again), so None or n + 1 is executed an n = 6 is set. The loop continues on and yield n is reached again, the generator pauses and 6 is yielded and printed.
next(g) resumes the generator; None is returned (nothing was sent again), so None or n + 1 is executed an n = 7 is set. The loop continues on and yield n is reached again, the generator pauses and 7 is yielded and printed.
Given your steps 1., 2. and 3., the actual order is 3., 2., 1. then, with the addition that next() also goes through step 2. producing None, and 1. being the next invocation of yield encountered after un-pausing.

Behaviour of Python's "yield"

I'm reading about the yield keyword in python, and trying to understand running this sample:
def countfrom(n):
while True:
print "before yield"
yield n
n += 1
print "after yield"
for i in countfrom(10):
print "enter for loop"
if i <= 20:
print i
else:
break
The output is:
before yield
enter for loop
10
after yield
before yield
enter for loop
11
after yield
before yield
enter for loop
12
after yield
before yield
enter for loop
13
after yield
before yield
enter for loop
14
after yield
before yield
enter for loop
15
after yield
before yield
enter for loop
16
after yield
before yield
enter for loop
17
after yield
before yield
enter for loop
18
after yield
before yield
enter for loop
19
after yield
before yield
enter for loop
20
after yield
before yield
enter for loop
It looks like the yield will return the specified value, and will continue runnning the function till the end (in a parallel thread, maybe). Is my understand correct?
If you could answer this without mentioning "generators", I would be thankful, because I'm trying to understand one at a time.
You can think of it as if the function which yields simply "pauses" when it comes across the yield. The next time you call it, it will resume after the yield keeping the state that it was in when it left.
No, there is only a single thread.
Each iteration of the for loop runs your countFrom function until it yields something, or returns. After the yield, the body of the for loop runs again and then, when a new iteration starts, the countFrom function picks up exactly where it left off and runs again until it yields (or returns).
This modified version of your example will helpfully make it clearer what path execution takes.
def countfrom(n):
while n <= 12:
print "before yield, n = ", n
yield n
n += 1
print "after yield, n = ", n
for i in countfrom(10):
print "enter for loop, i = ", i
print i
print "end of for loop iteration, i = ", i
Output
before yield, n = 10
enter for loop, i = 10
10
end of for loop iteration, i = 10
after yield, n = 11
before yield, n = 11
enter for loop, i = 11
11
end of for loop iteration, i = 11
after yield, n = 12
before yield, n = 12
enter for loop, i = 12
12
end of for loop iteration, i = 12
after yield, n = 13
..you cannot explain the meaning of the yield statement without mentioning generators; it would be like trying to explain what a stone is without mentioning rock. That is: the yield statement is the one responsible to transform a normal function into a generator.
While you find it well documented here: http://docs.python.org/reference/simple_stmts.html#the-yield-statement
..the brief explaination of it is:
When a function using the yield statement is called, it returns a "generator iterator", having a .next() method (the standard for iterable objects)
Each time the .next() method of the generator is called (eg. by iterating the object with a for loop), the function is called until the first yield is encountered. Then the function execution is paused and a value is passed as return value of the .next() method.
The next time .next() is called, the function execution is resumed until the next yield, etc. until the function returns something.
Some advantages in doing this are:
less memory usage since memory is allocated just for the currently yielded value, not the whole list of returned values (as it would be by returning a list of values)
"realtime" results return, as they are produced can be passed to the caller without waiting for the generation end (i used that to return output from a running process)
The function countfrom is not run in a parallel thread. What happens here is that whenever the for-construct asks for the next value, the function will execute until it hits a yield statement. When the next value after that is required, the function resumes execution from where it left off.
And while you asked not to mention "generators", they are so intimately linked with yield that it doesn't really make sense to talk about the separately. What your countfrom function actually returns is a "generator object". It returns this object immediately after it is called, so the function body is not executed at all until something (e.g. a for-loop) requests values from the generator using its method .next().
the yield statement stores the value that you yield, until that function is called again.
so if you call that function (with an iterator) it will run the function another time and give you the value.
the point being that it knows where it left off last time
Python runs until it hits a yield and then stops and freezes execution. It's not continuing to run. It's hitting "after" on the next call to countfrom
It's easy to say that without making reference to generators but the fact is yield and generator are inextricably linked. To really understand it you've got to view them as the same topic.
It's easy to show yourself that what I (and others) have said is true by working with the generator from your example in a more manual way.
A function that yields instead of returning really returns a generator. You can then consume that generator by calling next. You are confused because your loop is taking care of all that in the background for you.
Here it is with the internals opened up:
def countfrom(n):
while n <= 12:
print "before yield, n = ", n
yield n
n += 1
print "after yield, n = ", n
your_generator = countfrom(10)
next(your_generator)
print "see the after yield hasn't shown up yet, it's stopped at the first yield"
next(your_generator)
print "now it woke back up and printed the after... and continued through the loop until it got to back to yield"
next(your_generator)
print "rinse and repeate"
Yield with and without for loop:
def f1():
print('f1')
yield 10
print(f'f1 done')
def generator_with_for_loop():
print(f'generator_with_for_loop')
for f1_gen in f1():
print(f'f1_gen={f1_gen}')
def generator_without_for_loop():
print(f'\ngenerator_without_for_loop')
gen = f1()
print(f'f1_gen={gen}')
print(gen.__next__())
try:
print(gen.__next__())
except StopIteration:
print('done')
if __name__ == '__main__':
generator_with_for_loop()
generator_without_for_loop()
"""
generator_with_for_loop
f1
f1_gen=10
f1 done
generator_without_for_loop
f1_gen=<generator object f1 at 0x7fd7201e54a0>
f1
10
f1 done
done
"""

Categories

Resources