I have a simple coroutine
def id():
while True:
x = yield
yield x
I can use it to create a generator and prime it with next
gen = id()
next(gen)
for x in gen:
print(x)
This will print None forever. My intuition is that the generator from id would only yield a value once a value has been sent to it, but in this case it's producing None values all by itself.
Can someone explain this behaviour to me? Is the x = yield statement defaulting x to None?
As others have pointed out, calling next(gen) is equivalent to calling gen.send(None). You can see this if you look at the C code for generator objects:
PyDoc_STRVAR(send_doc,
"send(arg) -> send 'arg' into generator,\n\
return next yielded value or raise StopIteration.");
PyObject *
_PyGen_Send(PyGenObject *gen, PyObject *arg)
{
return gen_send_ex(gen, arg, 0);
}
...
static PyObject *
gen_iternext(PyGenObject *gen)
{
return gen_send_ex(gen, NULL, 0);
}
As you can see, iternext (which is what gets called when you call next(generator_object)), calls the exact same method as generator_object.send, only it always passes in a NULL object. Consider that your for x in gen loop is basically the equivalent of doing this:
iterable = iter(gen):
while True:
try:
x = next(iterable)
except StopIteration:
break
print(x)
And you see what the issue is.
It's important to realize that coroutines and generators are really completely separate concepts, even though they're both implemented using the yield keyword. In general, you shouldn't be iterating over a coroutine, and in general you don't need/want to send things into a generator. I highly recommend David Beazley's PyCon coroutine tutorial for more on this and coroutines in general.
The for statement calls the next method of the generator instead of the send method. According to the documentation (Python 2.7) this results in the yield statement returning None:
generator.next()
Starts the execution of a generator function or
resumes it at the last executed yield expression. When a generator
function is resumed with a next() method, the current yield expression
always evaluates to None.
yield is an expression that is only valid inside of functions. The value of a yield expression is None. So your function yields None, then assigns the result of yield (None) to x, then yields x. To demonstrate what is happening better let's yield something from the first yield and see what happens.
def id():
while True:
x = yield 5
yield x
This will now generate the following sequence:
5
None
5
None
5
None
5
None
...
Related
This question already has answers here:
Return in generator together with yield
(2 answers)
Closed last year.
Why does
yield [cand]
return
lead to different output/behavior than
return [[cand]]
Minimal viable example
uses recursion
the output of the version using yield [1]; return is different than the output of the version using return [[1]]
def foo(i):
if i != 1:
yield [1]
return
yield from foo(i-1)
def bar(i):
if i != 1:
return [[1]]
yield from bar(i-1)
print(list(foo(1))) # [[1]]
print(list(bar(1))) # []
Min viable counter example
does not use recurion
the output of the version using yield [1]; return is the same as the output of the version using return [[1]]
def foo():
yield [1]
return
def foofoo():
yield from foo()
def bar():
return [[1]]
def barbar():
yield from bar()
print(list(foofoo())) # [[1]]
print(list(barbar())) # [[1]]
Full context
I'm solving Leetcode #39: Combination Sum and was wondering why one solution works, but not the other:
Working solution
from functools import cache # requires Python 3.9+
class Solution:
def combinationSum(self, candidates: List[int], target: int) -> List[List[int]]:
#cache
def helper(targ, i=0):
if i == N or targ < (cand := candidates[i]):
return
if targ == cand:
yield [cand]
return
for comb in helper(targ - cand, i):
yield comb + [cand]
yield from helper(targ, i+1)
N = len(candidates)
candidates.sort()
yield from helper(target)
Non-working solution
from functools import cache # requires Python 3.9+
class Solution:
def combinationSum(self, candidates: List[int], target: int) -> List[List[int]]:
#cache
def helper(targ, i=0):
if i == N or targ < (cand := candidates[i]):
return
if targ == cand:
return [[cand]]
for comb in helper(targ - cand, i):
yield comb + [cand]
yield from helper(targ, i+1)
N = len(candidates)
candidates.sort()
yield from helper(target)
Output
On the following input
candidates = [2,3,6,7]
target = 7
print(Solution().combinationSum(candidates, target))
the working solution correctly prints
[[3,2,2],[7]]
while the non-working solution prints
[]
I'm wondering why yield [cand]; return works, but return [[cand]] doesn't.
In a generator function, return just defines the value associated with the StopIteration exception implicitly raised to indicate an iterator is exhausted. It's not produced during iteration, and most iterating constructs (e.g. for loops) intentionally ignore the StopIteration exception (it means the loop is over, you don't care if someone attached random garbage to a message that just means "we're done").
For example, try:
>>> def foo():
... yield 'onlyvalue' # Existence of yield keyword makes this a generator
... return 'returnvalue'
...
>>> f = foo() # Makes a generator object, stores it in f
>>> next(f) # Pull one value from generator
'onlyvalue'
>>> next(f) # There is no other yielded value, so this hits the return; iteration over
--------------------------------------------------------------------------
StopIteration Traceback (most recent call last)
...
StopIteration: 'returnvalue'
As you can see, your return value does get "returned" in a sense (it's not completely discarded), but it's never seen by anything iterating normally, so it's largely useless. Outside of rare cases involving using generators as coroutines (where you're using .send() and .throw() on instances of the generator and manually advancing it with next(genobj)), the return value of a generator won't be seen.
In short, you have to pick one:
Use yield anywhere in a function, and it's a generator (whether or not the code path of a particular call ever reaches a yield) and return just ends generation (while maybe hiding some data in the StopIteration exception). No matter what you do, calling the generator function "returns" a new generator object (which you can loop over until exhausted), it can never return a raw value computed inside the generator function (which doesn't even begin running until you loop over it at least once).
Don't use yield, and return works as expected (because it's not a generator function).
As an example to explain what happens to the return value in normal looping constructs, this is what for x in gen(): effectively expands to a C optimized version of:
__unnamed_iterator = iter(gen())
while True:
try:
x = next(__unnamed_iterator)
except StopIteration: # StopIteration caught here without inspecting it
break # Loop ends, StopIteration exception cleaned even from sys.exc_info() to avoid possible reference cycles
# body of loop goes here
# Outside of loop, there is no StopIteration object left
As you can see, the expanded form of the for loop has to look for a StopIteration to indicate the loop is over, but it doesn't use it. And for anything that's not a generator, the StopIteration never has any associated values; the for loop has no way to report them even if it did (it has to end the loop when it's told iteration is over, and the arguments to StopIteration are explicitly not part of the values iterated anyway). Anything else that consumes the generator (e.g. calling list on it) is doing roughly the same thing as the for loop, ignoring the StopIteration in the same way; nothing except code that specifically expects generators (as opposed to more generalized iterables and iterators) will ever bother to inspect the StopIteration object (at the C layer, there are optimizations that StopIteration objects aren't even produced by most iterators; they return NULL and leave the set exception empty, which all iterator protocol using things know is equivalent to returning NULL and setting a StopIteration object, so for anything but a generator, there isn't even an exception to inspect much of the time).
According to PEP 8 we should be consistent in our function declarations and ensure that they all have the same return-pattern, i.e. all should return an expression or all should not. However, I am not sure how to apply this to generators.
A generator will yield values as long as the code reaches them, unless a return statement is encountered in which case it will stop the iteration. However, I don't see any use-case in which returning a value from a generator function can happen. In that spirit, I don't see why it is useful - from a PEP 8 perspective - to end such a function with the explicit return None. In other words, why do we ought to verbalize a return statement for generators if the return expression is only reached when the yield'ing is over?
Example: in the following code, I don't see how hello() can be used to assign 100 to a variable (thus using the return statement). So why does PEP 8 expect us to write a return statement (be it 100 or None).
def hello():
for i in range(5):
yield i
return 100
h = [x for x in hello()]
g = hello()
print(h)
# [0, 1, 2, 3, 4]
print(g)
# <generator object hello at 0x7fd2f285a7d8>
# can we ever get 100?
You have misread PEP8. PEP8 states:
Be consistent in return statements. Either all return statements in a function should return an expression, or none of them should.
(bold emphasis mine)
You should be consistent with how you use return within a single function, not across your whole project.
Use return, it's the only return statement in the function.
However, I don't see any use-case in which returning a value from a generator function can happen.
The return value of a generator is attached to the StopIteration exception raised:
>>> def gen():
... if False: yield
... return 'Return value'
...
>>> try:
... next(gen())
... except StopIteration as ex:
... print(ex.value)
...
Return value
And this is also the mechanism by which yield from produces a value; the return value of yield from is the value attribute on the StopIteration exception. A generator can thus return a result to code using result = yield from generator by using return result:
>>> def bar():
... result = yield from gen()
... print('gen() returned', result)
...
>>> next(bar(), None)
gen() returned Return value
This feature is used in the Python standard library; e.g. in the asyncio library the value of StopIteration is used to pass along Task results, and the #coroutine decorator uses res = yield from ... to run a wrapped generator or awaitable and pass through the return value.
So, from a PEP-8 point of view, for generators and there are two possibilities:
You are using return to exit the generator early, say in a loop with if. Use return, no need to add None:
def foo():
while bar:
yield ham
if spam:
return
You are using return <something> to exit and set StopIteration.value. Use return <something> consistently throughout your generator, even when returning None:
def foo():
for bar in baz:
yield bar
if spam:
return 'The bar bazzed the spam'
return None
I was experimenting with creating a function that returns an object or works as a generator.
This is a bad idea because, as a best practice, you want functions to reliably return the same types of values, but in the interest of science...
I'm using Python 2, and so range returns a list, and xrange is an iterable (that interestingly also provides a __len__).
def xr(start, stop=None, step=1, gen=True):
if stop is None:
start, stop = 0, start
if gen == True:
for i in xrange(start, stop, step):
yield i
else:
return range(start, stop, step)
I get this error:
File "<stdin>", line 8
SyntaxError: 'return' with argument inside generator
Questions:
Why (beyond the obvious "you can't have both yield and return in a function," assuming that's right) does it do this? Looking at my code, it is not readily apparent why it would be bad to do this.
How would I attempt to get around this? I know I can return xrange instead of yielding each item from xrange, and so I could return a generator created in another function, but is there a better way?
How about using generator expression?
>>> def xr(start, stop=None, step=1, gen=True):
... if stop is None:
... start, stop = 0, start
... if gen == True:
... return (i for i in xrange(start, stop, step)) # <----
... else:
... return range(start, stop, step)
...
>>> xr(2, gen=False)
[0, 1]
>>> xr(2, gen=True)
<generator object <genexpr> at 0x0000000002C1C828>
>>> list(xr(2, gen=True))
[0, 1]
BTW, I would rather define a generator function only. Then use list(xr(..)) if I need a list.
UPDATE Alternatively you can use iter(xrange(start, stop, step)) instead of the generator expression as #DSM commented. See Built-in functions -- iter.
falsetru has given you a way to have a function that returns a generator or a list. I'm answering your other question about why you can't have a return and a yield in the same function.
When you call a generator function, it doesn't actually do anything immediately (see this question). It doesn't execute the function body, but waits until you start iterating over it (or call next on it). Therefore, Python has to know if the function is a generator function or not at the beginning, when you call it, to know whether to run the function body or not.
It doesn't make sense to then have it return some value, because if it's a generator function what it returns is the generator (i.e., the thing you iterate over). If you could have a return inside a generator function, it wouldn't be reached when you called the function, so the function would have to "spontaneously" return a value at some later point (when the return statement was reached as the generator was consumed), which would either be the same as yielding a value at that time, or would be something bizarre and confusing.
I agree with falsetru that it's probably not a great idea to have a function that sometimes returns a generator and sometimes a list. Just call list on the generator if you want a list.
You cannot have both because interpreter - when it encounters keyword yield - treats the function as a generator function. And obviously you cannot have a return statement in a generator function.
Additional point - you don't use generator functions directly, you use them to create (I am tempted to say instantiate) generators.
Simple example
def infinite():
cnt = 1
while True:
yield cnt
cnt += 1
infinite_gen = infinite()
infinite is generator function and infinite_gen is generator
There is this code:
def f():
return 3
return (i for i in range(10))
x = f()
print(type(x)) # int
def g():
return 3
for i in range(10):
yield i
y = g()
print(type(y)) # generator
Why f returns int when there is return generator statement? I guess that yield and generator expression both returns generators (at least when the statement return 3 is removed) but are there some other rules of function compilation when there is once generator expression returned and second time when there is yield keyword inside?
This was tested in Python 3.3
As soon as you use a yield statement in a function body, it becomes a generator. Calling a generator function just returns that generator object. It is no longer a normal function; the generator object has taken over control instead.
From the yield expression documentation:
Using a yield expression in a function definition is sufficient to cause that definition to create a generator function instead of a normal function.
When a generator function is called, it returns an iterator known as a generator. That generator then controls the execution of a generator function. The execution starts when one of the generator’s methods is called.
In a regular function, calling that function immediately switches control to that function body, and you are simply testing the result of the function, set by it's return statement. In a generator function, return still signals the end of the generator function, but that results in a StopIteration exception being raised instead. But until you call one of the 4 generator methods (.__next__(), .send(), .throw() or .close()), the generator function body is not executed at all.
For your specific function f(), you have a regular function, that contains a generator. The function itself is nothing special, other that that it exits early when return 3 is executed. The generator expression on the next line stands on its own, it does not influence the function in which it is defined. You can define it without the function:
>>> (i for i in range(10))
<generator object <genexpr> at 0x101472730>
Using a generator expression produces a generator object, just like using yield in a function, then calling that function produces a generator object. So you could have called g() in f() with the same result as using the generator expression:
def f():
return 3
return g()
g() is still a generator function, but using it in f() does not make f() a generator function too. Only yield can do that.
def f():
return 3
return (i for i in range(10))
is the same as
def f():
return 3
The second return statement never gets executed, just by having a generator expression within f does not make it a generator.
def f():
return 3
#unreachable code below
return (i for i in range(10))
I believe what you meant is:
def f():
yield 3
yield from (i for i in range(10))
Returning a generator doesn't make f a generator function. A generator is just an object, and generator objects can be returned by any function. If you want f to be a generator function, you have to use yield inside the function, as you did with g.
I'm familiar with yield to return a value thanks mostly to this question
but what does yield do when it is on the right side of an assignment?
#coroutine
def protocol(target=None):
while True:
c = (yield)
def coroutine(func):
def start(*args,**kwargs):
cr = func(*args,**kwargs)
cr.next()
return cr
return start
I came across this, on the code samples of this blog, while researching state machines and coroutines.
The yield statement used in a function turns that function into a "generator" (a function that creates an iterator). The resulting iterator is normally resumed by calling next(). However it is possible to send values to the function by calling the method send() instead of next() to resume it:
cr.send(1)
In your example this would assign the value 1 to c each time.
cr.next() is effectively equivalent to cr.send(None)
You can send values to the generator using the send function.
If you execute:
p = protocol()
p.next() # advance to the yield statement, otherwise I can't call send
p.send(5)
then yield will return 5, so inside the generator c will be 5.
Also, if you call p.next(), yield will return None.
You can find more information here.
yield returns a stream of data as per the logic defined within the generator function.
However, send(val) is a way to pass a desired value from outside the generator function.
p.next() doesn't work in python3, next(p) works (built-in) for both python 2,3
p.next() doesn't work with python 3, gives the following error,however it still works in python 2.
Error: 'generator' object has no attribute 'next'
Here's a demonstration:
def fun(li):
if len(li):
val = yield len(li)
print(val)
yield None
g = fun([1,2,3,4,5,6])
next(g) # len(li) i.e. 6 is assigned to val
g.send(8) # 8 is assigned to val