I'm having issues with Python 2.7, whereby an exception raised from a generator is not catchable.
I've lost a fair amount of time, twice, with this behavior.
def gen_function():
raise Exception("Here.")
for i in xrange(10):
yield i
try:
gen_function()
except Exception as e:
print("Ex: %s" % (e,))
else:
print("No exception.")
Output:
No exception.
gen_function() will give you generator object
You need to call next() function to invoke the code.
You can do it directly with next function
g = gen_function()
next(g)
or
for i in g:
pass # or whatever you want
Both will trigger an exception
Calling a generator just gives you the generator object. No code in the generator is actually executed, yet. Usually this isn't obvious since you often apply the generator immediately:
for x in gen_function():
print x
In this case the exception is raised. But where? To make it more explicit when this happens I've made explicit the for ... in loop (this is essentially what it does behind-the-scenes):
generator_obj = gen_function() # no exception
it = iter(generator_obj) # no exception (note iter(generator_obj) is generator_obj)
while True:
try:
x = it.next() # exception raised here
except StopIteration:
break
print x
Related
Why does this code work well and does not throw exceptions?
def myzip(*args):
iters = [iter(arg) for arg in args]
try:
while True:
yield tuple([next(it) for it in iters])
except StopIteration:
return
for x, y, z in myzip([1, 2], [3, 4], [5, 6]):
print(x, y, z)
But if this line
yield tuple([next(it) for it in iters])
replace by
yield tuple(next(it) for it in iters)
then everything stops working and throws a RuntimeError?
This is a feature introduced in Python 3.5, rather than a bug. Per PEP-479, a RuntimeError is re-raised intentionally when a StopIteration is raised from inside a generator so that iterations based on the generator can now only be stopped if the generator returns, at which point a StopIteration exception is raised to stop the iterations.
Otherwise, prior to Python 3.5, a StopIteration exception raised anywhere in a generator will stop the generator rather than getting propagated, so that in case of:
a = list(F(x) for x in xs)
a = [F(x) for x in xs]
The former would get a truncated result if F(x) raises a StopIteration exception at some point during the iteration, which makes it hard to debug, while the latter would propagate the exception raised from F(x). The goal of the feature is to make the two statements behave the same, which is why the change affects generators but not list comprehensions.
For a long time I didn't know you can't put return in front of a yield statement. But actually you can:
def gen():
return (yield 42)
which is similar to
def gen():
yield 42
return
And the only usage I can think of is to attach sent value to StopIteration: pep-0380
return expr in a generator causes StopIteration(expr) to be raised
upon exit from the generator.
def gen():
return (yield 42)
g = gen()
print(next(g)) # 42
try:
g.send('AAAA')
except StopIteration as e:
print(e.value) # 'AAAA'
But this can be done using an extra variable too, which is more explicit:
def gen():
a = yield 42
return a
g = gen()
print(next(g))
try:
g.send('AAAA')
except StopIteration as e:
print(e.value) # 'AAAA'
So it seems return (yield xxx) is merely a syntactic sugar. Am I missing something?
Inside a generator the expressions (yield 42) will yield the value 42, but it also returns a value which is either None, if you use next(generator) or a given value if you use generator.send(value).
So as you say, you could use an intermediate value to get the same behavior, not because this is syntactical sugar, but because the yield expressions is literally returning the value you send it.
You could equally do something like
def my_generator():
return (yield (yield 42) + 10)
If we call this, using the sequence of calls:
g = my_generator()
print(next(g))
try:
print('first response:', g.send(1))
print('Second response:', g.send(22))
print('third response:', g.send(3))
except StopIteration as e:
print('stopped at', e.value)
First we get the output of 42, and the generator is essentially paused in a state you could describe like: return (yield <Input will go here> + 10),
If we then call g.send(1) we get the output 11. and the generator is now in the state:
return <Input will go here>, then sending g.send(22) will throw a StopIteration(22), because of the way return is handled in generators.
So you never get to the third send because of the exception.
I hope this example makes it a bit more apparent how yield works in generators and why the syntax return (yield something) is nothing special or exotic and works exactly how you'd expect it.
As for the literal question, when would you do this? Well when ever you want to yield something, and then later return a StopIteration echoing the input of the user sent to the generator. Because this is literally what the code is stating. I expect that such behavior is very rarely wanted.
Why is it that when an exhausted generator is called several times, StopIteration is raised every time, rather than just on the first attempt? Aren't subsequent calls meaningless, and indicate a likely bug in the caller's code?
def gen_func():
yield 1
yield 2
gen = gen_func()
next(gen)
next(gen)
next(gen) # StopIteration as expected
next(gen) # why StopIteration and not something to warn me that I'm doing something wrong
This also results in this behavior when someone accidentally uses an expired generator:
def do_work(gen):
for x in gen:
# do stuff with x
pass
# here I forgot that I already used up gen
# so the loop does nothing without raising any exception or warning
for x in gen:
# do stuff with x
pass
def gen_func():
yield 1
yield 2
gen = gen_func()
do_work(gen)
If second and later attempts to call an exhausted generator raised a different exception, it would have been easier to catch this type of bugs.
Perhaps there's an important use case for calling exhausted generators multiple times and getting StopIteration?
Perhaps there's an important use case for calling exhausted generators multiple times and getting StopIteration?
There is, specifically, when you want to perform multiple loops on the same iterator. Here's an example from the itertools docs that relies on this behavior:
def grouper(iterable, n, fillvalue=None):
"Collect data into fixed-length chunks or blocks"
# grouper('ABCDEFG', 3, 'x') --> ABC DEF Gxx"
args = [iter(iterable)] * n
return zip_longest(*args, fillvalue=fillvalue)
It is a part of the iteration protocol:
Once an iterator’s __next__() method raises StopIteration, it must
continue to do so on subsequent calls. Implementations that do not
obey this property are deemed broken.
Source: https://docs.python.org/3/library/stdtypes.html#iterator-types
Here's an implementation of a wrapper that raises an error whenever StopIteration is raised more than once, as already noted by VPfB, this is implementation is considered broken
#!/usr/bin/env python3.8
from typing import TypeVar, Iterator
"""
https://docs.python.org/3/library/stdtypes.html#iterator-types
This is considered broken by the iterator protocol, god knows why
"""
class IteratorExhaustedError(Exception):
"""Exception raised when exhausted iterators are ``next``d"""
T = TypeVar("T")
class reuse_guard(Iterator[T]):
"""
Wraps an iterator so that StopIteration is only raised once,
after that, ``IteratorExhaustedError`` will be raised to detect
fixed-size iterator misuses
"""
def __init__(self, iterator: Iterator[T]):
self._iterated: bool = False
self._iterator = iterator
def __next__(self) -> T:
try:
return next(self._iterator)
except StopIteration as e:
if self._iterated:
raise IteratorExhaustedError(
"This iterator has already reached its end")
self._iterated = True
raise e
def __iter__(self) -> Iterator[T]:
return self
Example:
In [48]: iterator = reuse_guard(iter((1, 2, 3, 4)))
In [49]: list(iterator)
Out[49]: [1, 2, 3, 4]
In [50]: list(iterator)
---------------------------------------------------------------------------
StopIteration Traceback (most recent call last)
<ipython-input-47-456650faec86> in __next__(self)
19 try:
---> 20 return next(self._iterator)
21 except StopIteration as e:
StopIteration:
During handling of the above exception, another exception occurred:
IteratorExhaustedError Traceback (most recent call last)
<ipython-input-50-5070d0fe4365> in <module>
----> 1 list(iterator)
<ipython-input-47-456650faec86> in __next__(self)
21 except StopIteration as e:
22 if self._iterated:
---> 23 raise IteratorExhaustedError(
24 "This iterator has already reached its end")
25 self._iterated = True
IteratorExhaustedError: This iterator has already reached its end
Edit:
After revisiting the documentation on the iterator protocol it seems to me that the purpose of stating that iterators that do not continue to raise StopIteration should be considered broken is aimed more at the iterators that yield values instead of raising exceptions, that in this case make it more clear that the iterator should not be used once it's been exhausted. This is merely my interpretation thought.
I have a basic generator function that raises an exception if its parameters are not correct before doing any yield.
def my_generator(n):
if not isistance(n, int):
raise TypeError("Expecting an integer")
for i in range(1, 3):
yield n
I wanted to cover my whole project with unit tests, so I implemented this test function:
import pytest
from my_package import my_generator
#pytest.mark.parametrize("n, expected_exception", [
("1", TypeError), (1.0, TypeError), ([1], TypeError)
])
def test_my_generator_with_bad_parameters(n, expected_exception):
with pytest.raises(expected_exception):
my_generator(n)
But when I'm running pytest, I get:
Failed: DID NOT RAISE
However, if I modify my test to iterate over the resulting generator, the test passes.
def test_my_generator_with_bad_parameters(n, expected_exception):
res = my_generator(n)
with pytest.raises(expected_exception):
next(res)
How I am supposed to write this test? Is there a way to modify my_generator so that the first implementation of my unit test passes (assuming the function remains a generator)?
Normally, it's pretty OK to wait for the exception, until your generator is actually used, since most of the time, this is done in the same for-statement or list call.
If you really need checks at the time, your generator is generated, you can wrap your generator in an inner function:
def my_generator(n):
if not isistance(n, int):
raise TypeError("Expecting an integer")
def generator():
for i in range(1, 3):
yield n
return generator()
In Python 2 there was an error when return was together with yield in a function definition. But for this code in Python 3.3:
def f():
return 3
yield 2
x = f()
print(x.__next__())
there is no error that return is used in function with yield. However when the function __next__ is called then there is thrown exception StopIteration. Why there is not just returned value 3? Is this return somehow ignored?
This is a new feature in Python 3.3. Much like return in a generator has long been equivalent to raise StopIteration(), return <something> in a generator is now equivalent to raise StopIteration(<something>). For that reason, the exception you're seeing should be printed as StopIteration: 3, and the value is accessible through the attribute value on the exception object. If the generator is delegated to using the (also new) yield from syntax, it is the result. See PEP 380 for details.
def f():
return 1
yield 2
def g():
x = yield from f()
print(x)
# g is still a generator so we need to iterate to run it:
for _ in g():
pass
This prints 1, but not 2.
The return value is not ignored, but generators only yield values, a return just ends the generator, in this case early. Advancing the generator never reaches the yield statement in that case.
Whenever a iterator reaches the 'end' of the values to yield, a StopIteration must be raised. Generators are no exception. As of Python 3.3 however, any return expression becomes the value of the exception:
>>> def gen():
... return 3
... yield 2
...
>>> try:
... next(gen())
... except StopIteration as ex:
... e = ex
...
>>> e
StopIteration(3,)
>>> e.value
3
Use the next() function to advance iterators, instead of calling .__next__() directly:
print(next(x))