So I get generator functions for lazy evaluation and generator expressions, aka generator comprehensions as its syntactic sugar equivalent.
I understand classes like
class Itertest1:
def __init__(self):
self.count = 0
self.max_repeats = 100
def __iter__(self):
print("in __inter__()")
return self
def __next__(self):
if self.count >= self.max_repeats:
raise StopIteration
self.count += 1
print(self.count)
return self.count
as a way of implementing the iterator interface, i.e. iter() and next() in one and the same class.
But what then is
class Itertest2:
def __init__(self):
self.data = list(range(100))
def __iter__(self):
print("in __inter__()")
for i, dp in enumerate(self.data):
print("idx:", i)
yield dp
which uses the yield statement within the iter member function?
Also I noticed that upon calling the iter member function
it = Itertest2().__iter__()
batch = it.__next__()
the print statement is only executed when calling next() for the first time.
Is this due to this weird mixture of yield and iter? I think this is quite counter intuitive...
Something equivalent to Itertest2 could be written using a separate iterator class.
class Itertest3:
def __init__(self):
self.data = list(range(100))
def __iter__(self):
return Itertest3Iterator(self.data)
class Itertest3Iterator:
def __init__(self, data):
self.data = enumerate(data)
def __iter__(self):
return self
def __next__(self):
print("in __inter__()")
i, dp = next(self.state) # Let StopIteration exception propagate
print("idx:", i)
return dp
Compare this to Itertest1, where the instance of Itertest1 itself carried the state of the iteration around in it. Each call to Itertest1.__iter__ returned the same object (the instance of Itertest1), so they couldn't iterate over the data independently.
Notice I put print("in __iter__()") in __next__, not __iter__. As you observed, nothing in a generator function actually executes until the first call to __next__. The generator function itself only creates an generator; it does not actually start executing the code in it.
Having the yield statement anywhere in any function wraps the function code in a (native) generator object, and replaces the function with a stub that gives you said generator object.
So, here, calling __iter__ will give you an anonymous generator object that executes the code you want.
The main use case for __next__ is to provide a way to write an iterator without relying on (native) generators.
The use case of __iter__ is to distinguish between an object and an iteration state over said object. Consider code like
c = some_iterable()
for a in c:
for b in c:
# do something with a and b
You would not want the two interleaved iterations to interfere with each other's state. This is why such a loop would desugar to something like
c = some_iterable()
_iter1 = iter(c)
try:
while True:
a = next(_iter1)
_iter2 = iter(c)
try:
while True:
b = next(_iter2)
# do something with a and b
except StopIteration:
pass
except StopIteration:
pass
Typically, custom iterators implement a stub __iter__ that returns self, so that iter(iter(x)) is equivalent to iter(x). This is important when writing iterator wrappers.
Related
I am learning python these days and have a small question as follows:
Define a class drop_first that has only two methods init and iter
The constructor of drop_first has parameter iterable, to which an iterable is given.
Method init applies function iter to iterable and records the resulting iterator in some attribute. It then calls function next on the iterator once.
Method iter simply returns the iterator recorded in the attribute.
For example, the instance created by drop_first([1,2,3]) allows iteration skipping the first element 1.
for x in drop_first([1,2,3]): print(x)
2 and 3 will be printed.
I answered with the following codes:
class drop_first:
def __init__(self, iterable):
self.iterable = iterable
self.index = 1
def __iter__(self):
return self
def __next__(self):
try:
result = self.iterable[self.index]
except IndexError:
raise StopIteration
self.index += 1
return result
Nevertheless, I found that it was required to create a class with only 2 methods inside, and am a little bit confused about the specifications of those two methods. So I am wondering if anyone could give me some explainations about the requirement of the class...Thank you.
Define a class drop_first that has only two methods init and iter
Bad instruction. This class should be called DropFirst. But be it.
The constructor of drop_first has parameter iterable, to which an iterable is given.
def __init__(self, iterable):
Method init applies function iter to iterable and records the resulting iterator in some attribute. It then calls function next on the iterator once.
self.iterator = iter(iterable)
next(self.iterator)
Method iter simply returns the iterator recorded in the attribute.
def __iter__(self):
return self.iterator
This results in
class DropFirst: # or, if your instructor insists on it, drop_first
def __init__(self, iterable):
self.iterator = iter(iterable)
next(self.iterator)
def __iter__(self):
return self.iterator
But, to be honest, I'd take the following approach (which breaks the specifications, but is more versatile):
class DropFirst:
def __init__(self, iterable):
self.iterable = iterable
def __iter__(self):
iterator = iter(self.iterable)
next(iterator)
return iterator
This works in all cases the original one also does, but can be iterated over several times.
In Python 3, it is standard procedure to make a class an iterable and iterator at the same time by defining both the __iter__ and __next__ methods. But I have problems to wrap my head around this. Take this example which creates an iterator that produces only even numbers:
class EvenNumbers:
def __init__(self, max_):
self.max_ = max_
def __iter__(self):
self.n = 0
return self
def __next__(self):
if self.n <= self.max_: # edit: self.max --> self.max_
result = 2 * self.n
self.n += 1
return result
raise StopIteration
instance = EvenNumbers(4)
for entry in instance:
print(entry)
To my knowledge (correct me if I'm wrong), when I create the loop, an iterator is created by calling something like itr = iter(instance) which internally calls the __iter__ method. This is expected to return an iterator object (which the instance is due to defining __next__ and therefore I can just return self). To get an element from it, next(itr) is called until the exception is raised.
My question here is now: if and how can __iter__ and __next__ be separated, so that the content of the latter function is defined somewhere else? And when could this be useful? I know that I have to change __iter__ so that it returns an iterator.
Btw the idea to do this comes from this site (LINK), which does not state how to implement this.
It sounds like you're confusing iterators and iterables. Iterables have an __iter__ method which returns an iterator. Iterators have a __next__ method which returns either their next value or raise a StopIteration. Now in python, it is stated that iterators are also iterables (but not visa versa) and that iter(iterator) is iterator so an iterator, itr, should return only itself from it's __iter__ method.
Iterators are required to have an __iter__() method that returns the iterator object itself so every iterator is also iterable and may be used in most places where other iterables are accepted
In code:
class MyIter:
def __iter__(self):
return self
def __next__(self):
# actual iterator logic
If you want to make a custom iterator class, the easiest way is to inherit from collections.abc.Iterator which you can see defines __iter__ as above (it is also a subclass of collections.abc.Iterable). Then all you need is
class MyIter(collections.abc.Iterator):
def __next__(self):
...
There is of course a much easier way to make an iterator, and thats with a generator function
def fib():
a = 1
b = 1
yield a
yield b
while True:
b, a = a + b, b
yield b
list(itertools.takewhile(lambda x: x < 100, fib()))
# --> [1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
Just for reference, this is (simplified) code for an abstract iterator and iterable
from abc import ABC, abstractmethod
class Iterable(ABC):
#abstractmethod
def __iter__(self):
'Returns an instance of Iterator'
pass
class Iterator(Iterable, ABC):
#abstractmethod
def __next__(self):
'Return the next item from the iterator. When exhausted, raise StopIteration'
pass
# overrides Iterable.__iter__
def __iter__(self):
return self
I think I have grasped the concept now, even if I do not fully understand the passage from the documentation by #FHTMitchell. I came across an example on how to separate the two methods and wanted to document this.
What I found is a very basic tutorial that clearly distinguishes between the iterable and the iterator (which is the cause of my confusion).
Basically, you define your iterable first as a separate class:
class EvenNumbers:
def __init__(self, max_):
self.max = max_
def __iter__(self):
self.n = 0
return EvenNumbersIterator(self)
The __iter__ method only requires an object that has a __next__ method defined. Therefore, you can do this:
class EvenNumbersIterator:
def __init__(self, source):
self.source = source
def __next__(self):
if self.source.n <= self.source.max:
result = 2 * self.source.n
self.source.n += 1
return result
else:
raise StopIteration
This separates the iterator part from the iterable class. It now makes sense that if I define __next__ within the iterable class, I have to return the reference to the instance itself as it basically does 2 jobs at once.
I would like to have a function that can, optionally, return or yield the result.
Here is an example.
def f(option=True):
...
for...:
if option:
yield result
else:
results.append(result)
if not option:
return results
Of course, this doesn't work, I have tried with python3 and I always get a generator no matter what option value I set.
As far I have understood, python checks the body of the function and if a yield is present, then the result will be a generator.
Is there any way to get around this and make a function that can return or yield at will?
You can't. Any use of yield makes the function a generator.
You could wrap your function with one that uses list() to store all values the generator produces in a list object and returns that:
def f_wrapper(option=True):
gen = f()
if option:
return gen # return the generator unchanged
return list(gen) # return all values of the generator as a list
However, generally speaking, this is bad design. Don't have your functions alter behaviour like this; stick to one return type (a generator or an object) and don't have it switch between the two.
Consider splitting this into two functions instead:
def f():
yield result
def f_as_list():
return list(f())
and use either f() if you need the generator, and f_as_list() if you want to have a list instead.
Since list(), (and next() to access just one value of a generator) are built-in functions, you rarely need to use a wrapper. Just call those functions directly:
# access elements one by one
gen = f()
one_value = next(gen)
# convert the generator to a list
all_values = list(f())
What about this?
def make_f_or_generator(option):
def f():
return "I am a function."
def g():
yield "I am a generator."
if option:
return f
else:
return g
This gives you at least the choice to create a function or a generator.
class based approach
class FunctionAndGenerator:
def __init__(self):
self.counter = 0
def __iter__(self):
return self
# You need a variable to indicate if dunder next should return the string or raise StopIteration.
# Raising StopIteration will stop the loop from iterating more.
# You'll have to teach next to raise StopIteration at some point
def __next__(self):
self.counter += 1
if self.counter > 1 :
raise StopIteration
return f"I'm a generator and I've generated {self.counter} times"
def __call__(self):
return "I'm a function"
x = FunctionAndGenerator()
print(x())
for i in x:
print(i)
I'm a function
I'm a generator and I've generated 1 times
[Program finished]
I have a class that looks like this
Class myClass:
def __init__(self, key, value):
self.key = key
self.value = value
where key is a string and value is always a list of elements of myClass, possibly empty.
I want to define my own iter method that returns value.key for each value in values. I tried
def __iter__(self):
return self
def __next__(self):
try:
self.value.next().key
except:
raise StopIteration
but it's looping forever. What am I doing wrong?
Also if I want to have Python 2 / 3 compatibility, should I add the method
def next(self):
return self.__next__()
There's no reason for you to implement __next__. You can use __iter__ to return a generator which will do what you want.
class Pair(object):
def __init__(self, key, value):
self.key = key
self.value = value
def __iter__(self):
return (v.key for v in self.value)
# alternative iter function, that allows more complicated logic
def __iter__(self):
for v in self.value:
yield v.key
p = Pair("parent", [Pair("child0", "value0"), Pair("child1", "value1")])
assert list(p) == ["child0", "child1"]
This way of doing things is compatible with both python2 and python3 as the returned generator will have the required next function in python2, and __next__ in python3.
You need to extract and preserve an iterator on list self.value -- you can't just call next on a list, you need an iterator on such a list.
So, you need an auxiliary iterator class:
class myClassIter(object):
def __init__(self, theiter):
self.theiter = theiter
def __next__(self):
return next(self.theiter).key
next = __next__
which I've also made Py 2/3 compatible with the object base and appropriate aliasing.
Here, I'm assuming every item in the list has a key attribute (so the only expected exception is StopIteration, which you can just propagate). If that is not the case, and you want to just stop the iteration when an item is met without the attribite, the try/except is needed, but keep it tight! -- a crucial design aspect of good exception handling. I.e, if these are indeed your specs:
def __next__(self):
try: return next(self.theiter).key
except AttributeError: raise StopIteration
don't catch all exceptions -- only the ones you specifically expect!
Now, in myClass, you'll want:
def __iter__(self):
return myClassIter(iter(self.value))
This means that myClass is an iterable, not an iterator, so you can e.g properly have more than one loop on a myClass instance:
mc = myClass(somekey, funkylist)
for ka in mc:
for kb in mc:
whatever(ka, kb)
If mc was itself an iterator, the inner loop would exhaust it and the semantics of the nested loops would therefore be completely different.
If you do indeed want such completely different semantics (i.e you want mc to be itself an iterator, not just an iterable) then you must dispense with the auxiliary class (but still need to store the iterator on self.value as an instance attribute for myClass) -- that would be a strange, uncomfortable arrangement, but it is (just barely) possible that it is indeed the arrangement your application needs...
my code run wrong
class a(object):
def __iter(self):
return 33
b={'a':'aaa','b':'bbb'}
c=a()
print b.itervalues()
print c.itervalues()
Please try to use the code, rather than text, because my English is not very good, thank you
a. Spell it right: not
def __iter(self):
but:
def __iter__(self):
with __ before and after iter.
b. Make the body right: not
return 33
but:
yield 33
or
return iter([33])
If you return a value from __iter__, return an iterator (an iterable, as in return [33], is almost as good but not quite...); or else, yield 1+ values, making __iter__ into a generator function (so it intrinsically returns a generator iterator).
c. Call it right: not
a().itervalues()
but, e.g.:
for x in a(): print x
or
print list(a())
itervalues is a method of dict, and has nothing to do with __iter__.
If you fix all three (!) mistakes, the code works better;-).
A few things about your code:
__iter should be __iter__
You're returning '33' in the __iter__ function. You should actually be returning an iterator object. An iterator is an object which keeps returning different values when it's next() function is called (maybe a sequence of values like [0,1,2,3 etc]).
Here's a working example of an iterator:
class a(object):
def __init__(self,x=10):
self.x = x
def __iter__(self):
return self
def next(self):
if self.x > 0:
self.x-=1
return self.x
else:
raise StopIteration
c=a()
for x in c:
print x
Any object of class a is an iterator object. Calling the __iter__ function is supposed to return the iterator, so it returns itself – as you can see, the a class has a next() function, so this is an iterator object.
When the next function is called, it keeps return consecutive values until it hits zero, and then it sends the StopIteration exception, which (appropriately) stops the iteration.
If this seems a little hazy, I would suggest experimenting with the code and then checking out the documentation here: http://docs.python.org/library/stdtypes.html
Here is a code example that implements the xrange builtin:
class my_xrange(object):
def __init__(self, start, end, skip=1):
self.curval = int(start)
self.lastval = int(end)
self.skip = int(skip)
assert(int(skip) != 0)
def __iter__(self):
return self
def next(self):
if (self.skip > 0) and (self.curval >= self.lastval):
raise StopIteration()
elif (self.skip < 0) and (self.curval <= self.lastval):
raise StopIteration()
else:
oldval = self.curval
self.curval += self.skip
return oldval
for i in my_xrange(0, 10):
print i
You are using this language feature incorrectly.
http://docs.python.org/library/stdtypes.html#iterator-types
This above link will explain what the function should be used for.
You can try to see documentation in your native language here: http://wiki.python.org/moin/Languages