I'm trying to figure out how to make iterator, below is an iterator that works fine.
class DoubleIt:
def __init__(self):
self.start = 1
def __iter__(self):
self.max = 10
return self
def __next__(self):
if self.start < self.max:
self.start *= 2
return self.start
else:
raise StopIteration
obj = DoubleIt()
i = iter(obj)
print(next(i))
However, when I try to pass 16 into the second argument in iter() (I expect the iterator will stop when return 16)
i = iter(DoubleIt(), 16)
print(next(i))
It throws TypeError: iter(v, w): v must be callable
Therefore, I try to do so.
i = iter(DoubleIt, 16)
print(next(i))
It returns <main.DoubleIt object at 0x7f4dcd4459e8>. Which is not I expected.
I checked the website of programiz, https://www.programiz.com/python-programming/methods/built-in/iter
Which said that callable object must be passed in the first argument so as to use the second argument, but it doesn't mention can User defined object be passed in it in order to use the second argument.
So my question is, is there a way to do so? Can the second argument be used with the "Self defined Object"?
The documentation could be a bit clearer on this, it only states
iter(object[, sentinel])
...
The iterator created in this case will call object with no arguments
for each call to its __next__() method; if the value returned is equal to sentinel, StopIteration will be raised, otherwise the value will be returned.
What is maybe not said perfectly clearly is that what the iterator yields is whatever the callable returns. And since your callable is a class (with no arguments), it returns a new instance of the class every iteration.
One way around this is to make your class callable and delegate it to the __next__ method:
class DoubleIt:
def __init__(self):
self.start = 1
def __iter__(self):
return self
def __next__(self):
self.start *= 2
return self.start
__call__ = __next__
i = iter(DoubleIt(), 16)
print(next(i))
# 2
print(list(i))
# [4, 8]
This has the dis-/advantage that it is an infinite generator that is only stopped by the sentinel value of iter.
Another way is to make the maximum an argument of the class:
class DoubleIt:
def __init__(self, max=10):
self.start = 1
self.max = max
def __iter__(self):
return self
def __next__(self):
if self.start < self.max:
self.start *= 2
return self.start
else:
raise StopIteration
i = iter(DoubleIt(max=16))
print(next(i))
# 2
print(list(i))
# [4, 8, 16]
One difference to note is that iter stops when it encounters the sentinel value (and does not yield the item), whereas this second way uses <, instead of <= comparison (like your code) and will thus yield the maximum item.
Here's an example of a doubler routine that would work with the two argument mode of iter:
count = 1
def nextcount():
global count
count *= 2
return count
print(list(iter(nextcount, 16)))
# Produces [2, 4, 8]
This mode involves iter creating the iterator for us. Note that we need to reset count before it can work again; it only works given a callable (such as a function or bound method) that has side effects (changing the counter), and the iterator will only stop upon encountering exactly the sentinel value.
Your DoubleIt class provided no particular protocol for setting a max value, and iter doesn't expect or use any such protocol either. The alternate mode of iter creates an iterator from a callable and a sentinel value, quite independent of the iterable or iterator protocols.
The behaviour you expected is more akin to what itertools.takewhile or itertools.islice do, manipulating one iterator to create another.
Another way to make an iterable object is to implement the sequence protocol:
class DoubleSeq:
def __init__(self, steps):
self.steps = steps
def __len__(self):
return self.steps
def __getitem__(self, iteration):
if iteration >= self.steps:
raise IndexError()
return 2**iteration
print(list(iter(DoubleSeq(4))))
# Produces [1, 2, 4, 8]
Note that DoubleSeq isn't an iterator at all; iter created one for us using the sequence protocol. DoubleSeq doesn't hold the iteration counter, the iterator does.
Related
If I understand properly, we in Python we have:
Iterables = __iter__() is implemented
Iterators = __iter__() returns self & __next__() is implemented
Generators = an iterator created with a yield statement or a generator expression.
Question: Are there categories above that are always/never consumable?
By consumable I mean iterating through them "destroys" the iterable; like zip() (consumable) vs range() (not consumable).
All iterators are consumed; the reason you might not think so is that when you use an iterable with something like
for x in [1,2,3]:
the for loop is creating a new iterator for you behind the scenes. In fact, a list is not an iterator; iter([1,2,3]) returns something of type list_iterator, not the list itself.
Regarding the example you linked to in a comment, instead of
class PowTwo:
def __init__(self, max=0):
self.max = max
def __iter__(self):
self.n = 0
return self
def __next__(self):
if self.n <= self.max:
result = 2 ** self.n
self.n += 1
return result
else:
raise StopIteration
which has the side effect of modifying the iterator in the act of returning it, I would do something like
class PowTwoIterator:
def __init__(self, max=0):
self.max = max
self._restart()
def _restart(self):
self._n = 0
def __iter__(self):
return self
def __next__(self):
if self._n <= self.max:
result = 2 ** self._n
self._n += 1
return result
else:
raise StopIteration
Now, the only way you can modify the state of the object is to do so explicitly (and even that should not be done lightly, since both _n and _restart are marked as not being part of the public interface).
The change in the name reminds you that this is first and foremost an iterator, not an iterable that can provide independent iterators from.
I am learning python these days and have a small question as follows:
Define a class drop_first that has only two methods init and iter
The constructor of drop_first has parameter iterable, to which an iterable is given.
Method init applies function iter to iterable and records the resulting iterator in some attribute. It then calls function next on the iterator once.
Method iter simply returns the iterator recorded in the attribute.
For example, the instance created by drop_first([1,2,3]) allows iteration skipping the first element 1.
for x in drop_first([1,2,3]): print(x)
2 and 3 will be printed.
I answered with the following codes:
class drop_first:
def __init__(self, iterable):
self.iterable = iterable
self.index = 1
def __iter__(self):
return self
def __next__(self):
try:
result = self.iterable[self.index]
except IndexError:
raise StopIteration
self.index += 1
return result
Nevertheless, I found that it was required to create a class with only 2 methods inside, and am a little bit confused about the specifications of those two methods. So I am wondering if anyone could give me some explainations about the requirement of the class...Thank you.
Define a class drop_first that has only two methods init and iter
Bad instruction. This class should be called DropFirst. But be it.
The constructor of drop_first has parameter iterable, to which an iterable is given.
def __init__(self, iterable):
Method init applies function iter to iterable and records the resulting iterator in some attribute. It then calls function next on the iterator once.
self.iterator = iter(iterable)
next(self.iterator)
Method iter simply returns the iterator recorded in the attribute.
def __iter__(self):
return self.iterator
This results in
class DropFirst: # or, if your instructor insists on it, drop_first
def __init__(self, iterable):
self.iterator = iter(iterable)
next(self.iterator)
def __iter__(self):
return self.iterator
But, to be honest, I'd take the following approach (which breaks the specifications, but is more versatile):
class DropFirst:
def __init__(self, iterable):
self.iterable = iterable
def __iter__(self):
iterator = iter(self.iterable)
next(iterator)
return iterator
This works in all cases the original one also does, but can be iterated over several times.
In Python 3, it is standard procedure to make a class an iterable and iterator at the same time by defining both the __iter__ and __next__ methods. But I have problems to wrap my head around this. Take this example which creates an iterator that produces only even numbers:
class EvenNumbers:
def __init__(self, max_):
self.max_ = max_
def __iter__(self):
self.n = 0
return self
def __next__(self):
if self.n <= self.max_: # edit: self.max --> self.max_
result = 2 * self.n
self.n += 1
return result
raise StopIteration
instance = EvenNumbers(4)
for entry in instance:
print(entry)
To my knowledge (correct me if I'm wrong), when I create the loop, an iterator is created by calling something like itr = iter(instance) which internally calls the __iter__ method. This is expected to return an iterator object (which the instance is due to defining __next__ and therefore I can just return self). To get an element from it, next(itr) is called until the exception is raised.
My question here is now: if and how can __iter__ and __next__ be separated, so that the content of the latter function is defined somewhere else? And when could this be useful? I know that I have to change __iter__ so that it returns an iterator.
Btw the idea to do this comes from this site (LINK), which does not state how to implement this.
It sounds like you're confusing iterators and iterables. Iterables have an __iter__ method which returns an iterator. Iterators have a __next__ method which returns either their next value or raise a StopIteration. Now in python, it is stated that iterators are also iterables (but not visa versa) and that iter(iterator) is iterator so an iterator, itr, should return only itself from it's __iter__ method.
Iterators are required to have an __iter__() method that returns the iterator object itself so every iterator is also iterable and may be used in most places where other iterables are accepted
In code:
class MyIter:
def __iter__(self):
return self
def __next__(self):
# actual iterator logic
If you want to make a custom iterator class, the easiest way is to inherit from collections.abc.Iterator which you can see defines __iter__ as above (it is also a subclass of collections.abc.Iterable). Then all you need is
class MyIter(collections.abc.Iterator):
def __next__(self):
...
There is of course a much easier way to make an iterator, and thats with a generator function
def fib():
a = 1
b = 1
yield a
yield b
while True:
b, a = a + b, b
yield b
list(itertools.takewhile(lambda x: x < 100, fib()))
# --> [1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89]
Just for reference, this is (simplified) code for an abstract iterator and iterable
from abc import ABC, abstractmethod
class Iterable(ABC):
#abstractmethod
def __iter__(self):
'Returns an instance of Iterator'
pass
class Iterator(Iterable, ABC):
#abstractmethod
def __next__(self):
'Return the next item from the iterator. When exhausted, raise StopIteration'
pass
# overrides Iterable.__iter__
def __iter__(self):
return self
I think I have grasped the concept now, even if I do not fully understand the passage from the documentation by #FHTMitchell. I came across an example on how to separate the two methods and wanted to document this.
What I found is a very basic tutorial that clearly distinguishes between the iterable and the iterator (which is the cause of my confusion).
Basically, you define your iterable first as a separate class:
class EvenNumbers:
def __init__(self, max_):
self.max = max_
def __iter__(self):
self.n = 0
return EvenNumbersIterator(self)
The __iter__ method only requires an object that has a __next__ method defined. Therefore, you can do this:
class EvenNumbersIterator:
def __init__(self, source):
self.source = source
def __next__(self):
if self.source.n <= self.source.max:
result = 2 * self.source.n
self.source.n += 1
return result
else:
raise StopIteration
This separates the iterator part from the iterable class. It now makes sense that if I define __next__ within the iterable class, I have to return the reference to the instance itself as it basically does 2 jobs at once.
Can someone explain to me how the __iter__() and __next__() functions handle indices? Are they base 0 or base 1?
I have been playing around with it, but I'd like to know what Python is actually doing on the back end. I tried the example class below:
>>> class my_class:
def __init__(self, *stuff):
self.__stuff = stuff
def __iter__(self):
self.__n = 0
return iter(self.__stuff)
def __next__(self):
if self.__n <= len(self.__stuff):
self.__n += 1
return self.__stuff(self.__n)
else:
raise StopIteration
>>> x = my_class(1, 2, 3, 4)
>>> for each in x:
print(each)
1
2
3
4
Unless, I'm mistaken, the first self.__n value that __next__() uses should be 1, which should produce, this:
>>> for each in x:
print(each)
2
3
4
What am I missing? How does it know to start at self.__stuff[0]?
When you call for each in x:, it do nothing with __next__() in your class definition, so it start 1 of your object attribute rather than 2.
Even it you want to call something like print(next(x)) it will give you 'TypeError: 'tuple' object is not callable', because self.__stuff(self.__n) is invalid as in self.__stuff is a tuple and self.__n is an integer. You can only call tuple[int] rather than tuple(int).
Try following code after your code mentioned it will return you desired output then raise an exception.
for each in x:
print(next(x))
Result:
2
3
4
raise StopIteration
When you use my_class, it first calls the __init__, then calls the __iter__, last is the __next__.
In your code, when it calls __iter__, it return iter(self.__stuff),then is over, __next__ is not called. So the output is what you see.
If you want __next__ called, you can change your code like this(here self.__n that __next__ uses starts from 1):
class my_class:
def __init__(self, *stuff):
self.__stuff = stuff
def __iter__(self):
self.__n = 0
print('__iter__ is called')
return self
def __next__(self):
print('__next__ is called')
if self.__n <= len(self.__stuff):
self.__n += 1
return self.__stuff(self.__n)
else:
raise StopIteration
Tip: you can use print to help you understand what the code is doing, like print function in the code above.
The __iter__() method returns iter(self.__stuff) instead of self. As such, the tuple passed to __init__() is iterated over, not the object.
I've learnt from the official documentation of python 2.7.8 how to work with iterators and generators. I've got a question related on a curiosity.
it = iter("abcde")
print it
>>> <iterator object at 0x7ff4c2b3bad0>
class example1():
def __init__(self, word):
self.word = word
self.index = len(word)
def __iter__(self):
for x in range(self.index - 1, -1, -1):
yield self.word[x]
a = example1("altalena")
print iter(a)
>>> <generator object __iter__ at 0x7f24712000a0>
In the above examples, when I print the iterators, i read "generator","iterator" object and the hexadecimal ID. Why I can't do the same with the following code?
class example2():
def __init__(self, word):
self.word = word
self.index = len(word)
def __iter__(self):
return self
def next (self):
if self.index == 0:
raise StopIteration
self.index = self.index - 1
return self.word[self.index]
a = example2()
print iter(a)
>>> <__main__.example2 instance at 0x7f89ee2de440>
I think it is caused by "return self" in iter, that leads to the class instance, but i don't know the solution to get a more right output. It may be useless but I don't know why it is an how to avoid it.
Generators are a type of iterator. Your example1 class is returning a generator because you used yield in the __iter__; it is a separate iterator object, returned from __iter__. This makes your example1 class iterable, not an iterator itself.
In your second example your class __iter__ returns self. It is not just iterable, it is its own iterator. Because it is returning self, there is no separate object here.
Python makes an explicit distinction between iterable and iterator. An iterable can potentially be iterated over. An iterator does the actual work when iterating. By using iter() you ask Python to produce an iterator for a given iterable.
This is why iter(stringobject) returns a new object; you have produced an iterator from the iterable string.
You need this distinction because the process of iterating requires something that keeps state; namely where in the process are we now. An iterator is the object that keeps track of that, so that each time you call the .next() method on the iterator, you get the next value, or StopIteration is raised if there is no next value to produce anymore.
So in your example, the string and example1 are both just iterable, and calling iter() on them produced a new separate iterator.
However, your example2 is it's own iterator. Calling iter() on it does not produce a separate object. You cannot create separate iterators from something that already is an iterator.
If you want to produce a new independent iterator for your example2 class, you need to return a distinct, separate object from __iter__:
class ExampleIterator(object):
def __init__(self, source):
self.source = source
self.position = len(source)
def __iter__(self):
return self
def next(self):
if self.position <= 0:
raise StopIteration
self.position -= 1
return self.source[self.position]
class Example2():
def __init__(self, word):
self.word = word
def __iter__(self):
return ExampleIterator(self)
Here Example2 is just a iterable again, not an iterator. The __iter__ method returns a new, distinct ExampleIterator() instance, and it is responsible for keeping track of the iteration state.
I think you don't have a problem, you just don't understand what happens here.
In general, iter(object) returns an iterator for this iterable object.
This is obtained either via calling __iter__() or, if that doesn't exist, by providing a wrapper object which calls __getitem__() until it is exhausted (raises IndexError).
The object returned by __iter__() can either be the iterable itself (like in your 2nd example) or something else. Especially, it can be a generator object if you make __iter__() a generator function as you do in your 1st example.
The iteration itself happens with the iterator object and its next() resp. __next__() method.