Iterator vs Iterable? - python

(For python 3)
In the python docs, you can see that the list() function takes an iterable.
In the python docs, you can also see that the next() funciton takes an iterator.
So I did this in IDLE:
>>> var = map(lambda x: x+5, [1,2,3])
>>> var
>>> next(v)
>>> list(v)
Which gives the output:
<map object at 0x000000000375F978>
6
[7,8]
Frankly, this isn't what I expected. Is a map object an iterator or an iterable? Is there even a difference? Clearly both the list() and next() functions work on the map object, whatever it is.
Why do they both work?

An iterator is an iterable, but an iterable is not necessarily an iterator.
An iterable is anything that has an __iter__ method defined - e.g. lists and tuples, as well as iterators.
Iterators are a subset of iterables whose values cannot all be accessed at the same time, as they are not all stored in memory at once. These can be generated using functions like map, filter and iter, as well as functions using yield.
In your example, map returns an iterator, which is also an iterable, which is why both functions work with it. However, if we take a list for instance:
>>> lst = [1, 2, 3]
>>> list(lst)
[1, 2, 3]
>>> next(lst)
Traceback (most recent call last):
File "<pyshell#3>", line 1, in <module>
next(lst)
TypeError: 'list' object is not an iterator
we can see that next complains, because the list, an iterable, is not an iterator.

Related

Why isn't lst.sort().reverse() valid?

Per title. I do not understand why it is not valid. I understand that they mutate the object, but if you call the sort method, after it's done then you'd call the reverse method so it should be fine. Why is it then that I need to type lst.sort() then on the line below, lst.reverse()?
Edit: Well, when it's pointed out like that, it's a bit embarrassing how I didn't get it before. I literally recognize that it mutated the object and thus returns a None, but I suppose it didn't register that also meant that you can't reverse a None-type object.
When you call lst.sort(), it does not return anything, it changes the list itself.
So the result of lst.sort() is None, thus you try to reverse None which is impossible.
Put simply, lst.sort() does not return the list sorted. It modifies itself.
>>> lst = [3,1,2,0]
>>> lst
[3, 1, 2, 0]
>>> lst.sort().reverse()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
AttributeError: 'NoneType' object has no attribute 'reverse'
>>>
Since lst.sort() doesn't return anything, Python automatically returns None for you. Since None doesn't have a reverse method, you get an error.
>>> lst.sort()
>>> lst.reverse()
>>> lst
[3, 2, 1, 0]
>>>
You can also try reversing the list while sorting Like
lst.sort( reverse=True )

Unexpected results when comparing list comprehension with generator expression [duplicate]

This question already has answers here:
Generator Comprehension different output from list comprehension?
(4 answers)
Generator expressions vs. list comprehensions
(13 answers)
Closed 3 years ago.
I think I'm overlooking something simple, but I can't seem to figure out what exactly. Please consider the following code:
a = [2, 3, 4, 5]
lc = [ x for x in a if x >= 4 ] # List comprehension
lg = ( x for x in a if x >= 4 ) # Generator expression
a.extend([6,7,8,9])
for i in lc:
print("{} ".format(i), end="")
for i in lg:
print("{} ".format(i), end="")
I expected that both for-loops would produce the same result, so 4 5. However, the for-loop that prints the generator exp prints 4 5 6 7 8 9. I think it has something to do with the declaration of the list comprehension (Which is declared before the extend). But why is the result of the generator different, as it is also declared before extending the list? E.g. what is going on internally?
Generators aren't evaluated until you call next() on them which is what makes them useful, while list comprehensions are evaluated immediately.
So lc = [4,5] before extend and is therefore done.
lg is still the same value at the start so the extend still applies to the a which hasn't finished being evaluated within the generator, meaning that a gets extended before you start printing it which is why it will print out longer with the rest of the numbers as well.
Check it out like this:
>>> a = [2, 3, 4, 5]
>>> lg = ( x for x in a if x >= 4 )
>>> next(lg)
4
>>> next(lg)
5
>>> a.extend([6,7,8,9])
>>> next(lg)
6
However, if you were to try calling an extra next() before extend you'll get StopIteration because the generator is exhausted at that point and then you won't be able to call it any longer.
>>> a = [2, 3, 4, 5]
>>> lg = ( x for x in a if x >= 4 )
>>> next(lg)
4
>>> next(lg)
5
>>> next(lg)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
>>> a.extend([6,7,8,9])
>>> next(lg)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
what is going on internally?
Generators are inherently lazy.
[ x for x in a if x >= 4 ] is evaluated as soon as it is executed.
( x for x in a if x >= 4 ) when this executes it just creates the generator. The loops itself is only evaluated/executed when the generator is consumed in one of the many ways possible ('manually' calling next, converting to another iterable type [list, tuple, set etc] or with a for loop).
The main advantage of generators being lazy is memory consumption. They do not need to store all the elements in memory, but only the current (or next, I should say) element.
The generator expression is lazily evaluated, so when you get back the generator object the code x for x in a if x >= 4 is not yet executed.
The for-in loop internally calls the built-in next() function for each iteration of the loop for that generator object. The next() call actually evaluates the code and that code points to the updated list which has the new set of values you added after the generator object was created.
>>> lg = ( x for x in a if x >= 4)
#evaluates the code and returns the first value
>>> next(lg)
4
>>> next(lg)
5
# if new values are added here to the list
# the generator will return them
But in the case of the list comprehension the generator object's next() method is immediately invoked and all the values are added in a list container using the values which was there in the beginning.
The built-in list() and the [] takes an iterable object as a parameter and constructs a list with the values returned from the iterable object. This happens immediately when you pass the iterable (in your case the generator object which is an iterable) to the list constructor.
But on the other hand if you simply execute the generator expression, you just get back the generator object which is just an iterable and also an iterator. So either you need to call next() on it to execute the code and get the value or use it in a for in iterable: loop which does it implicitly.
But remember once you exhaust the generator object by getting a StopIteration exception, and you add a new value in the list that value won't be returned from the next() call as the generator object can be consumed only once.
>>> a = [2, 3, 4, 5]
>>> lg = ( x for x in a if x >= 4)
>>> next(lg)
4
>>> next(lg)
5
>>> a.append(9)
>>> next(lg)
9
>>> next(lg)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration
# lg is consumed
>>> a.append(10)
>>> next(lg)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
StopIteration

Why Generators are exhaustive and Lists/Tuples are not? [duplicate]

This question already has answers here:
What are iterator, iterable, and iteration?
(16 answers)
Closed 4 years ago.
First of all i have to say i read lot of SO posts before coming to this one because I could not find what I was looking for or maybe I didn't understood.
So here it goes
I kind of understand what Iterables and Iterators are. So any container object like Lists/Tuples/Sets which contains items, which you can iterate over are called Iterables. Now to iterate over the Iterables you need Iterators and the way it happens is because of __iter__ method which gives you the Iterator object for the type and then calling the __next__ on the Iterator object to extract the values.
So to make any object iterable you need to define iter and next methods, and i suppose that is true for Lists as well. But here comes the weird part which I discovered recently.
l1 = [1,2,3]
hasattr(l1, "__next__")
Out[42]: False
g = (x for x in range(3))
hasattr(g, "__next__")
Out[44]: True
Now because the lists do support Iterator protocol why the __next__ method is missing from their implementation, and if it indeed is missing then how does iteration for a list work ?
list_iterator = iter(l1)
next(list_iterator)
Out[46]: 1
next(list_iterator)
Out[47]: 2
next(list_iterator)
Out[48]: 3
next(list_iterator)
Traceback (most recent call last):
File "C:\Users\RJ\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2910, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-49-56e733bbb896>", line 1, in <module>
next(list_iterator)
StopIteration
gen0_iterator = iter(g)
gen_iterator = iter(g)
next(gen_iterator)
Out[57]: 0
next(gen_iterator)
Out[58]: 1
next(gen_iterator)
Out[59]: 2
next(gen_iterator)
Traceback (most recent call last):
File "C:\Users\RJ\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2910, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-60-83622dd5d1b9>", line 1, in <module>
next(gen_iterator)
StopIteration
gen_iterator1 = iter(g)
next(gen_iterator1)
Traceback (most recent call last):
File "C:\Users\RJ\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py", line 2910, in run_code
exec(code_obj, self.user_global_ns, self.user_ns)
File "<ipython-input-62-86f9b3cc341f>", line 1, in <module>
next(gen_iterator1)
StopIteration
I created an iterator for a list and then called next method on it to get the elements and it works.
Now if the previous hasattr(a, "__next__") returns a False then how we are able to call next method on the iterator object for a list.
Now the original question which made me think all this, no matter how many times i iterate over the list, it doesn't exhaust and calling the iter() gives back a new iterator object everytime, but in case of generator this does not happen, and once the generator has exhausted, no matter how many times you call iter() it will always gives you back the same object which already has raised the StopIteration exception and again this is true because an iterator once raised a StopIteration, it always will, but why it does not happen with lists.
Further this is in sync with what python docs says for conatiner.__ iter__ that container.__iter__ gives you the iterator object for the type and iterator.__ iter__ and iterator.__iter__ gives you the iterator object itself, which is precisely the reason that calling the iter() on generator returns the same object over and over again. But why and more importantly how ?
One more thing to observe here is
isinstance(l1 , collections.Iterator)
Out[65]: False
isinstance(g , collections.Iterator)
Out[66]: True
So this suggests that there is some implementation difference b/w Iterables and Iterators, but i could not find any such details, because both have __iter__ and __next__ methods implemented, so from where does this variation in behavior comes. So is it that __iter__ for iterables returns something different from what is returned by __iter__ of iterables(generators). If some can explain with some examples of __iter__ for Iterables and Iterataors that would be really helpful. Finally some puzzle about yield, since that is the magic word which makes a normal function a generator (so a type of iterator), so what does __iter__ and __next__ of `yield looks like.
I have tried my level best to explain the question, but if still something is missing, please do let me know i will try to clarify my question.
Its a bit different than that. iterables have an __iter__ method that returns an iterator. iterators have a __next__ method (and usually also have __iter__ so that iter() works on them - but that's not required).
Lists are iterable:
>>> l = [1,2,3]
>>> hasattr(l, "__iter__")
True
>>> hasattr(l, "__next__")
False
>>> l_iter = iter(l)
>>> hasattr(l_iter, "__next__")
True
>>> hasattr(l_iter, "__iter__")
True
>>> l_iter == iter(l_iter)
True
And give you new iterators that run through the e each time you use them
>>> list(l)
[1, 2, 3]
>>> list(l)
[1, 2, 3]
>>> l_iter = iter(l)
>>> list(l_iter)
[1, 2, 3]
>>> list(l_iter)
[]
each time you use them
>>> list(l)
[1, 2, 3]
>>> list(l)
[1, 2, 3]
>>> iter(l) == iter(l)
False
But the list iterator itself is one shot
>>> l_iter = iter(l)
>>> list(l_iter)
[1, 2, 3]
>>> list(l_iter)
[]
The generator is an iterator, not an iterable and is also one shot.
>>> g = (x for x in range(3))
>>> hasattr(g, "__iter__")
True
>>> hasattr(g, "__next__")
True
>>> g == iter(g)
True
>>>
>>> list(g)
[0, 1, 2]
>>> list(g)
[]

If range() is a generator in Python 3.3, why can I not call next() on a range?

Perhaps I've fallen victim to misinformation on the web, but I think it's more likely just that I've misunderstood something. Based on what I've learned so far, range() is a generator, and generators can be used as iterators. However, this code:
myrange = range(10)
print(next(myrange))
gives me this error:
TypeError: 'range' object is not an iterator
What am I missing here? I was expecting this to print 0, and to advance to the next value in myrange. I'm new to Python, so please accept my apologies for the rather basic question, but I couldn't find a good explanation anywhere else.
range is a class of immutable iterable objects. Their iteration behavior can be compared to lists: you can't call next directly on them; you have to get an iterator by using iter.
So no, range is not a generator.
You may be thinking, "why didn't they make it an iterator"? Well, ranges have some useful properties that wouldn't be possible that way:
They are immutable, so they can be used as dictionary keys.
They have the start, stop and step attributes (since Python 3.3), count and index methods and they support in, len and __getitem__ operations.
You can iterate over the same range multiple times.
>>> myrange = range(1, 21, 2)
>>> myrange.start
1
>>> myrange.step
2
>>> myrange.index(17)
8
>>> myrange.index(18)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: 18 is not in range
>>> it = iter(myrange)
>>> it
<range_iterator object at 0x7f504a9be960>
>>> next(it)
1
>>> next(it)
3
>>> next(it)
5

Understanding the behavior of Python's set

The documentation for the built-in type set says:
class set([iterable])
Return a new set or frozenset object
whose elements are taken from
iterable. The elements of a set must
be hashable.
That is all right but why does this work:
>>> l = range(10)
>>> s = set(l)
>>> s
set([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])
And this doesn't:
>>> s.add([10])
Traceback (most recent call last):
File "<pyshell#7>", line 1, in <module>
s.add([10])
TypeError: unhashable type: 'list'
Both are lists. Is some magic happening during the initialization?
When you initialize a set, you provide a list of values that must each be hashable.
s = set()
s.add([10])
is the same as
s = set([[10]])
which throws the same error that you're seeing right now.
In [13]: (2).__hash__
Out[13]: <method-wrapper '__hash__' of int object at 0x9f61d84>
In [14]: ([2]).__hash__ # nothing.
The thing is that set needs its items to be hashable, i.e. implement the __hash__ magic method (this is used for ordering in the tree as far as I know). list does not implement that magic method, hence it cannot be added in a set.
In this line:
s.add([10])
You are trying to add a list to the set, rather than the elements of the list. If you want ot add the elements of the list, use the update method.
Think of the constructor being something like:
class Set:
def __init__(self,l):
for elem in l:
self.add(elem)
Nothing too interesting to be concerned about why it takes lists but on the other hand add(element) does not.
It behaves according to the documentation: set.add() adds a single element (and since you give it a list, it complains it is unhashable - since lists are no good as hash keys). If you want to add a list of elements, use set.update(). Example:
>>> s = set([1,2,3])
>>> s.add(5)
>>> s
set([1, 2, 3, 5])
>>> s.update([8])
>>> s
set([8, 1, 2, 3, 5])
s.add([10]) works as documented. An exception is raised because [10] is not hashable.
There is no magic happening during initialisation.
set([0,1,2,3,4,5,6,7,8,9]) has the same effect as set(range(10)) and set(xrange(10)) and set(foo()) where
def foo():
for i in (9,8,7,6,5,4,3,2,1,0):
yield i
In other words, the arg to set is an iterable, and each of the values obtained from the iterable must be hashable.

Categories

Resources