Related
This question already has answers here:
Strange result when removing item from a list while iterating over it
(8 answers)
Closed 1 year ago.
I am wondering about the loop behavior when I am popping each element but that could apply to any modifications on an iterrables.
Let's imagine this:
l = ["elem1", "elem2", "elem3", "elem4"]
for i, elem in enumerate(l):
l.pop(i)
It is simple but i am wondering: does python keep a unmodified instance of l at each loop iteration or does it update l ? I could loop over a l.copy() but in that case i have an IndexError.
I know that i could simply solve the issue of removing elem in list one by one but here i am trying to understand the behavior.
for i in iterable:
# some code with i
is (with sufficient precision in this context) equivalent to
iterator = iter(iterable)
while True:
try:
i = next(iterator)
except StopIteration:
break
# some code with i
You can see
i is reassigned in each iteration
the iterator is created exactly once
mutations to the iterable may or may not lead to unexpected behavior, depending on how iterable.__iter__ is implemented. __iter__ is the method responsible for creating the iterator.
In the case of lists the iterator keeps track of an integer index, i.e. which element to pull next from the list. When you remove an item during iteration, the index of the subsequent elements change by -1, but the iterator is not being informed of this.
>>> l = ['a', 'b', 'c', 'd']
>>> li = iter(l) # iterator pulls index 0 next
>>> next(li)
'a' # iterator pulled index 0, will pull index 1 next
>>> l.remove('a') # 'b' is now at index 0
>>> l
['b', 'c', 'd']
>>> next(li)
'c'
Why don't you just test it?
>>> for i, elem in enumerate(l):
... print(i, l)
... l.pop(i)
...
0 ['elem1', 'elem2', 'elem3', 'elem4']
'elem1'
1 ['elem2', 'elem3', 'elem4']
'elem3'
Under the hood it just calls the next() method on the iterable (PEP279). Because the list, indeed, changes on each iteration, it just raises StopIteration as the index would reach 2 in your case.
When everything else fails, read the manual!
Note There is a subtlety when the sequence is being modified by the loop (this can only occur for mutable sequences, e.g. lists). An internal counter is used to keep track of which item is used next, and this is incremented on each iteration. When this counter has reached the length of the sequence the loop terminates. This means that if the suite deletes the current (or a previous) item from the sequence, the next item will be skipped (since it gets the index of the current item which has already been treated). Likewise, if the suite inserts an item in the sequence before the current item, the current item will be treated again the next time through the loop. This can lead to nasty bugs that can be avoided by making a temporary copy using a slice of the whole sequence, e.g., [...]
I'm not sure the bit about a "counter" that gets "incremented" is strictly true, but the gist is there: don't modify and iterate!
What does for row_number, row in enumerate(cursor): do in Python?
What does enumerate mean in this context?
The enumerate() function adds a counter to an iterable.
So for each element in cursor, a tuple is produced with (counter, element); the for loop binds that to row_number and row, respectively.
Demo:
>>> elements = ('foo', 'bar', 'baz')
>>> for elem in elements:
... print elem
...
foo
bar
baz
>>> for count, elem in enumerate(elements):
... print count, elem
...
0 foo
1 bar
2 baz
By default, enumerate() starts counting at 0 but if you give it a second integer argument, it'll start from that number instead:
>>> for count, elem in enumerate(elements, 42):
... print count, elem
...
42 foo
43 bar
44 baz
If you were to re-implement enumerate() in Python, here are two ways of achieving that; one using itertools.count() to do the counting, the other manually counting in a generator function:
from itertools import count
def enumerate(it, start=0):
# return an iterator that adds a counter to each element of it
return zip(count(start), it)
and
def enumerate(it, start=0):
count = start
for elem in it:
yield (count, elem)
count += 1
The actual implementation in C is closer to the latter, with optimisations to reuse a single tuple object for the common for i, ... unpacking case and using a standard C integer value for the counter until the counter becomes too large to avoid using a Python integer object (which is unbounded).
It's a builtin function that returns an object that can be iterated over. See the documentation.
In short, it loops over the elements of an iterable (like a list), as well as an index number, combined in a tuple:
for item in enumerate(["a", "b", "c"]):
print item
prints
(0, "a")
(1, "b")
(2, "c")
It's helpful if you want to loop over a sequence (or other iterable thing), and also want to have an index counter available. If you want the counter to start from some other value (usually 1), you can give that as second argument to enumerate.
I am reading a book (Effective Python) by Brett Slatkin and he shows another way to iterate over a list and also know the index of the current item in the list but he suggests that it is better not to use it and to use enumerate instead.
I know you asked what enumerate means, but when I understood the following, I also understood how enumerate makes iterating over a list while knowing the index of the current item easier (and more readable).
list_of_letters = ['a', 'b', 'c']
for i in range(len(list_of_letters)):
letter = list_of_letters[i]
print (i, letter)
The output is:
0 a
1 b
2 c
I also used to do something, even sillier before I read about the enumerate function.
i = 0
for n in list_of_letters:
print (i, n)
i += 1
It produces the same output.
But with enumerate I just have to write:
list_of_letters = ['a', 'b', 'c']
for i, letter in enumerate(list_of_letters):
print (i, letter)
As other users have mentioned, enumerate is a generator that adds an incremental index next to each item of an iterable.
So if you have a list say l = ["test_1", "test_2", "test_3"], the list(enumerate(l)) will give you something like this: [(0, 'test_1'), (1, 'test_2'), (2, 'test_3')].
Now, when this is useful? A possible use case is when you want to iterate over items, and you want to skip a specific item that you only know its index in the list but not its value (because its value is not known at the time).
for index, value in enumerate(joint_values):
if index == 3:
continue
# Do something with the other `value`
So your code reads better because you could also do a regular for loop with range but then to access the items you need to index them (i.e., joint_values[i]).
Although another user mentioned an implementation of enumerate using zip, I think a more pure (but slightly more complex) way without using itertools is the following:
def enumerate(l, start=0):
return zip(range(start, len(l) + start), l)
Example:
l = ["test_1", "test_2", "test_3"]
enumerate(l)
enumerate(l, 10)
Output:
[(0, 'test_1'), (1, 'test_2'), (2, 'test_3')]
[(10, 'test_1'), (11, 'test_2'), (12, 'test_3')]
As mentioned in the comments, this approach with range will not work with arbitrary iterables as the original enumerate function does.
The enumerate function works as follows:
doc = """I like movie. But I don't like the cast. The story is very nice"""
doc1 = doc.split('.')
for i in enumerate(doc1):
print(i)
The output is
(0, 'I like movie')
(1, " But I don't like the cast")
(2, ' The story is very nice')
I am assuming that you know how to iterate over elements in some list:
for el in my_list:
# do something
Now sometimes not only you need to iterate over the elements, but also you need the index for each iteration. One way to do it is:
i = 0
for el in my_list:
# do somethings, and use value of "i" somehow
i += 1
However, a nicer way is to user the function "enumerate". What enumerate does is that it receives a list, and it returns a list-like object (an iterable that you can iterate over) but each element of this new list itself contains 2 elements: the index and the value from that original input list:
So if you have
arr = ['a', 'b', 'c']
Then the command
enumerate(arr)
returns something like:
[(0,'a'), (1,'b'), (2,'c')]
Now If you iterate over a list (or an iterable) where each element itself has 2 sub-elements, you can capture both of those sub-elements in the for loop like below:
for index, value in enumerate(arr):
print(index,value)
which would print out the sub-elements of the output of enumerate.
And in general you can basically "unpack" multiple items from list into multiple variables like below:
idx,value = (2,'c')
print(idx)
print(value)
which would print
2
c
This is the kind of assignment happening in each iteration of that loop with enumerate(arr) as iterable.
the enumerate function calculates an elements index and the elements value at the same time. i believe the following code will help explain what is going on.
for i,item in enumerate(initial_config):
print(f'index{i} value{item}')
When I'm trying to put counter in inline loop of Python, it tells me the syntax error. Apparently here it expects me to assign a value to i not k.
Could anyone help with rewriting the inline loop?
aa = [2, 2, 1]
k = 0
b = [k += 1 if i != 2 for i in aa ]
print(b)
You seem to misunderstand what you're doing. This:
[x for y in z]
is not an "inline for loop". A for loop can do anything, iterating on any iterable object. One of the things a for loop can do is create a list of items:
my_list = []
for i in other_list:
if condition_is_met:
my_list.append(i)
A list comprehension covers only this use case of a for loop:
my_list = [i for i in other_list if condition_is_met]
That's why it's called a "list comprehension" and not an "inline for loop" - because it only creates lists. The other things you might use a for loop for, like iterating a number, you can't directly use a list comprehension to do.
For your particular problem, you're trying to use k += 1 in a list comprehension. This operation doesn't return anything - it just modifies the variable k - so when python tries to assign that to a list item, the operation fails. If you want to count up with k, you should either just use a regular for loop:
for i in aa:
if i != 2:
k += 1
or use the list comprehension to indirectly measure what you want:
k += len([i for i in aa if i != 2])
Here, we use a list comprehension to construct a list of every element i in aa such that i != 2, then we take the number of elements in that list and add it to k. Since this operation actually produces a list of its own, the code will not crash, and it will have the same overall effect. This solution isn't always doable if you have more complicated things you'd like to do in a for loop - and it's slightly less efficient as well, because this solution requires actually creating the new list which isn't necessary for what you're trying to achieve.
you can use len() like so
print(len([i for i in a if i != 2]))
Suppose I have a list
myList = [a,b,c,d,e]
And a function
def doSomething(list):
#Does something to the list
And I want to call the function iteratively like this:
doSomething([b,c,d,e])
doSomething([a,c,d,e])
doSomething([a,b,d,e])
doSomething([a,b,c,e])
doSomething([a,b,c,d])
The first thing that comes to mind would be something like this:
for x in range(0,len(myList)):
del myList[x]
doSomething(myList)
But this doesn't really work, because each time I call del it actually deletes the element. I sort of just want to 'hide' the element each time I call the function. Is there a way to do this?
You can use itertools.combinations for this:
import itertools
for sublist in itertools.combinations([a, b, c, d, e], 4):
# 4 is the number of elements in each sublist.
# If you do not know the length of the input list, use len() - 1
doSomething(sublist)
This will make sublist a tuple. If you need it to be a list, you can call list() on it before passing it to doSomething().
If you care about the order in which the doSomething() calls are done, you will want to reverse the order of iteration so that it begins by removing the first element instead of the last element:
for sublist in reversed(list(itertools.combinations([a, b, c, d, e], 4))):
doSomething(sublist)
This is less efficient because all of the sublists must be generated up front instead of one at a time. mgilson in the comments suggests reversing the input list and then reversing each sublist, which should be more efficient but the code may be harder to read.
Normally, looping over indices is a bad idea -- but in this case, it seems that you want to remove elements at a given index (iteratively) so looping over indices actually seems appropriate for once.
You could use list.pop for this purpose, but it turns out that would be an extra O(N) operation for each turn of the loop (once to copy the list, once to remove the i'th element). We can do it differently by removing the element while we're copying...
for i in range(len(lst)):
new_list = [x for j, x in enumerate(lst) if j != i]
doSomething(new_list)
Note however that it isn't guaranteed that this will be faster than the naive approach:
for i in range(len(lst)):
new_list = lst[:] # lst.copy() in python3.x
new_list.pop(i)
doSomething(new_list)
The naive approach has the advantage that the any indexing that needs to be done in .pop is pushed to C code which is genearally faster than doing python comparisons.
This question already has answers here:
Strange result when removing item from a list while iterating over it
(8 answers)
Closed 7 years ago.
This is the most common problem I face while trying to learn programming in python. The problem is, when I try to iterate a list using "range()" function to check if given item in list meets given condition and if yes then to delete it, it will always give "IndexError". So, is there a particular way to do this without using any other intermediate list or "while" statement? Below is an example:
l = range(20)
for i in range(0,len(l)):
if l[i] == something:
l.pop(i)
First of all, you never want to iterate over things like that in Python. Iterate over the actual objects, not the indices:
l = range(20)
for i in l:
...
The reason for your error was that you were removing an item, so the later indices cease to exist.
Now, you can't modify a list while you are looping over it, but that isn't a problem. The better solution is to use a list comprehension here, to filter out the extra items.
l = range(20)
new_l = [i for i in l if not i == something]
You can also use the filter() builtin, although that tends to be unclear in most situations (and slower where you need lambda).
Also note that in Python 3.x, range() produces a generator, not a list.
It would also be a good idea to use more descriptive variable names - I'll presume here it's for example, but names like i and l are hard to read and make it easier to introduce bugs.
Edit:
If you wish to update the existing list in place, as pointed out in the comments, you can use the slicing syntax to replace each item of the list in turn (l[:] = new_l). That said, I would argue that that case is pretty bad design. You don't want one segment of code to rely on data being updated from another bit of code in that way.
Edit 2:
If, for any reason, you need the indices as you loop over the items, that's what the enumerate() builtin is for.
You can always do this sort of thing with a list comprehension:
newlist=[i for i in oldlist if not condition ]
As others have said, iterate over the list and create a new list with just the items you want to keep.
Use a slice assignment to update the original list in-place.
l[:] = [item for item in l if item != something]
You should look the problem from the other side: add an element to a list when it is equal with "something". with list comprehension:
l = [i for i in xrange(20) if i != something]
you should not use for i in range(0,len(l)):, use for i, item in enumerate(l): instead if you need the index, for item in l: if not
you should not manipulate a structure you are iterating over. when faced to do so, iterate over a copy instead
don't name a variable l (may be mistaken as 1 or I)
if you want to filter a list, do so explicitly. use filter() or list comprehensions
BTW, in your case, you could also do:
while something in list_: list_.remove(something)
That's not very efficient, though. But depending on context, it might be more readable.
The reason you're getting an IndexError is because you're changing the length of the list as you iterate in the for-loop. Basically, here's the logic...
#-- Build the original list: [0, 1, 2, ..., 19]
l = range(20)
#-- Here, the range function builds ANOTHER list, in this case also [0, 1, 2, ..., 19]
#-- the variable "i" will be bound to each element of this list, so i = 0 (loop), then i = 1 (loop), i = 2, etc.
for i in range(0,len(l)):
if i == something:
#-- So, when i is equivalent to something, you "pop" the list, l.
#-- the length of l is now *19* elements, NOT 20 (you just removed one)
l.pop(i)
#-- So...when the list has been shortened to 19 elements...
#-- we're still iterating, i = 17 (loop), i = 18 (loop), i = 19 *CRASH*
#-- There is no 19th element of l, as l (after you popped out an element) only
#-- has indices 0, ..., 18, now.
NOTE also, that you're making the "pop" decision based on the index of the list, not what's in the indexed cell of the list. This is unusual -- was that your intention? Or did you
mean something more like...
if l[i] == something:
l.pop(i)
Now, in your specific example, (l[i] == i) but this is not a typical pattern.
Rather than iterating over the list, try the filter function. It's a built-in (like a lot of other list processing functions: e.g. map, sort, reverse, zip, etc.)
Try this...
#-- Create a function for testing the elements of the list.
def f(x):
if (x == SOMETHING):
return False
else:
return True
#-- Create the original list.
l = range(20)
#-- Apply the function f to each element of l.
#-- Where f(l[i]) is True, the element l[i] is kept and will be in the new list, m.
#-- Where f(l[i]) is False, the element l[i] is passed over and will NOT appear in m.
m = filter(f, l)
List processing functions go hand-in-hand with "lambda" functions - which, in Python, are brief, anonymous functions. so, we can re-write the above code as...
#-- Create the original list.
l = range(20)
#-- Apply the function f to each element of l.
#-- Where lambda is True, the element l[i] is kept and will be in the new list, m.
#-- Where lambda is False, the element l[i] is passed over and will NOT appear in m.
m = filter(lambda x: (x != SOMETHING), l)
Give it a go and see it how it works!