Python 3 : IndexError: list index out of range [duplicate] - python

This question already has answers here:
python : list index out of range error while iteratively popping elements
(12 answers)
Closed 6 years ago.
I am trying to remove duplicates from a list. I am trying to do that with below code.
>>> X
['a', 'b', 'c', 'd', 'e', 'f', 'a', 'b']
>>> for i in range(X_length) :
... j=i+1
... if X[i] == X[j] :
... X.pop([j])
But I am getting
Traceback (most recent call last):
File "<stdin>", line 2, in <module>
IndexError: list index out of range
Please help.

When you start to remove items from a list, it changes in size. So, the ith index may no longer exist after certain removals:
>>> x = ['a', 'b', 'c', 'd', 'e']
>>> x[4]
'e'
>>> x.pop()
'e'
>>> x[4]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: list index out of range
A simpler way to remove duplicate items is to convert your list to a set, which can only contain unique items. If you must have it as a list, you can convert it back to a list: list(set(X)). However, order is not preserved here.
If you want to remove consecutive duplicates, consider using a new array to store items that are not duplicates:
unique_x = []
for i in range(len(x) - 1):
if x[i] != x[i+1]:
unique_x.append(x[i])
unique_x.append(x[-1])
Note that our range bound is len(x) - 1 because otherwise, we would exceed the array bounds when using x[i+1].

#Rushy's answer is great and probably what I would recommend.
That said, if you want to remove consecutive duplicates and you want to do it in-place (by modifying the list rather than creating a second one), one common technique is to work your way backwards through the list:
def remove_consecutive_duplicates(lst):
for i in range(len(lst) - 1, 1, -1):
if lst[i] == lst[i-1]:
lst.pop(i)
x = ['a', 'b', 'b', 'c', 'd', 'd', 'd', 'e', 'f', 'f']
remove_consecutive_duplicates(x)
print(x) # ['a', 'b', 'c', 'd', 'e', 'f']
By starting at the end of the list and moving backwards, you avoid the problem of running off the end of the list because you've shortened it.
E.g. if you start with 'aabc' and move forwards, you'll use the indexes 0, 1, 2, and 3.
0
|
aabc
(Found a duplicate, so remove that element.)
1
|
abc
2
|
abc
3
|
abc <-- Error! You ran off the end of the list.
Going backwards, you'll use the indexes 3, 2, 1, and 0:
3
|
aabc
2
|
aabc
1
|
aabc
(Found a duplicate so remove that element.)
0
|
abc <-- No problem here!

In the last iteration of your list the value of j will be set to i + 1 which will be the length or 8 in this case. You then try to access X[j], but j is beyond the end of the list.
Instead, simply convert the list to a set:
>>> set(X)
{'e', 'f', 'd', 'c', 'a', 'b'}
unless you need to preserve order, in which case you'll need to look elsewhere for an ordered set.

It is generally not advised to mutate a sequence while iterating it since the sequence will be constantly changing. Here are some other approaches:
Given:
X = ['a', 'b', 'c', 'd', 'e', 'f', 'a', 'b']
If you are only interested in removing duplicates from a list (and order does not matter), you can use a set:
list(set(X))
['a', 'c', 'b', 'e', 'd', 'f']
If you want to maintain order and remove duplicates anywhere in the list, you can iterate while making a new list:
X_new = []
for i in X:
if i not in X_new:
X_new.append(i)
X_new
# Out: ['a', 'b', 'c', 'd', 'e', 'f']
If you would like to remove consecutive duplicates, consider #smarx's answer.

Related

Python, compute array difference in exact amount of elements [duplicate]

This question already has answers here:
Python removing overlap of lists
(2 answers)
Closed 2 years ago.
I have two lists in Python, like these:
temp1 = ['A', 'A', 'A', 'B', 'C', 'C','C']
temp2 = ['A','B','C','C']
I need to create a third list with items from the first list which will be different with exact number of elements existing in temp2, I need to create below :
temp3 = ['A','A','C']
What is the best way of doing that ? Using sets is not working as expected, so that would like to now is there a fast way to do it with python standart functions or i have to create my own function ?
temp1 = ['A', 'A', 'A', 'B', 'C', 'C','C']
temp2 = ['A','B','C','C']
# create a copy of your first list
temp3 = list(temp1)
# remove every item from the second list of the copy
for e in temp2:
temp3.remove(e)
Output:
['A', 'A', 'C']
If the lists are guaranteed to be sorted you can do much better in terms of time complexity than list.remove or counting every iteration using:
temp1 = ['A', 'A', 'A', 'B', 'C', 'C', 'C']
temp2 = ['A', 'B', 'C', 'C']
filtered = []
j = 0
for i, letter in enumerate(temp1):
while j < len(temp2) and temp2[j] < letter:
j += 1
if j == len(temp2):
break
if temp2[j] > letter:
filtered.append(letter)
else:
j += 1
filtered.extend(temp1[i:])
Another solution
A more interesting solution I thought of:
from collections import Counter
result = []
for letter, count in (Counter(temp1)-Counter(temp2)).items():
result.extend([letter]*count)
This is the same big O complexity as the above.
If lists are not sorted
If order is not important these solutions are still much faster, since sorting the lists is cheaper than the O(n^2) solutions, and the second one doesn't even need that. If it is, this still works, you just need to retain a mapping of element->index (which your temp1 already is) before sorting, though this might be out of scope for this question.
from collections import Counter
temp1 = ['A', 'A', 'A', 'B', 'C', 'C', 'C']
temp2 = ['A', 'B', 'C', 'C']
result = []
counts = Counter(temp2)
for item in temp1:
if item in counts and counts[item]:
counts[item] -= 1
else:
result.append(item)
print(result)
Output:
['A', 'A', 'C']
Scales O(n) and does not rely on sorted input.
This answer relies on the fact that Counter is just a subclass of dict, so we can use the instance as a mutable object in which to store the number of occurrences in temp2 that we still need to exclude from the result during the iteration over temp1. The documentation states explicitly that "Counter is a dict subclass" and that "Counter objects have a dictionary interface", which is a pretty good guarantee that item assignment will be supported, and that it is not necessary to treat it as a read-only object that must first be copied into a plain dict.
You can try
temp1 = ['A', 'A', 'A', 'B', 'C', 'C','C']
temp2 = ['A','B','C','C']
temp3 = []
for i in temp1:
if temp1.count(i) - temp2.count(i) > temp3.count(i):
temp3.append(i)
print(temp3)
This code will check if in temp3 all the diff elements init and if not it will append the relevant temp1 item to the temp3 list.
Output
['A', 'A', 'C']

Including first and last elements in list comprehension

I would like to keep the first and last elements of a list, and exclude others that meet defined criteria without using a loop. The first and last elements may or may not have the criteria of elements being removed.
As a very basic example,
aList = ['a','b','a','b','a']
[x for x in aList if x !='a']
returns ['b', 'b']
I need ['a','b','b','a']
I can split off the first and last values and then re-concatenate them together, but this doesn't seem very Pythonic.
You can use slice assignment:
>>> aList = ['a','b','a','b','a']
>>> aList[1:-1]=[x for x in aList[1:-1] if x !='a']
>>> aList
['a', 'b', 'b', 'a']
Yup, it looks like dawg’s and jez’s suggested answers are the right ones, here. Leaving the below for posterity.
Hmmm, your sample input and output don’t match what I think your question is, and it is absolutely pythonic to use slicing:
a_list = ['a','b','a','b','a']
# a_list = a_list[1:-1] # take everything but the first and last elements
a_list = a_list[:2] + a_list[-2:] # this gets you the [ 'a', 'b', 'b', 'a' ]
Here's a list comprehension that explicitly makes the first and last elements immune from removal, regardless of their value:
>>> aList = ['a', 'b', 'a', 'b', 'a']
>>> [ letter for index, letter in enumerate(aList) if letter != 'a' or index in [0, len(x)-1] ]
['a', 'b', 'b', 'a']
Try this:
>>> list_ = ['a', 'b', 'a', 'b', 'a']
>>> [value for index, value in enumerate(list_) if index in {0, len(list_)-1} or value == 'b']
['a', 'b', 'b', 'a']
Although, the list comprehension is becoming unwieldy. Consider writing a generator like so:
>>> def keep_bookends_and_bs(list_):
... for index, value in enumerate(list_):
... if index in {0, len(list_)-1}:
... yield value
... elif value == 'b':
... yield value
...
>>> list(keep_bookends_and_bs(list_))
['a', 'b', 'b', 'a']

Remove element from list in Python [duplicate]

This question already has answers here:
How to remove items from a list while iterating?
(25 answers)
Closed 6 years ago.
Executing the following sample code to remove elements from list:
l = ['A', 'B', 'C', 'D']
for x in l:
print(x, l)
if x == 'A' or x == 'B':
l.remove(x)
print(l)
The output in both Python 2.x and Python 3.x is:
$ python3 test.py
A ['A', 'B', 'C', 'D']
C ['B', 'C', 'D']
D ['B', 'C', 'D']
['B', 'C', 'D']
The expected output should be:
['C', 'D']
Why is Python behaving this way ?
What is the safest way to remove elements from a list ?
The problem is that when you have deleted 'A', 'B' becomes you first element, and i is "pointing" on the second, which is C. Thus, the check is launched for C and you skipped the check for 'B'
To do what you meant to do, you want to write the following:
a = ['A', 'B', 'C', 'D']
i = 0
while i < len(a):
if a[i] == 'A' or a[i] == 'B':
del a[i]
else:
i += 1
print(a)
This way if you delete the element your loop is currently looking at, you are looking at the element with the same number, which is the right element to look at after you delete it and element shift. I.e.:
a b c d
^
(deleted)
a c d
^
On the other hand, if you do not remove the current element, you just proceed to the next one
That is not the optimal way to delete all occurrences of 'A' or 'B' in Python though

Python insert operation on list

I am newbie to Python and I have a doubt regarding insert operation on the list.
Example 1:
mylist = ['a','b','c','d','e']
mylist.insert(len(mylist),'f')
print(mylist)
Output:
['a', 'b', 'c', 'd', 'e', 'f']
Example 2:
mylist = ['a','b','c','d','e']
mylist.insert(10,'f')
print(mylist)
Output:
['a', 'b', 'c', 'd', 'e', 'f']
In second example why it still inserts 'f' element in the list even if I am giving index 10 to insert method?
The list.insert function will insert before the specified index. Since the list is not that long anyways in your example, it goes on the end. Why not just use list.append if you want to put stuff on the end anyways.
x = ['a']
x.append('b')
print x
Output is
['a', 'b']
The concept here is "insert before the element with this index". To be able to insert at the end, you have to allow the invalid off-the-end index. Since there are no well-defined "off-the-end iterators" or anything in Python, it makes more sense to just allow all indices than to allow one invalid one, but no others.

How to maintain consistency in list?

I have a list like
lst = ['a', 'b', 'c', 'd', 'e', 'f']
I have a pop position list
p_list = [0,3]
[lst.pop(i) for i in p_list] changed the list to ['b', 'c', 'd', 'f'], here after 1st iteration list get modified. Next pop worked on the new modified list.
But I want to pop the element from original list at index [0,3] so, my new list should be
['b', 'c', 'e', 'f']
Lots of reasonable answers, here's another perfectly terrible one:
[item for index, item in enumerate(lst) if index not in plist]
You could pop the elements in order from largest index to smallest, like so:
lst = ['a', 'b', 'c', 'd', 'e', 'f']
p_list = [0,3]
p_list.sort()
p_list.reverse()
[lst.pop(i) for i in p_list]
lst
#output: ['b', 'c', 'e', 'f']
Do the pops in reversed order:
>>> lst = ['a', 'b', 'c', 'd', 'e', 'f']
>>> p_list = [0, 3]
>>> [lst.pop(i) for i in reversed(p_list)][::-1]
['a', 'd']
>>> lst
['b', 'c', 'e', 'f']
The important part here is that inside of the list comprehension you should always call lst.pop() on later indices first, so this will only work if p_list is guaranteed to be in ascending order. If that is not the case, use the following instead:
[lst.pop(i) for i in sorted(p_list, reverse=True)]
Note that this method makes it more complicated to get the popped items in the correct order from p_list, if that is important.
Your method of modifying the list may be error prone, why not use numpy to only access the index elements that you want? That way everything stays in place (in case you need it later) and it's a snap to make a new pop list. Starting from your def. of lst and p_list:
from numpy import *
lst = array(lst)
idx = ones(lst.shape,dtype=bool)
idx[p_list] = False
print lst[idx]
Gives ['b' 'c' 'e' 'f'] as expected.

Categories

Resources