Python: Removing a single element from a nested list - python

I'm having trouble figuring out how to remove something from within a nested list.
For example, how would I remove 'x' from the below list?
lst = [['x',6,5,4],[4,5,6]]
I tried del lst[0][0], but I get the following result:
TypeError: 'str' object doesn't support item deletion.
I also tried a for loop, but got the same error:
for char in lst:
del char[0]

Use the pop(i) function on the nested list. For example:
lst = [['x',6,5,4],[4,5,6]]
lst[0].pop(0)
print lst #should print [[6, 5, 4], [4, 5, 6]]
Done.

Your code works fine. Are you sure lst is defined as [['x',6,5,4],[4,5,6]]? Because if it is, del lst[0][0] effectively deletes 'x'.
Perhaps you have defined lst as ['x',6,5,4], in which case, you will indeed get the error you are mentioning.

You can also use "pop". E.g.,
list = [['x',6,5,4],[4,5,6]]
list[0].pop(0)
will result in
list = [[6,5,4],[4,5,6]]
See this thread for more: How to remove an element from a list by index in Python?

Related

How to 'format' a list in python [duplicate]

I'm trying to write a program that removes duplicates from a list, but my program keeps throwing the error "list index out of range" on line 5, if n/(sequence[k]) == 1:. I can't figure this out. Am I right in thinking that the possible values of "k" are 0, 1, and 2? How is "sequence" with any of those as the index outside of the possible index range?
def remove_duplicates(sequence):
new_list = sequence
for n in sequence:
for k in range(len(sequence)):
if n/(sequence[k]) == 1:
new_list.remove(sequence[k])
print new_list
remove_duplicates([1,2,3])
I strongly suggest Akavall's answer:
list(set(your_list))
As to why you get out of range errors: Python passes by reference, that is sequence and new_list still point at the same memory location. Changing new_list also changes sequence.
And finally, you are comparing items with themselves, and then remove them. So basically even if you used a copy of sequence, like:
new_list = list(sequence)
or
new_list = sequence[:]
It would return an empty list.
Your error is concurrent modification of the list:
for k in range(len(sequence)):
if n/(sequence[k]) == 1:
new_list.remove(sequence[k])
It may seem removing from new_list shouldn't effect sequence, but you did new_list = sequence at the beginning of the function. This means new_list actually literally is sequence, perhaps what you meant is new_list=list(sequence), to copy the list?
If you accept that they are the same list, the error is obvious. When you remove items, the length, and the indexes change.
P.S. As mentioned in a comment by #Akavall, all you need is:
sequence=list(set(sequence))
To make sequence contain no dupes. Another option, if you need to preserve ordering, is:
from collections import OrderedDict
sequence=list(OrderedDict.fromkeys(sequence))
If you don't like list(set(your_list)) because it's not guaranteed to preserved order, you could grab the OrderedSet recipe and then do:
from ordered_set import OrderedSet
foo = list("face a dead cabbage")
print foo
print list(set(foo)) # Order might change
print list(OrderedSet(foo)) # Order preserved
# like #Akavall suggested
def remove_duplicates(sequence):
# returns unsorted unique list
return list(set(sequence))
# create a list, if ele from input not in that list, append.
def remove_duplicates(sequence):
lst = []
for i in sequence:
if i not in lst:
lst.append(i)
# returns unsorted unique list
return lst

Organize/Formatting the python code to one line

Is there any way to rewrite the below python code in one line
for i in range(len(main_list)):
if main_list[i] != []:
for j in range(len(main_list[i])):
main_list[i][j][6]=main_list[i][j][6].strftime('%Y-%m-%d')
something like below,
[main_list[i][j][6]=main_list[i][j][6].strftime('%Y-%m-%d') for i in range(len(main_list)) if main_list[i] != [] for j in range(len(main_list[i]))]
I got SyntaxError for this.
Actually, i'm trying to storing all the values fetched from table into one list. Since the table contains date method/datatype, my requirement needs to convert it to string as i faced with malformed string error.
So my approach is to convert that element of list from datetime.date() to str. And i got it working. Just wanted it to work with one line
Use the explicit for loop. There's no better option.
A list comprehension is used to create a new list, not to modify certain elements of an existing list.
You may be able to update values via a list comprehension, e.g. [L.__setitem__(i, 'some_value') for i in range(len(L))], but this is not recommended as you are using a side-effect and in the process creating a list of None values which you then discard.
You could also write a convoluted list comprehension with a ternary statement indicating when you meet the 6th element in a 3rd nested sublist. But this will make your code difficult to maintain.
In short, use the for loop.
You're getting a syntax error because you're not allowed to perform assignments within a list comprehension. Python forbids assignments because it is discouraging over complex list comprehensions in favour of for loops.
Obviously you shouldn't do this on one line, but this is how to do it:
import datetime
# Example from your comment:
type1 = "some type"
main_list = [[], [],
[[1, 2, 3, datetime.date(2016, 8, 18), type1],
[3, 4, 5, datetime.date(2016, 8, 18), type1]], [], []]
def fmt_times(lst):
"""Format the fourth value of each element of each non-empty sublist"""
for i in range(len(lst)):
if lst[i] != []:
for j in range(len(lst[i])):
lst[i][j][3] = lst[i][j][3].strftime('%Y-%m-%d')
return lst
def fmt_times_one_line(main_list):
"""Format the fourth value of each element of each non-empty sublist"""
return [[] if main_list[i] == [] else [[main_list[i][j][k] if k != 3 else main_list[i][j][k].strftime('%Y-%m-%d') for k in range(len(main_list[i][j]))] for j in range(len(main_list[i])) ] for i in range(len(main_list))]
import copy
# Deep copy needed because fmt_times modifies the sublists.
assert fmt_times(copy.deepcopy(main_list)) == fmt_times_one_line(main_list)
The list comprehension is a functional thing. If you know how map() works in python or javascript then it's the same thing. In a map() or comprehension we generally don't mutate the data we're mapping over (and python discourages attempting it) so instead we recreate the entire object, substituting only the values we wanted to modify.
One line?
main_list = convert_list(main_list)
You will have to put a few more lines somewhere else though:
def convert_list(main_list):
for i, ml in enumerate(main_list):
if isinstance(ml, list) and len(ml) > 0:
main_list[i] = convert_list(ml)
elif isinstance(ml, datetime.date):
main_list[i] = ml.strftime('%Y-%m-%d')
return main_list
You might be able to whack this together with a list comprehension but it's a terrible idea (for reasons better explained in the other answer).

Removing duplicates from a list in Python [duplicate]

This question already has answers here:
Removing duplicates in lists
(56 answers)
Closed 4 years ago.
How would I use python to check a list and delete all duplicates? I don't want to have to specify what the duplicate item is - I want the code to figure out if there are any and remove them if so, keeping only one instance of each. It also must work if there are multiple duplicates in a list.
For example, in my code below, the list lseparatedOrbList has 12 items - one is repeated six times, one is repeated five times, and there is only one instance of one. I want it to change the list so there are only three items - one of each, and in the same order they appeared before. I tried this:
for i in lseparatedOrbList:
for j in lseparatedOrblist:
if lseparatedOrbList[i] == lseparatedOrbList[j]:
lseparatedOrbList.remove(lseparatedOrbList[j])
But I get the error:
Traceback (most recent call last):
File "qchemOutputSearch.py", line 123, in <module>
for j in lseparatedOrblist:
NameError: name 'lseparatedOrblist' is not defined
I'm guessing because it's because I'm trying to loop through lseparatedOrbList while I loop through it, but I can't think of another way to do it.
Use set():
woduplicates = set(lseparatedOrblist)
Returns a set without duplicates. If you, for some reason, need a list back:
woduplicates = list(set(lseperatedOrblist))
This will, however, have a different order than your original list.
Just make a new list to populate, if the item for your list is not yet in the new list input it, else just move on to the next item in your original list.
for i in mylist:
if i not in newlist:
newlist.append(i)
This should be faster and will preserve the original order:
seen = {}
new_list = [seen.setdefault(x, x) for x in my_list if x not in seen]
If you don't care about order, you can just:
new_list = list(set(my_list))
You can do this like that:
x = list(set(x))
Example: if you do something like that:
x = [1,2,3,4,5,6,7,8,9,10,2,1,6,31,20]
x = list(set(x))
x
you will see the following result:
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 31]
There is only one thing you should think of: the resulting list will not be ordered as the original one (will lose the order in the process).
The modern way to do it that maintains the order is:
>>> from collections import OrderedDict
>>> list(OrderedDict.fromkeys(lseparatedOrbList))
as discussed by Raymond Hettinger in this answer. In python 3.5 and above this is also the fastest way - see the linked answer for details. However the keys must be hashable (as is the case in your list I think)
As of python 3.7 ordered dicts are a language feature so the above call becomes
>>> list(dict.fromkeys(lseparatedOrbList))
Performance:
"""Dedup list."""
import sys
import timeit
repeat = 3
numbers = 1000
setup = """"""
def timer(statement, msg='', _setup=None):
print(msg, min(
timeit.Timer(statement, setup=_setup or setup).repeat(
repeat, numbers)))
print(sys.version)
s = """import random; n=%d; li = [random.randint(0, 100) for _ in range(n)]"""
for siz, m in ((150, "\nFew duplicates"), (15000, "\nMany duplicates")):
print(m)
setup = s % siz
timer('s = set(); [i for i in li if i not in s if not s.add(i)]', "s.add(i):")
timer('list(dict.fromkeys(li))', "dict:")
timer('list(set(li))', 'Not order preserving: list(set(li)):')
gives:
3.7.6 (tags/v3.7.6:43364a7ae0, Dec 19 2019, 00:42:30) [MSC v.1916 64 bit (AMD64)]
Few duplicates
s.add(i): 0.008242200000040611
dict: 0.0037373999998635554
Not order preserving: list(set(li)): 0.0029409000001123786
Many duplicates
s.add(i): 0.2839437000000089
dict: 0.21970469999996567
Not order preserving: list(set(li)): 0.102068700000018
So dict seems consistently faster although approaching list comprehension with set.add for many duplicates - not sure if further varying the numbers would give different results. list(set) is of course faster but does not preserve original list order, a requirement here
This should do it for you:
new_list = list(set(old_list))
set will automatically remove duplicates. list will cast it back to a list.
No, it's simply a typo, the "list" at the end must be capitalized. You can nest loops over the same variable just fine (although there's rarely a good reason to).
However, there are other problems with the code. For starters, you're iterating through lists, so i and j will be items not indices. Furthermore, you can't change a collection while iterating over it (well, you "can" in that it runs, but madness lies that way - for instance, you'll propably skip over items). And then there's the complexity problem, your code is O(n^2). Either convert the list into a set and back into a list (simple, but shuffles the remaining list items) or do something like this:
seen = set()
new_x = []
for x in xs:
if x in seen:
continue
seen.add(x)
new_xs.append(x)
Both solutions require the items to be hashable. If that's not possible, you'll probably have to stick with your current approach sans the mentioned problems.
It's because you are missing a capital letter, actually.
Purposely dedented:
for i in lseparatedOrbList: # capital 'L'
for j in lseparatedOrblist: # lowercase 'l'
Though the more efficient way to do it would be to insert the contents into a set.
If maintaining the list order matters (ie, it must be "stable"), check out the answers on this question
for unhashable lists. It is faster as it does not iterate about already checked entries.
def purge_dublicates(X):
unique_X = []
for i, row in enumerate(X):
if row not in X[i + 1:]:
unique_X.append(row)
return unique_X
There is a faster way to fix this:
list = [1, 1.0, 1.41, 1.73, 2, 2, 2.0, 2.24, 3, 3, 4, 4, 4, 5, 6, 6, 8, 8, 9, 10]
list2=[]
for value in list:
try:
list2.index(value)
except:
list2.append(value)
list.clear()
for value in list2:
list.append(value)
list2.clear()
print(list)
print(list2)
In this way one can delete a particular item which is present multiple times in a list : Try deleting all 5
list1=[1,2,3,4,5,6,5,3,5,7,11,5,9,8,121,98,67,34,5,21]
print list1
n=input("item to be deleted : " )
for i in list1:
if n in list1:
list1.remove(n)
print list1

Remove duplicates from list python

I'm trying to write a program that removes duplicates from a list, but my program keeps throwing the error "list index out of range" on line 5, if n/(sequence[k]) == 1:. I can't figure this out. Am I right in thinking that the possible values of "k" are 0, 1, and 2? How is "sequence" with any of those as the index outside of the possible index range?
def remove_duplicates(sequence):
new_list = sequence
for n in sequence:
for k in range(len(sequence)):
if n/(sequence[k]) == 1:
new_list.remove(sequence[k])
print new_list
remove_duplicates([1,2,3])
I strongly suggest Akavall's answer:
list(set(your_list))
As to why you get out of range errors: Python passes by reference, that is sequence and new_list still point at the same memory location. Changing new_list also changes sequence.
And finally, you are comparing items with themselves, and then remove them. So basically even if you used a copy of sequence, like:
new_list = list(sequence)
or
new_list = sequence[:]
It would return an empty list.
Your error is concurrent modification of the list:
for k in range(len(sequence)):
if n/(sequence[k]) == 1:
new_list.remove(sequence[k])
It may seem removing from new_list shouldn't effect sequence, but you did new_list = sequence at the beginning of the function. This means new_list actually literally is sequence, perhaps what you meant is new_list=list(sequence), to copy the list?
If you accept that they are the same list, the error is obvious. When you remove items, the length, and the indexes change.
P.S. As mentioned in a comment by #Akavall, all you need is:
sequence=list(set(sequence))
To make sequence contain no dupes. Another option, if you need to preserve ordering, is:
from collections import OrderedDict
sequence=list(OrderedDict.fromkeys(sequence))
If you don't like list(set(your_list)) because it's not guaranteed to preserved order, you could grab the OrderedSet recipe and then do:
from ordered_set import OrderedSet
foo = list("face a dead cabbage")
print foo
print list(set(foo)) # Order might change
print list(OrderedSet(foo)) # Order preserved
# like #Akavall suggested
def remove_duplicates(sequence):
# returns unsorted unique list
return list(set(sequence))
# create a list, if ele from input not in that list, append.
def remove_duplicates(sequence):
lst = []
for i in sequence:
if i not in lst:
lst.append(i)
# returns unsorted unique list
return lst

How to remove all duplicate items from a list [duplicate]

This question already has answers here:
Removing duplicates in lists
(56 answers)
Closed 4 years ago.
How would I use python to check a list and delete all duplicates? I don't want to have to specify what the duplicate item is - I want the code to figure out if there are any and remove them if so, keeping only one instance of each. It also must work if there are multiple duplicates in a list.
For example, in my code below, the list lseparatedOrbList has 12 items - one is repeated six times, one is repeated five times, and there is only one instance of one. I want it to change the list so there are only three items - one of each, and in the same order they appeared before. I tried this:
for i in lseparatedOrbList:
for j in lseparatedOrblist:
if lseparatedOrbList[i] == lseparatedOrbList[j]:
lseparatedOrbList.remove(lseparatedOrbList[j])
But I get the error:
Traceback (most recent call last):
File "qchemOutputSearch.py", line 123, in <module>
for j in lseparatedOrblist:
NameError: name 'lseparatedOrblist' is not defined
I'm guessing because it's because I'm trying to loop through lseparatedOrbList while I loop through it, but I can't think of another way to do it.
Use set():
woduplicates = set(lseparatedOrblist)
Returns a set without duplicates. If you, for some reason, need a list back:
woduplicates = list(set(lseperatedOrblist))
This will, however, have a different order than your original list.
Just make a new list to populate, if the item for your list is not yet in the new list input it, else just move on to the next item in your original list.
for i in mylist:
if i not in newlist:
newlist.append(i)
This should be faster and will preserve the original order:
seen = {}
new_list = [seen.setdefault(x, x) for x in my_list if x not in seen]
If you don't care about order, you can just:
new_list = list(set(my_list))
You can do this like that:
x = list(set(x))
Example: if you do something like that:
x = [1,2,3,4,5,6,7,8,9,10,2,1,6,31,20]
x = list(set(x))
x
you will see the following result:
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 20, 31]
There is only one thing you should think of: the resulting list will not be ordered as the original one (will lose the order in the process).
The modern way to do it that maintains the order is:
>>> from collections import OrderedDict
>>> list(OrderedDict.fromkeys(lseparatedOrbList))
as discussed by Raymond Hettinger in this answer. In python 3.5 and above this is also the fastest way - see the linked answer for details. However the keys must be hashable (as is the case in your list I think)
As of python 3.7 ordered dicts are a language feature so the above call becomes
>>> list(dict.fromkeys(lseparatedOrbList))
Performance:
"""Dedup list."""
import sys
import timeit
repeat = 3
numbers = 1000
setup = """"""
def timer(statement, msg='', _setup=None):
print(msg, min(
timeit.Timer(statement, setup=_setup or setup).repeat(
repeat, numbers)))
print(sys.version)
s = """import random; n=%d; li = [random.randint(0, 100) for _ in range(n)]"""
for siz, m in ((150, "\nFew duplicates"), (15000, "\nMany duplicates")):
print(m)
setup = s % siz
timer('s = set(); [i for i in li if i not in s if not s.add(i)]', "s.add(i):")
timer('list(dict.fromkeys(li))', "dict:")
timer('list(set(li))', 'Not order preserving: list(set(li)):')
gives:
3.7.6 (tags/v3.7.6:43364a7ae0, Dec 19 2019, 00:42:30) [MSC v.1916 64 bit (AMD64)]
Few duplicates
s.add(i): 0.008242200000040611
dict: 0.0037373999998635554
Not order preserving: list(set(li)): 0.0029409000001123786
Many duplicates
s.add(i): 0.2839437000000089
dict: 0.21970469999996567
Not order preserving: list(set(li)): 0.102068700000018
So dict seems consistently faster although approaching list comprehension with set.add for many duplicates - not sure if further varying the numbers would give different results. list(set) is of course faster but does not preserve original list order, a requirement here
This should do it for you:
new_list = list(set(old_list))
set will automatically remove duplicates. list will cast it back to a list.
No, it's simply a typo, the "list" at the end must be capitalized. You can nest loops over the same variable just fine (although there's rarely a good reason to).
However, there are other problems with the code. For starters, you're iterating through lists, so i and j will be items not indices. Furthermore, you can't change a collection while iterating over it (well, you "can" in that it runs, but madness lies that way - for instance, you'll propably skip over items). And then there's the complexity problem, your code is O(n^2). Either convert the list into a set and back into a list (simple, but shuffles the remaining list items) or do something like this:
seen = set()
new_x = []
for x in xs:
if x in seen:
continue
seen.add(x)
new_xs.append(x)
Both solutions require the items to be hashable. If that's not possible, you'll probably have to stick with your current approach sans the mentioned problems.
It's because you are missing a capital letter, actually.
Purposely dedented:
for i in lseparatedOrbList: # capital 'L'
for j in lseparatedOrblist: # lowercase 'l'
Though the more efficient way to do it would be to insert the contents into a set.
If maintaining the list order matters (ie, it must be "stable"), check out the answers on this question
for unhashable lists. It is faster as it does not iterate about already checked entries.
def purge_dublicates(X):
unique_X = []
for i, row in enumerate(X):
if row not in X[i + 1:]:
unique_X.append(row)
return unique_X
There is a faster way to fix this:
list = [1, 1.0, 1.41, 1.73, 2, 2, 2.0, 2.24, 3, 3, 4, 4, 4, 5, 6, 6, 8, 8, 9, 10]
list2=[]
for value in list:
try:
list2.index(value)
except:
list2.append(value)
list.clear()
for value in list2:
list.append(value)
list2.clear()
print(list)
print(list2)
In this way one can delete a particular item which is present multiple times in a list : Try deleting all 5
list1=[1,2,3,4,5,6,5,3,5,7,11,5,9,8,121,98,67,34,5,21]
print list1
n=input("item to be deleted : " )
for i in list1:
if n in list1:
list1.remove(n)
print list1

Categories

Resources