First of all let me acknowledge that this question has been asked before, but the answers either seem outdated or unsatisfactory. The question is given a list of unsorted list, how can we remove duplicates in the most efficient and elegant manner? (i.e. using the shortest syntax vs. the fastest computational time)
Example:
Given [[1,2,3],[],[2,-2],[3,2,1]], we want [[1,2,3],[],[2,-2]]. Note that whether [1,2,3] or [3,2,1] doesn't matter.
You can do:
>>> li=[[1,2,3],[],[2,-2],[3,2,1]]
>>> {frozenset(e) for e in li}
{frozenset({1, 2, 3}), frozenset({2, -2}), frozenset()}
>>> [list(x) for x in {frozenset(e) for e in li}]
[[1, 2, 3], [2, -2], []]
The key is to use frozenset since a set is not hashable. Note the order may change with this method.
If you want to maintain the same order, you can do:
>>> seen=set()
>>> [e for e in li if frozenset(e) not in seen and not seen.add(frozenset(e))]
[[1, 2, 3], [], [2, -2]]
If there is a possibility of repeated elements within the sublists, you can sort the sublists and use a representation of that:
li=[[1,2,3],[],[2,-2],[3,2,1],[1,1,2,2,3],[1,2,1,2,3]]
seen=set()
nli=[]
for e in li:
re=repr(sorted(e))
if re not in seen:
seen.add(re)
nli.append(e)
>>> nli
[[1, 2, 3], [], [2, -2], [1, 1, 2, 2, 3]]
(Note: You can use tuple instead or repr if desired. Either produces a hashable immutable result)
Related
I have a nested list as an example:
lst_a = [[1,2,3,5], [1,2,3,7], [1,2,3,9], [1,2,6,8]]
I'm trying to check if the first 3 indices of a nested list element are the same as other.
I.e.
if [1,2,3] exists in other lists, remove all the other nested list elements that contain that. So that the nested list is unique.
I'm not sure the most pythonic way of doing this would be.
for i in range(0, len(lst_a)):
if lst[i][:3] == lst[i-1][:3]:
lst[i].pop()
Desired output:
lst_a = [[1,2,3,9], [1,2,6,8]]
If, as you said in comments, sublists that have the same first three elements are always next to each other (but the list is not necessarily sorted) you can use itertools.groupby to group those elements and then get the next from each of the groups.
>>> from itertools import groupby
>>> lst_a = [[1,2,3,5], [1,2,3,7], [1,2,3,9], [1,2,6,8]]
>>> [next(g) for k, g in groupby(lst_a, key=lambda x: x[:3])]
[[1, 2, 3, 5], [1, 2, 6, 8]]
Or use a list comprehension with enumerate and compare the current element with the last one:
>>> [x for i, x in enumerate(lst_a) if i == 0 or lst_a[i-1][:3] != x[:3]]
[[1, 2, 3, 5], [1, 2, 6, 8]]
This does not require any imports, but IMHO when using groupby it is much clearer what the code is supposed to do. Note, however, that unlike your method, both of those will create a new filtered list, instead of updating/deleting from the original list.
I think you are missing a loop For if you want to check all possibilities. I guess it should like :
for i in range(0, len(lst_a)):
for j in range(i, len(lst_a)):
if lst[i][:3] == lst[j][:3]:
lst[i].pop()
Deleting while going throught the list is maybe not the best idea you should delete unwanted elements at the end
Going with your approach, Find the below code:
lst=[lst_a[0]]
for li in lst_a[1:]:
if li[:3]!=lst[0][:3]:
lst.append(li)
print(lst)
Hope this helps!
You can use a dictionary to filter a list:
dct = {tuple(i[:3]): i for i in lst}
# {(1, 2, 3): [1, 2, 3, 9], (1, 2, 6): [1, 2, 6, 8]}
list(dct.values())
# [[1, 2, 3, 9], [1, 2, 6, 8]]
This question already has answers here:
Understanding slicing
(38 answers)
Closed 5 years ago.
I worked in Python 3.6, and I am a beginner. So can anyone give me a true way of how can slice a list into variable size sub-list. I tried this solution from this site, but it gave me fixed size of sliced list. To clarify:
if I have this list:
inputList= [0,1,2,3,4,5,6,7]
I want the output to be like e.g.:
outputList=[[0,1,2], [3,4], [5], [6,7]]
each time it depends on (for example) user input or some variable size.
Just use itertools.islice(). This has the added advantage that if you request a slice that you would normally take you out of bounds, you won't get an error. You'll just get as many items are left as possible.
>>> import itertools as it
>>> input_list = range(8)
>>> slices = (3, 2, 1, 2)
>>> iterable = iter(input_list)
>>> output_list = [list(it.islice(iterable, sl)) for sl in slices]
>>> output_list
[[0, 1, 2], [3, 4], [5], [6, 7]]
For example, if you had slices = (3, 2, 1, 3, 2), your result would be [[0, 1, 2], [3, 4], [5], [6, 7], []].
Basically, iter(input_list) creates an iterable of your list so you can fetch the next k values with islice().
You can do this in a loop making use of python's [:#] slice notation:
Let's say the user's input for slicing chunk sizes is stored in a list. They want chunks of 3, 2, 1 and 2, so the user input defining the chunks gets stored into a list that has has [3,2,1,2].
Then, loop through that list and use it to get your slices:
input_list = [0,1,2,3,4,5,6,7]
chunk_list = [3,2,1,2]
output_list = []
for i in chunk_list:
output_list.append(input_list[:i])
input_list = input_list[i:]
print(output_list)
Prints:
[[0, 1, 2], [3, 4], [5], [6, 7]]
Probably you just want to learn about slicing. See Understanding Python's slice notation
To get your example output, you could do
outputList = [inputList[:3], inputList[3:5], inputList[5:6], inputList[6:]]
Is there a way in Python to sort a list where there are strings, floats and integers in it?
I tried to use list.sort() method but of course it did not work.
Here is an example of a list I would like to sort:
[2.0, True, [2, 3, 4, [3, [3, 4]], 5], "titi", 1]
I would like it to be sorted by value by floats and ints, and then by type: floats and ints first, then strings, then booleans and then lists. I would like to use Python 2.7 but I am not allowed to...
Expected output:
[1, 2.0, "titi", True, [2, 3, 4, [3, [3, 4]], 5]]
Python's comparison operators wisely refuse to work for variables of incompatible types. Decide on the criterion for sorting your list, encapsulate it in a function and pass it as the key option to sort(). For example, to sort by the repr of each element (a string):
l.sort(key=repr)
To sort by type first, then by the contents:
l.sort(key=lambda x: (str(type(x)), x))
The latter has the advantage that numbers get sorted numerically, strings alphabetically, etc. It will still fail if there are two sublists that cannot be compared, but then you must decide what to do-- just extend your key function however you see fit.
The key-argument to list.sort or sorted can be used to sort it the way you need it, first you need to define how you want to order the types, easiest (and probably fastest) is a dictionary with types as keys and order as value
# define a dictionary that gives the ordering of the types
priority = {int: 0, float: 0, str: 1, bool: 2, list: 3}
To make this work one can use the fact that tuples and lists compare by first comparing the first element and if that is equal compare the second element, if that's equal compare the third (and so on).
# Define a function that converts the items to a tuple consisting of the priority
# and the actual value
def priority_item(item):
return priority[type(item)], item
Finally you can sort your input, I'm going to shuffle it because it's already sorted (as far as I understand your question):
>>> l = [1, 2.0, "titi", True, [2, 3, 4, [3, [3, 4]], 5]]
>>> import random
>>> random.shuffle(l)
>>> print(l)
[True, [2, 3, 4, [3, [3, 4]], 5], 'titi', 2.0, 1]
>>> # Now sort it
>>> sorted(l, key=priority_item)
[1, 2.0, 'titi', True, [2, 3, 4, [3, [3, 4]], 5]]
I am a python beginner--headbangin'. This is a very basic question and I can't seem to find any straight forward answer, either using google or StackOverFlow.
QUESTION:
I have a nested list:
l = [
[1,4,3,n]
[2,2,4,n]
[3,1,5,n]
]
I want to sort the entire list by the second value smallest to largest. I will end up sorting the list again by the third value... and nth value of each nested list.
HOW WOULD ONE SORT based on the SECOND, THIRD, Nth value?
A key is mentioned, but "key=lambda" is often used and that just confuses me more.
EDIT: Thank you guys for your help. I was able to use your advice to solve the problem at hand. I would upvote you but apparently, I can't yet show my thanks in that form. I will return someday and give you your reputation bumps.
Can also be done using operator
Inside operator.itemgetter(1) you are indicating which index you want to sort on. So, in in this case we are specifying 1 which is the second item.
import operator
l = [[1,4,3], [2,2,4], [3,1,5]]
print(sorted(l, key=operator.itemgetter(1)))
output:
[[3, 1, 5], [2, 2, 4], [1, 4, 3]]
You can try like this,
>>> l = [[1,4,3], [2,2,4], [3,1,5]]
>>> sorted(l, key=lambda x: x[1])
[[3, 1, 5], [2, 2, 4], [1, 4, 3]]
or:
>>> l.sort(key=lambda x: x[1])
>>> l
[[3, 1, 5], [2, 2, 4], [1, 4, 3]]
Assume we have a list that looks like this:
L = [0, 1, [2, [3, 4], 5], [6, 7]]
What would be the best way to clear out all the elements from each sublist without removing the sublists? For example, in this case the return value would be:
>>> clearsublists(L)
[[[]], []]
EDIT: There are ways of doing this using wonky string methods like converting the list to a string and counting the number of times the symbols '[', '(' and '{' appear, but that would screw up if you have a list of deques for example, as a deque would be displayed as deque([]), making the program think there are actually two subsets when it's really only one.
For arbitrary depth, use recursion:
def clearsublists(L):
return [clearsublists(l) for l in L if isinstance(l, list)]
def clear(L):
if not L:
return []
else:
return [clear(i) for i in L if isinstance(i, list)]
Output:
>>> L = [0, 1, [2, [3, 4], 5], [6, 7]]
>>> clear(L)
[[[]], []]