Python: Remove Duplicate Items from Nested list

Python: Remove Duplicate Items from Nested list - python

mylist = [[1,2],[4,5],[3,4],[4,3],[2,1],[1,2]]
I want to remove duplicate items, duplicated items can be reversed. The result should be :
mylist = [[1,2],[4,5],[3,4]]
How do I achieve this in Python?

lst=[[1,2],[4,5],[3,4],[4,3],[2,1],[1,2]]
fset = set(frozenset(x) for x in lst)
lst = [list(x) for x in fset]
This won't preserve order from your original list, nor will it preserve order of your sublists.
>>> lst=[[1,2],[4,5],[3,4],[4,3],[2,1],[1,2]]
>>> fset = set(frozenset(x) for x in lst)
>>> lst = [list(x) for x in fset]
>>> lst
[[1, 2], [3, 4], [4, 5]]

If the Order Matters you can always use OrderedDict
>>> unq_lst = OrderedDict()
>>> for e in lst:
unq_lst.setdefault(frozenset(e),[]).append(e)
>>> map(list, unq_lst.keys())
[[1, 2], [4, 5], [3, 4]]

If order is not important:
def rem_dup(l: List[List[Any]]) -> List[List[Any]]:
tuples = map(lambda t: tuple(sorted(t)), l)
return [list(t) for t in set(tuples)]

Related

Segmenting a list of lists in Python

I have a list of lists all of the same length. I would like to segment the first list into contiguous runs of a given value. I would then like to segment the remaining lists to match the segments generated from the first list.
For example:
Given value: 2
Given list of lists: [[0,0,2,2,2,1,1,1,2,3], [1,2,3,4,5,6,7,8,9,10], [1,1,1,1,1,1,1,1,1,1]
Return: [ [[2,2,2],[2]], [[3,4,5],[9]], [[1,1,1],[1]] ]
The closest I have gotten is to get the indices by:
>>> import itertools
>>> import operator
>>> x = 2
>>> L = [[0,0,2,2,2,1,1,1,2,3],[1,2,3,4,5,6,7,8,9,10],[1,1,1,1,1,1,1,1,1,1]]
>>> I = [[i for i,value in it] for key,it in itertools.groupby(enumerate(L[0]), key=operator.itemgetter(1)) if key == x]
>>> print I
[[2, 3, 4], [8]]
This code was modified from another question on this site.
I would like to find the most efficient way possible, since these lists may be very long.
EDIT:
Maybe if I place the lists one on top of each other it might be clearer:
[[0,0,[2,2,2],1,1,1,[2],3], -> [2,2,2],[2]
[1,2,[3,4,5],6,7,8,[9],10],-> [3,4,5],[9]
[1,1,[1,1,1],1,1,1,[1],1]] -> [1,1,1],[1]

You can use groupby to create a list of groups in the form of a tuple of starting index and length of the group, and use this list to extract the values from each sub-list:
from itertools import groupby
from operator import itemgetter
def match(L, x):
groups = [(next(g)[0], sum(1 for _ in g) + 1)
for k, g in groupby(enumerate(L[0]), key=itemgetter(1)) if k == x]
return [[lst[i: i + length] for i, length in groups] for lst in L]
so that:
match([[0,0,2,2,2,1,1,1,2,3], [1,2,3,4,5,6,7,8,9,10], [1,1,1,1,1,1,1,1,1,1]], 2)
returns:
[[[2, 2, 2], [2]], [[3, 4, 5], [9]], [[1, 1, 1], [1]]]

l=[[0,0,2,2,2,1,1,1,2,3], [1,2,3,4,5,6,7,8,9,10], [1,1,1,1,1,1,1,1,1,1]]
temp=l[0]
value=2
dict={}
k=-1
prev=-999
for i in range(0,len(temp)):
if(temp[i]==value):
if(prev!=-999 and prev==i-1):
if(k in dict):
dict[k].append(i)
else:
dict[k]=[i]
else:
k+=1
if(k in dict):
dict[k].append(i)
else:
dict[k]=[i]
prev=i
output=[]
for i in range(0,len(l)):
single=l[i]
final=[]
for keys in dict: #{0: [2, 3, 4], 1: [8]}
ans=[]
desired_indices=dict[keys]
for j in range(0,len(desired_indices)):
ans.append(single[desired_indices[j]])
final.append(ans)
output.append(final)
print(output) #[[[2, 2, 2], [2]], [[3, 4, 5], [9]], [[1, 1, 1], [1]]]
This seems to be one of the approach, this first creates the dictionary of contagious elements and then looks for that keys in every list and stores in output.

Split lists and tuples in Python

I have a simple question.
I have list, or a tuple, and I want to split it into many lists (or tuples) that contain the same elements.
I'll try to be more clear using an example:
(1,1,2,2,3,3,4) --> (1,1),(2,2),(3,3),(4,)
(1,2,3,3,3,3) --> (1,),(2,),(3,3,3,3)
[2,2,3,3,2,3] --> [2,2],[3,3],[2],[3]
How can I do? I know that tuples and lists do not have the attribute "split" so i thought that i could turn them into strings before. This is what i tried:
def splitt(l)
x=str(l)
for i in range (len(x)-1):
if x[i]!=x[i+1]:
x.split()
return x

You can use groupby.
import itertools as it
[list(grp) if isinstance(t,list) else tuple(grp) for k, grp in it.groupby(t)]
Examples:
>>> t = (1,2,3,3,3,3)
[(1,), (2,), (3, 3, 3, 3)]
>>> t = [2,2,3,3,2,3]
[[2, 2], [3, 3], [2], [3]]

You also may try with for-loop:
def group_lt(list_or_tuple):
result = []
for x in list_or_tuple:
if not result or result[-1][0] != x:
result.append(type(list_or_tuple)([x]))
else:
result[-1] += type(list_or_tuple)([x])
return result
t = (1,1,2,2,3,3,4)
print(group_lt(t)) # [(1,1),(2,2),(3,3),(4,)]
l = [2,2,3,3,2,3]
print(group_lt(l)) # [[2,2],[3,3],[2],[3]]

Try this
from itertools import groupby
input_list = [1, 1, 2, 4, 6, 6, 7]
output = [list(g) for k, g in groupby(input_list)]

Python list comprehension: is there a way to do [func(x) for x in list1 or list2]

Or [func(x) for x in list1 and list2] (for some function func), without having to create a new list that happens to be the union or intersection of the two lists.

You can use itertools.chain to join the two lists without creating a new one:
from itertools import chain
lst = [x for x in chain(list1, list2)]
Below is a demonstration:
>>> from itertools import chain
>>> list1 = [1, 2, 3]
>>> list2 = [4, 5, 6]
>>> [x for x in chain(list1, list2)]
[1, 2, 3, 4, 5, 6]
>>> list(chain(list1, list2)) # Equivalent
[1, 2, 3, 4, 5, 6]
>>>

import itertools
[x for x in itertools.chain(list1,list2)]
Note that this will add duplicates, so it's neither a union nor an intersection. If you want a true union/intersection:
set.union(map(set, [list1,list2])) # cast to list if you need
# union
set.intersection(map(set, [list1,list2])) # cast to list if you need
# intersection
From your edit:
def func(x):
pass
# do something useful
for element in list1:
if element in list2:
func(element)
# Or, but less readably imo
# # for element in filter(lambda x: x in list2, list1):
# # func(element)

You want itertools.chain().
[... in itertools.chain(list1, list2)]

How do I remove duplicate arrays in a list in Python

I have a list in Python filled with arrays.
([4,1,2],[1,2,3],[4,1,2])
How do I remove the duplicate array?

Very simple way to remove duplicates (if you're okay with converting to tuples/other hashable item) is to use a set as an intermediate element.
lst = ([4,1,2],[1,2,3],[4,1,2])
# convert to tuples
tupled_lst = set(map(tuple, lst))
lst = map(list, tupled_lst)
If you have to preserve order or don't want to convert to tuple, you can use a set to check if you've seen the item before and then iterate through, i.e.,
seen = set()
def unique_generator(lst)
for item in lst:
tupled = tuple(item)
if tupled not in seen:
seen.add(tupled)
yield item
lst = list(unique_generator(lst))
This isn't great python, but you can write this as a crazy list comprehension too :)
seen = set()
lst = [item for item in lst if not(tuple(item) in seen or seen.add(tuple(item)))]

If order matters:
>>> from collections import OrderedDict
>>> items = ([4,1,2],[1,2,3],[4,1,2])
>>> OrderedDict((tuple(x), x) for x in items).values()
[[4, 1, 2], [1, 2, 3]]
Else it is much simpler:
>>> set(map(tuple, items))
set([(4, 1, 2), (1, 2, 3)])

l = ([4,1,2],[1,2,3],[4,1,2])
uniq = []
for i in l:
if not i in uniq:
uniq.append(i)
print('l=%s' % str(l))
print('uniq=%s' % str(uniq))
which produces:
l=([4, 1, 2], [1, 2, 3], [4, 1, 2])
uniq=[[4, 1, 2], [1, 2, 3]]

Use sets to keep track of seen items, but as sets can only contain hashable items so you may have to convert the items of your tuple to some hashable value first( tuple in this case) .
Sets provide O(1) lookup, so overall complexity is going to be O(N)
This generator function will preserve the order:
def solve(lis):
seen = set()
for x in lis:
if tuple(x) not in seen:
yield x
seen.add(tuple(x))
>>> tuple( solve(([4,1,2],[1,2,3],[4,1,2])) )
([4, 1, 2], [1, 2, 3])
If the order doesn't matter then you can simply use set() here:
>>> lis = ([4,1,2],[1,2,3],[4,1,2]) # this contains mutable/unhashable items
>>> set( tuple(x) for x in lis) # apply tuple() to each item, to make them hashable
set([(4, 1, 2), (1, 2, 3)]) # sets don't preserve order
>>> lis = [1, 2, 2, 4, 1] #list with immutable/hashable items
>>> set(lis)
set([1, 2, 4])

Return a range of elements of each list inside a list of lists

From a list mylist = [[1, 2, 3], [4, 5, 6], [7, 8, 9]] how can I get a new list of lists composed of the first two elements of each "inside" list e.i. newlist = [[1, 2], [4, 5], [7, 8]]? Is there a one-liner that can do this efficiently (for large lists of lists)?

The easiest way is probably to use a list comprehension:
newlist = [sublist[:2] for sublist in mylist]

Quick answer:
first_two = [sublist[:2] for sublist in mylist]
If having a list of tuples is ok, then a faster answer (2x by my measurements):
import operator
map(operator.itemgetter(0, 1), mylist)
Measurements:
t = timeit.Timer("[i[:2] for i in ll]", "ll = [[i, i + 1, i + 2] for i in xrange(1000)]")
t.timeit(10000)
>>> 2.2732808589935303
t2 = timeit.Timer("map(operator.itemgetter(0, 1), ll)", "import operator; ll = [[i, i + 1, i + 2] for i in xrange(1000)]")
t2.timeit(10000)
>>> 1.3041009902954102

Use list comprehension.
newlist = [x[:2] for x in mylist]

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Python: Remove Duplicate Items from Nested list - python

mylist = [[1,2],[4,5],[3,4],[4,3],[2,1],[1,2]] I want to remove duplicate items, duplicated items can be reversed. The result should be : mylist = [[1,2],[4,5],[3,4]] How do I achieve this in Python?

If the Order Matters you can always use OrderedDict >>> unq_lst = OrderedDict() >>> for e in lst: unq_lst.setdefault(frozenset(e),[]).append(e) >>> map(list, unq_lst.keys()) [[1, 2], [4, 5], [3, 4]]

If order is not important: def rem_dup(l: List[List[Any]]) -> List[List[Any]]: tuples = map(lambda t: tuple(sorted(t)), l) return [list(t) for t in set(tuples)]

Related

Segmenting a list of lists in Python

Split lists and tuples in Python

Python list comprehension: is there a way to do [func(x) for x in list1 or list2]

How do I remove duplicate arrays in a list in Python

Return a range of elements of each list inside a list of lists

Categories

Resources