Remove common in two list - python

I have problem with Python function uncommon(l1,l2) that takes two lists sorted in ascending order as arguments and returns the list of all elements that appear in exactly one of the two lists. The list returned should be in ascending order. All such elements should be listed only once, even if they appear multiple times in l1 or l2.
Thus, uncommon([2,2,4],[1,3,3,4,5]) should return [1,2,3,5] while uncommon([1,2,3],[1,1,2,3,3]) should return []
I have tried
def uncommon(l1,l2):
sl1=set(l1)
sl2=set(l2)

Using sets is a good start - sets have good logic regarding group operations.
If we think using venn diagrams we can see that "uncommon" elements are everything that is in the Union of the two lists minus the intersection of the two lists (the part in white is the intersection):
In python, this is called symmetric difference, and is built into sets:
def uncommon(l1, l2):
set1 = set(l1)
set2 = set(l2)
return sorted(set1.symmetric_difference(set2))
print(uncommon([2, 2, 4], [1, 3, 3, 4, 5])) # [1, 2, 3, 5]
print(uncommon([2, 2, 4], [1, 3, 3, 4, 5, 255])) # [1, 2, 3, 5, 255]
print(uncommon([1, 2, 3], [1, 1, 2, 3, 3])) # []

For a solution that doesn't require sorting (O(n*logn)), you can merge the sorted lists with heapq.merge after removing the duplicates and intersections:
from heapq import merge
def uncommon(l1,l2):
d1 = dict.fromkeys(l1)
d2 = dict.fromkeys(l2)
drop = set(d1).intersection(d2)
return list(merge(*([x for x in d
if not x in drop]
for d in [d1, d2])))
uncommon([2,2,4], [1,3,3,4,5,256])
# [1, 2, 3, 5, 256]

To get the uncommon element from two lists remove the common elements from the union(list1+list2) of the two lists.
def uncommon(l1,l2):
l1 = set(l1)
l2 = set(l2)
return sorted(list(l1.union(l2) - (l1.intersection(l2))))
print(uncommon([2,2,4],[1,3,3,4,5]))
print(uncommon([1,2,3],[1,1,2,3,3]))
OUTPUT
[1, 2, 3, 5]
[]

def uncommon(l1,l2):
sl1=set(l1)
sl2=set(l2)
return list(sl1^sl2)
Thanks for help! Actually i found a solution like this. What about this?

The common elements are those contained in both lists. So first get rid of elements in list 1 that are also in list 2. Then get rid of elements in list 2 that are also in list 1.
Return the remaining elements.
def uncommon(l1, l2):
d1=l1-l2
d2=l2-l1
return d1+d2

Related

Removing duplicate from list using list comprehension lead to all Nones

I tried the following code to remove duplicates from list using list comprehension
lists = [1, 2, 2, 3, 4, 3, 5, 6, 1]
unique_lists = []
[unique_lists.append(x) for x in lists if x not in unique_lists]
print(unique_lists)
unique_lists = []
g = [unique_lists.append(x) for x in lists if x not in unique_lists]
print(g)
The printed result are shown as below
[1, 2, 3, 4, 5, 6]
[None, None, None, None, None, None]
I'm trying to understand why the 2nd method (simply by assigning the list to g, returns all None? Thanks.
It's because list.append() always returns None. The easy way to do what you want is to convert it to a set.
unique_lists = list(set(lists))
list.append returns None. What you could do is use a set to remember the unique numbers:
lists = [1, 2, 2, 3, 4, 3, 5, 6, 1]
uniques = set()
unique_lists = [x for x in lists if not (x in uniques or uniques.add(x))]
Or a more efficient way would be to use how dict's act similar to a set while keeping insertion order (in python3+):
unique_lists = list(dict.fromkeys(lists))
You can also check this question for more answers/explanation on how to remove dups from a list.
Using a set to select the 1st occurrence of each distinct element will be more efficient than a list. You can even embed this control set in the comprehension:
g = [x for s in [set()] for x in lists if x not in s and not s.add(x)]
if you're not concerned with performance, you could select the item based on the index of their first occurrence:
g = [x for i,x in enumerate(lists) if i==lists.index(x)]
There is also a trick you can do using a dictionary:
g = list(dict(zip(lists,lists)))

Is there an elegant way to find an item in a list of list with multiple layers in python

List comprehensions are often cited as an elegant way to iterate through list of lists, like [item for sublist in original_list for item in sublist].
However to me it seem to only deal with simple list of lists that has 2 layers: [[1,2],[3,4].
Is there an universal elegant way to find an item in unknown number of layers of list of lists?
[0,[1,[2,[3]]],[4,[5]],6].
I have thought about flattening all the elements into a single list before checking, but any simpler methods?
If you don't want to flatten the array, you can use recursion to iterate over a list and recurs on each another list inside it.
Code:
def check(lst, x):
for i in range(len(lst)):
if type(lst[i]) == list: # If there is another list inside it
if check(lst[i], x): # If the value x exist in lst[i]
return True
elif lst[i] == x: # If there is no more list inside it
return True
return False # X can't be found in lst.
print(check([0, [1, [2, [3]]], [4, [5]], 6], 5)) # Output: True
print(check([0, [1, [2, [3]]], [4, [5]], 6], 9)) # Output: False
If you want to check if an element exists in the list, you must flatten it first, as below, then search for it in the flattened list
from pandas.core.common import flatten
lis = [0,[1,[2,[3]]],[4,[5]],6]
new_lis = list(flatten(lis))
print(new_lis)
// [0, 1, 2, 3, 4, 5, 6]
print(6 in new_lis)
// True

Unique list of lists

I have a nested list as an example:
lst_a = [[1,2,3,5], [1,2,3,7], [1,2,3,9], [1,2,6,8]]
I'm trying to check if the first 3 indices of a nested list element are the same as other.
I.e.
if [1,2,3] exists in other lists, remove all the other nested list elements that contain that. So that the nested list is unique.
I'm not sure the most pythonic way of doing this would be.
for i in range(0, len(lst_a)):
if lst[i][:3] == lst[i-1][:3]:
lst[i].pop()
Desired output:
lst_a = [[1,2,3,9], [1,2,6,8]]
If, as you said in comments, sublists that have the same first three elements are always next to each other (but the list is not necessarily sorted) you can use itertools.groupby to group those elements and then get the next from each of the groups.
>>> from itertools import groupby
>>> lst_a = [[1,2,3,5], [1,2,3,7], [1,2,3,9], [1,2,6,8]]
>>> [next(g) for k, g in groupby(lst_a, key=lambda x: x[:3])]
[[1, 2, 3, 5], [1, 2, 6, 8]]
Or use a list comprehension with enumerate and compare the current element with the last one:
>>> [x for i, x in enumerate(lst_a) if i == 0 or lst_a[i-1][:3] != x[:3]]
[[1, 2, 3, 5], [1, 2, 6, 8]]
This does not require any imports, but IMHO when using groupby it is much clearer what the code is supposed to do. Note, however, that unlike your method, both of those will create a new filtered list, instead of updating/deleting from the original list.
I think you are missing a loop For if you want to check all possibilities. I guess it should like :
for i in range(0, len(lst_a)):
for j in range(i, len(lst_a)):
if lst[i][:3] == lst[j][:3]:
lst[i].pop()
Deleting while going throught the list is maybe not the best idea you should delete unwanted elements at the end
Going with your approach, Find the below code:
lst=[lst_a[0]]
for li in lst_a[1:]:
if li[:3]!=lst[0][:3]:
lst.append(li)
print(lst)
Hope this helps!
You can use a dictionary to filter a list:
dct = {tuple(i[:3]): i for i in lst}
# {(1, 2, 3): [1, 2, 3, 9], (1, 2, 6): [1, 2, 6, 8]}
list(dct.values())
# [[1, 2, 3, 9], [1, 2, 6, 8]]

Unable to create duplicate list from existing list using list comprehension with an if condition

I have a sorted list with duplicate elements like
>>> randList = [1, 2, 2, 3, 4, 4, 5]
>>> randList
[1, 2, 2, 3, 4, 4, 5]
I need to create a list that removes the adjacent duplicate elements. I can do it like:
>>>> dupList = []
for num in nums:
if num not in dupList:
dupList.append(num)
But I want to do it with list comprehension. I tried the following code:
>>> newList = []
>>> newList = [num for num in randList if num not in newList]
But I get the result like the if condition isn't working.
>>> newList
[1, 2, 2, 3, 4, 4, 5]
Any help would be appreciated.
Thanks!!
Edit 1: The wording of the question does seem to be confusing given the data I have provided. The for loop that I am using will remove all duplicates but since I am sorting the list beforehand, that shouldn't a problem when removing adjacent duplicates.
Using itertools.groupby is the simplest approach to remove adjacent (and only adjacent) duplicates, even for unsorted input:
>>> from itertools import groupby
>>> [k for k, _ in groupby(randList)]
[1, 2, 3, 4, 5]
Removing all duplicates while maintaining the order of occurence can be efficiently achieved with an OrderedDict. This, as well, works for ordered and unordered input:
>>> from collections import OrderedDict
>>> list(OrderedDict.fromkeys(randList))
[1, 2, 3, 4, 5]
I need to create a list that removes the adjacent duplicate elements
Note that your for loop based solution will remove ALL duplicates, not only adjacent ones. Test it with this:
rand_list = [1, 2, 2, 3, 4, 4, 2, 5, 1]
according to your spec the result should be:
[1, 2, 3, 4, 2, 5, 1]
but you'll get
[1, 2, 3, 4, 5]
instead.
A working solution to only remove adjacent duplicates is to use a generator:
def dedup_adjacent(seq):
prev = seq[0]
yield prev
for current in seq[1:]:
if current == prev:
continue
yield current
prev = current
rand_list = [1, 2, 2, 3, 4, 4, 2, 5, 1]
list(dedup_adjacent(rand_list))
=> [1, 2, 3, 4, 2, 5, 1]
Python first evaluates the list comprehension and then assigns it to newList, so you cannot refer to it during execution of the list comprehension.
You can remove dublicates in two ways:-
1. Using for loop
rand_list = [1,2,2,3,3,4,5]
new_list=[]
for i in rand_list:
if i not in new_list:
new_list.append(i)
Convert list to set,then again convert set to list,and at last sort the new list.
Since set stores values in any order so when we convert set into list you need to sort the list so that you get the item in ascending order
rand_list = [1,2,2,3,3,4,5]
sets = set(rand_list)
new_list = list(sets)
new_list.sort()
Update: Comparison of different Approaches
There have been three ways of achieving the goal of removing adjacent duplicate elements in a sorted list, i.e. removing all duplicates:
using groupby (only adjacent elements, requires initial sorting)
using OrderedDict (all duplicates removed)
using sorted(list(set(_))) (all duplicaties removed, ordering restored by sorting).
I compared the running times of the different solutions using:
from timeit import timeit
print('groupby:', timeit('from itertools import groupby; l = [x // 5 for x in range(1000)]; [k for k, _ in groupby(l)]'))
print('OrderedDict:', timeit('from collections import OrderedDict; l = [x // 5 for x in range(1000)]; list(OrderedDict.fromkeys(l))'))
print('Set:', timeit('l = [x // 5 for x in range(1000)]; sorted(list(set(l)))'))
> groupby: 78.83623623599942
> OrderedDict: 94.54144410200024
> Set: 65.60372123999969
Note that the set approach is the fastest among all alternatives.
Old Answer
Python first evaluates the list comprehension and then assigns it to newList, so you cannot refer to it during execution of the list comprehension. To illustrate, consider the following code:
randList = [1, 2, 2, 3, 4, 4, 5]
newList = []
newList = [num for num in randList if print(newList)]
> []
> []
> []
> …
This becomes even more evident if you try:
# Do not initialize newList2
newList2 = [num for num in randList if print(newList2)]
> NameError: name 'newList2' is not defined
You can remove duplicates by turning randList into a set:
sorted(list(set(randlist)))
> [1, 2, 3, 4, 5]
Be aware that this does remove all duplicates (not just adjacent ones) and ordering is not preserved. The former also holds true for your proposed solution with the loop.
edit: added a sorted clause as to specification of required ordering.
In this line newList = [num for num in randList if num not in newList], at first the list will be created in right side then then it will be assigned to newList. That's why every time you check if num not in newList returns True. Becasue newList remains empty till the assignment.
You can try this:
randList = [1, 2, 2, 3, 4, 4, 5]
new_list=[]
for i in randList:
if i not in new_list:
new_list.append(i)
print(new_list)
You cannot access the items in a list comprehension as you go along. The items in a list comprehension are only accessible once the comprehension is completed.
For large lists, checking for membership in a list will be expensive, albeit with minimal memory requirements. Instead, you can append to a set:
randList = [1, 2, 2, 3, 4, 4, 5]
def gen_values(L):
seen = set()
for i in L:
if i not in seen:
seen.add(i)
yield i
print(list(gen_values(randList)))
[1, 2, 3, 4, 5]
This algorithm has been implemented in the 3rd party toolz library. It's also known as the unique_everseen recipe in the itertools docs:
from toolz import unique
res = list(unique(randList))
Since your list is sorted, using set will be the fasted way to achieve your goal, as follows:
>>> randList = [1, 2, 2, 3, 4, 4, 5]
>>> randList
[1, 2, 2, 3, 4, 4, 5]
>>> remove_dup_list = list(set(randList))
>>> remove_dup_list
[1, 2, 3, 4, 5]
>>>

unite lists if at least one value matches in python

Let's say I have a list of lists, for example:
[[0, 2], [0, 1], [2, 3], [4, 5, 7, 8], [6, 4]]
and if at least one of the values on a list is the same that another one of a different list, i would like to unite the lists so in the example the final result would be:
[[0, 1, 2, 3], [4, 5, 6, 7, 8]]
I really don't care about the order of the values inside the list [0, 1, 2, 3] or [0, 2, 1, 3].
I tried to do it but it doesn't work. So have you got any ideas? Thanks.
Edit(sorry for not posting the code that i tried before):
What i tried to do was the following:
for p in llista:
for q in p:
for k in llista:
if p==k:
llista.remove(k)
else:
for h in k:
if p!=k:
if q==h:
k.remove(h)
for t in k:
if t not in p:
p.append(t)
llista_final = [x for x in llista if x != []]
Where llista is the list of lists.
I have to admit this is a tricky problem. I'm really curious what does this problem represent and/or where did you find it out...
I initially have thought this is just a graph connected components problem, but I wanted to take a shortcut from creating an explicit representation of the graph, running bfs, etc...
The idea of the solution is this: for every sublist, check if it has some common element with any other sublist, and replace that with their union.
Not very pythonic, but here it is:
def merge(l):
l = list(map(tuple, l))
for i, h in enumerate(l):
sh = set(h)
for j, k in enumerate(l):
if i == j: continue
sk = set(k)
if sh & sk: # h and k have some element in common
l[j] = tuple(sh | sk)
return list(map(list, set(l)))
Here is a function that does what you want. I tried to use self-documenting variable names and comments to help you understand how this code works. As far as I can tell, the code is pythonic. I used sets to speed up and simplify some of the operations. The downside of that is that the items in your input list-of-lists must be hashable, but your example uses integers which works perfectly well.
def cliquesfromlistoflists(inputlistoflists):
"""Given a list of lists, return a new list of lists that unites
the old lists that have at least one element in common.
"""
listofdisjointsets = []
for inputlist in inputlistoflists:
# Update the list of disjoint sets using the current sublist
inputset = set(inputlist)
unionofsetsoverlappinginputset = inputset.copy()
listofdisjointsetsnotoverlappinginputset = []
for aset in listofdisjointsets:
# Unite set if overlaps the new input set, else just store it
if aset.isdisjoint(inputset):
listofdisjointsetsnotoverlappinginputset.append(aset)
else:
unionofsetsoverlappinginputset.update(aset)
listofdisjointsets = (listofdisjointsetsnotoverlappinginputset
+ [unionofsetsoverlappinginputset])
# Return the information in a list-of-lists format
return [list(aset) for aset in listofdisjointsets]
print(cliquesfromlistoflists([[0, 2], [0, 1], [2, 3], [4, 5, 7, 8], [6, 4]]))
# printout is [[0, 1, 2, 3], [4, 5, 6, 7, 8]]
This solution modifies the generic breadth-first search to gradually diminish the initial deque and update a result list with either a combination should a match be found or a list addition if no grouping is discovered:
from collections import deque
d = deque([[0,2] , [0,1] , [2,3] , [4,5,7,8] , [6,4]])
result = [d.popleft()]
while d:
v = d.popleft()
result = [list(set(i+v)) if any(c in i for c in v) else i for i in result] if any(any(c in i for c in v) for i in result) else result + [v]
Output:
[[0, 1, 2, 3], [8, 4, 5, 6, 7]]

Categories

Resources