I am trying to compare 2 different lists and find the differences between them. Say for example I have list 1 which consists of cat,dog,whale,hamster and list 2 which consists of dog,whale,hamster. How would I compare these two and then assign a variable to the difference which in this case is cat. Order does not matter however if there is more than one difference each of these differences should be assigned to an individual variable.
In my actual code im comparing html which consists of thousands of lines so I would prefer something as fast as possible but any is appreciated :)
str1 = 'cat,dog,whale,hamster'
str2 = 'dog,whale,hamster'
Change strings into python sets:
set1 = set(str1.split(','))
set2 = set(str2.split(','))
Get the difference:
result = set1 - set2
Which prints:
{'cat'}
You can convert it to a list or a string:
result_as_list = list(result)
result_as_string = ','.join(result)
If your lists can contain duplicates or if you need to know the elements that are only in one of the two lists, you can use Counter (from the collections module):
list1 = ['cat','dog','whale','hamster','dog']
list2 = ['dog','whale','hamster','cow','horse']
from collections import Counter
c1,c2 = Counter(list1),Counter(list2)
differences = [*((c1-c2)+(c2-c1)).elements()]
print(differences) # ['cat', 'dog', 'cow', 'horse']
This is how you are gonna do it. The function defined here will print the difference between the two lists
def Diff(list1, list2):
li_dif = [i for i in list1 + list2 if i not in list1 or i not in list2]
return li_dif
# Driver Code
list1 = ['cat','dog','whale','hamster']
list2 = ['dog','whale','hamster']
diff = Diff(list1, list2)
print(diff)
output:
['cat']
here cat is generated by the variable diff
Now if there is more than one difference, as follows:
def Diff(list1, list2):
li_dif = [i for i in list1 + list2 if i not in list1 or i not in list2]
return li_dif
# Driver Code
list1 = ['cat','dog','whale','hamster','ostrich','yak','sheep','lion','tiger']
list2 = ['dog','whale','hamster']
diff = Diff(list1, list2)
print(diff)
the output will be:
['cat','ostrich','yak','sheep','lion','tiger']
Your question is that if there is more than one difference, each of these differences should be assigned to an individual variable.
for that, we will treat the printed item as a list, let's name it list3
diff==list3
here, list3=['cat','ostrich','yak','sheep','lion','tiger']
Here, is only 6 list items, we can assign a variable to each of them as follows:
v1=list3[0]
v2=list3[1]
v3=list3[2]
v4=list3[3]
v5=list3[4]
v6=list3[5]
Related
list1 = ['/mnt/1m/a_pre.geojson','/mnt/2m/b_pre.geojson']
list2 = ['/mnt/1m/a_post.geojson']
I have multiple lists and I want to find all the elements of list1 which do not have entry in list2 with a filtering condition.
The condition is it should match 'm' like 1m,2m.. and name of geojson file excluding 'pre or post' substring.
For in e.g. list1 '/mnt/1m/a_pre.geojson' is processed but '/mnt/2m/b_pre.geojson' is not so the output should have a list ['/mnt/2m/b_pre.geojson']
I am using 2 for loops and then splitting the string which I am sure is not the only one and there might be easier way to do this.
for i in list1:
for j in list2:
pre_tile = i.split("/")[-1].split('_pre', 1)[0]
post_tile = j.split("/")[-1].split('_post', 1)[0]
if pre_tile == post_tile:
...
I believe you have similar first part of the file paths. If so, you can try this:
list1 = ['/mnt/1m/a_pre.geojson','/mnt/2m/b_pre.geojson']
list2 = ['/mnt/1m/a_post.geojson']
res = [x for x in list1 if x[:7] not in [y[:7] for y in list2]]
res:
['/mnt/2m/b_pre.geojson']
If I understand you correctly, using a regular expression to do this kind of string manipulation can be fast and easy.
Additionally, to do multiple member-tests in list2, it's more efficient to convert the list to a set.
import re
list1 = ['/mnt/1m/a_pre.geojson', '/mnt/2m/b_pre.geojson']
list2 = ['/mnt/1m/a_post.geojson']
pattern = re.compile(r'(.*?/[0-9]m/.*?)_pre.geojson')
set2 = set(list2)
result = [
m.string
for m in map(pattern.fullmatch, list1)
if m and f"{m[1]}_post.geojson" not in set2
]
print(result)
I have two lists
list1 = ['01:15', 'abc', '01:15', 'def', '01:45', 'ghi' ]
list2 = ['01:15', 'abc', '01:15', 'uvz', '01:45', 'ghi' ]
and when I loop through the list
list_difference = []
for item in list1:
if item not in list2:
list_difference.append(item)
and I managed to get the difference, but I need time as well
because it is a separate item and 'uvz' does not mean to me anything in the list with a few thousand entries.
I tried to convert it to the dictionary, but it overwrites with the last key:value {'01:15' : 'def'}.
Convert the two lists to sets of tuples, then use the set difference operator.
set1 = set((list1[i], list1[i+1]) for i in range(0, len(list1), 2))
set2 = set((list2[i], list2[i+1]) for i in range(0, len(list2), 2))
list_difference = list(set1 - set2)
reformat your data, then do whatever you have done before
list1=list(zip(list1[::2],list1[1::2]))
list2=list(zip(list2[::2],list2[1::2]))
I have the following 2 lists, and I want to obtain the elements of list2 that are not in list1:
list1 = ["0100","0300","0500"]
list2 = ["0100","0200","0300","0400","0500"]
My output should be:
list3 = ["0200","0400"]
I was checking for a way to subtract one from the other, but so far I can't be able to get the list 3 as I want
list3 = [x for x in list2 if x not in list1]
Or, if you don't care about order, you can convert the lists to sets:
set(list2) - set(list1)
Then, you can also convert this back to a list:
list3 = list(set(list2) - set(list1))
could this solution work for you?
list3 = []
for i in range(len(list2)):
if list2[i] not in list1:
list3.append(list2[i])
list1 = ["0100","0300","0500"]
list2 = ["0100","0200","0300","0400","0500"]
list3 = list(filter(lambda e: e not in list1,list2))
print(list3)
I believe this has been answered here:
Python find elements in one list that are not in the other
import numpy as np
list1 = ["0100","0300","0500"]
list2 = ["0100","0200","0300","0400","0500"]
list3 = np.setdiff1d(list2,list1)
print(list3)
set functions will help you to solve your problem in few lines of code...
set1=set(["0100","0300","0500"])
set2=set(["0100","0200","0300","0400","0500"])
set3=set2-set1
print(list(set3))
set gives you faster implementation in Python than the Lists...............
I have list of s3 objects like this:
list1 = ['uid=123/2020/06/01/625e2ghvh.parquet','uid=876/2020/04/01/hgdshct7.parquet','uid=0987/2019/03/01/323dc.parquet']
list2 = ['123','876']
result_list = ['uid=0987/2019/03/01/323dc.parquet']
With out using any loop is there any efficient way to achieve this considering large no of elements in list1?
You could build a set from list2 for a faster lookup and use a list comprehension to check for membership using the substring of interest:
list1 = ['uid=123/2020/06/01/625e2ghvh.parquet','uid=876/2020/04/01/hgdshct7.parquet',
'uid=0987/2019/03/01/323dc.parquet']
list2 = ['123','876']
set2 = set(list2)
[i for i in list1 if i.lstrip('uid=').split('/',1)[0] not in set2]
# ['uid=0987/2019/03/01/323dc.parquet']
The substring is obtained through:
s = 'uid=123/2020/06/01/625e2ghvh.parquet'
s.lstrip('uid=').split('/',1)[0]
# '123'
This does the job. For different patterns though, or to also cover slight variations, you could go for a regex. For this example you'd need something like:
import re
[i for i in list1 if re.search(r'^uid=(\d+).*?', i).group(1) not in set2]
# ['uid=0987/2019/03/01/323dc.parquet']
This is one way to do it without loops
def filter_function(item):
uid = int(item[4:].split('/')[0])
if uid not in list2:
return True
return False
list1 = ['uid=123/2020/06/01/625e2ghvh.parquet','uid=876/2020/04/01/hgdshct7.parquet','uid=0987/2019/03/01/323dc.parquet']
list2 = [123, 876]
result_list = list(filter(filter_function, list1))
How about this one:
_list2 = [f'uid={number}' for number in list2]
result = [item for item in list1 if not any([item.startswith(i) for i in _list2])] # ['uid=0987/2019/03/01/323dc.parquet']
I am trying to combine (2) 2D lists based on a common value in both lists.
The values in the list are unique so there is nothing to take in to account for a list entry having any of the same values.
The example is:
list1 = [['hdisk37', '00f7e0b88577106a']]
list2 = [['1', '00f7e0b8cee02cd6'], ['2', '00f7e0b88577106a']]
With the desired result of:
list3 = [['hdisk37', '00f7e0b88577106a','2']]
The common value is at list1[0][1] and list2[1][1].
The pythonic way to get the needed result using set objects:
list1 = [['hdisk37', '00f7e0b88577106a']]
list2 = [['1', '00f7e0b8cee02cd6'], ['2', '00f7e0b88577106a']]
set1 = set(list1[0])
list3 = [list(set1 | s) for s in map(set, list2) if set1 & s]
print(list3)
The output:
[['00f7e0b88577106a', '2', 'hdisk37']]
set1 & s is intersection of two sets(returns a new set with elements common to the first set and all others)
set1 | s is union of a specified sets
Try this:
result = []
for inner_list1 in list1:
for inner_list2 in list2:
set1 = set(inner_list1)
set2 = set(inner_list1)
if set1.intersection(set2):
result.append(list(set1.union(set2)))
For each inner list in both lists, check if the intersection between them is not empty. In case it isn't, they are both merged and added to the final result.
This method returns all the possible "second value" matches as a dict, from the second value to the resulting list. It also takes an arbitrary number of these lists of lists (not just two).
import collections
a = [['hdisk37', '00f7e0b88577106a']]
b = [['1', '00f7e0b8cee02cd6'], ['2', '00f7e0b88577106a']]
def combine(*lols): # list of lists
ret = collections.defaultdict(set)
for lol in lols:
for l in lol:
ret[l[1]].add(l[1])
ret[l[1]].add(l[0])
return {k:list(v) for k,v in ret.items()}
print combine(a,b)
Output:
$ python test.py
{'00f7e0b8cee02cd6': ['00f7e0b8cee02cd6', '1'], '00f7e0b88577106a': ['hdisk37', '2', '00f7e0b88577106a']}
To get your exact output requested, you'd do:
combine(list1, list2).get('00f7e0b88577106a')
If you wanna try something different you could do a
merger = lambda x,y : set(x)|set(y) if set(x)&set(y) else x
results = []
for item in list1:
result = reduce(merger,[item]+list2)
if isinstance(result,set):
results.append(result)
print results