Comparing lists. Which elements are NOT in a list? - python

I have the following 2 lists, and I want to obtain the elements of list2 that are not in list1:
list1 = ["0100","0300","0500"]
list2 = ["0100","0200","0300","0400","0500"]
My output should be:
list3 = ["0200","0400"]
I was checking for a way to subtract one from the other, but so far I can't be able to get the list 3 as I want

list3 = [x for x in list2 if x not in list1]
Or, if you don't care about order, you can convert the lists to sets:
set(list2) - set(list1)
Then, you can also convert this back to a list:
list3 = list(set(list2) - set(list1))

could this solution work for you?
list3 = []
for i in range(len(list2)):
if list2[i] not in list1:
list3.append(list2[i])

list1 = ["0100","0300","0500"]
list2 = ["0100","0200","0300","0400","0500"]
list3 = list(filter(lambda e: e not in list1,list2))
print(list3)

I believe this has been answered here:
Python find elements in one list that are not in the other
import numpy as np
list1 = ["0100","0300","0500"]
list2 = ["0100","0200","0300","0400","0500"]
list3 = np.setdiff1d(list2,list1)
print(list3)

set functions will help you to solve your problem in few lines of code...
set1=set(["0100","0300","0500"])
set2=set(["0100","0200","0300","0400","0500"])
set3=set2-set1
print(list(set3))
set gives you faster implementation in Python than the Lists...............

Related

Manipulate a list based on another list

I have a list of English Words(list2) and I want to remove the words from the list that contain the alphabets/letters in (list1)
For this example:
list1 = ['A','B']
list2 = ['AARON', 'ABAFT', 'ABASE', 'ABASK', 'ABAVE', 'ABBAS', 'ABBIE', 'ABDAL', 'ABEAM', 'ABELE', 'ABIDE', 'ABIES', 'ABKAR', 'ABLOW', 'ABNER', 'ABODE', 'ABOHM']
I want to write a loop to remove all elements as they contain either A or B.
The result should be an empty list.
list1 = ['A','B']
list2 = ['AARON', 'ABAFT', 'ABASE', 'ABASK', 'ABAVE', 'ABBAS', 'ABBIE', 'ABDAL', 'ABEAM', 'ABELE', 'ABIDE', 'ABIES', 'ABKAR', 'ABLOW', 'ABNER', 'ABODE', 'ABOHM']
for x in list1:
for y in list2:
if x in y:
list2.remove(y)
print(list2)
I was expecting an empty list but the result was:
['ABASK', 'ABDAL', 'ABIES', 'ABODE']
As commented by tripleee, changing a list while iterating over it can cause trouble. I'd suggest using a list comprehension and set to check for intersecting characters:
# Or list1 = {'A','B'}
list1 = set(list1)
# returns empty list
[w for w in list2 if not list1.intersection(w)]
You can also use a regex match if the list1 is not complex. An example approach can be like below:
import re
matcher = re.compile("|".join(list1))
list2 = [s for s in list2 if not matcher.search(s)]
Using a compositions of built-in functions, doc.
filter return a generator so should be casted to list.
list1 = ['A','B']
list2 = ['ARON', 'ABAFT', 'ABASE', 'ABASK', 'ABAVE', 'ABBAS', 'ABBIE', 'ABDAL', 'ABEAM', 'ABELE', 'ABIDE', 'ABIES', 'ABKAR', 'ABLOW', 'ABNER', 'ABODE', 'ABOHM']
a = filter(lambda s: not any(map(s.__contains__, list1)), list2)
print(list(a))

Python - filter list from another other list with condition

list1 = ['/mnt/1m/a_pre.geojson','/mnt/2m/b_pre.geojson']
list2 = ['/mnt/1m/a_post.geojson']
I have multiple lists and I want to find all the elements of list1 which do not have entry in list2 with a filtering condition.
The condition is it should match 'm' like 1m,2m.. and name of geojson file excluding 'pre or post' substring.
For in e.g. list1 '/mnt/1m/a_pre.geojson' is processed but '/mnt/2m/b_pre.geojson' is not so the output should have a list ['/mnt/2m/b_pre.geojson']
I am using 2 for loops and then splitting the string which I am sure is not the only one and there might be easier way to do this.
for i in list1:
for j in list2:
pre_tile = i.split("/")[-1].split('_pre', 1)[0]
post_tile = j.split("/")[-1].split('_post', 1)[0]
if pre_tile == post_tile:
...
I believe you have similar first part of the file paths. If so, you can try this:
list1 = ['/mnt/1m/a_pre.geojson','/mnt/2m/b_pre.geojson']
list2 = ['/mnt/1m/a_post.geojson']
res = [x for x in list1 if x[:7] not in [y[:7] for y in list2]]
res:
['/mnt/2m/b_pre.geojson']
If I understand you correctly, using a regular expression to do this kind of string manipulation can be fast and easy.
Additionally, to do multiple member-tests in list2, it's more efficient to convert the list to a set.
import re
list1 = ['/mnt/1m/a_pre.geojson', '/mnt/2m/b_pre.geojson']
list2 = ['/mnt/1m/a_post.geojson']
pattern = re.compile(r'(.*?/[0-9]m/.*?)_pre.geojson')
set2 = set(list2)
result = [
m.string
for m in map(pattern.fullmatch, list1)
if m and f"{m[1]}_post.geojson" not in set2
]
print(result)

remove element from python list based on match from another list

I have list of s3 objects like this:
list1 = ['uid=123/2020/06/01/625e2ghvh.parquet','uid=876/2020/04/01/hgdshct7.parquet','uid=0987/2019/03/01/323dc.parquet']
list2 = ['123','876']
result_list = ['uid=0987/2019/03/01/323dc.parquet']
With out using any loop is there any efficient way to achieve this considering large no of elements in list1?
You could build a set from list2 for a faster lookup and use a list comprehension to check for membership using the substring of interest:
list1 = ['uid=123/2020/06/01/625e2ghvh.parquet','uid=876/2020/04/01/hgdshct7.parquet',
'uid=0987/2019/03/01/323dc.parquet']
list2 = ['123','876']
set2 = set(list2)
[i for i in list1 if i.lstrip('uid=').split('/',1)[0] not in set2]
# ['uid=0987/2019/03/01/323dc.parquet']
The substring is obtained through:
s = 'uid=123/2020/06/01/625e2ghvh.parquet'
s.lstrip('uid=').split('/',1)[0]
# '123'
This does the job. For different patterns though, or to also cover slight variations, you could go for a regex. For this example you'd need something like:
import re
[i for i in list1 if re.search(r'^uid=(\d+).*?', i).group(1) not in set2]
# ['uid=0987/2019/03/01/323dc.parquet']
This is one way to do it without loops
def filter_function(item):
uid = int(item[4:].split('/')[0])
if uid not in list2:
return True
return False
list1 = ['uid=123/2020/06/01/625e2ghvh.parquet','uid=876/2020/04/01/hgdshct7.parquet','uid=0987/2019/03/01/323dc.parquet']
list2 = [123, 876]
result_list = list(filter(filter_function, list1))
How about this one:
_list2 = [f'uid={number}' for number in list2]
result = [item for item in list1 if not any([item.startswith(i) for i in _list2])] # ['uid=0987/2019/03/01/323dc.parquet']

Python - Concatenate an item from a list with an item from another list

I need to concatenate an item from a list with an item from another list. In my case the item is a string (a path more exactly). After the concatenation I want to obtain a list with all the possible items resulted from concatenation.
Example:
list1 = ['Library/FolderA/', 'Library/FolderB/', 'Library/FolderC/']
list2 = ['FileA', 'FileB']
I want to obtain a list like this:
[
'Library/FolderA/FileA',
'Library/FolderA/FileB',
'Library/FolderB/FileA',
'Library/FolderB/FileB',
'Library/FolderC/FileA',
'Library/FolderC/FileB'
]
Thank you!
In [11]: [d+f for (d,f) in itertools.product(list1, list2)]
Out[11]:
['Library/FolderA/FileA',
'Library/FolderA/FileB',
'Library/FolderB/FileA',
'Library/FolderB/FileB',
'Library/FolderC/FileA',
'Library/FolderC/FileB']
or, slightly more portably (and perhaps robustly):
In [16]: [os.path.join(*p) for p in itertools.product(list1, list2)]
Out[16]:
['Library/FolderA/FileA',
'Library/FolderA/FileB',
'Library/FolderB/FileA',
'Library/FolderB/FileB',
'Library/FolderC/FileA',
'Library/FolderC/FileB']
You can use a list comprehension:
>>> [d + f for d in list1 for f in list2]
['Library/FolderA/FileA', 'Library/FolderA/FileB', 'Library/FolderB/FileA', 'Library/FolderB/FileB', 'Library/FolderC/FileA', 'Library/FolderC/FileB']
You may want to use os.path.join() instead of simple concatenation though.
The built-in itertools module defines a product() function for this:
import itertools
result = itertools.product(list1, list2)
The for loop can do this easily:
my_list, combo = [], ''
list1 = ['Library/FolderA/', 'Library/FolderB/', 'Library/FolderC/']
list2 = ['FileA', 'FileB']
for x in list1:
for y in list2:
combo = x + y
my_list.append(combo)
return my_list
You can also just print them:
list1 = ['Library/FolderA/', 'Library/FolderB/', 'Library/FolderC/']
list2 = ['FileA', 'FileB']
for x in list1:
for y in list2:
print str(x + y)

How do i add two lists' elements into one list?

For example, I have a list like this:
list1 = ['good', 'bad', 'tall', 'big']
list2 = ['boy', 'girl', 'guy', 'man']
and I want to make a list like this:
list3 = ['goodboy', 'badgirl', 'tallguy', 'bigman']
I tried something like these:
list3=[]
list3 = list1 + list2
but this would only contain the value of list1
So I used for :
list3 = []
for a in list1:
for b in list2:
c = a + b
list3.append(c)
but it would result in too many lists(in this case, 4*4 = 16 of them)
You can use list comprehensions with zip:
list3 = [a + b for a, b in zip(list1, list2)]
zip produces a list of tuples by combining elements from iterables you give it. So in your case, it will return pairs of elements from list1 and list2, up to whichever is exhausted first.
A solution using a loop that you try is one way, this is more beginner friendly than Xions solution.
list3 = []
for index, item in enumerate(list1):
list3.append(list1[index] + list2[index])
This will also work for a shorter solution. Using map() and lambda, I prefer this over zip, but thats up to everyone
list3 = map(lambda x, y: str(x) + str(y), list1, list2);
for this or any two list of same size you may also use like this:
for i in range(len(list1)):
list3[i]=list1[i]+list2[i]
Using zip
list3 = []
for l1,l2 in zip(list1,list2):
list3.append(l1+l2)
list3 = ['goodboy', 'badgirl', 'tallguy', 'bigman']

Categories

Resources