Manipulate a list based on another list - python

I have a list of English Words(list2) and I want to remove the words from the list that contain the alphabets/letters in (list1)
For this example:
list1 = ['A','B']
list2 = ['AARON', 'ABAFT', 'ABASE', 'ABASK', 'ABAVE', 'ABBAS', 'ABBIE', 'ABDAL', 'ABEAM', 'ABELE', 'ABIDE', 'ABIES', 'ABKAR', 'ABLOW', 'ABNER', 'ABODE', 'ABOHM']
I want to write a loop to remove all elements as they contain either A or B.
The result should be an empty list.
list1 = ['A','B']
list2 = ['AARON', 'ABAFT', 'ABASE', 'ABASK', 'ABAVE', 'ABBAS', 'ABBIE', 'ABDAL', 'ABEAM', 'ABELE', 'ABIDE', 'ABIES', 'ABKAR', 'ABLOW', 'ABNER', 'ABODE', 'ABOHM']
for x in list1:
for y in list2:
if x in y:
list2.remove(y)
print(list2)
I was expecting an empty list but the result was:
['ABASK', 'ABDAL', 'ABIES', 'ABODE']

As commented by tripleee, changing a list while iterating over it can cause trouble. I'd suggest using a list comprehension and set to check for intersecting characters:
# Or list1 = {'A','B'}
list1 = set(list1)
# returns empty list
[w for w in list2 if not list1.intersection(w)]

You can also use a regex match if the list1 is not complex. An example approach can be like below:
import re
matcher = re.compile("|".join(list1))
list2 = [s for s in list2 if not matcher.search(s)]

Using a compositions of built-in functions, doc.
filter return a generator so should be casted to list.
list1 = ['A','B']
list2 = ['ARON', 'ABAFT', 'ABASE', 'ABASK', 'ABAVE', 'ABBAS', 'ABBIE', 'ABDAL', 'ABEAM', 'ABELE', 'ABIDE', 'ABIES', 'ABKAR', 'ABLOW', 'ABNER', 'ABODE', 'ABOHM']
a = filter(lambda s: not any(map(s.__contains__, list1)), list2)
print(list(a))

Related

Python - filter list from another other list with condition

list1 = ['/mnt/1m/a_pre.geojson','/mnt/2m/b_pre.geojson']
list2 = ['/mnt/1m/a_post.geojson']
I have multiple lists and I want to find all the elements of list1 which do not have entry in list2 with a filtering condition.
The condition is it should match 'm' like 1m,2m.. and name of geojson file excluding 'pre or post' substring.
For in e.g. list1 '/mnt/1m/a_pre.geojson' is processed but '/mnt/2m/b_pre.geojson' is not so the output should have a list ['/mnt/2m/b_pre.geojson']
I am using 2 for loops and then splitting the string which I am sure is not the only one and there might be easier way to do this.
for i in list1:
for j in list2:
pre_tile = i.split("/")[-1].split('_pre', 1)[0]
post_tile = j.split("/")[-1].split('_post', 1)[0]
if pre_tile == post_tile:
...
I believe you have similar first part of the file paths. If so, you can try this:
list1 = ['/mnt/1m/a_pre.geojson','/mnt/2m/b_pre.geojson']
list2 = ['/mnt/1m/a_post.geojson']
res = [x for x in list1 if x[:7] not in [y[:7] for y in list2]]
res:
['/mnt/2m/b_pre.geojson']
If I understand you correctly, using a regular expression to do this kind of string manipulation can be fast and easy.
Additionally, to do multiple member-tests in list2, it's more efficient to convert the list to a set.
import re
list1 = ['/mnt/1m/a_pre.geojson', '/mnt/2m/b_pre.geojson']
list2 = ['/mnt/1m/a_post.geojson']
pattern = re.compile(r'(.*?/[0-9]m/.*?)_pre.geojson')
set2 = set(list2)
result = [
m.string
for m in map(pattern.fullmatch, list1)
if m and f"{m[1]}_post.geojson" not in set2
]
print(result)

Comparing lists. Which elements are NOT in a list?

I have the following 2 lists, and I want to obtain the elements of list2 that are not in list1:
list1 = ["0100","0300","0500"]
list2 = ["0100","0200","0300","0400","0500"]
My output should be:
list3 = ["0200","0400"]
I was checking for a way to subtract one from the other, but so far I can't be able to get the list 3 as I want
list3 = [x for x in list2 if x not in list1]
Or, if you don't care about order, you can convert the lists to sets:
set(list2) - set(list1)
Then, you can also convert this back to a list:
list3 = list(set(list2) - set(list1))
could this solution work for you?
list3 = []
for i in range(len(list2)):
if list2[i] not in list1:
list3.append(list2[i])
list1 = ["0100","0300","0500"]
list2 = ["0100","0200","0300","0400","0500"]
list3 = list(filter(lambda e: e not in list1,list2))
print(list3)
I believe this has been answered here:
Python find elements in one list that are not in the other
import numpy as np
list1 = ["0100","0300","0500"]
list2 = ["0100","0200","0300","0400","0500"]
list3 = np.setdiff1d(list2,list1)
print(list3)
set functions will help you to solve your problem in few lines of code...
set1=set(["0100","0300","0500"])
set2=set(["0100","0200","0300","0400","0500"])
set3=set2-set1
print(list(set3))
set gives you faster implementation in Python than the Lists...............

remove element from python list based on match from another list

I have list of s3 objects like this:
list1 = ['uid=123/2020/06/01/625e2ghvh.parquet','uid=876/2020/04/01/hgdshct7.parquet','uid=0987/2019/03/01/323dc.parquet']
list2 = ['123','876']
result_list = ['uid=0987/2019/03/01/323dc.parquet']
With out using any loop is there any efficient way to achieve this considering large no of elements in list1?
You could build a set from list2 for a faster lookup and use a list comprehension to check for membership using the substring of interest:
list1 = ['uid=123/2020/06/01/625e2ghvh.parquet','uid=876/2020/04/01/hgdshct7.parquet',
'uid=0987/2019/03/01/323dc.parquet']
list2 = ['123','876']
set2 = set(list2)
[i for i in list1 if i.lstrip('uid=').split('/',1)[0] not in set2]
# ['uid=0987/2019/03/01/323dc.parquet']
The substring is obtained through:
s = 'uid=123/2020/06/01/625e2ghvh.parquet'
s.lstrip('uid=').split('/',1)[0]
# '123'
This does the job. For different patterns though, or to also cover slight variations, you could go for a regex. For this example you'd need something like:
import re
[i for i in list1 if re.search(r'^uid=(\d+).*?', i).group(1) not in set2]
# ['uid=0987/2019/03/01/323dc.parquet']
This is one way to do it without loops
def filter_function(item):
uid = int(item[4:].split('/')[0])
if uid not in list2:
return True
return False
list1 = ['uid=123/2020/06/01/625e2ghvh.parquet','uid=876/2020/04/01/hgdshct7.parquet','uid=0987/2019/03/01/323dc.parquet']
list2 = [123, 876]
result_list = list(filter(filter_function, list1))
How about this one:
_list2 = [f'uid={number}' for number in list2]
result = [item for item in list1 if not any([item.startswith(i) for i in _list2])] # ['uid=0987/2019/03/01/323dc.parquet']

list and elements comparison python

list1 = ['A','B']
list2 = ['a','c']
list3 = ['x','y','z']
list4 = [['A','b','c'],['a','x'],['Y','Z'],['d','g']]
I want to check if all elements of list (list1, list2, list3) is contained in any of list in another bigger list (list4).
I want the comparison to be case insensitive.
To be sure, here list1 and list2 is in list4 but not list3. How can I do it?
On the other note, How would I know if a list is collection of list.
In other words, how can I distinguish if list is a collection of list of just list of elements, if I am not the one who is defining the lists.
First item - you want to do case-insensitive matching. The best way to do that is to convert everything to one case (upper or lower). So for each list, run
list1 = map(lambda x: x.lower(), list1)
That will convert your lists to lowercase. Let's assume you've done that.
Second, for a comparison of two "simple" lists (not-nested), you can simply say
if set(list1) < set(list2):
to compare if list1 is a subset of list2. In your example, it would be false.
Finally, if you want to check if a list is nested:
if ( type(list4[0]) == list) :
which in this case, would be true. Then, just iterate over the elements of list4 and do the set comparison above.
You can use lower() to make all elements of all lists to lowercase to achieve case-insensitivity.
def make_case_insensitive(lst):
return [i.lower() for i in lst]
For example,
list1=make_case_insensitive(list1)
As, biggerlist is slightly different (contains list as element), you have to change the function slightly.
def make_bigger_list_caseinsensitive(bigger_list):
return [[i.lower() for i in element] for element in bigger_list]
list4=make_bigger_list_caseinsensitive(list4)
Check if any element of the biggerlist is the superset of smaller set. Print Is in bigger list if condition satisfied, print not in biggger list otherwise. Make set from the list first.
print "Is in bigger list" if any(set(element).issuperset(set(list1)) for element in list4) else "not in biggger list"
To write it with slightly more readability, do:
if any(set(element).issuperset(set(list1)) for element in list4):
print "Is in bigger list"
else:
print "not in biggger list"
Finally,to check if nested list exists in biggerlist:
print any(type(element)==list for element in list4)
Using set is a good way.
list1 = ['A','B']
list2 = ['a','c']
list3 = ['x','y','z']
list4 = [['A','b','c'],['a','x'],['Y','Z'],['d','g']]
set1 = set(map(lambda s: s.lower(), list1))
set2 = set(map(lambda s: s.lower(), list2))
set3 = set(map(lambda s: s.lower(), list3))
set4 = map(lambda l: set(map(lambda s: s.lower(), l)), list4)
print(set1) # set(['a', 'b'])
print(set2) # set(['a', 'c'])
print(set3) # set(['y', 'x', 'z'])
print(set4) # [set(['a', 'c', 'b']), set(['a', 'x']), set(['y', 'z']), set(['d', 'g'])]
lor = lambda x, y: x or y
reduce(lor, map(lambda s: set1.issubset(s), set4)) # True
reduce(lor, map(lambda s: set2.issubset(s), set4)) # True
reduce(lor, map(lambda s: set3.issubset(s), set4)) # False
To do a case-insensitive string comparison, covert both strings to lowercase or uppercase.
To test all elements in list1 are contained in list4, use set.issubset.

How do i add two lists' elements into one list?

For example, I have a list like this:
list1 = ['good', 'bad', 'tall', 'big']
list2 = ['boy', 'girl', 'guy', 'man']
and I want to make a list like this:
list3 = ['goodboy', 'badgirl', 'tallguy', 'bigman']
I tried something like these:
list3=[]
list3 = list1 + list2
but this would only contain the value of list1
So I used for :
list3 = []
for a in list1:
for b in list2:
c = a + b
list3.append(c)
but it would result in too many lists(in this case, 4*4 = 16 of them)
You can use list comprehensions with zip:
list3 = [a + b for a, b in zip(list1, list2)]
zip produces a list of tuples by combining elements from iterables you give it. So in your case, it will return pairs of elements from list1 and list2, up to whichever is exhausted first.
A solution using a loop that you try is one way, this is more beginner friendly than Xions solution.
list3 = []
for index, item in enumerate(list1):
list3.append(list1[index] + list2[index])
This will also work for a shorter solution. Using map() and lambda, I prefer this over zip, but thats up to everyone
list3 = map(lambda x, y: str(x) + str(y), list1, list2);
for this or any two list of same size you may also use like this:
for i in range(len(list1)):
list3[i]=list1[i]+list2[i]
Using zip
list3 = []
for l1,l2 in zip(list1,list2):
list3.append(l1+l2)
list3 = ['goodboy', 'badgirl', 'tallguy', 'bigman']

Categories

Resources