If list1 has any items of list2 - python

Assume we have the following two lists,
list1 = ['text_svm_a', 'football_04', 'nice_sensor']
list2 = ['svm', 'sensor']
filtered_list = [item for item in list1 if item_contains_any_of_items_in_list2]
any help on writing item_contains_any_of_items_in_list2 is really appreciated.
Note: Both lists could be large so I don't want to hard code each condition.

You can use any:
filtered_list = [item for item in list1 if any(x in item for x in list2)]
# ['text_svm_a', 'nice_sensor']

Related

Manipulate a list based on another list

I have a list of English Words(list2) and I want to remove the words from the list that contain the alphabets/letters in (list1)
For this example:
list1 = ['A','B']
list2 = ['AARON', 'ABAFT', 'ABASE', 'ABASK', 'ABAVE', 'ABBAS', 'ABBIE', 'ABDAL', 'ABEAM', 'ABELE', 'ABIDE', 'ABIES', 'ABKAR', 'ABLOW', 'ABNER', 'ABODE', 'ABOHM']
I want to write a loop to remove all elements as they contain either A or B.
The result should be an empty list.
list1 = ['A','B']
list2 = ['AARON', 'ABAFT', 'ABASE', 'ABASK', 'ABAVE', 'ABBAS', 'ABBIE', 'ABDAL', 'ABEAM', 'ABELE', 'ABIDE', 'ABIES', 'ABKAR', 'ABLOW', 'ABNER', 'ABODE', 'ABOHM']
for x in list1:
for y in list2:
if x in y:
list2.remove(y)
print(list2)
I was expecting an empty list but the result was:
['ABASK', 'ABDAL', 'ABIES', 'ABODE']
As commented by tripleee, changing a list while iterating over it can cause trouble. I'd suggest using a list comprehension and set to check for intersecting characters:
# Or list1 = {'A','B'}
list1 = set(list1)
# returns empty list
[w for w in list2 if not list1.intersection(w)]
You can also use a regex match if the list1 is not complex. An example approach can be like below:
import re
matcher = re.compile("|".join(list1))
list2 = [s for s in list2 if not matcher.search(s)]
Using a compositions of built-in functions, doc.
filter return a generator so should be casted to list.
list1 = ['A','B']
list2 = ['ARON', 'ABAFT', 'ABASE', 'ABASK', 'ABAVE', 'ABBAS', 'ABBIE', 'ABDAL', 'ABEAM', 'ABELE', 'ABIDE', 'ABIES', 'ABKAR', 'ABLOW', 'ABNER', 'ABODE', 'ABOHM']
a = filter(lambda s: not any(map(s.__contains__, list1)), list2)
print(list(a))

Optimization: Remove difference between list of list

I need to know if i can optimize my code, because i feel like it can be.
Here the context:
list1 = [[1,'name1'], [2,'name2'], [3,'name3']]
list2 = [2,3]
for item in list1:
if item[0] not in list2:
list1.remove(item)
To get list1 i'm doing this:
list(filter(lambda x: name in x[1].lower(), list_of_items))
So i'm asking if it is really possible to optimize this code ?
Update:
As ask i can't use directly list2 in the lmbda filter because i'm getiing it with:
list2 = set(item[0] for item in list1) - set(object.id for object in list_in_my_bdd)
That looks very much like a job for a dictionary lookup which probably is faster than a O(N^2) list comparison. So if you want to have all entries of list1 whose keys are contained in list2 and can assume that the keys in list_of_items are unique you can do:
list1 = [[1,'name1'], [2,'name2'], [3,'name3']]
dict1 = dict(list1)
list2 = [2,3]
result = [dict1[k] for k in list2]
This requires all keys contained in list 2 to actually be in list1 though. Otherwise there will be None values inside result
Timing:
To reproduce the timings in a notebook:
import numpy as np
list1 = [[i, f"name{i}"] for i in range(10000)]
list2 = np.random.choice(range(10000), 1000).tolist()
dict1 = dict(list1)
%timeit [dict1.get(k) for k in list2]
%timeit [item for item in list1 if item[0] in list2]
>>>10000 loops, best of 5: 141 µs per loop
>>>10 loops, best of 5: 160 ms per loop
So the dictionary lookup is approximately 1000 times faster than the list comprehension.
As #Paul mention this is only faster if the setup of list1 can be replaced by directly using a dictionary:
import numpy as np
list_of_items = [(j, f"name{i}") for j, i in enumerate(np.random.randint(0, 50000, 10000))]
list1 = list(filter(lambda x: "name" in x[1].lower(), list_of_items))
dict1 = dict(filter(lambda x: "name" in x[1].lower(), list_of_items))
You could use a list comprehension instead and only include sub-lists of list1 whose first element is in list2:
[item for item in list1 if item[0] in list2]
what about a list comprehension and set?
list1 = [[1,'name1'], [2,'name2'], [3,'name3']]
good = {2,3}
print([[a,b] for a,b in list1 if a in good])
# [[2, 'name2'], [3, 'name3']]
You could do it like this. Note that the variable names are the same as in your question but list2 is actually a set (for efficiency).
list1 = [[1,'name1'], [2,'name2'], [3,'name3']]
list2 = {2, 3}
list3 = [item for item in list1 if item[0] in list2]
print(list3)
To get list1 i'm doing this:
list(filter(lambda x: name in x[1].lower(), list_of_items))
Why don't you also include the check here:
list(filter(lambda x: name in x[1].lower() and x[0] in list2, list_of_items))

Comparing lists. Which elements are NOT in a list?

I have the following 2 lists, and I want to obtain the elements of list2 that are not in list1:
list1 = ["0100","0300","0500"]
list2 = ["0100","0200","0300","0400","0500"]
My output should be:
list3 = ["0200","0400"]
I was checking for a way to subtract one from the other, but so far I can't be able to get the list 3 as I want
list3 = [x for x in list2 if x not in list1]
Or, if you don't care about order, you can convert the lists to sets:
set(list2) - set(list1)
Then, you can also convert this back to a list:
list3 = list(set(list2) - set(list1))
could this solution work for you?
list3 = []
for i in range(len(list2)):
if list2[i] not in list1:
list3.append(list2[i])
list1 = ["0100","0300","0500"]
list2 = ["0100","0200","0300","0400","0500"]
list3 = list(filter(lambda e: e not in list1,list2))
print(list3)
I believe this has been answered here:
Python find elements in one list that are not in the other
import numpy as np
list1 = ["0100","0300","0500"]
list2 = ["0100","0200","0300","0400","0500"]
list3 = np.setdiff1d(list2,list1)
print(list3)
set functions will help you to solve your problem in few lines of code...
set1=set(["0100","0300","0500"])
set2=set(["0100","0200","0300","0400","0500"])
set3=set2-set1
print(list(set3))
set gives you faster implementation in Python than the Lists...............

remove element from python list based on match from another list

I have list of s3 objects like this:
list1 = ['uid=123/2020/06/01/625e2ghvh.parquet','uid=876/2020/04/01/hgdshct7.parquet','uid=0987/2019/03/01/323dc.parquet']
list2 = ['123','876']
result_list = ['uid=0987/2019/03/01/323dc.parquet']
With out using any loop is there any efficient way to achieve this considering large no of elements in list1?
You could build a set from list2 for a faster lookup and use a list comprehension to check for membership using the substring of interest:
list1 = ['uid=123/2020/06/01/625e2ghvh.parquet','uid=876/2020/04/01/hgdshct7.parquet',
'uid=0987/2019/03/01/323dc.parquet']
list2 = ['123','876']
set2 = set(list2)
[i for i in list1 if i.lstrip('uid=').split('/',1)[0] not in set2]
# ['uid=0987/2019/03/01/323dc.parquet']
The substring is obtained through:
s = 'uid=123/2020/06/01/625e2ghvh.parquet'
s.lstrip('uid=').split('/',1)[0]
# '123'
This does the job. For different patterns though, or to also cover slight variations, you could go for a regex. For this example you'd need something like:
import re
[i for i in list1 if re.search(r'^uid=(\d+).*?', i).group(1) not in set2]
# ['uid=0987/2019/03/01/323dc.parquet']
This is one way to do it without loops
def filter_function(item):
uid = int(item[4:].split('/')[0])
if uid not in list2:
return True
return False
list1 = ['uid=123/2020/06/01/625e2ghvh.parquet','uid=876/2020/04/01/hgdshct7.parquet','uid=0987/2019/03/01/323dc.parquet']
list2 = [123, 876]
result_list = list(filter(filter_function, list1))
How about this one:
_list2 = [f'uid={number}' for number in list2]
result = [item for item in list1 if not any([item.startswith(i) for i in _list2])] # ['uid=0987/2019/03/01/323dc.parquet']

How do i add two lists' elements into one list?

For example, I have a list like this:
list1 = ['good', 'bad', 'tall', 'big']
list2 = ['boy', 'girl', 'guy', 'man']
and I want to make a list like this:
list3 = ['goodboy', 'badgirl', 'tallguy', 'bigman']
I tried something like these:
list3=[]
list3 = list1 + list2
but this would only contain the value of list1
So I used for :
list3 = []
for a in list1:
for b in list2:
c = a + b
list3.append(c)
but it would result in too many lists(in this case, 4*4 = 16 of them)
You can use list comprehensions with zip:
list3 = [a + b for a, b in zip(list1, list2)]
zip produces a list of tuples by combining elements from iterables you give it. So in your case, it will return pairs of elements from list1 and list2, up to whichever is exhausted first.
A solution using a loop that you try is one way, this is more beginner friendly than Xions solution.
list3 = []
for index, item in enumerate(list1):
list3.append(list1[index] + list2[index])
This will also work for a shorter solution. Using map() and lambda, I prefer this over zip, but thats up to everyone
list3 = map(lambda x, y: str(x) + str(y), list1, list2);
for this or any two list of same size you may also use like this:
for i in range(len(list1)):
list3[i]=list1[i]+list2[i]
Using zip
list3 = []
for l1,l2 in zip(list1,list2):
list3.append(l1+l2)
list3 = ['goodboy', 'badgirl', 'tallguy', 'bigman']

Categories

Resources