I am trying to see if I can make this code better using list comprehensions.
Lets say that I have the following lists:
a_list = [
'HELLO',
'FOO',
'FO1BAR',
'ROOBAR',
'SHOEBAR'
]
regex_list = [lambda x: re.search(r'FOO', x, re.IGNORECASE),
lambda x: re.search(r'RO', x, re.IGNORECASE)]
I basically want to add all the elements that do not have any matches in the regex_list into another list.
E.g. ==>
newlist = []
for each in a_list:
for regex in regex_list:
if(regex(each) == None):
newlist.append(each)
How can I do this using list comprehensions? Is it even possible?
Sure, I think this should do it
newlist = [s for s in a_list if not any(r(s) for r in regex_list)]
EDIT: on closer inspection, I notice that your example code actually adds to the new list each string in a_list that doesn't match all the regexes - and what's more, it adds each string once for each regex that it doesn't match. My list comprehension does what I think you meant, which is add only one copy of each string that doesn't match any of the regexes.
I'd work your code down to this:
a_list = [
'HELLO',
'FOO',
'FO1BAR',
'ROOBAR',
'SHOEBAR'
]
regex_func = lambda x: not re.search(r'(FOO|RO)', x, re.IGNORECASE)
Then you have two options:
Filter
newlist = filter(regex_func, a_list)
List comprehensions
newlist = [x for x in a_list if regex_func(x)]
Related
list1 = ['/mnt/1m/a_pre.geojson','/mnt/2m/b_pre.geojson']
list2 = ['/mnt/1m/a_post.geojson']
I have multiple lists and I want to find all the elements of list1 which do not have entry in list2 with a filtering condition.
The condition is it should match 'm' like 1m,2m.. and name of geojson file excluding 'pre or post' substring.
For in e.g. list1 '/mnt/1m/a_pre.geojson' is processed but '/mnt/2m/b_pre.geojson' is not so the output should have a list ['/mnt/2m/b_pre.geojson']
I am using 2 for loops and then splitting the string which I am sure is not the only one and there might be easier way to do this.
for i in list1:
for j in list2:
pre_tile = i.split("/")[-1].split('_pre', 1)[0]
post_tile = j.split("/")[-1].split('_post', 1)[0]
if pre_tile == post_tile:
...
I believe you have similar first part of the file paths. If so, you can try this:
list1 = ['/mnt/1m/a_pre.geojson','/mnt/2m/b_pre.geojson']
list2 = ['/mnt/1m/a_post.geojson']
res = [x for x in list1 if x[:7] not in [y[:7] for y in list2]]
res:
['/mnt/2m/b_pre.geojson']
If I understand you correctly, using a regular expression to do this kind of string manipulation can be fast and easy.
Additionally, to do multiple member-tests in list2, it's more efficient to convert the list to a set.
import re
list1 = ['/mnt/1m/a_pre.geojson', '/mnt/2m/b_pre.geojson']
list2 = ['/mnt/1m/a_post.geojson']
pattern = re.compile(r'(.*?/[0-9]m/.*?)_pre.geojson')
set2 = set(list2)
result = [
m.string
for m in map(pattern.fullmatch, list1)
if m and f"{m[1]}_post.geojson" not in set2
]
print(result)
The list ['a','a #2','a(Old)'] should become {'a'} because '#' and '(Old)' are to be excised and a list of duplicates isn't needed. I struggled to develop a list comprehension with a generator and settled on this since I knew it'd work and valued time more than looking good:
l = []
groups = ['a','a #2','a(Old)']
for i in groups:
if ('#') in i: l.append(i[:i.index('#')].strip())
elif ('(Old)') in i: l.append(i[:i.index('(Old)')].strip())
else: l.append(i)
groups = set(l)
What's the slick way to get this result?
Here is general solution, if you want to clean elements of list lst from parts in wastes:
lst = ['a','a #2','a(Old)']
wastes = ['#', '(Old)']
cleaned_set = {
min([element.split(waste)[0].strip() for waste in wastes])
for element in arr
}
You could write this whole expression in a single set comprehension
>>> groups = ['a','a #2','a(Old)']
>>> {i.split('#')[0].split('(Old)')[0].strip() for i in groups}
{'a'}
This will get everything preceding a # and everything preceding '(Old)', then trim off whitespace. The remainder is placed into a set, which only keeps unique values.
You could define a helper function to apply all of the splits and then use a set comprehension.
For example:
lst = ['a','a #2','a(Old)', 'b', 'b #', 'b(New)']
splits = {'#', '(Old)', '(New)'}
def split_all(a):
for s in splits:
a = a.split(s)[0]
return a.strip()
groups = {split_all(a) for a in lst}
#{'a', 'b'}
I have a large list like this:
mylist = [['pears','apples','40'],['grapes','trees','90','bears']]
I'm trying to remove all numbers within the lists of this list. So I made a list of numbers as strings from 1 to 100:
def integers(a, b):
return list(range(a, b+1))
numb = integers(1,100)
numbs = []
for i in range(len(numb)):
numbs.append(str(numb[i])) # strings
numbs = ['1','2',....'100']
How can I iterate through lists in mylist and remove the numbers in numbs? Can I use list comprehension in this case?
If number is always in the end in sublist
mylist = [ x[:-1] for x in mylist ]
mylist = [[item for item in sublist if item not in numbs] for sublist in mylist] should do the trick.
However, this isn't quite what you've asked. Nothing was actually removed from mylist, we've just built an entirely new list and reassigned it to mylist. Same logical result, though.
If numbers are always at the end and only once, you can remove the last item like:
my_new_list = [x[:-1] for x in mylist]
If there is more (of if they are not ordered), you have to loop thru each elements, in that case you can use:
my_new_list = [[elem for elem in x if elem not in integer_list] for x in mylist]
I would also recommend to generate the list of interger as follow :
integer_list = list(map(str, range(1, 100)))
I hope it helps :)
Instead of enumerating all the integers you want to filter out you can use the isdigit to test each string to see if it really is only numbers:
mylist = [['pears','apples','40'],['grapes','trees','90','bears']]
mylist2 = [[x for x in aList if not x.isdigit()] for aList in mylist]
print mylist2
[['pears', 'apples'], ['grapes', 'trees', 'bears']]
If you have the following list:
mylist = [['pears','apples','40'],['grapes','trees','90','bears']]
numbs = [str(i) for i in range(1, 100)]
Using list comprehension to remove element in numbs
[[l for l in ls if l not in numbs] for ls in mylist]
This is a more general way to remove digit elements in a list
[[l for l in ls if not l.isdigit()] for ls in mylist]
I have
char=str('DOTR')
and
a=range(0,18)
How could I combine them to create a list with:
mylist=['DOTR00','DOTR01',...,'DOTR17']
If I combine them in a for loop then I lose the leading zero.
Use zfill:
>>> string = "DOTR"
>>> for i in range(0, 18):
... print("DOTR{}".format(str(i).zfill(2)))
...
DOTR00
DOTR01
DOTR02
DOTR03
DOTR04
DOTR05
DOTR06
DOTR07
DOTR08
DOTR09
DOTR10
DOTR11
DOTR12
DOTR13
DOTR14
DOTR15
DOTR16
DOTR17
>>>
And if you want a list:
>>> my_list = ["DOTR{}".format(str(i).zfill(2)) for i in range(18)]
>>> my_list
['DOTR00', 'DOTR01', 'DOTR02', 'DOTR03', 'DOTR04', 'DOTR05', 'DOTR06', 'DOTR07', 'DOTR08', 'DOTR09', 'DOTR10', 'DOTR11', 'DOTR12', 'DOTR13', 'DOTR14', 'DOTR15', 'DOTR16', 'DOTR17']
>>>
You can do it using a list comprehension like so:
>>> mylist = [char+'{0:02}'.format(i) for i in a]
>>> mylist
['DOTR00', 'DOTR01', 'DOTR02', 'DOTR03', 'DOTR04', 'DOTR05', 'DOTR06', 'DOTR07', 'DOTR08', 'DOTR09', 'DOTR10', 'DOTR11', 'DOTR12', 'DOTR13', 'DOTR14', 'DOTR15', 'DOTR16', 'DOTR17']
Simply use list comprehension and format:
mylist = ['DOTR%02d'%i for i in range(18)]
Or given that char and a are variable:
mylist = ['%s%02d'%(char,i) for i in a]
You can, as #juanpa.arrivillaga also specify it as:
mylist = ['{}{:02d}'.format(char,i) for i in a]
List comprehension is a concept where you write an expression:
[<expr> for <var> in <iterable>]
Python iterates over the <iterable> and unifies it with <var> (here i), next it calls the <expr> and the result is appended to the list until the <iterable> is exhausted.
can do like this
char = str('DOTR')
a=range(0,18)
b = []
for i in a:
b.append(char + str(i).zfill(2))
print(b)
From these two lists:
list_A = ["eyes", "clothes", "body" "etc"]
list_B = ["xxxx_eyes", "xxx_zzz", "xxxxx_bbbb_zzzz_clothes" ]
I want to populate a third list wit those objects from 2nd list, only if some part of his names matchs one of the names from the first list.
In the previous example, the third list has to be:
["xxxx_eyes", "xxxxx_bbbb_zzzz_clothes"]
If you want to use a list comprehension, this will work:
list_C = [word for word in list_B if any(test in word for test in list_A)]
If you want to use regexs for this:
search = re.compile("|".join(map(re.escape, list_A))).search
result = filter(search, list_B)
Although Blender's answer might be enough in most cases.
In [1]: list_A = ["eyes", "clothes", "body" "etc"]
In [2]: list_B = ["xxxx_eyes", "xxx_zzz", "xxxxx_bbbb_zzzz_clothes" ]
In [7]: [x for x in list_B if any(y in list_A for y in x.split('_'))]
Out[7]: ['xxxx_eyes', 'xxxxx_bbbb_zzzz_clothes']
Slowest but simplest would be:
list_A = ["eyes", "clothes", "body" "etc"]
list_B = ["xxxx_eyes", "xxx_zzz", "xxxxx_bbbb_zzzz_clothes" ]
list_C=[]
for _ in list_A:
for __ in list_B:
if _ in __:
list_C.append(__)