I know can use string.find() to find a substring in a string.
But what is the easiest way to find out if one of the array items has a substring match in a string without using a loop?
Pseudocode:
string = 'I would like an apple.'
search = ['apple','orange', 'banana']
string.find(search) # == True
You could use a generator expression (which somehow is a loop)
any(x in string for x in search)
The generator expression is the part inside the parentheses. It creates an iterable that returns the value of x in string for each x in the tuple search. x in string in turn returns whether string contains the substring x. Finally, the Python built-in any() iterates over the iterable it gets passed and returns if any of its items evaluate to True.
Alternatively, you could use a regular expression to avoid the loop:
import re
re.search("|".join(search), string)
I would go for the first solution, since regular expressions have pitfalls (escaping etc.).
Strings in Python are sequences, and you can do a quick membership test by just asking if one string exists inside of another:
>>> mystr = "I'd like an apple"
>>> 'apple' in mystr
True
Sven got it right in his first answer above. To check if any of several strings exist in some other string, you'd do:
>>> ls = ['apple', 'orange']
>>> any(x in mystr for x in ls)
True
Worth noting for future reference is that the built-in 'all()' function would return true only if all items in 'ls' were members of 'mystr':
>>> ls = ['apple', 'orange']
>>> all(x in mystr for x in ls)
False
>>> ls = ['apple', 'like']
>>> all(x in mystr for x in ls)
True
The simpler is
import re
regx = re.compile('[ ,;:!?.:]')
string = 'I would like an apple.'
search = ['apple','orange', 'banana']
print any(x in regx.split(string) for x in search)
EDIT
Correction, after having read Sven's answer: evidently, string has to not be splited, stupid ! any(x in string for x in search) works pretty well
If you want no loop:
import re
regx = re.compile('[ ,;:!?.:]')
string = 'I would like an apple.'
search = ['apple','orange', 'banana']
print regx.split(string)
print set(regx.split(string)) & set(search)
result
set(['apple'])
Related
I would like to know the quickest or the shortest way to check if all the strings in a list occur in another specific string.
Ex:
l = ['I','you']
s = ['I do like you']
in this case, I would like to see if both I and you appear in I do like you. Is there a one-liner? Instead of a for loop and checking one by by manually, in a traditional way?
Use all() which returns a True if all elements of the iterable are truthy or else False:
all(x in s[0] for x in l)
In code:
l = ['I','you']
s = ['I do like you']
print(all(x in s[0] for x in l))
# True
You can use all() operator, which returns True if every element of an Iterator is True or if it is empty.
l = ['I', 'you']
s = 'I do like you'
print(all(x in s for x in l))
You might be intereseted by the any() operator, which returns True if at least one element is True.
I guess you want the words and not just strings. For this use:
all(_ in s[0].split() for _ in l)
how to match the below case 1 in python.. i want each and every word in the sentence to be matched with the list.
l1=['there is a list of contents available in the fields']
>>> 'there' in l1
False
>>> 'there is a list of contents available in the fields' in l1
True
Simple way
l1=['there is a list of contents available in the fields']
>>> 'there' in l1[0]
True
Better way wil be to iterate to all element of list.
l1=['there is a list of contents available in the fields']
print(bool([i for i in l1 if 'there' in i]))
If you just want to know if any of the string in the list contains a word no matter which string it is you can do this:
if any('there' in element for element in li):
pass
Now if you want to filter the ones which matches the string you can simply:
li = filter(lambda x: 'there' in x, li)
Or in Python 3:
li = list(filter(lambda x: 'there' in x, li))
Is there a way to disable breaking string with list. For example:
>>> a = "foo"
>>> b = list()
>>> b.append(list(a))
>>> b
>>>[['f', 'o', 'o']]
Is there a way to have a list inside of a list with string that is not "broken", for example [["foo"],["bar"]]?
Very esay:
>>> a = "foo"
>>> b = list()
>>> b.append([a])
>>> b
[['foo']]
Do this:
>>> a = "foo"
>>> b = list()
>>> b.append([a])
>>> b
[["foo"]]
The reason this happens is that the list function works by taking each element of the sequence you pass it and putting them in a list. A string in Python is a sequence, the elements of the sequence are the individual characters.
Having this abstract concept of a "sequence" means that a lot of Python functions can work on multiple data types, as long as they accept a sequence. Once you get used to this idea, hopefully you'll start finding this concept more useful than surprising.
you sound like you want to break on word boundaries instead of on each letter.
Try something like
a = "foo bar"
b = list()
b.append(a.split(' ')) # [['foo', 'bar']]
Example with RegEx (to support multiple consecutive spaces) :
import re
a = "foo bar"
b.append(re.split(r'\s+', a)) # [['foo', 'bar']]
If need to say
if <this list has a string in it that matches this rexeg>:
do_stuff()
I found this powerful construct to extract matching strings from a list:
[m.group(1) for l in my_list for m in [my_regex.search(l)] if m]
...but this is hard to read and overkill. I don't want the list, I just want to know if such a list would have anything in it.
Is there a simpler-reading way to get that answer?
You can simply use any. Demo:
>>> lst = ['hello', '123', 'SO']
>>> any(re.search('\d', s) for s in lst)
True
>>> any(re.search('\d{4}', s) for s in lst)
False
use re.match if you want to enforce matching from the start of the string.
Explanation:
any will check if there is any truthy value in an iterable. In the first example, we pass the contents of the following list (in the form of a generator):
>>> [re.search('\d', s) for s in lst]
[None, <_sre.SRE_Match object at 0x7f15ef317d30>, None]
which has one match-object which is truthy, while None will always evaluate to False in a boolean context. This is why any will return False for the second example:
>>> [re.search('\d{4}', s) for s in lst]
[None, None, None]
I have following:
temp = "aaaab123xyz#+"
lists = ["abc", "123.35", "xyz", "AND+"]
for list in lists
if re.match(list, temp, re.I):
print "The %s is within %s." % (list,temp)
The re.match is only match the beginning of the string, How to I match substring in between too.
You can use re.search instead of re.match.
It also seems like you don't really need regular expressions here. Your regular expression 123.35 probably doesn't do what you expect because the dot matches anything.
If this is the case then you can do simple string containment using x in s.
Use re.search or just use in if l in temp:
Note: built-in type list should not be shadowed, so for l in lists: is better
You can do this with a slightly more complex check using map and any.
>>> temp = "aaaab123xyz#+"
>>> lists = ["abc", "123.35", "xyz", "AND+"]
>>> any(map(lambda match: match in temp, lists))
True
>>> temp = 'fhgwghads'
>>> any(map(lambda match: match in temp, lists))
False
I'm not sure if this is faster than a compiled regexp.