Python check if string contains all words in Python - python

I want to check if all words are found in another string without any loops or iterations:
a = ['god', 'this', 'a']
sentence = "this is a god damn sentence in python"
all(a in sentence)
should return TRUE.

You could use a set depending on your exact needs as follows:
a = ['god', 'this', 'a']
sentence = "this is a god damn sentence in python"
print set(a) <= set(sentence.split())
This would print True, where <= is issubset.

It should be:
all(x in sentence for x in a)
Or:
>>> chk = list(filter(lambda x: x not in sentence, a)) #Python3, for Python2 no need to convert to list
[] #Will return empty if all words from a are in sentence
>>> if not chk:
print('All words are in sentence')

Related

How to compare reverse strings in list of strings with the original list of strings in python?

Input a given string and check if any word in that string matches with its reverse in the same string then print that word else print $
I split the string and put the words in a list and then I reversed the words in that list. After that, I couldn't able to compare both the lists.
str = input()
x = str.split()
for i in x: # printing i shows the words in the list
str1 = i[::-1] # printing str1 shows the reverse of words in a new list
# now how to check if any word of the new list matches to any word of the old list
if(i==str):
print(i)
break
else:
print('$)
Input: suman is a si boy.
Output: is ( since reverse of 'is' is present in the same string)
You almost have it, just need to add another loop to compare each word against each inverted word. Try using the following
str = input()
x = str.split()
for i in x:
str1 = i[::-1]
for j in x: # <-- this is the new nested loop you are missing
if j == str1: # compare each inverted word against each regular word
if len(str1) > 1: # Potential condition if you would like to not include single letter words
print(i)
Update
To only print the first occurrence of a match, you could, in the second loop, only check the elements that come after. We can do this by keeping track of the index:
str = input()
x = str.split()
for index, i in enumerate(x):
str1 = i[::-1]
for j in x[index+1:]: # <-- only consider words that are ahead
if j == str1:
if len(str1) > 1:
print(i)
Note that I used index+1 in order to not consider single word palindromes a match.
a = 'suman is a si boy'
# Construct the list of words
words = a.split(' ')
# Construct the list of reversed words
reversed_words = [word[::-1] for word in words]
# Get an intersection of these lists converted to sets
print(set(words) & set(reversed_words))
will print:
{'si', 'is', 'a'}
Another way to do this is just in a list comprehension:
string = 'suman is a si boy'
output = [x for x in string.split() if x[::-1] in string.split()]
print(output)
The split on string creates a list split on spaces. Then the word is included only if the reverse is in the string.
Output is:
['is', 'a', 'si']
One note, you have a variable name str. Best not to do that as str is a Python thing and could cause other issues in your code later on.
If you want word more than one letter long then you can do:
string = 'suman is a si boy'
output = [x for x in string.split() if x[::-1] in string.split() and len(x) > 1]
print(output)
this gives:
['is', 'si']
Final Answer...
And for the final thought, in order to get just the 'is':
string = 'suman is a si boy'
seen = []
output = [x for x in string.split() if x[::-1] not in seen and not seen.append(x) and x[::-1] in string.split() and len(x) > 1]
print(output)
output is:
['is']
BUT, this is not necessarily a good way to do it, I don't believe. Basically you are storing information in seen during the list comprehension AND referencing that same list. :)
This answer wouldn't show you 'a' and won't output 'is' with 'si'.
str = input() #get input string
x = str.split() #returns list of words
y = [] #list of words
while len(x) > 0 :
a = x.pop(0) #removes first item from list and returns it, then assigns it to a
if a[::-1] in x: #checks if the reversed word is in the list of words
#the list doesn't contain that word anymore so 'a' that doesn't show twice wouldn't be returned
#and 'is' that is present with 'si' will be evaluated once
y.append(a)
print(y) # ['is']

How to print out print out only elements of a list containing certain letters?

I am working on a project and I want to write a code, that would find words containing only certain letters in a sentence and then return them (print them out).
sentence = "I am asking a question on Stack Overflow"
lst = []
# this gives me a list of all words in a sentence
change = sentence.split()
# NOTE: I know this isn't correct syntax, but that's basically what I want to do.
lst.append(only words containing "a")
print(lst)
Now the part I am struggeling with is, how do I append only words containig letter "a" for example?
you can act like this:
words = sentence.split()
lst = [word for word in words if 'a' in word]
print(lst)
# ['am', 'asking', 'a', 'Stack']
Try this! I hope it's well understood!
sentence = "I am asking a question on Stack Overflow"
lst = []
change = sentence.split()
#we are going to check in every word of the sentence, if letter 'a' is in it.
for a in change:
if 'a' in a:
print(a+" has an a! ")
lst.append(a)
print(lst)
This will output:
['am', 'asking', 'a', 'Stack']

Python: how to find out the occurrences of a sentence in a list

I'm writing a function to implement the solution to finding the number of times a word occurs in a list of elements, retrieved from a text file which is pretty straightforward to achieve.
However, I have been at it for two days trying to figure out how to check occurrences of a string which contains multiple words, can be two or more
So for example say the string is:
"hello bye"
and the list is:
["car", "hello","bye" ,"hello"]
The function should return the value 1 because the elements "hello" and "bye" only occur once consecutively.
The closest I've gotten to the solution is using
words[0:2] = [' '.join(words[0:2])]
which would join two elements together given the index. This however is wrong as the input given will be the element itself rather than an index.
Can someone point me to the right direction?
Two possibilities.
## laboriously
lookFor = 'hello bye'
words = ["car", "hello","bye" ,"hello", 'tax', 'hello', 'horn', 'hello', 'bye']
strungOutWords = ' '.join(words)
count = 0
p = 0
while True:
q = strungOutWords [p:].find(lookFor)
if q == -1:
break
else:
p = p + q + 1
count += 1
print (count)
## using a regex
import re
print (len(re.compile(lookFor).findall(strungOutWords)))
Match the string with the join of the consecutive elements in the main list. Below is the sample code:
my_list = ["car", "hello","bye" ,"hello"]
sentence = "hello bye"
word_count = len(sentence.split())
c = 0
for i in range(len(my_list) - word_count + 1):
if sentence == ' '.join(my_list[i:i+word_count]):
c+=1
Final value hold by c will be:
>>> c
1
If you are looking for a one-liner, you may use zip and sum as:
>>> my_list = ["car", "hello","bye" ,"hello"]
>>> sentence = "hello bye"
>>> words = sentence.split()
>>> sum(1 for i in zip(*[my_list[j:] for j in range(len(words))]) if list(i) == words)
1
Let's split this problem in two parts. First, we establish a function that will return ngrams of a given list, that is sublists of n consecutive elements:
def ngrams(l, n):
return list(zip(*[l[i:] for i in range(n)]))
We can now get 2, 3 or 4-grams easily:
>>> ngrams(["car", "hello","bye" ,"hello"], 2)
[('car', 'hello'), ('hello', 'bye'), ('bye', 'hello')]
>>> ngrams(["car", "hello","bye" ,"hello"], 3)
[('car', 'hello', 'bye'), ('hello', 'bye', 'hello')]
>>> ngrams(["car", "hello","bye" ,"hello"], 4)
[('car', 'hello', 'bye', 'hello')]
Each item is made into a tuple.
Now make the phrase 'hello bye' into a tuple:
>>> as_tuple = tuple('hello bye'.split())
>>> as_tuple
('hello', 'bye')
>>> len(as_tuple)
2
Since this has 2 words, we need to generate bigrams from the sentence, and count the number of matching bigrams. We can generalize all this to
def ngrams(l, n):
return list(zip(*[l[i:] for i in range(n)]))
def count_occurrences(sentence, phrase):
phrase_as_tuple = tuple(phrase.split())
sentence_ngrams = ngrams(sentence, len(phrase_as_tuple))
return sentence_ngrams.count(phrase_as_tuple)
print(count_occurrences(["car", "hello","bye" ,"hello"], 'hello bye'))
# prints 1
I would suggest reducing the problem into counting occurrences of a string within another string.
words = ["hello", "bye", "hello", "car", "hello ", "bye me", "hello", "carpet", "shoplifter"]
sentence = "hello bye"
my_text = " %s " % " ".join([item for sublist in [x.split() for x in words] for item in sublist])
def count(sentence):
my_sentence = " %s " % " ".join(sentence.split())
return my_text.count(my_sentence)
print count("hello bye")
>>> 2
print count("pet shop")
>>> 0

Python - Check if there's only one element of multiple lists in a string

The following code allow me to check if there is only one element of the lists that is in ttext.
from itertools import product, chain
from string import punctuation
list1 = ['abra', 'hello', 'cfre']
list2 = ['dacc', 'ex', 'you', 'fboaf']
list3 = ['ihhio', 'oih', 'oihoihoo']
l = [list1, list2, list3]
def test(l, tt):
counts = {word.strip(punctuation):0 for word in tt.split()}
for word in chain(*product(*l)):
if word in counts:
counts[word] += 1
if sum(v > 1 for v in counts.values()) > 1:
return False
return True
Output:
In [16]: ttext = 'hello my name is brian'
In [17]: test(l,ttext)
Out[17]: True
In [18]: ttext = 'hello how are you?'
In [19]: test(l,ttext)
Out[19]: False
Now, how can i do the same if i have space in the elements of the lists, "I have", "you are" and "he is"?
You could add a list comprehension that goes through and splits all the words:
def test(l, tt):
counts = {word.strip(punctuation):0 for word in tt.split()}
splitl = [[word for item in sublist for word in item.split(' ')] for sublist in l]
for word in chain(*product(*splitl)):
if word in counts:
counts[word] += 1
if sum(v > 1 for v in counts.values()) > 1:
return False
return True
You can simplify a lot by just concatenating the lists using '+' rather than having a list of lists. This code also words if the string has spaces in it.
import string
list1 = ['abra', 'hello', 'cfre']
list2 = ['dacc', 'ex', 'you', 'fboaf']
list3 = ['ihhio', 'oih', 'oihoihoo']
l = list1 + list2 + list3
def test(l, tt):
count = 0
for word in l:
#set of all punctuation to exclude
exclude = set(string.punctuation)
#remove punctuation from word
word = ''.join(ch for ch in word if ch not in exclude)
if word in tt:
count += 1
if count > 1:
return False
else:
return True
You could just split all the list input by iterating through it. Something like:
words=[]
for list in l:
for word in list:
string=word.split()
words.append(string)
You may consider using sets for this kind of processing.
Here is a quick implementation :
from itertools import chain
from string import punctuation
list1 = ['abra', 'hello', 'cfre']
list2 = ['dacc', 'ex', 'you', 'fboaf']
list3 = ['ihhio', 'oih', 'oihoihoo']
l = list(chain(list1, list2, list3))
words = set(w.strip(punctuation) for word in l for w in word.split()) # 1
def test(words, text):
text_words = set(word.strip(punctuation) for word in text.split()) # 2
return len(words & text_words) == 1 # 3
Few comments:
Double for-loop on intentions works, you get a list of the words. The set make sure each word is unique.
Same thing on the input sentence
Using set intersection to get all words in the sentence that are also in your search set. Then using the length of this set to see if there is only one.
Well, first, lets rewrite the function to be more natural:
from itertools import chain
def only_one_of(lists, sentence):
found = None
for item in chain(*lists):
if item in sentence:
if found: return False
else: found = item
return True if found not is None else False
This already works with your constrains as it is only looking for some string item being a substring of sentence. It does not matter if it includes spaces or not. But it may lead to unexpected results. Imagine:
list1 = ['abra', 'hello', 'cfre']
list2 = ['dacc', 'ex', 'you', 'fboaf']
list3 = ['ihhio', 'oih', 'oihoihoo']
l = [list1, list2, list3]
only_one_of(l, 'Cadabra')
This returns True because abra is a substring of Cadabra. If this is what you want, then you're done. But if not, you need to redefine what item in sentence really means. So, let's redefine our function:
def only_one_of(lists, sentence, is_in=lambda i, c: i in c):
found = None
for item in chain(*lists):
if is_in(item, sentence):
if found: return False
else: found = item
return True if found not is None else False
Now the last parameter expects to be a function to be applied to two strings that return True if the first is found in the second or False, elsewhere.
You usually want to check if the item is inside the sentence as a word (but a word that can contain spaces in the middle) so let's use regular expressions to do that:
import re
def inside(string, sentence):
return re.search(r'\b%s\b' % string, sentence)
This function returns True when string is in sentence but considering string as a word (the special sequence \b in regular expression stands for word boundary).
So, the following code should pass your constrains:
import re
from itertools import chain
def inside(string, sentence):
return re.search(r'\b%s\b' % string, sentence)
def only_one_of(lists, sentence, is_in=lambda i, c: i in c):
found = None
for item in chain(*lists):
if is_in(item, sentence):
if found: return False
else: found = item
return True if found not is None else False
list1 = ['abra', 'hello', 'cfre']
list2 = ['dacc', 'ex', 'you', 'fboaf']
list3 = ['ihhio', 'oih', 'oihoihoo']
list4 = ['I have', 'you are', 'he is']
l = [list1, list2, list3, list4]
only_one_of(l, 'hello my name is brian', inside) # True
only_one_of(l, 'hello how are you?', inside) # False
only_one_of(l, 'Cadabra', inside) # False
only_one_of(l, 'I have a sister', inside) # True
only_one_of(l, 'he is my ex-boyfriend', inside) # False, ex and boyfriend are two words
only_one_of(l, 'he is my exboyfriend', inside) # True, exboyfriend is only one word

Python: replace nth word in string

What is the easiest way in Python to replace the nth word in a string, assuming each word is separated by a space?
For example, if I want to replace the tenth word of a string and get the resulting string.
I guess you may do something like this:
nreplace=1
my_string="hello my friend"
words=my_string.split(" ")
words[nreplace]="your"
" ".join(words)
Here is another way of doing the replacement:
nreplace=1
words=my_string.split(" ")
" ".join([words[word_index] if word_index != nreplace else "your" for word_index in range(len(words))])
Let's say your string is:
my_string = "This is my test string."
You can split the string up using split(' ')
my_list = my_string.split()
Which will set my_list to
['This', 'is', 'my', 'test', 'string.']
You can replace the 4th list item using
my_list[3] = "new"
And then put it back together with
my_new_string = " ".join(my_list)
Giving you
"This is my new string."
A solution involving list comprehension:
text = "To be or not to be, that is the question"
replace = 6
replacement = 'it'
print ' '.join([x if index != replace else replacement for index,x in enumerate(s.split())])
The above produces:
To be or not to be, it is the question
You could use a generator expression and the string join() method:
my_string = "hello my friend"
nth = 0
new_word = 'goodbye'
print(' '.join(word if i != nth else new_word
for i, word in enumerate(my_string.split(' '))))
Output:
goodbye my friend
Through re.sub.
>>> import re
>>> my_string = "hello my friend"
>>> new_word = 'goodbye'
>>> re.sub(r'^(\s*(?:\S+\s+){0})\S+', r'\1'+new_word, my_string)
'goodbye my friend'
>>> re.sub(r'^(\s*(?:\S+\s+){1})\S+', r'\1'+new_word, my_string)
'hello goodbye friend'
>>> re.sub(r'^(\s*(?:\S+\s+){2})\S+', r'\1'+new_word, my_string)
'hello my goodbye'
Just replace the number within curly braces with the position of the word you want to replace - 1. ie, for to replace the first word, the number would be 0, for second word the number would be 1, likewise it goes on.

Categories

Resources