I am constructing a chatbot that rhymes in Python. Is it possible to identify the last vowel (and all the letters after that vowel) in a random word and then append those letters to another string without having to go through all the possible letters one by one (like in the following example)
lastLetters = '' # String we want to append the letters to
if user_answer.endswith("a")
lastLetters.append("a")
else if user_answer.endswith("b")
lastLetters.append("b")
Like if the word was right we’d want to get ”ight”
You need to find the last index of a vowel, for that you could do something like this (a bit fancy):
s = input("Enter the word: ") # You can do this to get user input
last_index = len(s) - next((i for i, e in enumerate(reversed(s), 1) if e in "aeiou"), -1)
result = s[last_index:]
print(result)
Output
ight
An alternative using regex:
import re
s = "right"
last_index = -1
match = re.search("[aeiou][^aeiou]*$", s)
if match:
last_index = match.start()
result = s[last_index:]
print(result)
The pattern [aeiou][^aeiou]*$ means match a vowel followed by possibly several characters that are not a vowel ([^aeiou] means not a vowel, the sign ^ inside brackets means negation in regex) until the end of the string. So basically match the last vowel. Notice this assumes a string compose only of consonants and vowels.
Related
Suppose that I have a string that I would like to modify at random with a defined set of options from another string. First, I created my original string and the potential replacement characters:
string1 = "abcabcabc"
replacement_chars = "abc"
Then I found this function on a forum that will randomly replace n characters:
def randomlyChangeNChar(word, value):
length = len(word)
word = list(word)
# This will select the distinct index for us to replace
k = random.sample(range(0, length), value)
for index in k:
# This will replace the characters at the specified index with the generated characters
word[index] = random.choice(replacement_chars)
# Finally print the string in the modified format.
return "".join(word)
This code does what I want with one exception -- it does not account for characters in string1 that match the random replacement character. I understand that the problem is in the function that I am trying to adapt, I predict under the for loop, but I am unsure what to add to prevent the substituting character from equaling the old character from string1. All advice appreciated, if I'm overcomplicating things please educate me!
In the function you retrieved, replacing:
word[index] = random.choice(replacement_chars)
with
word[index] = random.choice(replacement_chars.replace(word[index],'')
will do the job. It simply replaces word[index] (the char you want to replace) with an empty string in the replacement_chars string, effectively removing it from the replacement characters.
Another approach, that will predictably be less efficient on average, is to redraw until you get a different character from the original one:
that is, replacing:
word[index] = random.choice(replacement_chars)
with
char = word[index]
while char == word[index]:
char = random.choice(replacement_chars)
word[index] = char
or
while True:
char = random.choice(replacement_chars)
if char != word[index]:
word[index] = char
break
WARNING: if replacement_chars only features 1 character, both methods would fail when the original character is the same as the replacement one!
I am trying to make a simple function that gets three inputs: a list of words, list of guessed letters and a pattern. The pattern is a word with some letters hidden with an underscore. (for example the word apple and the pattern '_pp_e')
For some context it's a part of the game hangman where you try to guess a word and this function gives a hint.
I want to make this function to return a filtered list of words from the input that does not contain any letters from the list of guessed letters and the filtered words contain the same letters and their position as with the given pattern.
I tried making this work with three loops.
First loop that filters all words by the same length as the pattern.
Second loop that checks for similarity between the pattern and the given word. If the not filtered word does contain the letter but not in the same position I filter it out.
Final loop checks the filtered word that it does not contain any letters from the given guessed list.
I tried making it work with not a lot of success, I would love for help. Also any tips for making the code shorter (without using third party libraries) will be a appreciated very much.
Thanks in advance!
Example: pattern: "d _ _ _ _ a _ _ _ _" guessed word list ['b','c'] and word list contain all the words in english.
output list: ['delegating', 'derogation', 'dishwasher']
this is the code for more context:
def filter_words_list(words, pattern, wrong_guess_lst):
lst_return = []
lst_return_2 = []
lst_return_3 = []
new_word = ''
for i in range(len(words)):
if len(words[i]) == len(pattern):
lst_return.append(words[i])
pattern = list(pattern)
for i in range(len(lst_return)):
count = 0
word_to_check = list(lst_return[i])
for j in range(len(pattern)):
if pattern[j] == word_to_check[j] or (pattern[j] == '_' and
(not (word_to_check[j] in
pattern))):
count += 1
if count == len(pattern):
lst_return_2.append(new_word.join(word_to_check))
for i in range(len(lst_return_2)):
word_to_check = lst_return_2[i]
for j in range(len(wrong_guess_lst)):
if word_to_check.find(wrong_guess_lst[j]) == -1:
lst_return_3.append(word_to_check)
return lst_return_3
The easiest, and likely quite efficient, way to do this would be to translate your pattern into a regular expression, if regular expressions are in your "toolbox". (The re module is in the standard library.)
In a regular expression, . matches any single character. So, we replace all _s with .s and add "^" and "$" to anchor the regular expression to the whole string.
import re
def filter_words(words, pattern, wrong_guesses):
re_pattern = re.compile("^" + re.escape(pattern).replace("_", ".") + "$")
# get words that
# (a) are the correct length
# (b) aren't in the wrong guesses
# (c) match the pattern
return [
word
for word in words
if (
len(word) == len(pattern) and
word not in wrong_guesses and
re_pattern.match(word)
)
]
all_words = [
"cat",
"dog",
"mouse",
"horse",
"cow",
]
print(filter_words(all_words, "c_t", []))
print(filter_words(all_words, "c__", []))
print(filter_words(all_words, "c__", ["cat"]))
prints out
['cat']
['cat', 'cow']
['cow']
If you don't care for using regexps, you can instead translate the pattern to a dict mapping each defined position to the character that should be found there:
def filter_words_without_regex(words, pattern, wrong_guesses):
# get a map of the pattern's defined letters to their positions
letter_map = {i: letter for i, letter in enumerate(pattern) if letter != "_"}
# get words that
# (a) are the correct length
# (b) aren't in the wrong guesses
# (c) have the correct letters in the correct positions
return [
word
for word in words
if (
len(word) == len(pattern) and
word not in wrong_guesses and
all(word[i] == ch for i, ch in letter_map.items())
)
]
The result is the same.
Probably not the most efficient, but this should work:
def filter_words_list(words, pattern, wrong_guess_lst):
fewer_words = [w for w in words if not any([wgl in w for wgl in wrong_guess_lst])]
equal_len_words = [w for w in fewer_words if len(w) == len(pattern)]
pattern_indices = [idl for idl, ltr in enumerate(pattern) if ltr != '_']
word_indices = [[idl for idl, ltr in enumerate(w) if ((ltr in pattern) and (ltr != '_'))] for w in equal_len_words]
out = [w for wid, w in zip(word_indices, equal_len_words) if ((wid == pattern_indices) and (w[pid] == pattern[pid] for pid in pattern_indices))]
return out
The idea is to first remove all words that have letters in your wrong_guess_lst.
Then, remove everything which does not have the same length (you could also merge this condition in the first one..).
Next, for both pattern and your remaining words, you create a pattern mask, which indicates the positions of non '_' letters.
To be a candidate, the masks have to be identical AND the letters in these positions have to be identical as well.
Note, that I replaced a lot of for loops in you code by list comprehension snippets. List comprehension is a very useful construct which helps a lot especially if you don't want to use other libraries.
Edit: I cannot really tell you, where your code went wrong as it was a little too long for me..
The regex rule is explicitely constructed, in particular no check on the word's length is needed. To achieve this the groupby function from the itertools package of the standard library is used:
'_ b _ _ _' -- regex-- > r'^.{1}b.{3}$'
Here how to filter the dictionary by a guess string:
import itertools as it
import re
# sample dictionary
dictionary = "a ability able about above accept according account across act action activity actually add address"
dictionary = dictionary.split()
guess = '_ b _ _ _'
guess = guess.replace(' ', '') # remove white spaces
# construction of the regex rule
regex = r'^'
for _, i in it.groupby(guess, key=lambda x: x == '_'):
if '_' in (l:=list(i)):
regex += ''.join(f'.{{{len(l)}}}') # escape the curly brackets
else:
regex += ''.join(l)
regex += '$'
# processing the regex rule
pattern = re.compile(regex)
# filter the dictionary by the rule
l = [word for word in dictionary if pattern.match(word)]
print(l)
Output
['about', 'above']
My task is to turn the input text to acronym and reverse it. The word should be more than 3 characters long and do not contain symbols such as ,!'?. For example if I have this sentence "That was quite easy?" the function should return EQT
I have done so far:
def acr(message):
words = message.split()
if check_length(words) is False:
return "the input long!"
else:
first_letters = []
for word in words:
first_letters.append(word[0])
result = "".join(first_letters)
return reverse(result.upper())
def check(word):
if len(word) > 3:
return False
def check_length(words):
if len(words) > 50:
return False
def rev(message):
reversed_message = message[::-1]
return reversed_message
I have problems with check function. How to correctly control the length of words and symbols?
A bit hacky in the sense that a comma is technically a special character (but you want the 'e' from easy), but this works perfectly for your example. Set up the "if" statement in the "for word in words" section.
def acronymize(message):
"""Turn the input text into the acronym and reverse it, if the text is not too long."""
words = message.split()
if check_message_length(words) is False:
return "Sorry, the input's just too long!"
else:
first_letters = []
for word in words:
if len(word) > 3 and word.isalnum()== True or (len(word) > 4 and ',' in word): #satisfies all conditions. Allows commas, but no other special characters.
first_letters.append(word[0])
result = "".join(first_letters)
return reverse(result.upper())
Basically the 'if' condition became if you have word of length > 3 characters AND the word is alphanumeric (then that satisfies all conditions) OTHERWISE (OR) if there is a comma next to the word (there will be len(word)+1 characters) and it will have a comma (,), that still satisfies the previous conditions, then populate the first_letters list.
Otherwise, ignore the word.
This way you don't even have to set up a check_word function.
This spits out the answer
'EQT'
A couple more examples from my code:
Input: Holy cow, does this really work??
Output: 'RTDH'
** Note that it did NOT include the word 'cow' because it did not have more than 3 letters.
Input: Holy cows, this DOES work!!
Output: 'DTCH'
** Note, now the term 'cows' gets counted because it has more than 3 letters.
You can similarly add any exceptions that you want (!, ? and .) using the 'or' format:
Ex: or (len(word) > 4 and '!' in word) or (len(word) > 4 and '?' in word)
The only assumption made for this is that the sentence is grammatically correct (as in, it won't have exclamation marks followed by commas).
It can be further cleaned up by making a list of the special characters that you would allow and passing that list into the or clause.
Hope that helps!
re.findall(r'(\w)\w{3,}', sentence) finds first letter of every at least four letter word
''.join(reversed(re.findall(r'(\w)\w{3,}', sentence))).upper()
re docs
If you want to ignore words preceding non-word characters, use (\w)\w{3,},?(?:$|\s) – this also allows a comma explicitly.
''.join(reversed(re.findall(r'(\w)\w{3,},?(?:$|\s)', sentence))).upper()
I'm working with a hangman like project whereas if the user inputs a letter and matches with the solution, it replaces a specific asterisk that corresponds to the position of the letter in the solution. I'm trying to do this by getting the index of the instance of that letter in the solution then replacing the the matching index in the asterisk.
The thing here is that I only get the first instance of a recurring character when I used var.index(character) whereas I also have to replace the other instance of that letter. Here's the code:
word = 'turtlet'
astk = '******'
for i in word:
if i == t:
astk[word.index('i')] = i
Here it just replaces the first instance of 't' every time. How can I possibly solve this?
index() gives you only the index of the first occurrence of the character (technically, substring) in a string. You should take advantage of using enumerate(). Also, instead of a string, your guess (hidden word) should be a list, since strings are immutable and do not support item assignment, which means you cannot reveal the character if the user's guess was correct. You can then join() it when you want to display it. Here is a very simplified version of the game so you can see it in action:
word = 'turtlet'
guess = ['*'] * len(word)
while '*' in guess:
print(''.join(guess))
char = input('Enter char: ')
for i, x in enumerate(word):
if x == char:
guess[i] = char
print(''.join(guess))
print('Finished!')
Note the the find method of the string type has an optional parameter that tells where to start the search. So if you are sure that the string word has at least two ts, you can use
firstindex = word.find('t')
secondindex = word.find('t', firstindex + 1)
I'm sure you can see how to adapt that to other uses.
I believe there's a better way to do your specific task.
Simply keep the word (or phrase) itself and, when you need to display the masked phrase, calculate it at that point. The following snippet shows how you can do this:
>>> import re
>>> phrase = 'My hovercraft is full of eels'
>>> letters = ' mefl'
>>> display = re.sub("[^"+letters+"]", '*', phrase, flags=re.IGNORECASE)
>>> display
'M* ***e****f* ** f*ll *f eel*'
Note that letters should start with the characters you want displayed regardless (space in my case but may include punctuation as well). As each letter is guessed, add it to letters and recalculate/redisplay the masked phrase.
The regular expression substitution replaces all characters that are not in letters, with an asterisk.
for i in range(len(word)):
if word[i] == "t":
astk = astk[:i] + word[i] + astk[i + 1:]
I have a two dimensional array called "beats" with a bunch of data. In the second column of the array, there is a list of words in alphabetical order.
I also have a sentence called "words" which was originally a string, which I've turned into an array.
I need to check if one of the words in "words" matches any of the words in the second column of the array "beats". If a match has been found, the program changes the matched word in the sentence "words" to "match" and then return the words in a string. This is the code I'm using:
i = 0
while i < len(words):
n = 0
while n < len(beats):
if words[i] == beats[n][1]:
words[i] = "match"
n = n + 1
i = i + 1
mystring = ' '.join(words)
return mystring
So if I have the sentence:
"Money is the last money."
And "money" is in the second column of the array "beats", the result would be:
"match is the last match."
But since there's a period behind "match", it doesn't consider it a match.
Is there a way to ignore punctuation when comparing the two strings? I don't want to strip the sentence of punctuation because I want the punctuation to be in tact when I return the string once my program's done replacing the matches.
You can create a new string that has the properties you want, and then compare with the new string(s). This will strip everything but numbers, letters, and spaces while making all letters lowercase.
''.join([letter.lower() for letter in ' '.join(words) if letter.isalnum() or letter == ' '])
To strip everything but letters from a string you can do something like:
from string import ascii_letters
''.join([letter for letter in word if letter in ascii_letters])
You could use a regex:
import re
st="Money is the last money."
words=st.split()
beats=['money','nonsense']
for i,word in enumerate(words):
if word=='match': continue
for tgt in beats:
word=re.sub(r'\b{}\b'.format(tgt),'match',word,flags=re.I)
words[i]=word
print print ' '.join(words)
prints
match is the last match.
If it is only the fullstop that you are worried about, then you can add another if case to match that too. Or similar you can add custom handling if your cases are limited. or otherwise regex is the way to go.
words="Money is the last money. This money is another money."
words = words.split()
i = 0
while i < len(words):
if (words[i].lower() == "money".lower()):
words[i] = "match"
if (words[i].lower() == "money".lower() + '.'):
words[i] = "match."
i = i + 1
mystring = ' '.join(words)
print mystring
Output:
match is the last match. This match is another match.