I am working on the famous hamlet bot program to in Python 3.7. So I have a partial script (in the form of string input) from the famous Hamlet play of Shakespeare.
My task is to split the sentences of the script into lists and then further create list of the words in the sentences.
I am using the following code copied from the internet:
'''
### BEGIN SOLUTION
def hamsplits__soln0():
cleanham = ""
for char in hamlet_text:
swaplist = ["?","!", "."] #define the puntuations which we need to replace.
if char in swaplist:
cleanham += "." #replace all the puntuations with .
elif char is " ":
cleanham += char #convert all the spaces to character type.
elif char.isalpha():
cleanham += char.lower() #bringing all the letters in lower case.
hamlist = cleanham.split(". ") #spliting all the sentences as the parts of a list.
for sentence in hamlist:
hamsplits.append(sentence.split()) #spliting all the words of the sentences as the part of list.
if hamsplits[-1][-1][-1] == '.':
hamsplits[-1][-1] = hamsplits[-1][-1][:-1] # Remove trailing punctuation
'''
Here in I want to understand the meaning of the last two lines of the code.
if hamsplits[-1][-1][-1] == '.':
hamsplits[-1][-1] = hamsplits[-1][-1][:-1] # Remove trailing punctuation
If anyone can help me on this???
Let's suppose that hamsplits it's a 3D array.
The first line check that the last element in the last line of last plane is dot and then remove this last element from the last line
>>> x = [1, 2, 3]
>>> x = x[:-1] # Remove last element
>>> x
[1, 2]
Should have the same effect with
del hamsplits[-1][-1][-1]
Let's take an example, suppose we have hamsplits like
hamsplits=['test',['test1',['test2','.']]]
print(hamsplits[-1][-1][-1]) # it would be equal to '.'
if hamsplits[-1][-1][-1] == '.': # here we are comparing it with "."
hamsplits[-1][-1] = hamsplits[-1][-1][:-1] # in this we are just removing the '.' from third list in hamsplits and taking all remaining elements
print(hamsplits[-1][-1][:-1]) # it would print ['test2'] (removing last element from list) and overwriting in hamsplits[-1][-1]
**Note**:
hamsplits[:-1] is removing the last element, it's a slicing in python
hamsplits[-1] you are accessing the last element
Hope this helps!
Related
I am looking for a way to store the position integer of a character into a variable, but now I'm using a way I used in Delphi 2010, which is not right, according to Jupyter Notebook
This is my code I have this far:
def animal_crackers(text):
for index in text:
if index==' ':
if text[0] == text[pos(index)+1]:
return True
else:
return False
else:
pass
The aim, is to get two words (word + space + word) and if the beginning letters, of both words, match, then it has to show True, otherwise it shows False
For getting the index of a letter in a string (as the title asks), just use str.index(), or str.find() if you don't want an error to be raised if the letter/substring could not be found:
>>> text = 'seal sheep'
>>> text.index(' ')
4
However for your program, you do not need to use str.index if you want to identify the first and second word. Instead, use str.split() to break up a given text into a list of substrings:
>>> words = text.split() # With no arguments, splits words by whitespace
>>> words
['seal', 'sheep']
Then, you can take the letter of the first word and check if the second word begins with the same letter:
# For readability, you can assign the two words into their own variables
>>> first_word, second_word = words[0], words[1]
>>> first_word[0] == second_word[0]
True
Combined into a function, it may look like this:
def animal_crackers(text):
words = text.split()
first_word, second_word = words[0], words[1]
return first_word[0] == second_word[0]
Assuming that text is a single line containing two words:
def animal_crackers(text):
words = text.split()
if len(words)== 1:
break # we only have one word!
# here, the .lower() is only necessary is the program is NOT case-sensitive
# if you do care about the case of the letter, remove them
if word[0].lower() == words[1][0].lower():
return True
else:
return false
I'm trying to compare the words in "alice_list" to "dictionary_list", and if a word isnt found in the "dictionary_list" to print it and say it is probably misspelled. I'm having issues where its not printing anything if its not found, maybe you guys could help me out. I have the "alice_list" being appended to uppercase, as the "dictionary_list" is all in capitals. Any help with why its not working would be appreciated as I'm about to pull my hair out over it!
import re
# This function takes in a line of text and returns
# a list of words in the line.
def split_line(line):
return re.findall('[A-Za-z]+(?:\'[A-Za-z]+)?', line)
# --- Read in a file from disk and put it in an array.
dictionary_list = []
alice_list = []
misspelled_words = []
for line in open("dictionary.txt"):
line = line.strip()
dictionary_list.extend(split_line(line))
for line in open("AliceInWonderLand200.txt"):
line = line.strip()
alice_list.extend(split_line(line.upper()))
def searching(word, wordList):
first = 0
last = len(wordList) - 1
found = False
while first <= last and not found:
middle = (first + last)//2
if wordList[middle] == word:
found = True
else:
if word < wordList[middle]:
last = middle - 1
else:
first = middle + 1
return found
for word in alice_list:
searching(word, dictionary_list)
--------- EDITED CODE THAT WORKED ----------
Updated a few things if anyone has the same issue, and used "for word not in" to double check what was being outputted in the search.
"""-----Binary Search-----"""
# search for word, if the word is searched higher than list length, print
words = alice_list
for word in alice_list:
first = 0
last = len(dictionary_list) - 1
found = False
while first <= last and not found:
middle = (first + last) // 2
if dictionary_list[middle] == word:
found = True
else:
if word < dictionary_list[middle]:
last = middle - 1
else:
first = middle + 1
if word > dictionary_list[last]:
print("NEW:", word)
# checking to make sure words match
for word in alice_list:
if word not in dictionary_list:
print(word)
Your function split_line() returns a list. You then take the output of the function and append it to the dictionary list, which means each entry in the dictionary is a list of words rather than a single word. The quick fix it to use extend instead of append.
dictionary_list.extend(split_line(line))
A set might be a better choice than a list here, then you wouldn't need the binary search.
--EDIT--
To print words not in the list, just filter the list based on whether your function returns False. Something like:
notfound = [word for word in alice_list if not searching(word, dictionary_list)]
Are you required to use binary search for this program? Python has this handy operator called "in". Given an element as the first operand and and a list/set/dictionary/tuple as the second, it returns True if that element is in the structure, and false if it is not.
Examples:
1 in [1, 2, 3, 4] -> True
"APPLE" in ["HELLO", "WORLD"] -> False
So, for your case, most of the script can be simplified to:
for word in alice_list:
if word not in dictionary_list:
print(word)
This will print each word that is not in the dictionary list.
I'm working with a hangman like project whereas if the user inputs a letter and matches with the solution, it replaces a specific asterisk that corresponds to the position of the letter in the solution. I'm trying to do this by getting the index of the instance of that letter in the solution then replacing the the matching index in the asterisk.
The thing here is that I only get the first instance of a recurring character when I used var.index(character) whereas I also have to replace the other instance of that letter. Here's the code:
word = 'turtlet'
astk = '******'
for i in word:
if i == t:
astk[word.index('i')] = i
Here it just replaces the first instance of 't' every time. How can I possibly solve this?
index() gives you only the index of the first occurrence of the character (technically, substring) in a string. You should take advantage of using enumerate(). Also, instead of a string, your guess (hidden word) should be a list, since strings are immutable and do not support item assignment, which means you cannot reveal the character if the user's guess was correct. You can then join() it when you want to display it. Here is a very simplified version of the game so you can see it in action:
word = 'turtlet'
guess = ['*'] * len(word)
while '*' in guess:
print(''.join(guess))
char = input('Enter char: ')
for i, x in enumerate(word):
if x == char:
guess[i] = char
print(''.join(guess))
print('Finished!')
Note the the find method of the string type has an optional parameter that tells where to start the search. So if you are sure that the string word has at least two ts, you can use
firstindex = word.find('t')
secondindex = word.find('t', firstindex + 1)
I'm sure you can see how to adapt that to other uses.
I believe there's a better way to do your specific task.
Simply keep the word (or phrase) itself and, when you need to display the masked phrase, calculate it at that point. The following snippet shows how you can do this:
>>> import re
>>> phrase = 'My hovercraft is full of eels'
>>> letters = ' mefl'
>>> display = re.sub("[^"+letters+"]", '*', phrase, flags=re.IGNORECASE)
>>> display
'M* ***e****f* ** f*ll *f eel*'
Note that letters should start with the characters you want displayed regardless (space in my case but may include punctuation as well). As each letter is guessed, add it to letters and recalculate/redisplay the masked phrase.
The regular expression substitution replaces all characters that are not in letters, with an asterisk.
for i in range(len(word)):
if word[i] == "t":
astk = astk[:i] + word[i] + astk[i + 1:]
I am trying to create define a function that:
Splits a string called text at every new line (ex text="1\n2\n\3)
Checks ONLY the first index in each of the individual split items to see if number is 0-9.
Return any index that has 0-9, it can be more than one line
ex: count_digit_leading_lines ("AAA\n1st") → 1 # 2nd line starts w/ digit 1
So far my code is looking like this but I can't figure out how to get it to only check the first index in each split string:
def count_digit_leading_lines(text):
for line in range(len(text.split('\n'))):
for index, line in enumerate(line):
if 0<=num<=9:
return index
It accepts the arguement text, it iterates over each individual line (new split strings,) I think it goes in to check only the first index but this is where I get lost...
The code should be as simple as :
text=text.strip() #strip all whitespace : for cases ending with '\n' or having two '\n' together
text=text.replace('\t','') #for cases with '\t' etc
s=text.split('\n') #Split each sentence (# '\n')
#s=[words.strip() for words in s] #can also use this instead of replace('\t')
for i,sentence in enumerate(s):
char=sentence[0] #get first char in each sentence
if char.isdigit(): #if 1st char is a digit (0-9)
return i
UPDATE:
Just noticed OP's comment on another answer stating you don't want to use enumerate in your code (though its good practice to use enumeration). So the for loop modified version without enumerate is :
for i in range(len(s)):
char=s[i][0] #get first char in each sentence
if char.isdigit(): #if 1st char is a digit (0-9)
return i
This should do it:
texts = ["1\n2\n\3", 'ABC\n123\n456\n555']
def _get_index_if_matching(text):
split_text = text.split('\n')
if split_text:
for line_index, line in enumerate(split_text):
try:
num = int(line[0])
if 0 < num < 9:
return line_index
except ValueError:
pass
for text in texts:
print(_get_index_if_matching(text))
It will return 0 and then 1
You could change out your return statement for a yield, making your function a generator. Then you could get the indexes one by one in a loop, or make them into a list. Here's a way you could do it:
def count_digit_leading_lines(text):
for index, line in enumerate(text.split('\n')):
try:
int(line[0])
yield index
except ValueError: pass
# Usage:
for index in count_digit_leading_lines(text):
print(index)
# Or to get a list
print(list(count_digit_leading_lines(text)))
Example:
In : list(count_digit_leading_lines('he\n1\nhto2\n9\ngaga'))
Out: [1, 3]
I have a two dimensional array called "beats" with a bunch of data. In the second column of the array, there is a list of words in alphabetical order.
I also have a sentence called "words" which was originally a string, which I've turned into an array.
I need to check if one of the words in "words" matches any of the words in the second column of the array "beats". If a match has been found, the program changes the matched word in the sentence "words" to "match" and then return the words in a string. This is the code I'm using:
i = 0
while i < len(words):
n = 0
while n < len(beats):
if words[i] == beats[n][1]:
words[i] = "match"
n = n + 1
i = i + 1
mystring = ' '.join(words)
return mystring
So if I have the sentence:
"Money is the last money."
And "money" is in the second column of the array "beats", the result would be:
"match is the last match."
But since there's a period behind "match", it doesn't consider it a match.
Is there a way to ignore punctuation when comparing the two strings? I don't want to strip the sentence of punctuation because I want the punctuation to be in tact when I return the string once my program's done replacing the matches.
You can create a new string that has the properties you want, and then compare with the new string(s). This will strip everything but numbers, letters, and spaces while making all letters lowercase.
''.join([letter.lower() for letter in ' '.join(words) if letter.isalnum() or letter == ' '])
To strip everything but letters from a string you can do something like:
from string import ascii_letters
''.join([letter for letter in word if letter in ascii_letters])
You could use a regex:
import re
st="Money is the last money."
words=st.split()
beats=['money','nonsense']
for i,word in enumerate(words):
if word=='match': continue
for tgt in beats:
word=re.sub(r'\b{}\b'.format(tgt),'match',word,flags=re.I)
words[i]=word
print print ' '.join(words)
prints
match is the last match.
If it is only the fullstop that you are worried about, then you can add another if case to match that too. Or similar you can add custom handling if your cases are limited. or otherwise regex is the way to go.
words="Money is the last money. This money is another money."
words = words.split()
i = 0
while i < len(words):
if (words[i].lower() == "money".lower()):
words[i] = "match"
if (words[i].lower() == "money".lower() + '.'):
words[i] = "match."
i = i + 1
mystring = ' '.join(words)
print mystring
Output:
match is the last match. This match is another match.