How to make this code disregard all punctuation from the sentence? - python

I've created this code to analyse an input sentence to allow for the user to search for a certain word within it. However, I can't seem to figure out how to make it so all the punctuation in the input sentence is disregarded. I need this because, if a sentence such as "hello there, friend" is input, the word "there" is counted as "there," and so if the user is searching for "there" it says it is not in the sentence. Please help me. I'm really new to python.
print("Please enter a sentence")
sentence=input()
lowersen=(sentence.lower())
print(lowersen)
splitlowersen=(lowersen.split())
print (splitlowersen)
print("Enter word")
word=input()
lword=(word.lower())
if lword in splitlowersen:
print(lword, "is in sentence")
for i, j in enumerate (splitlowersen):
if j==lword:
print(""+lword+"","is in position", i+1)
if lword not in splitlowersen:
print (lword, "is not in sentence")

print("Please enter a sentence")
sentence=input()
lowersen=(sentence.lower())
print(lowersen)
splitlowersen=(lowersen.strip())
#to remove punctuations
splitlowersen = "".join(c for c in splitlowersen if c not in ('!','.',':'))
print("Enter word")
word=input()
lword=(word.lower())
if lword in splitlowersen:
print(lword, "is in sentence")
for i, j in enumerate (splitlowersen):
if j==lword:
print(""+lword+"","is in position", i+1)
if lword not in splitlowersen:
print (lword, "is not in sentence")
Output:
Please enter a sentence
hello, friend
hello, friend
Enter word
hello
hello is in sentence

You could split the string on all punctuation marks:
s = "This, is a line."
f = s.split(".,!?")
>>>> f = ["This", "is", "a", "line"]

This is a little long winded maybe but in python3.
# This will remove all non letter characters and spaces from the sentence
sentence = ''.join(filter(lambda x: x.isalpha() or x == ' ', sentence)
# the rest of your code will work after this.
There are a couple of advanced concepts in here.
Filter will take a function and an iterible returning a generator with the items that don't return true from the function
https://docs.python.org/3/library/functions.html#filter
Lambda will create an anonymous function that will check each letter for us.
https://docs.python.org/3/reference/expressions.html#lambda
x.isalpha() will check that the letter in question is actually a letter.
followed by x == ' ' to see it it could be a space.
https://docs.python.org/3.6/library/stdtypes.html?highlight=isalpha#str.isalpha
''.join will take the results of the filter and put it back into a string for you.
https://docs.python.org/3.6/library/stdtypes.html?highlight=isalpha#str.join

Or you could use nltk package for tokenizing your text which does the sentence tokenization as you would be expecting and it also avoids the common pitfalls of punctuation as 'Mr.' --> This will not be broken down based on the punctuation.
from nltk.tokenize import word_tokenize
string = "Hello there, friend"
words = word_tokenize(string)
print(words)
OUTPUT
['Hello', 'there', ',', 'friend']
So I guess you should try using nltk package and see if it works.
Click this link here for better understanding.
Hope this helps :)

Related

How do input more than one word for translation in Python?

I'm trying to make a silly translator game as practice. I'm replacing "Ben" with "Idiot" but it only works when the only word I input is "Ben". If I input "Hello, Ben" then the console prints out a blank statement. I'm trying to get "Hello, Idiot". Or if I enter "Hi there, Ben!" I would want to get "Hi there Idiot!". If I input "Ben" then it converts to "Idiot" but only when the name by itself is entered.
I'm using Python 3 and am using function def translate(word): so maybe I'm over-complicating the process.
def translate(word):
translation = ""
if word == "Ben":
translation = translation + "Idiot"
return translation
print(translate(input("Enter a phrase: ")))
I'm sorry if I explained all of this weird. Completely new to coding and using this website! Appreciate all of the help!
use str.replace() function for this:
sentence = "Hi there Ben!"
sentence=sentence.replace("Ben","Idiot")
Output: Hi there Idiot!
#str.replace() is case sensitive
At first, you must split string to words:
s.split()
But that function, split string to words by white spaces, it's not good enough!
s = "Hello Ben!"
print(s.split())
Out: ["Hello", "Ben!"]
In this example, you can't find "Ben" easily.
We use re in this case:
re.split('[^a-zA-Z]', word)
Out: ["Hello", "Ben", ""]
But, we missed "!", We change it:
re.split('([^a-zA-Z])', word)
Out: ['Hello', ' ', 'Ben', '!', '']
and finally:
import re
def translate(word):
words_list = re.split('([^a-zA-Z])', word)
translation = ""
for item in words_list:
if item == "Ben":
translation += "Idiot"
else:
translation += item
return translation
print(translate("Hello Ben! Benchmark is ok!"))
P.S:
If we use replace, we have a wrong answer!
"Hello Ben! Benchmark is ok!".replace("Ben", "Idiot")
Out: Hello Idiot! Idiotchmark is ok!

Python: Iterate through string and print only specific words

I'm taking a class in python and now I'm struggling to complete one of the tasks.
The aim is to ask for an input, integrate through that string and print only words that start with letters > g. If the word starts with a letter larger than g, we print that word. Otherwise, we empty the word and iterate through the next word(s) in the string to do the same check.
This is the code I have, and the output. Would be grateful for some tips on how to solve the problem.
# [] create words after "G" following the Assignment requirements use of functions, menhods and kwyowrds
# sample quote "Wheresoever you go, go with all your heart" ~ Confucius (551 BC - 479 BC)
# [] copy and paste in edX assignment page
quote = input("Enter a sentence: ")
word = ""
# iterate through each character in quote
for char in quote:
# test if character is alpha
if char.isalpha():
word += char
else:
if word[0].lower() >= "h":
print(word.upper())
else:
word=""
Enter a sentence: Wheresoever you go, go with all your heart
WHERESOEVER
WHERESOEVERYOU
WHERESOEVERYOUGO
WHERESOEVERYOUGO
WHERESOEVERYOUGOGO
WHERESOEVERYOUGOGOWITH
WHERESOEVERYOUGOGOWITHALL
WHERESOEVERYOUGOGOWITHALLYOUR
The output should look like,
Sample output:
WHERESOEVER
YOU
WITH
YOUR
HEART
Simply a list comprehension with split will do:
s = "Wheresoever you go, go with all your heart"
print(' '.join([word for word in s.split() if word[0].lower() > 'g']))
# Wheresoever you with your heart
Modifying to match with the desired output (Making all uppercase and on new lines):
s = "Wheresoever you go, go with all your heart"
print('\n'.join([word.upper() for word in s.split() if word[0].lower() > 'g']))
'''
WHERESOEVER
YOU
WITH
YOUR
HEART
'''
Without list comprehension:
s = "Wheresoever you go, go with all your heart"
for word in s.split(): # Split the sentence into words and iterate through each.
if word[0].lower() > 'g': # Check if the first character (lowercased) > g.
print(word.upper()) # If so, print the word all capitalised.
Here is a readable and commented solution. The idea is first to split the sentence into a list of words using re.findall (regex package) and iterate through this list, instead of iterating on each character as you did. It is then quite easy to print only the words starting by a letter greater then 'g':
import re
# Prompt for an input sentence
quote = input("Enter a sentence: ")
# Split the sentence into a list of words
words = re.findall(r'\w+', quote)
# Iterate through each word
for word in words:
# Print the word if its 1st letter is greater than 'g'
if word[0].lower() > 'g':
print(word.upper())
To go further, here is also the one-line style solution based on exactly the same logic, using list comprehension:
import re
# Prompt for an input sentence
quote = input("Enter a sentence: ")
# Print each word starting by a letter greater than 'g', in upper case
print(*[word.upper() for word in re.findall(r'\w+', quote) if word[0].lower() > 'g'], sep='\n')
s = "Wheresoever you go, go with all your heart"
out = s.translate(str.maketrans(string.punctuation, " "*len(string.punctuation)))
desired_result = [word.upper() for word in out.split() if word and word[0].lower() > 'g']
print(*desired_result, sep="\n")
Your problem is that you're only resetting word to an empty string in the else clause. You need to reset it to an empty string immediately after the print(word.upper()) statement as well for the code as you've wrote it to work correctly.
That being said, if it's not explicitly disallowed for the class you're taking, you should look into string methods, specifically string.split()

Can't get program to print "not in sentence" when word not in sentence

I have a program that asks for input of a sentence, then asks for a word, and tells you the position of that word:
sentence = input("enter sentence: ").lower()
askedword = input("enter word to locate position: ").lower()
words = sentence.split(" ")
for i, word in enumerate(words):
if askedword == word :
print(i+1)
#elif keyword != words :
#print ("this not")
However I cannot get the program to work correctly when I edit it to say that if the input word is not in the sentence, then print "this isn't in the sentence"
Lists are sequences, as such you can use the in operation on them to test for membership in the words list. If inside, find the position inside the sentence with words.index:
sentence = input("enter sentence: ").lower()
askedword = input("enter word to locate position: ").lower()
words = sentence.split(" ")
if askedword in words:
print('Position of word: ', words.index(askedword))
else:
print("Word is not in the given sentence.")
With sample input:
enter sentence: hello world
enter word to locate position: world
Position of word: 1
and, a false case:
enter sentence: hello world
enter word to locate position: worldz
Word is not in the given sentence.
If you're looking to check against multiple matches then a list-comprehension with enumerate is the way to go:
r = [i for i, j in enumerate(words, start=1) if j == askedword]
Then check on whether the list is empty or not and print accordingly:
if r:
print("Positions of word:", *r)
else:
print("Word is not in the given sentence.")
Jim's answer—combining a test for askedword in words with a call to words.index(askedword)—is the best and most Pythonic approach in my opinion.
Another variation on the same approach is to use try-except:
try:
print(words.index(askedword) + 1)
except ValueError:
print("word not in sentence")
However, I just thought I'd point out that the structure of the OP code looks like you might have been attempting to adopt the following pattern, which also works:
for i, word in enumerate(words):
if askedword == word :
print(i+1)
break
else: # triggered if the loop runs out without breaking
print ("word not in sentence")
In an unusual twist unavailable in most other programming languages, this else binds to the for loop, not to the if statement (that's right, get your editing hands off my indents). See the python.org documentation here.

Scanning for two word phrase in Python dictionary

I am trying to use a Python dictionary object to help translate an input string to other words or phrases. I am having success with translating single words from the input, but I can't seem to figure out how to translate multi-word phrases.
Example:
sentence = input("Please enter a sentence: ")
myDict = {"hello": "hi","mean adult":"grumpy elder", ...ect}
How can I return hi grumpy elder if the user enters hello mean adult for the input?
"fast car" is a key to the dictionary, so you can extract the value if you use the key coming back from it.
If you're taking the input straight from the user and using it to reference the dictionary, get is safer, as it allows you to provide a default value in case the key doesn't exist.
print(myDict.get(sentence, "Phrase not found"))
Since you've clarified your requirements a bit more, the hard part now is the splitting; the get doesn't change. If you can guarantee the order and structure of the sentences (that is, it's always going to be structured such that we have a phrase with 1 word followed by a phrase with 2 words), then split only on the first occurrence of a space character.
split_input = input.split(' ', 1)
print("{} {}".format(myDict.get(split_input[0]), myDict.get(split_input[1])))
More complex split requirements I leave as an exercise for the reader. A hint would be to use the keys of myDict to determine what valid tokens are present in the sentence.
The same way as you normally would.
translation = myDict['fast car']
A solution to your particular problem would be something like the following, where maxlen is the maximum number of words in a single phrase in the dictionary.
translation = []
words = sentence.split(' ')
maxlen = 3
index = 0
while index < len(words):
for i in range(maxlen, 0, -1):
phrase = ' '.join(words[index:index+i])
if phrase in myDict:
translation.append(myDict[phrase])
index += i
break
else:
translation.append(words[index])
index += 1
print ' '.join(translation)
Given the sentence hello this is a nice fast car, it outputs hi this is a sweet quick ride
This will check for each word and also a two word phrase using the word before and after the current word to make the phrase:
myDict = {"hello": "hi",
"fast car": "quick ride"}
sentence = input("Please enter a sentence: ")
words = sentence.split()
for i, word in enumerate(words):
if word in myDict:
print myDict.get(word)
continue
if i:
phrase = ' '.join([words[i-1], word])
if phrase1 in myDict:
print myDict.get(phrase)
continue
if i < len(words)-1:
phrase = ' '.join([word, words[i+1])
if phrase in myDict:
print myDict.get(phrase)
continue

Python: How to print a word one letter at a time

How do I use Python to print a word one letter at a time? Any help would be appreciated.
If i understood you correctly than you can use the following code:
for word in text.split():
print word
else if you need to print word's letters:
for let in word:
print let
In case you need to skip punctuation and so on you can also use regEx:
tst = 'word1, word2 word3;'
from re import findall
print findall(r'\w+', tst)
Or not very pythonic:
skipC = [':','.', ',', ';']# add any if needed
text= 'word1, word2. word3;'
for x in skipC:
text = text.replace(x, ' ')
for word in text.split():
print word

Categories

Resources