Replacing specific word in text with uppercase version of itself [duplicate] - python

This question already has answers here:
Why doesn't calling a string method (such as .replace or .strip) modify (mutate) the string?
(3 answers)
Closed 3 years ago.
Ok, so this shouldn't be nearly this difficult, but I'm drawing a blank. So, the idea of this program is that it finds words that only occur in a text once, then capitalizes them. Here's the complete code:
from collections import Counter
from string import punctuation
path = input("Path to file: ")
with open(path) as f:
word_counts = Counter(word.strip(punctuation) for line in f for word in line.replace(")", " ").replace("(", " ")
.replace(":", " ").replace("", " ").split())
wordlist = open(path).read().replace("\n", " ").replace(")", " ").replace("(", " ").replace("", " ")
unique = [word for word, count in word_counts.items() if count == 1]
for word in unique:
text = wordlist
text.replace(word, str(word.upper()))
print(text)
It just prints the regular text, with no modifications made.
I know for a fact the first part works, It's just the final for loop thats giving me trouble.
Any idea what I'm screwing up?

Replace this line
text.replace(word, str(word.upper()))
with
text = text.replace(word, str(word.upper()))
string.replace() does not modify the original string instance.

You should assign it back to text.
text = text.replace(word, str(word.upper()))

Related

What is the simplest way to capitalize the first word in a sentence for multiple sentences in python 3.7?

For my homework I have tried to get the first word of each sentence to capitalize.
This is for python 3.7.
def fix_cap():
if "." in initialInput:
sentsplit = initialInput.split(". ")
capsent = [x.capitalize() for x in sentsplit]
joinsent = ". ".join(capsent)
print("Number of words capitalized: " + str(len(sentsplit)))
print("Edited text: " + joinsent)
elif "!" in initialInput:
sentsplit = initialInput.split("! ")
capsent = [x.capitalize() for x in sentsplit]
joinsent = "! ".join(capsent)
print("Number of words capitalized: " + str(len(sentsplit)))
print("Edited text: " + joinsent)
elif "?" in initialInput:
sentsplit = initialInput.split("? ")
capsent = [x.capitalize() for x in sentsplit]
joinsent = "? ".join(capsent)
print("Number of words capitalized: " + str(len(sentsplit)))
print("Edited text: " + joinsent)
else:
print(initialInput.capitalize())
This will work if only one type of punctuation is used, but I would like it to work with multiple types in a paragraph.
Correctly splitting a text into sentences is hard. For how to do this correctly also for cases like e.g. abbreviations, names with titles etc., please refer to other questions on this site, e.g. this one. This is only a very simple version, based on your conditions, which, I assume, will suffice for your task.
As you noticed, your code only works for one type of punctuation, because of the if/elif/else construct. But you do not need that at all! If e.g. there is no ? in the text, then split("? ") will just return the text as a whole (wrapped in a list). You could just remove the conditions, or iterate a list of possible sentence-ending punctuation. However, note that capitalize will not just upper-case the first letter, but also lower-case all the rest, e.g. names, acronyms, or words previously capitalized for a different type of punctuation. Instead, you could just upper the first char and keep the rest.
text = "text with. multiple types? of sentences! more stuff."
for sep in (". ", "? ", "! "):
text = sep.join(s[0].upper() + s[1:] for s in text.split(sep))
print(text)
# Text with. Multiple types? Of sentences! More stuff.
You could also use a regular expression to split by all sentence separators at once. This way, you might even be ablt to use capitalize, although it will still lower-case names and acronyms.
import re
>>> ''.join(s.capitalize() for s in re.split(r"([\?\!\.] )", text))
'Text with. Multiple types? Of sentences! More stuff.'
Or using re.sub with a look-behind (note the first char is still lower-case):
>>> re.sub(r"(?<=[\?\!\.] ).", lambda m: m.group().upper(), text)
'text with. Multiple types? Of sentences! More stuff.'
However, unless you know what those are doing, I'd suggest going with the first loop-based version.

How to replace the puntuation marks in words with effective code? [duplicate]

This question already has answers here:
Best way to strip punctuation from a string
(32 answers)
Closed 6 years ago.
I have been working on a file which has lot of puntuations and we need to neglect the puntuations so we can count the actual length of words.
Example:
Is this stack overflow! ---> Is this stack overflow
While doing this I did wrote a lot of cases for each and every punctuation which is there which made my code work slow.So I was looking for some effective way to implement the same using a module or function.
Code snippet :
with open(file_name,'r') as f:
for line in f:
for word in line.split():
#print word
'''
Handling Puntuations
'''
word = word.replace('.','')
word = word.replace(',','')
word = word.replace('!','')
word = word.replace('(','')
word = word.replace(')','')
word = word.replace(':','')
word = word.replace(';','')
word = word.replace('/','')
word = word.replace('[','')
word = word.replace(']','')
word = word.replace('-','')
So form this logic I have written this, so is there any way to minimize this?
This question is a "classic", but a lot of answers don't work in Python 3 because the maketrans function has been removed from Python 3. A Python 3-compliant solution is:
use string.punctuation to get the list and str.translate to remove them
import string
"hello, world !".translate({ord(k):"" for k in string.punctuation})
results in:
'hello world '
the argument of translate is (in Python 3) a dictionary. Key is the ASCII code of the character, and value is the replacement character. I created it using a dictionary comprehension.
You can use regular expression to replace from a character class as
>>> import re
>>> re.sub(r'[]!,:)([/-]', '', string)
'Is this stack overflow'
[]!,:)([/-] A character class which matches ] or ! or , or etc. Replace it with ''.

How do I turn only the first letter uppercase? [duplicate]

This question already has answers here:
Capitalize a string
(9 answers)
Closed 6 years ago.
I have this:
word = raw_input("enter a word")
word[0].upper()
But it still doesn't make the first letter uppercase.
.upper() returns a new string because strings are immutable data types. You ought to set the return value to a variable.
You can use .capitalize over .upper if you want to make only the first letter uppercase.
>>> word = raw_input("enter a word")
>>> word = word.capitalize()
Please note that .capitalize turns the rest of the characters to lowercase. If you don't want it to happen, just go with [0].upper():
word = word[0].upper() + word[1:]

Trying to delete vowels from a string fails [duplicate]

This question already has answers here:
def anti_vowel - codecademy (python)
(7 answers)
Closed 8 years ago.
What's wrong with this code? The aim is to check wether the entered string contains vowels and delete them
Here is the code:
def anti_vowel(text):
text = str(text)
vowel = "aeiouAEIOU"
for i in text:
for i in vowel.lower():
text = text.replace('i',' ')
for i in vowel.upper():
text = text.replace('i',' ')
return text
It's a lesson on Codecademy
You are trying to replace the string with the value 'i', not the contents of the variable i.
Your code is also very inefficient; you don't need to loop over every character in text; a loop over vowel is enough. Because you already include both upper and lowercase versions, the two loops over the lowercased and uppercased versions are in essence checking for each vowel 4 times.
The following would be enough:
def anti_vowel(text):
text = str(text)
vowel = "aeiouAEIOU"
for i in vowel:
text = text.replace(i,' ')
return text
You are also replacing vowels with spaces, not just deleting them.
The fastest way to delete (rather than replace) all vowels would be to use str.translate():
def anti_vowel(text):
return text.translate(text.maketrans('', '', 'aeiouAEIOU'))
The str.maketrans() static method produces a mapping that'll delete all characters named in the 3rd argument.

Python: Best practice for dynamically constructing regex [duplicate]

This question already has answers here:
Escaping regex string
(4 answers)
Closed 9 months ago.
I have a simple function to remove a "word" from some text:
def remove_word_from(word, text):
if not text or not word: return text
rec = re.compile(r'(^|\s)(' + word + ')($|\s)', re.IGNORECASE)
return rec.sub(r'\1\3', text, 1)
The problem, of course, is that if word contains characters such as "(" or ")" things break, and it generally seems unsafe to stick a random word in the middle of a regex.
What's best practice for handling cases like this? Is there a convenient, secure function I can call to escape "word" so it's safe to use?
You can use re.escape(word) to escape the word.
Unless you're forced to use regexps, couldn't you use instead the replace method for strings ?
text = text.replace(word, '')
This allows you to get rid of punctuation issues.
Write a sanitizer function and pass word through that first.
def sanitize(word):
def literalize(wd, escapee):
return wd.replace(escapee, "\\%s"%escapee)
return reduce(literalize, "()[]*?{}.+|", word)
def remove_word_from(word, text):
if not text or not word: return text
rec = re.compile(r'(^|\s)(' + sanitize(word) + ')($|\s)', re.IGNORECASE)
return rec.sub(r'\1\3', text, 1)

Categories

Resources