How to use the isalpha function to remove special characters - python

I am trying to remove special characters from each element in a string. The below code does count the elements but i can't get the .isalpha to remove the non alphabetical elements. Is anyone able to assist? Thank you in advance.
input = 'Hello, Goodbye hello hello! bye byebye hello?'
word_list = input.split()
for word in word_list:
if word.isalpha()==False:
word[:-1]
di = dict()
for word in word_list:
di[word] = di.get(word,0)+1
di

It seems you are expecting word[:-1] to remove the last character of word and have that change reflected in the list word_list. However, you have assigned the string in word_list to a new variable called word and therefore the change won't be reflected in the list itself.
A simple fix would be to create a new list and append values into that. Note that your original string is called input which shadows the builtin input() function which is not a good idea:
input_string = 'Hello, Goodbye hello hello! bye byebye hello?'
word_list = input_string.split()
new = []
for word in word_list:
if word.isalpha() == False:
new.append(word[:-1])
else:
new.append(word)
di = dict()
for word in new:
di[word] = di.get(word,0)+1
print(di)
# {'byebye': 1, 'bye': 1, 'Hello': 1, 'Goodbye': 1, 'hello': 3}
You could also remove the second for loop and use collections.Counter instead:
from collections import Counter
print(Counter(new))

You are nearly there with your for loop. The main stumbling block seems to be that word[:-1] on its own does nothing, you need to store that data somewhere. For example, by appending to a list.
You also need to specify what happens to strings which don't need modifying. I'm also not sure what purpose the dictionary serves.
So here's your for loop re-written:
mystring = 'Hello, Goodbye hello hello! bye byebye hello?'
word_list = mystring.split()
res = []
for word in word_list:
if not word.isalpha():
res.append(word[:-1])
else:
res.append(word)
mystring_out = ' '.join(res) # 'Hello Goodbye hello hello bye byebye hello'
The idiomatic way to write the above is via feeding a list comprehension to str.join:
mystring_out = ' '.join([word[:-1] if not word.isalpha() else word \
for word in mystring.split()])
It goes without saying that this assumes word.isalpha() returns False due to an unwanted character at the end of a string, and that this is the only scenario you want to consider for special characters.

One solution using re:
In [1]: import re
In [2]: a = 'Hello, Goodbye hello hello! bye byebye hello?'
In [3]: ' '.join([i for i in re.split(r'[^A-Za-z]', a) if i])
Out[3]: 'Hello Goodbye hello hello bye byebye hello'

Related

How can i join words in an array to form a single line sentence without it spanning to a new line

i have tried this:
mystring=input()
mystring=mystring.split()
for word in mystring:
newword=word[1:]+word[0]
print("".join(newword))
Input:
hello world
the output for the above input:
elloh
orldw
The expected output should be:
elloh orldw
The problem with your current approach is that you are just printing each word once in the loop, which by default is also printing a newline character. You seem to have the idea of populating a new list with the partially reversed version of each word. If so, then define a list and use this approach:
mystring = input()
words = mystring.split()
words_out = []
for word in words:
newword = word[1:] + word[0]
words_out.append(newword)
print(" ".join(words_out))
For an input of hello world, the above script prints:
elloh orldw
You can simply pass a parameter " " to join in order to concatenate the list elements like:
mystring=input()
mystring=mystring.split()
newstring = ' '.join(mystring)
print(newstring)
Try This
myString = "hello world".split()
for word in myString:
new = word[1:]+word[0]
print(new, end=" ", flush=True)
Output:
elloh orldw
Or you can do this:
myString = "hello world".split()
new_word = []
for word in myString:
new = word[1:]+word[0]
new_word.append(new)
print(*new_word, sep=" ")
Output:
elloh orldw
One Line Solution:
print(*[word[1:]+word[0] for word in input().split()], sep=" ")
Input - hello world
Output:elloh orldw
You can change the input.
mystring="hello world"
mystring=[text[1:]+text[0] for text in mystring.split()]
print(" ".join(mystring))
You can code as below for one liner function:
print(" ".join([(word[1:] + word[0]) for word in mystring.strip().split()]))
Basically you were wrong here: print("".join(newword)). It's wrong since the print should be outside the loop, and even if you do that the join function has nothing to join as every time you run the loop the newword variable is assigned with new value and loses the previous value, hence there is nothing to join. So try this bit of code it will definitely help:
mystring=input()
mystring=mystring.split()
final=[]
for word in mystring:
newword=word[1:]+word[0]
final.append(newword)
print(" ".join(final))

How do input more than one word for translation in Python?

I'm trying to make a silly translator game as practice. I'm replacing "Ben" with "Idiot" but it only works when the only word I input is "Ben". If I input "Hello, Ben" then the console prints out a blank statement. I'm trying to get "Hello, Idiot". Or if I enter "Hi there, Ben!" I would want to get "Hi there Idiot!". If I input "Ben" then it converts to "Idiot" but only when the name by itself is entered.
I'm using Python 3 and am using function def translate(word): so maybe I'm over-complicating the process.
def translate(word):
translation = ""
if word == "Ben":
translation = translation + "Idiot"
return translation
print(translate(input("Enter a phrase: ")))
I'm sorry if I explained all of this weird. Completely new to coding and using this website! Appreciate all of the help!
use str.replace() function for this:
sentence = "Hi there Ben!"
sentence=sentence.replace("Ben","Idiot")
Output: Hi there Idiot!
#str.replace() is case sensitive
At first, you must split string to words:
s.split()
But that function, split string to words by white spaces, it's not good enough!
s = "Hello Ben!"
print(s.split())
Out: ["Hello", "Ben!"]
In this example, you can't find "Ben" easily.
We use re in this case:
re.split('[^a-zA-Z]', word)
Out: ["Hello", "Ben", ""]
But, we missed "!", We change it:
re.split('([^a-zA-Z])', word)
Out: ['Hello', ' ', 'Ben', '!', '']
and finally:
import re
def translate(word):
words_list = re.split('([^a-zA-Z])', word)
translation = ""
for item in words_list:
if item == "Ben":
translation += "Idiot"
else:
translation += item
return translation
print(translate("Hello Ben! Benchmark is ok!"))
P.S:
If we use replace, we have a wrong answer!
"Hello Ben! Benchmark is ok!".replace("Ben", "Idiot")
Out: Hello Idiot! Idiotchmark is ok!

Check for words in a sentence

I write a program in Python. The user enters a text message. It is necessary to check whether there is a sequence of words in this message. Sample. Message: "Hello world, my friend.". Check the sequence of these two words: "Hello", "world". The Result Is "True". But when checking the sequence of these words in the message: "Hello, beautiful world "the result is"false". When you need to check the presence of only two words it is possible as I did it in the code, but when combinations of 5 or more words is difficult. Is there any small solution to this problem?
s=message.text
s=s.lower()
lst = s.split()
elif "hello" in lst and "world" in lst :
if "hello" in lst:
c=lst.index("hello")
if lst[c+1]=="world" or lst[c-1]=="world":
E=True
else:
E=False
The straightforward way is to use a loop. Split your message into individual words, and then check for each of those in the sentence in general.
word_list = message.split() # this gives you a list of words to find
word_found = True
for word in word_list:
if word not in message2:
word_found = False
print(word_found)
The flag word_found is True iff all words were found in the sentence. There are many ways to make this shorter and faster, especially using the all operator, and providing the word list as an in-line expression.
word_found = all(word in message2 for word in message.split())
Now, if you need to restrict your "found" property to matching exact words, you'll need more preprocessing. The above code is too forgiving of substrings, such as finding "Are you OK ?" in the sentence "your joke is only barely funny". For the more restrictive case, you should break message2 into words, strip those words of punctuation, drop them to lower-case (to make matching easier), and then look for each word (from message) in the list of words from message2.
Can you take it from there?
I will clarify your requirement first:
ignore case
consecutive sequence
match in any order, like permutation or anagram
support duplicated words
if the number is not too large, you can try this easy-understanding but not the fastest way.
split all words in text message
join them with ' '
list all the permutation of words and join them with ' ' too, For
example, if you want to check sequence of ['Hello', 'beautiful', 'world']. The permutation will be 'Hello beautiful world',
'Hello world beautiful', 'beautiful Hello world'... and so on.
and you can just find whether there is one permutation such as
'hello beautiful world' is in it.
The sample code is here:
import itertools
import re
# permutations brute-force, O(nk!)
def checkWords(text, word_list):
# split all words without space and punctuation
text_words= re.findall(r"[\w']+", text.lower())
# list all the permutations of word_list, and match
for words in itertools.permutations(word_list):
if ' '.join(words).lower() in ' '.join(text_words):
return True
return False
# or use any, just one line
# return any(' '.join(words).lower() in ' '.join(text_words) for words in list(itertools.permutations(word_list)))
def test():
# True
print(checkWords('Hello world, my friend.', ['Hello', 'world', 'my']))
# False
print(checkWords('Hello, beautiful world', ['Hello', 'world']))
# True
print(checkWords('Hello, beautiful world Hello World', ['Hello', 'world', 'beautiful']))
# True
print(checkWords('Hello, beautiful world Hello World', ['Hello', 'world', 'world']))
But it costs a lot when words number is large, k words will generate k! permutation, the time complexity is O(nk!).
I think a more efficient solution is sliding window. The time complexity will decrease to O(n):
import itertools
import re
import collections
# sliding window, O(n)
def checkWords(text, word_list):
# split all words without space and punctuation
text_words = re.findall(r"[\w']+", text.lower())
counter = collections.Counter(map(str.lower, word_list))
start, end, count, all_indexes = 0, 0, len(word_list), []
while end < len(text_words):
counter[text_words[end]] -= 1
if counter[text_words[end]] >= 0:
count -= 1
end += 1
# if you want all the index of match, you can change here
if count == 0:
# all_indexes.append(start)
return True
if end - start == len(word_list):
counter[text_words[start]] += 1
if counter[text_words[start]] > 0:
count += 1
start += 1
# return all_indexes
return False
I don't know if that what you really need but this worked you can tested
message= 'hello world'
message2= ' hello beautiful world'
if 'hello' in message and 'world' in message :
print('yes')
else :
print('no')
if 'hello' in message2 and 'world' in message2 :
print('yes')
out put :
yes
yes

Python: Modify Part of a String

I am taking an input string that is all one continuous group of letters and splitting it into a sentence. The problem is that as a beginner I can't figure out how to modify the string to ONLY capitalize the first letter and convert the others to lowercase. I know the string.lower but that converts everything to lowercase. Any ideas?
# This program asks user for a string run together
# with each word capitalized and gives back the words
# separated and only the first word capitalized
import re
def main():
# ask the user for a string
string = input( 'Enter some words each one capitalized, run together without spaces ')
for ch in string:
if ch.isupper() and not ch.islower():
newstr = re.sub('[A-Z]',addspace,string)
print(newstr)
def addspace(m) :
return ' ' + m.group(0)
#call the main function
main()
You can use capitalize():
Return a copy of the string with its first character capitalized and
the rest lowercased.
>>> s = "hello world"
>>> s.capitalize()
'Hello world'
>>> s = "hello World"
>>> s.capitalize()
'Hello world'
>>> s = "hELLO WORLD"
>>> s.capitalize()
'Hello world'
Unrelated example. To capitalize only the first letter you can do:
>>> s = 'hello'
>>> s = s[0].upper()+s[1:]
>>> print s
Hello
>>> s = 'heLLO'
>>> s = s[0].upper()+s[1:]
>>> print s
HeLLO
For a whole string, you can do
>>> s = 'what is your name'
>>> print ' '.join(i[0].upper()+i[1:] for i in s.split())
What Is Your Name
[EDIT]
You can also do:
>>> s = 'Hello What Is Your Name'
>>> s = ''.join(j.lower() if i>0 else j for i,j in enumerate(s))
>>> print s
Hello what is your name
If you only want to capitalize the start of sentences (and your string has multiple sentences), you can do something like:
>>> sentences = "this is sentence one. this is sentence two. and SENTENCE three."
>>> split_sentences = sentences.split('.')
>>> '. '.join([s.strip().capitalize() for s in split_sentences])
'This is sentence one. This is sentence two. And sentence three. '
If you don't want to change the case of the letters that don't start the sentence, then you can define your own capitalize function:
>>> def my_capitalize(s):
if s: # check that s is not ''
return s[0].upper() + s[1:]
return s
and then:
>>> '. '.join([my_capitalize(s.strip()) for s in split_sentences])
'This is sentence one. This is sentence two. And SENTENCE three. '

How do I print words with only 1 vowel?

my code so far, but since i'm so lost it doesn't do anything close to what I want it to do:
vowels = 'a','e','i','o','u','y'
#Consider 'y' as a vowel
input = input("Enter a sentence: ")
words = input.split()
if vowels == words[0]:
print(words)
so for an input like this:
"this is a really weird test"
I want it to only print:
this, is, a, test
because they only contains 1 vowel.
Try this:
vowels = set(('a','e','i','o','u','y'))
def count_vowels(word):
return sum(letter in vowels for letter in word)
my_string = "this is a really weird test"
def get_words(my_string):
for word in my_string.split():
if count_vowels(word) == 1:
print word
Result:
>>> get_words(my_string)
this
is
a
test
Here's another option:
import re
words = 'This sentence contains a bunch of cool words'
for word in words.split():
if len(re.findall('[aeiouy]', word)) == 1:
print word
Output:
This
a
bunch
of
words
You can translate all the vowels to a single vowel and count that vowel:
import string
trans = string.maketrans('aeiouy','aaaaaa')
strs = 'this is a really weird test'
print [word for word in strs.split() if word.translate(trans).count('a') == 1]
>>> s = "this is a really weird test"
>>> [w for w in s.split() if len(w) - len(w.translate(None, "aeiouy")) == 1]
['this', 'is', 'a', 'test']
Not sure if words with no vowels are required. If so, just replace == 1 with < 2
You may use one for-loop to save the sub-strings into the string array if you have checked he next character is a space.
Them for each substring, check if there is only one a,e,i,o,u (vowels) , if yes, add into the another array
aFTER THAT, FROM another array, concat all the strings with spaces and comma
Try this:
vowels = ('a','e','i','o','u','y')
words = [i for i in input('Enter a sentence ').split() if i != '']
interesting = [word for word in words if sum(1 for char in word if char in vowel) == 1]
i found so much nice code here ,and i want to show my ugly one:
v = 'aoeuiy'
o = 'oooooo'
sentence = 'i found so much nice code here'
words = sentence.split()
trans = str.maketrans(v,o)
for word in words:
if not word.translate(trans).count('o') >1:
print(word)
I find your lack of regex disturbing.
Here's a plain regex only solution (ideone):
import re
str = "this is a really weird test"
words = re.findall(r"\b[^aeiouy\W]*[aeiouy][^aeiouy\W]*\b", str)
print(words)

Categories

Resources