IndexError: string index out of range in checking palindrome - python

Checking palindrome
I am new to python. But I did the debugging. But couldn't find the error.
import string
def is_palindrome(str_1, lowest_index, highest_index):
punct = set(string.punctuation)
print(punct)
#remove punctuations
no_punct = ""
for char in str_1:
if char not in punct:
no_punct = no_punct + char
print(no_punct)
# rmv_whtspc = no_punct.rstrip()
rmv_whtspc = no_punct.replace(' ','')
print(rmv_whtspc)
str_2 = rmv_whtspc.lower()
print(str_2)
if lowest_index > highest_index:
return True
else:
if str_2[lowest_index] == str_2[highest_index]:
return is_palindrome(str_2, lowest_index+1, highest_index-1)
else:
return False
Calling the function:
str_1 = "Madama I am adam"
lowest_index = 0
highest_index = len(str_1)-1
print(is_palindrome(str_1, lowest_index, highest_index))
The Output:
{'{', '<', '_', '$', '"', ',', '&', '\\', ']', '`', '%', "'", '#', '*', '+', '>', '/', '?', '=', '^', ')', '[', '(',
'~', '!', '#', '|', '}', ':', '.', ';', '-'}
Madama I am adam
MadamaIamadam
madamaiamadam
Traceback (most recent call last):
File "recursion_5_2problem.py", line 27, in <module>
print(is_palindrome(str_1, lowest_index, highest_index))
File "recursion_5_2problem.py", line 19, in is_palindrome
if str_2[lowest_index] == str_2[highest_index]:
IndexError: string index out of range

You are getting the lowest and highest index before you clean the string (removing punctuation and whitespace). So you are trying to access a character in the string that may now be out of bounds.
I'd suggest maybe cleaning the string before putting it through the palindrome function then getting the lowest and highest index in the function itself (aka. after all the punctuation and whitespace is removed).
def clean_string()
# remove punctuation
# remove whitespace
return clean_string
def is_palindrome()
# set high/low index
# do your thing
return result
to_check = "race car!"
cleaned = clean_string(to_check)
print(is_palindrome(cleaned))
Just pseudocode, but I'm sure you get the point!
Hope it helps! :)

the mistake you made is nicely described in Andrew Grass's answer.
here a suggestion how you could make all this a lot simpler:
for the cleanup you could use str.maketrans and str.translate; then you just compare the first half of the string to the second half (in reverse):
from string import punctuation, whitespace
repl_table = str.maketrans("", "", punctuation + whitespace)
def normalize(strg):
# remove all punctuation and whitespace and lowercase strg
return strg.translate(repl_table).lower()
def ispalindrome(strg):
n2 = len(strg) // 2
return strg[:n2] == "".join(reversed(strg))[0:n2]
you could use that then as:
strg = "Madama I am adam"
strg = normalize(strg) # madamaiamadam
print(ispalindrome(strg)) # True

Related

Using a List as 1 argument in replace()

Trying to solve.
I have a string from a user input. And I want to reomove all special characters from a list =
[',', '.', '"', '\'', ':',]
using the replace function I´m able to remove one by one. using somethin like:
string = "a,bhc:kalaej jff!"
string.replace(",", "")
but I want to do remove all the special chr. in one go. I have tried:
unwanted_specialchr = [',', '.', '"', '\'', ':',]
string = "a,bhc:kalaej jff!"
string.replace(unwanted_specialchr, "")
figured it out:
def remove_specialchr(string):
unwanted_specialchr = [',', '.', '"', '\'', ':',]
for chr in string:
if chr in unwanted_specialchr:
string = string.replace(chr, '')
return string
you can use re.sub:
import re
unwanted_specialchr = [',', '.', '"', '\'', ':',]
string = "a,bhc:kalaej jff!"
re.sub(f'[{"".join(unwanted_specialchr)}]', '', string)
output:
'abhckalaej jff!'
or you could use:
''.join(c for c in string if c not in unwanted_specialchr)
output:
'abhckalaej jff!'
Well i think that your solution could be better with the optimization:
def remove_specialchr(string):
specialChr = {',', '.', '"', '\'', ':'}
stringS = ''
for chr in string:
if chr not in specialChr:
stringS += it
return stringS

How to remove my punctuation array from original text

I have punctuation array like this
punctuation_data = [ '=' '+' '_' '-' ')' '(' '*' '&' '^' '%'
'SSSS' 'AAAA' 'wwww' '!' '~' '،']
and i have text to remove punctuation of this text, i use this but its not working
list = [''.join(c for c in original_data if c not in punctuation_data)
for s in list]
Edit: Original post did not delete longer substrings. I included a function that loops through the punctuation data and deletes the substrings.
You need to separate your list by comma. Also, don't use predefined names like list.
This will work:
punctuation_data = [ '=', '+', '_', '-', ')', '(', '*', '&', '^', '%',
'SSSS', 'AAAA', 'wwww', '!', '~', '،']
orig_string = ['3+5=8']
def delete_substrings(orig_sub_string, punctuation_data):
for element_to_delete in punctuation_data:
orig_sub_string = orig_sub_string.replace(element_to_delete, "")
return orig_sub_string
lst = [''.join(c for c in orig_sub_string if c not in punctuation_data) for orig_sub_string in orig_string]
print(lst) #['358']
Since you're trying match a number of strings of varying lengths, it's best to use regex instead. Escape the strings with re.escape first so that they don't get interpreted as special characters in regex:
import re
punctuation_data = [ '=', '+', '_', '-', ')', '(', '*', '&', '^', '%', 'SSSS', 'AAAA', 'wwww', '!', '~', '،']
print(re.sub('|'.join(map(re.escape, punctuation_data)), '', 'abc*xyzAAAA123'))
This outputs:
abcxyz123
this is worked for me
original_data = 'What is hello'
punctuation_data = [ '=' '+' '_' '-' ')' '(' '*' '&' '^'
'%'
'SSSS' 'AAAA' 'wwww' '!' '~' '،']
original_data = original_data.split()
resultwords = [word for word in original_data if
word.lower() not in punctuation_data]
result = ' '.join(resultwords)
print result

remove certain symbols form a string array

I have a numpy.ndarray with Strings. I have created a character list, which I would like to use against the strings array, to remove all characters which appear in the character list. I want to put the symbol free strings in a new array. How can I do this?
Input:
symbols = string.printable[62:]
symbolsList = list(symbols)
symbolsList
Output:
['!',
'"',
'#',
'$',
'%',
'&',
"'",
'(',
')',
'*',
'+',
',',
'-',
'.',
'/',
':',
';',
'<',
'=',
'>',
'?',
'#',
'[',
'\\',
']',
'^',
'_',
'`',
'{',
'|',
'}',
'~',
' ',
'\t',
'\n',
'\r',
'\x0b',
'\x0c']
A sample output of the string_array:
array(['[KFC] CHicken_Gravy_Coke_Biscuit This is my Order!!!<lf><lf>', dtype=object)
I want it to look like this:
array(['KFC CHicken Gravy Coke Biscuit This is my Order lf lf', dtype=object)
I tried:
cleanData = []
for i in string_array:
cleanData.append(string_array[i].replace(symbolsList[i], " "))
and:
cleanData = []
for i in summary_data:
cleanData = summary_data[i].replace(symbolsList[i], " ")
Both give same Output:
IndexError: only integers, slices (`:`), ellipsis (`...`), numpy.newaxis (`None`) and integer or boolean arrays are valid indices
But does not work :( How to make this work? Or do what I want?
This is one way.
import re, string, numpy as np
def remove_chars_re(x):
x = re.sub('[' + re.escape(''.join(string.printable[62:])) + ']', ' ', x)
return re.sub(' +', ' ', x).strip()
arr = np.array(['[KFC] CHicken_Gravy_Coke_Biscuit This is my Order!!!<lf><lf>'], dtype=object)
list(map(remove_chars_re, arr))
# ['KFC CHicken Gravy Coke Biscuit This is my Order lf lf']
Explanation
The first re.sub removes unwanted characters with a single space.
The second re.sub removes double spaces.
strip() removes whitespace from start and end of the string.
Here is how I would go about it.
Iterate over each string, str
Iterate over each undesired charactered, chr, for each str (Nested Loop)
Use the str.replace(chr, '')
Here is the code.
cleanData = []
for str in string_array:
tmp_str = str #you need to do this because you need to filter every character one by one
for chr in symbolsList:
tmp_str = tmp_str.replace(chr, ' ') #if you want to replace your undesired symbols with a space
cleanData.append(tmp_str)

Removing punctuation in lists in Python

Creating a Python program that converts the string to a list, uses a loop to remove any punctuation and then converts the list back into a string and prints the sentence without punctuation.
punctuation=['(', ')', '?', ':', ';', ',', '.', '!', '/', '"', "'"]
str=input("Type in a line of text: ")
alist=[]
alist.extend(str)
print(alist)
#Use loop to remove any punctuation (that appears on the punctuation list) from the list
print(''.join(alist))
This is what I have so far. I tried using something like: alist.remove(punctuation) but I get an error saying something like list.remove(x): x not in list. I didn't read the question properly at first and realized that I needed to do this by using a loop so I added that in as a comment and now I'm stuck. I was, however, successful in converting it from a list back into a string.
import string
punct = set(string.punctuation)
''.join(x for x in 'a man, a plan, a canal' if x not in punct)
Out[7]: 'a man a plan a canal'
Explanation: string.punctuation is pre-defined as:
'!"#$%&\'()*+,-./:;<=>?#[\\]^_`{|}~'
The rest is a straightforward comprehension. A set is used to speed up the filtering step.
I found a easy way to do it:
punctuation = ['(', ')', '?', ':', ';', ',', '.', '!', '/', '"', "'"]
str = raw_input("Type in a line of text: ")
for i in punctuation:
str = str.replace(i,"")
print str
With this way you will not get any error.
punctuation=['(', ')', '?', ':', ';', ',', '.', '!', '/', '"', "'"]
result = ""
for character in str:
if(character not in punctuation):
result += character
print result
Here is the answer of how to tokenize the given statements by using python. the python version I used is 3.4.4
Assume that I have text which is saved as one.txt. then I have saved my python program in the directory where my file is (i.e. one.txt). The following is my python program:
with open('one.txt','r')as myFile:
str1=myFile.read()
print(str1)# This is to print the given statements with punctuations(before removal of punctuations)
# The following is the list of punctuations that we need to remove, add any more if I forget
punctuation = ['(', ')', '?', ':', ';', ',', '.', '!', '/', '"', "'"]
for i in punctuation:
str1 = str1.replace(i," ") #to make empty the place where punctuation is there.
myList=[]
myList.extend(str1.split(" "))
print (str1) #this is to print the given statements without puctions(after Removal of punctuations)
for i in myList:
# print ("____________")
print(i,end='\n')
print ("____________")
==============next I will post for you how to remove stop words============
until that let you comment if it is useful.
Thank you

strip punctuation with regex - python

I need to use regex to strip punctuation at the start and end of a word. It seems like regex would be the best option for this. I don't want punctuation removed from words like 'you're', which is why I'm not using .replace().
You don't need regular expression to do this task. Use str.strip with string.punctuation:
>>> import string
>>> string.punctuation
'!"#$%&\'()*+,-./:;<=>?#[\\]^_`{|}~'
>>> '!Hello.'.strip(string.punctuation)
'Hello'
>>> ' '.join(word.strip(string.punctuation) for word in "Hello, world. I'm a boy, you're a girl.".split())
"Hello world I'm a boy you're a girl"
I think this function will be helpful and concise in removing punctuation:
import re
def remove_punct(text):
new_words = []
for word in text:
w = re.sub(r'[^\w\s]','',word) #remove everything except words and space
w = re.sub(r'_','',w) #how to remove underscore as well
new_words.append(w)
return new_words
If you persist in using Regex, I recommend this solution:
import re
import string
p = re.compile("[" + re.escape(string.punctuation) + "]")
print(p.sub("", "\"hello world!\", he's told me."))
### hello world hes told me
Note also that you can pass your own punctuation marks:
my_punct = ['!', '"', '#', '$', '%', '&', "'", '(', ')', '*', '+', ',', '.',
'/', ':', ';', '<', '=', '>', '?', '#', '[', '\\', ']', '^', '_',
'`', '{', '|', '}', '~', '»', '«', '“', '”']
punct_pattern = re.compile("[" + re.escape("".join(my_punct)) + "]")
re.sub(punct_pattern, "", "I've been vaccinated against *covid-19*!") # the "-" symbol should remain
### Ive been vaccinated against covid-19
You can remove punctuation from a text file or a particular string file using regular expression as follows -
new_data=[]
with open('/home/rahul/align.txt','r') as f:
f1 = f.read()
f2 = f1.split()
all_words = f2
punctuations = '''!()-[]{};:'"\,<>./?##$%^&*_~'''
# You can add and remove punctuations as per your choice
#removing stop words in hungarian text and english text and
#display the unpunctuated string
# To remove from a string, replace new_data with new_str
# new_str = "My name$## is . rahul -~"
for word in all_words:
if word not in punctuations:
new_data.append(word)
print (new_data)
P.S. - Do the identation properly as per required.
Hope this helps!!

Categories

Resources