(python) I keep getting an IndexError: string index out of range - python

I am trying to solve the question
Implement the ​mapper​, ​mapFileToCount​, which takes a string (text from a file) and returns the number of capitalized words in that string. A word is defined as a series
of characters separated from other words by either a space or a newline. A word is capitalized if its first letter is capitalized (A vs a).
and my python code currently reads
def mapFileToCount(s):
lines = (str(s)).splitlines()
words = (str(lines)).split(" ")
up = 0
for word in words:
if word[0].isupper() == True:
up = up + 1
return up
However I keep getting the error IndexError: string index out of range
please help

For now
given Hi huy \n hi you there
lines will be ['Hi huy ', ' hi you there']
words will be ["['Hi", 'huy', "',", "'", 'hi', 'you', "there']"] as you use the str(lines) to split on
I'd suggest you split on any whitespace at once with words = re.split("\s+", s).
Then the problem of IndexError comes in cases like Hi where are you__ (_ is space), when you split there will be an empty string at the end, and you can't access the first char char of this, so just add a condition in the if
if word because 0-size word are False, and other True
if word[0].isupper() for you test
import re
def mapFileToCount(s):
words = re.split("\s+", s)
up = 0
for word in words:
if word and word[0].isupper():
up = up + 1
return up

The string index out of range means that the index you are trying to access does not exist in a string. That means you're trying to get a character from the string at a given point. If that given point does not exist , then you will be trying to get a character that is not inside of the string.
In your code its that word[0].

Related

Why my function doesn't return a new string?

My assignment was to write a function with one string parameter that returns a string. The function should extract the words within this string, drop empty words as well as words that are equal to "end" and "exit", convert the
remaining words to upper case, join them with the joining token string ";" and return this
newly joined string.
This is my function but if the string doesnt contain the words "exit" or "end" no new string is returned:
def fun(long_string):
stop_words = ('end', 'exit', ' ')
new_line = ''
for word in long_string:
if word in stop_words:
new_line = long_string.replace(stop_words, " ")
result = ';'.join(new_line.upper())
return result
print(fun("this is a long string"))
The condition of if is never True, since word is not a real "word"; word in your code will be each "character" of long_string. So what if really does here is comparing 't' with 'end' and so on. Therefore, new_line always remains to be the empty string as initialized.
You will need split to work with words:
def fun(long_string):
return ';'.join(word for word in long_string.split() if word not in ('end', 'exit'))
print(fun("this is a long string")) # this;is;a;long;string
You don't need to check for empty words, because split considers them as separators (i.e., not even a word).
for word in long_string will iterate over each character in long_string, not each word. The next line compares each character to the words in stop_words.
You probably want something like for word in long.string.split(' ') in order to iterate over the words.

extracting words from a string without using the .split() function

I coded this in order to get a list full of a given string words .
data=str(input("string"))
L=[]
word=""
for i in data:
if i.isalpha() :
word+=i
elif :
L.append(word)
word=""
but, when I run this code it doesn't show the last word !
You can simply split words on a string using str.split() method, here is a demo:
data = input("string: ")
words = data.split()
L = []
for word in words:
if word.isalpha():
L.append(word)
print(L)
Note that .split() splits a string by any whitespace character by default, if you want for example to split using commas instead, you can simply use data.split(",").
You are not getting the last word into the list because it does not have non-alpha character to make it pass to the else stage and save the word to list.
Let's correct your code a little. I assume you want to check the words in the string but not characters(because what you are doing right now is checking each charackter not words.):
data=input("Input the string: ") #you don't need to cast string to string (input() returns string)
data = data+' ' # to make it save the last word
l=[] #variable names should be lowercase
word=""
for i in data:
if i.isalpha() :
word+=i
else: # you shouldn't use elif it is else if no condition is provided
l.append(word)
word=" " # not to make each word connected right after each other

Python - string index out of range issue

This is the question I was given to solve:
Create a program inputs a phrase (like a famous quotation) and prints all of the words that start with h-z.
I solved the problem, but the first two methods didn't work and I wanted to know why:
#1 string index out of range
quote = input("enter a 1 sentence quote, non-alpha separate words: ")
word = ""
for character in quote:
if character.isalpha():
word += character.upper()
else:
if word[0].lower() >= "h":
print(word)
word = ""
else:
word = ""
I get the IndexError: string index out of range message for any words after "g". Shouldn't the else statement catch it? I don't get why it doesn't, because if I remove the brackets [] from word[0], it works.
#2: last word not printing
quote = input("enter a 1 sentence quote, non-alpha separate words: ")
word = ""
for character in quote:
if character.isalpha():
word += character.upper()
else:
if word.lower() >= "h":
print(word)
word = ""
else:
word = ""
In this example, it works to a degree. It eliminates any words before 'h' and prints words after 'h', but for some reason doesn't print the last word. It doesn't matter what quote i use, it doesn't print the last word even if it's after 'h'. Why is that?
You're calling on word[0]. This accesses the first element of the iterable string word. If word is empty (that is, word == ""), there is no "first element" to access; thus you get an IndexError. If a "word" starts with a non-alphabetic character (e.g. a number or a dash), then this will happen.
The second error you're having, with your second code snippet leaving off the last word, is because of the approach you're using for this problem. It looks like you're trying to walk through the sentence you're given, character by character, and decide whether to print a word after having read through it (which you know because you hit a space character. But this leads to the issue with your second approach, which is that it doesn't print the last string. That's because the last character in your sentence isn't a space - it's just the last letter in the last word. So, your else loop is never executed.
I'd recommend using an entirely different approach, using the method string.split(). This method is built-in to python and will transform one string into a list of smaller strings, split across the character/substring you specify. So if I do
quote = "Hello this is a sentence"
words = quote.split(' ')
print(words)
you'll end up seeing this:
['Hello', 'this', 'is', 'a', 'sentence']
A couple of things to keep in mind on your next approach to this problem:
You need to account for empty words (like if I have two spaces in a row for some reason), and make sure they don't break the script.
You need to account for non-alphanumeric characters like numbers and dashes. You can either ignore them or handle them differently, but you have to have something in place.
You need to make sure that you handle the last word at some point, even if the sentence doesn't end in a space character.
Good luck!
Instead of what you're doing, you can Iterate over each word in the string and count how many of them begin in those letters. Read about the function str.split(), in the parameter you enter the divider, in this case ' ' since you want to count the words, and that returns a list of strings. Iterate over that in the loop and it should work.

Python word in file change

I am trying to change the words that are nouns in a text to "noun".
I am having trouble. Here is what I have so far.
def noun(file):
for word in file:
for ch in word:
if ch[-1:-3] == "ion" or ch[-1:-3] == "ism" or ch[-1:-3] == "ity":
word = "noun"
if file(word-1) == "the" and (file(word+1)=="of" or file(word+1) == "on"
word = "noun"
# words that appear after the
return outfile
Any ideas?
Your slices are empty:
>>> 'somethingion'[-1:-3]
''
because the endpoint lies before the start. You could just use [-3:] here:
>>> 'somethingion'[-3:]
'ion'
But you'd be better of using str.endswith() instead:
ch.endswith(("ion", "ism", "ity"))
The function will return True if the string ends with any of the 3 given strings.
Not that ch is actually a word; if word is a string, then for ch in word iterates over individual characters, and those are never going to end in 3-character strings, being only one character long themselves.
Your attempts to look at the next and previous words are also going to fail; you cannot use a list or file object as a callable, let alone use file(word - 1) as a meaningful expression (a string - 1 fails, as well as file(...)).
Instead of looping over the 'word', you could use a regular expression here:
import re
nouns = re.compile(r'(?<=\bthe\b)(\s*\w+(?:ion|ism|ity)\s*)(?=\b(?:of|on)\b)')
some_text = nouns.sub(' noun ', some_text)
This looks for words ending in your three substrings, but only if preceded by the and followed by of or on and replaces those with noun.
Demo:
>>> import re
>>> nouns = re.compile(r'(?<=\bthe\b)(\s*\w+(?:ion|ism|ity)\s*)(?=\b(?:of|on)\b)')
>>> nouns.sub(' noun ', 'the scion on the prism of doom')
'the noun on the noun of doom'

How might I create an acronym by splitting a string at the spaces, taking the character indexed at 0, joining it together, and capitalizing it?

My code
beginning = input("What would you like to acronymize? : ")
second = beginning.upper()
third = second.split()
fourth = "".join(third[0])
print(fourth)
I can't seem to figure out what I'm missing. The code is supposed to the the phrase the user inputs, put it all in caps, split it into words, join the first character of each word together, and print it. I feel like there should be a loop somewhere, but I'm not entirely sure if that's right or where to put it.
Say input is "Federal Bureau of Agencies"
Typing third[0] gives you the first element of the split, which is "Federal". You want the first element of each element in the sprit. Use a generator comprehension or list comprehension to apply [0] to each item in the list:
val = input("What would you like to acronymize? ")
print("".join(word[0] for word in val.upper().split()))
In Python, it would not be idiomatic to use an explicit loop here. Generator comprehensions are shorter and easier to read, and do not require the use of an explicit accumulator variable.
When you run the code third[0], Python will index the variable third and give you the first part of it.
The results of .split() are a list of strings. Thus, third[0] is a single string, the first word (all capitalized).
You need some sort of loop to get the first letter of each word, or else you could do something with regular expressions. I'd suggest the loop.
Try this:
fourth = "".join(word[0] for word in third)
There is a little for loop inside the call to .join(). Python calls this a "generator expression". The variable word will be set to each word from third, in turn, and then word[0] gets you the char you want.
works for me this way:
>>> a = "What would you like to acronymize?"
>>> a.split()
['What', 'would', 'you', 'like', 'to', 'acronymize?']
>>> ''.join([i[0] for i in a.split()]).upper()
'WWYLTA'
>>>
One intuitive approach would be:
get the sentence using input (or raw_input in python 2)
split the sentence into a list of words
get the first letter of each word
join the letters with a space string
Here is the code:
sentence = raw_input('What would you like to acronymize?: ')
words = sentence.split() #split the sentece into words
just_first_letters = [] #a list containing just the first letter of each word
#traverse the list of words, adding the first letter of
#each word into just_first_letters
for word in words:
just_first_letters.append(word[0])
result = " ".join(just_first_letters) #join the list of first letters
print result
#acronym2.py
#illustrating how to design an acronymn
import string
def main():
sent=raw_input("Enter the sentence: ")#take input sentence with spaces
for i in string.split(string.capwords(sent)):#split the string so each word
#becomes
#a string
print string.join(i[0]), #loop through the split
#string(s) and
#concatenate the first letter
#of each of the
#split string to get your
#acronym
main()
name = input("Enter uppercase with lowercase name")
print(f'the original string = ' + name)
def uppercase(name):
res = [char for char in name if char.isupper()]
print("The uppercase characters in string are : " + "".join(res))
uppercase(name)

Categories

Resources