Python Index Error: string index out of range - python

## A little helper program that capitalizes the first letter of a word
def Cap (s):
s = s.upper()[0]+s[1:]
return s
Giving me this error :
Traceback (most recent call last):
File "\\prov-dc\students\jadewusi\crack2.py", line 447, in <module>
sys.exit(main(sys.argv[1:]))
File "\\prov-dc\students\jadewusi\crack2.py", line 398, in main
foundit = search_method_3("passwords.txt")
File "\\prov-dc\students\jadewusi\crack2.py", line 253, in search_method_3
ourguess_pass = Cap(ourguess_pass)
File "\\prov-dc\students\jadewusi\crack2.py", line 206, in Cap
s = s.upper()[0]+s[1:]
IndexError: string index out of range

As others have already noted, the problem is that you're trying to access an item in an empty string. Instead of adding special handling in your implementation, you can simply use capitalize:
'hello'.capitalize()
=> 'Hello'
''.capitalize()
=> ''

It blows up, presumably, because there is no indexing an empty string.
>>> ''[0]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: string index out of range
And as it has been pointed out, splitting a string to call str.upper() on a single letter can be supplanted by str.capitalize().
Additionally, if you should regularly encounter a situation where this would be passed an empty string, you can handle it a couple of ways:
…#whatever previous code comes before your function
if my_string:
Cap(my_string) #or str.capitalize, or…
if my_string being more or less like if len(my_string) > 0.
And there's always ye old try/except, though I think you'll want to consider ye olde refactor first:
#your previous code, leading us to here…
try:
Cap(my_string)
except IndexError:
pass
I wouldn't stay married to indexing a string to call str.upper() on a single character, but you may have a unique set of reasons for doing so. All things being equal, though, str.capitalize() performs the same function.

>>> s = 'macGregor'
>>> s.capitalize()
'Macgregor'
>>> s[:1].upper() + s[1:]
'MacGregor'
>>> s = ''
>>> s[0]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: string index out of range
>>> s[:1].upper() + s[1:]
''
Why does s[1:] not bail on an empty string?
Tutorial on strings says:
Degenerate slice indices are handled gracefully: an index that is too
large is replaced by the string size, an upper bound smaller than the
lower bound returns an empty string.
See also Python's slice notation.

I just had the same error while I was sure that my string wasn't empty. So I thought I'd share this here, so people who get that error have as many potentional reasons as possible.
In my case, I declared a one character string, and python apparently saw it as a char type. It worked when I added another character. I don't know why it doesn't convert it automatically, but this might be a reason that causes an "IndexError: string index out of range", even if you think that the supposed string is not empty.
It might differ between Python versions, I see the original question refers to Python 3. I used Python 2.6 when this happened.

Related

Python : Get count of successfully matched groups for regex

I want to capture data and numbers from a string in python. The string is a measurement from an RF sensor so it might be corrupted from bad transmission. Strings from the sensor look like this PA1015.7 TMPA20.53 HUM76.83.
My re is :
s= re.search('^(\D+)([0-9.]+'),message)
Now before I proceed I want to check if I truly received exactly two matches properly or if the string is garbled.
So I tried :
len(s)
But that errors out :
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: object of type '_sre.SRE_Match' has no len()
I do need access to the match group elements for processing later. (I think that eliminates findall)
key= s.group(1)
data= s.group(2)
What's missing?
Instead of using search, you should use findall instead:
s = re.findall('(\D+)([0-9.]+)',message)
print("matched " + str(len(s)))
search only returns whether there is or is no match in the input string, in the form of a boolean.

Empty string returned instead of IndexError [duplicate]

This question already has answers here:
Why does substring slicing with index out of range work?
(3 answers)
Closed 5 years ago.
I'm really confused now:
>>> string = "some string"
>>> string[100:105]
''
>>> string[100]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
IndexError: string index out of range
Why does string[100:105] return an empty string whereas string[100] raises an IndexError (as expected)? Why does string[100:105] not raise an IndexError as well?
When slicing a string in Python, the string[100:105], the operation was designed specifically to fail gracefully. The result of an out of range slice is to return the empty string ''. See the Informal Introduction for more information.
Accessing a specific index of a string, the string[100] was not designed to fail gracefully, so it raise an exception.
String slicing in python will not raise errors, they will just return the string that you were looking for (in this case, it's indeed a empty string because there's nothing there).
Slicing returnes a subsequence of items, and it doesn't do any bound checking. If the wanted data (memory) is empty, it will send back an empty string as you got.

IndexError: list index out of range, for re.match

below is part of a script I wrote, in which I have a problem in the if statement.
If I want to use re.match('ATOM|MODEL',lines[i]), I got error message. removing the "|MODEL" in re.match, it will work.
can anyone give me some hints why this happens? Thank you very much!
new_pdb=open(pdb_out,'w')
i=0
while (i<len(lines)):
frag=lines[i].split()
# do not know why 'ATOM|MODEL' does not work
if (re.match('ATOM',lines[i]) and "H" not in frag[2]):
new_pdb.write(lines[i])
i=i+1
new_pdb.close()
Below is the error message when I used re.match('ATOM|MODEL',lines[i]):
Traceback (most recent call last):
File "non-h-1.py", line 17, in
if (re.match('ATOM|MODEL',lines[i]) and "H" not in frag[2]):
IndexError: list index out of range
At least one of the lines that starts with MODEL contains less than three whitespace-separated items, so frag[2] fails. If you remove |MODEL from the regex, re.match() fails and therefore Python doesn't even try to evaluate frag[2] which is why the error doesn't occur in that situation.
Other than that, you shouldn't be iterating over lines using a while loop - Python is not C. Use
for line in lines:
frag = line.split()
if (re.match('ATOM',line) and "H" not in frag[2]):
new_pdb.write(line)

Converting from string to int in file from the internet

My program is supposed to download a file from the internet and then to guess at a persons salary based on certain factors such as age, work, etc. It seems to me that it is not letting me turn the string into an int which I need to do. As I'm still new to python, any help would be appreciated. The main error occurs here:
below_count = 0
for row in myfile:
if ages_midpoint > int(row[0]):
count_below50+=1
The error is:
ValueError: invalid literal for int() with base 10: ''
The error message tells you exactly what is wrong.
If you're looping over a text file with for..in, that means the value row is a string (one line from the file).
You are looking at row[0] which is the first character of the string.
That character is a space (I assume, because calling [0] on the empty string would throw an exception), which is not a legal representation of a decimal number.
You need to go back to what you're actually trying to do and re-think how to do it, because this isn't it.
This might help.
>>> int('3')
3
>>> int('')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: ''
so, you can strip off any whitespace and put a try..catch.
below_count = 0
for row in myfile:
try:
if ages_midpoint > int(row.strip()[0]):
count_below50+=1
except ValueError:
# row[0] is not an integer character
# do something here
pass
The value of row[0] is an empty string. You can recreate the error by doing the following on the command line interpreter...
>>> int('')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: invalid literal for int() with base 10: ''
You might want to validate the value stored in row. if row: will do it.
Unfortunately there's not enough information in your post to be able to help you much beyond that. the line of code if ages_midpoint > int(row[0]): is very suspicious. Is row a line of text from a text file, if so row[0] will return the first character... probably not what you want. Use the string split function word = row.split(<charToSplitOn>)[0]

TypeError: expected a character buffer object ITS SO ANNOYING

This is what it says on the interpreter...
Traceback (most recent call last):
File "<pyshell#1>", line 1, in <module>
nth_term_rule(list_a)
File "C:\Users\Jimmy\Desktop\Python 2.7 Functions Pack 1\nth_term_rule.py", line 5, in nth_term_rule
f.write(n)
TypeError: expected a character buffer object
This is really bugging me, I'm currently in the middle of the nth term rule function for my function pack and I'm attempting to make sure the sequence is steady - if not the nth term rule would be wrong. So I'm trying to append a text file that will have every number in the list in it. Then the interpreter will go through each number, making sure the difference between it and the next one is the same as difference as the previous numbers.
Unfortunately it comes up with the error above and I don't know why.
Here is my code if it helps...
def nth_term_rule(a):
for n in a:
str(n)
f = open("C:\Users\Jimmy\Desktop\Python 2.7 Functions Pack 1\Numbers.txt","a")
f.write(n)
f.close()
if a[0] - a[1] == a[len(a)-2] - a[len(a)-1]:
b=a[1] - a[0]
c=a[0] - b
return (b,'n + ',c)
else:
return ("Error.")
Any help would be much appreciated.
You are ignoring the return value of str():
str(n)
str() returns the new string, you want to assign this back to n:
n = str(n)
You probably want to avoid re-opening the file each loop iteration; just open it once:
filename = r"C:\Users\Jimmy\Desktop\Python 2.7 Functions Pack 1\Numbers.txt"
with open(filename, "a") as f:
for n in a:
f.write(str(n) + \n)
This adds a few more things:
Using the file as a context manager (with the with statement) makes sure that it is closed again automatically, when the block ends.
Using a raw string literal (r'...') prevents \ being interpreted as an escape sequence. That way filenames that start with a t or n or r, etc. are not interpreted as special. See string literals for more info.
I assumed you probably wanted to have newlines between your values when written to the file.

Categories

Resources