Getting an IndexError: string index out of range - python

I'm not sure why I'm getting an
IndexError: string index out of range
with this code.
s = 'oobbobobo'
a = 0
for b in range(len(s)-1):
if (s[b] == 'b') and (s[b+1] == 'o') and (s[b+2] == s[b]):
a += 1
elif (s[b] == 'b') and (s[b+1] == 'o') and None:
break
print("Number of times bob occurs is: ", a)
I thought the elif statement would fix the error, so I'm lost.

In this case, the length of s is 9 which means that you're looping over range(8) and therefore the highest value that b will have is 7 (Stay with me, I'm going somewhere with this ...)
When b = 7 (on the last iteration of the loop), the conditional expression in the if statement is being checked which contains:
(s[b+2] == s[b])
Well, since b = 7, b + 2 = 9, but s[9] will be out of bounds (remember, python is 0 indexed so the highest index in a a string of length 9 is 8).
I'm guessing that the fix is to just modify the range statement:
for b in range(len(s)-2):
...

Related

If, else return else value even when the condition is true, inside a for loop

Here is the function i defined:
def count_longest(field, data):
l = len(field)
count = 0
final = 0
n = len(data)
for i in range(n):
count = 0
if data[i:i + l] is field:
while data[i - l: i] == data[i:i + l]:
count = count + 1
i = i + 1
else:
print("OK")
if final == 0 or count >= final:
final = count
return final
a = input("Enter the field - ")
b = input("Enter the data - ")
print(count_longest(a, b))
It works in some cases and gives incorrect output in most cases. I checked by printing the strings being compared, and even after matching the requirement, the loop results in "OK" which is to be printed when the condition is not true! I don't get it! Taking the simplest example, if i enter 'as', when prompted for field, and 'asdf', when prompted for data, i should get count = 1, as the longest iteration of the substring 'as' is once in the string 'asdf'. But i still get final as 0 at the end of the program. I added the else statement just to check the if the condition was being satisfied, but the program printed 'OK', therefore informing that the if condition has not been satisfied. While in the beginning itself, data[0 : 0 + 2] is equal to 'as', 2 being length of the "field".
There are a few things I notice when looking at your code.
First, use == rather than is to test for equality. The is operator checks if the left and right are referring to the very same object, whereas you want to properly compare them.
The following code shows that even numerical results that are equal might not be one and the same Python object:
print(2 ** 31 is 2 ** 30 + 2 ** 30) # <- False
print(2 ** 31 == 2 ** 30 + 2 ** 30) # <- True
(note: the first expression could either be False or True—depending on your Python interpreter).
Second, the while-loop looks rather suspicious. If you know you have found your sequence "as" at position i, you are repeating the while-loop as long as it is the same as in position i-1—which is probably something else, though. So, a better way to do the while-loop might be like so:
while data[i: i + l] == field:
count = count + 1
i = i + l # <- increase by l (length of field) !
Finally, something that might be surprising: changing the variable i inside the while-loop has no effect on the for-loop. That is, in the following example, the output will still be 0, 1, 2, 3, ..., 9, although it looks like it should skip every other element.
for i in range(10):
print(i)
i += 1
It does not effect the outcome of the function, but when debugging you might observe that the function seems to go backward after having found a run and go through parts of it again, resulting in additional "OK"s printed out.
UPDATE: Here is the complete function according to my remarks above:
def count_longest(field, data):
l = len(field)
count = 0
final = 0
n = len(data)
for i in range(n):
count = 0
while data[i: i + l] == field:
count = count + 1
i = i + l
if count >= final:
final = count
return final
Note that I made two additional simplifications. With my changes, you end up with an if and while that share the same condition, i.e:
if data[i:i+1] == field:
while data[i:i+1] == field:
...
In that case, the if is superfluous since it is already included in the condition of while.
Secondly, the condition if final == 0 or count >= final: can be simplified to just if count >= final:.

String of numbers converted to int and added to a list(Optimization issue)

I've managed to make the code work but i believe it could be optimized... a lot.
The input is a string of numbers separated with spaces. Something like - 4 2 8 6 or 1 2 3 4 5 6 7
It has to find which 3 numbers match this condition a + b == c. While 'b' is always on the right side of 'a' and for every time the condition is met print the numbers on the console in the following format - 'a + b == c'. If there isn't a single match print 'No'.
The only restriction is for 'b' to be at least 1 index away from 'a'.
This is what I have come up with.
lineOfNums = input('Line of numbers: ') # User input: example - 4 2 6 8
arrNums = lineOfNums.split()
conMet = False # Is the condition met at least once
for a in range(0, len(arrNums)):
for b in range(a + 1, len(arrNums)):
for c in range(0, len(arrNums)):
if int(arrNums[a]) + int(arrNums[b]) == int(arrNums[c]):
print(f'{arrNums[a]} + {arrNums[b]} == {arrNums[c]}')
conMet = True
if conMet == False: print('No')
You can do it with itertools, first of course convert to int
from itertools import combinations
# Convert to int
arr= [int(i) for i in arrNums]
# Get all the combinations
psums = {sum(i): i for i in combinations(arr, 2)}
# Then loop once
for i, v in enumerate(arr):
if v in psums:
print(f'{psums[v][0]} + {psums[v][1]} == {v}')
The big O for this algorithm is O(n^2) on average, which comes from O(n choose r), where n is the number of inputs (4 in this example) and r is the count of numbers your summing, in this case 2.
First, do the integer conversion once when you create arrNum, not every time through the loops.
arrNum = [int(x) for x in lineOfNums.split()]
The outer loop only needs to go to len(arrNums)-1, since it needs to leave room for B to the right of it.
for a in range(0, len(arrNums)-1):
for b in range(a + 1, len(arrNums)):
for c in range(0, len(arrNums)):
if arrNums[a] + arrNums[b] == arrNums[c]:
print(f'{arrNums[a]} + {arrNums[b]} == {arrNums[c]}')
conMet = True

Formatting unknown output in a table in Python

Help! I'm a Python beginner given the assignment of displaying the Collatz Sequence from a user-inputted integer, and displaying the contents in columns and rows. As you may know, the results could be 10 numbers, 30, or 100. I'm supposed to use '\t'. I've tried many variations, but at best, only get a single column. e.g.
def sequence(number):
if number % 2 == 0:
return number // 2
else:
result = number * 3 + 1
return result
n = int(input('Enter any positive integer to see Collatz Sequence:\n'))
while sequence != 1:
n = sequence(int(n))
print('%s\t' % n)
if n == 1:
print('\nThank you! The number 1 is the end of the Collatz Sequence')
break
Which yields a single vertical column, rather than the results being displayed horizontally. Ideally, I'd like to display 10 results left to right, and then go to another line. Thanks for any ideas!
Something like this maybe:
def get_collatz(n):
return [n // 2, n * 3 + 1][n % 2]
while True:
user_input = input("Enter a positive integer: ")
try:
n = int(user_input)
assert n > 1
except (ValueError, AssertionError):
continue
else:
break
sequence = [n]
while True:
last_item = sequence[-1]
if last_item == 1:
break
sequence.append(get_collatz(last_item))
print(*sequence, sep="\t")
Output:
Enter a positive integer: 12
12 6 3 10 5 16 8 4 2 1
>>>
EDIT Trying to keep it similar to your code:
I would change your sequence function to something like this:
def get_collatz(n):
if n % 2 == 0:
return n // 2
return n * 3 + 1
I called it get_collatz because I think that is more descriptive than sequence, it's still not a great name though - if you wanted to be super explicit maybe get_collatz_at_n or something.
Notice, I took the else branch out entirely, since it's not required. If n % 2 == 0, then we return from the function, so either you return in the body of the if or you return one line below - no else necessary.
For the rest, maybe:
last_number = int(input("Enter a positive integer: "))
while last_number != 1:
print(last_number, end="\t")
last_number = get_collatz(last_number)
In Python, print has an optional keyword parameter named end, which by default is \n. It signifies which character should be printed at the very end of a print-statement. By simply changing it to \t, you can print all elements of the sequence on one line, separated by tabs (since each number in the sequence invokes a separate print-statement).
With this approach, however, you'll have to make sure to print the trailing 1 after the while loop has ended, since the loop will terminate as soon as last_number becomes 1, which means the loop won't have a chance to print it.
Another way of printing the sequence (with separating tabs), would be to store the sequence in a list, and then use str.join to create a string out of the list, where each element is separated by some string or character. Of course this requires that all elements in the list are strings to begin with - in this case I'm using map to convert the integers to strings:
result = "\t".join(map(str, [12, 6, 3, 10, 5, 16, 8, 4, 2, 1]))
print(result)
Output:
12 6 3 10 5 16 8 4 2 1
>>>

Python if statement is only executed once

I try to make a simple spell checking programm by given two strings and adapt the first to the second one. If the strings have the same length my code works fine but if they're different, then the problems start. It only executes the if-statements once and stops after that. If I remove the break points, I get an IndexError: list index out of range.
Here is my code:
#!python
# -*- coding: utf-8 -*-
def edit_operations(first,second):
a = list(first)
b = list(second)
counter = 0
l_a = len(a)
l_b = len(b)
while True:
if a == b:
break
if l_a > l_b:
if a[counter] != b[counter]:
a[counter] = ""
c = "".join(a)
print "delete", counter+1, b[counter], c
counter += 1
l_a -= 1
break
if l_a < l_b:
if a[counter] != b[counter]:
c = "".join(a)
c = c[:counter] + b[counter] + c[counter:]
print "insert", counter+1, b[counter], c
counter += 1
l_a += 1
break
if a[counter] != b[counter]:
a[counter] = b[counter]
c = "".join(a)
print "replace", counter+1, b[counter], c
counter += 1
else:
counter += 1
if __name__ == "__main__":
edit_operations("Reperatur","Reparatur")
edit_operations("Singel","Single")
edit_operations("Krach","Stall")
edit_operations("wiederspiegeln","widerspiegeln")
edit_operations("wiederspiglen","widerspiegeln")
edit_operations("Babies","Babys")
edit_operations("Babs","Babys")
edit_operations("Babeeees","Babys")
This is the output I get:
replace 4 a Reparatur
replace 5 l Singll
replace 6 e Single
replace 1 S Srach
replace 2 t Stach
replace 4 l Stalh
replace 5 l Stall
delete 3 d widerspiegeln
replace 3 d widderspiglen
replace 4 e wideerspiglen
replace 5 r widerrspiglen
replace 6 s widersspiglen
replace 7 p widersppiglen
replace 8 i widerspiiglen
replace 9 e widerspieglen
replace 11 e widerspiegeen
replace 12 l widerspiegeln
delete 4 y Babes
insert 4 y Babys
delete 4 y Babeees
By the last 3 lines you can see my problem and I'm kinda desperate right now.
Hopefully someone could give me a hint what is wrong with it
The answer to the question in the title -- i.e., the if statement executed only once -- is already in a comment to your question, that is, there are two breaks in the two if blocks if l_a < l_b: and if l_a < l_b:.
In general, break statement interrupts the closest loop that it finds, no matter how nested the block where break finds itself is.
However, other problems do appear in your code:
the size of the list a is kept the same, however the same counter is used for iterating over the letters of the two strings. In case the length of the two strings are different, this problem leads eventually to the error IndexError: list index out of range, because the only condition that allows to exit the loop is when the two strings are the same. Also, when l_a > l_b, the same character of b that mismatched with a should be checked with the character next to the deleted one, however this does not happen because of the same counter.
When l_a < l_b the list a is not modified; just a new list c is created with the additional letter. Please look at list documentation.
counter is not updated correctly, as, when the two strings differ in length, it is incremented only if the letters different. This leads to an infinite loop.
In general, consider using a debugger in order to figure out the issues (look at the debuggers available in python https://wiki.python.org/moin/PythonDebuggingTools). It is possible to find online or in a bookstore many resources to learn how to debug code.
You should make use of the list.insert() function to insert a character into the list, the del operator to remove a single character from the list, and move the a==b comparison into the while loop conditional. The variable counter should indicate the index of the next character to be compared, and should not be incremented if the characters are not equal. Like this:
#! python3
def edit_operations(first,second):
a = list(first)
b = list(second)
counter = 0
while a != b:
if a[counter] != b[counter]:
if len(a) > len(b):
print("delete", counter + 1, a[counter])
del a[counter]
elif len(b) > len(a):
print("insert", counter + 1, b[counter])
a.insert(counter, b[counter])
else:
print("replace", counter + 1, b[counter])
a[counter] = b[counter]
else:
counter += 1
print("".join(a))
if __name__ == "__main__":
edit_operations("Reperatur","Reparatur")
edit_operations("Singel","Single")
edit_operations("Krach","Stall")
edit_operations("wiederspiegeln","widerspiegeln")
edit_operations("wiederspiglen","widerspiegeln")
edit_operations("Babies","Babys")
edit_operations("Babs","Babys")
edit_operations("Babeeees","Babys")
I've changed the print statements a little.
I really don't understand what your question is but if you need a spelling checker just use this library

Returning the index of a string that is not within brackets

Suppose I have a string:
x = '[1.3].[1.2]'
How do I find the first index of "." that is not within the square brackets ([])?
So for the above example the first "." is at index 5, it is not at index 2 since at index 2 the "." is within the square brackets.
I tried doing x.index(".") but that only returns the index of the first "." and that "." can be within brackets.
I also tried doing x.index('].[') + 1 but that would fail for this example:
x = '[[1.3].[9.10]].[1.2.[4.[5.6]]]'
x.index('].[') + 1
6
Since the first "." that is not within brackets is at index 13
If anyone can help me out with this that would be really appreciated.
What this is is just you have two strings starting with '[' and ending with ']' and you connect them using '.', so
s1 = "[1.2]"
s2 = "[2.3]"
s1 + "." + s2
and basically I'm trying to get the index of the '.' after the strings are connected.
A simple “parser” for this:
def findRootIndexes (s):
nested = 0
for i, c in enumerate(s):
if c == '[':
nested += 1
elif c == ']':
nested -= 1
elif c == '.' and nested == 0:
yield i
>>> list(findRootIndexes('[1.3].[1.2]'))
[5]
>>> list(findRootIndexes('[[1.3].[9.10]].[1.2.[4.[5.6]]]'))
[14]
>>> list(findRootIndexes('[1.2].[3.4].[5.6]'))
[5, 11]
This is essentially a pushdown automaton except that we don’t need to track different tokens but just the opening and closing bracket. So we just need to count how many open levels we still have.
If you want to take it even further, you can—as roippi suggested in the comments—add some syntax checking to prevent things like [[1.2]]]. Or you could also add some additional checks to make sure that an opening [ is always preceded by a dot or another opening [. To do this, you could make it a one-look-behind parser. Something like this:
nested = 0
last = None
for i, c in enumerate(s):
if c == '[':
if last not in (None, '[', '.'):
raise SyntaxError('Opening bracket must follow either `[` or `.`')
nested += 1
elif c == ']'
if nested == 0:
raise SyntaxError('Closing bracket for non-open group')
nested -= 1
elif c == '.' and nested == 0:
yield i
last = c
But of course, if you create that string yourself from components you know that are valid, such checks are not really necessary.
In this solution we're counting the opening brackets. This is the easiest way I can imagine:
x = '[[1.3].[9.10]].[1.2.[4.[5.6]]]'
brackets = 0
pos = 0
for y in x:
if y == '[':
brackets += 1
elif y == ']':
brackets -=1
if brackets == 0:
print(pos) # Find first occurence and break from the loop
break
pos += 1
Prints 13

Categories

Resources