Python IndexError : string index out of range in substring program - python

I am writing a code for a class that wants me to make a code to check the substring in a string using nested loops.
Basically my teacher wants to prove how the function 'in', as in:
ana in banana will return True.
The goal of the program is to make a function of 2 parameters,
substring(subStr,fullStr)
that will print out a sentence saying if subStr is a substring of fullStr, my program is as follows:
def substring(subStr,fullStr):
tracker=""
for i in (0,(len(fullStr)-1)):
for j in (0,(len(subStr)-1)):
if fullStr[i]==subStr[j]:
tracker=tracker+subStr[j]
i+=1
if i==(len(fullStr)-1):
break
if tracker==subStr:
print "Yes",subStr,"is a substring of",fullStr
When i called the function in the interpreter 'substring("ana","banana")', it printed out a traceback error on line 5 saying string index out of range:
if fullStr[i]==subStr[j]:
I'm banging my head trying to find the error. Any help would be appreciated

There are a few separate issues.
You are not reseting tracker in every iteration of the outer loop. This means that the leftovers from previous iterations contaminate later iterations.
You are not using range, and are instead looping over a tuple of just the 0 and the length of each string.
You are trying to increment the outer counter and skipping checks for the iteration of the outer loop.
You are not doing the bounds check correctly before trying to index into the outer string.
Here is a corrected version.
def substring(subStr,fullStr):
for i in range(0,(len(fullStr))):
tracker=""
for j in range(0,(len(subStr))):
if i + j >= len(fullStr):
break
if fullStr[i+j]==subStr[j]:
tracker=tracker+subStr[j]
if tracker==subStr:
print "Yes",subStr,"is a substring of",fullStr
return
substring("ana", "banana")

First off, your loops should be
for i in xrange(0,(len(fullStr))):
for example. i in (0, len(fullStr)-1) will have i take on the value of 0 the first time around, then take on len(fullStr)-1 the second time. I assume by your algorithm you want it to take on the intermediate values as well.
Now as for the error, consider i on the very last pass of the for loop. i is going to be equal to len(fullStr)-1. Now when we execute i+=1, i is now equal to len(fullStr). This does not fufill the condition of i==len(fullStr)-1, so we do not break, we loop, and we crash. It would be better if you either made it if i>=len(fullStr)-1 or checked for i==len(fullStr)-1 before your if fullStr[i]==subStr[j]: statement.
Lastly, though not related to the question specifically, you do not reset tracker each time you stop checking a certain match. You should place tracker = "" after the for i in xrange(0,(len(fullStr))): line. You also do not check if tracker is correct after looping through the list starting at i, nor do you break from the loop when you get a mismatch(instead continuing and possibly picking up more letters that match, but not consecutively.)
Here is a fully corrected version:
def substring(subStr,fullStr):
for i in xrange(0,(len(fullStr))):
tracker="" #this is going to contain the consecutive matches we find
for j in xrange(0,(len(subStr))):
if i==(len(fullStr)): #end of i; no match.
break
if fullStr[i]==subStr[j]: #okay, looks promising, check the next letter to see if it is a match,
tracker=tracker+subStr[j]
i+=1
else: #found a mismatch, leave inner loop and check what we have so far.
break
if tracker==subStr:
print "Yes",subStr,"is a substring of",fullStr
return #we already know it is a substring, so we don't need to check the rest

Related

Function result varies on each run

I have the following function that generates the longest palindrome of a string by removing and re-ordering the characters:
from collections import Counter
def find_longest_palindrome(s):
count = Counter(s)
chars = list(set(s))
beg, mid, end = '', '', ''
for i in range(len(chars)):
if count[chars[i]] % 2 != 0:
mid = chars[i]
count[chars[i - 1]] -= 1
else:
for j in range(0, int(count[chars[i]] / 2)):
beg += chars[i]
end = beg
end = ''.join(list(reversed(end)))
return beg + mid + end
out = find_longest_palindrome('aacggg')
print(out)
I got this function by 'translating' this example from C++
When ever I run my function, I get one of the following outputs at random it seems:
a
aca
agcga
The correct one in this case is 'agcga' as this is the longest palindrome for the input string 'aacggg'.
Could anyone suggest why this is occurring and how I could get the function to reliably return the longest palindrome?
P.S. The C++ code does not have this issue.
Your code depends on the order of list(set(s)).
But sets are unordered.
In CPython 3.4-3.7, the specific order you happen to get for sets of strings depends on the hash values for strings, which are explicitly randomized at startup, so it makes sense that you’d get different results on each run.
The reason you don’t see this in C++ is that the C++ set class template is not an unordered set, but a sorted set (based on a binary search tree, instead of a hash table), so you always get the same order in every run.
You could get the same behavior in Python by calling sorted on the set instead of just copying it to a list in whatever order it has.
But the code still isn’t correct; it just happens to work for some examples because the sorted order happens to give you the characters in most-repeated order. But that’s obviously not true in general, so you need to rethink your logic.
The most obvious difference introduced in your translation is this:
count[ch--]--;
… or, since you're looping over the characters by index instead of directly, more like:
count[chars[i--]]--;
Either way, this decrements the count of the current character, and then decrements the current character so that the loop will re-check the same character the next time through. You've turned this into something completely different:
count[chars[i - 1]] -= 1
This just decrements the count of the previous character.
In a for-each loop, you can't just change the loop variable and have any effect on the looping. To exactly replicate the C++ behavior, you'd either need to switch to a while loop, or put a while True: loop inside the for loop to get the same "repeat the same character" effect.
And, of course, you have to decrement the count of the current character, not decrement the count of the previous character that you're never going to see again.
for i in range(len(chars)):
while True:
if count[chars[i]] % 2 != 0:
mid = chars[i]
count[chars[i]] -= 1
else:
for j in range(0, int(count[chars[i]] / 2)):
beg += chars[i]
break
Of course you could obviously simplify this—starting with just looping for ch in chars:, but if you think about the logic of how the two loops work together, you should be able to see how to remove a whole level of indentation here. But this seems to be the smallest change to your code.
Notice that if you do this change, without the sorted change, the answer is chosen randomly when the correct answer is ambiguous—e.g., your example will give agcga one time, then aggga the next time.
Adding the sorted will make that choice consistent, but no less arbitrary.

Looping for loop in Python

I have the following code.
for idx in range(len(networks)):
net_ = networks[idx]
lastId=0
for layerUptID in range(len(net_[1])):
retNet,lastId=cn_.UpdateTwoConvLayers(deepcopy(net_),lastId)
networks.append(retNet)
if(lastId==-1):
break
networks has only one net at the beginning.
After running the line retNet,lastId=cn_.UpdateTwoConvLayers(deepcopy(net_),lastId), I have additional six nets and appended to networks.
So after this lastId ==-1, go back to first for loop with len(networks) is 7.
For the next idx, idx=1 and continue.
Then, len(networks) is 13. Then go back to first for loop.
After this, the first for loop breaks.
I am expecting to continue for idx is 2, but it breaks.
What could be the issue?
If you try using a WHILE loop instead of FOR loop, the break statement would be check if the loop is on the last item in 'networks' collection.
This way the network length would be calculated in each loop iteration
For starters: Iterating, or looping, over the list (or data) you're editing is bad practice. Keep that in mind while coding.
This means if you plan to edit what you're looping on, in your case networks, then you're going to have a bad time looping over it. I would advise to break it up into two code parts:
The first part creates a new list of whatever it is you want WHILE looping.
The second part replaces the list you've used to generate what you wanted.
Another thing which could go wrong is net_[i] may not be set up for some i, and you're trying to access it here:
for layerUptID in range(len(net_[1])):
What if there is nothing in net_[1]?
To avoid these errors, usually verifying your data is a great way to start. If it is not null, then proceed, otherwise, print an error.
This is what I can think of. Hope it helps.
If I understood correctly your problem is that you've added new elements to networks, i.e. have increased length of networks and expect that for-loop will pick up this changes, well it's not, let's look at following snippet
elements = [1]
indices = range(len(elements))
for index in indices:
print('index is', index)
elements.append(2)
print('elements count is', len(elements))
print('indices count is', len(indices))
outputs are
index is 0
elements count is 2
indices count is 1
so as we can see despite the fact that length of elements list has changed, range object which is used in for-loop has not. This happens because len returns int object which are immutable, so when you change list length its length becomes different object and range function has no idea about this changes.
Finally, we can use while loop here like
while networks:
net_ = networks.pop()
lastId = 0
for layerUptID in range(len(net_[1])):
retNet, lastId = cn_.UpdateTwoConvLayers(deepcopy(net_), lastId)
networks.append(retNet)
if lastId == -1:
break

Naive implementation of Karp-Rabin pattern matching algorithm

I'm having problem implementing the naive version of Karp-Rabin pattern marcher; I'm not getting the expected result. Here's my example;
string='today is a good day'
sub='good'
I would like to find the pattern good in the string above.
def kapr(n,m):
for i in range(len(n)-len(m)+1):
for j in range(len(m)):
if n[i+j-1]!=m[j]:
continue
return i
return not found
Print (kapr(string, sub))
Output=0
Expected output=11, should correspond with the offset of good in the string.
Thanks for your help.
You want break instead of continue. Continue will move on to the next iteration of the inner loop, while break will exit the inner loop. Furthermore, you aren't jumping directly to the next iteration of the outer loop by using break, so you will hit the return i statement. To stop this happening, you can use a for/else branch.
E.g.
for j in range(len(m)):
if n[i+j-1]!=m[j]:
break
else:
return i
It will only return i if the inner loop completes normally.
The index it returns is also not zero indexed, so with the above modifications it will return 12. Should be simple to update if you want it to be zero-indexed!

Trying to create a sorting algorithm from scratch in Python

I'm taking a course on programming (I'm a complete beginner) and the current assignment is to create a Python script that sorts a list of numbers in ascending order without using built-in functions like "sorted".
The script I started to come up with is probably laughably convoluted and inefficient, but I'd like to try to make it work eventually. In other words, I don't want to just copy someone else's script that's better.
Also, as far as I can tell, this script (if it ever functioned) would probably just put things in a NEW order that wouldn't necessarily be ascending. This is just a start, though, and hopefully I'll fix that later.
Anyway, I've run in to several problems with the current incarnation, and the latest is that it just runs forever without printing anything.
So here it is (with hashes explaining what I was trying to accomplish). If someone could look over it and tell me why the code does not match my explanations of what each block is supposed to do, that would be great!
# The numbers to be inputted, could be anything
numList = [1, 25, 5, 6, 17, 4]
# The final (hopefully sorted) list
numSort = []
# The index
i = 0
# Run the loop until you run out of numbers
while len(numList) != 0:
# If there's only one value left in numList,
# attach it to the end of numSort.
# (Variable 'basket' is just for transporting numbers)
if len(numList) == 1:
basket = numList.pop()
numSort.append(basket)
# The rest of the elifs are supposed to compare values
# that sit next to each other.
# If there's still a number in numList after 'i'
# and 'i' is smaller than that next number
# then pop 'i' and attach it to the end of numSort
elif numList[i+1] != numList[-1] and numList[i] < numList[i+1]:
basket = numList.pop(i)
numSort.append(basket)
# If there's NOT a number after 'i'
# then compare 'i' to the first number in the list instead.
elif numList[i+1] == numList[-1] and numList[i] < numList[0]:
basket = numList.pop(i)
numSort.append(basket)
# If 'i' IS the last number in the list
# and has nothing to compare itself to,
# Then start over and go through it again
# from the beginning
elif numList [i+1] == numList[-1]:
i = 0
# If 'i' is not at the end of numList yet
# and 'i' is NOT smaller than the next number
# and there are still numbers left
# then move on to the next pair
# and continue comparing and moving numbers
else:
i = i+1
# Hopefully these will be in ascending order eventually.
print(numSort)
Here is a simple way to sort your list with a classic loop :
myList = [2,99,0,56,8,1]
for i,value in enumerate(myList):
for j,innerValue in enumerate(myList):
if value < innerValue: #for order desc use '>'
myList[j],myList[i]=myList[i],myList[j]
print(myList)
The Algorithm behind this code is :
fix one value of the list and compare it with the rest
if it is smallest then switch the index of two values
I hope this will help you
Your conditions are essentially:
If there is only one number in the list
The current number is less than the next, which is not equal to the last number
The current number is less than the first number, and the next number is equal to the last number
The next number is equal to the last number
If you trace out the code by hand you will see how in many cases, none of these will evaluate to true and your "else" will be executed and i will be incremented. Under these conditions, certain numbers will never be removed from your original list (none of your elifs will catch them), i will increment until the next number is equal to the last number, i will be reset to zero, repeat. You are stuck in an infinite loop. You will need to update your if-statement in such a way that all numbers will eventually be caught by a block other than your final elif.
On a separate note, you are potentially comparing a number to only one number in the current list before appending it to the "sorted" list. You will likely need to compare the number you want to add to your "sorted" list and find its proper place in THAT list rather than merely appending.
You should also consider finding the end of list using a method more like
if i == len(numList) - 1
This will compare the index to the length of the list rather than comparing more values in the list which are not necessarily relevant to the order you are trying to create.

how can i search for common elements in two integers with while loop

in my code im having a problem because i cannot compare to list as i wanted. what i try to do is looking for first indexes of inputs firstly and then if indexes not the same looking for the next index of the longer input as i guess1. and then after finishing comparing the first index of elements i want to compare second indexes .. what i mean first checking (A-C)(A-A)(A-T) and then (C-A)(C-T).. and then (T-T)...
and want an input list as (A,T) beacuse of ATT part of guess1..
however i stuck in a moment that i always find ACT not A and T..
where i am wrong.. i will be very glad if you enlighten me..
edit..
what i'm trying to do is looking for the best similarity in the longer list of guess1 and find the most similiar list as ATT
GUESS1="CATTCG"
GUESS2="ACT"
if len(str(GUESS1))>len(str(GUESS2)):
DNA_input_list=list((GUESS1))
DNA_input1_list=list((GUESS2))
common_elements=[]
i=0
while i<len(DNA_input1_list)-1:
j=0
while j<len(DNA_input_list)-len(DNA_input1_list):
if DNA_input_list[i] == DNA_input1_list[j]:
common_elements.append(DNA_input1_list[j])
i+=1
j+=1
if j>len(DNA_input1_list)-1:
break
print(common_elements)
As far as I understand, you want to find a shorter substring in a longer substring, and if not found, remove an element from shorter substring then repeat the search.
You can use string find function in python for that. i.e. "CATTCG".find('ACT'), this function will return -1 because there are no substing ACT. What then you can do is remove an element from the shorter string using slice operator [::] and repeat the search like this --
>>> for x in range(len('ACT')):
... if "CATTCG".find('ACT'[x:]) > -1 :
... print("CATTCG".find('ACT'[x:]))
... print("Match found for " + 'ACT'[x:])
In code here, first a range of lengths is generated i.e. [0, 1, 2, 3] this is the number of items we're gonna slice off from the beginning.
In second line we do the slicing with 'ACT'[x:] (for x==0, we get 'ACT', for x == 1, we get 'CT' and for x==2, we get 'T').
The last two lines print out the position and the string that matched.
If I have understood everything correctly, you want to return the longest similar substring from GUESS2, with is included in GUESS1.
I would use something like this.
<!-- language: lang-py -->
for count in range(len(GUESS2)):
if GUESS2[:count] in GUESS1:
common_elements = GUESS2[:count]
print(GUESS2[:count]) #if a function, return GUESS2[:count]
A loop as long as the count from the searching string.
Then check if the substring is included in the other.
If so, save it to a variable and print/return it after the loop has finished.

Categories

Resources