I have the following function that generates the longest palindrome of a string by removing and re-ordering the characters:
from collections import Counter
def find_longest_palindrome(s):
count = Counter(s)
chars = list(set(s))
beg, mid, end = '', '', ''
for i in range(len(chars)):
if count[chars[i]] % 2 != 0:
mid = chars[i]
count[chars[i - 1]] -= 1
else:
for j in range(0, int(count[chars[i]] / 2)):
beg += chars[i]
end = beg
end = ''.join(list(reversed(end)))
return beg + mid + end
out = find_longest_palindrome('aacggg')
print(out)
I got this function by 'translating' this example from C++
When ever I run my function, I get one of the following outputs at random it seems:
a
aca
agcga
The correct one in this case is 'agcga' as this is the longest palindrome for the input string 'aacggg'.
Could anyone suggest why this is occurring and how I could get the function to reliably return the longest palindrome?
P.S. The C++ code does not have this issue.
Your code depends on the order of list(set(s)).
But sets are unordered.
In CPython 3.4-3.7, the specific order you happen to get for sets of strings depends on the hash values for strings, which are explicitly randomized at startup, so it makes sense that you’d get different results on each run.
The reason you don’t see this in C++ is that the C++ set class template is not an unordered set, but a sorted set (based on a binary search tree, instead of a hash table), so you always get the same order in every run.
You could get the same behavior in Python by calling sorted on the set instead of just copying it to a list in whatever order it has.
But the code still isn’t correct; it just happens to work for some examples because the sorted order happens to give you the characters in most-repeated order. But that’s obviously not true in general, so you need to rethink your logic.
The most obvious difference introduced in your translation is this:
count[ch--]--;
… or, since you're looping over the characters by index instead of directly, more like:
count[chars[i--]]--;
Either way, this decrements the count of the current character, and then decrements the current character so that the loop will re-check the same character the next time through. You've turned this into something completely different:
count[chars[i - 1]] -= 1
This just decrements the count of the previous character.
In a for-each loop, you can't just change the loop variable and have any effect on the looping. To exactly replicate the C++ behavior, you'd either need to switch to a while loop, or put a while True: loop inside the for loop to get the same "repeat the same character" effect.
And, of course, you have to decrement the count of the current character, not decrement the count of the previous character that you're never going to see again.
for i in range(len(chars)):
while True:
if count[chars[i]] % 2 != 0:
mid = chars[i]
count[chars[i]] -= 1
else:
for j in range(0, int(count[chars[i]] / 2)):
beg += chars[i]
break
Of course you could obviously simplify this—starting with just looping for ch in chars:, but if you think about the logic of how the two loops work together, you should be able to see how to remove a whole level of indentation here. But this seems to be the smallest change to your code.
Notice that if you do this change, without the sorted change, the answer is chosen randomly when the correct answer is ambiguous—e.g., your example will give agcga one time, then aggga the next time.
Adding the sorted will make that choice consistent, but no less arbitrary.
Related
I've seen some posts about this same question, and I think my logic is pretty much the same as their answers. But I cannot find where exactly I'm wrong here.
My code first checks the length of the provided sequence, if it is 2 or less it automatically returns True.
Next, it removes(pops) the first element and check if the rest are in ascending order.
If the sequence isn't in order, it replaces it with the original sequence and repeats the second step, but this time it removes the next element (pop(i)).
This continues until there are no more elements to remove, which ultimately returns as False
If in any of the iterations, the list is found to be in ascending order, the function returns True.
This is the code:
def almostIncreasingSequence(sequence):
original = sequence.copy()
if len(sequence) <= 2: return True
for i in range(len(sequence)):
sequence.pop(i)
# print(sequence)
for j in range(len(sequence)-1):
if sequence[j+1] <= sequence[j]:
sequence = original.copy()
elif j+1 == len(sequence)-1:
return True
if i == len(sequence)-1:
return False
And this is my result :'(
I think my logic may not be correctly implemented in the code. But I don't know how to test it. It'd be helpful if you can give me a sequence where this function will give a wrong answer.
Solve almostIncreasingSequence (Codefights)
This is one of the posts I was referring to at the very beginning. It also explains the almostIncreasingSequence(sequence) question and the answer explains the logic behind the code.
You don't have to try every element. Just find the violation of the ascension, and try to resolve it by removing one of the violators. Then check the rest of the list.
More formally, suppose that the sequence[:i] is in the ascending order, but sequence[i] < sequence[i+1]. You cannot keep them both; one must be gone. Which one, depends on sequence[i-1].
If sequence[i+1] < sequence[i-1], removal of sequence[i] wouldn't help: a violation will remain. Therefore, remove sequence[i+1]. Otherwise, remove sequence[i] (do you see why?). Finally, check that the rest of sequence is ascending.
I'm trying to understand just how a python for loop iterates. I know how to iterate with c++ but I have been asked to write this program in python. Forgive my knowledge in python but I am by no means an expert on the subject.
I've googled many possible solutions, however, they have not given actual guidance to my issue. Meaning that there was never an actual explanation as to how the coding works to iterate one by one and to be able to match 3 consecutive indexes.
for i in range(0, len(dna)):
if dna[i] == 'A' & dna[i+1] == 'T' & dna[i+2] == 'G':
protein_sequence[dna[i:i+3]]
//for i in range(0, len(dna)-(3+len(dna)%3), 3):
// if protein[dna[i:i+3]] == "ATG":
// protein_sequence += protein[dna[i:i+3]]
if protein[dna[i:i+3]] == "STOP" :
break
protein_sequence += protein[dna[i:i+3]]
What I am trying to do is to iterate through and match an "exact" three character sequence. Once the sequence is found then I can iterate through by sequences of 3's until I match the "Stop" sequence. The for loop that is commented out didn't work either as far as finding the "Start" trigger to initiate the for loop. Thank you in advance for assistance.
In Python, there is no such thing as a multiple index match; in case you need to look up the surrounding values of an element in an array, use a sliding window of size len(pattern):
def match(s, pattern): # returns the FIRST match
for start in xrange(len(s) - len(pattern)):
if s[start: start + len(pattern)] == pattern:
return start
return None
idx = match(dna, "ATG")
if idx is not None:
pass # do something witty with it instead
Of course, this performs poorly on large data due to its time complexity of O(n^2): you'll need to employ faster algorithms, like Aho-Corasick or KMP.
You could simplify by using the split function limiting it to the first occurrence of ‘atg’ then doing your 3 letter loop:
dna='cgatgxggctatgaatcttccggtaatg'
z=dna.split('atg',1)
Output:
z
['cg', 'xggctatgaatcttccggtaatg']
I'm taking a course on programming (I'm a complete beginner) and the current assignment is to create a Python script that sorts a list of numbers in ascending order without using built-in functions like "sorted".
The script I started to come up with is probably laughably convoluted and inefficient, but I'd like to try to make it work eventually. In other words, I don't want to just copy someone else's script that's better.
Also, as far as I can tell, this script (if it ever functioned) would probably just put things in a NEW order that wouldn't necessarily be ascending. This is just a start, though, and hopefully I'll fix that later.
Anyway, I've run in to several problems with the current incarnation, and the latest is that it just runs forever without printing anything.
So here it is (with hashes explaining what I was trying to accomplish). If someone could look over it and tell me why the code does not match my explanations of what each block is supposed to do, that would be great!
# The numbers to be inputted, could be anything
numList = [1, 25, 5, 6, 17, 4]
# The final (hopefully sorted) list
numSort = []
# The index
i = 0
# Run the loop until you run out of numbers
while len(numList) != 0:
# If there's only one value left in numList,
# attach it to the end of numSort.
# (Variable 'basket' is just for transporting numbers)
if len(numList) == 1:
basket = numList.pop()
numSort.append(basket)
# The rest of the elifs are supposed to compare values
# that sit next to each other.
# If there's still a number in numList after 'i'
# and 'i' is smaller than that next number
# then pop 'i' and attach it to the end of numSort
elif numList[i+1] != numList[-1] and numList[i] < numList[i+1]:
basket = numList.pop(i)
numSort.append(basket)
# If there's NOT a number after 'i'
# then compare 'i' to the first number in the list instead.
elif numList[i+1] == numList[-1] and numList[i] < numList[0]:
basket = numList.pop(i)
numSort.append(basket)
# If 'i' IS the last number in the list
# and has nothing to compare itself to,
# Then start over and go through it again
# from the beginning
elif numList [i+1] == numList[-1]:
i = 0
# If 'i' is not at the end of numList yet
# and 'i' is NOT smaller than the next number
# and there are still numbers left
# then move on to the next pair
# and continue comparing and moving numbers
else:
i = i+1
# Hopefully these will be in ascending order eventually.
print(numSort)
Here is a simple way to sort your list with a classic loop :
myList = [2,99,0,56,8,1]
for i,value in enumerate(myList):
for j,innerValue in enumerate(myList):
if value < innerValue: #for order desc use '>'
myList[j],myList[i]=myList[i],myList[j]
print(myList)
The Algorithm behind this code is :
fix one value of the list and compare it with the rest
if it is smallest then switch the index of two values
I hope this will help you
Your conditions are essentially:
If there is only one number in the list
The current number is less than the next, which is not equal to the last number
The current number is less than the first number, and the next number is equal to the last number
The next number is equal to the last number
If you trace out the code by hand you will see how in many cases, none of these will evaluate to true and your "else" will be executed and i will be incremented. Under these conditions, certain numbers will never be removed from your original list (none of your elifs will catch them), i will increment until the next number is equal to the last number, i will be reset to zero, repeat. You are stuck in an infinite loop. You will need to update your if-statement in such a way that all numbers will eventually be caught by a block other than your final elif.
On a separate note, you are potentially comparing a number to only one number in the current list before appending it to the "sorted" list. You will likely need to compare the number you want to add to your "sorted" list and find its proper place in THAT list rather than merely appending.
You should also consider finding the end of list using a method more like
if i == len(numList) - 1
This will compare the index to the length of the list rather than comparing more values in the list which are not necessarily relevant to the order you are trying to create.
I have discovered something in python today. But haven't found a clear explanation for it yet.
In python it seems that this works:
variable += a_single_statement
So, following statements are correct:
variable += another_variable
variable += (another_variable - something_else)
But doing the following is incorrect:
variable += a_variable - b_variable
Could someone explain why this is the case, preferably with a link to the documentation to the syntactical structure that explains what the operands of a += operator are, what expressions are expected and what their structure is? Also, are my thoughts, outlined above, even correct?
The behavior seems to be different from other programming languages I'm used too, and that last 'statement' leads to a syntax error.
Edit: the code where it doesn't work. It might be a whitespace error instead :/
T = input()
counter = 0
# For each word, figure out edit length to palindrome
for _ in range(T):
counter += 1
word = raw_input()
word_len = len(word) #stored for efficiency
index = 0
sum_edits = 0
# Iterate half the word and always compare characters
# at equal distance d from the beginning and from
# the ending of the word
while index < word_len/2.0:
sum_edits += max(ord(word[index]), ord(word[word_len-index-1])) -
min(ord(word[index]), ord(word[word_len - index - 1]))
index += 1
print sum_edits
It's code to detect how many edits it would take to make a word into a palindrome, if you could only change letters 'downwards' towards an 'a'.
Does this mean you can not arbitrarily break up a line in python code, if it's clear that the 'expression' has to continue anyway? Or can you only break up lines of code if they are surrounded with parentheses?
Sorry, I'm very new to python.
It has nothing to do with +=. It's just that Python doesn't let you split a statement across multiple lines unless there's an open (, {, or [, or unless you perform a line continuation with \. It might look obvious that those two lines are supposed to be one statement, but then when you have statements like
a = loooooooooooooooooooooooooooooong_thiiiiiiiiiiiiiiiiiiiiiiiing
+ ooooooooooooootheeeeeeeeeeeeer_thiiiiiiiiiiiiiiiiiiiing
is that one statement or two? If you allow
a = loooooooooooooooooooooooooooooong_thiiiiiiiiiiiiiiiiiiiiiiiing +
ooooooooooooootheeeeeeeeeeeeer_thiiiiiiiiiiiiiiiiiiiing
to be one statement, then either interpretation for having the + operator on the second line is confusing and bug-prone. Javascript tries to allow this kind of thing, and its semicolon insertion causes all kinds of problems.
It's usually recommended to use parentheses if you're not already inside brackets or braces.
I am writing a code for a class that wants me to make a code to check the substring in a string using nested loops.
Basically my teacher wants to prove how the function 'in', as in:
ana in banana will return True.
The goal of the program is to make a function of 2 parameters,
substring(subStr,fullStr)
that will print out a sentence saying if subStr is a substring of fullStr, my program is as follows:
def substring(subStr,fullStr):
tracker=""
for i in (0,(len(fullStr)-1)):
for j in (0,(len(subStr)-1)):
if fullStr[i]==subStr[j]:
tracker=tracker+subStr[j]
i+=1
if i==(len(fullStr)-1):
break
if tracker==subStr:
print "Yes",subStr,"is a substring of",fullStr
When i called the function in the interpreter 'substring("ana","banana")', it printed out a traceback error on line 5 saying string index out of range:
if fullStr[i]==subStr[j]:
I'm banging my head trying to find the error. Any help would be appreciated
There are a few separate issues.
You are not reseting tracker in every iteration of the outer loop. This means that the leftovers from previous iterations contaminate later iterations.
You are not using range, and are instead looping over a tuple of just the 0 and the length of each string.
You are trying to increment the outer counter and skipping checks for the iteration of the outer loop.
You are not doing the bounds check correctly before trying to index into the outer string.
Here is a corrected version.
def substring(subStr,fullStr):
for i in range(0,(len(fullStr))):
tracker=""
for j in range(0,(len(subStr))):
if i + j >= len(fullStr):
break
if fullStr[i+j]==subStr[j]:
tracker=tracker+subStr[j]
if tracker==subStr:
print "Yes",subStr,"is a substring of",fullStr
return
substring("ana", "banana")
First off, your loops should be
for i in xrange(0,(len(fullStr))):
for example. i in (0, len(fullStr)-1) will have i take on the value of 0 the first time around, then take on len(fullStr)-1 the second time. I assume by your algorithm you want it to take on the intermediate values as well.
Now as for the error, consider i on the very last pass of the for loop. i is going to be equal to len(fullStr)-1. Now when we execute i+=1, i is now equal to len(fullStr). This does not fufill the condition of i==len(fullStr)-1, so we do not break, we loop, and we crash. It would be better if you either made it if i>=len(fullStr)-1 or checked for i==len(fullStr)-1 before your if fullStr[i]==subStr[j]: statement.
Lastly, though not related to the question specifically, you do not reset tracker each time you stop checking a certain match. You should place tracker = "" after the for i in xrange(0,(len(fullStr))): line. You also do not check if tracker is correct after looping through the list starting at i, nor do you break from the loop when you get a mismatch(instead continuing and possibly picking up more letters that match, but not consecutively.)
Here is a fully corrected version:
def substring(subStr,fullStr):
for i in xrange(0,(len(fullStr))):
tracker="" #this is going to contain the consecutive matches we find
for j in xrange(0,(len(subStr))):
if i==(len(fullStr)): #end of i; no match.
break
if fullStr[i]==subStr[j]: #okay, looks promising, check the next letter to see if it is a match,
tracker=tracker+subStr[j]
i+=1
else: #found a mismatch, leave inner loop and check what we have so far.
break
if tracker==subStr:
print "Yes",subStr,"is a substring of",fullStr
return #we already know it is a substring, so we don't need to check the rest