I'm trying to write a function about open reading frame using a dictionary of only the stop codon. The program takes in three letter at a time and if that three letter is one of that stop codon, the program stops and counts the number of letters (the stop codon is NOT counted, nor is anything afterwards). For example, nextStop2('AAAAAAAGTGGGTGCTAGGTTGGC') should return 15. I'm not sure why but the code I wrote below doesn't seem to work. Can anyone give me any advice on how to improve? Thanks!
def nextStop2(Seq):
GeneticCodeStop = {'TAA':'X', 'TAG':'X', 'TGA':'X'}
seq2 = ''.join(end_of_loop() if GeneticCodeStop[i]=='X' else i for i in Seq)
return len((seq2)/3)
your parens are off you should write len(seq2)/3. but you don't want to divide by 3 (you expect 15 and not 5) so just return len(seq2).
def nextStop2(Seq):
GeneticCodeStop = ['TAA', 'TAG', 'TGA']
seq2=''
for i in range(0,len(Seq),3) :
codon=Seq[i:i+3]
if codon in GeneticCodeStop:
break
seq2+=codon
return len(seq2)
print(nextStop2('AAAAAAAGTGGGTGCTAGGTTGGC') )
>>> 15
but i don't know biopython, and i think it should have a function to do this
Related
I'm working on a "word generator" But I have a problem.
Sometimes it creates unwanted words.
I would like to remove them but not to heavy the loop.
I have to do like the example in the code with the whole alphabet.
There would be nothing wrong with it that much, but there are a few other problems that I would have dealt with if only I knew how to cut it short.
# alph = "abcdefghijklmnopqrstuvwyzxÄ…"
my_var = ["abc", "aabc", "cbd", "ccbd", "qwe", "qqwe"]
my_var2 = []
def removeDup():
for x in my_var:
if x.find("aa") == -1 and x.find("cc") == -1 and x.find("qq") == -1:
my_var2.append(x)
print(my_var2)
removeDup()
My idea is dynamic variables, but I can't make one loop in the other without creating chaos
I tried something like the one in the picture, but I can only take out words with repeated letters
There's no need for dynamic variables. Just make a list of all the duplicate characters.
dups = [z*2 for z in alph]
for x in open('xxx.txt', encoding='utf-8'):
if not any(dup in x for dup in dups):
print(x.strip())
so i'm doing the dna problem for cs50 wherein i have to count the number of times a STR repeats in a dna sequence. i had an idea on how to solve the problem so i took one of the data and ran my code but the problem is that the program doesn't end and keeps running i think it has been about 10 minutes now from when i started the program and it still like this. here's the code:
text="AAGGTAAGTTTAGAATATAAAAGGTGAGTTAAATAGAATAGGTTAAAATTAAAGGAGATCAGATCAGATCAGATCTATCTATCTATCTATCTATCAGAAAAGAGTAAATAGTTAAAGAGTAAGATATTGAATTAATGGAAAATATTGTTGGGGAAAGGAGGGATAGAAGG"
length=len(text)
AGAT=0
tmp=0
for i in range(length):
while text[i]=="A" and text[i+1]=="G" and text[i+2]=="A" and text[i+3]=="T":
tmp+=1
if tmp>AGAT:
AGAT=tmp
else:
AGAT=AGAT
print("done")
As mentioned in the comments there is an infinite loop in your while loop, you could just remove it and choose to use a sliding window technique where you go over the text looking at neighbouring slices of 4 adjacent characters at a time:
text = "AAGGTAAGTTTAGAATATAAAAGGTGAGTTAAATAGAATAGGTTAAAATTAAAGGAGATCAGATCAGATCAGATCTATCTATCTATCTATCTATCAGAAAAGAGTAAATAGTTAAAGAGTAAGATATTGAATTAATGGAAAATATTGTTGGGGAAAGGAGGGATAGAAGG"
search_seq = "AGAT"
count = 0
for i in range(len(text) - len(search_seq) + 1):
if text[i:i+len(search_seq)] == search_seq:
count += 1
print(f"Sequence {search_seq} found {count} times")
Output:
Sequence AGAT found 5 times
I know this is a weird way to solve your problem but i wanted to do something a bit different...
Try this:
agat_string="AAGGTAAGTTTAGAATATAAAAGGTGAGTTAAATAGAATAGGTTAAAATTAAAGGAGATCAGATCAGATCAGATCTATCTATCTATCTATCTATCAGAAAAGAGTAAATAGTTAAAGAGTAAGATATTGAATTAATGGAAAATATTGTTGGGGAAAGGAGGGATAGAAGG"
agat_list=[AGAT for AGAT in range(len(agat_string)) if agat_string.find("AGAT", AGAT) == AGAT] #finds the indices of "AGAT" ;-)
print(len(agat_list))
The output:
5
Also, as someone said, tmp has nothing to do with the while condition. It just throws you in an infinite loop....
My brother and I are creating a simple text editor that changes entries to pig latin using Python. Code below:
our_word = ("cat")
vowels = ("a","e","i","o","u")
#remember I have to compare variables not strings
way = "way"
for i in range(len(our_word)):
for j in range (len(vowels)):
#checking if there is any vowel present
if our_word[i] == vowels[j]:
# if there were to be any vowels our_word[i] wil now be changed with way
#.replace is our function the dot is what notates this in the python library
our_word = our_word.replace(our_word[i], way)
print(our_word)
Right now we're testing the word 'cat' but the program when run returns the following:
/Users/x/PycharmProjects/pythonProject3/venv/bin/python /Users/x/PycharmProjects/pythonProject3/main.py
cwwayyt
Process finished with exit code 0
We're not sure why there is a double 'w' and a double 'y'. It seems the word 'cat' is edited once to 'cwayt' and then a second time to 'cwwayyt'.
Any suggestions are welcome!
The problem arises from the fact that on the next iteration of for loop after doing the substitution, you are looking at the next position, which is part of the way that you just substituted into place. Instead, you need to skip past this. You would also experience another problem, that it only loops up to the original length, rather than the new increased length. You are probably better in this situation to use a while loop with an index variable that you can manipulate to point to the correct place as needed. For example:
our_word = "cat"
vowels = "aeiou"
way = "way"
i = 0
while i < len(our_word):
if our_word[i] in vowels:
our_word = our_word[:i] + way + our_word[i + 1:]
i += len(way) # <=== if you made a substitution, skip over the bit
# that you just substituted in place
else:
i += 1 # <=== if you didn't make any substitution
# just go to the next position next time
print(our_word)
I'm making a method that takes a string, and it outputs parts of the strings on separate line according to a window.
For example:
I want to output every 3 letters of my string on separate line.
Input : "Advantage"
Output:
Adv
ant
age
Input2: "23141515"
Output:
231
141
515
My code:
def print_method(input):
mywindow = 3
start_index = input[0]
if(start_index == input[len(input)-1]):
exit()
print(input[1:mywindow])
printmethod(input[mywindow:])
However I get a runtime error.... Can someone help?
I think this is what you're trying to get. Here's what I changed:
Renamed input to input_str. input is a keyword in Python, so it's not good to use for a variable name.
Added the missing _ in the recursive call to print_method
Print from 0:mywindow instead of 1:mywindow (which would skip the first character). When you start at 0, you can also just say :mywindow to get the same result.
Change the exit statement (was that sys.exit?) to be a return instead (probably what is wanted) and change the if condition to be to return once an empty string is given as the input. The last string printed might not be of length 3; if you want this, you could use instead if len(input_str) < 3: return
def print_method(input_str):
mywindow = 3
if not input_str: # or you could do if len(input_str) == 0
return
print(input_str[:mywindow])
print_method(input_str[mywindow:])
edit sry missed the title: if that is not a learning example for recursion you shouldn't use recursion cause it is less efficient and slices the list more often.
def chunked_print (string,window=3):
for i in range(0,len(string) // window + 1): print(string[i*window:(i+1)*window])
This will work if the window size doesn't divide the string length, but print an empty line if it does. You can modify that according to your needs
I'm trying to write a function to count how many lines in my input file begin with 'AJ000012.1' but my function keeps returning None. I'm a beginner and not entirely sure what the problem is and why this keeps happening. The answer is supposed to be 13 and when I just write code eg:
count=0
input=BLASTreport
for line in input:
if line.startswith('AJ000012.1'):
count=count+1
print('Number of HSPs: {}'.format(count))
I get the right answer. When I try to make this a function and call it, it does not work:
def nohsps(input):
count=0
for line in input:
if line.startswith('AJ000012.1'):
count=count+1
return
ans1=nohsps(BLASTreport)
print('Number of HSPs: {}'.format(ans1))
Any help would be seriously appreciated, thank you!
(HSP stands for high scoring segment pair if you're wondering. The input file is a BLAST report file that lists alignment results for a DNA sequence)
When you simply return without specifying what you are returning, you will not return anything. It will be None. You want to return something. Based on your specifications, you want to return count. Furthermore, you are returning inside your for loop, which means you are never going to get the count you expect. You want to count all occurrences of your match, so you need to move this return outside of your loop:
def nohsps(input):
count=0
for line in input:
if line.startswith('AJ000012.1'):
count=count+1
return count