translate my sequence? - python

I have to write a script to translate this sequence:
dict = {"TTT":"F|Phe","TTC":"F|Phe","TTA":"L|Leu","TTG":"L|Leu","TCT":"S|Ser","TCC":"S|Ser",
"TCA":"S|Ser","TCG":"S|Ser", "TAT":"Y|Tyr","TAC":"Y|Tyr","TAA":"*|Stp","TAG":"*|Stp",
"TGT":"C|Cys","TGC":"C|Cys","TGA":"*|Stp","TGG":"W|Trp", "CTT":"L|Leu","CTC":"L|Leu",
"CTA":"L|Leu","CTG":"L|Leu","CCT":"P|Pro","CCC":"P|Pro","CCA":"P|Pro","CCG":"P|Pro",
"CAT":"H|His","CAC":"H|His","CAA":"Q|Gln","CAG":"Q|Gln","CGT":"R|Arg","CGC":"R|Arg",
"CGA":"R|Arg","CGG":"R|Arg", "ATT":"I|Ile","ATC":"I|Ile","ATA":"I|Ile","ATG":"M|Met",
"ACT":"T|Thr","ACC":"T|Thr","ACA":"T|Thr","ACG":"T|Thr", "AAT":"N|Asn","AAC":"N|Asn",
"AAA":"K|Lys","AAG":"K|Lys","AGT":"S|Ser","AGC":"S|Ser","AGA":"R|Arg","AGG":"R|Arg",
"GTT":"V|Val","GTC":"V|Val","GTA":"V|Val","GTG":"V|Val","GCT":"A|Ala","GCC":"A|Ala",
"GCA":"A|Ala","GCG":"A|Ala", "GAT":"D|Asp","GAC":"D|Asp","GAA":"E|Glu",
"GAG":"E|Glu","GGT":"G|Gly","GGC":"G|Gly","GGA":"G|Gly","GGG":"G|Gly"}
seq = "TTTCAATACTAGCATGACCAAAGTGGGAACCCCCTTACGTAGCATGACCCATATATATATATATA"
a=""
for y in range( 0, len ( seq)):
c=(seq[y:y+3])
#print(c)
for k, v in dict.items():
if seq[y:y+3] == k:
alle_amino = v[::3] #alle aminozuren op rijtje, a1.1 -a2.1- a.3.1-a1.2 enzo
print (v)
With this script I get the amino acids from the 3 frames under each other, but how can I sort this and get all the amino acids from frame 1 next to each other, and all the amino acids from frame 2 next to each other, and the same for frame 3?
for example , my results must be :
+3 SerIleLeuAlaStpProLysTrpGluProProTyrValAlaStpProIleTyrIleTyrTle
+2 PheAsnThrSerMetThrLysValGlyThrProLeuArgSerMetThrHisIleTyrIleTyr
+1 PheGlnTyrStpHisAspGlnSerGlyAsnProLeuThrStpHisAspProTyrIleTyrIle
TTTCAATACTAGCATGACCAAAGTGGGAACCCCCTTACGTAGCATGACCCATATATATATATATA
I use Python 3.
i had one more question : can i make this results by some changes in mine own script ?

You can use (Note this would be ridiculously much more easier using biopython translate method):
dictio = {your dictionary here}
def translate(seq):
x = 0
aaseq = []
while True:
try:
aaseq.append(dicti[seq[x:x+3]])
x += 3
except (IndexError, KeyError):
break
return aaseq
seq = "TTTCAATACTAGCATGACCAAAGTGGGAACCCCCTTACGTAGCATGACCCATATATATATATATA"
for frame in range(3):
print('+%i' %(frame+1), ''.join(item.split('|')[1] for item in translate(seq[frame:])))
Note I changed the name of your dictionary with dicti (not to overwrite dict).
Some comments to help you understand:
translate takes you sequence and returns it in the form of a list in which each item corresponds to the amino acid translation of the triplet coding that position. Like:
aaseq = ["L|Leu","L|Leu","P|Pro", ....]
you could process more this data (get only one or three letters code) inside translate or return it as it is to be processed latter as I have done.
translate is called in
''.join(item.split('|')[1] for item in translate(seq[frame:]))
for each frame. For frame value being 0, 1 or 2 it sends seq[frame:] as a parameter to translate. That is, you are sending the sequences corresponding to the three different reading frames processing them in series. Then, in
''.join(item.split('|')[1]
I split the one and three-letters codes for each amino acid and take the one at index 1 (the second). Then they are joined in a single string

Not too pretty, but does what you want
dct = {"TTT":"F|Phe","TTC":"F|Phe","TTA":"L|Leu","TTG":"L|Leu","TCT":"S|Ser","TCC":"S|Ser",
"TCA":"S|Ser","TCG":"S|Ser", "TAT":"Y|Tyr","TAC":"Y|Tyr","TAA":"*|Stp","TAG":"*|Stp",
"TGT":"C|Cys","TGC":"C|Cys","TGA":"*|Stp","TGG":"W|Trp", "CTT":"L|Leu","CTC":"L|Leu",
"CTA":"L|Leu","CTG":"L|Leu","CCT":"P|Pro","CCC":"P|Pro","CCA":"P|Pro","CCG":"P|Pro",
"CAT":"H|His","CAC":"H|His","CAA":"Q|Gln","CAG":"Q|Gln","CGT":"R|Arg","CGC":"R|Arg",
"CGA":"R|Arg","CGG":"R|Arg", "ATT":"I|Ile","ATC":"I|Ile","ATA":"I|Ile","ATG":"M|Met",
"ACT":"T|Thr","ACC":"T|Thr","ACA":"T|Thr","ACG":"T|Thr", "AAT":"N|Asn","AAC":"N|Asn",
"AAA":"K|Lys","AAG":"K|Lys","AGT":"S|Ser","AGC":"S|Ser","AGA":"R|Arg","AGG":"R|Arg",
"GTT":"V|Val","GTC":"V|Val","GTA":"V|Val","GTG":"V|Val","GCT":"A|Ala","GCC":"A|Ala",
"GCA":"A|Ala","GCG":"A|Ala", "GAT":"D|Asp","GAC":"D|Asp","GAA":"E|Glu",
"GAG":"E|Glu","GGT":"G|Gly","GGC":"G|Gly","GGA":"G|Gly","GGG":"G|Gly"}
seq = "TTTCAATACTAGCATGACCAAAGTGGGAACCCCCTTACGTAGCATGACCCATATATATATATATA"
def get_amino_list(s):
for y in range(3):
yield [s[x:x+3] for x in range(y, len(s) - 2, 3)]
for n, amn in enumerate(get_amino_list(seq), 1):
print ("+%d " % n + "".join(dct[x][2:] for x in amn))
print(seq)

Here's my solution. I've called your "dict" variable "aminos". The function method3 returns a list of the values to the right of the "|". To merge them into a single string, just join them on "".
From looking at your code, I believe that your aminos dict contains all possible three-letter combinations. Therefore, I've removed the checks that verify this. It should run a lot faster as a result.
def overlapping_groups(seq, group_len=3):
"""Returns `N` adjacent items from an iterable in a sliding window style
"""
for i in range(len(seq)-group_len):
yield seq[i:i+group_len]
def method3(seq, aminos):
return [aminos[k][2:] for k in overlapping_groups(seq, 3)]
for i in range(3):
print("%d: %s" % (i, "".join(method3(seq[i:], aminos))))

Related

Python insertion sorting a csv by row

My objective is to use an insertion sort to sort the contents of a csv file by the numbers in the first column for example I want this:
[[7831703, Christian, Schmidt]
[2299817, Amber, Cohen]
[1964394, Gregory, Hanson]
[1984288, Aaron, White]
[9713285, Alexander, Kirk]
[7025528, Janice, Lee]
[6441979, Sarah, Browning]
[8815776, Rick, Wallace]
[2395480, Martin, Weinstein]
[1927432, Stephen, Morrison]]
and sort it to:
[[1927432, Stephen, Morrison]
[1964394, Gregory, Hanson]
[1984288, Aaron, White]
[2299817, Amber, Cohen]
[2395480, Martin, Weinstein]
[6441979, Sarah, Browning]
[7025528, Janice, Lee]
[7831703, Christian, Schmidt]
[8815776, Rick, Wallace]
[9713285, Alexander, Kirk]]
based off the numbers in the first column within python my current code looks like:
import csv
with open('EmployeeList.csv', newline='') as File:
reader = csv.reader(File)
readList = list(reader)
for row in reader:
print(row)
def insertionSort(readList):
#Traverse through 1 to the len of the list
for row in range(len(readList)):
# Traverse through 1 to len(arr)
for i in range(1, len(readList[row])):
key = readList[row][i]
# Move elements of arr[0..i-1], that are
# greater than key, to one position ahead
# of their current position
j = i-1
while j >=0 and key < readList[row][j] :
readList[row] = readList[row]
j -= 1
readList[row] = key
insertionSort(readList)
print ("Sorted array is:")
for i in range(len(readList)):
print ( readList[i])
The code can already sort the contents of a 2d array, but as it is it tries to sort everything.
I think if I got rid of the [] it would work but in testing it hasn't given what I needed.
To try to clarify again I want to sort the rows positions based off of the first columns numerical value.
Sorry if I didn't understand your need right. But you have a list and you need to sort it? Why you don't you just use sort method in list object?
>>> data = [[7831703, "Christian", "Schmidt"],
... [2299817, "Amber", "Cohen"],
... [1964394, "Gregory", "Hanson"],
... [1984288, "Aaron", "White"],
... [9713285, "Alexander", "Kirk"],
... [7025528, "Janice", "Lee"],
... [6441979, "Sarah", "Browning"],
... [8815776, "Rick", "Wallace"],
... [2395480, "Martin", "Weinstein"],
... [1927432, "Stephen", "Morrison"]]
>>> data.sort()
>>> from pprint import pprint
>>> pprint(data)
[[1927432, 'Stephen', 'Morrison'],
[1964394, 'Gregory', 'Hanson'],
[1984288, 'Aaron', 'White'],
[2299817, 'Amber', 'Cohen'],
[2395480, 'Martin', 'Weinstein'],
[6441979, 'Sarah', 'Browning'],
[7025528, 'Janice', 'Lee'],
[7831703, 'Christian', 'Schmidt'],
[8815776, 'Rick', 'Wallace'],
[9713285, 'Alexander', 'Kirk']]
>>>
Note that here we have first element parsed as integer. It is important if you want to sort it by numerical value (99 comes before 100).
And don't be confused by importing pprint. You don't need it to sort. I just used is to get nicer output in console.
And also note that List.sort() is in-place method. It doesn't return sorted list but sorts the list itself.
*** EDIT ***
Here is two different apporach to sort function. Both could be heavily optimized but I hope you get some ideas how this can be done. Both should work and you can add some print commands in loops to see what happens there.
First recursive version. It orders the list a little bit on every run until it is ordered.
def recursiveSort(readList):
# You don't want to mess original data, so we handle copy of it
data = readList.copy()
changed = False
res = []
while len(data): #while 1 shoudl work here as well because eventually we break the loop
if len(data) == 1:
# There is only one element left. Let's add it to end of our result.
res.append(data[0])
break;
if data[0][0] > data[1][0]:
# We compare first two elements in list.
# If first one is bigger, we remove second element from original list and add it next to the result set.
# Then we raise changed flag to tell that we changed the order of original list.
res.append(data.pop(1))
changed = True
else:
# otherwise we remove first element from the list and add next to the result list.
res.append(data.pop(0))
if not changed:
#if no changes has been made, the list is in order
return res
else:
#if we made changes, we sort list one more time.
return recursiveSort(res)
And here is a iterative version, closer your original function.
def iterativeSort(readList):
res = []
for i in range(len(readList)):
print (res)
#loop through the original list
if len(res) == 0:
# if we don't have any items in our result list, we add first element here.
res.append(readList[i])
else:
done = False
for j in range(len(res)):
#loop through the result list this far
if res[j][0] > readList[i][0]:
#if our item in list is smaller than element in res list, we insert it here
res.insert(j, readList[i])
done = True
break
if not done:
#if our item in list is bigger than all the items in result list, we put it last.
res.append(readList[i])
print(res)
return res

different result from recursive and dynamic programming

Working on below problem,
Problem,
Given a m * n grids, and one is allowed to move up or right, find the different paths between two grid points.
I write a recursive version and a dynamic programming version, but they return different results, and any thoughts what is wrong?
Source code,
from collections import defaultdict
def move_up_right(remaining_right, remaining_up, prefix, result):
if remaining_up == 0 and remaining_right == 0:
result.append(''.join(prefix[:]))
return
if remaining_right > 0:
prefix.append('r')
move_up_right(remaining_right-1, remaining_up, prefix, result)
prefix.pop(-1)
if remaining_up > 0:
prefix.append('u')
move_up_right(remaining_right, remaining_up-1, prefix, result)
prefix.pop(-1)
def move_up_right_v2(remaining_right, remaining_up):
# key is a tuple (given remaining_right, given remaining_up),
# value is solutions in terms of list
dp = defaultdict(list)
dp[(0,1)].append('u')
dp[(1,0)].append('r')
for right in range(1, remaining_right+1):
for up in range(1, remaining_up+1):
for s in dp[(right-1,up)]:
dp[(right,up)].append(s+'r')
for s in dp[(right,up-1)]:
dp[(right,up)].append(s+'u')
return dp[(right, up)]
if __name__ == "__main__":
result = []
move_up_right(2,3,[],result)
print result
print '============'
print move_up_right_v2(2,3)
In version 2 you should be starting your for loops at 0 not at 1. By starting at 1 you are missing possible permutations where you traverse the bottom row or leftmost column first.
Change version 2 to:
def move_up_right_v2(remaining_right, remaining_up):
# key is a tuple (given remaining_right, given remaining_up),
# value is solutions in terms of list
dp = defaultdict(list)
dp[(0,1)].append('u')
dp[(1,0)].append('r')
for right in range(0, remaining_right+1):
for up in range(0, remaining_up+1):
for s in dp[(right-1,up)]:
dp[(right,up)].append(s+'r')
for s in dp[(right,up-1)]:
dp[(right,up)].append(s+'u')
return dp[(right, up)]
And then:
result = []
move_up_right(2,3,[],result)
set(move_up_right_v2(2,3)) == set(result)
True
And just for fun... another way to do it:
from itertools import permutations
list(map(''.join, set(permutations('r'*2+'u'*3, 5))))
The problem with the dynamic programming version is that it doesn't take into account the paths that start from more than one move up ('uu...') or more than one move right ('rr...').
Before executing the main loop you need to fill dp[(x,0)] for every x from 1 to remaining_right+1 and dp[(0,y)] for every y from 1 to remaining_up+1.
In other words, replace this:
dp[(0,1)].append('u')
dp[(1,0)].append('r')
with this:
for right in range(1, remaining_right+1):
dp[(right,0)].append('r'*right)
for up in range(1, remaining_up+1):
dp[(0,up)].append('u'*up)

Is there a better way to combine multiple items in a python list

I've created a function to combine specific items in a python list, but I suspect there is a better way I can't find despite extreme googling. I need the code to be fast, as I'm going to be doing this thousands of times.
mergeleft takes a list of items and a list of indices. In the example below, I call it as mergeleft(fields,(2,4,5)). Items 5, 4, and 2 of list fields will be concatenated to the item immediately to the left. In this case, 3 and d get concatenated to c; b gets concatenated to a. The result is a list ('ab', 'cd3', 'f').
fields = ['a','b','c','d', 3,'f']
def mergeleft(x, fieldnums):
if 1 in fieldnums: raise Exception('Cannot merge field 1 left')
if max(fieldnums) > len(x): raise IndexError('Fieldnum {} exceeds available fields {}'.format(max(fieldnums),len(x)))
y = []
deleted_rows = ''
for i,l in enumerate(reversed(x)):
if (len(x) - i) in fieldnums:
deleted_rows = str(l) + deleted_rows
else:
y.append(str(l)+deleted_rows)
deleted_rows = ''
y.reverse()
return y
print(mergeleft(fields,(2,4,5)))
# Returns ['ab','cd3','f']
fields = ['a','b','c','d', 3,'f']
This assumes a list of indices in monotonic ascending order.
I reverse the order, so that I'm merging right-to-left.
For each given index, I merge that element into the one on the left, converting to string at each point.
Do note that I've changed the fieldnums type to list, so that it's easily reversible. You can also just traverse the tuple in reverse order.
def mergeleft(lst, fieldnums):
fieldnums.reverse()
for pos in fieldnums:
# Merge this field left
lst[pos-2] = str(lst[pos-2]) + str(lst[pos-1])
lst = lst[:pos-1] + lst[pos:]
return lst
print(mergeleft(fields,[2,4,5]))
Output:
['ab', 'cd3', 'f']
Here's a decently concise solution, probably among many.
def mergeleft(x, fieldnums):
if 1 in fieldnums: raise Exception('Cannot merge field 1 left')
if max(fieldnums) > len(x): raise IndexError('Fieldnum {} exceeds available fields {}'.format(max(fieldnums),len(x)))
ret = list(x)
for i in reversed(sorted(set(fieldnums))):
ret[i-1] = str(ret[i-1]) + str(ret.pop(i))
return ret

Understanding another's text-mining function that removes similar strings

I’m trying to replicate the methodology from this article, 538 Post about Most Repetitive Phrases, in which the author mined US presidential debate transcripts to determine the most repetitive phrases for each candidate.
I'm trying to implement this methodology with another dataset in R with the tm package.
Most of the code (GitHub repository) concerns mining the transcripts and assembling counts of each ngram, but I get lost at the prune_substrings() function code below:
def prune_substrings(tfidf_dicts, prune_thru=1000):
pruned = tfidf_dicts
for candidate in range(len(candidates)):
# growing list of n-grams in list form
so_far = []
ngrams_sorted = sorted(tfidf_dicts[candidate].items(), key=operator.itemgetter(1), reverse=True)[:prune_thru]
for ngram in ngrams_sorted:
# contained in a previous aka 'better' phrase
for better_ngram in so_far:
if overlap(list(better_ngram), list(ngram[0])):
#print "PRUNING!! "
#print list(better_ngram)
#print list(ngram[0])
pruned[candidate][ngram[0]] = 0
# not contained, so add to so_far to prevent future subphrases
else:
so_far += [list(ngram[0])]
return pruned
The input of the function, tfidf_dicts, is an array of dictionaries (one for each candidate) with ngrams as keys and tf-idf scores as values. For example, Trump's tf-idf dict begins like this:
trump.tfidf.dict = {'we don't win': 83.2, 'you have to': 72.8, ... }
so the structure of the input is like this:
tfidf_dicts = {trump.tfidf.dict, rubio.tfidf.dict, etc }
MY understanding is that prune_substrings does the following things, but I'm stuck on the else if clause, which is a pythonic thing I don't understand yet.
A. create list : pruned as tfidf_dicts; a list of tfidf dicts for each candidate
B loop through each candidate:
so_far = start an empty list of ngrams gone through so so_far
ngrams_sorted = sorted member's tf-idf dict from smallest to biggest
loop through each ngram in sorted
loop through each better_ngram in so_far
IF overlap b/w (below) == TRUE:
better_ngram (from so_far) and
ngram (from ngrams_sorted)
THEN zero out tf-idf for ngram
ELSE if (WHAT?!?)
add ngram to list, so_far
C. return pruned, i.e. list of unique ngrams sorted in order
Any help at all is much appreciated!
Note the indentation in your code... The else is lined up with the second for, not the if. This is a for-else construct, not an if-else.
In that case, the else is being used to initialize the inner loop, because it will be executed when so_far is empty the first time through, and each time the inner loop runs out of items to iterate through...
I am not sure that this is the most efficient way to achieve these comparisons, but conceptually you can get a sense of the flow with this snippet:
s=[]
for j in "ABCD":
for i in s:
print i,
else:
print "\nelse"
s.append(j)
Output:
else
A
else
A B
else
A B C
else
I would think that in R there is a much better way to do this than nested loops....
4 months later but here's my solution. I'm sure there is a more efficient solution, but for my purposes, it worked. The pythonic for-else doesn't translate to R. So the steps are different.
Take top n ngrams.
Create a list, t, where each element of the list is a logical vector of length n that says whether ngram in question overlaps all other ngrams (but fix 1:x to be false automatically)
Cbind together every element of t into a table, t2
Return only elements of t2 row sum is zero
set elements 1:n to FALSE (i.e. no overlap)
Ouala!
PrunedList Function
#' GetPrunedList
#'
#' takes a word freq df with columns Words and LenNorm, returns df of nonoverlapping strings
GetPrunedList <- function(wordfreqdf, prune_thru = 100) {
#take only first n items in list
tmp <- head(wordfreqdf, n = prune_thru) %>%
select(ngrams = Words, tfidfXlength = LenNorm)
#for each ngram in list:
t <- (lapply(1:nrow(tmp), function(x) {
#find overlap between ngram and all items in list (overlap = TRUE)
idx <- overlap(tmp[x, "ngrams"], tmp$ngrams)
#set overlap as false for itself and higher-scoring ngrams
idx[1:x] <- FALSE
idx
}))
#bind each ngram's overlap vector together to make a matrix
t2 <- do.call(cbind, t)
#find rows(i.e. ngrams) that do not overlap with those below
idx <- rowSums(t2) == 0
pruned <- tmp[idx,]
rownames(pruned) <- NULL
pruned
}
Overlap function
#' overlap
#' OBJ: takes two ngrams (as strings) and to see if they overlap
#' INPUT: a,b ngrams as strings
#' OUTPUT: TRUE if overlap
overlap <- function(a, b) {
max_overlap <- min(3, CountWords(a), CountWords(b))
a.beg <- word(a, start = 1L, end = max_overlap)
a.end <- word(a, start = -max_overlap, end = -1L)
b.beg <- word(b, start = 1L, end = max_overlap)
b.end <- word(b, start = -max_overlap, end = -1L)
# b contains a's beginning
w <- str_detect(b, coll(a.beg, TRUE))
# b contains a's end
x <- str_detect(b, coll(a.end, TRUE))
# a contains b's beginning
y <- str_detect(a, coll(b.beg, TRUE))
# a contains b's end
z <- str_detect(a, coll(b.end, TRUE))
#return TRUE if any of above are true
(w | x | y | z)
}

sorting pattern algorithms in python

What i am trying to do is to prompt the user for an sort function type, sort patter, array size, size of array increment and number of test. Then i want it to save it. However, there are couple problems with this program.
Somehow when i choose the random pattern it gives me some weird answer like:
1543 0.002
600 0.020
1400 0.08
Its not really in an order. I think that something wrong is with the for loop.
def rand_array(n):
''' returns sorted array of integers of size n'''
R=[randint(1, 1000*n) for i in xrange(n)]
return R
def sorted_array(n):
''' returns a sorted array of n integers'''
return [i for i in xrange(1,n+1)]
def rev_array(n):
'''returns an array of n integers in reverse order'''
R= [i for i in reversed(xrange(1,n+1))]
return R
def sort_timehelp(x,f):
''' This times the quick sort algorithm as it must take 3 variables'''
high=len(x)
low=0
t0=clock()
f(x,low,high)
t1=clock()
dt=t1-t0
return dt
def main():
myinfo()
info()
while True:
print '==================== to quit enter Control-c=================='
sortfunction=input("Choose a sort function: ")
s=input("Choose a pattern: ")
n=input("Array Size: ")
increment=input("Increment size: ")
y=input("Number of tests: ")
if s == 1:
x=rand_array(n)
elif s ==2:
x= sorted_array(n)
elif s==3:
x=rev_array(n)
if sortfunction==1:
i=0
output="algorith: quick sort \n input data: %s" %s
print output
while i<y:
i=i+1
ff=0.0
array=x[increment-1:n:increment]
for my in array:
ff+=sort_timehelp(x,quick_sort)
output="%d\t %f" %(my, ff)
print output
saving=input("You want to save data ? type 0 to continue or 1 to save " )
if saving == 0:
continue
if saving == 1:
ask=raw_input("Type the name file: ")
fileout=open(ask+".csv","w")
fileout.write(output)
fileout.close()
Second problem is that when i am trying to save the data it only saves the last data, but i want to save everything.
I would appreciate any help.
Edit:
timing function takes and array and a sorting algorithm
i want to save the numbers by increments and corresponding timing to it. (thats where my for loop)
Your random pattern is actually a random pattern, not a sorted list as the docstring suggests.
To save everything, open your output file for appending, not just writing (which, as you've found, overwrites the previous contents). That is, use "a" instead of "w".
There are a lot of issues. Let's go through them...
def rand_array(n):
''' returns sorted array of integers of size n'''
R=[randint(1, 1000*n) for i in xrange(n)]
return R
This doesn't return a sorted array of random numbers. It returns a list of random integers chosen from successively larger domains. You probably want:
def rand_array(n):
''' returns sorted array of integers of size n'''
return sorted([randint(1, 1000) for i in xrange(n)])
def sorted_array(n):
''' returns a sorted array of n integers'''
return [i for i in xrange(1,n+1)]
This should simply be:
def sorted_array(n):
''' returns a sorted array of n integers'''
return range(1, n + 1)
def rev_array(n):
'''returns an array of n integers in reverse order'''
R= [i for i in reversed(xrange(1,n+1))]
return R
is simply:
def rev_array(n):
'''returns an array of n integers in reverse order'''
return reversed(sorted_array(n))
i=0
output="algorith: quick sort \n input data: %s" %s
print output
while i<y:
i=i+1
ff=0.0
array=x[increment-1:n:increment]
for my in array:
ff+=sort_timehelp(x,quick_sort)
output="%d\t %f" %(my, ff)
print output
So you're sorting as many times (in the inner loop) as you have array elements? Not sure why. Anyway, the business with i should simply be done with a for loop:
print "algorith: quick sort \n input data: %s" %s
for i in range(y):
ff = 0.0
array = x[increment-1:n:increment]
for my in array:
ff += sort_timehelp(x, quick_sort)
output = "%d\t %f" %(my, ff)
print output
saving=input("You want to save data ? type 0 to continue or 1 to save " )
if saving == 0:
continue
if saving == 1:
ask=raw_input("Type the name file: ")
fileout=open(ask+".csv","w")
fileout.write(output)
fileout.close()
The if saving==0 clause can be removed; any value of saving other than 1 will skip saving.
As Scott pointed out, you want "a" instead of "w" in open. Another thing you could do is move the open and close out of the loop. You might also want to use the built-in Python csv module.

Categories

Resources