I am trying to generate combination of ID's
Input: cid = SPARK
oupout: list of all the comibnations as below, position of each element should be constant. I am a beginner in python any help here is much appreciated.
'S****'
'S***K'
'S**R*'
'S**RK'
'S*A**'
'S*A*K'
'S*AR*'
'S*ARK'
'SP***'
'SP**K'
'SP*R*'
'SP*RK'
'SPA**'
'SPA*K'
'SPAR*'
'SPARK'
I tried below, I need a dynamic code:
cid = 'SPARK'
# print(cid.replace(cid[1],'*'))
# cu_len = lenth of cid [SPARK] here which is 5
# com_stars = how many stars i.e '*' or '**'
def cubiod_combo_gen(cu_len, com_stars, j_ite, i_ite):
cubiodList = []
crange = cu_len
i = i_ite #2 #3
j = j_ite #1
# com_stars = ['*','**','***','****']
while( i <= crange):
# print(j,i)
if len(com_stars) == 1:
x = len(com_stars)
n_cid = cid.replace(cid[j:i],com_stars)
i += x
j += x
cubiodList.append(n_cid)
elif len(com_stars) == 2:
x = len(com_stars)
n_cid = cid.replace(cid[j:i],com_stars)
i += x
j += x
cubiodList.append(n_cid)
elif len(com_stars) == 3:
x = len(com_stars)
n_cid = cid.replace(cid[j:i],com_stars)
i += x
j += x
cubiodList.append(n_cid)
return cubiodList
#print(i)
#print(n_cid)
# for item in cubiodList:
# print(item)
print(cubiod_combo_gen(5,'*',1,2))
print(cubiod_combo_gen(5,'**',1,3))
For every character in your given string, you can represent it as a binary string, using a 1 for a character that stays the same and a 0 for a character to replace with an asterisk.
def cubiod_combo_gen(string, count_star):
str_list = [char0 for char0 in string] # a list with the characters of the string
itercount = 2 ** (len(str_list)) # 2 to the power of the length of the input string
results = []
for config in range(itercount):
# return a string of i in binary representation
binary_repr = bin(config)[2:]
while len(binary_repr) < len(str_list):
binary_repr = '0' + binary_repr # add padding
# construct a list with asterisks
i = -1
result_list = str_list.copy() # soft copy, this made me spend like 10 minutes debugging lol
for char in binary_repr:
i += 1
if char == '0':
result_list[i] = '*'
if char == '1':
result_list[i] = str_list[i]
# now we have a possible string value
if result_list.count('*') == count_star:
# convert back to string and add to list of accepted strings
result = ''
for i in result_list:
result = result + i
results.append(result)
return results
# this function returns the value, so you have to use `print(cubiod_combo_gen(args))`
# comment this stuff out if you don't want an interactive user prompt
string = input('Enter a string : ')
count_star = input('Enter number of stars : ')
print(cubiod_combo_gen(string, int(count_star)))
It iterates through 16 characters in about 4 seconds and 18 characters in about 17 seconds. Also you made a typo on "cuboid" but I left the original spelling
Enter a string : DPSCT
Enter number of stars : 2
['**SCT', '*P*CT', '*PS*T', '*PSC*', 'D**CT', 'D*S*T', 'D*SC*', 'DP**T', 'DP*C*', 'DPS**']
As a side effect of this binary counting, the list is ordered by the asterisks, where the earliest asterisk takes precedence, with next earliest asterisks breaking ties.
If you want a cumulative count like 1, 4, 5, and 6 asterisks from for example "ABCDEFG", you can use something like
star_counts = (1, 4, 5, 6)
string = 'ABCDEFG'
for i in star_counts:
print(cubiod_combo_gen(string, star_counts))
If you want the nice formatting you have in your answer, try adding this block at the end of your code:
def formatted_cuboid(string, count_star):
values = cubiod_combo_gen(string, count_star)
for i in values:
print(values[i])
I honestly do not know what your j_ite and i_ite are, but it seems like they have no use so this should work. If you still want to pass these arguments, change the first line to def cubiod_combo_gen(string, count_star, *args, **kwargs):
I am not sure what com_stars does, but to produce your sample output, the following code does.
def cuboid_combo(cid):
fill_len = len(cid)-1
items = []
for i in range(2 ** fill_len):
binary = f'{i:0{fill_len}b}'
#print(binary, 'binary', 'num', i)
s = cid[0]
for idx, bit in enumerate(binary,start=1):
if bit == '0':
s += '*'
else: # 'bit' == 1
s += cid[idx]
items.append(s)
return items
#cid = 'ABCDEFGHI'
cid = 'DPSCT'
result = cuboid_combo(cid)
for item in result:
print(item)
Prints:
D****
D***T
D**C*
D**CT
D*S**
D*S*T
D*SC*
D*SCT
DP***
DP**T
DP*C*
DP*CT
DPS**
DPS*T
DPSC*
DPSCT
Related
I am practicing an algorithm on a website.
I want to add data(number) comma(,) every 3 digit.
But 'a', which variable I made, can't be the collect answer.
But 'b', which variable I searched, is the collect answer.
Can you tell me why 'a' is not the same as 'b'
length = 8
data = "12421421"
inv_result = []
for index in range(length):
if index % 3 == 0:
inv_result.append(',')
inv_result.append(str(data[index]))
else:
inv_result.append(str(data[index]))
result = inv_result[::-1]
#first comma delete
result.pop()
a = ''.join(result)
b = format(int(datas),",")
print(a)
print(b)
print(a == b)
result is
12,412,421
12,421,421
False
Your problem is that you didn't reverse the data in the beginning. The following (slightly cleaned up) code works:
length = 8
data = "12421421"
inv_data = data[::-1]
inv_result = []
for index in range(length):
if index % 3 == 0:
inv_result.append(',')
inv_result.append(str(inv_data[index]))
result = inv_result[::-1]
#first comma delete
result.pop()
a = ''.join(result)
b = format(int(data),",")
print(a)
print(b)
print(a == b)
because you are making it backwards with this line:
result = inv_result[::-1]
If you didn't reverse the order, then you would have the right order.
result = inv_result
result.pop(0) # remove first character which is a comma
But this only works if the number of digits is a multiple of three. For example, if your digits were 1234, then doing it this way would result in 123,4 instead of the desired 1,234.
So you have to reverse the string in the beginning or go through it in reverse order. Then leave the later inversion and pop() like you had it.
for index in range(length):
if index % 3 == 0:
inv_result.append(',')
inv_result.append(str(inv_data[-1-index]))# count from -1 to more negative, equivalent to going backwards through string
result = inv_result[::-1]
#first comma delete
result.pop()
A solution with comprehension:
data = "12421421"
len_data = len(data)
triplets_num = len_data // 3
remainder = len_data % 3
triplets = [data[:remainder]] if remainder else []
triplets += [data[remainder+i*3:remainder+3+i*3] for i in range(triplets_num)]
result = ','.join(triplets)
print(result)
Hel lo I need your help in a complicated task.
Here is a file1.txt :
>Name1.1_1-40_-__Sp1
AAAAAACC-------------
>Name1.1_67-90_-__Sp1
------CCCCCCCCC------
>Name1.1_90-32_-__Sp1
--------------CCDDDDD
>Name2.1_20-89_-__Sp2
AAAAAACCCCCCCCCCC----
>Name2.1_78-200_-__Sp2
-------CCCCCCCCCCDDDD
and the idea is to create a new file called file1.txt_Hsp such as:
>Name1.1-3HSPs-__Sp1
AAAAAACCCCCCCCCCDDDDD
>Name3.1_-__Sp2
AAAAAACCCCCCCCCCC----
>Name4.1_-__Sp2
-------CCCCCCCCCCCCCC
So basically the idea is to:
Compare each sequence from the same SpN <-- (here it is very important only with the same SpN name) with each other in file1.txt.
For instance I will have to compare :
Name1.1_1-40_-__Sp1 vs Name1.1_67-90_-__Sp1
Name1.1_1-40_-__Sp1 vs Name1.1_90-32_-__Sp1
Name1.1_67-90_-__Sp1 vs Name1.1_90-32_-__Sp1
Name2.1_20-89_-__Sp2 vs Name2.1_78-200_-__Sp2
So for exemple when I compare:
Name1.1_1-40_-__Sp1 vs Name1.1_67-90_-__Sp1 I get :
>Name1.1_1-40_-__Sp1
AAAAAACC-------------
>Name1.1_67-90_-__Sp1
------CCCCCCCCC------
here I want to concatenate the two sequences if ratio between number of letter matching with another letter / nb letter matching with a (-) is < 0.20`.
Here for example there are 21 characters, and the number of letter matching with another letter = 2 (C and C).
And the number of letter that match with a - , is 13 (AAAAAA+CCCCCCC)
so
ratio = 2/15 : 0.1538462
and if this ratio < 0.20 then I want to concatenate this 2 sequences such as :
>Name1.1-2HSPs_-__Sp1
AAAAAACCCCCCCCC------
(As you can se the name of the new seq is now : Name.1-2HSPs_-__Sp1 with the 2 meaning that there are 2 sequences concatenated) So we remove the number-number part for XHSPS with X being the number of sequence concatenated.
and get the file1.txt_Hsp :
>Name1.1-2HSPs_-__Sp1
AAAAAACCCCCCCCC------
>Name1.1_90-32_-__Sp1
--------------CCDDDDD
>Name2.1_20-89_-__Sp2
AAAAAACCCCCCCCCCC----
>Name2.1_78-200_-__Sp2
-------CCCCCCCCCCDDDD
Then I do it again with Name1.1-2HSPs_-__Sp1 vs Name1.1_90-32_-__Sp1
>Name1.1-2HSPs_-__Sp1
AAAAAACCCCCCCCC------
>Name1.1_90-32-__Sp1
--------------CCDDDDD
Where ratio = 1/20 = 0.05
Then because the ratio is < 0.20 I want to concatenate this 2 sequences such as :
>Name1.1-3HSPs_-__Sp1
AAAAAACCCCCCCCCCDDDDD
(As you can see the name of the new seq is now : Name.1-3HSPs_-__Sp1 with the 3 meaning that there are 3 sequences concatenated)
file1.txt_Hsp:
>Name1.1-3HSPs_-__Sp1
AAAAAACCCCCCCCCCDDDDD
>Name2.1_20-89_-__Sp2
AAAAAACCCCCCCCCCC----
>Name2.1_78-200_-__Sp2
-------CCCCCCCCCCDDDD
Then I do it again with Name2.1_20-89_-__Sp2 vs Name2.1_78-200_-__Sp2
>Name2.1_20-89_-__Sp2
AAAAAACCCCCCCCCCC----
>Name2.1_78-200_-__Sp2
-------CCCCCCCCCCDDDD
Where ratio = 10/11 = 0.9090909
Then because the ratio is > 0.20 I do nothing and get the final file1.txt_Hsp:
>Name1.1-3HSPs_-__Sp1
AAAAAACCCCCCCCCCDDDDD
>Name2.1_20-89_-__Sp2
AAAAAACCCCCCCCCCC----
>Name2.1_78-200_-__Sp2
-------CCCCCCCCCCDDDD
Which is the final result I needed.
A simplest exemple would be :
>Name1.1_10-60_-__Seq1
AAA------
>Name1.1_70-120_-__Seq1
--AAAAAAA
>Name2.1_12-78_-__Seq2
--AAAAAAA
The ratio is 1/8 = 0.125 because only 1 letter is matching and 8 because 8 letters are matching with a (-)
Because the ratio < 0.20 I concatenate the two sequences Seq1 to:
>Name1.1_2HSPs_-__Seq1
AAAAAAAAA
and the new file should be :
>Name1.1_2HSPs_-__Seq1
AAAAAAAAA
>Name2.1_-__Seq2
--AAAAAAA
** Here is an exemple from my real data **
>YP_009186705
MMSCQSWMMKYFTKVCNRSNLALPFDQSVNPVSFSMISSHDVMLKLDDEIFYKSLNQSNL
ALPFDQSVNPVSFSMISSHDLIA
>XO009980.1_26784332-20639090_-__Agapornis_vilveti
------------------------------------------------------LNQSNL
ALPFDQSVNPVSFSMISSHDLIA
>CM009917.1_20634332-20634508_-__Neodiprion_lecontei
---CDSWMIKFFARISQMC---IKIHSKYEEVSFFLFQSK--KKKIADSHFFRSLNQDTA
-------LNTVSY----------
>XO009980.1_20634508-20634890_-__Agapornis_vilveti
MMSCQSWMMKYFTKVCNRSNLALPFDQSVNPVSFSMISSHDVMLKL--------------
-----------------------
>YUUBBOX12
MMSCQSWMMKYFTKVCNRSNLALPFDQSVNPVSFSMISSHDVMLKLDDEIFYKSLNQSNL
ALPFDQSVNPVSFSMISSHDLIA
and I should get :
>YP_009186705
MMSCQSWMMKYFTKVCNRSNLALPFDQSVNPVSFSMISSHDVMLKLDDEIFYKSLNQSNL
ALPFDQSVNPVSFSMISSHDLIA
>XO009980.1_2HSPs_-__Agapornis_vilveti
MMSCQSWMMKYFTKVCNRSNLALPFDQSVNPVSFSMISSHDVMLKLLNQSNL
ALPFDQSVNPVSFSMISSHDLIA
>CM009917.1_20634332-20634508_-__Neodiprion_lecontei
---CDSWMIKFFARISQMC---IKIHSKYEEVSFFLFQSK--KKKIADSHFFRSLNQDTA
-------LNTVSY----------
>YUUBBOX12
MMSCQSWMMKYFTKVCNRSNLALPFDQSVNPVSFSMISSHDVMLKLDDEIFYKSLNQSNL
ALPFDQSVNPVSFSMISSHDLIA
the ratio between XO009980.1_26784332-20639090_-__Agapornis_vilveti and XO009980.1_20634508-20634890_-__Agapornis_vilveti was : 0/75 = 0
Here as you can see, some sequence does not have the [\d]+[-]+[\d] patterns such as YP_009186705 or YUUBBOX12, these one does not have to be concatenate, they juste have to be added in the outputfile.
Thanks a lot for your help.
First, let's read the text files into tuples of (name, seq):
with open('seq.txt', 'r+') as f:
lines = f.readlines()
seq_map = []
for i in range(0, len(lines), 2):
seq_map.append((lines[i].strip('\n'), lines[i+1].strip('\n')))
#[('>Name1.1_10-60_-__Seq1', 'AAA------'),
# ('>Name1.1_70-120_-__Seq1', '--AAAAAAA'),
# ('>Name2.1_12-78_-__Seq2', '--AAAAAAA')]
#
# or
#
# [('>Name1.1_1-40_-__Sp1', 'AAAAAACC-------------'),
# ('>Name1.1_67-90_-__Sp1', '------CCCCCCCCC------'),
# ('>Name1.1_90-32_-__Sp1', '--------------CCDDDDD'),
# ('>Name2.1_20-89_-__Sp2', 'AAAAAACCCCCCCCCCC----'),
# ('>Name2.1_78-200_-__Sp2', '-------CCCCCCCCCCDDDD')]
Then we define helper functions, one each for checking for a concat, then concat for seq, and merge for name (with a nest helper for getting HSPs counts):
import re
def count_num(x):
num = re.findall(r'[\d]+?(?=HSPs)', x)
count = int(num[0]) if num and 'HSPs' in x else 1
return count
def concat_name(nx, ny):
count, new_name = 0, []
count += count_num(nx)
count += count_num(ny)
for ind, x in enumerate(nx.split('_')):
if ind == 1:
new_name.append('{}HSPs'.format(count))
else:
new_name.append(x)
new_name = '_'.join([x for x in new_name])
return new_name
def concat_seq(x, y):
mash, new_seq = zip(x, y), ''
for i in mash:
if i.count('-') > 1:
new_seq += '-'
else:
new_seq += i[0] if i[1] == '-' else i[1]
return new_seq
def check_concat(x, y):
mash, sim, dissim = zip(x, y), 0 ,0
for i in mash:
if i[0] == i[1] and '-' not in i:
sim += 1
if '-' in i and i.count('-') == 1:
dissim += 1
return False if not dissim or float(sim)/float(dissim) >= 0.2 else True
Then we will write a script to run over the tuples in sequence, checking for spn matches, then concat_checks, and taking forward the new pairing for the next comparison, adding to the final list where necessary:
tmp_seq_map = seq_map[:]
final_seq = []
for ind in range(1, len(seq_map)):
end = True if ind == len(seq_map)-1 else False
pair_a = tmp_seq_map[ind-1]
pair_b = tmp_seq_map[ind]
name_a = pair_a[0][:]
name_b = pair_b[0][:]
if name_a.split('__')[1] == name_b.split('__')[1]:
if check_concat(pair_a[1], pair_b[1]):
new_name = concat_name(pair_a[0], pair_b[0])
new_seq = concat_seq(pair_a[1], pair_b[1])
tmp_seq_map[ind] = (((new_name, new_seq)))
if end:
final_seq.append(tmp_seq_map[ind])
end = False
else:
final_seq.append(pair_a)
else:
final_seq.append(pair_a)
if end:
final_seq.append(pair_b)
print(final_seq)
#[('>Name1.1_2HSPs_-__Seq1', 'AAAAAAAAA'),
# ('>Name2.1_12-78_-__Seq2', '--AAAAAAA')]
#
# or
#
#[('>Name1.1_3HSPs_-__Sp1', 'AAAAAACCCCCCCCCCDDDDD'),
# ('>Name2.1_20-89_-__Sp2', 'AAAAAACCCCCCCCCCC----'),
# ('>Name2.1_78-200_-__Sp2', '-------CCCCCCCCCCDDDD')]
Please note that I have checked for concatenation of only consecutive sequences from the text files, and that you would have to re-use the methods I've written in a different script for accounting for combinations. I leave that to your discretion.
Hope this helps. :)
You can do this as follows.
from collections import defaultdict
with open('lines.txt','r') as fp:
lines=fp.readlines()
dnalist = defaultdict(list)
for i,line in enumerate(lines):
line = line.replace('\n','')
if i%2: #'Name' in line:
dnalist[n].append(line)
else:
n = line.split('-')[-1]
That gives you a dictionary with keys being the file numbers and values being the dna sequences in a list.
def calc_ratio(str1,str2):
n_skipped,n_matched,n_notmatched=0,0,0
print(len(str1),len(str2))
for i,ch in enumerate(str1):
if ch=='-' or str2[i]=='-':
n_skipped +1
elif ch == str2[i]:
n_matched += 1
else:
n_notmatched+=1
retval = float(n_matched)/float(n_matched+n_notmatched+n_skipped)
print(n_matched,n_notmatched,n_skipped)
return retval
That gets you the ratio; you might want to consider the case where characters in the sequences dont match (and neither is '-'), here I assumed that's not a different case than one being '-'.
A helper function to concatenate the strings: here I took the case of non-matching chars and put in an 'X' to mark it (if it ever happens) .
def dna_concat(str1,str2):
outstr=[]
for i,ch in enumerate(str1):
if ch!=str2[i]:
if ch == '-':
outchar = str2[i]
elif str2[i] == '-':
outchar = ch
else:
outchar = 'X'
else:
outchar = ch
outstr.append(outchar)
outstr = ''.join(outstr)
return outstr
And finally a loop thru the dictionary lists to get the concatenated answers, in another dictionary with filenumbers as keys and lists of concatenations as values.
for filenum,dnalist in dnalist.items():
print(dnalist)
answers = defaultdict(list)
for i,seq in enumerate(dnalist):
for seq2 in dnalist[i+1:len(dnalist)]:
ratio = calc_ratio(seq,seq2)
print('i {} {} ration {}'.format(seq,seq2,ratio))
if ratio<0.2:
answers[filenum].append(dna_concat(seq,seq2))
print(dna_concat(seq,seq2))
I'm trying to augment a lengthy string that can contain multiple number of digits (0,1,2,3,4,5,6,?)
Consider a string "000000000004?0??100001??2?0?10000000".
I'm trying to replace all the question marks (?) by the neighbouring largest digit. The comparison should be done from both left character and right character to the question mark (?).
Input String: "000000000004?0??100001??2?0?10000000"
Output String: "000000000004401110000122220110000000"
I wrote a function that ends up replacing them during the first iteration of the loop itself which results in replacing the ? by the highest number i.e, 4 in this case. Check the code snippet below.
Wrong Output: "000000000004404410000144240410000000"
def method_augment(aug_str):
global pos_of_sec_char, sec_char, preced_char
flag_marks_string_end = 0
list_predef = ['0', '1', '2', '3', '4', '5', '6']
len_aug_str = len(aug_str)
for m in range(0, (len_aug_str - 1)):
if aug_str[m] == '?':
pos_of_first_ques = m
if m != 0:
preced_char = aug_str[m - 1]
# print("preced char:", preced_char)
for n in range((pos_of_first_ques + 1), (len_aug_str - 1)):
if aug_str[n] in list_predef:
pos_of_sec_char = n
sec_char = aug_str[n]
print(sec_char)
if preced_char > sec_char:
aug_str = aug_str.replace(aug_str[pos_of_first_ques], preced_char)
del preced_char, sec_char, pos_of_first_ques, m
else:
aug_str = aug_str.replace(aug_str[pos_of_first_ques], sec_char)
del preced_char, sec_char, pos_of_first_ques
break
else:
flag_marks_string_end += 1
else:
for q in range((pos_of_first_ques + 1), (len_aug_str - 1)):
if aug_str[q] in list_predef:
pos_of_sec_char = q
sec_char = aug_str[q]
aug_str = aug_str.replace(aug_str[pos_of_first_ques], sec_char)
break
# if preced_char > sec_char:
# aug_str = aug_str.replace(aug_str[m], preced_char)
# else:
# aug_str = aug_str.replace(aug_str[m], sec_char)
else:
continue
return aug_str
Input String: "000000000004?0??100001??2?0?10000000"
Expected Output String: "000000000004401110000122220110000000"
Actual Output String: "000000000004404410000144240410000000"
There are multiple strings like this with different combinations of digit and ?. I hope I have explained it well. Please help. Thanks.
Here is a way to do it, with some tests:
def replace_with_largest(s):
out = []
last_left = None
next_right = None
for i, c in enumerate(s):
if c in '0123456789':
out.append(c)
last_left = c
next_right = None # now the next digit to the right is unknown
continue
# Now we have a '?'.
# We need the next digit to the right
if next_right is None:
for j in range(i+1, len(s)):
if s[j] != '?':
next_right = s[j]
break
else:
# No more digit right of here, we'll use the last one on the left
next_right = last_left
out.append(max(last_left, next_right) if last_left is not None else next_right)
return ''.join(out)
The tests, some strings and the expected output:
tests = [("000000000004?0??100001??2?0?10000000", "000000000004401110000122220110000000"),
("??123??1", "11123331"),
("123???", "123333")]
for test in tests:
print(test[0], replace_with_largest(test[0]), replace_with_largest(test[0]) == test[1])
000000000004?0??100001??2?0?10000000 000000000004401110000122220110000000 True
??123??1 11123331 True
123??? 123333 True
Your program sounds overly complicated. I didn't even try to understand. Can you read and understand this?
import re
def method_augment(text: str) -> str:
while "?" in text:
text = replacesingle("0" + text + "0")[1:-1] # consider starting and ending question marks
return text
def replacesingle(text: str) -> str:
match = re.search("\\d\\?+\\d", text)
span = match.span(0)
partialtext = text[span[0]:span[1]]
left = int(partialtext[0])
right = int(partialtext[-1])
larger = left if left > right else right
number_of_question_marks = len(partialtext) - 2
text = text[:span[0] + 1] + str(larger) * number_of_question_marks + text[span[1] - 1:]
return text
assert(method_augment("000000000004?0??100001??2?0?10000000") == "000000000004401110000122220110000000")
assert (method_augment("??123??1??") == "1112333111")
I'm not sure how efficient this is but you can split your list such you retain all consecutive values of ?s and together as separate elements, pad that list with leading and trailing characters that'll never pass as the max value test compared to digits or ?s (and also make accessing indices slightly more convenient), eg:
import re
def augment(text):
w = [' ', *[el for el in re.split(r'(\?+)', text) if el], ' ']
for i in range(1, len(w) - 1):
w[i] = w[i].replace('?', max(w[i - 1][-1], w[i + 1][0]))
return ''.join(w[1:-1]).strip() or None
Then to use it, eg:
cases = [
'000000000004?0??100001??2?0?10000000',
'?????????9',
'9????????0',
'0????????9',
'?0???????9',
'123???????',
'12?????321',
'??????????',
]
for case in cases:
print(case, '->', augment(case))
Which gives you:
000000000004?0??100001??2?0?10000000 -> 000000000004401110000122220110000000
?????????9 -> 9999999999
9????????0 -> 9999999990
0????????9 -> 0999999999
?0???????9 -> 0099999999
123??????? -> 1233333333
12?????321 -> 1233333321
?????????? -> None
I need a Python function which gives reversed string with the following conditions.
$ position should not change in the reversed string.
Should not use Python built-in functions.
Function should be an efficient one.
Example : 'pytho$n'
Result : 'nohty$p'
I have already tried with this code:
list = "$asdasdas"
list1 = []
position = ''
for index, i in enumerate(list):
if i == '$':
position = index
elif i != '$':
list1.append(i)
reverse = []
for index, j in enumerate( list1[::-1] ):
if index == position:
reverse.append( '$' )
reverse.append(j)
print reverse
Thanks in advance.
Recognise that it's a variation on the partitioning step of the Quicksort algorithm, using two pointers (array indices) thus:
data = list("foo$barbaz$$")
i, j = 0, len(data) - 1
while i < j:
while i < j and data[i] == "$": i += 1
while i < j and data[j] == "$": j -= 1
data[i], data[j] = data[j], data[i]
i, j = i + 1, j - 1
"".join(data)
'zab$raboof$$'
P.S. it's a travesty to write this in Python!
A Pythonic solution could look like this:
def merge(template, data):
for c in template:
yield c if c == "$" else next(data)
data = "foo$barbaz$$"
"".join(merge(data, reversed([c for c in data if c != "$"])))
'zab$raboof$$'
Wrote this without using any inbuilt functions. Hope it fulfils your criteria -
string = "zytho$n"
def reverse(string):
string_new = string[::-1]
i = 0
position = 0
position_new = 0
for char in string:
if char=="$":
position = i
break
else:
i = i + 1
j = 0
for char in string_new:
if char=="$":
position_new = i
break
else:
j = j + 1
final_string = string_new[:position_new]+string_new[position_new+1:position+1]+"$"+string_new[position+1:]
return(final_string)
string_new = reverse(string)
print(string_new)
The output of this is-
nohty$x
To explain the code to you, first I used [::-1], which is just taking the last position of the string and moving forward so as to reverse the string. Then I found the position of the $ in both the new and the old string. I found the position in the form of an array, in case you have more than one $ present. However, I took for granted that you have just one $ present, and so took the [0] index of the array. Next I stitched back the string using four things - The part of the new string upto the $ sign, the part of the new string from after the dollar sign to the position of the $ sign in the old string, then the $ sign and after that the rest of the new string.
I'm currently learning to create generators and to use itertools. So I decided to make a string index generator, but I'd like to add some parameters such as a "start index" allowing to define where to start generating the indexes.
I came up with this ugly solution which can be very long and not efficient with large indexes:
import itertools
import string
class StringIndex(object):
'''
Generator that create string indexes in form:
A, B, C ... Z, AA, AB, AC ... ZZ, AAA, AAB, etc.
Arguments:
- startIndex = string; default = ''; start increment for the generator.
- mode = 'lower' or 'upper'; default = 'upper'; is the output index in
lower or upper case.
'''
def __init__(self, startIndex = '', mode = 'upper'):
if mode == 'lower':
self.letters = string.ascii_lowercase
elif mode == 'upper':
self.letters = string.ascii_uppercase
else:
cmds.error ('Wrong output mode, expected "lower" or "upper", ' +
'got {}'.format(mode))
if startIndex != '':
if not all(i in self.letters for i in startIndex):
cmds.error ('Illegal characters in start index; allowed ' +
'characters are: {}'.format(self.letters))
self.startIndex = startIndex
def getIndex(self):
'''
Returns:
- string; current string index
'''
startIndexOk = False
x = 1
while True:
strIdMaker = itertools.product(self.letters, repeat = x)
for stringList in strIdMaker:
index = ''.join([s for s in stringList])
# Here is the part to simpify
if self.startIndex:
if index == self.startIndex:
startIndexOk = True
if not startIndexOk:
continue
###
yield index
x += 1
Any advice or improvement is welcome. Thank you!
EDIT:
The start index must be a string!
You would have to do the arithmetic (in base 26) yourself to avoid looping over itertools.product. But you can at least set x=len(self.startIndex) or 1!
Old (incorrect) answer
If you would do it without itertools (assuming you start with a single letter), you could do the following:
letters = 'abcdefghijklmnopqrstuvwxyz'
def getIndex(start, case):
lets = list(letters.lower()) if case == 'lower' else list(letters.upper())
# default is 'upper', but can also be an elif
for r in xrange(0,10):
for l in lets[start:]:
if l.lower() == 'z':
start = 0
yield ''.join(lets[:r])+l
I run until max 10 rows of letters are created, but you could ofcourse use an infinite while loop such that it can be called forever.
Correct answer
I found the solution in a different way: I used a base 26 number translator (based on (and fixxed since it didn't work perfectly): http://quora.com/How-do-I-write-a-program-in-Python-that-can-convert-an-integer-from-one-base-to-another)
I uses itertools.count() to count and just loops over all the possibilities.
The code:
import time
from itertools import count
def toAlph(x, letters):
div = 26
r = '' if x > 0 else letters[0]
while x > 0:
r = letters[x % div] + r
if (x // div == 1) and (x % div == 0):
r = letters[0] + r
break
else:
x //= div
return r
def getIndex(start, case='upper'):
alphabet = 'abcdefghijklmnopqrstuvwxyz'
letters = alphabet.upper() if case == 'upper' else alphabet
started = False
for num in count(0,1):
l = toAlph(num, letters)
if l == start:
started = True
if started:
yield l
iterator = getIndex('AA')
for i in iterator:
print(i)
time.sleep(0.1)