how to get all value in list in same time - python

following show the my list value (variable is "data")
[{'letters': ['R', 'V', 'X', 'U', 'M', 'Z', 'B', 'O', 'R'],
'words': ['RVX', 'BOM', 'RUB', 'RUZ', 'MBOOO', 'RMR'],
'score': 51},
{'letters': ['P', 'X', 'M', 'R', 'D', 'S', 'I', 'C', 'E'],
'words': ['PXM', 'RDS', 'ICE', 'PRI', 'DSCE', 'PXM', 'MRE'],
'score': 54}]
Then I used the following code
print(*data)
output was
{'letters': ['R', 'V', 'X', 'U', 'M', 'Z', 'B', 'O', 'R'], 'words': ['RVX', 'BOM', 'RUB', 'RUZ', 'MBOOO', 'RMR'], 'score': 51} {'letters': ['P', 'X', 'M', 'R', 'D', 'S', 'I', 'C', 'E'], 'words': ['PXM', 'RDS', 'ICE', 'PRI', 'DSCE', 'PXM', 'MRE'], 'score': 54}
but I want to get that output with comma with saparate two JSON
Like this
{'letters': ['R', 'V', 'X', 'U', 'M', 'Z', 'B', 'O', 'R'], 'words': ['RVX', 'BOM', 'RUB', 'RUZ', 'MBOOO', 'RMR'], 'score': 51},
{'letters': ['P', 'X', 'M', 'R', 'D', 'S', 'I', 'C', 'E'], 'words': ['PXM', 'RDS', 'ICE', 'PRI', 'DSCE', 'PXM', 'MRE'], 'score': 54}
Any one can help me
Thank you

Is this what you are trying to accomplish?
print(*data, sep=",\n")
That will put your items on separate lines, with commas after all but the last.

You can try this:
print(*data, sep=',')

Related

Remove adjacent items group in list based on length of elements

I have a list of items from PDF text extraction in this way:
['performed three times. Data represent the mean±SEM of threeindependent experiments. *P<0.05, **P<0.005, ***P<0.001.', 'B','O-GlcNAc', 'AMPK', 'pAMPK', 'SREBP-1', 'ACC', 'FAS', 'actin', 'C','T', 'L', 'D', 'O', 'N', 'T', 'M', 'G', 'C', 'T', 'L', 'D', 'O', 'N','T', 'M', 'G', 'HaCaT HeLa', 'O', '-G', 'lN', 'A', 'c', 'le', 'v','e', 'l', '(A', '.U', ')', 'H', 'a', 'C', 'a', 'T', 'C', 'T', 'L','0.0', '2.5', '1.5', '1.0', '0.5', 'H', 'a', 'C', 'a', 'T', 'D', 'O','N', 'H', 'a', 'C', 'a', 'T', 'T', 'M', 'G', '2.0', '**', '***', 'S','R', 'E', 'B', 'P', '-1', 'le', 'v', 'e', 'l', 'H', 'a', 'C', 'a','T', 'C', 'T', 'L', '0.0', '1.5', '1.0', '0.5', 'H', 'a', 'C', 'a','T', 'D', 'O', 'N', 'H', 'a', 'C', 'a', 'T', 'T', 'M', 'G', '2.0','F', 'A', 'S', '(A', '.U', ')', 'H', 'a', 'C', 'a', 'T', 'C', 'T','L', '0.0', '1.5', '1.0', '0.5', 'H', 'a', 'C', 'a', 'T', 'D', 'O','N', 'H', 'a', 'C', 'a', 'T', 'T', 'M', 'G', '2.0', 'A', 'C', 'C','(A', '.U', ')', 'H', 'a', 'C', 'a', 'T', 'C', 'T', 'L', '0.0', '2.5','1.5', '1.0', '0.5', 'H', 'a', 'C', 'a', 'T', 'D', 'O', 'N', 'H', 'a','C', 'a', 'T', 'T', 'M', 'G', '2.0', 'O', '-G', 'lc', 'N', 'A', 'c','le', 'v', 'e', 'l', '(A', '.U', ')', 'H', 'e', 'L', 'a', 'C', 'T','L', '0.0', '1.5', '1.0', '0.5', 'H', 'e', 'L', 'a', 'D', 'O', 'N','H', 'e', 'L', 'a', 'T', 'M', 'G', '2.0', '***', '***', 'S', 'R', 'E','B', 'P', '-1', 'le', 'v', 'e', 'l', 'H', 'e', 'L', 'a', 'C', 'T','L', '0.0', '1.5', '1.0', '0.5', 'H', 'e', 'L', 'a', 'D', 'O', 'N','H', 'e', 'L', 'a', 'T', 'M', 'G', '2.0', '***', 'F', 'A', 'S', '(A','.U', ')', 'H', 'e', 'L', 'a', 'C', 'T', 'L', '0.0', '1.5', '1.0','0.5', 'H', 'e', 'L', 'a', 'D', 'O', 'N', 'H', 'e', 'L', 'a', 'T','M', 'G', '2.0', '***', 'A', 'C', 'C', '(A', '.U', ')', 'H', 'e', 'L','a', 'C', 'T', 'L', '0.0', '1.5', '1.0', '0.5', 'H', 'e', 'L', 'a','D', 'O', 'N', 'H', 'e', 'L', 'a', 'T', 'M', 'G', '2.0', '***', '***','***', '***', '***', '***', '***', '*** ***', '***', 'O-GlcNAc','AMPK', 'pAMPK', 'SREBP-1', 'ACC', 'FAS', '�-actin', 'O', '-G', 'lN','A', 'c', 'le', 'v', 'e', 'l', '(A', '.U', ')', 'C', 'T', 'L', 'H','a', 'C', 'a', 'T', '0.0', '1.5', '1.0', '0.5', 'Q', 'u', 'e', 'r','H', 'a', 'C', 'a', 'T', 'C', 'T', 'L', 'H', 'e', 'L', 'a', '2.0','Q', 'u', 'e', 'r', 'H', 'e', 'L', 'a', 'p', 'A', 'M', 'P', 'K', '(A','.U', ')', 'C', 'T', 'L', 'H', 'a', 'C', 'a', 'T', '0.0', '1.5','1.0', '0.5', 'Q', 'u', 'e', 'r', 'H', 'a', 'C', 'a', 'T', 'C', 'T','L', 'H', 'e', 'L', 'a', '2.5 ***', 'Q', 'u', 'e', 'r', 'H', 'e', 'L','a', 'S', 'R', 'E', 'B', 'P', '-1', 'le', 'v', 'e', 'l', 'C', 'T','L', 'H', 'a', 'C', 'a', 'T', '0.0', '1.5', '1.0', '0.5', 'Q', 'u','e', 'r', 'H', 'a', 'C', 'a', 'T', 'C', 'T', 'L', 'H', 'e', 'L', 'a','**', 'Q', 'u', 'e', 'r', 'H', 'e', 'L', 'a', 'A', 'C', 'C', '(A','.U', ')', 'C', 'T', 'L', 'H', 'a', 'C', 'a', 'T', '0.0', '1.5','1.0', '0.5', 'Q', 'u', 'e', 'r', 'H', 'a', 'C', 'a', 'T', 'C', 'T','L', 'H', 'e', 'L', 'a', '2.0', '***', 'Q', 'u', 'e', 'r', 'H', 'e','L', 'a', 'F', 'A', 'S', '(A', '.U', ')', 'C', 'T', 'L', 'H', 'a','C', 'a', 'T', '0.0', '1.5', '1.0', '0.5', 'Q', 'u', 'e', 'r', 'H','a', 'C', 'a', 'T', 'C', 'T', 'L', 'H', 'e', 'L', 'a', '***', 'Q','u', 'e', 'r', 'H', 'e', 'L', 'a', '2.0', '***', '*** ***', '*','******* *******', 'HaCaT HeLa', 'CTL Quer CTL Quer', 'A', 'Fig. 4.Quercetin regulates SREBP­1 and its target proteins']
In this list, I would like to remove all groups of adjacent elements (length of group > N) for which no element has length > M.
A pseudo code would be:
for item in list:
if len(item) <= M:
buffer.append(item_index)
active = True
if len(item) > M and active == True:
active = False
if len(buffer) > N:
list.replace_at_index(buffer_by_index,'')
buffer.clear()
Thanks for helping
Here is how you can use the built-in enumerate method to iterate through the elements in a list alongside each element's index:
lst = ['performed three times. Data represent the mean±SEM of threeindependent experiments. *P<0.05, **P<0.005, ***P<0.001.', 'B','O-GlcNAc', 'AMPK', 'pAMPK', 'SREBP-1', 'ACC', 'FAS', 'actin', 'C','T', 'L', 'D', 'O', 'N', 'T', 'M', 'G', 'C', 'T', 'L', 'D', 'O', 'N','T', 'M', 'G', 'HaCaT HeLa', 'O', '-G', 'lN', 'A', 'c', 'le', 'v','e', 'l', '(A', '.U', ')', 'H', 'a', 'C', 'a', 'T', 'C', 'T', 'L','0.0', '2.5', '1.5', '1.0', '0.5', 'H', 'a', 'C', 'a', 'T', 'D', 'O','N', 'H', 'a', 'C', 'a', 'T', 'T', 'M', 'G', '2.0', '**', '***', 'S','R', 'E', 'B', 'P', '-1', 'le', 'v', 'e', 'l', 'H', 'a', 'C', 'a','T', 'C', 'T', 'L', '0.0', '1.5', '1.0', '0.5', 'H', 'a', 'C', 'a','T', 'D', 'O', 'N', 'H', 'a', 'C', 'a', 'T', 'T', 'M', 'G', '2.0','F', 'A', 'S', '(A', '.U', ')', 'H', 'a', 'C', 'a', 'T', 'C', 'T','L', '0.0', '1.5', '1.0', '0.5', 'H', 'a', 'C', 'a', 'T', 'D', 'O','N', 'H', 'a', 'C', 'a', 'T', 'T', 'M', 'G', '2.0', 'A', 'C', 'C','(A', '.U', ')', 'H', 'a', 'C', 'a', 'T', 'C', 'T', 'L', '0.0', '2.5','1.5', '1.0', '0.5', 'H', 'a', 'C', 'a', 'T', 'D', 'O', 'N', 'H', 'a','C', 'a', 'T', 'T', 'M', 'G', '2.0', 'O', '-G', 'lc', 'N', 'A', 'c','le', 'v', 'e', 'l', '(A', '.U', ')', 'H', 'e', 'L', 'a', 'C', 'T','L', '0.0', '1.5', '1.0', '0.5', 'H', 'e', 'L', 'a', 'D', 'O', 'N','H', 'e', 'L', 'a', 'T', 'M', 'G', '2.0', '***', '***', 'S', 'R', 'E','B', 'P', '-1', 'le', 'v', 'e', 'l', 'H', 'e', 'L', 'a', 'C', 'T','L', '0.0', '1.5', '1.0', '0.5', 'H', 'e', 'L', 'a', 'D', 'O', 'N','H', 'e', 'L', 'a', 'T', 'M', 'G', '2.0', '***', 'F', 'A', 'S', '(A','.U', ')', 'H', 'e', 'L', 'a', 'C', 'T', 'L', '0.0', '1.5', '1.0','0.5', 'H', 'e', 'L', 'a', 'D', 'O', 'N', 'H', 'e', 'L', 'a', 'T','M', 'G', '2.0', '***', 'A', 'C', 'C', '(A', '.U', ')', 'H', 'e', 'L','a', 'C', 'T', 'L', '0.0', '1.5', '1.0', '0.5', 'H', 'e', 'L', 'a','D', 'O', 'N', 'H', 'e', 'L', 'a', 'T', 'M', 'G', '2.0', '***', '***','***', '***', '***', '***', '***', '*** ***', '***', 'O-GlcNAc','AMPK', 'pAMPK', 'SREBP-1', 'ACC', 'FAS', '�-actin', 'O', '-G', 'lN','A', 'c', 'le', 'v', 'e', 'l', '(A', '.U', ')', 'C', 'T', 'L', 'H','a', 'C', 'a', 'T', '0.0', '1.5', '1.0', '0.5', 'Q', 'u', 'e', 'r','H', 'a', 'C', 'a', 'T', 'C', 'T', 'L', 'H', 'e', 'L', 'a', '2.0','Q', 'u', 'e', 'r', 'H', 'e', 'L', 'a', 'p', 'A', 'M', 'P', 'K', '(A','.U', ')', 'C', 'T', 'L', 'H', 'a', 'C', 'a', 'T', '0.0', '1.5','1.0', '0.5', 'Q', 'u', 'e', 'r', 'H', 'a', 'C', 'a', 'T', 'C', 'T','L', 'H', 'e', 'L', 'a', '2.5 ***', 'Q', 'u', 'e', 'r', 'H', 'e', 'L','a', 'S', 'R', 'E', 'B', 'P', '-1', 'le', 'v', 'e', 'l', 'C', 'T','L', 'H', 'a', 'C', 'a', 'T', '0.0', '1.5', '1.0', '0.5', 'Q', 'u','e', 'r', 'H', 'a', 'C', 'a', 'T', 'C', 'T', 'L', 'H', 'e', 'L', 'a','**', 'Q', 'u', 'e', 'r', 'H', 'e', 'L', 'a', 'A', 'C', 'C', '(A','.U', ')', 'C', 'T', 'L', 'H', 'a', 'C', 'a', 'T', '0.0', '1.5','1.0', '0.5', 'Q', 'u', 'e', 'r', 'H', 'a', 'C', 'a', 'T', 'C', 'T','L', 'H', 'e', 'L', 'a', '2.0', '***', 'Q', 'u', 'e', 'r', 'H', 'e','L', 'a', 'F', 'A', 'S', '(A', '.U', ')', 'C', 'T', 'L', 'H', 'a','C', 'a', 'T', '0.0', '1.5', '1.0', '0.5', 'Q', 'u', 'e', 'r', 'H','a', 'C', 'a', 'T', 'C', 'T', 'L', 'H', 'e', 'L', 'a', '***', 'Q','u', 'e', 'r', 'H', 'e', 'L', 'a', '2.0', '***', '*** ***', '*','******* *******', 'HaCaT HeLa', 'CTL Quer CTL Quer', 'A', 'Fig. 4.Quercetin regulates SREBP­1 and its target proteins']
N = 3
M = 5
buffer = []
for i, v in enumerate(lst):
if len(v) <= M:
buffer.append(i)
else:
if len(buffer) > N:
for i in buffer:
lst[i] = None
buffer.clear()
print(list(filter(None, lst)))
Output:
['performed three times. Data represent the mean±SEM of threeindependent experiments. *P<0.05, **P<0.005, ***P<0.001.', 'B', 'O-GlcNAc', 'AMPK', 'pAMPK', 'SREBP-1', 'HaCaT HeLa', '*** ***', '***', 'O-GlcNAc', 'AMPK', 'pAMPK', 'SREBP-1', 'ACC', 'FAS', '�-actin', '2.5 ***', '*** ***', '*', '******* *******', 'HaCaT HeLa', 'CTL Quer CTL Quer', 'A', 'Fig. 4.Quercetin regulates SREBP\xad1 and its target proteins']
It is unclear exactly what is desired. Below is some code of what I believe is being asked for. Hope this helps. If this is off... let me know.
Psuedo algorithm
Given a list of strings.
Identify the len of each string using class cntseq.
Identify sequence of strings having the same len
For each sequence, if the length of sequence
x = ['performed three times. Data represent the mean±SEM of three independent experiments. P<0.05, P<0.005, P<0.001.', 'B', 'O-GlcNAc', 'AMPK', 'pAMPK', 'SREBP-1', 'ACC', 'FAS', 'actin', 'C', 'T', 'L', 'D', 'O', 'N', 'T', 'M', 'G', 'C', 'T', 'L', 'D', 'O', 'N', 'T', 'M', 'G', 'HaCaT HeLa', 'O', '-G', 'lN', 'A', 'c', 'le', 'v', 'e', 'l', '(A', '.U', ')', 'H', 'a', 'C', 'a', 'T', 'C', 'T', 'L', '0.0', '2.5', '1.5', '1.0', '0.5', 'H', 'a', 'C', 'a', 'T', 'D', 'O', 'N', 'H', 'a', 'C', 'a', 'T', 'T', 'M', 'G', '2.0', '', '', 'S', 'R', 'E', 'B', 'P', '-1', 'le', 'v', 'e', 'l', 'H', 'a', 'C', 'a', 'T', 'C', 'T', 'L', '0.0', '1.5', '1.0', '0.5', 'H', 'a', 'C', 'a', 'T', 'D', 'O', 'N', 'H', 'a', 'C', 'a', 'T', 'T', 'M', 'G', '2.0', 'F', 'A', 'S', '(A', '.U', ')', 'H', 'a', 'C', 'a', 'T', 'C', 'T', 'L', '0.0', '1.5', '1.0', '0.5', 'H', 'a', 'C', 'a', 'T', 'D', 'O', 'N', 'H', 'a', 'C', 'a', 'T', 'T', 'M', 'G', '2.0', 'A', 'C', 'C', '(A', '.U', ')', 'H', 'a', 'C', 'a', 'T', 'C', 'T', 'L', '0.0', '2.5', '1.5', '1.0', '0.5', 'H', 'a', 'C', 'a', 'T', 'D', 'O', 'N', 'H', 'a', 'C', 'a', 'T', 'T', 'M', 'G', '2.0', 'O', '-G', 'lc', 'N', 'A', 'c', 'le', 'v', 'e', 'l', '(A', '.U', ')', 'H', 'e', 'L', 'a', 'C', 'T', 'L', '0.0', '1.5', '1.0', '0.5', 'H', 'e', 'L', 'a', 'D', 'O', 'N', 'H', 'e', 'L', 'a', 'T', 'M', 'G', '2.0', '', '', 'S', 'R', 'E', 'B', 'P', '-1', 'le', 'v', 'e', 'l', 'H', 'e', 'L', 'a', 'C', 'T', 'L', '0.0', '1.5', '1.0', '0.5', 'H', 'e', 'L', 'a', 'D', 'O', 'N', 'H', 'e', 'L', 'a', 'T', 'M', 'G', '2.0', '', 'F', 'A', 'S', '(A', '.U', ')', 'H', 'e', 'L', 'a', 'C', 'T', 'L', '0.0', '1.5', '1.0', '0.5', 'H', 'e', 'L', 'a', 'D', 'O', 'N', 'H', 'e', 'L', 'a', 'T', 'M', 'G', '2.0', '', 'A', 'C', 'C', '(A', '.U', ')', 'H', 'e', 'L', 'a', 'C', 'T', 'L', '0.0', '1.5', '1.0', '0.5', 'H', 'e', 'L', 'a', 'D', 'O', 'N', 'H', 'e', 'L', 'a', 'T', 'M', 'G', '2.0', '', '', '', '', '', '', '', '* ', '', 'O-GlcNAc', 'AMPK', 'pAMPK', 'SREBP-1', 'ACC', 'FAS', '�-actin', 'O', '-G', 'lN', 'A', 'c', 'le', 'v', 'e', 'l', '(A', '.U', ')', 'C', 'T', 'L', 'H', 'a', 'C', 'a', 'T', '0.0', '1.5', '1.0', '0.5', 'Q', 'u', 'e', 'r', 'H', 'a', 'C', 'a', 'T', 'C', 'T', 'L', 'H', 'e', 'L', 'a', '2.0', 'Q', 'u', 'e', 'r', 'H', 'e', 'L', 'a', 'p', 'A', 'M', 'P', 'K', '(A', '.U', ')', 'C', 'T', 'L', 'H', 'a', 'C', 'a', 'T', '0.0', '1.5', '1.0', '0.5', 'Q', 'u', 'e', 'r', 'H', 'a', 'C', 'a', 'T', 'C', 'T', 'L', 'H', 'e', 'L', 'a', '2.5 ', 'Q', 'u', 'e', 'r', 'H', 'e', 'L', 'a', 'S', 'R', 'E', 'B', 'P', '-1', 'le', 'v', 'e', 'l', 'C', 'T', 'L', 'H', 'a', 'C', 'a', 'T', '0.0', '1.5', '1.0', '0.5', 'Q', 'u', 'e', 'r', 'H', 'a', 'C', 'a', 'T', 'C', 'T', 'L', 'H', 'e', 'L', 'a', '', 'Q', 'u', 'e', 'r', 'H', 'e', 'L', 'a', 'A', 'C', 'C', '(A', '.U', ')', 'C', 'T', 'L', 'H', 'a', 'C', 'a', 'T', '0.0', '1.5', '1.0', '0.5', 'Q', 'u', 'e', 'r', 'H', 'a', 'C', 'a', 'T', 'C', 'T', 'L', 'H', 'e', 'L', 'a', '2.0', '', 'Q', 'u', 'e', 'r', 'H', 'e', 'L', 'a', 'F', 'A', 'S', '(A', '.U', ')', 'C', 'T', 'L', 'H', 'a', 'C', 'a', 'T', '0.0', '1.5', '1.0', '0.5', 'Q', 'u', 'e', 'r', 'H', 'a', 'C', 'a', 'T', 'C', 'T', 'L', 'H', 'e', 'L', 'a', '', 'Q', 'u', 'e', 'r', 'H', 'e', 'L', 'a', '2.0', '', '* ', '', '***** *******', 'HaCaT HeLa', 'CTL Quer CTL Quer', 'A', 'Fig. 4. Quercetin regulates SREBP\xad1 and its target proteins']
df = pd.DataFrame(x, columns=['v'])
df['len'] = df.v.apply(len)
N = 2
M = 2
class cntseq(object):
'''
Define class to track sequences across the Column
'''
def __init__(self, **kwargs):
self.prevLen = -1
self.cnt = 0
self.start = None
def countN(self, r):
if r.len == self.prevLen:
# if adjcent sequence found, mark it's start with "len.start"
self.cnt += 1
if self.start == None :
self.start = int(r.name)-1
return '%d.%d'%(r.len, self.start)
else:
# non-adjcent sequence found, mark None.None"
self.prevLen = r.len
self.cnt = 0
self.start = None
return 'None.None'
# Identify sequences of adjcent lengths.
cs = cntseq()
df['seq'] = df.apply(lambda r : cs.countN(r),axis=1)
print("\nOriginal DF info")
print(df.describe())
print("\nOriginal DF ")
print(df.head())
# Compute lookup of duplicate information
df2 = pd.DataFrame(df.groupby('seq').seq.count())
df2.columns=['M']
df2 = df2.reset_index()
n = df2[df2.seq == 'None.None'].index[0]
df2 = df2.drop(78, axis=0)
print("\nLookup DF, seq has count and index for start of 'adjcent elements'")
print(df2.head())
# Compute final DF without duplicates
df3 = df[df.seq.isin(list(df2[df2.M > M].seq))].head()
print("\nFinal DF without duplicated")
print(df3)
print("\nOriginal DF info")
print(df3.describe())
output:
Original DF info
len
count 578.000000
mean 1.730104
std 5.284275
min 0.000000
25% 1.000000
50% 1.000000
75% 1.000000
max 110.000000
Original DF
v len seq
0 performed three times. Data represent the mean... 110 None.None
1 B 1 None.None
2 O-GlcNAc 8 None.None
3 AMPK 4 None.None
4 pAMPK 5 None.None
Lookup DF, seq has count and index for start of 'adjcent elements'
seq M
0 0.221 1
1 0.325 6
2 0.70 1
3 1.111 2
4 1.116 8
Final DF without duplicated
v len seq
10 T 1 1.9
11 L 1 1.9
12 D 1 1.9
13 O 1 1.9
14 N 1 1.9
Original DF info
len
count 5.0
mean 1.0
std 0.0
min 1.0
25% 1.0
50% 1.0
75% 1.0
max 1.0

Trying to append modified lists within a for loop

So I'm trying to add a new list to a 2d array inside a for loop, by taking the last letter of the list and putting it to the start, and adding this new list to the 2d array.
But although the new lists work, when I append to the 2d array, it just appends the normal alphabet, not my new one.
Can anybody see what I'm doing wrong?
def tabulaRecta():
import string
alpha = list(string.ascii_uppercase)
l2d = []
l2d.append(alpha)
for i in range(26):
temp = alpha[len(alpha)-1]
alpha.pop()
alpha.insert(0,temp)
l2d.append(alpha)
print(l2d)
tabulaRecta()
The program is supposed to output something like this, but all of these within a list:
['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z']
['Z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y']
['Y', 'Z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X']
['X', 'Y', 'Z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W']
['W', 'X', 'Y', 'Z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V']
['V', 'W', 'X', 'Y', 'Z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U']
['U', 'V', 'W', 'X', 'Y', 'Z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T']
['T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S']
['S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R']
['R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q']
['Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P']
['P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O']
['O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N']
['N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M']
['M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L']
['L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K']
['K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J']
['J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I']
['I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H']
['H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'A', 'B', 'C', 'D', 'E', 'F', 'G']
['G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'A', 'B', 'C', 'D', 'E', 'F']
['F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'A', 'B', 'C', 'D', 'E']
['E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'A', 'B', 'C', 'D']
['D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'A', 'B', 'C']
['C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'A', 'B']
['B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'A']
['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z']
instead it prints this 26 times in a list:
['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z']
Instead of saying print(...) you can append it to a list and print it when the loop is finished
s = list(string.ascii_uppercase)
>>> for i in range(len(s),-1,-1):
... print(s[i:]+s[:i])
['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z']
['Z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y']
['Y', 'Z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X']
['X', 'Y', 'Z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W']
.
.
.
.
A 2d array is an array of pointers to arrays, and in your code each one points to the same list, that you are modifying 26 times eventually coming back to the original one.
You have to make a copy of the list each time you modify it.
For example, add a alpha = list(alpha) at the beginning of the for loop:
def tabulaRecta():
import string
alpha = list(string.ascii_uppercase)
l2d = []
l2d.append(alpha)
for i in range(26):
alpha = list(alpha)
temp = alpha[len(alpha) - 1]
alpha.pop()
alpha.insert(0, temp)
l2d.append(alpha)
print(l2d)
tabulaRecta()
You could also just do this:
import string
def tabula_rect():
alpha = list(string.ascii_uppercase)
l2d = []
l2d.append(alpha)
for i in range(len(alpha)):
temp = alpha[len(alpha)-1]
alpha = alpha[:-1]
alpha.insert(0, temp)
l2d.append(alpha)
return l2d

DNA To Protein Sequence

I am trying to create a program which translates a DNA sequence entered by the user to 3 alternative protein sequence using the following dictionary (codons are keys, amino acids are values):
{'TGA': '*', 'GCG': 'A', 'CGA': 'R', 'ATA': 'I', 'AGA': 'R', 'TAA': '*', 'TTT': 'F', 'GAG': 'E', 'CTT': 'L', 'CGT': 'R', 'CTC': 'L', 'CTG': 'L', 'TGT': 'C', 'CCA': 'P', 'AAT': 'N', 'GTC': 'V', 'GAC': 'D', 'GAT': 'D', 'TAT': 'Y', 'AAA': 'K', 'GTA': 'V', 'TAG': '*', 'CGC': 'R', 'GCA': 'A', 'TCG': 'S', 'GCT': 'A', 'GCC': 'A', 'TGG': 'W', 'TTC': 'F', 'CCC': 'P', 'TTG': 'L', 'CGG': 'R', 'GGC': 'G', 'AGG': 'R', 'TCC': 'S', 'CCT': 'P', 'GGT': 'G', 'GGG': 'G', 'TCA': 'S', 'AGC': 'S', 'CAG': 'Q', 'CAC': 'H', 'ATC': 'I', 'GAA': 'E', 'GTG': 'V', 'CCG': 'P', 'CAT': 'H', 'AAG': 'K', 'ATG': 'M', 'AAC': 'N', 'TAC': 'Y', 'TGC': 'C', 'CTA': 'L', 'TCT': 'S', 'ATT': 'I', 'ACG': 'T', 'AGT': 'S', 'GTT': 'V', 'TTA': 'L', 'CAA': 'Q', 'GGA': 'G', 'ACC': 'T', 'ACA': 'T', 'ACT': 'T'}
I want to just use mapping and not biopython.
So, when the program is run it should look like this:
Please enter a DNA sequence: GCTgttaagactatgaaaagaataagcaacaccatcaat
Frame 1 is AVKTMKRISNTIN
Frame 2 is LLRL*KE*ATPS
Frame 3 is C*DYEKNKQHHQ
I have created this dictionary from a file, but I'm unsure how to get started from here. Any help would be greatly appreciated. Thanks!
Declare the dictionary:
prot_match = {'TGA': '*', 'GCG': 'A', 'CGA': 'R', 'ATA': 'I', 'AGA': 'R', 'TAA': '*', 'TTT': 'F', 'GAG': 'E', 'CTT': 'L', 'CGT': 'R', 'CTC': 'L', 'CTG': 'L', 'TGT': 'C', 'CCA': 'P', 'AAT': 'N', 'GTC': 'V', 'GAC': 'D', 'GAT': 'D', 'TAT': 'Y', 'AAA': 'K', 'GTA': 'V', 'TAG': '*', 'CGC': 'R', 'GCA': 'A', 'TCG': 'S', 'GCT': 'A', 'GCC': 'A', 'TGG': 'W', 'TTC': 'F', 'CCC': 'P', 'TTG': 'L', 'CGG': 'R', 'GGC': 'G', 'AGG': 'R', 'TCC': 'S', 'CCT': 'P', 'GGT': 'G', 'GGG': 'G', 'TCA': 'S', 'AGC': 'S', 'CAG': 'Q', 'CAC': 'H', 'ATC': 'I', 'GAA': 'E', 'GTG': 'V', 'CCG': 'P', 'CAT': 'H', 'AAG': 'K', 'ATG': 'M', 'AAC': 'N', 'TAC': 'Y', 'TGC': 'C', 'CTA': 'L', 'TCT': 'S', 'ATT': 'I', 'ACG': 'T', 'AGT': 'S', 'GTT': 'V', 'TTA': 'L', 'CAA': 'Q', 'GGA': 'G', 'ACC': 'T', 'ACA': 'T', 'ACT': 'T'}
Slice the string to chunks of 3 characters:
def chunks (arr, size = 1, frame = 1):
return [arr[i: i+size] for i in range(frame - 1, len(arr), size)]
dna = input('Enter DNA: ').upper()
Then, map for the prot_match value of the current sequence (use get for safety in case you dont have that sequence for sure):
frames = 3
for frame in range(frames):
chunked_dna = chunks(dna, 3, frame + 1)
prot_seq = map(lambda seq: prot_match.get(seq, '0'), chunked_dna )
Finally, use join to get the protein sequence as a string from the map:
prot_seq = ''.join(prot_seq)
print('Protein sequence number', frame + 1, ':', prot_seq)
Example (these snippets together in a file):
Enter DNA: GCTgttaagactatgaaaagaataagcaacaccatcaat
Protein sequence number 1 : AVKTMKRISNTIN
Protein sequence number 2 : LLRL*KE*ATPS0
Protein sequence number 3 : C*DYEKNKQHHQ0

Why do I get the error TypeError: unhashable type: 'list'?

RNA_codon_dictonary = {
'UUU': 'F', 'CUU': 'L', 'AUU': 'I', 'GUU': 'V',
'UUC': 'F', 'CUC': 'L', 'AUC': 'I', 'GUC': 'V',
'UUA': 'L', 'CUA': 'L', 'AUA': 'I', 'GUA': 'V',
'UUG': 'L', 'CUG': 'L', 'AUG': 'M', 'GUG': 'V',
'UCU': 'S', 'CCU': 'P', 'ACU': 'T', 'GCU': 'A',
'UCC': 'S', 'CCC': 'P', 'ACC': 'T', 'GCC': 'A',
'UCA': 'S', 'CCA': 'P', 'ACA': 'T', 'GCA': 'A',
'UCG': 'S', 'CCG': 'P', 'ACG': 'T', 'GCG': 'A',
'UAU': 'Y', 'CAU': 'H', 'AAU': 'N', 'GAU': 'D',
'UAC': 'Y', 'CAC': 'H', 'AAC': 'N', 'GAC': 'D',
'UAA': 'Stop', 'CAA': 'Q', 'AAA': 'K', 'GAA': 'E',
'UAG': 'Stop', 'CAG': 'Q', 'AAG': 'K', 'GAG': 'E',
'UGU': 'C', 'CGU': 'R', 'AGU': 'S', 'GGU': 'G',
'UGC': 'C', 'CGC': 'R', 'AGC': 'S', 'GGC': 'G',
'UGA': 'Stop', 'CGA': 'R', 'AGA': 'R', 'GGA': 'G',
'UGG': 'W', 'CGG': 'R', 'AGG': 'R', 'GGG': 'G'
}
def RNA_to_Protien(mRNA_seq):
codon = []
if codon in RNA_codon_dictonary:
# return the aminoacid by looking up in the dictionary:
return RNA_codon_dictonary[codon]
else:
# return '' if we could not translate the codon:
return '?'
if __name__ == "__main__":
mRNA_seq = "UCAAUGUAACGCGCUACCCGGAGCUCUGGGCCCAAAUUUCAUCCACU"
print (RNA_to_Protien(mRNA_seq))
You are checking to see if the empty list is a key in your dictionary. There are two problems with that:
1) The answer will always be no, since your dictionary doesn't have any empty keys, and
2) That operation isn't even allowed, since lists are never allowed to be keys of a dict.
Based on your comment, the following code might be what you are looking for. It breaks the sequence in non-overlapping substrings of length 3, looks up each substring in the dict, and returns all of the results.
def RNA_to_Protien(mRNA_seq):
return [
RNA_codon_dictonary.get(mRNA_seq[i:i+3], '?')
for i in range(0, len(mRNA_seq), 3)
]
In your example sequence, this returns:
['S', 'M', 'Stop', 'R', 'A', 'T', 'R', 'S', 'S', 'G', 'P', 'K', 'F', 'H', 'P', '?']
Or, if you would rather lookup overlapping sequences, try this:
def RNA_to_Protien(mRNA_seq):
return [
RNA_codon_dictonary.get(mRNA_seq[i:i+3], '?')
for i in range(0, len(mRNA_seq)-2, 1)
]
It yields this result:
['S', 'Q', 'N', 'M', 'C', 'V', 'Stop', 'N', 'T', 'R', 'A', 'R', 'A', 'L', 'Y', 'T', 'P', 'P', 'R', 'G', 'E', 'S', 'A', 'L', 'S', 'L', 'W', 'G', 'G', 'A', 'P', 'P', 'Q', 'K', 'N', 'I', 'F', 'F', 'S', 'H', 'I', 'S', 'P', 'H', 'T']
And, based on your request to return a single string instead of a list of strings:
def RNA_to_Protien(mRNA_seq):
return ''.join(
RNA_codon_dictonary.get(mRNA_seq[i:i+3], '?')
for i in range(0, len(mRNA_seq), 3)
)
This yields:
'SMStopRATRSSGPKFHP?'

Need to reorder list and make a sentence in python

I need help rearranging this list alphabetically
list= ['z', 'a', 'b', 'y', 'c', 'x', 'd', 'w', 'e', 'v', 'f', 'g', 'u', 'h', 'i', 'j', 't' ,'k', 'l', 's', 'm', 'n', 'r', 'o', 'p', 'q', ' ']
into "hello world" by indexing into the array. How exactly do I do that? I'm a beginner and I'm doing this in python 2.7.
As has been mentioned, your list can be sorted alphabetically by using the sort() function as follows:
mylist = ['z', 'a', 'b', 'y', 'c', 'x', 'd', 'w', 'e', 'v', 'f', 'g', 'u', 'h', 'i', 'j', 't' ,'k', 'l', 's', 'm', 'n', 'r', 'o', 'p', 'q', ' ']
mylist.sort()
print mylist
Which results in your list looking like:
[' ', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
But you go on to say 'index into' for 'hello world'. If you mean you want to create a simple cypher then this could be easily be achieved as follows:
import string
s_from = 'abcdefghijklmnopqrstuvwxyz '
s_to = 'zabycxdwevfguhijtklsmnropq '
cypher_table = string.maketrans(s_from, s_to)
print "hello world".translate(cypher_table)
This would convert your text as follows:
wcggi rikgy
Please could you edit your question to give an example of what you are trying to achieve.
Since there is a empty element in the list. You can try to avoid them by using sorted method of list
>>> sorted(list_alphabet, key=list_alphabet.remove(' '))
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
sorted does create new copy of actual list and sort them. So you can refer the output of the sorted to new variable. Like
>>> sorted_list_alphabet = sorted(list_alphabet, key=list_alphabet.remove(' '))
>>> sorted_list_alphabet
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
PS: do not use list as a variable name because it brings conflict between actual list and your list variable

Categories

Resources