Is there a "base64.b85decode" equivalent in NodeJS? - python

Is there a "base64.b85decode" function in nodejs?
It uses the following character set -
0123456789
ABCDEFGHIJKLMNOPQRSTUVWXYZ
abcdefghijklmnopqrstuvwxyz
!#$%&()*+-;<=>?#^_`{|}~
It's not a regular Ascii85 encoding but a different base85 encoding type.
For example, for "Hello, world!!!!", It should return "NM&qnZ!92pZpv8At50l"

I solved it by using the ascii85 package and using a customized character set -
var ascii85 = require('ascii85');
ascii85.decode(to_decode, ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X', 'Y', 'Z', 'a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z', '!', '#', '$', '%', '&', '(', ')', '*', '+', '-', ';', '<', '=', '>', '?', '#', '^', '_', '`', '{', '|', '}', '~']).toString('ascii');

Related

Joining individual elements of an array

I have an array consisting of labels but each label has been broken down by individual characters. For example, this is the first 2 elements of the array:
array([['1', '.', ' ', 'I', 'd', 'e', 'n', 't', 'i', 'f', 'y', 'i', 'n',
'g', ',', ' ', 'A', 's', 's', 'e', 's', 's', 'i', 'n', 'g', ' ',
'a', 'n', 'd', ' ', 'I', 'm', 'p', 'r', 'o', 'v', 'i', 'n', 'g',
' ', 'C', 'a', 'r', 'e', '', ''],
['9', '.', ' ', 'N', 'o', 'n', '-', 'P', 'h', 'a', 'r', 'm', 'a',
'c', 'o', 'l', 'o', 'g', 'i', 'c', 'a', 'l', ' ', 'I', 'n', 't',
'e', 'r', 'v', 'e', 'n', 't', 'i', 'o', 'n', 's', '', '', '',
'', ''], ...
I would like it to be formatted as such:
array(['1. Identifying, Assessing and Improving Care',
'9. Non-Pharmacological Interventions', ...
I want to be able to iterate through a concatenate the label output so it is as shown above.
Any help in achieving this would be much appreciated :) Many thanks!
import numpy as np
k=np.array([['1', '.', ' ', 'I', 'd', 'e', 'n', 't', 'i', 'f', 'y', 'i', 'n',
'g', ',', ' ', 'A', 's', 's', 'e', 's', 's', 'i', 'n', 'g', ' ',
'a', 'n', 'd', ' ', 'I', 'm', 'p', 'r', 'o', 'v', 'i', 'n', 'g',
' ', 'C', 'a', 'r', 'e', '', ''],
['9', '.', ' ', 'N', 'o', 'n', '-', 'P', 'h', 'a', 'r', 'm', 'a',
'c', 'o', 'l', 'o', 'g', 'i', 'c', 'a', 'l', ' ', 'I', 'n', 't',
'e', 'r', 'v', 'e', 'n', 't', 'i', 'o', 'n', 's', '', '', '',
'', '']])
for x in k:
print(''.join(x))
#output
1. Identifying, Assessing and Improving Care
9. Non-Pharmacological Interventions
Using List comprehension:
[''.join(x) for x in k]
#output
['1. Identifying, Assessing and Improving Care',
'9. Non-Pharmacological Interventions']
Considering the array as a list of lists, you could join all characters by looping through the list:
r = [['1', '.', ' ', 'I', 'd', 'e', 'n', 't', 'i', 'f', 'y', 'i', 'n',
'g', ',', ' ', 'A', 's', 's', 'e', 's', 's', 'i', 'n', 'g', ' ',
'a', 'n', 'd', ' ', 'I', 'm', 'p', 'r', 'o', 'v', 'i', 'n', 'g',
' ', 'C', 'a', 'r', 'e', '', ''],
['9', '.', ' ', 'N', 'o', 'n', '-', 'P', 'h', 'a', 'r', 'm', 'a',
'c', 'o', 'l', 'o', 'g', 'i', 'c', 'a', 'l', ' ', 'I', 'n', 't',
'e', 'r', 'v', 'e', 'n', 't', 'i', 'o', 'n', 's', '', '', '',
'', '']]
t = ["".join(i) for i in r]
print(t)
Output:
['1. Identifying, Assessing and Improving Care',
'9. Non-Pharmacological Interventions']
array = [['1', '.', ' ', 'I', 'd', 'e', 'n', 't', 'i', 'f', 'y', 'i', 'n',
'g', ',', ' ', 'A', 's', 's', 'e', 's', 's', 'i', 'n', 'g', ' ',
'a', 'n', 'd', ' ', 'I', 'm', 'p', 'r', 'o', 'v', 'i', 'n', 'g',
' ', 'C', 'a', 'r', 'e', '', ''],
['9', '.', ' ', 'N', 'o', 'n', '-', 'P', 'h', 'a', 'r', 'm', 'a',
'c', 'o', 'l', 'o', 'g', 'i', 'c', 'a', 'l', ' ', 'I', 'n', 't',
'e', 'r', 'v', 'e', 'n', 't', 'i', 'o', 'n', 's', '', '', '',
'', '']]
# array(['1. Identifying, Assessing and Improving Care',
# '9. Non-Pharmacological Interventions', ...
array = [''.join(i) for i in array]
print(array) #['1. Identifying, Assessing and Improving Care', '9. Non-Pharmacological Interventions']
Assuming from array([...]) that you are using numpy, here's a solution
import numpy as np
a = np.array([['1', '.', ' ', 'I', 'd', 'e', 'n', 't', 'i', 'f', 'y', 'i', 'n',
'g', ',', ' ', 'A', 's', 's', 'e', 's', 's', 'i', 'n', 'g', ' ',
'a', 'n', 'd', ' ', 'I', 'm', 'p', 'r', 'o', 'v', 'i', 'n', 'g',
' ', 'C', 'a', 'r', 'e', '', ''],
['9', '.', ' ', 'N', 'o', 'n', '-', 'P', 'h', 'a', 'r', 'm', 'a',
'c', 'o', 'l', 'o', 'g', 'i', 'c', 'a', 'l', ' ', 'I', 'n', 't',
'e', 'r', 'v', 'e', 'n', 't', 'i', 'o', 'n', 's', '', '', '',
'', '']])
b = np.empty(a.shape[0], dtype=object)
for i, x in enumerate(a): b[i] = ''.join(x)
If you make a loop over each element of your array, you can then use list .join to get what you are looking for.
Something like:
arr = [['1', '.', ' ', 'I', ...], ...]
output = list()
for x in arr:
output.append(''.join(x))
output
>>>
['1. Identifying, Assessing and Improving Care', ...]

How do I remove the all the spaces in the end result? [duplicate]

This question already has answers here:
How to convert list to string [duplicate]
(3 answers)
Closed 4 months ago.
How do I remove the all the spaces in the end result?
letters = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k',
'l', 'm', 'n', 'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x',
'y', 'z', 'A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K',
'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T', 'U', 'V', 'W', 'X',
'Y', 'Z']
numbers = ['0', '1', '2', '3', '4', '5', '6', '7', '8', '9']
symbols = ['!', '#', '$', '%', '&', '(', ')', '*', '+']
x_letters = int(input("How many letters would you like in your
password? "))
x_symbols = int(input(f"How many symbols would you like? "))
x_numbers = int(input(f"How many numbers would you like? "))
import random
password_list = []
for char in range(1, x_letters + 1):
password_list.append(random.choice(letters))
for char in range(1, x_symbols + 1):
password_list.append(random.choice(numbers))
for char in range(1, x_numbers + 1):
password_list.append(random.choice(symbols))
random.shuffle(password_list)
print(" ".join(password_list))
When I input everything it types everything like this:
r 3 a $ w & % 6 1
Last line should be
print("".join(password_list))
Leave no space in the join method attachment
which is print("".join(password_list))
Anything within that string will separate the elements

Remove adjacent items group in list based on length of elements

I have a list of items from PDF text extraction in this way:
['performed three times. Data represent the mean±SEM of threeindependent experiments. *P<0.05, **P<0.005, ***P<0.001.', 'B','O-GlcNAc', 'AMPK', 'pAMPK', 'SREBP-1', 'ACC', 'FAS', 'actin', 'C','T', 'L', 'D', 'O', 'N', 'T', 'M', 'G', 'C', 'T', 'L', 'D', 'O', 'N','T', 'M', 'G', 'HaCaT HeLa', 'O', '-G', 'lN', 'A', 'c', 'le', 'v','e', 'l', '(A', '.U', ')', 'H', 'a', 'C', 'a', 'T', 'C', 'T', 'L','0.0', '2.5', '1.5', '1.0', '0.5', 'H', 'a', 'C', 'a', 'T', 'D', 'O','N', 'H', 'a', 'C', 'a', 'T', 'T', 'M', 'G', '2.0', '**', '***', 'S','R', 'E', 'B', 'P', '-1', 'le', 'v', 'e', 'l', 'H', 'a', 'C', 'a','T', 'C', 'T', 'L', '0.0', '1.5', '1.0', '0.5', 'H', 'a', 'C', 'a','T', 'D', 'O', 'N', 'H', 'a', 'C', 'a', 'T', 'T', 'M', 'G', '2.0','F', 'A', 'S', '(A', '.U', ')', 'H', 'a', 'C', 'a', 'T', 'C', 'T','L', '0.0', '1.5', '1.0', '0.5', 'H', 'a', 'C', 'a', 'T', 'D', 'O','N', 'H', 'a', 'C', 'a', 'T', 'T', 'M', 'G', '2.0', 'A', 'C', 'C','(A', '.U', ')', 'H', 'a', 'C', 'a', 'T', 'C', 'T', 'L', '0.0', '2.5','1.5', '1.0', '0.5', 'H', 'a', 'C', 'a', 'T', 'D', 'O', 'N', 'H', 'a','C', 'a', 'T', 'T', 'M', 'G', '2.0', 'O', '-G', 'lc', 'N', 'A', 'c','le', 'v', 'e', 'l', '(A', '.U', ')', 'H', 'e', 'L', 'a', 'C', 'T','L', '0.0', '1.5', '1.0', '0.5', 'H', 'e', 'L', 'a', 'D', 'O', 'N','H', 'e', 'L', 'a', 'T', 'M', 'G', '2.0', '***', '***', 'S', 'R', 'E','B', 'P', '-1', 'le', 'v', 'e', 'l', 'H', 'e', 'L', 'a', 'C', 'T','L', '0.0', '1.5', '1.0', '0.5', 'H', 'e', 'L', 'a', 'D', 'O', 'N','H', 'e', 'L', 'a', 'T', 'M', 'G', '2.0', '***', 'F', 'A', 'S', '(A','.U', ')', 'H', 'e', 'L', 'a', 'C', 'T', 'L', '0.0', '1.5', '1.0','0.5', 'H', 'e', 'L', 'a', 'D', 'O', 'N', 'H', 'e', 'L', 'a', 'T','M', 'G', '2.0', '***', 'A', 'C', 'C', '(A', '.U', ')', 'H', 'e', 'L','a', 'C', 'T', 'L', '0.0', '1.5', '1.0', '0.5', 'H', 'e', 'L', 'a','D', 'O', 'N', 'H', 'e', 'L', 'a', 'T', 'M', 'G', '2.0', '***', '***','***', '***', '***', '***', '***', '*** ***', '***', 'O-GlcNAc','AMPK', 'pAMPK', 'SREBP-1', 'ACC', 'FAS', '�-actin', 'O', '-G', 'lN','A', 'c', 'le', 'v', 'e', 'l', '(A', '.U', ')', 'C', 'T', 'L', 'H','a', 'C', 'a', 'T', '0.0', '1.5', '1.0', '0.5', 'Q', 'u', 'e', 'r','H', 'a', 'C', 'a', 'T', 'C', 'T', 'L', 'H', 'e', 'L', 'a', '2.0','Q', 'u', 'e', 'r', 'H', 'e', 'L', 'a', 'p', 'A', 'M', 'P', 'K', '(A','.U', ')', 'C', 'T', 'L', 'H', 'a', 'C', 'a', 'T', '0.0', '1.5','1.0', '0.5', 'Q', 'u', 'e', 'r', 'H', 'a', 'C', 'a', 'T', 'C', 'T','L', 'H', 'e', 'L', 'a', '2.5 ***', 'Q', 'u', 'e', 'r', 'H', 'e', 'L','a', 'S', 'R', 'E', 'B', 'P', '-1', 'le', 'v', 'e', 'l', 'C', 'T','L', 'H', 'a', 'C', 'a', 'T', '0.0', '1.5', '1.0', '0.5', 'Q', 'u','e', 'r', 'H', 'a', 'C', 'a', 'T', 'C', 'T', 'L', 'H', 'e', 'L', 'a','**', 'Q', 'u', 'e', 'r', 'H', 'e', 'L', 'a', 'A', 'C', 'C', '(A','.U', ')', 'C', 'T', 'L', 'H', 'a', 'C', 'a', 'T', '0.0', '1.5','1.0', '0.5', 'Q', 'u', 'e', 'r', 'H', 'a', 'C', 'a', 'T', 'C', 'T','L', 'H', 'e', 'L', 'a', '2.0', '***', 'Q', 'u', 'e', 'r', 'H', 'e','L', 'a', 'F', 'A', 'S', '(A', '.U', ')', 'C', 'T', 'L', 'H', 'a','C', 'a', 'T', '0.0', '1.5', '1.0', '0.5', 'Q', 'u', 'e', 'r', 'H','a', 'C', 'a', 'T', 'C', 'T', 'L', 'H', 'e', 'L', 'a', '***', 'Q','u', 'e', 'r', 'H', 'e', 'L', 'a', '2.0', '***', '*** ***', '*','******* *******', 'HaCaT HeLa', 'CTL Quer CTL Quer', 'A', 'Fig. 4.Quercetin regulates SREBP­1 and its target proteins']
In this list, I would like to remove all groups of adjacent elements (length of group > N) for which no element has length > M.
A pseudo code would be:
for item in list:
if len(item) <= M:
buffer.append(item_index)
active = True
if len(item) > M and active == True:
active = False
if len(buffer) > N:
list.replace_at_index(buffer_by_index,'')
buffer.clear()
Thanks for helping
Here is how you can use the built-in enumerate method to iterate through the elements in a list alongside each element's index:
lst = ['performed three times. Data represent the mean±SEM of threeindependent experiments. *P<0.05, **P<0.005, ***P<0.001.', 'B','O-GlcNAc', 'AMPK', 'pAMPK', 'SREBP-1', 'ACC', 'FAS', 'actin', 'C','T', 'L', 'D', 'O', 'N', 'T', 'M', 'G', 'C', 'T', 'L', 'D', 'O', 'N','T', 'M', 'G', 'HaCaT HeLa', 'O', '-G', 'lN', 'A', 'c', 'le', 'v','e', 'l', '(A', '.U', ')', 'H', 'a', 'C', 'a', 'T', 'C', 'T', 'L','0.0', '2.5', '1.5', '1.0', '0.5', 'H', 'a', 'C', 'a', 'T', 'D', 'O','N', 'H', 'a', 'C', 'a', 'T', 'T', 'M', 'G', '2.0', '**', '***', 'S','R', 'E', 'B', 'P', '-1', 'le', 'v', 'e', 'l', 'H', 'a', 'C', 'a','T', 'C', 'T', 'L', '0.0', '1.5', '1.0', '0.5', 'H', 'a', 'C', 'a','T', 'D', 'O', 'N', 'H', 'a', 'C', 'a', 'T', 'T', 'M', 'G', '2.0','F', 'A', 'S', '(A', '.U', ')', 'H', 'a', 'C', 'a', 'T', 'C', 'T','L', '0.0', '1.5', '1.0', '0.5', 'H', 'a', 'C', 'a', 'T', 'D', 'O','N', 'H', 'a', 'C', 'a', 'T', 'T', 'M', 'G', '2.0', 'A', 'C', 'C','(A', '.U', ')', 'H', 'a', 'C', 'a', 'T', 'C', 'T', 'L', '0.0', '2.5','1.5', '1.0', '0.5', 'H', 'a', 'C', 'a', 'T', 'D', 'O', 'N', 'H', 'a','C', 'a', 'T', 'T', 'M', 'G', '2.0', 'O', '-G', 'lc', 'N', 'A', 'c','le', 'v', 'e', 'l', '(A', '.U', ')', 'H', 'e', 'L', 'a', 'C', 'T','L', '0.0', '1.5', '1.0', '0.5', 'H', 'e', 'L', 'a', 'D', 'O', 'N','H', 'e', 'L', 'a', 'T', 'M', 'G', '2.0', '***', '***', 'S', 'R', 'E','B', 'P', '-1', 'le', 'v', 'e', 'l', 'H', 'e', 'L', 'a', 'C', 'T','L', '0.0', '1.5', '1.0', '0.5', 'H', 'e', 'L', 'a', 'D', 'O', 'N','H', 'e', 'L', 'a', 'T', 'M', 'G', '2.0', '***', 'F', 'A', 'S', '(A','.U', ')', 'H', 'e', 'L', 'a', 'C', 'T', 'L', '0.0', '1.5', '1.0','0.5', 'H', 'e', 'L', 'a', 'D', 'O', 'N', 'H', 'e', 'L', 'a', 'T','M', 'G', '2.0', '***', 'A', 'C', 'C', '(A', '.U', ')', 'H', 'e', 'L','a', 'C', 'T', 'L', '0.0', '1.5', '1.0', '0.5', 'H', 'e', 'L', 'a','D', 'O', 'N', 'H', 'e', 'L', 'a', 'T', 'M', 'G', '2.0', '***', '***','***', '***', '***', '***', '***', '*** ***', '***', 'O-GlcNAc','AMPK', 'pAMPK', 'SREBP-1', 'ACC', 'FAS', '�-actin', 'O', '-G', 'lN','A', 'c', 'le', 'v', 'e', 'l', '(A', '.U', ')', 'C', 'T', 'L', 'H','a', 'C', 'a', 'T', '0.0', '1.5', '1.0', '0.5', 'Q', 'u', 'e', 'r','H', 'a', 'C', 'a', 'T', 'C', 'T', 'L', 'H', 'e', 'L', 'a', '2.0','Q', 'u', 'e', 'r', 'H', 'e', 'L', 'a', 'p', 'A', 'M', 'P', 'K', '(A','.U', ')', 'C', 'T', 'L', 'H', 'a', 'C', 'a', 'T', '0.0', '1.5','1.0', '0.5', 'Q', 'u', 'e', 'r', 'H', 'a', 'C', 'a', 'T', 'C', 'T','L', 'H', 'e', 'L', 'a', '2.5 ***', 'Q', 'u', 'e', 'r', 'H', 'e', 'L','a', 'S', 'R', 'E', 'B', 'P', '-1', 'le', 'v', 'e', 'l', 'C', 'T','L', 'H', 'a', 'C', 'a', 'T', '0.0', '1.5', '1.0', '0.5', 'Q', 'u','e', 'r', 'H', 'a', 'C', 'a', 'T', 'C', 'T', 'L', 'H', 'e', 'L', 'a','**', 'Q', 'u', 'e', 'r', 'H', 'e', 'L', 'a', 'A', 'C', 'C', '(A','.U', ')', 'C', 'T', 'L', 'H', 'a', 'C', 'a', 'T', '0.0', '1.5','1.0', '0.5', 'Q', 'u', 'e', 'r', 'H', 'a', 'C', 'a', 'T', 'C', 'T','L', 'H', 'e', 'L', 'a', '2.0', '***', 'Q', 'u', 'e', 'r', 'H', 'e','L', 'a', 'F', 'A', 'S', '(A', '.U', ')', 'C', 'T', 'L', 'H', 'a','C', 'a', 'T', '0.0', '1.5', '1.0', '0.5', 'Q', 'u', 'e', 'r', 'H','a', 'C', 'a', 'T', 'C', 'T', 'L', 'H', 'e', 'L', 'a', '***', 'Q','u', 'e', 'r', 'H', 'e', 'L', 'a', '2.0', '***', '*** ***', '*','******* *******', 'HaCaT HeLa', 'CTL Quer CTL Quer', 'A', 'Fig. 4.Quercetin regulates SREBP­1 and its target proteins']
N = 3
M = 5
buffer = []
for i, v in enumerate(lst):
if len(v) <= M:
buffer.append(i)
else:
if len(buffer) > N:
for i in buffer:
lst[i] = None
buffer.clear()
print(list(filter(None, lst)))
Output:
['performed three times. Data represent the mean±SEM of threeindependent experiments. *P<0.05, **P<0.005, ***P<0.001.', 'B', 'O-GlcNAc', 'AMPK', 'pAMPK', 'SREBP-1', 'HaCaT HeLa', '*** ***', '***', 'O-GlcNAc', 'AMPK', 'pAMPK', 'SREBP-1', 'ACC', 'FAS', '�-actin', '2.5 ***', '*** ***', '*', '******* *******', 'HaCaT HeLa', 'CTL Quer CTL Quer', 'A', 'Fig. 4.Quercetin regulates SREBP\xad1 and its target proteins']
It is unclear exactly what is desired. Below is some code of what I believe is being asked for. Hope this helps. If this is off... let me know.
Psuedo algorithm
Given a list of strings.
Identify the len of each string using class cntseq.
Identify sequence of strings having the same len
For each sequence, if the length of sequence
x = ['performed three times. Data represent the mean±SEM of three independent experiments. P<0.05, P<0.005, P<0.001.', 'B', 'O-GlcNAc', 'AMPK', 'pAMPK', 'SREBP-1', 'ACC', 'FAS', 'actin', 'C', 'T', 'L', 'D', 'O', 'N', 'T', 'M', 'G', 'C', 'T', 'L', 'D', 'O', 'N', 'T', 'M', 'G', 'HaCaT HeLa', 'O', '-G', 'lN', 'A', 'c', 'le', 'v', 'e', 'l', '(A', '.U', ')', 'H', 'a', 'C', 'a', 'T', 'C', 'T', 'L', '0.0', '2.5', '1.5', '1.0', '0.5', 'H', 'a', 'C', 'a', 'T', 'D', 'O', 'N', 'H', 'a', 'C', 'a', 'T', 'T', 'M', 'G', '2.0', '', '', 'S', 'R', 'E', 'B', 'P', '-1', 'le', 'v', 'e', 'l', 'H', 'a', 'C', 'a', 'T', 'C', 'T', 'L', '0.0', '1.5', '1.0', '0.5', 'H', 'a', 'C', 'a', 'T', 'D', 'O', 'N', 'H', 'a', 'C', 'a', 'T', 'T', 'M', 'G', '2.0', 'F', 'A', 'S', '(A', '.U', ')', 'H', 'a', 'C', 'a', 'T', 'C', 'T', 'L', '0.0', '1.5', '1.0', '0.5', 'H', 'a', 'C', 'a', 'T', 'D', 'O', 'N', 'H', 'a', 'C', 'a', 'T', 'T', 'M', 'G', '2.0', 'A', 'C', 'C', '(A', '.U', ')', 'H', 'a', 'C', 'a', 'T', 'C', 'T', 'L', '0.0', '2.5', '1.5', '1.0', '0.5', 'H', 'a', 'C', 'a', 'T', 'D', 'O', 'N', 'H', 'a', 'C', 'a', 'T', 'T', 'M', 'G', '2.0', 'O', '-G', 'lc', 'N', 'A', 'c', 'le', 'v', 'e', 'l', '(A', '.U', ')', 'H', 'e', 'L', 'a', 'C', 'T', 'L', '0.0', '1.5', '1.0', '0.5', 'H', 'e', 'L', 'a', 'D', 'O', 'N', 'H', 'e', 'L', 'a', 'T', 'M', 'G', '2.0', '', '', 'S', 'R', 'E', 'B', 'P', '-1', 'le', 'v', 'e', 'l', 'H', 'e', 'L', 'a', 'C', 'T', 'L', '0.0', '1.5', '1.0', '0.5', 'H', 'e', 'L', 'a', 'D', 'O', 'N', 'H', 'e', 'L', 'a', 'T', 'M', 'G', '2.0', '', 'F', 'A', 'S', '(A', '.U', ')', 'H', 'e', 'L', 'a', 'C', 'T', 'L', '0.0', '1.5', '1.0', '0.5', 'H', 'e', 'L', 'a', 'D', 'O', 'N', 'H', 'e', 'L', 'a', 'T', 'M', 'G', '2.0', '', 'A', 'C', 'C', '(A', '.U', ')', 'H', 'e', 'L', 'a', 'C', 'T', 'L', '0.0', '1.5', '1.0', '0.5', 'H', 'e', 'L', 'a', 'D', 'O', 'N', 'H', 'e', 'L', 'a', 'T', 'M', 'G', '2.0', '', '', '', '', '', '', '', '* ', '', 'O-GlcNAc', 'AMPK', 'pAMPK', 'SREBP-1', 'ACC', 'FAS', '�-actin', 'O', '-G', 'lN', 'A', 'c', 'le', 'v', 'e', 'l', '(A', '.U', ')', 'C', 'T', 'L', 'H', 'a', 'C', 'a', 'T', '0.0', '1.5', '1.0', '0.5', 'Q', 'u', 'e', 'r', 'H', 'a', 'C', 'a', 'T', 'C', 'T', 'L', 'H', 'e', 'L', 'a', '2.0', 'Q', 'u', 'e', 'r', 'H', 'e', 'L', 'a', 'p', 'A', 'M', 'P', 'K', '(A', '.U', ')', 'C', 'T', 'L', 'H', 'a', 'C', 'a', 'T', '0.0', '1.5', '1.0', '0.5', 'Q', 'u', 'e', 'r', 'H', 'a', 'C', 'a', 'T', 'C', 'T', 'L', 'H', 'e', 'L', 'a', '2.5 ', 'Q', 'u', 'e', 'r', 'H', 'e', 'L', 'a', 'S', 'R', 'E', 'B', 'P', '-1', 'le', 'v', 'e', 'l', 'C', 'T', 'L', 'H', 'a', 'C', 'a', 'T', '0.0', '1.5', '1.0', '0.5', 'Q', 'u', 'e', 'r', 'H', 'a', 'C', 'a', 'T', 'C', 'T', 'L', 'H', 'e', 'L', 'a', '', 'Q', 'u', 'e', 'r', 'H', 'e', 'L', 'a', 'A', 'C', 'C', '(A', '.U', ')', 'C', 'T', 'L', 'H', 'a', 'C', 'a', 'T', '0.0', '1.5', '1.0', '0.5', 'Q', 'u', 'e', 'r', 'H', 'a', 'C', 'a', 'T', 'C', 'T', 'L', 'H', 'e', 'L', 'a', '2.0', '', 'Q', 'u', 'e', 'r', 'H', 'e', 'L', 'a', 'F', 'A', 'S', '(A', '.U', ')', 'C', 'T', 'L', 'H', 'a', 'C', 'a', 'T', '0.0', '1.5', '1.0', '0.5', 'Q', 'u', 'e', 'r', 'H', 'a', 'C', 'a', 'T', 'C', 'T', 'L', 'H', 'e', 'L', 'a', '', 'Q', 'u', 'e', 'r', 'H', 'e', 'L', 'a', '2.0', '', '* ', '', '***** *******', 'HaCaT HeLa', 'CTL Quer CTL Quer', 'A', 'Fig. 4. Quercetin regulates SREBP\xad1 and its target proteins']
df = pd.DataFrame(x, columns=['v'])
df['len'] = df.v.apply(len)
N = 2
M = 2
class cntseq(object):
'''
Define class to track sequences across the Column
'''
def __init__(self, **kwargs):
self.prevLen = -1
self.cnt = 0
self.start = None
def countN(self, r):
if r.len == self.prevLen:
# if adjcent sequence found, mark it's start with "len.start"
self.cnt += 1
if self.start == None :
self.start = int(r.name)-1
return '%d.%d'%(r.len, self.start)
else:
# non-adjcent sequence found, mark None.None"
self.prevLen = r.len
self.cnt = 0
self.start = None
return 'None.None'
# Identify sequences of adjcent lengths.
cs = cntseq()
df['seq'] = df.apply(lambda r : cs.countN(r),axis=1)
print("\nOriginal DF info")
print(df.describe())
print("\nOriginal DF ")
print(df.head())
# Compute lookup of duplicate information
df2 = pd.DataFrame(df.groupby('seq').seq.count())
df2.columns=['M']
df2 = df2.reset_index()
n = df2[df2.seq == 'None.None'].index[0]
df2 = df2.drop(78, axis=0)
print("\nLookup DF, seq has count and index for start of 'adjcent elements'")
print(df2.head())
# Compute final DF without duplicates
df3 = df[df.seq.isin(list(df2[df2.M > M].seq))].head()
print("\nFinal DF without duplicated")
print(df3)
print("\nOriginal DF info")
print(df3.describe())
output:
Original DF info
len
count 578.000000
mean 1.730104
std 5.284275
min 0.000000
25% 1.000000
50% 1.000000
75% 1.000000
max 110.000000
Original DF
v len seq
0 performed three times. Data represent the mean... 110 None.None
1 B 1 None.None
2 O-GlcNAc 8 None.None
3 AMPK 4 None.None
4 pAMPK 5 None.None
Lookup DF, seq has count and index for start of 'adjcent elements'
seq M
0 0.221 1
1 0.325 6
2 0.70 1
3 1.111 2
4 1.116 8
Final DF without duplicated
v len seq
10 T 1 1.9
11 L 1 1.9
12 D 1 1.9
13 O 1 1.9
14 N 1 1.9
Original DF info
len
count 5.0
mean 1.0
std 0.0
min 1.0
25% 1.0
50% 1.0
75% 1.0
max 1.0

Python encoding using ASCII

I am designing a program that will encode. messages imported from a csv. It will do this by converting them their ASCII value, adding 2 to them and then converting them back to characters.
My current problem is that while my code will encode each character in each string the messages are now no longer joined together.
Any help would be appreciated.
My code:
#importing csv file and allowing it to be read from
import csv
ifile = open("messages.csv","rb")
reader= csv.reader(ifile)
#creating lists
plain_text=[]
plain_ascii=[]
encrypted_ascii=[]
encrypted_text=[]
latitude=[]
longitude=[]
#appending csv data to separate lists
for row in reader:
latitude.append(row[0])
longitude.append(row[1])
plain_text.append(row[2])
#encoding messages
encrypted_text=[[chr(ord(ch)+2) for ch in string] for string in plain_text]
print plain_text
print encrypted_text
ifile.close()
The current output:
['A famous Scottish victory in the First War of Scottish Independence - bit.ly/1yIAb8Q', "How high is Scotland's tallest mountain? - bit.ly/1q3Rj6D", 'What is the traditional instrument most often linked with Scotland? - http://#bit.ly/1lNdrk3', "A prickly problem Scotland's national symbol - bit.ly/1q3REpQ", 'Name the largest city in Scotland - bit.ly/T4OEuU']
[['C', '"', 'h', 'c', 'o', 'q', 'w', 'u', '"', 'U', 'e', 'q', 'v', 'v', 'k', 'u', 'j', '"', 'x', 'k', 'e', 'v', 'q', 't', '{', '"', 'k', 'p', '"', 'v', 'j', 'g', '"', 'H', 'k', 't', 'u', 'v', '"', 'Y', 'c', 't', '"', 'q', 'h', '"', 'U', 'e', 'q', 'v', 'v', 'k', 'u', 'j', '"', 'K', 'p', 'f', 'g', 'r', 'g', 'p', 'f', 'g', 'p', 'e', 'g', '"', '/', '"', 'd', 'k', 'v', '0', 'n', '{', '1', '3', '{', 'K', 'C', 'd', ':', 'S'], ['J', 'q', 'y', '"', 'j', 'k', 'i', 'j', '"', 'k', 'u', '"', 'U', 'e', 'q', 'v', 'n', 'c', 'p', 'f', ')', 'u', '"', 'v', 'c', 'n', 'n', 'g', 'u', 'v', '"', 'o', 'q', 'w', 'p', 'v', 'c', 'k', 'p', 'A', '"', '/', '"', 'd', 'k', 'v', '0', 'n', '{', '1', '3', 's', '5', 'T', 'l', '8', 'F'], ['Y', 'j', 'c', 'v', '"', 'k', 'u', '"', 'v', 'j', 'g', '"', 'v', 't', 'c', 'f', 'k', 'v', 'k', 'q', 'p', 'c', 'n', '"', 'k', 'p', 'u', 'v', 't', 'w', 'o', 'g', 'p', 'v', '"', 'o', 'q', 'u', 'v', '"', 'q', 'h', 'v', 'g', 'p', '"', 'n', 'k', 'p', 'm', 'g', 'f', '"', 'y', 'k', 'v', 'j', '"', 'U', 'e', 'q', 'v', 'n', 'c', 'p', 'f', 'A', '"', '/', '"', 'j', 'v', 'v', 'r', '<', '1', '1', 'd', 'k', 'v', '0', 'n', '{', '1', '3', 'n', 'P', 'f', 't', 'm', '5'], ['C', '"', 'r', 't', 'k', 'e', 'm', 'n', '{', '"', 'r', 't', 'q', 'd', 'n', 'g', 'o', '"', 'U', 'e', 'q', 'v', 'n', 'c', 'p', 'f', ')', 'u', '"', 'p', 'c', 'v', 'k', 'q', 'p', 'c', 'n', '"', 'u', '{', 'o', 'd', 'q', 'n', '"', '/', '"', 'd', 'k', 'v', '0', 'n', '{', '1', '3', 's', '5', 'T', 'G', 'r', 'S'], ['P', 'c', 'o', 'g', '"', 'v', 'j', 'g', '"', 'n', 'c', 't', 'i', 'g', 'u', 'v', '"', 'e', 'k', 'v', '{', '"', 'k', 'p', '"', 'U', 'e', 'q', 'v', 'n', 'c', 'p', 'f', '"', '/', '"', 'd', 'k', 'v', '0', 'n', '{', '1', 'V', '6', 'Q', 'G', 'w', 'W']]
You need to join the inner lists.
[''.join(chr(ord(ch)+2) for ch in string) for string in plain]

How do I add a suffix to the end of each list element?

I have two, unequal length lists. The first list contains variable names for a longitudinal study, the other contains suffixes for these variables. The user supplies a CSV from which the variable names are read, and then is prompted to enter the (n) number of iterations of these variables, and the names of the n number of suffixes
Here is a fake example of what I mean
Number of iterations: 2
Suffix1: pre
Suffix2: 6month
List 1:
['mood1', 'mood2', 'mood3', 'dep1', 'dep2', 'dep3']
List 2:
['pre', '6month']
Desired concatenation:
['mood1_pre', 'mood2_pre', 'mood3_pre', 'dep1_pre', 'dep2_pre', 'dep3_pre', 'mood1_6month', ..., 'dep3_6month']
I have the program working fully, except the output splits each letter of the concatenated list into its own element, for example:
How many iterations of the variables do you need?: 3
Variable Suffix 1: pre
Variable Suffix 2: 6m
Variable Suffix 3: 12m
['B', 'o', 'b', '_', 'p', 'r', 'e', 'J', 'o', 'e', '_', 'p', 'r', 'e', 'J', 'i',
'm', '_', 'p', 'r', 'e', 'A', '_', 'p', 'r', 'e', 'B', '_', 'p', 'r', 'e', 'C',
'_', 'p', 'r', 'e', '1', '_', 'p', 'r', 'e', '2', '_', 'p', 'r', 'e', '3', '_',
'p', 'r', 'e', '1', '4', '_', 'p', 'r', 'e', 'B', 'o', 'b', '_', '6', 'm', 'J',
'o', 'e', '_', '6', 'm', 'J', 'i', 'm', '_', '6', 'm', 'A', '_', '6', 'm', 'B',
'_', '6', 'm', 'C', '_', '6', 'm', '1', '_', '6', 'm', '2', '_', '6', 'm', '3',
'_', '6', 'm', '1', '4', '_', '6', 'm', 'B', 'o', 'b', '_', '1', '2', 'm', 'J',
'o', 'e', '_', '1', '2', 'm', 'J', 'i', 'm', '_', '1', '2', 'm', 'A', '_', '1',
'2', 'm', 'B', '_', '1', '2', 'm', 'C', '_', '1', '2', 'm', '1', '_', '1', '2',
'm', '2', '_', '1', '2', 'm', '3', '_', '1', '2', 'm', '1', '4', '_', '1', '2',
'm']
I am using this to make the new list
newvarlist.extend((varlist[vars] + '_' + varsuffix[j]))
Here is one way to do it using list comprehension:
['{}_{}'.format(a, b) for b in b_list for a in a_list]
Demo:
>>> a_list = ['mood1', 'mood2', 'mood3', 'dep1', 'dep2', 'dep3']
>>> b_list = ['pre', '6month']
>>> result = ['{}_{}'.format(a, b) for b in b_list for a in a_list]
>>> result
['mood1_pre', 'mood2_pre', 'mood3_pre', 'dep1_pre', 'dep2_pre', 'dep3_pre', 'mood1_6month', 'mood2_6month', 'mood3_6month', 'dep1_6month', 'dep2_6month', 'dep3_6month']
If you are flexible on the ordering of the final list:
from itertools import product, imap
l1 = ['mood1', 'mood2', 'mood3', 'dep1', 'dep2', 'dep3']
l2 = ['pre', '6month']
x = list(imap('_'.join, product(l1, l2)))
This produces ['mood1_pre', 'mood1_6month', ...] rather than ['mood1_pre', 'mood2_pre', ...].

Categories

Resources