Python Append Microseconds to date element of the list - python

Hello I have a list of data generated as below
l_ele = line.split()
['2014-02-10T15:57:00.400733+00:00', 'coccus1','info="processing"]
['2014-02-10T15:57:02.734042+00:00', 'coccus1' , info="processing"]
['2014-02-10T15:57:02+00:00','coccus1','info="processing"']
['2014-02-10T15:57:03+00:00', 'coccus1','info="looking for match"']
['2014-02-10T15:57:04+00:00', 'coccus1', info="sampling"
['2014-02-10T15:57:06.771501+00:00','coccus1','info="sampling"']
I would like to append the ssssss to 000000 bit to the dateelement of the list if it does not have it.How to achieve it ?
Expected Output:
['2014-02-10T15:57:00.400733+00:00', 'coccus1','info="processing"]
['2014-02-10T15:57:02.734042+00:00', 'coccus1' , info="processing"]
['2014-02-10T15:57:02.000000+00:00','coccus1','info="processing"']
['2014-02-10T15:57:03.000000+00:00', 'coccus1','info="looking for match"']
['2014-02-10T15:57:04.000000+00:00', 'coccus1', info="sampling"
['2014-02-10T15:57:06.771501+00:00','coccus1','info="sampling"']

It is not clear what format your data is in, but assuming two strings:
s1 = "2014-02-10T15:57:02+00:00"
s2 = "2014-02-10T15:57:02.734042+00:00"
you can ensure they both match formats by doing:
def process_string(s):
return s if len(s) == 32 else "".join((s[:-6], ".000000", s[-6:]))
Or, in Python pre-2.5:
def process_string:
if len(s) == 32:
return s
return "".join((s[:-6], ".000000", s[-6:]))
Examples:
>>> process_string(s1)
'2014-02-10T15:57:02.000000+00:00'
>>> process_string(s2)
'2014-02-10T15:57:02.734042+00:00'
Ordinarily, I would recommend using datetime to do this, but your timezone offset is not in the format supported by strptime.
To apply this to the first item in each list, simply access it by index, for example:
>>> l = ['2014-02-10T15:57:02+00:00', 'coccus1','info="processing"']
>>> l[0] = process_string(l[0])
>>> l
['2014-02-10T15:57:02.000000+00:00', 'coccus1', 'info="processing"']

Related

How to convert numeric string from a sublist in Python

I'm a freshie. I would like to convert a numeric string into int from a sublist in Python. But not getting accurate results. 😔
countitem = 0
list_samp = [['1','2','blue'],['1','66','green'],['1','88','purple']]
for list in list_samp:
countitem =+1
for element in list:
convert_element = int(list_samp[countitem][0])
list_samp[countitem][1] = convert_element
You can do it like this:
list_samp = [['1','2','blue'],['1','66','green'],['1','88','purple']]
me = [[int(u) if u.isdecimal() else u for u in v] for v in list_samp]
print(me)
The correct way to do it:
list_samp = [['1','2','blue'],['1','66','green'],['1','88','purple']]
list_int = [[int(i) if i.isdecimal() else i for i in l] for l in list_samp]
print(list_int)
Let's go through the process step-by-step
countitem = 0
list_samp = [['1','2','blue'],['1','66','green'],['1','88','purple']]
#Let's traverse through the list
for list in list_samp: #gives each list
for i in range(len(list)): # get index of each element in sub list
if list[i].isnumeric(): # Check if all characters in the string is a number
list[i] = int(list[i]) # store the converted integer in the index i

I want to print the letters in each word in matrix form given the number of columns

I have a string thestackoverflow
and I also have # columns = 4
Then I want the output as
thes
tack
over
flow
you can do it using python slice notation.
refer to this thread for a nice explanation on slice notation: Understanding slice notation
example code for your question:
>>> input_string = "thestackoverflow"
>>> chop_size = 4
>>> while(input_string):
... print input_string[:chop_size]
... input_string = input_string[chop_size:]
...
thes
tack
over
flow
You can have a look at textwrap
import textwrap
string = 'thestackoverflow'
max_width = 4
result = textwrap.fill(string,max_width)
print(result)
thes
tack
over
flow
If you don't want to use any module
string = 'thestackoverflow'
max_width = 4
row = 0
result = ''
while row*max_width < len(string):
result+='\n'+string[row*max_width:(row+1)*max_width]
row+=1
result = result.strip()
print(result)

Replace a substring in a string according to a list

According to tutorialspoint:
The method replace() returns a copy of the string in which the occurrences of old have been replaced with new. https://www.tutorialspoint.com/python/string_replace.htm
Therefore one can use:
>>> text = 'fhihihi'
>>> text.replace('hi', 'o')
'fooo'
With this idea, given a list [1,2,3], and a string 'fhihihi' is there a method to replace a substring hi with 1, 2, and 3 in order? For example, this theoretical solution would yield:
'f123'
You can create a format string out of your initial string:
>>> text = 'fhihihi'
>>> replacement = [1,2,3]
>>> text.replace('hi', '{}').format(*replacement)
'f123'
Use re.sub:
import re
counter = 0
def replacer(match):
global counter
counter += 1
return str(counter)
re.sub(r'hi', replacer, text)
This is going to be way faster than any alternative using str.replace
One solution with re.sub:
text = 'fhihihi'
lst = [1,2,3]
import re
print(re.sub(r'hi', lambda g, l=iter(lst): str(next(l)), text))
Prints:
f123
Other answers gave good solutions. If you want to re-invent the wheel, here is one way.
text = "fhihihi"
target = "hi"
l = len(target)
i = 0
c = 0
new_string_list = []
while i < len(text):
if text[i:i + l] == target:
new_string_list.append(str(c))
i += l
c += 1
continue
new_string_list.append(text[i])
i += 1
print("".join(new_string_list))
Used a list to prevent consecutive string creation.

How to cut a string with duplicates?

I have this string:
'fhsdkfhskdslshsdkhlghs
bksjvsfgsdnfsfbjfgzfga
avzaeafeaeaddacbytt!tw
fhsdkfhskdslshsdkhlghs
bksjvsfgsdnfsfbjfgzfga
avzaeafeaeaddacbytt!tw'
And I want to use this part of code for cut it in pieces of length 22:
from textwrap import wrap
w_str= (wrap(str,22))
And then I will got this:
fhsdkfhskdslshsdkhlghs
bksjvsfgsdnfsfbjfgzfga
avzaeafeaeaddacbytt!tw
The next step should take the last four letters and of the first string and past it at the beginning of the next and so on.
Just like this with an Id:
e_1
fhsdkfhskdslshsdkhlghs
bksjvsfgsdnfsfbjfgzfgaavza
e_2
avzaeafeaeaddacbytt!tw
fhsdkfhskdslshsdkhlghslghs
e_3
lghsbksjvsfgsdnfsfbjfgzfga
zfgaavzaeafeaeaddacbytt!tw
Once you have your string as such:
_str = """fhsdkfhskdslshsdkhlghs
bksjvsfgsdnfsfbjfgzfga
avzaeafeaeaddacbytt!tw"""
You can do the following:
>>> _str = _str.split()
>>> new = [_str[i-1][len(_str[i-1])-4:len(_str[i-1])]+_str[i] if i > 0 else _str[i] for i in range(len(_str))]
>>> print '\n'.join(new)
fhsdkfhskdslshsdkhlghs
lghsbksjvsfgsdnfsfbjfgzfga
zfgaavzaeafeaeaddacbytt!tw
>>>
Edit
zip two lists together in a list comprehension, as such:
'\n'.join(['\n'.join(item) for item in zip(['e_'+str(num) for num in range(1, len(new)+1)], new)])
>>> _str = _str.split()
>>> new = [_str[i-1][len(_str[i-1])-4:len(_str[i-1])]+_str[i] if i > 0 else _str[i] for i in range(len(_str))]
>>> print '\n'.join(['\n'.join(item) for item in zip(['e_'+str(num) for num in range(1, len(new)+1)], new)])
e_1
fhsdkfhskdslshsdkhlghs
e_2
lghsbksjvsfgsdnfsfbjfgzfga
e_3
zfgaavzaeafeaeaddacbytt!tw
>>>
In some ways, strings are like lists in Python in the way you can reference their contents by index, and splice them and so on.
So you could use the index of the characters in the string to pull out the last 4 characters of each wrapped string:
input_string = 'fhsdkfhskdslshsdkhlghsbksjvsfgsdnfsfbjfgzfgaavzaeafeaeaddacbytt!tw'
split_strings = wrap(input_string, 22)
add_string = '' # nothing there at first, but will be updated as we process each wrapped string
for output_string in split_strings:
print add_string + output_string
add_string = output_string[-4:] # "[-4:]" means: "from the fourth last char of the string, to the end"
outputs:
fhsdkfhskdslshsdkhlghs
lghsbksjvsfgsdnfsfbjfgzfga
zfgaavzaeafeaeaddacbytt!tw

Encoding a numeric string into a shortened alphanumeric string, and back again

Quick question. I'm trying to find or write an encoder in Python to shorten a string of numbers by using upper and lower case letters. The numeric strings look something like this:
20120425161608678259146181504021022591461815040210220120425161608667
The length is always the same.
My initial thought was to write some simple encoder to utilize upper and lower case letters and numbers to shorten this string into something that looks more like this:
a26Dkd38JK
That was completely arbitrary, just trying to be as clear as possible.
I'm certain that there is a really slick way to do this, probably already built in. Maybe this is an embarrassing question to even be asking.
Also, I need to be able to take the shortened string and convert it back to the longer numeric value.
Should I write something and post the code, or is this a one line built in function of Python that I should already know about?
Thanks!
This is a pretty good compression:
import base64
def num_to_alpha(num):
num = hex(num)[2:].rstrip("L")
if len(num) % 2:
num = "0" + num
return base64.b64encode(num.decode('hex'))
It first turns the integer into a bytestring and then base64 encodes it. Here's the decoder:
def alpha_to_num(alpha):
num_bytes = base64.b64decode(alpha)
return int(num_bytes.encode('hex'), 16)
Example:
>>> num_to_alpha(20120425161608678259146181504021022591461815040210220120425161608667)
'vw4LUVm4Ea3fMnoTkHzNOlP6Z7eUAkHNdZjN2w=='
>>> alpha_to_num('vw4LUVm4Ea3fMnoTkHzNOlP6Z7eUAkHNdZjN2w==')
20120425161608678259146181504021022591461815040210220120425161608667
There are two functions that are custom (not based on base64), but produce shorter output:
chrs = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
l = len(chrs)
def int_to_cust(i):
result = ''
while i:
result = chrs[i % l] + result
i = i // l
if not result:
result = chrs[0]
return result
def cust_to_int(s):
result = 0
for char in s:
result = result * l + chrs.find(char)
return result
And the results are:
>>> int_to_cust(20120425161608678259146181504021022591461815040210220120425161608667)
'9F9mFGkji7k6QFRACqLwuonnoj9SqPrs3G3fRx'
>>> cust_to_int('9F9mFGkji7k6QFRACqLwuonnoj9SqPrs3G3fRx')
20120425161608678259146181504021022591461815040210220120425161608667L
You can also shorten the generated string, if you add other characters to the chrs variable.
Do it with 'class':
VALID_CHRS = '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'
BASE = len(VALID_CHRS)
MAP_CHRS = {k: v
for k, v in zip(VALID_CHRS, range(BASE + 1))}
class TinyNum:
"""Compact number representation in alphanumeric characters."""
def __init__(self, n):
result = ''
while n:
result = VALID_CHRS[n % BASE] + result
n //= BASE
if not result:
result = VALID_CHRS[0]
self.num = result
def to_int(self):
"""Return the number as an int."""
result = 0
for char in self.num:
result = result * BASE + MAP_CHRS[char]
return result
Sample usage:
>> n = 4590823745
>> tn = TinyNum(a)
>> print(n)
4590823745
>> print(tn.num)
50GCYh
print(tn.to_int())
4590823745
(Based on Tadeck's answer.)
>>> s="20120425161608678259146181504021022591461815040210220120425161608667"
>>> import base64, zlib
>>> base64.b64encode(zlib.compress(s))
'eJxly8ENACAMA7GVclGblv0X4434WrKFVW5CtJl1HyosrZKRf3hL5gLVZA2b'
>>> zlib.decompress(base64.b64decode(_))
'20120425161608678259146181504021022591461815040210220120425161608667'
so zlib isn't real smart at compressing strings of digits :(

Categories

Resources