Parsing and rewriting a date-time string [duplicate]

Parsing and rewriting a date-time string [duplicate] - python

This question already has answers here:
Parse date string and change format
(10 answers)
Closed 2 years ago.
I have got a string of time data: "2019-08-02 18:18:06.02887" and I am trying to rewrite it as another string "EV190802_181802" in another file.
What I am trying now is splitting the string into lists and reconstructing another string by those lists:
hello=data.split(' ')
date=hello[0]
time=hello[1]
world=hello[0].split('-')
stack=time.split('.')
overflow=stack[0].split(':')
print('EV' + world[0] + world[1] + world[2] + '_' + overflow[0] + overflow[1] + overflow[2])
However, I have no idea how to remove 20 in 2019/world[0]. Is there any way I could remove '20'?
If there are alternative methods to rewrite the string, welcome to suggest as well.

Just another way to solve the problem,
>>> from datetime import datetime
>>>
>>> format_ = datetime.strptime("2019-08-02 18:18:06.02887",
... "%Y-%m-%d %H:%M:%S.%f")
>>>
>>> print(
format_.strftime('EV%y%m%d_%H%M') + format_.strftime('%f')[:2]
)
EV190802_181802

Using regex:
import re
data = "2019-08-02 18:18:06.02887"
res = re.match(r'(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})\s(?P<hours>\d{2}):(?P<minutes>\d{2}):(?P<seconds>\d{2}).(?P<miliseconds>\d+)',data)
out = f"EV{res.group('year')[2:]}{res.group('month')}{res.group('day')}_{res.group('hours')}{res.group('minutes')}{res.group('miliseconds')[:2]}"
print(out)
Output will be:
EV190802_181802

Remove all occurrences of -, . and :
hello.replace("-","")
hello.replace(".","")
hello.replace(":","")
Get string in one line:
print("EV" + hello[2:8] + "_" + hello[9:15])

Without regex, just string splits and joins:
string = "2019-08-02 18:18:06.02887"
target = "EV190802_181806"
d, t = string.split("20")[-1].split(".")[0].split(" ")
print("date:", d, "\ntime:", t)
d = "".join(d.split('-'))
t = "".join(t.split(':'))
result = "EV" + d + "_" + t
print("\nresult: ", result)
assert result == target
# >> out:
# date: 19-08-02
# time: 18:18:06
#
# result: EV190802_181806
I make the assumption that you'd like to have "06" at the end of the target string (sorry if I'm mistaken!).

I would use re.sub here for a regex approach:
inp = "2019-08-02 18:18:06.02887"
output = re.sub(r'^\d{2}(\d{2})-(\d{2})-(\d{2}) (\d{2}):(\d{2}):(\d{2}).*$',
'EV\\1\\2\\3_\\4\\5\\6',
inp)
print(output)
This prints:
EV190802_181806
Note: Your expected output was actually given as EV190802_181802, but it appears to be a typo for my solution above, as I see no reason why you would not want to report seconds, but instead report hundreth fractions of a second.

Related

Split a string if character is present else don't split

I have a string like below in python
testing_abc
I want to split string based on _ and extract the 2 element
I have done like below
split_string = string.split('_')[1]
I am getting the correct output as expected
abc
Now I want this to work for below strings
1) xyz
When I use
split_string = string.split('_')[1]
I get below error
list index out of range
expected output I want is xyz
2) testing_abc_bbc
When I use
split_string = string.split('_')[1]
I get abc as output
expected output I want is abc_bbc
Basically What I want is
1) If string contains `_` then print everything after the first `_` as variable
2) If string doesn't contain `_` then print the string as variable
How can I achieve what I want

Set the maxsplit argument of split to 1 and then take the last element of the resulting list.
>>> "testing_abc".split("_", 1)[-1]
'abc'
>>> "xyz".split("_", 1)[-1]
'xyz'
>>> "testing_abc_bbc".split("_", 1)[-1]
'abc_bbc'

You can use list slicing and str.join in case _ is in the string, and you can just get the first element of the split (which is the only element) in the other case:
sp = string.split('_')
result = '_'.join(sp[1:]) if len(sp) > 1 else sp[0]

All of the ways are good but there is a very simple and optimum way for this.
Try:
s = 'aaabbnkkjbg_gghjkk_ttty'
try:
ans = s[s.index('_')+1:]
except:
ans = s

Ok so your error is supposed to happen/expected because you are using '_' as your delimiter and it doesn't contain it.
See How to check a string for specific characters? for character checking.
If you want to only split iff the string contains a '_' and only on the first one,
input_string = "blah_end"
delimiter = '_'
if delimiter in input_string:
result = input_string.split("_", 1)[1] # The ",1" says only split once
else:
# Do whatever here. If you want a space, " " to be a delimiter too you can try that.
result = input_string

this code will solve your problem
txt = "apple_banana_cherry_orange"
# setting the maxsplit parameter to 1, will return a list with 2 elements!
x = txt.split("_", 1)
print(x[-1])

Extract some letters from a string and add hyphen in Python

have files with date attached in last part of filename. Like this.
string = 'blablablabla_20210812.jpg'
I extract that data like this.
string[-12:][:-4]
I want to add '-' between year, month and date. This is how I do.
string[-12:][:-4][:4] + '-' + string[-12:][:-4][4:][:2] + '-' + string[-12:][:-4][6:]
In my opinion, it seems like more complicated than reading machine code. Could you guys could enlighten me the ways which are more pragmatic?

One solution is to use regular expression and re.sub:
import re
s = "blablablabla_20210812.jpg"
s = re.sub(r"_(\d{4})(\d{2})(\d{2})\.", r"_\1-\2-\3.", s)
print(s)
Prints:
blablablabla_2021-08-12.jpg

You can also replace the values joining the groups with lambda:
>>> import re
>>> string = 'blablablabla_20210812.jpg'
>>> re.sub('(\d{4})(\d{2})(\d{2})', lambda m: '-'.join(g for g in m.groups()), string)
#output: 'blablablabla_2021-08-12.jpg'

You can put both indices in one square and use join function:
'-'.join([string[-12:-8], string[-8:-6], string[-6:-4]])
Also, I personally prefer to keep code readable. You can name the variables first:
def extractDataInfo(string):
year, month, day = string[-12:-8], string[-8:-6], string[-6:-4]
return '-'.join([year, month, day])

You can compress the string[-12:][:-4][:4] to string[-12:-8] to make it look cleaner. The code will look like this:
string = 'blablablabla_20210812.jpg'
print(string[-12:-8] + '-' + string[-8:-6]+ '-' + string[-6:-4])
# 2021-08-12
Or this if you want the text:
print(string[:-8] + '-' + string[-8:-6]+ '-' + string[-6:-4])
# blablablabla_2021-08-12

A simple solution for this question will work only if you know that the file always ends with:
_data.extension
If so, the solution will be:
string = 'blablablabla_20210812.jpg'
s = string.replace("_", ".")
s = s.split(".")
data = s[1]
s[1] = data[:4] + "-" + data[4:6] + "-" + data[6:]
print(s[1]) # OUTPUT: 2021-08-12
# Taking all together:
print(s[0] + "_" + s[1] + "." + s[2]) # OUTPUT: blablablabla_2021-08-12.jpg

You can also try like this:
import datetime
print(datetime.datetime.strptime(string[-12:-4],'%Y%m%d').strftime('%Y-%m-%d'))
Output:
'2021-08-12'

Unable to remove certain symbols from a string?

#Loading of the .csv data from the Happy Planet Index and WHO, defining and clearing the data for RDF conversion.
hsi_data = pd.read_csv("HPI_Main.csv", sep=';')
hsi_data = hsi_data.replace(to_replace=[" ", "%"], value="", regex=True)
hsi_data = hsi_data.replace(to_replace=",", value=".", regex=True)
hsi_data = hsi_data.fillna("unknown")
pd.set_option("display.max_rows", None, "display.max_columns", None)
hsi_data.columns = hsi_data.columns.str.replace(' ', '_')
for x in hsi_data["GDP/capita"]:
re.sub(r'$', ' ', x)
print(x)
The whole point is to remove $ sign from the data in GDP/capita, and convert it into a integer. However nothing seems to remove the symbol, no re.sub nor replace or remove, its like it isnt detecting it?

I looked at the Happy Planet data. The problem you are having with re replacement is that the $ sign is a special character in a regular expression, so must be escaped. This works:
x = " $67,646 "
z = re.sub("\\$", " ", x)
print(z)

If you want to remove the $ sign and convert the value to an integer, you could replace the last 3 lines of the above code with the following -
hsi_data['GDP/Capita'] = hsi_data['GDP/Capita'].str.replace('$','').astype(int)

python: how can I make this code with regular expression?

If I input "1995y 05m 05d", then I want to make a program that prints "950505". More example: "1949y 05m 23d" --> "490523".
import re
Birthday = str(input("insert your birth<(ex) xxxxy **m 00d> : "))
p= re.sub('[ymd ]','',Birthday)
print(p) #result is "xxxx**00"
here is my code. How do I fix it? any solutions?

Since you're basically working with date strings, you can use datetime.strptime() to parse them:
>>> from datetime import datetime
>>> birthday = '1995y 05m 05d'
>>> datetime.strptime(birthday, '%Yy %mm %dd').strftime('%y%m%d')
'950505'

Your existing code prints the full year, where you want only two digits. Just skip the first two digits on print.
print(p[2:])
That will print p starting from position 2 (the third character, since lists are counted from 0), with no end to the range, so it prints the entire string except the first two characters (19 in your sample).

Using Regex expression :
>>> import re
>>> a = re.findall("\d+","1995y 05m 05d")
>>> a[0] = a[0][2:]
>>> output = ""
>>> for item in a:
output += item
>>> int(output)
950505
>>>

Requested regex solution:
import re
s = '1995y 05m 05d'
print(''.join(re.findall(r'\d{2}(?=[ymd])', s)))
# 950505
Uses findall to find all two digits before y, m and d and join to join all to required format.

birthday = input("birthday").split()
a = ''.join([''.join([i for i in c if i.isdigit()][-2:]) for c in birthday])
this does it with out any libraries

Parsing a MAC address with python

How can I convert a hex value "0000.0012.13a4" into "00:00:00:12:13:A4"?

text = '0000.0012.13a4'
text = text.replace('.', '').upper() # a little pre-processing
# chunk into groups of 2 and re-join
out = ':'.join([text[i : i + 2] for i in range(0, len(text), 2)])
print(out)
00:00:00:12:13:A4

import re
old_string = "0000.0012.13a4"
new_string = ':'.join(s for s in re.split(r"(\w{2})", old_string.upper()) if s.isalnum())
print(new_string)
OUTPUT
> python3 test.py
00:00:00:12:13:A4
>
Without modification, this approach can handle some other MAC formats that you might run into like, "00-00-00-12-13-a4"

Try following code
import re
hx = '0000.0012.13a4'.replace('.','')
print(':'.join(re.findall('..', hx)))
Output: 00:00:00:12:13:a4

There is a pretty simple three step solution:
First we strip those pesky periods.
step1 = hexStrBad.replace('.','')
Then, if the formatting is consistent:
step2 = step1[0:2] + ':' + step1[2:4] + ':' + step1[4:6] + ':' + step1[6:8] + ':' + step1[8:10] + ':' + step1[10:12]
step3 = step2.upper()
It's not the prettiest, but it will do what you need!

It's unclear what you're asking exactly, but if all you want is to make a string all uppercase, use .upper()
Try to clarify your question somewhat, because if you're asking about converting some weirdly formatted string into what looks like a MAC address, we need to know that to answer your question.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Parsing and rewriting a date-time string [duplicate] - python

Just another way to solve the problem, >>> from datetime import datetime >>> >>> format_ = datetime.strptime("2019-08-02 18:18:06.02887", ... "%Y-%m-%d %H:%M:%S.%f") >>> >>> print( format_.strftime('EV%y%m%d_%H%M') + format_.strftime('%f')[:2] ) EV190802_181802

Remove all occurrences of -, . and : hello.replace("-","") hello.replace(".","") hello.replace(":","") Get string in one line: print("EV" + hello[2:8] + "_" + hello[9:15])

Related

Split a string if character is present else don't split

Extract some letters from a string and add hyphen in Python

Unable to remove certain symbols from a string?

python: how can I make this code with regular expression?

Parsing a MAC address with python

Categories

Resources