Extract some letters from a string and add hyphen in Python

Extract some letters from a string and add hyphen in Python - python

have files with date attached in last part of filename. Like this.
string = 'blablablabla_20210812.jpg'
I extract that data like this.
string[-12:][:-4]
I want to add '-' between year, month and date. This is how I do.
string[-12:][:-4][:4] + '-' + string[-12:][:-4][4:][:2] + '-' + string[-12:][:-4][6:]
In my opinion, it seems like more complicated than reading machine code. Could you guys could enlighten me the ways which are more pragmatic?

One solution is to use regular expression and re.sub:
import re
s = "blablablabla_20210812.jpg"
s = re.sub(r"_(\d{4})(\d{2})(\d{2})\.", r"_\1-\2-\3.", s)
print(s)
Prints:
blablablabla_2021-08-12.jpg

You can also replace the values joining the groups with lambda:
>>> import re
>>> string = 'blablablabla_20210812.jpg'
>>> re.sub('(\d{4})(\d{2})(\d{2})', lambda m: '-'.join(g for g in m.groups()), string)
#output: 'blablablabla_2021-08-12.jpg'

You can put both indices in one square and use join function:
'-'.join([string[-12:-8], string[-8:-6], string[-6:-4]])
Also, I personally prefer to keep code readable. You can name the variables first:
def extractDataInfo(string):
year, month, day = string[-12:-8], string[-8:-6], string[-6:-4]
return '-'.join([year, month, day])

You can compress the string[-12:][:-4][:4] to string[-12:-8] to make it look cleaner. The code will look like this:
string = 'blablablabla_20210812.jpg'
print(string[-12:-8] + '-' + string[-8:-6]+ '-' + string[-6:-4])
# 2021-08-12
Or this if you want the text:
print(string[:-8] + '-' + string[-8:-6]+ '-' + string[-6:-4])
# blablablabla_2021-08-12

A simple solution for this question will work only if you know that the file always ends with:
_data.extension
If so, the solution will be:
string = 'blablablabla_20210812.jpg'
s = string.replace("_", ".")
s = s.split(".")
data = s[1]
s[1] = data[:4] + "-" + data[4:6] + "-" + data[6:]
print(s[1]) # OUTPUT: 2021-08-12
# Taking all together:
print(s[0] + "_" + s[1] + "." + s[2]) # OUTPUT: blablablabla_2021-08-12.jpg

You can also try like this:
import datetime
print(datetime.datetime.strptime(string[-12:-4],'%Y%m%d').strftime('%Y-%m-%d'))
Output:
'2021-08-12'

Related

how to split string from the end after certain occurances of character

how to split the below string after 2nd occurrence of '/' from the end:
/u01/dbms/orcl/product/11.2.0.4/db_home
Expected output is :
/u01/dbms/orcl/product/
Thanks.

Do not use split, use rsplit instead! It's much simpler and faster.
s = '/u01/dbms/orcl/product/11.2.0.4/db_home'
result = s.rsplit('/', 2)[0] + '/'

string = "/u01/dbms/orcl/product/11.2.0.4/db_home"
split_string = string.split('/')
expected_output = "/".join(split_string[:-2]) + "/"
You're also free to change "-2" to minus whatever amount of filenames you need clipped.

If you can parse it as a filepath, I recommend pathlib, try:
from pathlib import Path
p = Path('/u01/dbms/orcl/product/11.2.0.4/db_hom')
p.parent.parent # Returns object containg path /u01/dbms/orc1/product/

input='/u01/dbms/orcl/product/11.2.0.4/db_home'
output = '/'.join(str(word) for word in input.split('/')[:-2])+'/'

Parsing and rewriting a date-time string [duplicate]

This question already has answers here:
Parse date string and change format
(10 answers)
Closed 2 years ago.
I have got a string of time data: "2019-08-02 18:18:06.02887" and I am trying to rewrite it as another string "EV190802_181802" in another file.
What I am trying now is splitting the string into lists and reconstructing another string by those lists:
hello=data.split(' ')
date=hello[0]
time=hello[1]
world=hello[0].split('-')
stack=time.split('.')
overflow=stack[0].split(':')
print('EV' + world[0] + world[1] + world[2] + '_' + overflow[0] + overflow[1] + overflow[2])
However, I have no idea how to remove 20 in 2019/world[0]. Is there any way I could remove '20'?
If there are alternative methods to rewrite the string, welcome to suggest as well.

Just another way to solve the problem,
>>> from datetime import datetime
>>>
>>> format_ = datetime.strptime("2019-08-02 18:18:06.02887",
... "%Y-%m-%d %H:%M:%S.%f")
>>>
>>> print(
format_.strftime('EV%y%m%d_%H%M') + format_.strftime('%f')[:2]
)
EV190802_181802

Using regex:
import re
data = "2019-08-02 18:18:06.02887"
res = re.match(r'(?P<year>\d{4})-(?P<month>\d{2})-(?P<day>\d{2})\s(?P<hours>\d{2}):(?P<minutes>\d{2}):(?P<seconds>\d{2}).(?P<miliseconds>\d+)',data)
out = f"EV{res.group('year')[2:]}{res.group('month')}{res.group('day')}_{res.group('hours')}{res.group('minutes')}{res.group('miliseconds')[:2]}"
print(out)
Output will be:
EV190802_181802

Remove all occurrences of -, . and :
hello.replace("-","")
hello.replace(".","")
hello.replace(":","")
Get string in one line:
print("EV" + hello[2:8] + "_" + hello[9:15])

Without regex, just string splits and joins:
string = "2019-08-02 18:18:06.02887"
target = "EV190802_181806"
d, t = string.split("20")[-1].split(".")[0].split(" ")
print("date:", d, "\ntime:", t)
d = "".join(d.split('-'))
t = "".join(t.split(':'))
result = "EV" + d + "_" + t
print("\nresult: ", result)
assert result == target
# >> out:
# date: 19-08-02
# time: 18:18:06
#
# result: EV190802_181806
I make the assumption that you'd like to have "06" at the end of the target string (sorry if I'm mistaken!).

I would use re.sub here for a regex approach:
inp = "2019-08-02 18:18:06.02887"
output = re.sub(r'^\d{2}(\d{2})-(\d{2})-(\d{2}) (\d{2}):(\d{2}):(\d{2}).*$',
'EV\\1\\2\\3_\\4\\5\\6',
inp)
print(output)
This prints:
EV190802_181806
Note: Your expected output was actually given as EV190802_181802, but it appears to be a typo for my solution above, as I see no reason why you would not want to report seconds, but instead report hundreth fractions of a second.

Better way to change specific chars in a string separated by underscores without using re

I have files with names like centerOne_camera_2_2018-04-11_15:11:21_2.0.jpg. I want to change the last string i.e. image_name.split('_')[5].split('.')[0] to some other string. I can't seem to find a neat way to do this and ended up doing the following which is very crude
new_name = image_base.split('_')[0] + image_base.split('_')[1] + image_base.split('_')[2] + image_base.split('_')[3] + image_base.split('_')[4] + frameNumber
That is, my output should be centerOne_camera_2_2018-04-11_15:11:21_<some string>.0.jpg
Any better way is appreciated. Note: I want to retain the rest of the string too.

I think you may be looking for this:
>>> "centerOne_camera_2_2018-04-11_15:11:21_2.0.jpg".rpartition("_")
('centerOne_camera_2_2018-04-11_15:11:21', '_', '2.0.jpg')
That is for the last element. But from the comments I gather you want to split at delimiter n.
>>> n = 3
>>> temp = "centerOne_camera_2_2018-04-11_15:11:21_2.0.jpg".split("_",n)
>>> "_".join(temp[:n]),temp[n]
('centerOne_camera_2', '2018-04-11_15:11:21_2.0.jpg')
I'm not sure what your objection to using + is, but you can do this if you like:
>>> temp="centerOne_camera_2_2018-04-11_15:11:21_2.0.jpg".rpartition("_")
>>> "{0}<some_string>{2}".format(*temp)
'centerOne_camera_2_2018-04-11_15:11:21<some_string>2.0.jpg'

You can try rsplit:
"centerOne_camera_2_2018-04-11_15:11:21_2.0.jpg".rsplit("_", 1)
['centerOne_camera_2_2018-04-11_15:11:21', '2.0.jpg']

Parsing a MAC address with python

How can I convert a hex value "0000.0012.13a4" into "00:00:00:12:13:A4"?

text = '0000.0012.13a4'
text = text.replace('.', '').upper() # a little pre-processing
# chunk into groups of 2 and re-join
out = ':'.join([text[i : i + 2] for i in range(0, len(text), 2)])
print(out)
00:00:00:12:13:A4

import re
old_string = "0000.0012.13a4"
new_string = ':'.join(s for s in re.split(r"(\w{2})", old_string.upper()) if s.isalnum())
print(new_string)
OUTPUT
> python3 test.py
00:00:00:12:13:A4
>
Without modification, this approach can handle some other MAC formats that you might run into like, "00-00-00-12-13-a4"

Try following code
import re
hx = '0000.0012.13a4'.replace('.','')
print(':'.join(re.findall('..', hx)))
Output: 00:00:00:12:13:a4

There is a pretty simple three step solution:
First we strip those pesky periods.
step1 = hexStrBad.replace('.','')
Then, if the formatting is consistent:
step2 = step1[0:2] + ':' + step1[2:4] + ':' + step1[4:6] + ':' + step1[6:8] + ':' + step1[8:10] + ':' + step1[10:12]
step3 = step2.upper()
It's not the prettiest, but it will do what you need!

It's unclear what you're asking exactly, but if all you want is to make a string all uppercase, use .upper()
Try to clarify your question somewhat, because if you're asking about converting some weirdly formatted string into what looks like a MAC address, we need to know that to answer your question.

Using 'Replace' Function in Python

I have an access table that has a bunch coordinate values in degrees minutes seconds and they are formatted like this:
90-12-28.15
I want to reformat it like this:
90° 12' 28.15"
essentially replacing the dashes with the degrees minutes and seconds characters and a space between the degrees and minutes and another one between the minutes and seconds.
I'm thinking about using the 'Replace' function, but I'm not sure how to replace the first instance of the dash with a degree character (°) and space and then detect the second instance of the dash and place the minute characters and a space and then finally adding the seconds character at the end.
Any help is appreciated.
Mike

While regular expressions and split() are fine solutions, doing this with replace() is rather easy.
lat = "90-12-28.15"
lat = lat.replace("-", "° ", 1)
lat = lat.replace("-", "' ", 1)
lat = lat + '"'
Or you can do it all on one line:
lat = lat.replace("-", "° ", 1).replace("-", "' ", 1) + '"'

I would just split your first string:
# -*- coding: utf-8 -*-
str = '90-12-28.15'
arr = str.split('-')
str2 = arr[0] +'° ' + arr[1] + '\'' +arr[2] +'"'
print str2

You might want to use Python's regular expressions module re, particularly re.sub(). Check the Python docs here for more information.
If you're not familiar with regular expressions, check out this tutorial here, also from the Python documentation.
import re
text = 'replace "-" in 90-12-28.15'
print(re.sub(r'(\d\d)-(\d\d)-(\d\d)\.(\d\d)', r'''\1° \2' \3.\4"''', text))
# use \d{1,2} instead of \d\d if single digits are allowed

The python "replace" string method should be easy to use. You can find the documentation here.
In your case, you can do something like this:
my_str = "90-12-28.15"
my_str = my_str.replace("-","°",1)# 1 means you are going to replace the first ocurrence only
my_str = my_str.replace("-","'",1)
my_str = my_str + "\""

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Extract some letters from a string and add hyphen in Python - python

One solution is to use regular expression and re.sub: import re s = "blablablabla_20210812.jpg" s = re.sub(r"_(\d{4})(\d{2})(\d{2})\.", r"_\1-\2-\3.", s) print(s) Prints: blablablabla_2021-08-12.jpg

You can also replace the values joining the groups with lambda: >>> import re >>> string = 'blablablabla_20210812.jpg' >>> re.sub('(\d{4})(\d{2})(\d{2})', lambda m: '-'.join(g for g in m.groups()), string) #output: 'blablablabla_2021-08-12.jpg'

You can also try like this: import datetime print(datetime.datetime.strptime(string[-12:-4],'%Y%m%d').strftime('%Y-%m-%d')) Output: '2021-08-12'

Related

how to split string from the end after certain occurances of character

Parsing and rewriting a date-time string [duplicate]

Better way to change specific chars in a string separated by underscores without using re

Parsing a MAC address with python

Using 'Replace' Function in Python

Categories

Resources