Validating date format with Python regex - python

I want to check if the format of the date input by user matches the below:
Jan 5 2018 6:10 PM
Month: First letter should be caps, followed 2 more in small. (total 3 letters)
<Space>: single space, must exist
Date: For single digit it should not be 05, but 5
<Space>: single space, must exist
Hour: 0-12, for single digit it should not be 06, but 6
Minute: 00-59
AM/PM
I'm using the below regex and trying to match:
import re,sys
usr_date = str(input("Please enter the older date until which you want to scan ? \n[Date Format Example: Jan 5 2018 6:10 PM] : "))
valid_usr_date = re.search("^(\s+)*[A-Z]{1}[a-z]{2}\s{1}[1-31]{1}\s{1}[1-2]{1}[0-9]{1}[0-9]{1}[0-9]{1}\s{1}[0-12]{1}:[0-5]{1}[0-9]{1}\s{1}(A|P)M$",usr_date,re.M)
if not valid_usr_date:
print ("The date format is incorrect. Please follow the exact date format as shown in example. Exiting Program!")
sys.exit()
But, even for the correct format it gives a syntax wrong error. What am I doing wrong.

I would not use regex for that, as you have no way to actually validate the date itself (eg, a regex will happily accept Abc 99 9876 9:99 PM).
Instead, use strptime:
from datetime import datetime
string = 'Jan 5 2018 6:10 PM'
datetime.strptime(string, '%b %d %Y %I:%M %p')
If the string would be in the "wrong" format you'd get a ValueError.
The only apparent "problem" with this approach is that for some reason you require the day and hour not to be zero-padded and strptime doesn't seem to have such directives.
A table with all available directives is here.

You could use a function which parses the input string and tries to return a datetime object, if it can't it raises an ValueError:
from datetime import datetime
def valid_date(s):
try:
return datetime.strptime(s, '%Y-%m-%d %H:%M')
except ValueError:
msg = "Not a valid date: '{0}'.".format(s)
raise argparse.ArgumentTypeError(msg)

Related

Python Date and Time format

I have a pandas column with a date and time as value - 20131019T150000. I would like to convert these to something like 3pm, 19th October 2013.
Example -
ID DATE
1 20131019T150000
2 20180202T100000
output should be something like:
ID DATE
1 3pm, 19th October 2013
2 10am, 2nd February 2018
Thanks in advance.
First you'll need to convert the string to a Python datetime.datetime object to easily work with it. To do that you can use classmethod datetime.strptime(date_string, format) (see in docs), which returns a datetime corresponding to date_string, parsed according to format.
Then, to print the datetime object as any string you'd want, there is this other method datetime.strftime(format) (see in docs) which return a string representing the date and time, controlled by an explicit format string.
(Note: For more about the formating directives, follow this link to docs)
So for the given string you could proceed as follow:
from datetime import datetime
def get_suffix(day: int) -> str:
# https://stackoverflow.com/a/739266/7771926
if 4 <= day <= 20 or 24 <= day <= 30:
suffix = "th"
else:
suffix = ["st", "nd", "rd"][day % 10 - 1]
return suffix
def process_date(date: str) -> str:
dt_obj = datetime.strptime(date, '%Y%m%dT%H%M%S')
return dt_obj.strftime('%I%p, %d{} %B %Y').format(get_suffix(dt_obj.day))
def main():
date_str = '20131019T150000'
print(process_date(date_str))
if __name__ == '__main__':
main()
If you execute the script, this is what is printed to console: 03PM, 19th October 2013
Glad if helps.
You could use the datetime module of python (in-built). Here's a function to which does what you want:
import datetime
def func(input_st):
dt_obj = datetime.datetime.strptime(input_st, '%Y%m%dT%H%M%S')
return dt_obj.strftime('%I%p, %d %B %Y')
Hope this solves it! You could (maybe) use this function for every item.... maybe using the 'map' inbuilt function of python.
from datetime import datetime
s="20131019T150000"
dt_obj1=datetime.strptime(s,'%Y%m%dT%H%M%S')
print(dt_obj1.strftime("%I %p,%d %B %Y"))
Output:
03 PM,19 October 2013

Converting string to datetime with milliseconds and timezone - Python

I have the following python snippet:
from datetime import datetime
timestamp = '05/Jan/2015:17:47:59:000-0800'
datetime_object = datetime.strptime(timestamp, '%d/%m/%y:%H:%M:%S:%f-%Z')
print datetime_object
However when I execute the code, I'm getting the following error:
ValueError: time data '05/Jan/2015:17:47:59:000-0800' does not match format '%d/%m/%y:%H:%M:%S:%f-%Z'
what's wrong with my matching expression?
EDIT 2: According to this post, strptime doesn't support %z (despite what the documentation suggests). To get around this, you can just ignore the timezone adjustment?:
from datetime import datetime
timestamp = '05/Jan/2015:17:47:59:000-0800'
# only take the first 24 characters of `timestamp` by using [:24]
dt_object = datetime.strptime(timestamp[:24], '%d/%b/%Y:%H:%M:%S:%f')
print(dt_object)
Gives the following output:
$ python date.py
2015-01-05 17:47:59
EDIT: Your datetime.strptime argument should be '%d/%b/%Y:%H:%M:%S:%f-%z'
With strptime(), %y refers to
Year without century as a zero-padded decimal number
I.e. 01, 99, etc.
If you want to use the full 4-digit year, you need to use %Y
Similarly, if you want to use the 3-letter month, you need to use %b, not %m
I haven't looked at the rest of the string, but there are possibly more mismatches. You can find out how each section can be defined in the table at https://docs.python.org/2/library/datetime.html#strftime-and-strptime-behavior
And UTC offset is lowercase z.

converting written date to date format in python

I am using Python 2.7.
I have an Adobe PDF form doc that has a date field. I extract the values using the pdfminer function. The problem I need to solve is, the user in Adobe Acrobat reader is allowed to type in strings like april 3rd 2017 or 3rd April 2017 or Apr 3rd 2017 or 04/04/2017 as well as 4 3 2017. Now the date field in Adobe is set to mm/dd/yyyy format, so when a user types in one of the values above, that is the actual value that pdfminer pulls, yet adobe will display it as 04/03/2017, but when you click on the field is shows you the actual value like the ones above. Adobe allows this and then doing it's on conversion I think to display the date as mm/dd/yyyy. There is ability to use javascript with adobe for more control, but i can't do that the users can only have and use the pdf form without any accompanying javascript file.
So I was looking to find a method with datetime in Python that would be able to accept a written date such as the examples above from a string and then convert them into a true mm/dd/yyyy format??? I saw methods for converting long and short month names but nothing that would handle day names like 1st,2nd,3rd,4th .
You could just try each possible format in turn. First remove any st nd rd specifiers to make the testing easier:
from datetime import datetime
formats = ["%B %d %Y", "%d %B %Y", "%b %d %Y", "%m/%d/%Y", "%m %d %Y"]
dates = ["april 3rd 2017", "3rd April 2017", "Apr 3rd 2017", "04/04/2017", "4 3 2017"]
for date in dates:
date = date.lower().replace("rd", "").replace("nd", "").replace("st", "")
for format in formats:
try:
print datetime.strptime(date, format).strftime("%m/%d/%Y")
except ValueError:
pass
Which would display:
04/03/2017
04/03/2017
04/03/2017
04/04/2017
04/03/2017
This approach has the benefit of validating each date. For example a month greater than 12. You could flag any dates that failed all allowed formats.
Just write a regular expression to get the number out of the string.
import re
s = '30Apr'
n = s[:re.match(r'[0-9]+', s).span()[1]]
print(n) # Will print 30
The other things should be easy.
Based on #MartinEvans's anwser, but using arrow library: (because it handles more cases than datetime so you don't have to use replace() nor lower())
First install arrow:
pip install arrow
Then try each possible format:
import arrow
dates = ['april 3rd 2017', '3rd April 2017', 'Apr 3rd 2017', '04/04/2017', '4 3 2017']
formats = ['MMMM Do YYYY', 'Do MMMM YYYY', 'MMM Do YYYY', 'MM/DD/YYYY', 'M D YYYY']
def convert_datetime(date):
for format in formats:
try:
print arrow.get(date, format).format('MM/DD/YYYY')
except arrow.parser.ParserError:
pass
[convert_datetime(date) for date in dates]
Will output:
04/03/2017
04/03/2017
04/03/2017
04/04/2017
04/03/2017
If you are unsure of what could be wrong in your date format, you can also output a nice error message if none of the date matches the format:
def convert_datetime(date):
for format in formats:
try:
print arrow.get(date, format).format('MM/DD/YYYY')
break
except (arrow.parser.ParserError, ValueError) as e:
pass
else:
print 'For date: "{0}", {1}'.format(date, e)
convert_datetime('124 5 2017') # test invalid date
Will output the following error message:
'For date: "124 5 2017", month must be in 1..12'

Convert timestring into date - Python

If I have the following timestring:
20150505
How would I convert this into the date May 5, 2015 in Python? So far I've tried:
from datetime import datetime
sd = datetime.strptime('20150504', '%Y%M%d')
But this outputs:
2015-01-04 00:05:00
The capital M denotes minute not month. Use the lowercase m and then call the strftime method to refactor the format:
>>> datetime.strptime('20150504', '%Y%m%d').strftime('%b %d, %Y')
'May 04, 2015'
You can remove the zero padding from the month by using the -d directive in place of d:
%-d Day of the month as a decimal number. (Platform specific)
For longer month names, you can use the directive %B in place of %b to get the full month name.
Reference:
http://strftime.org/
If you know it's a date and not a datetime, or you don't know the format. You can use dateutil.
from dateutil.parser import parse
print(parse('20150504'))
This is the anwser, wihout leading zero for day, as OP's example:
print(sd.strftime('%b %-d, %Y'))
# Jan 4, 2015 # note your sd parsing is wrong. Thus Jan

Difficulty with the replace method

I must have the user enter a date in mm/dd/yy format and then output the string in long-date format like January, ##, ####. I cannot for the life of me get the month to replace as a the word.
def main():
get_date=input('Input a date in mm/dd/yy format!\nIf you would like to enter a 1-digit number, enter a zero first, then the number\nDate:')
month= int(get_date[:2])
day=int(get_date[3:5])
year=int(get_date[6:])
validate(month, day, year)#validates input
get_month(get_date)
def validate(month,day,year):
while month>12 or month<1 or day>31 or day<1 or year!=15:
print("if you would like to enter a one-digit number, enter a zero first, then the number\n theres only 12 months in a year\n only up to 31 days in a month, and\n you must enter 15 as the year")
get_date=input('Input a date in mm/dd/yy format!:')
month= int(get_date[:2])
day=int(get_date[3:5])
year=int(get_date[6:])
def get_month(get_date):
if get_date.startswith('01'):
get_date.replace('01','January')
print(get_date)
I have tried a plethora of things to fix this but I cannot make January appear instead of 01.
Strings in Python are immutable, they don't change once they're created. That means any function that modifies it must return a new string. You need to capture that new value.
get_date = get_date.replace('01','January')
You can do this (and simplify the code) using python's date module.
The strptime function will parse a date from a string using format codes. If it's can't parse it correctly, it will raise a value error, so no need for your custom validation function
https://docs.python.org/2.7/library/datetime.html#datetime.datetime.strptime
The strftime function will print out that date formatted according to the same codes.
https://docs.python.org/2.7/library/datetime.html#datetime.datetime.strftime
Updated, your code would look something like this:
from datetime import datetime
parsed = None
while not parsed:
get_date=input('Input a date in mm/dd/yy format!\nIf you would like to enter a 1-digit number, enter a zero first, then the number\nDate:')
try:
parsed = datetime.strptime(get_date, '%m/%d/%y')
except ValueError:
parsed = None
print parsed.strftime('%B %d, %Y')
Why don't you use datetime module ?
year = 2007; month=11; day=3
import datetime
d = datetime.date(year, month, day)
print d.strftime("%d %B %Y")
You might be better off using Python's datetime module for this:
from datetime import datetime
entered_date = input('Input a date in mm/dd/yy format!\nIf you would like to enter a 1-digit number, enter a zero first, then the number\nDate:')
d = datetime.strptime(entered_date, '%m/%d/%y')
entered_date = d.strftime('%B, %d, %Y')
e.g.
'February, 29, 2016'
This way you catch invalid dates (such as 02/29/15) as well as badly-formatted ones.

Categories

Resources