Python unit-testing: Check if datestring is in the allowed format - python

I'm unittesting my code and I want to make sure that the timestamps I get in are:
strings
In the format YYYY<space>DDD:HH:MM:SS.sss<space> where DDD represents the day of the year.
What I've got is:
def test_time(self, time_stamp)
from datetime import datetime
self.assertIsInstance(time_stamp,str, msg="%s not a string" %time_stamp)
self.assertIsInstance(datetime.strptime(time_stamp, "%Y %j:%H:%M:%S.%f"), datetime.datetime)
The problem with this is that it the second assert is true for both 2014 031:09:59:59.862 (correct timestamp) and 2014 31:9:59:59.862 (incorrect timestamp).
How can I check that the timestamp has the correct format?

You can follow strptime by strftime to normalize the string to your desired format, and check if the original string equals the normalized string.

Related

How to add leading zeros and convert my date field to YYYY-MM-DD? [duplicate]

This question already has answers here:
Parse date string and change format
(10 answers)
Closed 9 days ago.
I have some dates in my json file that look like this:
2020-12-11
2020-5-1
2020-3-21
and I want to convert them to YYYY-MM-DD format. They are already in a similar format, but I want to add leading zeros for single-digit month and day numbers.
The output should look like this:
2020-12-11
2020-05-01
2020-03-21
How can I do this?
The parser in datetutil can be used as follows (d1 is original date string):
from dateutil import parser
d2 = parser.parse(d1).date()
produces (datetime format which could be converted to string using strftime() if that is required):
2020-12-11
2020-05-01
2020-03-21
There is also an option (dayfirst = True) to declare day-before-month.
The usual way to reformat a date string is to use the datetime module (see How to convert a date string to different format).
Here you want to use the format codes (quoting the descriptions from the documentation)
%Y - Year with century as a decimal number.
%m - Month as a zero-padded decimal number.
%d - Day of the month as a zero-padded decimal number.
According to the footnote (9) in the documentation of datetime, %m and %d accept month and day numbers without leading zeros when used with strptime, but will output zero-padded numbers when used with strftime.
So you can use the same format string %Y-%m-%d to do a round-trip with strptime and strftime to add the zero-padding.
from datetime import datetime
def reformat(date_str):
fmt = '%Y-%m-%d'
return datetime.strptime(date_str, fmt).strftime(fmt)

Time value does not match the format. ValueError in python

I am trying to convert a string to datetime object using the strptime function.
I am encountering a ValueError that says format doesn't match, so I did double checking and confirmed that the format in the string matches the format I am passing as the parameter for strptime.
I have also referenced this question: time data does not match format but there the month and year were swapped.
So does this only work with the '%y-%m-%d %H:%M:%S' format or is it dynamic as per the user input like in my case '%y-%m-%d-%H:%M:%S' ?
input:-
from datetime import datetime
stg = "2022-10-31-01:17:46"
do = datetime.strptime(stg, '%y-%m-%d-%H:%M:%S')
output
ValueError: time data '2022-09-31-01:17:46' does not match format '%y-%m-%d-%H:%M:%S'
Expected output:
#while printing 'do'
2020-09-31-01:17:46
You're almost there. You need %Y instead of %y since you're providing the year with the century (2022 instead of 22).
Your code would be
from datetime import datetime
stg = "2022-10-31-01:17:46"
do = datetime.strptime(stg, '%Y-%m-%d-%H:%M:%S')

Date formatting to month

I want to change the format of "Date" column from 10/15/2019 to m/d/y format.
tax['AsOfdate']= pd.to_datetime(tax['date'])
How do I do it?
like this, and here is the documentation.
tax['AsOfdate']= pd.to_datetime(tax['date'], format="%m/%d/%Y" )
Here is an example with today's date formatting:
from datetime import date
today = date.today()
new_format = today.strftime("%m/%d/%y")
print(today, new_format)
Pandas to_datetime function accepts a format command which accepts strftime notation.
Pandas docs: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.to_datetime.html
Strftime docs: https://docs.python.org/3/library/datetime.html#strftime-and-strptime-behavior
m/d/y notation would be:
tax['AsOfdate']= pd.to_datetime(tax['date'], format='%m/%d/%y)
Assuming you want everything zero padded with two digits like 01/01/19 for January first 2019. If you need something else, the strftime formatting link shows all the codes that let you choose padding or not, four-digit year or two-digit, and so on.

How do I parse a date without zero padding, in the format (1 or 2-digit year)-(Month abbreviation)?

I need to parse a few dates that are roughly in the format (1 or 2-digit year)-(Month abbreviation), for example:
5-Jun (June 2005)
13-Jan (January 2013)
I tried using strptime with the format %b-%y but it did not consistently produce the desired date. Per the documentation, this is because some years in my dataset are not zero-padded.
Further, when I tested the datetime module (please see below for my code) on the string "5-Jun", I got "2019-06-05", instead of the desired result (June 2005), even if I set yearfirst=True when calling parse.
from dateutil.parser import parse
parsed = parse("5-Jun",yearfirst=True)
print(parsed)
It will be easier if 0 is padded to single digit years, as it can be directly converted to time using format. Regular expression is used here to replace any instance of single digit number with it's '0 padded in front' value. I've used regex from here.
Sample code:
import re
match_condn = r'\b([0-9])\b'
replace_str = r'0\1'
datetime.strptime(re.sub(match_condn, replace_str, '15-Jun'), '%y-%b').strftime("%B %Y")
Output:
June 2015
One approach is to use str.zfill
Ex:
import datetime
d = ["5-Jun", "13-Jan"]
for date in d:
date, month = date.split("-")
date = date.zfill(2)
print(datetime.datetime.strptime(date+"-"+month, "%y-%b").strftime("%B %Y"))
Output:
June 2005
January 2013
Ah. I see from #Rakesh's answer what your data is about. I thought you needed to parse the full name of the month. So you had your two terms %b and %y backwards, but then you had the problem with the single-digit years. I get it now. Here's a much simpler way to get what you want if you can assume your dates are always in one of the two formats you indicate:
inp = "5-Jun"
t = time.strptime(("0" + inp)[-6:], "%y-%b")

Converting string to datetime with milliseconds and timezone - Python

I have the following python snippet:
from datetime import datetime
timestamp = '05/Jan/2015:17:47:59:000-0800'
datetime_object = datetime.strptime(timestamp, '%d/%m/%y:%H:%M:%S:%f-%Z')
print datetime_object
However when I execute the code, I'm getting the following error:
ValueError: time data '05/Jan/2015:17:47:59:000-0800' does not match format '%d/%m/%y:%H:%M:%S:%f-%Z'
what's wrong with my matching expression?
EDIT 2: According to this post, strptime doesn't support %z (despite what the documentation suggests). To get around this, you can just ignore the timezone adjustment?:
from datetime import datetime
timestamp = '05/Jan/2015:17:47:59:000-0800'
# only take the first 24 characters of `timestamp` by using [:24]
dt_object = datetime.strptime(timestamp[:24], '%d/%b/%Y:%H:%M:%S:%f')
print(dt_object)
Gives the following output:
$ python date.py
2015-01-05 17:47:59
EDIT: Your datetime.strptime argument should be '%d/%b/%Y:%H:%M:%S:%f-%z'
With strptime(), %y refers to
Year without century as a zero-padded decimal number
I.e. 01, 99, etc.
If you want to use the full 4-digit year, you need to use %Y
Similarly, if you want to use the 3-letter month, you need to use %b, not %m
I haven't looked at the rest of the string, but there are possibly more mismatches. You can find out how each section can be defined in the table at https://docs.python.org/2/library/datetime.html#strftime-and-strptime-behavior
And UTC offset is lowercase z.

Categories

Resources