I'm using datetime.strptime to parse and obtain DateTime values from strings, in the form of %Y-%m-%dT%H:%M:%SZ but the data is dirty and sometimes doesn't have the time parameter, is sometimes received in yyyy/mm/dd format instead of yyyy-mm-dd format. I can think of hacky regex and try-catch ways to parse this and get what I need, but is there a clean way to use datetime.strptime and obtain the datetime in '%Y-%m-%dT%H:%M:%SZ' format with 00:00:00 or something as the default time if there is no time information?
Currently doing:
time = datetime.strptime(data['time'], '%Y-%m-%dT%H:%M:%SZ').replace(tzinfo=pytz.utc)
which throws an error if the data is in an unexpected format.
Just catch the ValueError and try again with an augmented value.
fmt = '%Y-%m-%dT%H:%M:%SZ'
try:
time = datetime.strptime(data['time'], fmt)
except ValueError:
time = datetime.strptime(data['time'] + "T00:00:00Z", fmt)
Alternatively, try the same string with a date-only format, since the resulting value will already default to 00:00:00.
date_and_time = '%Y-%m-%dT%H:%M:%SZ'
date_only = '%Y-%m-%d'
try:
time = datetime.strptime(data['time'], date_and_time)
except ValueError:
time = datetime.strptime(data['time'], date_only)
The second approach is a bit easier to adapt to multiple possible formats. Make a list, and iterate over them until one succeeds.
formats = ['%Y-%m-%dT%H:%M:%SZ', '%Y-%m-%d', ...]
for fmt in formats:
try:
time = datetime.strptime(data['time'], fmt)
break
except ValueError:
pass
else:
# raise ValueError(f'{data["time"]} does not match any expected format')
time = datetime.now() # Or some other completely artificial value
If you're okay with third-party dependencies, you may also try the dateutil library:
import dateutil.parser
time = parser.isoparse(data['time']).replace(tzinfo=pytz.utc)
Or, if you want to have more control over the default values:
import dateutil.parser
time = parser.parse(data['time'], default=datetime.datetime(2019, 10, 14, 20, 14, 50), yearfirst=True).replace(tzinfo=pytz.utc)
Both of them allow more missing fields in the date string (like YYYY or YYYY-MM, etc.). See https://dateutil.readthedocs.io/en/stable/parser.html for more details.
Related
I am trying to convert a string to datetime object using the strptime function.
I am encountering a ValueError that says format doesn't match, so I did double checking and confirmed that the format in the string matches the format I am passing as the parameter for strptime.
I have also referenced this question: time data does not match format but there the month and year were swapped.
So does this only work with the '%y-%m-%d %H:%M:%S' format or is it dynamic as per the user input like in my case '%y-%m-%d-%H:%M:%S' ?
input:-
from datetime import datetime
stg = "2022-10-31-01:17:46"
do = datetime.strptime(stg, '%y-%m-%d-%H:%M:%S')
output
ValueError: time data '2022-09-31-01:17:46' does not match format '%y-%m-%d-%H:%M:%S'
Expected output:
#while printing 'do'
2020-09-31-01:17:46
You're almost there. You need %Y instead of %y since you're providing the year with the century (2022 instead of 22).
Your code would be
from datetime import datetime
stg = "2022-10-31-01:17:46"
do = datetime.strptime(stg, '%Y-%m-%d-%H:%M:%S')
I have a set of variables(epoch_time,normal_date,date_time,date_time_zone) which can be passed randomly and based on the string format, I am converting it into my required date format (%Y-%m-%d). My variable can be a string with epoch value or string with date timezone or string with datetime or only date. I have tried the following way and it is always going into the first item only in allowed_date_formats. Can someone suggest me a better approach or help me in resolving the issue.
from datetime import datetime
epoch_time='1481883402'
normal_date="2014-09-03"
date_time=str("2014-05-12 00:00:00")
date_time_zone=str("2015-01-20 08:28:16 UTC")
OP_FORMAT="%Y-%m-%d"
ALLOWED_STRING_FORMATS=["%Y-%m-%d %H:%M:%S %Z","%Y-%m-%d %H:%M:%S","%Y-%m-%d"]
def convert_timestamp(date_timestamp=None):
for format in ALLOWED_STRING_FORMATS:
if datetime.strptime(date_timestamp,format):
d=datetime.strptime(date_timestamp,"%Y-%m-%d")
else:
d = datetime.fromtimestamp((float(date_timestamp) / 1000.), tz=None)
return d.strftime(OP_FORMAT)
print(convert_timestamp(normal_date))
Error that i am getting is
ValueError: time data '2014-09-03' does not match format '%Y-%m-%d %H:%M:%S %Z'
You can use try-except for this.
def convert_timestamp(date_timestamp, output_format="%Y-%m-%d"):
ALLOWED_STRING_FORMATS=[
"%Y-%m-%d %H:%M:%S %Z",
"%Y-%m-%d %H:%M:%S",
"%Y-%m-%d",
]
for format in ALLOWED_STRING_FORMATS:
try:
d = datetime.strptime(date_timestamp,format):
return d.strftime(output_format)
except ValueError:
pass
try:
# unix epoch timestamp
epoch = int(date_timestamp) / 1000
return datetime.fromtimestamp(epoch).strftime(output_format)
except ValueError:
raise ValueError('The timestamp did not match any of the allowed formats')
Do you need to make sure that only specific formats are allowed?
Otherwise you might consider using the automatic parser from dateutil:
from dateutil import parser
normal_date="2014-09-03"
print(parser.parse(normal_date))
I'm trying to see if a list of dates are valid dates. I'm using the dateutil library, but I'm getting weird results. For example, when I try the following:
import dateutil.parser as parser
x = '10/84'
date = (parser.parse(x))
print(date.isoformat())
I get the result 1984-10-12T00:00:00 which is wrong. Does anyone know why this 12 gets added to the date?
The parse() method parses the string and updates a default datetime object, using the parsed information. If the default is not passed into this function, it uses first second of today.
This means that the 12 in your result, is today (when you're running the code), only the year and the month are updated from parsing the string.
If you need to parse the date string but you're not sure if it's a valid date value, then you may use a try ... except block to catch parse errors.
import dateutil.parser as parser
x = '10/84'
try:
date = (parser.parse(x))
print(date.isoformat())
except ValueError as err:
pass # handle the error
12 is the current date . dateutil takes components from current date/time to account for missing date or year in the date (it does not do this for the month, only date or year). Like another example would be a date like - Janauary 20 - this would get parsed as 2015/01/12 taking the 2015 year from the current datetime.
Sadly I have not yet found any options or such to stop this behavior.
I believe the best option for you would be to come up with a list of the valid datetime formats that you are expecting , and then manually try datetime.datetime.strptime on them , excepting ValueError . Example -
def isdate(dt, fmt):
try:
datetime.datetime.strptime(dt, fmt)
return True
except ValueError:
return False
validformats = [...]
dates =[...]
for x in dates:
if any(isdate(x,fmt) for fmt in validformats):
print(x, 'is valid date')
My issue is that user can input any date format, 12-feb-2015 or 12/10/2015, i need to convert this in below format :
12-feb-2015 00:00:00
this further would be fed in a MySQL query which would then be used to fetch data in given date ranges
so i have 2 questions :
is there any standard way to convert any input format to my required one?
how can i append hh:mm:ss to it?
i saw lot of methods on SO thread but none seem to help me out.
Normally SO isn't a code writing service but... :)
This is only a start to what you could do. I'm unaware of any way to have one test catch multiple formats. Instead I've always "gone through" the available formats. Since we're talking about two, here's something to kick-start your thinking:
from datetime import datetime
def parse_date(thedate):
result = None
#try each format
try:
result = datetime.strptime(thedate, "%d-%b-%Y")
except ValueError:
pass
except:
raise
# Let the last one "blow" up
if result is None:
result = datetime.strptime(thedate, "%d/%m/%Y")
print "{} parsed into {}".format(thedate,result.strftime("%d-%b-%Y %H:%M:%S"))
parse_date("12/10/2015") yields 12/10/2015 parsed into 12-Oct-2015 00:00:00
and
parse_date("12-feb-2015") yields 12-feb-2015 parsed into 12-Feb-2015 00:00:00
That should get you going. Check out the strptime/strftime formats here (scroll down to the strftime function).
The main problem with your approach is that any kind of format may be infinite kinds how much you can imagine.
Inspired in #al-g's answer, I propose an approach using a set of known data formats.
from datetime import datetime
def convert(dtm):
formats = ['%d/%m/%Y', '%d-%m-%Y', '%d/%m/%Y %H:%M:%S', '%d-%m-%Y %H:%M:%S']
for fmt in formats:
try:
return datetime.strptime(dtm, fmt).strftime('%d/%b/%Y %H:%M:%S')
except ValueError:
pass
except:
raise
print 'Format not recognized'
>>> convert('15072015')
Format not recognized
>>> convert('15-07-2015')
'15/Jul/2015 00:00:00'
>>> convert('15/07/2015')
'15/Jul/2015 00:00:00'
You can update the set of formats every time you find new one.
I have a datetime object with integer number of seconds (ex: 2010-04-16 16:51:23). I am using the following command to extract exact time
dt = datetime.datetime.strptime(time, '%Y-%m-%d %H:%M:%S.%f
(generically, I have decimals (ex: 2010-04-16 16:51:23.1456) but sometimes I don't. So when I run this command, I get an error message
ValueError: time data '2010-04-16 16:51:23' does not match format '%Y-%m-%d %H:%M:%S.%f'
How do I go about resolving this?
It's because you don't have the format you specified. You have the format:
'%Y-%m-%d %H:%M:%S'
There are multiple solutions. First, always generate the data in the same format (adding .00 if you need to).
A second solution is that you try to decode in one format and if you fail, you decode using the other format:
try:
dt = datetime.datetime.strptime(time, '%Y-%m-%d %H:%M:%S.%f')
except ValueError:
dt = datetime.datetime.strptime(time, '%Y-%m-%d %H:%M:%S')
Another way avoiding using the exception handling mechanism is to default the field if not present and just try processing with the one format string:
from datetime import datetime
s = '2010-04-16 16:51:23.123'
dt, secs = s.partition('.')[::2]
print datetime.strptime('{}.{}'.format(dt, secs or '0'), '%Y-%m-%d %H:%M:%S.%f')
if you're using the latest python (3.2+) simple-date will do this kind of thing for you:
>>> from simpledate import *
>>> SimpleDate('2010-04-16 16:51:23.1456')
SimpleDate('2010-04-16 16:51:23.145600', tz='America/Santiago')
>>> SimpleDate('2010-04-16 16:51:23')
SimpleDate('2010-04-16 16:51:23', tz='America/Santiago')
it works by extending the python template format. so you could also write (it's not needed because ISO8601-like formats are handled by default):
>>> SimpleDate('2010-04-16 16:51:23', format='Y-m-d H:M:S(.f)?')
SimpleDate('2010-04-16 16:51:23', tz='America/Santiago')
see how the fractional seconds are (.f)? like a regexp - means it's optional (also, it will add % signs if there are none).
PS and you can access the datetime via an attribute. if you wanted to discard the tzinfo (which is taken from the locale by default - i live in chile, hence America/Santiago above) to get a naive datetime:
>>> SimpleDate('2010-04-16 16:51:23').naive.datetime
datetime.datetime(2010, 4, 16, 16, 51, 23)