Right now I have:
timestamp = datetime.strptime(date_string, '%Y-%m-%d %H:%M:%S.%f')
This works great unless I'm converting a string that doesn't have the microseconds. How can I specify that the microseconds are optional (and should be considered 0 if they aren't in the string)?
You could use a try/except block:
try:
timestamp = datetime.strptime(date_string, '%Y-%m-%d %H:%M:%S.%f')
except ValueError:
timestamp = datetime.strptime(date_string, '%Y-%m-%d %H:%M:%S')
What about just appending it if it doesn't exist?
if '.' not in date_string:
date_string = date_string + '.0'
timestamp = datetime.strptime(date_string, '%Y-%m-%d %H:%M:%S.%f')
I'm late to the party but I found if you don't care about the optional bits this will lop off the .%f for you.
datestring.split('.')[0]
I prefer using regex matches instead of try and except. This allows for many fallbacks of acceptable formats.
# full timestamp with milliseconds
match = re.match(r"\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d+Z", date_string)
if match:
return datetime.strptime(date_string, "%Y-%m-%dT%H:%M:%S.%fZ")
# timestamp missing milliseconds
match = re.match(r"\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}Z", date_string)
if match:
return datetime.strptime(date_string, "%Y-%m-%dT%H:%M:%SZ")
# timestamp missing milliseconds & seconds
match = re.match(r"\d{4}-\d{2}-\d{2}T\d{2}:\d{2}Z", date_string)
if match:
return datetime.strptime(date_string, "%Y-%m-%dT%H:%MZ")
# unknown timestamp format
return false
Don't forget to import "re" as well as "datetime" for this method.
datetime(*map(int, re.findall('\d+', date_string)))
can parse both '%Y-%m-%d %H:%M:%S.%f' and '%Y-%m-%d %H:%M:%S'. It is too permissive if your input is not filtered.
It is quick-and-dirty but sometimes strptime() is too slow. It can be used if you know that the input has the expected date format.
If you are using Pandas you can also filter the the Series and concatenate it. The index is automatically joined.
import pandas as pd
# Every other row has a different format
df = pd.DataFrame({"datetime_string": ["21-06-08 14:36:09", "21-06-08 14:36:09.50", "21-06-08 14:36:10", "21-06-08 14:36:10.50"]})
df["datetime"] = pd.concat([
pd.to_datetime(df["datetime_string"].iloc[1::2], format="%y-%m-%d %H:%M:%S.%f"),
pd.to_datetime(df["datetime_string"].iloc[::2], format="%y-%m-%d %H:%M:%S"),
])
datetime_string
datetime
0
21-06-08 14:36:09
2021-06-08 14:36:09
1
21-06-08 14:36:09.50
2021-06-08 14:36:09.500000
2
21-06-08 14:36:10
2021-06-08 14:36:10
3
21-06-08 14:36:10.50
2021-06-08 14:36:10.500000
using one regular expression and some list expressions
time_str = "12:34.567"
# time format is [HH:]MM:SS[.FFF]
sum([a*b for a,b in zip(map(lambda x: int(x) if x else 0, re.match(r"(?:(\d{2}):)?(\d{2}):(\d{2})(?:\.(\d{3}))?", time_str).groups()), [3600, 60, 1, 1/1000])])
# result = 754.567
For my similar problem using jq I used the following:
|split("Z")[0]|split(".")[0]|strptime("%Y-%m-%dT%H:%M:%S")|mktime
As the solution to sort my list by time properly.
Related
I have the following string 20211208_104755, representing date_time format. I want to convert it to the python datetime format using datetime.strip() method.
mydatetime = "20211208_104755"
datetime_object = datetime.strptime(mydatetime, '%y/%m/%d')
However I am getting the following error.
ValueError: time data '20211208' does not match format '%y/%m/%d'
The second argument in strptime has to match the pattern of your datetime string. You can find the patterns and their meaning on https://docs.python.org/3/library/datetime.html
In your case, you can format it as
from datetime import datetime
mydatetime = "20211208_104755"
datetime_object = datetime.strptime(mydatetime, '%Y%m%d_%H%M%S')
print(datetime_object)
>>> 2021-12-08 10:47:55
what should work is to define how the mydatetime string is composed.
example:
%Y is the year (4 digits); check here for format (section strftime() Date Format Codes)
So in your example I would assume it's like this:
mydatetime = "20211208_104755"
datetime_object = datetime.strptime(mydatetime, '%Y%m%d_%H%M%S')
print (datetime_object)
result
2021-12-08 10:47:55
and
type(datetime_object)
datetime.datetime
I am using django python. Now I want to convert the following timing string into hours, minutes ,am/pm format.
string_time = '2022-09-13 11:00:00.996795+00'
expected output:
11:00 am
actual output is :
ValueError: time data '2022-09-13 11:00:00.996795+00' does not match format '%m/%d/%y %H:%M:%S'
my code :
def time_slots(self,string_time='2022-09-13 11:00:00.996795+00'):
print(datetime.strptime(string_time, '%m/%d/%y %H:%M:%S'),type(start_time))
start_time = datetime.strptime(string_time, '%m/%d/%y %H:%M:%S')
return formated_start_time
When you remove the last three chars ('+00') and replace the space with T you can use datetime.datetime.fromisoformat(str) to get a datetime object.
from datetime import datetime
timestr = '2022-09-13 11:00:00.996795+00'
timestr = timestr.rstrip(timestr[-3:]).replace(' ', 'T')
date = datetime.fromisoformat(timestr)
from there you can use date.hour and date.minute to get the values you want.
e.g.:
hour = date.hour%12
minute = date.minute
addition = ''
if date.hour > 12:
addition = 'pm'
else:
addition = 'am'
print(f'{hour}:{minute} {addition}')
I'm not sure if the last string +00 is useful.
If not, the following implementation can help you.
from datetime import datetime
def time_slots(string_time='2022-09-13 11:00:00.996795+00'):
date = datetime.strptime(string_time[:-3], '%Y-%m-%d %H:%M:%S.%f')
return date.strftime("%H:%M %p")
output = time_slots()
print(output) # the output is: 11:00 AM
You can use the parse function provided by dateutil:
from dateutil import parse
string_time = '2022-09-13 11:00:00.996795+00'
dt = parse(string_time)
return dt.strftime("%H:%M %p")
Result: 11:00 AM
We are struggling with formatting datetime in Python 3, and we can't seem to figure it out by our own. So far, we have formatted our dataframe to datetime, so that it should be '%Y-%m-%d %H:%M:%S':
before
02-01-2011 22:00:00
after
2011-01-02 22:00:00
For some very odd reason, when datetime is
13-01-2011 00:00:00
it is changed to this
2011-13-01 00:00:00
And from there it's mixing months with days and is therefore counting months instead of days.
This is all of our code for this datetime formatting:
df['local_date']=df['local_date'] + ':00'
df['local_date'] = pd.to_datetime(df.local_date)
df['local_date']=df['local_date'].dt.strftime('%Y-%m-%d %H:%M:%S')
UPDATED CODE WHICH WORKS:
df['local_date']=df['local_date'] + ':00'
df['local_date'] = pd.to_datetime(df.local_date.str.strip(), format='%d-%m-%Y %H:%M:%S')
df['local_date']=df['local_date'].dt.strftime('%Y-%m-%d %H:%M:%S')
Can't say for sure, but I believe this has to do with the warning mentioned in the documentation of to_datetime:
dayfirst : boolean, default False
Specify a date parse order if arg is str or its list-likes. If True, parses dates with the day first, eg 10/11/12 is parsed as 2012-11-10. Warning: dayfirst=True is not strict, but will prefer to parse with day first (this is a known bug, based on dateutil behavior).
I think the way to get around this is by explicitly pssing a format string to to_datetime:
df['local_date'] = pd.to_datetime(df.local_date, format='%d-%m-%Y %H:%M:%S')
This way it won't accidentally mix months and days (but it will raise an error if any line has a different format)
import pandas as pd
local_date = "13-01-2011 00:00"
local_date = local_date + ":00"
local_date = pd.to_datetime(local_date, format='%d-%m-%Y %H:%M:%S')
local_date = local_date.strftime('%Y-%m-%d %H:%M:%S')
print(local_date)
The output is:
2011-01-13 00:00:00
I have a set of variables(epoch_time,normal_date,date_time,date_time_zone) which can be passed randomly and based on the string format, I am converting it into my required date format (%Y-%m-%d). My variable can be a string with epoch value or string with date timezone or string with datetime or only date. I have tried the following way and it is always going into the first item only in allowed_date_formats. Can someone suggest me a better approach or help me in resolving the issue.
from datetime import datetime
epoch_time='1481883402'
normal_date="2014-09-03"
date_time=str("2014-05-12 00:00:00")
date_time_zone=str("2015-01-20 08:28:16 UTC")
OP_FORMAT="%Y-%m-%d"
ALLOWED_STRING_FORMATS=["%Y-%m-%d %H:%M:%S %Z","%Y-%m-%d %H:%M:%S","%Y-%m-%d"]
def convert_timestamp(date_timestamp=None):
for format in ALLOWED_STRING_FORMATS:
if datetime.strptime(date_timestamp,format):
d=datetime.strptime(date_timestamp,"%Y-%m-%d")
else:
d = datetime.fromtimestamp((float(date_timestamp) / 1000.), tz=None)
return d.strftime(OP_FORMAT)
print(convert_timestamp(normal_date))
Error that i am getting is
ValueError: time data '2014-09-03' does not match format '%Y-%m-%d %H:%M:%S %Z'
You can use try-except for this.
def convert_timestamp(date_timestamp, output_format="%Y-%m-%d"):
ALLOWED_STRING_FORMATS=[
"%Y-%m-%d %H:%M:%S %Z",
"%Y-%m-%d %H:%M:%S",
"%Y-%m-%d",
]
for format in ALLOWED_STRING_FORMATS:
try:
d = datetime.strptime(date_timestamp,format):
return d.strftime(output_format)
except ValueError:
pass
try:
# unix epoch timestamp
epoch = int(date_timestamp) / 1000
return datetime.fromtimestamp(epoch).strftime(output_format)
except ValueError:
raise ValueError('The timestamp did not match any of the allowed formats')
Do you need to make sure that only specific formats are allowed?
Otherwise you might consider using the automatic parser from dateutil:
from dateutil import parser
normal_date="2014-09-03"
print(parser.parse(normal_date))
I tried:
df["datetime_obj"] = df["datetime"].apply(lambda dt: datetime.strptime(dt, "%d/%m/%Y %H:%M"))
but got this error:
ValueError: time data '10/11/2006 24:00' does not match format
'%d/%m/%Y %H:%M'
How to solve it correctly?
The reason why this does not work is because the %H parameter only accepts values in the range of 00 to 23 (both inclusive). This thus means that 24:00 is - like the error says - not a valid time string.
I think therefore we have not much other options than convert the string to a valid format. We can do this by first replacing 24:00 with 00:00, and then later increment the day for these timestamps.
Like:
from datetime import timedelta
import pandas as pd
df['datetime_zero'] = df['datetime'].str.replace('24:00', '0:00')
df['datetime_er'] = pd.to_datetime(df['datetime_zero'], format='%d/%m/%Y %H:%M')
selrow = df['datetime'].str.contains('24:00')
df['datetime_obj'] = df['datetime_er'] + selrow * timedelta(days=1)
The last line thus adds one day to the rows that contain 24:00, such that '10/11/2006 24:00' gets converted to '11/11/2006 24:00'. Note however that the above is rather unsafe since depending on the format of the timestamp this will/will not work. For the above it will (probably) work, since there is only one colon. But if for example the datetimes have seconds as well, the filter could get triggered for 00:24:00, so it might require some extra work to get it working.
Your data doesn't follow the conventions used by Python / Pandas datetime objects. There should be only one way of storing a particular datetime, i.e. '10/11/2006 24:00' should be rewritten as '11/11/2006 00:00'.
Here's one way to approach the problem:
# find datetimes which have '24:00' and rewrite
twenty_fours = df['strings'].str[-5:] == '24:00'
df.loc[twenty_fours, 'strings'] = df['strings'].str[:-5] + '00:00'
# construct datetime series
df['datetime'] = pd.to_datetime(df['strings'], format='%d/%m/%Y %H:%M')
# add one day where applicable
df.loc[twenty_fours, 'datetime'] += pd.DateOffset(1)
Here's some data to test:
dateList = ['10/11/2006 24:00', '11/11/2006 00:00', '12/11/2006 15:00']
df = pd.DataFrame({'strings': dateList})
Result after transformations described above:
print(df['datetime'])
0 2006-11-11 00:00:00
1 2006-11-11 00:00:00
2 2006-11-12 15:00:00
Name: datetime, dtype: datetime64[ns]
As indicated in the documentation (https://docs.python.org/2/library/datetime.html#strftime-strptime-behavior), hours go from 00 to 23. 24:00 is then an error.