One column of CSV file includes time and time zone.
Here is one value under the column: 2018-05-20 15:05:51.065 America/New_York. I wonder, how can I convert the value to the 2019-05-20 format? There are over a half-million rows in the CSV file.
Split your column into date, time and zone using string manipulators, regex etc . Have a standard time zone to follow (eg: UTC)
Now
Get time difference between the zone and UTC using below,
How to convert string timezones in form (Country/city) into datetime.tzinfo
Use this difference to the time you have split already and then change date based on 24 hours.
If you just want it to be a string, just strip away everything past the first space:
"2018-05-20 15:05:51.065 America/New_York".split(' ')[0]
EDIT:
If you want it to be a timezone-aware datetime object, you can do it easily with pytz package:
from datetime import datetime
from pytz import timezone
string_date = "2018-05-20 15:05:51.065 America/New_York"
tz = timezone(string_date.split(' ')[len(string_date.split(' '))-1])
unaware = " ".join(string_date.split(' ')[:len(string_date.split(' '))-1])
unaware_datetime = datetime.strptime(unaware, "%Y-%m-%d %H:%M:%S.%f")
aware_datetime = unaware_datetime.replace(tzinfo=tz)
Related
The following code converts a string into a timestamp. The timestamp comes out to: 1646810127.
However, if I use Excel to convert this date and time into a float I get: 44629,34.
I need the Excel's output from the Python script.
I have tried with a few different datetime strings to see if there is any pattern in between the two numbers, but cannot seem to find any.
Any thoughts on how I get the code to output 44629,34?
Much appreciated
import datetime
date_time_str = '2022-03-09 08:15:27'
date_time_obj = datetime.datetime.strptime(date_time_str, '%Y-%m-%d %H:%M:%S')
print('Date:', date_time_obj.date())
print('Time:', date_time_obj.time())
print('Date-time:', date_time_obj)
print(date_time_obj.timestamp())
>>output:
Date: 2022-03-09
Time: 08:15:27
Date-time: 2022-03-09 08:15:27
1646810127.0
calculate the timedelta of your datetime object versus Excel's "day zero", then divide the total_seconds of the timedelta by the seconds in a day to get Excel serial date:
import datetime
date_time_str = '2022-03-09 08:15:27'
UTC = datetime.timezone.utc
dt_obj = datetime.datetime.fromisoformat(date_time_str).replace(tzinfo=UTC)
day_zero = datetime.datetime(1899,12,30, tzinfo=UTC)
excel_serial_date = (dt_obj-day_zero).total_seconds()/86400
print(excel_serial_date)
# 44629.3440625
Note: I'm setting time zone to UTC here to avoid any ambiguities - adjust as needed.
Since the question is tagged pandas, you'd do the same thing here, only that you don't need to set UTC as pandas assumes UTC by default for naive datetime:
import pandas as pd
ts = pd.Timestamp('2022-03-09 08:15:27')
excel_serial_date = (ts-pd.Timestamp('1899-12-30')).total_seconds()/86400
print(excel_serial_date)
# 44629.3440625
See also:
background: What is story behind December 30, 1899 as base date?
inverse operation: Convert Excel style date with pandas
I have a string and I need to convert it first to utc and then extract the date from it.
times = '2021-04-15T21:53:00:000-06'
I am first doing:
datetime.datetime.strptime(times, "%Y-%m-%dT%H:%M:%S.%f%z")
It's giving me exception as:
ValueError: time data '2021-04-15T21:53:00-06' does not match format
'%Y-%m-%dT%H:%M:%S.%f%z'
I want to replace the timezone to utc replace(tzinfo=datetime.timezone.utc)
and extract only yyyy-mm-dd.
Assuming the format is consistent in your data (length of the strings is constant), you can do a bit of string slicing to separate date/time and UTC offset. Parse the first to datetime and add the latter as a timezone constructed from a timedelta. Then convert to UTC.
Ex:
from datetime import datetime, timedelta, timezone
s = '2021-04-15T21:53:00:000-06'
# first part to datetime
dt = datetime.fromisoformat(s[:-3])
# set time zone
dt = dt.replace(tzinfo=timezone(timedelta(hours=int(s[-3:]))))
# to UTC
dt_utc = dt.astimezone(timezone.utc)
print(dt_utc.date())
# 2021-04-16
Note that this will fail if the format is not consistent, e.g. if some strings have +0530 while others only have e.g. -06.
In that case, another option is to use strptime, but that requires modifying the input as well. %z expects ±HH:MM or ±HHMM, so you can add the minutes like
if len(s) == 26: # minutes missing
s += '00'
dt = datetime.strptime(s, "%Y-%m-%dT%H:%M:%S:%f%z")
and then convert to UTC as described above.
What is this date format 2020-01-13T09:25:19-0330 ? and how can I get the current datetime in this format in python ?
Edited: Also note there are only 4 digits after last -. The API which I need to hit accepts exactly this format.
2nd Edit: Confirmed from api's dev team, last 4 digits are milliseconds, with 0 prepended. ex, 330 is the milliseconds, and they mention it as 0330.
It's an ISO 8601 timestamp format.
In order to get the current time in that format:
from datetime import datetime
print(datetime.now().isoformat())
In your case, the iso format is truncated to seconds, and has a timezone:
from datetime import datetime, timezone, timedelta
tz = timezone(timedelta(hours=-3.5))
current_time = datetime.now(tz)
print(current_time.isoformat(timespec="seconds"))
Where -3.5 is the UTC offset.
If you wish to use the system's local timezone, you can do so like this:
from datetime import datetime, timezone, timedelta
current_time = datetime.now().astimezone()
print(current_time.isoformat(timespec="seconds"))
I have the following two date/time which are date_time1 and date_time2 respectively:
2017-04-15 00:00:00
2017-04-17 15:35:19+00:00
parsed1 = dateutil.parser.parse(date_time1)
parsed2 = dateutil.parser.parse(date_time2)
and would if I were to receive another date/time called input_date_time (e.g. 2017-04-16 12:11:42+00:00), would like to do the following:
# Would like to check if `input_date_time` is within the range
if parsed1 <= input_date_time <= parsed2:
…
And got an error: TypeError: can't compare offset-naive and offset-aware datetimes
Thought up of breaking it down to just year, month, day, hour, minute, and second, and compare every single one.
What would be the proper way to do so?
here is my edited (again) example
I think we should provide timezone data to every datetime object
assume that date_time1 is a local time.
I think we should add timezone data to date_time1 instead of clear other tzinfo (my first example)
import dateutil.parser
import datetime
from pytz import utc
date_time1 ='2017-04-15 00:00:00'
date_time2 ='2017-04-17 15:35:19+00:00'
input_date_time = '2017-04-16 12:11:42+00:00'
parsed1 = dateutil.parser.parse(date_time1).astimezone(utc)
parsed2 = dateutil.parser.parse(date_time2)
input_parsed = dateutil.parser.parse(input_date_time)
if parsed1 <= input_parsed <= parsed2:
print('input is between')
this can check if input is between parsed1 and parsed2
Assuming you have python datetime obejcts,
two objects in python can be compared with the "<", "==", and ">" signs.
You don't need to parse them to compare them.
if date_time1 <= input_date_time <= datetime_2:
#do work
If you don't have datetime objects, there is also a method called datetime in the datetime class, which will allow you to create datetime objects, if you'll find that useful.
You need to apply a timezone to the 'naive ' datetime object (2017-04-15 00:00:00 in your example) (to make it TZ aware) OR convert the 'aware' datetime object (2017-04-17 15:35:19+00:00 in your example) to a 'naive' object and the date you are trying to compare.
Then your TypeError will disappear.
Since your second date has a timezone offset of +00:00 and your input_datetime is also +00:00, let's apply UTC to the naive first date (assuming that it's the correct timezone) and then convert it to whatever timezone you need (you can skip the conversion if UTC is correct - the comparison will now work.)
parsed1 = dateutil.parser.parse(date_time1)
parsed2 = dateutil.parser.parse(date_time2)
# make parsed1 timezone aware (UTC)
parsed1 = parsed1.replace(tzinfo=pytz.utc)
Now your comparison should work.
If you want to apply another timezone to any of the dates, you can use the astimezone function. Lets change the timezone to that applicable to Sydney, Australia. Here is a list of timezones https://gist.github.com/heyalexej/8bf688fd67d7199be4a1682b3eec7568
syd_tz = pytz.timezone('Australia/Sydney')
syd_parsed1 = parsed1.astimezone(syd_tz)
You can now check what timezone is applied to each of your datetime objects using the %zand %Z parameters for strftime. Using %c will print it in the local time format as will %x and %X.
Using Python3+:
print("Local time: %s" % syd_parsed1.strftime('%c'))
print("Offset-Timezone-Date-Time: %s" % syd_parsed1.strftime("%z-%Z-%x-%X))
Hope that helps, the timezone functions did my head in when I used them the first time when I didn't know about %c.
I want to convert 2014-08-14 20:01:28.242 into a unix timestamp 245293529385 and subtract this by the current timestamp in order to figure out how many days have past and are ultimately remaining by subtracting this value from 14.
Scenario: user signs up and I want to count down the number of days remaining in their trial.
time.strptime to the rescue! Use the format string %Y-%m-%d %H:%M:%S.%f. For example:
import time
t = '2014-08-14 20:01:28.242'
ts = time.strptime(t, '%Y-%m-%d %H:%M:%S.%f')
timestamp = time.mktime(ts)
Now to convert it to a datetime (from: How do you convert a Python time.struct_time object into a datetime object? ):
from datetime import datetime
dt = datetime.fromtimestamp(timestamp)
There are two parts:
Convert input time string into datetime object
#!/usr/bin/env python
from datetime import datetime
dt = datetime.strptime('2014-08-14 20:01:28.242', '%Y-%m-%d %H:%M:%S.%f')
Convert datetime object to Unix time ("seconds since epoch")
The result depends on what time zone is used for the input time e.g., if the input is in UTC then the corresponding POSIX timestamp is:
timestamp = (dt - datetime(1970,1,1)).total_seconds()
# -> 1408046488.242
If your input is in the local timezone then see How do I convert local time to UTC in Python?