I get the following error for my timestamp field when converting from UTC to CET due to the transition to daylight savings last Saturday/Sunday:
AmbiguousTimeError: Cannot infer dst time from 2020-07-31 11:17:18+00:00, try using the 'ambiguous' argument
#converting timestamp fields to CET (Europe,Berlin)
df['timestamp_berlin_time'] = df['timestamp'].dt.tz_localize('Europe/Berlin')
I tried the following snippet:
df['timestamp_berlin_time'] = df['timestamp'].dt.tz_localize('CET',ambiguous='infer')
but this gives me then this error:
AmbiguousTimeError: 2020-07-31 11:17:18+00:00
Data sample:
0 2020-07-31 11:17:18+00:00
1 2020-07-31 11:17:18+00:00
2 2020-08-31 16:26:42+00:00
3 2020-10-20 07:28:46+00:00
4 2020-10-01 22:11:33+00:00
Name: timestamp, dtype: datetime64[ns, UTC]
If your input is UTC but UTC isn't set yet, you can localize to UTC first, here e.g.:
df['timestamp'] = df['timestamp'].dt.tz_localize('UTC')
If your input already is converted to UTC, you can simply tz_convert, e.g.:
s = pd.Series(pd.to_datetime(['2020-10-25 00:40:03.925000',
'2020-10-25 01:40:03.925000',
'2020-10-25 02:40:03.925000'], utc=True))
s.dt.tz_convert('Europe/Berlin')
# 0 2020-10-25 02:40:03.925000+02:00
# 1 2020-10-25 02:40:03.925000+01:00
# 2 2020-10-25 03:40:03.925000+01:00
# dtype: datetime64[ns, Europe/Berlin]
If your input timestamps represent local time (here: Europe/Berlin time zone), you can try to infer the DST transition based on order:
s = pd.Series(pd.to_datetime(['2020-10-25 02:40:03.925000',
'2020-10-25 02:40:03.925000',
'2020-10-25 03:40:03.925000']))
s.dt.tz_localize('Europe/Berlin', ambiguous='infer')
# 0 2020-10-25 02:40:03.925000+02:00
# 1 2020-10-25 02:40:03.925000+01:00
# 2 2020-10-25 03:40:03.925000+01:00
# dtype: datetime64[ns, Europe/Berlin]
Note: CET is not a time zone in a geographical sense. pytz can handle some of these for historical reasons but don't count on it. In any case, it might give you static tz offsets - which is not what you want if you expect it to include DST transitions.
Related
I have a dataframe with columns:
time: time in UTC format
timezone: the corresponding timezone.
time timezone
0 2022-12-28T20:16:31.373Z Europe/Athens
1 2022-07-28T20:16:31.373Z Europe/Athens
2 2022-11-01T21:35:35.865Z Europe/Dublin
3 2022-08-03T19:44:07.611Z America/Los_Angeles
4 2022-08-02T12:44:44.360Z Europe/Minsk
I want to:
Convert UTC time to Local time (using timezone)
Remove the Timezone and just keep the datetime
It seems to me that this solution works, but want to make sure that I am not missing something (eg. doesn't deal with dailight saving or something)
import pandas as pd
# example dataframe
df = pd.DataFrame({
'time' : ['2022-12-28T20:16:31.373Z', '2022-07-28T20:16:31.373Z', '2022-11-01T21:35:35.865Z', '2022-08-03T19:44:07.611Z', '2022-08-02T12:44:44.360Z'],
'timezone': ['Europe/Athens', 'Europe/Athens', 'Europe/Dublin', 'America/Los_Angeles', 'Europe/Minsk']
})
# function
def get_local_time (timestamp: pd.Timestamp, timezone: str) -> pd.Timestamp:
timestamp = pd.to_datetime(timestamp).tz_convert(timezone).replace(tzinfo=None)
return timestamp
df['local_time'] = df.apply(lambda row: get_local_time(row['time'], row['timezone']), axis = 1).dt.round(freq='S')
print (df)
---
OUT:
time timezone local_time
0 2022-12-28T20:16:31.373Z Europe/Athens 2022-12-28 22:16:31
1 2022-07-28T20:16:31.373Z Europe/Athens 2022-07-28 23:16:31
2 2022-11-01T21:35:35.865Z Europe/Dublin 2022-11-01 21:35:36
3 2022-08-03T19:44:07.611Z America/Los_Angeles 2022-08-03 12:44:08
4 2022-08-02T12:44:44.360Z Europe/Minsk 2022-08-02 15:44:44
I've got some date and time data as a string that is formatted like this, in UTC:
,utc_date_and_time, api_calls
0,2022-10-20 00:00:00,12
1,2022-10-20 00:05:00,14
2,2022-10-20 00:10:00,17
Is there a way to create another column here that always represents that time, but so it is for London/Europe?
,utc_date_and_time, api_calls, london_date_and_time
0,2022-10-20 00:00:00,12,2022-10-20 01:00:00
1,2022-10-20 00:05:00,14,2022-10-20 01:05:00
2,2022-10-20 00:10:00,17,2022-10-20 01:10:00
I want to write some code that, for any time of the year, will display the time in London - but I'm worried that when the timezone changes in London/UK that my code will break.
with pandas, you'd convert to datetime, specify UTC and then call tz_convert:
df
Out[9]:
utc_date_and_time api_calls
0 2022-10-20 00:00:00 12
1 2022-10-20 00:05:00 14
2 2022-10-20 00:10:00 17
df["utc_date_and_time"] = pd.to_datetime(df["utc_date_and_time"], utc=True)
df["london_date_and_time"] = df["utc_date_and_time"].dt.tz_convert("Europe/London")
df
Out[12]:
utc_date_and_time api_calls london_date_and_time
0 2022-10-20 00:00:00+00:00 12 2022-10-20 01:00:00+01:00
1 2022-10-20 00:05:00+00:00 14 2022-10-20 01:05:00+01:00
2 2022-10-20 00:10:00+00:00 17 2022-10-20 01:10:00+01:00
in vanilla Python >= 3.9, you'd let zoneinfo handle the conversion;
from datetime import datetime
from zoneinfo import ZoneInfo
t = "2022-10-20 00:00:00"
# to datetime, set UTC
dt = datetime.fromisoformat(t).replace(tzinfo=ZoneInfo("UTC"))
# to london time
dt_london = dt.astimezone(ZoneInfo("Europe/London"))
print(dt_london)
2022-10-20 01:00:00+01:00
You should use utc timezone
from datetime import datetime, timezone
datetime.now(timezone.utc).isoformat()
Outputs:
2022-10-25T15:27:08.874057+00:00
Trying to convert object type variable to datetime type
pd.to_datetime(df['Time'])
0 13:08:00
1 10:29:00
2 13:23:00
3 20:33:00
4 10:37:00
Error :<class 'datetime.time'> is not convertible to datetime
Please help how can I convert object to datetime and merge with date variable.
What you have are datetime.time objects, as the error tells you. You can use their string representation and parse to pandas datetime or timedelta, depending on your needs. Here's three options for example,
import datetime
import pandas as pd
df = pd.DataFrame({'Time': [datetime.time(13,8), datetime.time(10,29), datetime.time(13,23)]})
# 1)
# use string representation and parse to datetime:
pd.to_datetime(df['Time'].astype(str))
# 0 2022-01-19 13:08:00
# 1 2022-01-19 10:29:00
# 2 2022-01-19 13:23:00
# Name: Time, dtype: datetime64[ns]
# 2)
# add as timedelta to a certain date:
pd.Timestamp('2020-1-1') + pd.to_timedelta(df['Time'].astype(str))
# 0 2020-01-01 13:08:00
# 1 2020-01-01 10:29:00
# 2 2020-01-01 13:23:00
# Name: Time, dtype: datetime64[ns]
# 3)
# add the cumulated sum of the timedelta to a starting date:
pd.Timestamp('2020-1-1') + pd.to_timedelta(df['Time'].astype(str)).cumsum()
# 0 2020-01-01 13:08:00
# 1 2020-01-01 23:37:00
# 2 2020-01-02 13:00:00
# Name: Time, dtype: datetime64[ns]
df['col'] = df['col'].astype('datetime64')
This worked for me.
The starting date format I currently have is 2019-09-04 16:00 UTC+3 and I'm trying to convert it into a datetime format of 2019-09-04 16:00:00+0300.
The format I thought would work was format='%Y-%m-%d %H:%M %Z%z', but when I run it I get the error message ValueError: Cannot parse both %Z and %z.
Does anyone know the correct format to use, or should I be trying a different method altogether? Thanks.
Edit
Sorry, I had a hard time putting into words what it is I am looking to do, hopefully I can clarify.
I'm looking to change all the date and times in a dataframe into the datetime format.
This is the method I was trying to use which presented me with an error
df['datepicker'] = pd.to_datetime(df['datepicker'], format='%Y-%m-%d %H:%M %Z%z')
And here is a sample of the data I currently have.
datepicker
2019-09-07 16:00 UTC+2
2019-09-04 18:30 UTC+4
2019-09-06 17:00 UTC±0
2019-09-10 16:00 UTC+1
2019-09-04 18:00 UTC+3
And this is what I'm looking to convert them into, a timestamp format.
datepicker
2019-09-07 16:00:00+0200
2019-09-04 18:30:00+0400
2019-09-06 17:00:00+0000
2019-09-10 16:00:00+0100
2019-09-04 18:00:00+0300
pandas.to_datetime should parse this happily if you tweak the strings slightly:
import pandas as pd
df = pd.DataFrame({"datepicker":[ "2019-09-07 16:00 UTC+2", "2019-09-04 18:30 UTC+4",
"2019-09-06 17:00 UTC±0", "2019-09-10 16:00 UTC+1",
"2019-09-04 18:00 UTC+3"]})
df['datetime'] = pd.to_datetime(df['datepicker'].str.replace('±', '+'))
# df['datetime']
# 0 2019-09-07 16:00:00-02:00
# 1 2019-09-04 18:30:00-04:00
# 2 2019-09-06 17:00:00+00:00
# 3 2019-09-10 16:00:00-01:00
# 4 2019-09-04 18:00:00-03:00
# Name: datetime, dtype: object
Note that due to the mixed UTC offsets, the column's data type is 'object' (datetime objects). If you wish, you can also convert to UTC straight away, to get a column of dtype datetime[ns]:
df['UTC'] = pd.to_datetime(df['datepicker'].str.replace('±', '+'), utc=True)
# df['UTC']
# 0 2019-09-07 18:00:00+00:00
# 1 2019-09-04 22:30:00+00:00
# 2 2019-09-06 17:00:00+00:00
# 3 2019-09-10 17:00:00+00:00
# 4 2019-09-04 21:00:00+00:00
# Name: UTC, dtype: datetime64[ns, UTC]
When i defined as below. it works as you expect.
from datetime import datetime, timedelta, timezone
UTC = timezone(timedelta(hours=+3))
dt = datetime(2019, 1, 1, 12, 0, 0, tzinfo=UTC)
timestampStr = dt.strftime("%Y-%m-%d %H:%M %Z%z")
print(timestampStr)
With the output of:
2019-01-01 12:00 UTC+03:00+0300
I have a dataframe that has entries like this, where the times are in UTC:
start_date_time timezone
1 2017-01-01 14:00:00 America/Los_Angeles
2 2017-01-01 14:00:00 America/Denver
3 2017-01-01 14:00:00 America/Phoenix
4 2017-01-01 14:30:00 America/Los_Angeles
5 2017-01-01 14:30:00 America/Los_Angeles
I need to be able to group by date (local date, not UTC date) and I need to be able to create indicators for whether the event happened between certain times (local times, not UTC times).
I have successfully done the above in R by:
Creating a time variable in each of the timezones
Converting those to strings
Pulling each of the string date/time variables into one column, which one I pull depends on the appropriate timezone
Then, splitting that column to get a string date column and a string time column
I can then convert everything back to datetime objects for comparisons. e.g. now I can say if something happened between 2 and 3pm and it will correctly identify everything that happened between 2 and 3pm locally.
I have tried a bunch in python and have the dates as
2017-01-02 04:30:00-08:00
but I can't figure out how to go from there to
2017-01-01 20:30:00
Thanks!
Your example is incorrect. Your timezone is eight hours behind UTC, which means you need to add eight hours to 4:30AM which is 12:30PM UTC time.
The datetime object function astimezone(...) will do the conversion for you. For ease of use, I recommend pytz.
However in pure python:
import datetime as dt
local_tz = dt.timezone(dt.timedelta(hours=-8))
utc = dt.timezone.utc
d = dt.datetime(2017, 1, 2, 4, 30, 0, 0, local_tz)
print(d, d.astimezone(utc))
Will print:
2017-01-02 04:30:00-08:00 2017-01-02 12:30:00+00:00
Here's an example using pytz to lookup time zones:
import datetime as dt
import pytz
dates = [("2017-01-01 14:00:00", "America/Los_Angeles"),
("2017-01-01 14:00:00", "America/Denver"),
("2017-01-01 14:00:00", "America/Phoenix"),
("2017-01-01 14:30:00", "America/Los_Angeles"),
]
for d, tz_str in dates:
start = dt.datetime.strptime(d, "%Y-%m-%d %H:%M:%S")
start = start.replace(tzinfo=pytz.utc)
local_tz = pytz.timezone(tz_str) # convert to desired timezone
print(start, local_tz.zone, "\t", start.astimezone(local_tz))
This produces:
2017-01-01 14:00:00+00:00 America/Los_Angeles 2017-01-01 06:00:00-08:00
2017-01-01 14:00:00+00:00 America/Denver 2017-01-01 07:00:00-07:00
2017-01-01 14:00:00+00:00 America/Phoenix 2017-01-01 07:00:00-07:00
2017-01-01 14:30:00+00:00 America/Los_Angeles 2017-01-01 06:30:00-08:00