I'm trying to import CSV data from a file produced by a device which has a system clock which is set to 'Australia/Adelaide' time, but doesn't switch from standard to daylight time in summer. I can import it no problem as tz-naive but I need to correlate it with data which is tz-aware.
The following is incorrect as it assumes the data transitions to summer time on '2017-10-01'
data = pd.read_csv('~/dev/datasets/data.csv', parse_dates=['timestamp'], index_col=['timestamp'])
data.index.tz_localize('Australia/Adelaide')
tz_localize contains a number of arguments to deal with ambiguous dates - but I don't see any way to tell it that the data doesn't transition at all. Is there a way to specify a "custom" timezone that's 'Australia/Adelaide', no daylight savings?
Edit: I found this question - Create New Timezone in pytz which has given me some ideas - in this case the timestamps are a constant offset from UTC so i can probably add that to the date after importing, localise as UTC then convert to 'Australia/Adelaide'. I'll report back...
The solution I came up with is as follows:
Since the data is 'Australia/Adelaide' with no DLS transistion, that means the UTC offset is a constant (+10:30) all year. Hence a solution is to import that data as tz-naive, subtract 10 hours and 30 minutes, localise as UTC then convert to 'Australia/Adelaide', i.e.
data = pd.read_csv('~/dev/datasets/data.csv', parse_dates=['timestamp'], index_col=['timestamp'])
data.index = data.index - pd.DateOffset(hours=10) - pd.DateOffset(minutes=30)
data.index = data.index.tz_localize('UTC').tz_convert('Australia/Adelaide')
Related
I have an array of Unix Epoch time stampts that I need to convert to datetime format in python.
I can make the conversion ok using numpy and pandas, as below:
tim = [1627599600,1627599600,1627599601,1627649998,1627649998,1627649999]
tim_np = np.array(tim)
tim_np = np.asarray(tim, dtype='datetime64[s]',)
tim_pd = pd.to_datetime(tim,unit='s', utc=True,)
print(tim_np)
print(tim_pd)
The problem I am running into is that the time zone is wrong, I am in NY so require it set to "EST".
I tried addressing by setting utc=True in the pd.to_datetime function but it still keeps defaulting to "GMT" ( 5 hours ahead).
I also tried the datetime.fromtimestamp(0) but it seemingly only works on single elements and not arrays - https://note.nkmk.me/en/python-unix-time-datetime/
Is there any efficient method to set the time zone when converting epochs?
Found that this can be achieved with pandas using the .tz_* methods:
.tz_localie - https://pandas.pydata.org/docs/reference/api/pandas.Series.dt.tz_localize.html
.tz_convert - https://pandas.pydata.org/docs/reference/api/pandas.Series.dt.tz_convert.html
Working code:
tim = [1627599600,1627599600,1627599601,1627649998,1627649998,1627649999]
tim_pd = (pd.to_datetime(tim,unit='s')
.tz_localize('utc')
.tz_convert('America/New_York'))
print(tim_pd)
The question in the title seems to be familiar as I could see lot of example blog posts and SO posts. However, I couldn't find a question similar to the issue I am facing. I have a netcdf file in which variable time has a single data value 10643385. The unit of this time variable is minutes since 2000-01-01 00:00:00 which is different from many examples I found on the internet.I am also aware of the fact that actual value of time is 27-03-2020 05:45. My query is that how do I get this epoch value int to the date time format like `27-03-2020 05:45'. Here is the sample code I have been trying which results in the reference datetime rather than actual datetime of the file:-
print(datetime.datetime.fromtimestamp(int(epoch_time_value)).strftime('%Y-%m-%d %H:%M:%S'))
The above single line of code result in 1970-05-04 09:59:45. Can some one help me to get the correct date.
import datetime
t = datetime.datetime(2000, 1, 1) + datetime.timedelta(minutes=10643385)
outputs
datetime.datetime(2020, 3, 27, 5, 45)
Python epoch time is in seconds, so we must first convert this to seconds by multiplying by 60.
Python epoch time starts on 1, Jan, 1970. Since netcdf starts on 2000-01-01, we must adjust by adding the amount of seconds from 1970 to 2000 (which happens to be 946684800).
Putting these together we get:
>>> import datetime
>>> epoch_time_value = 10643385
>>> _epoch_time_value = epoch_time_value * 60 + 946684800
>>> print(datetime.datetime.fromtimestamp(int(_epoch_time_value)).strftime('%Y-%m-%d %H:%M:%S'))
2020-03-26 22:45:00
Then, there may be some shift (possibly +/- 12 hours) based on timezone, so make sure the timezones in your calculations are synced when you do this!
One column of CSV file includes time and time zone.
Here is one value under the column: 2018-05-20 15:05:51.065 America/New_York. I wonder, how can I convert the value to the 2019-05-20 format? There are over a half-million rows in the CSV file.
Split your column into date, time and zone using string manipulators, regex etc . Have a standard time zone to follow (eg: UTC)
Now
Get time difference between the zone and UTC using below,
How to convert string timezones in form (Country/city) into datetime.tzinfo
Use this difference to the time you have split already and then change date based on 24 hours.
If you just want it to be a string, just strip away everything past the first space:
"2018-05-20 15:05:51.065 America/New_York".split(' ')[0]
EDIT:
If you want it to be a timezone-aware datetime object, you can do it easily with pytz package:
from datetime import datetime
from pytz import timezone
string_date = "2018-05-20 15:05:51.065 America/New_York"
tz = timezone(string_date.split(' ')[len(string_date.split(' '))-1])
unaware = " ".join(string_date.split(' ')[:len(string_date.split(' '))-1])
unaware_datetime = datetime.strptime(unaware, "%Y-%m-%d %H:%M:%S.%f")
aware_datetime = unaware_datetime.replace(tzinfo=tz)
Let's assume that I have the following data:
25/01/2000 05:50
When I convert it using datetime.toordinal, it returns this value:
730144
That's nice, but this value just considers the date itself. I also want it to consider the hour and minutes (05:50). How can I do it using datetime?
EDIT:
I want to convert a whole Pandas Series.
An ordinal date is by definition only considering the year and day of year, i.e. its resolution is 1 day.
You can get the microseconds / milliseconds (depending on your platform) from epoch using
datetime.datetime.strptime('25/01/2000 05:50', '%d/%m/%Y %H:%M').timestamp()
for a pandas series you can do
s = pd.Series(['25/01/2000 05:50', '25/01/2000 05:50', '25/01/2000 05:50'])
s = pd.to_datetime(s) # make sure you're dealing with datetime instances
s.apply(lambda v: v.timestamp())
If you use python 3.x. You can get date with time in seconds from 1/1/1970 00:00
from datetime import datetime
dt = datetime.today() # Get timezone naive now
seconds = dt.timestamp()
I have a CSV with epoch GMT timestamp at irregular intervals paired with a value. I tried reading it from the CSV but all the times zones are shifted to my local timezone. How can I make it read in as-is (in GMT)? Then I would like the resample to one minute intervals, HOWEVER, I would like to skip gaps which are larger than a user specified value. If this is not possible, is there way to resample to one minute, but in the gaps, put in an arbitrary value like 0.0?
Data:
Time,Data
1354979750250,0.2343
1354979755250,2.3433
1354979710250,1.2343
def date_utc(s):
return parse(s, tzinfos=tzutc)
x = read_csv("time2.csv", date_parser=date_utc, converters={'Time': lambda x:datetime.fromtimestamp(int(x)/1000.)}).set_index('Time')
Convert local datetime to GMT datetime like this:
gmtDatetime = localdatetime - datetime.timedelta(hours=8)
The time zone is +08 (China).
Or using 'datetime.utcfromtimestamp':
classmethod datetime.utcfromtimestamp(timestamp)
classmethod datetime.fromtimestamp(timestamp, tz=None)
Return the UTC datetime corresponding to the POSIX timestamp, with
tzinfo None. This may raise OverflowError, if the timestamp is out of
the range of values supported by the platform C gmtime() function, and
OSError on gmtime() failure. It’s common for this to be restricted to
years in 1970 through 2038. See also fromtimestamp().