I'm working with CMIP5 data that has time units of "days since 1-1-1850". To find the current day I'm working with in the file, I would normally just do a timedelta addition from 1-1-1850 and the time value (in days) for the datapoint that I'm working with. However, CMIP5 (or at least the file I'm using) uses a 'noleap' calendar, meaning that all years are only 365 days.
In my current case, when dealing with the datapoint that corresponds to January 1, 1980, I add its time argument of 47450 days to the original date of January 1, 1850. However, I get back an answer of December 1, 1979 because all the Feb. 29ths between 1850 and 1980 are excluded. Is there an additional argument in timedelta or datetime in general that deals with calendars that exclude leap days?
netCDF num2date is the function you are looking for:
import netCDF4
ncfile = netCDF4.Dataset('./foo.nc', 'r')
time = ncfile.variables['time'] # note that we do not cast to numpy array yet
time_convert = netCDF4.num2date(time[:], time.units, time.calendar)
Note that the CMIP5 models do not have a standard calendar, so the time.calendar argument is important to include when doing this conversion.
Related
I'm working with some telemetry that uses timestamps measured in hours since January 1st at midnight of the current year.
So I get value 1 at time 8668.12034
I'd like to convert it to a more useful date format and of course I've been doing so with hardcoded math dividing into days, remainder hours, minutes, etc accounting for leap years... and it works but I'm sure there's a simple way using the datetime library or something right?
I'm thinking timedelta is the way to go since it's giving me a delta since the beginning of the year but does that account for leap years?
Curious how others would approach this issue, thanks for any advice.
# import packages we need
import datetime
From elapsed hours to datetime.datetime object
You can for example do:
hours_elapsed = 1000
your_date = datetime.datetime(2020,1,1,0,0)+datetime.timedelta(hours=hours_elapsed)
(Of course change hours_elapsed to whatever hours elapsed in your case.)
your_date will be: datetime.datetime(2020, 2, 11, 16, 0)
Yes, timedelta does know about leap years.
Further processing
If want to process this further, can do, using getattr():
timeunits = ['year', 'month', 'day', 'hour', 'minute', 'second']
[getattr(your_date,timeunit) for timeunit in timeunits]
Resulting in:
[2020, 2, 11, 16, 0, 0]
I have the program that generate datetime in several format like below.
1 day, 21:21:00.561566
11:19:26.056148
Maybe it have in month or year format, and i want to know are there any way to plus these all time that i get from the program.
- 1 day, 21:21:00.561566 is the string representation of a datetime.timedelta object. If you need to parse from string to timedelta, pandas has a suitable method. There are other third party parsers; I'm just using this one since pandas is quite common.
import pandas as pd
td = pd.to_timedelta('- 11:19:26.056148')
# Timedelta('-1 days +12:40:33.943852')
td.total_seconds()
# -40766.056148
If you need to find the sum of multiple timedelta values, you can sum up their total_seconds and convert them back to timedelta:
td_strings = ['- 1 day, 21:21:00.561566', '- 11:19:26.056148']
td_sum = pd.Timedelta(seconds=sum([pd.to_timedelta(s).total_seconds() for s in td_strings]))
td_sum
# Timedelta('-1 days +10:01:34.505418')
...or leverage some tools from the Python standard lib:
from functools import reduce
from operator import add
td_sum = reduce(add, map(pd.to_timedelta, td_strings))
# Timedelta('-1 days +10:01:34.505418')
td_sum.total_seconds()
# -50305.494582
You can subtract date time like here to find how far apart these two times are:
https://stackoverflow.com/a/1345852/2415706
Adding two dates doesn't really make any sense though. Like, if you try to add Jan 1st of 2020 to Jan 1st of 1995, what are you expecting?
You can use datatime.timedelta class for this purpose.
You can find the documentation here.
You will need to parse your string and build a timedelta object.
I'm trying to import CSV data from a file produced by a device which has a system clock which is set to 'Australia/Adelaide' time, but doesn't switch from standard to daylight time in summer. I can import it no problem as tz-naive but I need to correlate it with data which is tz-aware.
The following is incorrect as it assumes the data transitions to summer time on '2017-10-01'
data = pd.read_csv('~/dev/datasets/data.csv', parse_dates=['timestamp'], index_col=['timestamp'])
data.index.tz_localize('Australia/Adelaide')
tz_localize contains a number of arguments to deal with ambiguous dates - but I don't see any way to tell it that the data doesn't transition at all. Is there a way to specify a "custom" timezone that's 'Australia/Adelaide', no daylight savings?
Edit: I found this question - Create New Timezone in pytz which has given me some ideas - in this case the timestamps are a constant offset from UTC so i can probably add that to the date after importing, localise as UTC then convert to 'Australia/Adelaide'. I'll report back...
The solution I came up with is as follows:
Since the data is 'Australia/Adelaide' with no DLS transistion, that means the UTC offset is a constant (+10:30) all year. Hence a solution is to import that data as tz-naive, subtract 10 hours and 30 minutes, localise as UTC then convert to 'Australia/Adelaide', i.e.
data = pd.read_csv('~/dev/datasets/data.csv', parse_dates=['timestamp'], index_col=['timestamp'])
data.index = data.index - pd.DateOffset(hours=10) - pd.DateOffset(minutes=30)
data.index = data.index.tz_localize('UTC').tz_convert('Australia/Adelaide')
I have exported a list of AD Users out of AD and need to validate their login times.
The output from the powershell script give lastlogin as LDAP/FILE time
EXAMPLE 130305048577611542
I am having trouble converting this to readable time in pandas
Im using the following code:
df['date of login'] = pd.to_datetime(df['FileTime'], unit='ns')
The column FileTime contains time formatted like the EXAMPLE above.
Im getting the following output in my new column date of login
EXAMPLE 1974-02-17 03:50:48.577611542
I know this is being parsed incorrectly as when i input this date time on a online converter i get this output
EXAMPLE:
Epoch/Unix time: 1386031258
GMT: Tuesday, December 3, 2013 12:40:58 AM
Your time zone: Monday, December 2, 2013 4:40:58 PM GMT-08:00
Anyone have an idea of what occuring here why are all my dates in the 1970'
I know this answer is very late to the party, but for anyone else looking in the future.
The 18-digit Active Directory timestamps (LDAP), also named 'Windows NT time format','Win32 FILETIME or SYSTEMTIME' or NTFS file time. These are used in Microsoft Active Directory for pwdLastSet, accountExpires, LastLogon, LastLogonTimestamp and LastPwdSet. The timestamp is the number of 100-nanoseconds intervals (1 nanosecond = one billionth of a second) since Jan 1, 1601 UTC.
Therefore, 130305048577611542 does indeed relate to December 3, 2013.
When putting this value through the date time function in Python, it is truncating the value to nine digits. Therefore the timestamp becomes 130305048 and goes from 1.1.1970 which does result in a 1974 date!
In order to get the correct Unix timestamp you need to do:
(130305048577611542 / 10000000) - 11644473600
Here's a solution I did in Python that worked well for me:
import datetime
def ad_timestamp(timestamp):
if timestamp != 0:
return datetime.datetime(1601, 1, 1) + datetime.timedelta(seconds=timestamp/10000000)
return np.nan
So then if you need to convert a Pandas column:
df.lastLogonTimestamp = df.lastLogonTimestamp.fillna(0).apply(ad_timestamp)
Note: I needed to use fillna before using apply. Also, since I filled with 0's, I checked for that in the conversion function about, if timestamp != 0. Hope that makes sense. It's extra stuff but you may need it to convert the column in question.
I've been stuck on this for couple of days. But now i am ready to share really working solution in more easy to use form:
import datetime
timestamp = 132375402928051110
value = datetime.datetime (1601, 1, 1) +
datetime.timedelta(seconds=timestamp/10000000) ### combine str 3 and 4
print(value.strftime('%Y-%m-%d %H:%M:%S'))
I have two datetime objects; a start date and an end date. I need to enumerate the days, weeks and months between the two, inclusive.
Ideally the results would be in datetime form, though any compatible form is fine. Weeks and months are represented by a date corresponding to the first day of the week/month, where Monday is the first day of a week, as in ISO-8601. This means that the result may contain a date earlier than the start date.
For example, given 2010-11-28 to 2010-12-01, the results would be as follows:
days: 2010-11-28, 2010-11-29, 2010-11-30, 2010-12-01
weeks: 2010-11-22, 2010-11-29
months: 2010-11-01, 2010-12-01
I realize that the list of days is by itself straightforward, but I'd like a clean and consistent solution that uses a similar approach for all three. It seems like the calendar module should be useful, but I'm not seeing a good way to use it for this purpose.
Using dateutil:
import datetime
import dateutil.rrule as drrule
import dateutil.relativedelta as drel
import pprint
def dt2d(date):
'''
Convert a datetime.datetime to datetime.date object
'''
return datetime.date(date.year,date.month,date.day)
def enumerate_dates(start,end):
days=map(dt2d,drrule.rrule(drrule.DAILY, dtstart=start, until=end))
# Find the Monday on or before start
start_week=start+drel.relativedelta(weekday=drel.MO(-1))
end_week=end+drel.relativedelta(weekday=drel.MO(-1))
weeks=map(dt2d,drrule.rrule(drrule.WEEKLY, dtstart=start_week, until=end_week))
# Find the first day of the month
start_month=start.replace(day=1)
end_month=end.replace(day=1)
months=map(dt2d,drrule.rrule(drrule.MONTHLY, dtstart=start_month, until=end_month))
return days,weeks,months
if __name__=='__main__':
days,weeks,months=enumerate_dates(datetime.date(2010,11,28),
datetime.date(2010,12,01))
print('''\
days: {d}
weeks: {w}
months: {m}'''.format(d=map(str,days),w=map(str,weeks),m=map(str,months)))
yields
days: ['2010-11-28', '2010-11-29', '2010-11-30', '2010-12-01']
weeks: ['2010-11-22', '2010-11-29']
months: ['2010-11-01', '2010-12-01']