Related
I have generated a day-wise nested list and want to calculate total duration between login and logout sessions and store that value individually in a duration nested list, organized by the day in which the login happened.
My python script is:
import datetime
import itertools
Logintime = [
datetime.datetime(2021,1,1,8,10,10),
datetime.datetime(2021,1,1,10,25,19),
datetime.datetime(2021,1,2,8,15,10),
datetime.datetime(2021,1,2,9,35,10)
]
Logouttime = [
datetime.datetime(2021,1,1,10,10,11),
datetime.datetime(2021,1,1,17,0,10),
datetime.datetime(2021,1,2,9,30,10),
datetime.datetime(2021,1,2,17,30,12)
]
Logintimedaywise = [list(group) for k, group in itertools.groupby(Logintime,
key=datetime.datetime.toordinal)]
Logouttimedaywise = [list(group) for j, group in itertools.groupby(Logouttime,
key=datetime.datetime.toordinal)]
print(Logintimedaywise)
print(Logouttimedaywise)
# calculate total duration
temp = []
l = []
for p,q in zip(Logintimedaywise,Logouttimedaywise):
for a,b in zip(p, q):
tdelta = (b-a)
diff = int(tdelta.total_seconds()) / 3600
if diff not in temp:
temp.append(diff)
l.append(temp)
print(l)
this script generating the following output (the duration in variable l is coming out as a flat list inside a singleton list):
[[datetime.datetime(2021, 1, 1, 8, 10, 10), datetime.datetime(2021, 1, 1, 10, 25, 19)], [datetime.datetime(2021, 1, 2, 8, 15, 10), datetime.datetime(2021, 1, 2, 9, 35, 10)]]
[[datetime.datetime(2021, 1, 1, 10, 10, 11), datetime.datetime(2021, 1, 1, 17, 0, 10)], [datetime.datetime(2021, 1, 2, 9, 30, 10), datetime.datetime(2021, 1, 2, 17, 30, 12)]]
[[2.000277777777778, 6.5808333333333335, 1.25, 7.917222222222223]]
But my desired output format is the following nested list of durations (each item in the list should be the list of durations for a given login day):
[[2.000277777777778, 6.5808333333333335] , [1.25, 7.917222222222223]]
anyone can help how can i store total duration as a nested list according to the login day?
thanks in advance.
Try changing this peace of code:
# calculate total duration
temp = []
l = []
for p,q in zip(Logintimedaywise,Logouttimedaywise):
for a,b in zip(p, q):
tdelta = (b-a)
diff = int(tdelta.total_seconds()) / 3600
if diff not in temp:
temp.append(diff)
l.append(temp)
print(l)
To:
# calculate total duration
l = []
for p,q in zip(Logintimedaywise,Logouttimedaywise):
l.append([])
for a,b in zip(p, q):
tdelta = (b-a)
diff = int(tdelta.total_seconds()) / 3600
if diff not in l[-1]:
l[-1].append(diff)
print(l)
Then the output would be:
[[datetime.datetime(2021, 1, 1, 8, 10, 10), datetime.datetime(2021, 1, 1, 10, 25, 19)], [datetime.datetime(2021, 1, 2, 8, 15, 10), datetime.datetime(2021, 1, 2, 9, 35, 10)]]
[[datetime.datetime(2021, 1, 1, 10, 10, 11), datetime.datetime(2021, 1, 1, 17, 0, 10)], [datetime.datetime(2021, 1, 2, 9, 30, 10), datetime.datetime(2021, 1, 2, 17, 30, 12)]]
[[2.000277777777778, 6.5808333333333335], [1.25, 7.917222222222223]]
I add a new sublist for every iteration.
Your solution and the answer by #U11-Forward will break if login and logout for the same session happen in different days, since the inner lists in Logintimedaywise and Logouttimedaywise will have different number of elements.
To avoid that, a way simpler solution is if you first calculate the duration for all pairs of login, logout, then you create the nested lists based only on the login day (or logout day if you wish), like this:
import datetime
import itertools
import numpy
# define the login and logout times
Logintime = [datetime.datetime(2021,1,1,8,10,10),datetime.datetime(2021,1,1,10,25,19),datetime.datetime(2021,1,2,8,15,10),datetime.datetime(2021,1,2,9,35,10)]
Logouttime = [datetime.datetime(2021,1,1,10,10,11),datetime.datetime(2021,1,1,17,0,10), datetime.datetime(2021,1,2,9,30,10),datetime.datetime(2021,1,2,17,30,12) ]
# calculate the duration and the unique days in the set
duration = [ int((logout - login).total_seconds())/3600 for login,logout in zip(Logintime,Logouttime) ]
login_days = numpy.unique([login.day for login in Logintime])
# create the nested list of durations
# each inner list correspond to a unique login day
Logintimedaywise = [[ login for login in Logintime if login.day == day ] for day in login_days ]
Logouttimedaywise = [[ logout for login,logout in zip(Logintime,Logouttime) if login.day == day ] for day in login_days ]
duration_daywise = [[ d for d,login in zip(duration,Logintime) if login.day == day ] for day in login_days ]
# check
print(Logintimedaywise)
print(Logouttimedaywise)
print(duration_daywise)
Outputs
[[datetime.datetime(2021, 1, 1, 8, 10, 10), datetime.datetime(2021, 1, 1, 10, 25, 19)], [datetime.datetime(2021, 1, 2, 8, 15, 10), datetime.datetime(2021, 1, 2, 9, 35, 10)]]
[[datetime.datetime(2021, 1, 1, 10, 10, 11), datetime.datetime(2021, 1, 1, 17, 0, 10)], [datetime.datetime(2021, 1, 2, 9, 30, 10), datetime.datetime(2021, 1, 2, 17, 30, 12)]]
[[2.000277777777778, 6.5808333333333335], [1.25, 7.917222222222223]]
I am trying to generate time interval array. for example:
time_array = ["2016-09-02T17:30:00Z", "2016-09-02T17:45:00Z",
"2016-09-02T18:00:00Z", "2016-09-02T18:15:00Z",
"2016-09-02T18:30:00Z", "2016-09-02T18:45:00Z"]
It should create the element like above in zulu time till 9 pm everyday.
Should generate the elements for next and day after next as well
Start time from 7:00 am - Ed time 9:00 pm,
if current_time is > start_time then generate 15 min time interval array till 9 pm. and then generate for next day and day + 2.
And Interval should be 7:00, 7:15 like that.. not in 7:12, 8:32
Here's a generic datetime_range for you to use.
Code
from datetime import datetime, timedelta
def datetime_range(start, end, delta):
current = start
while current < end:
yield current
current += delta
dts = [dt.strftime('%Y-%m-%d T%H:%M Z') for dt in
datetime_range(datetime(2016, 9, 1, 7), datetime(2016, 9, 1, 9+12),
timedelta(minutes=15))]
print(dts)
Output
['2016-09-01 T07:00 Z', '2016-09-01 T07:15 Z', '2016-09-01 T07:30 Z', '2016-09-01 T07:45 Z', '2016-09-01 T08:00 Z', '2016-09-01 T08:15 Z', '2016-09-01 T08:30 Z', '2016-09-01 T08:45 Z', '2016-09-01 T09:00 Z', '2016-09-01 T09:15 Z', '2016-09-01 T09:30 Z', '2016-09-01 T09:45 Z' ... ]
Here is a Pandas solution:
import pandas as pd
l = (pd.DataFrame(columns=['NULL'],
index=pd.date_range('2016-09-02T17:30:00Z', '2016-09-04T21:00:00Z',
freq='15T'))
.between_time('07:00','21:00')
.index.strftime('%Y-%m-%dT%H:%M:%SZ')
.tolist()
)
Output:
In [165]: l
Out[165]:
['2016-09-02T17:30:00Z',
'2016-09-02T17:45:00Z',
'2016-09-02T18:00:00Z',
'2016-09-02T18:15:00Z',
'2016-09-02T18:30:00Z',
'2016-09-02T18:45:00Z',
'2016-09-02T19:00:00Z',
'2016-09-02T19:15:00Z',
'2016-09-02T19:30:00Z',
'2016-09-02T19:45:00Z',
'2016-09-02T20:00:00Z',
'2016-09-02T20:15:00Z',
'2016-09-02T20:30:00Z',
'2016-09-02T20:45:00Z',
'2016-09-02T21:00:00Z',
'2016-09-03T07:00:00Z',
'2016-09-03T07:15:00Z',
'2016-09-03T07:30:00Z',
'2016-09-03T07:45:00Z',
'2016-09-03T08:00:00Z',
'2016-09-03T08:15:00Z',
'2016-09-03T08:30:00Z',
'2016-09-03T08:45:00Z',
'2016-09-03T09:00:00Z',
'2016-09-03T09:15:00Z',
'2016-09-03T09:30:00Z',
'2016-09-03T09:45:00Z',
'2016-09-03T10:00:00Z',
'2016-09-03T10:15:00Z',
'2016-09-03T10:30:00Z',
'2016-09-03T10:45:00Z',
'2016-09-03T11:00:00Z',
'2016-09-03T11:15:00Z',
'2016-09-03T11:30:00Z',
'2016-09-03T11:45:00Z',
'2016-09-03T12:00:00Z',
'2016-09-03T12:15:00Z',
'2016-09-03T12:30:00Z',
'2016-09-03T12:45:00Z',
'2016-09-03T13:00:00Z',
'2016-09-03T13:15:00Z',
'2016-09-03T13:30:00Z',
'2016-09-03T13:45:00Z',
'2016-09-03T14:00:00Z',
'2016-09-03T14:15:00Z',
'2016-09-03T14:30:00Z',
'2016-09-03T14:45:00Z',
'2016-09-03T15:00:00Z',
'2016-09-03T15:15:00Z',
'2016-09-03T15:30:00Z',
'2016-09-03T15:45:00Z',
'2016-09-03T16:00:00Z',
'2016-09-03T16:15:00Z',
'2016-09-03T16:30:00Z',
'2016-09-03T16:45:00Z',
'2016-09-03T17:00:00Z',
'2016-09-03T17:15:00Z',
'2016-09-03T17:30:00Z',
'2016-09-03T17:45:00Z',
'2016-09-03T18:00:00Z',
'2016-09-03T18:15:00Z',
'2016-09-03T18:30:00Z',
'2016-09-03T18:45:00Z',
'2016-09-03T19:00:00Z',
'2016-09-03T19:15:00Z',
'2016-09-03T19:30:00Z',
'2016-09-03T19:45:00Z',
'2016-09-03T20:00:00Z',
'2016-09-03T20:15:00Z',
'2016-09-03T20:30:00Z',
'2016-09-03T20:45:00Z',
'2016-09-03T21:00:00Z',
'2016-09-04T07:00:00Z',
'2016-09-04T07:15:00Z',
'2016-09-04T07:30:00Z',
'2016-09-04T07:45:00Z',
'2016-09-04T08:00:00Z',
'2016-09-04T08:15:00Z',
'2016-09-04T08:30:00Z',
'2016-09-04T08:45:00Z',
'2016-09-04T09:00:00Z',
'2016-09-04T09:15:00Z',
'2016-09-04T09:30:00Z',
'2016-09-04T09:45:00Z',
'2016-09-04T10:00:00Z',
'2016-09-04T10:15:00Z',
'2016-09-04T10:30:00Z',
'2016-09-04T10:45:00Z',
'2016-09-04T11:00:00Z',
'2016-09-04T11:15:00Z',
'2016-09-04T11:30:00Z',
'2016-09-04T11:45:00Z',
'2016-09-04T12:00:00Z',
'2016-09-04T12:15:00Z',
'2016-09-04T12:30:00Z',
'2016-09-04T12:45:00Z',
'2016-09-04T13:00:00Z',
'2016-09-04T13:15:00Z',
'2016-09-04T13:30:00Z',
'2016-09-04T13:45:00Z',
'2016-09-04T14:00:00Z',
'2016-09-04T14:15:00Z',
'2016-09-04T14:30:00Z',
'2016-09-04T14:45:00Z',
'2016-09-04T15:00:00Z',
'2016-09-04T15:15:00Z',
'2016-09-04T15:30:00Z',
'2016-09-04T15:45:00Z',
'2016-09-04T16:00:00Z',
'2016-09-04T16:15:00Z',
'2016-09-04T16:30:00Z',
'2016-09-04T16:45:00Z',
'2016-09-04T17:00:00Z',
'2016-09-04T17:15:00Z',
'2016-09-04T17:30:00Z',
'2016-09-04T17:45:00Z',
'2016-09-04T18:00:00Z',
'2016-09-04T18:15:00Z',
'2016-09-04T18:30:00Z',
'2016-09-04T18:45:00Z',
'2016-09-04T19:00:00Z',
'2016-09-04T19:15:00Z',
'2016-09-04T19:30:00Z',
'2016-09-04T19:45:00Z',
'2016-09-04T20:00:00Z',
'2016-09-04T20:15:00Z',
'2016-09-04T20:30:00Z',
'2016-09-04T20:45:00Z',
'2016-09-04T21:00:00Z']
Looking at the data file, you should use the built in python date-time objects. followed by strftime to format your dates.
Broadly you can modify the code below to however many date-times you would like
First create a starting date.
Today= datetime.datetime.today()
Replace 100 with whatever number of time intervals you want.
date_list = [Today + datetime.timedelta(minutes=15*x) for x in range(0, 100)]
Finally, format the list in the way that you would like, using code like that below.
datetext=[x.strftime('%Y-%m-%d T%H:%M Z') for x in date_list]
Here is an example using an arbitrary date time
from datetime import datetime
start = datetime(1900,1,1,0,0,0)
end = datetime(1900,1,2,0,0,0)
Now you need to get the timedelta (the difference between two dates or times.) between the start and end
seconds = (end - start).total_seconds()
Define the 15 minutes interval
from datetime import timedelta
step = timedelta(minutes=15)
Iterate over the range of seconds, with step of time delta of 15 minutes (900 seconds) and sum it to start.
array = []
for i in range(0, int(seconds), int(step.total_seconds())):
array.append(start + timedelta(seconds=i))
print array
[datetime.datetime(1900, 1, 1, 0, 0),
datetime.datetime(1900, 1, 1, 0, 15),
datetime.datetime(1900, 1, 1, 0, 30),
datetime.datetime(1900, 1, 1, 0, 45),
datetime.datetime(1900, 1, 1, 1, 0),
...
At the end you can format the datetime objects to str representation.
array = [i.strftime('%Y-%m-%d %H:%M%:%S') for i in array]
print array
['1900-01-01 00:00:00',
'1900-01-01 00:15:00',
'1900-01-01 00:30:00',
'1900-01-01 00:45:00',
'1900-01-01 01:00:00',
...
You can format datetime object at first iteration. But it may hurt your eyes
array.append((start + timedelta(seconds=i)).strftime('%Y-%m-%d %H:%M%:%S'))
I'll provide a solution that does not handle timezones, since the problem is generating dates and times and you can set the timezone afterwards however you want.
You have a starting date and starting and ending time (for each day), plus an interval (in minutes) for these datetimes. The idea is to create a timedelta object that represent the time interval and repeatedly update the datetime until we reach the ending time, then we advance by one day and reset the time to the initial one and repeat.
A simple implementation could be:
def make_dates(start_date, number_of_days, start_time, end_time, interval, timezone):
if isinstance(start_date, datetime.datetime):
start_date = start_date.date()
start_date = datetime.datetime.combine(start_date, start_time)
cur_date = start_date
num_days_passed = 0
step = datetime.timedelta(seconds=interval*60)
while True:
new_date = cur_date + step
if new_date.time() > end_time:
num_days_passed += 1
if num_days_passed > number_of_days:
break
new_date = start_date + datetime.timedelta(days=num_days_passed)
ret_date, cur_date = cur_date, new_date
yield ret_date
In [31]: generator = make_dates(datetime.datetime.now(), 3, datetime.time(hour=17), datetime.time(hour=19), 15, None)
In [32]: next(generator)
Out[32]: datetime.datetime(2016, 9, 2, 17, 0)
In [33]: next(generator)
Out[33]: datetime.datetime(2016, 9, 2, 17, 15)
In [34]: list(generator)
Out[34]:
[datetime.datetime(2016, 9, 2, 17, 30),
datetime.datetime(2016, 9, 2, 17, 45),
datetime.datetime(2016, 9, 2, 18, 0),
datetime.datetime(2016, 9, 2, 18, 15),
datetime.datetime(2016, 9, 2, 18, 30),
datetime.datetime(2016, 9, 2, 18, 45),
datetime.datetime(2016, 9, 2, 19, 0),
datetime.datetime(2016, 9, 3, 17, 0),
datetime.datetime(2016, 9, 3, 17, 15),
datetime.datetime(2016, 9, 3, 17, 30),
datetime.datetime(2016, 9, 3, 17, 45),
datetime.datetime(2016, 9, 3, 18, 0),
datetime.datetime(2016, 9, 3, 18, 15),
datetime.datetime(2016, 9, 3, 18, 30),
datetime.datetime(2016, 9, 3, 18, 45),
datetime.datetime(2016, 9, 3, 19, 0),
datetime.datetime(2016, 9, 4, 17, 0),
datetime.datetime(2016, 9, 4, 17, 15),
datetime.datetime(2016, 9, 4, 17, 30),
datetime.datetime(2016, 9, 4, 17, 45),
datetime.datetime(2016, 9, 4, 18, 0),
datetime.datetime(2016, 9, 4, 18, 15),
datetime.datetime(2016, 9, 4, 18, 30),
datetime.datetime(2016, 9, 4, 18, 45),
datetime.datetime(2016, 9, 4, 19, 0),
datetime.datetime(2016, 9, 5, 17, 0),
datetime.datetime(2016, 9, 5, 17, 15),
datetime.datetime(2016, 9, 5, 17, 30),
datetime.datetime(2016, 9, 5, 17, 45),
datetime.datetime(2016, 9, 5, 18, 0),
datetime.datetime(2016, 9, 5, 18, 15),
datetime.datetime(2016, 9, 5, 18, 30),
datetime.datetime(2016, 9, 5, 18, 45)]
Once you have the datetimes you can use the strftime method to convert them to strings.
This is the final script I have written based on the answers posted on my question:
from datetime import datetime
from datetime import timedelta
import calendar
current_utc = datetime.utcnow().strftime("%Y-%m-%d-%H-%M-%S")
current_year = int(current_utc.split("-")[0])
current_month = int(current_utc.split("-")[1])
current_date = int(current_utc.split("-")[2])
current_hour = int(current_utc.split("-")[3])
current_min = int(current_utc.split("-")[4])
current_sec = int(current_utc.split("-")[5])
#### To make minutes round to quarter ####
min_range_1 = range(1,16)
min_range_2 = range(16,31)
min_range_3 = range(31,46)
min_range_4 = range(46,60)
if current_min in min_range_1:
current_min = 15
elif current_min in min_range_2:
current_min = 30
elif current_min in min_range_3:
current_min = 45
elif current_min in min_range_4:
current_hour = current_hour + 1
current_min = 0
else:
print("Please check current minute.")
current_sec = 00
date_range_31 = range(1,32)
date_range_30 = range(1,31)
month_days_31 = [1,3,5,7,8,10,12]
month_days_30 = [4,6,9,11]
if current_month in month_days_31:
if current_date == 31:
next_day_month = current_month + 1
next_day_date = 1
else:
next_day_month = current_month
next_day_date = current_date
elif current_month == 2:
if calendar.isleap(current_year):
if current_date == 29:
next_day_month = current_month + 1
next_day_date = 1
else:
next_day_month = current_month
next_day_date = current_date
else:
if current_date == 28:
next_day_month = current_month + 1
next_day_date = 1
else:
next_day_month = current_month
next_day_date = current_date
elif current_month in month_days_30:
if current_date == 30:
next_day_month = current_month + 1
next_day_date = 1
else:
next_day_month = current_month
next_day_date = current_date
else:
print("Please check the current month and date to procedd further.")
if current_hour < 11:
current_hour = 11
current_min = 15
next_day_date = current_date + 1
current_start = datetime(current_year,current_month,current_date,current_hour,current_min,current_sec)
current_end = datetime(current_year,current_month,current_date,21,15,0)
next_day_start = datetime(current_year,next_day_month,next_day_date,11,15,0)
next_day_end = datetime(current_year,next_day_month,next_day_date,21,15,0)
current_seconds = (current_end - current_start).total_seconds()
next_day_seconds = (next_day_end - next_day_start).total_seconds()
step = timedelta(minutes=15)
current_day_array = []
next_day_array = []
for i in range(0, int(current_seconds), int(step.total_seconds())):
current_day_array.append(current_start + timedelta(seconds=i))
for i in range(0, int(next_day_seconds), int(step.total_seconds())):
current_day_array.append(next_day_start + timedelta(seconds=i))
current_day_array = [i.strftime('%Y-%m-%dT%H:%M%:%SZ') for i in current_day_array]
print current_day_array
Which produces the following output:
['2016-09-03T11:15:00Z', '2016-09-03T11:30:00Z', '2016-09-03T11:45:00Z', '2016-09-03T12:00:00Z', '2016-09-03T12:15:00Z', '2016-09-03T12:30:00Z', '2016-09-03T12:45:00Z', '2016-09-03T13:00:00Z', '2016-09-03T13:15:00Z', '2016-09-03T13:30:00Z', '2016-09-03T13:45:00Z', '2016-09-03T14:00:00Z', '2016-09-03T14:15:00Z', '2016-09-03T14:30:00Z', '2016-09-03T14:45:00Z', '2016-09-03T15:00:00Z', '2016-09-03T15:15:00Z', '2016-09-03T15:30:00Z', '2016-09-03T15:45:00Z', '2016-09-03T16:00:00Z', '2016-09-03T16:15:00Z', '2016-09-03T16:30:00Z', '2016-09-03T16:45:00Z', '2016-09-03T17:00:00Z', '2016-09-03T17:15:00Z', '2016-09-03T17:30:00Z', '2016-09-03T17:45:00Z', '2016-09-03T18:00:00Z', '2016-09-03T18:15:00Z', '2016-09-03T18:30:00Z', '2016-09-03T18:45:00Z', '2016-09-03T19:00:00Z', '2016-09-03T19:15:00Z', '2016-09-03T19:30:00Z', '2016-09-03T19:45:00Z', '2016-09-03T20:00:00Z', '2016-09-03T20:15:00Z', '2016-09-03T20:30:00Z', '2016-09-03T20:45:00Z', '2016-09-03T21:00:00Z', '2016-09-04T11:15:00Z', '2016-09-04T11:30:00Z', '2016-09-04T11:45:00Z', '2016-09-04T12:00:00Z', '2016-09-04T12:15:00Z', '2016-09-04T12:30:00Z', '2016-09-04T12:45:00Z', '2016-09-04T13:00:00Z', '2016-09-04T13:15:00Z', '2016-09-04T13:30:00Z', '2016-09-04T13:45:00Z', '2016-09-04T14:00:00Z', '2016-09-04T14:15:00Z', '2016-09-04T14:30:00Z', '2016-09-04T14:45:00Z', '2016-09-04T15:00:00Z', '2016-09-04T15:15:00Z', '2016-09-04T15:30:00Z', '2016-09-04T15:45:00Z', '2016-09-04T16:00:00Z', '2016-09-04T16:15:00Z', '2016-09-04T16:30:00Z', '2016-09-04T16:45:00Z', '2016-09-04T17:00:00Z', '2016-09-04T17:15:00Z', '2016-09-04T17:30:00Z', '2016-09-04T17:45:00Z', '2016-09-04T18:00:00Z', '2016-09-04T18:15:00Z', '2016-09-04T18:30:00Z', '2016-09-04T18:45:00Z', '2016-09-04T19:00:00Z', '2016-09-04T19:15:00Z', '2016-09-04T19:30:00Z', '2016-09-04T19:45:00Z', '2016-09-04T20:00:00Z', '2016-09-04T20:15:00Z', '2016-09-04T20:30:00Z', '2016-09-04T20:45:00Z', '2016-09-04T21:00:00Z']
While I understand that you can get the oldest date in a list of dates by using min(list_of_dates), say I have have a list of dictionaries which contain arbitrary keys that have date values:
[{key1: date1}, {key2: date2}, {key3: date3}]
Is there a built-in method to return the dictionary with the oldest date value? Do I need to iterate over the list, and if so what would that look like?
You can get the minimum date value per dictionary:
min(list_of_dictionaries, key=lambda d: min(d.values()))
This would work with just 1 or with multiple values per dictionary in the list, provided they are all date objects.
Demo:
>>> from datetime import date
>>> import random, string
>>> def random_date(): return date.fromordinal(random.randint(730000, 740000))
...
>>> def random_key(): return ''.join([random.choice(string.ascii_lowercase) for _ in range(10)])
...
>>> list_of_dictionaries = [{random_key(): random_date() for _ in range(random.randint(1, 3))} for _ in range(5)]
>>> list_of_dictionaries
[{'vsiaffoloi': datetime.date(2018, 1, 3)}, {'omvhscpvqg': datetime.date(2020, 10, 7), 'zyvrtvptuw': datetime.date(2001, 7, 25), 'hvcjgsiicz': datetime.date(2019, 11, 30)}, {'eoltbkssmj': datetime.date(2016, 2, 27), 'xqflazzvyv': datetime.date(2024, 9, 1), 'qaszxzxbsg': datetime.date(2014, 11, 26)}, {'noydyjtmjf': datetime.date(2013, 6, 4), 'okieejoiay': datetime.date(2020, 12, 15), 'ddcqoxkpdn': datetime.date(2002, 7, 13)}, {'vbwstackcq': datetime.date(2025, 12, 14)}]
>>> min(list_of_dictionaries, key=lambda d: min(d.values()))
{'omvhscpvqg': datetime.date(2020, 10, 7), 'zyvrtvptuw': datetime.date(2001, 7, 25), 'hvcjgsiicz': datetime.date(2019, 11, 30)}
or just one value per dictionary:
>>> list_of_dictionaries = [{random_key(): random_date()} for _ in range(5)]
>>> list_of_dictionaries
[{'vmlrfbyybp': datetime.date(2001, 10, 25)}, {'tvenffnapv': datetime.date(2003, 1, 1)}, {'ivypocbyuz': datetime.date(2026, 8, 9)}, {'trywaosiqm': datetime.date(2022, 7, 29)}, {'ndqmejmfqj': datetime.date(2001, 2, 13)}]
>>> min(list_of_dictionaries, key=lambda d: min(d.values()))
{'ndqmejmfqj': datetime.date(2001, 2, 13)}
Per the official docs, min supports an arbitrary key function to specify what to compare on. If you require more specific behavior, you may also consider sorted instead.
I want to write a function that returns a tuple of (start,end) where start is the Monday at 00:00:00:000000 and end is Sunday at 23:59:59:999999. start and end are datetime objects. No other information is given about day, month or year. i tried this function
def week_start_end(date):
start= date.strptime("00:00:00.000000", "%H:%M:%S.%f")
end = date.strptime("23:59:59.999999", "%H:%M:%S.%f")
return (start,end)
print week_start_end(datetime(2013, 8, 15, 12, 0, 0))
should return (datetime(2013, 8, 11, 0, 0, 0, 0), datetime(2013, 8, 17, 23, 59, 59, 999999))
but the function returns tuple with dates (datetime.datetime(1900, 1, 1, 0, 0), datetime.datetime(1900, 1, 1, 23, 59, 59, 999999))
I think using datetime.isocalendar is a nice solution. This give the correct outputs for your example:
import datetime
def iso_year_start(iso_year):
"The gregorian calendar date of the first day of the given ISO year"
fourth_jan = datetime.date(iso_year, 1, 4)
delta = datetime.timedelta(fourth_jan.isoweekday()-1)
return fourth_jan - delta
def iso_to_gregorian(iso_year, iso_week, iso_day):
"Gregorian calendar date for the given ISO year, week and day"
year_start = iso_year_start(iso_year)
return year_start + datetime.timedelta(days=iso_day-1, weeks=iso_week-1)
def week_start_end(date):
year = date.isocalendar()[0]
week = date.isocalendar()[1]
d1 = iso_to_gregorian(year, week, 0)
d2 = iso_to_gregorian(year, week, 6)
d3 = datetime.datetime(d1.year, d1.month, d1.day, 0,0,0,0)
d4 = datetime.datetime(d2.year, d2.month, d2.day, 23,59,59,999999)
return (d3,d4)
As an example:
>>> d = datetime.datetime(2013, 8, 15, 12, 0, 0)
>>> print week_start_end(d)
(datetime.datetime(2013, 8, 11, 0, 0), datetime.datetime(2013, 8, 17, 23, 59, 59, 999999))
And should help you with your problem.
I want to split the calendar into two-week intervals starting at 2008-May-5, or any arbitrary starting point.
So I start with several date objects:
import datetime as DT
raw = ("2010-08-01",
"2010-06-25",
"2010-07-01",
"2010-07-08")
transactions = [(DT.datetime.strptime(datestring, "%Y-%m-%d").date(),
"Some data here") for datestring in raw]
transactions.sort()
By manually analyzing the dates, I am quite able to figure out which dates fall within the same fortnight interval. I want to get grouping that's similar to this one:
# Fortnight interval 1
(datetime.date(2010, 6, 25), 'Some data here')
(datetime.date(2010, 7, 1), 'Some data here')
(datetime.date(2010, 7, 8), 'Some data here')
# Fortnight interval 2
(datetime.date(2010, 8, 1), 'Some data here')
import datetime as DT
import itertools
start_date=DT.date(2008,5,5)
def mkdate(datestring):
return DT.datetime.strptime(datestring, "%Y-%m-%d").date()
def fortnight(date):
return (date-start_date).days //14
raw = ("2010-08-01",
"2010-06-25",
"2010-07-01",
"2010-07-08")
transactions=[(date,"Some data") for date in map(mkdate,raw)]
transactions.sort(key=lambda (date,data):date)
for key,grp in itertools.groupby(transactions,key=lambda (date,data):fortnight(date)):
print(key,list(grp))
yields
# (55, [(datetime.date(2010, 6, 25), 'Some data')])
# (56, [(datetime.date(2010, 7, 1), 'Some data'), (datetime.date(2010, 7, 8), 'Some data')])
# (58, [(datetime.date(2010, 8, 1), 'Some data')])
Note that 2010-6-25 is in the 55th fortnight from 2008-5-5, while 2010-7-1 is in the 56th. If you want them grouped together, simply change start_date (to something like 2008-5-16).
PS. The key tool used above is itertools.groupby, which is explained in detail here.
Edit: The lambdas are simply a way to make "anonymous" functions. (They are anonymous in the sense that they are not given names like functions defined by def). Anywhere you see a lambda, it is also possible to use a def to create an equivalent function. For example, you could do this:
import operator
transactions.sort(key=operator.itemgetter(0))
def transaction_fortnight(transaction):
date,data=transaction
return fortnight(date)
for key,grp in itertools.groupby(transactions,key=transaction_fortnight):
print(key,list(grp))
Use itertools groupby with lambda function to divide by the length of period the distance from starting point.
>>> for i, group in groupby(range(30), lambda x: x // 7):
print list(group)
[0, 1, 2, 3, 4, 5, 6]
[7, 8, 9, 10, 11, 12, 13]
[14, 15, 16, 17, 18, 19, 20]
[21, 22, 23, 24, 25, 26, 27]
[28, 29]
So with dates:
import itertools as it
start = DT.date(2008,5,5)
lenperiod = 14
for fnight,info in it.groupby(transactions,lambda data: (data[0]-start).days // lenperiod):
print list(info)
You can use also weeknumbers from strftime, and lenperiod in number of weeks:
for fnight,info in it.groupby(transactions,lambda data: int (data[0].strftime('%W')) // lenperiod):
print list(info)
Using a pandas DataFrame with resample works too. Given OP's data, but change "some data here" to 'abcd'.
>>> import datetime as DT
>>> raw = ("2010-08-01",
... "2010-06-25",
... "2010-07-01",
... "2010-07-08")
>>> transactions = [(DT.datetime.strptime(datestring, "%Y-%m-%d"), data) for
... datestring, data in zip(raw,'abcd')]
[(datetime.datetime(2010, 8, 1, 0, 0), 'a'),
(datetime.datetime(2010, 6, 25, 0, 0), 'b'),
(datetime.datetime(2010, 7, 1, 0, 0), 'c'),
(datetime.datetime(2010, 7, 8, 0, 0), 'd')]
Now try using pandas. First create a DataFrame, naming the columns and setting the indices to the dates.
>>> import pandas as pd
>>> df = pd.DataFrame(transactions,
... columns=['date','data']).set_index('date')
data
date
2010-08-01 a
2010-06-25 b
2010-07-01 c
2010-07-08 d
Now use the Series Offset Aliases to every 2 weeks starting on Sundays and concatenate the results.
>>> fortnight = df.resample('2W-SUN').sum()
data
date
2010-06-27 b
2010-07-11 cd
2010-07-25 0
2010-08-08 a
Now drill into the data as needed by weekstart
>>> fortnight.loc['2010-06-27']['data']
b
or index
>>> fortnight.iloc[0]['data']
b
or indices
>>> data = fortnight.iloc[:2]['data']
b
date
2010-06-27 b
2010-07-11 cd
Freq: 2W-SUN, Name: data, dtype: object
>>> data[0]
b
>>> data[1]
cd