python, datetime.date: difference between two days - python

I'm playing around with 2 objects {#link http://docs.python.org/library/datetime.html#datetime.date}
I would like to calculate all the days between them, assuming that date 1 >= date 2, and print them out. Here is an example what I would like to achieve. But I don't think this is efficient at all. Is there a better way to do this?
# i think +2 because this calc gives only days between the two days,
# i would like to include them
daysDiff = (dateTo - dateFrom).days + 2
while (daysDiff > 0):
rptDate = dateFrom.today() - timedelta(days=daysDiff)
print rptDate.strftime('%Y-%m-%d')
daysDiff -= 1

I don't see this as particularly inefficient, but you could make it slightly cleaner without the while loop:
delta = dateTo - dateFrom
for delta_day in range(0, delta.days+1): # Or use xrange in Python 2.x
print dateFrom + datetime.timedelta(delta_day)
(Also, notice how printing or using str on a date produces that '%Y-%m-%d' format for you for free)
It might be inefficient, however, to do it this way if you were creating a long list of days in one go instead of just printing, for example:
[dateFrom + datetime.timedelta(delta_day) for delta_day in range(0, delta.days+1)]
This could easily be rectified by creating a generator instead of a list. Either replace [...] with (...) in the above example, or:
def gen_days_inclusive(start_date, end_date):
delta_days = (end_date - start_date).days
for day in xrange(delta_days + 1):
yield start_date + datetime.timedelta(day)
Whichever suits your syntax palate better.

Related

python 3 datetime difference in microseconds giving wrong answer for a long operation

I'm doing a delete operation of 3000 elements from a binary search tree of size 6000 ( sorted therefore one sided tree). I need to calculate the time taken for completing all the deletes
I did this
bst2 = foo.BinarySearchTree() #init
insert_all_to_tree(bst2,insert_lines) #insert 6000 elements
start = datetime.now() #start time
for idx, line in enumerate(lines):
bst2.delete(line) #deleting
if (idx%10 == 0):
print("deleted ", (idx+1), "th element - ", line)
end = datetime.now() #completion time
duration = end - start
print(duration.microseconds) #duration in microseconds
I got the answer 761716 microseconds which is less than even a minute when my actual code ran for about 5 hours. I expected something in the ranges of 10^9 - 10^10. I even checked the max integer allowed in python to see if it's related to that but apparently that's not the problem.
Why I'm I getting a wrong answer for the duration?
datetime.now() returns a datetime, so doing math with it doesn't work out. You want to either use time.time() (Python < v3.3), time.perf_counter() (Python v3.3 until v3.7) or time.perf_counter_ns() (Python > v3.7).
time.time() and time.perf_counter() both return float, and time.perf_counter_ns() returns int.

Convert date with more than 31 days to regular date format in python

I have exported some data from another programm, where I added up time for a station waiting.
So after some time, I have the format '32:00:00:33.7317' for the waiting time.
This is my function to convert every date into the format I want:
def Datum_formatieren(Datensatz):
if len(str(Datensatz)) == 24:
return datetime.datetime.strptime(Datensatz, "%d.%m.%Y %H:%M:%S.%f").strftime("%d%H%M")
elif len(str(Datensatz)) == 3:
return 0
#return datetime.datetime.strptime(Datensatz, "%S.%f").strftime("%d%H%M")
elif len(str(Datensatz)) == 5:
return str(Datensatz)
elif len(str(Datensatz)) == 7:
return str(Datensatz)
elif len(str(Datensatz)) == 6:
return datetime.datetime.strptime(str(Datensatz), "%S.%f").strftime("%d%H%M")
elif len(str(Datensatz)) == 9 or len(str(Datensatz))==10:
return datetime.datetime.strptime(str(Datensatz), "%M:%S.%f").strftime("%d%H%M")
elif len(str(Datensatz)) == 12 or len(str(Datensatz)) ==13:
return datetime.datetime.strptime(str(Datensatz), "%H:%M:%S.%f").strftime("%d%H%M")
elif len(str(Datensatz)) == 15 or len(str(Datensatz)) == 16:
return datetime.datetime.strptime(str(Datensatz), "%d:%H:%M:%S.%f").strftime("%d%H%M")
I get the following error since python does not recognize days above 30 or 31:
ValueError: time data '32:00:00:33.7317' does not match format '%d:%H:%M:%S.%f'
How do I convert all entries with days above 31 into a format, which python can recognize?
You cannot use datetime.datetime.strptime() to construct datetimes that are invalid - why see other answer.
You can however leverage datetime.timespan:
import datetime
def Datum_formatieren(Datensatz):
# other cases omitted for brevity
# Input: "days:hours:minutes:seconds.ms"
if len(Datensatz) in (15,16):
k = list(map(float,Datensatz.split(":")))
secs = k[0]*60*60*24 + k[1]*60*60 + k[2]*60 + k[3]
td = datetime.timedelta(seconds=secs)
days = td.total_seconds() / 24 / 60 // 60
hours = (td.total_seconds() - days * 24*60*60) / 60 // 60
minuts = (td.total_seconds() - days *24*60*60 - hours * 60*60) // 60
print(td)
return f"{td.days}{int(hours):02d}{int(minuts):02d}"
print(Datum_formatieren("32:32:74:33.731"))
Output for "32:32:74:33.731":
33 days, 9:14:33.731000 # timespan
330914 # manually parsed
You are misusing datetime wich only map to correct dates with times - not "any amount time passed".
Use a timedelta instead:
Adapted from datetime.timedelta:
from datetime import datetime, timedelta
delta = timedelta( days=50, seconds=27, microseconds=10,
milliseconds=29000, minutes=5, hours=8, weeks=2 )
print(datetime.now() + delta)
You can add any timedelta to a normal datetime and get the resulting value.
If you want to stick wich your approach you may want to shorten it:
if len(str(Datensatz)) == 9 or len(str(Datensatz))==10:
if len(Datensatz) in (9,10):
Related: How to construct a timedelta object from a simple string (look at its answers and take inspiration with attribution from it)
You're taking the Datensatz variable, converting it to string using str(), then parsing it back into an internal representation; there is almost always a better way to do it.
Can you check what type the Datensatz variable has, perhaps print(type(Datensatz)) or based on the rest of your code?
Most likely the Datensatz variable already has fields for the number of days, hours, minutes and seconds. It's usually much better to base your logic on those directly, rather than converting to string and back.
As others have pointed out, you're trying to use a datetime.datetime to represent a time interval; this is incorrect. Instead, you need to either:
Use the datetime.timedelta type, which is designed for time intervals. It can handle periods over 30 days correctly:
>>> print(datetime.timedelta(days=32, seconds=12345))
32 days, 3:25:45
>>>
Since your function is named Datum_formatieren, perhaps you intend to take Datensatz and convert it to string, for output to the user or to another system.
In that case, you should take the fields directly in Datensatz and convert them appropriately, perhaps using f-strings or % formatting. Depending on the situation, you may need to do some arithmetic. The details will depend on the type of Datensatz and the format you need on the output.

How do I compare two list with a condition to check?

I am trying to implement an attendance system. I have two list which I already convert them to Unix timestamp. One list contains a fix timetable and the other is the log of the student if they clock in.
For example the timetable list might include
timetable[1519650000, 1519740000, 1519743600]
timetableEnd[1519653600, 1519743600, 1519747200]
log[1519739987, 1519744087]
In the human readable way but not in the code
timetable[2018-02-26 13:00:00, 2018-02-27 14:00:00, 2018-02-27 15:00:00]
timetableEnd[2018-02-26 14:00:00, 2018-02-27 15:00:00, 2018-02-27 16:00:00]
log[2018-02-27 13:59:47, 2018-02-27 15:08:07]
Is the a way to loop every single log element to check with timetable element that will agree this condition
a <= x <= b
Where a = timetable begin time - 5mins
x = log time
b = end time for timetable time
Example: (1519650000 - 300) <= x <= 1519653600
return false since log doesnt have a value that satisfy this
Can I get some advice or a guidance on how should I proceed this.
You can build pairs of neighbors of a list using the pattern zip(a[:-1], a[1:]) and you can pair up the elements of two equally long lists using zip(a, b). Using these two things you can try this:
if all(start-300 <= log_element <= end
for ((start, end), log_element) in zip(
zip(timetable[:-1], timetable[1:]), log)):
print("All logs are in their boundaries.")
In your case this will not succeed because log[1] is not between timetable[1]-300 and timetable[2].
In case you are not familiar with Python's elegant functional style, you might find it easier to understand it phrased this (less elegant) way:
def all_logs_in_boundaries(timetable, log):
for ((start, end), log_element) in zip(
zip(timetable[:-1], timetable[1:]), log)):
if not (start-300 <= log_element <= end):
return False
return True # or: print("All logs are in their boundaries.")
Why don't you use datetime?
Use something like this
import datetime
logtime = datetime.datetime.fromtimestamp(x)
begintime = datatime.datetime.fromtimestamp(a)
arrivaltime = logtime - datetime.timedelta(minutes=5)
return (arrivaltime <= logtime and logtime <= begintime)
logtime is the datetime from x
begintime is the datetime from a
and arrivaltime is 5 minutes earlier than begintime

Python Nested list -Time intervals - intersection and difference

I have a problem with a nested list, time as elements
time=[(2017-01-01T00:00:00.000000Z,2017-01-01T00:00:39.820000Z),
(2017-01-01T00:00:38.840000Z,2017-01-01T01:36:33.260000Z),
(2017-01-01T01:36:45.960000Z,2017-01-01T03:06:15.340000Z),
(2017-01-01T03:06:24.320000Z,2017-01-01T03:31:00.420000Z),
(2017-01-01T03:31:22.880000Z,2017-01-01T03:48:43.500000Z),
(2017-01-01T03:48:53.280000Z,2017-01-01T04:14:53.660000Z),
(2017-01-01T04:15:15.160000Z,2017-01-01T04:34:44.060000Z),
(2017-01-01T04:34:57.440000Z,2017-01-01T04:46:31.100000Z),
(2017-01-01T04:46:53.320000Z,2017-01-01T05:22:20.340000Z),
(2017-01-01T05:22:24.920000Z,2017-01-01T06:17:30.900000Z),
(2017-01-01T06:18:02.280000Z,2017-01-01T07:01:45.740000Z),
(2017-01-01T07:02:04.640000Z,2017-01-01T07:39:48.780000Z),
(2017-01-01T07:40:12.400000Z,2017-01-01T08:19:46.140000Z),
(2017-01-01T08:20:13.520000Z,2017-01-01T10:17:45.380000Z),
(2017-01-01T10:17:59.880000Z,2017-01-01T15:01:29.100000Z),
(2017-01-01T15:01:55.840000Z,2017-01-01T15:08:45.460000Z),
(2017-01-01T15:09:04.000000Z,2017-01-01T15:42:13.180000Z),
(2017-01-01T15:42:30.360000Z,2017-01-01T16:14:07.340000Z),
(2017-01-01T16:14:24.560000Z,2017-01-01T17:11:28.420000Z),
(2017-01-01T17:11:32.960000Z,2017-01-01T17:46:07.660000Z),
(2017-01-01T17:46:30.280000Z,2017-01-01T18:02:17.860000Z),
(2017-01-01T18:02:35.240000Z,2017-01-01T18:16:17.740000Z),
(2017-01-01T18:16:26.720000Z,2017-01-01T18:39:10.540000Z),
(2017-01-01T18:39:19.360000Z,2017-01-01T19:45:25.860000Z),
(2017-01-01T19:45:34.720000Z,2017-01-01T20:41:00.220000Z),
(2017-01-01T20:41:21.520000Z,2017-01-01T21:13:51.660000Z),
(2017-01-01T21:14:13.360000Z,2017-01-01T21:41:16.220000Z),
(2017-01-01T21:41:28.640000Z,2017-01-01T22:03:03.820000Z),
(2017-01-01T22:03:29.400000Z,2017-01-01T23:14:13.500000Z),
(2017-01-01T23:14:35.200000Z,2017-01-01T23:59:59.980000Z)]
as you can see, all the elements belong to the same day, 2017-01-01, what I want to do is the difference (in seconds or ms) of the entire day (86400s) and all the intervals in the list, but there are some overlaps, so I think that first I have to do some kind of "intersection check", and after all the intersections are set, just do the difference between all the elements and 86400, but how can I do that intersection check?. Any suggestion would be highly appreciated, Thanks in advance!
Desired Output:
86400(day) - 85000(possible time in seconds after time intersection of list) = 1400
The problem is twofold:
to replace overlapping intervals with their unions;
to sum the resulting non-overlapping intervals.
The first part can be done like this:
time.sort()
new_time = [list(time[0])]
for t in time[1:]:
if t[0] <= new_time[-1][1]:
if t[1] > new_time[-1][1]:
new_time[-1][1] = t[1]
else:
new_time.append(list(t))
while the second part is best done using datetime module:
import datetime
total = sum([ ( datetime.datetime.strptime(t[1], '%Y-%m-%dT%H:%M:%S.%fZ') -
datetime.datetime.strptime(t[0], '%Y-%m-%dT%H:%M:%S.%fZ') ).total_seconds()
for t in new_time ])
print(86400 - total)
After converting strings to numbers, you could use the top answer from Python find continuous interesctions of intervals
You can sort and then merge any overlaps
time.sort()
noOverlapList = []
start = time[0][0] # start of first interval
end = time[0][1] # end of first interval
for interval in time:
# if interval overlaps with tempInterval
if interval[0] < end and interval[1] > end:
end = interval[1]
else if interval[0] > end:
noOverlapList.append((start, end)) # merged non overlapping interval
start = interval[0]
end = interval[1]
Then just sum the intervals contained in noOverlaplList, and get the difference

Add up the value of data[x] to data[x+1]

I have a long list of data which I am working with now,containing a list of 'timestamp' versus 'quantity'. However, the timestamp in the list is not all in order (for example,timestamp[x] can be 140056 while timestamp[x+1] can be 560). I am not going to arrange them, but to add up the value of timestamp[x] to timestamp[x+1] when this happens.
ps:The arrangement of quantity needs to be in the same order as in the list when plotting.
I have been working with this using the following code, which timestamp is the name of the list which contain all the timestamp values:
for t in timestamp:
previous = timestamp[t-1]
increment = 0
if previous > timestamp[t]:
increment = previous
t += increment
delta = datetime.timedelta(0, (t - startTimeStamp) / 1000);
timeAtT = fileStartDate + (delta + startTime)
print("time at t=" + str(t) + " is: " + str(timeAtT));
previous = t
However it comes out with TypeError: list indices must be integers, not tuples. May I know how to solve this, or any other ways of doing this task? Thanks!
The problem is that you're treating t as if it is an index of the list. In your case, t holds the actual values of the list, so constructions like timestamp[t] are not valid. You either want:
for t in range(len(timestamp)):
Or if you want both an index and the value:
for (t, value) in enumerate(timestamp):
When you for the in timestamp you are making t take on the value of each item in timestamp. But then you try to use t as an index to make previous. To do this, try:
for i, t, in enumerate(timestamp):
previous = timestamp[i]
current = t
Also when you get TypeErrors like this make sure you try printing out the intermediate steps, so you can see exactly what is going wrong.

Categories

Resources