Negative time difference in Pandas

Negative time difference in Pandas - python

I get this strange result by substracting earlier time stamp for later one:
pd.to_datetime('2021-05-21 06:00:00') - pd.to_datetime('2021-05-21 06:02:00')
Output:
Timedelta('-1 days +23:58:00')
Expected Output:
Timedelta('-0 days 00:02:00')
What is the correct way to calculate a negative time difference? Thank you!

Timedelta('-1 days +23:58:00') is the proper representation of a negative time difference in pandas (and also in pure python)
# using pure python
from datetime import datetime
datetime(2021,5,21,6,0,0) - datetime(2021,5,21,6,2,0)
datetime.timedelta(days=-1, seconds=86280)
this is because the difference is properly calculated as -120 seconds, but individual time elements cannot exceed their moduli. the timedelta components are normalized. To represent negative 2 minutes, a negative day & positive time component are used.
from the python datetime module's documentation
and days, seconds and microseconds are then normalized so that the representation is unique, with
0 <= microseconds < 1000000
0 <= seconds < 3600*24 (the number of seconds in one day)
-999999999 <= days <= 999999999
Note that normalization of negative values may be surprising at first. For example:
from datetime import timedelta
d = timedelta(microseconds=-1)
(d.days, d.seconds, d.microseconds)
(-1, 86399, 999999)
it is possible to retrieve the total seconds as a negative integer using the method Timedelta.total_seconds

We can do total_seconds
(pd.to_datetime('2021-05-21 06:00:00') - pd.to_datetime('2021-05-21 06:02:00')).total_seconds()
Out[9]: -120.0

Use abs to get the time delta:
>>> abs(pd.to_datetime('2021-05-21 06:00:00') - pd.to_datetime('2021-05-21 06:02:00'))
Timedelta('0 days 00:02:00')

Well your code is giving correct output ...
Your result is Timedelta('-1 days +23:58:00') which is equal to -24:00:00 + 23:58:00 => 2 mins

you can use np.timedelta64 to change the time delta to your desired output
as others have said, the pandas negative Timedelta object is the correct output in python.
import numpy as np
delta = pd.to_datetime('2021-05-21 06:00:00') - pd.to_datetime('2021-05-21 06:02:00')
print(delta)
Timedelta('-1 days +23:58:00')
#minutes
print(delta / np.timedelta64(1,'m')
-2.0
#seconds
delta / np.timedelta64(1,'s')
-120.0

Related

Why doesn't python datetime give negative seconds and hours when subtracting two dates?

I would like to known why I do not get minus hours and minus seconds from datetime.timedelta?
I have the following method
def time_diff(external_datetime, internal_datetime):
from pandas.core.indexes.period import parse_time_string # include for completeness
time_external = parse_time_string(external_datetime)[0]
time_internal = parse_time_string(internal_datetime)[0]
diff = time_external - time_internal
return diff
all is as expected when the datetimes look like these;
external_datetime = "2020-01-01T00:00:00"
internal_datetime = "2020-01-02T00:00:00"
returned is a datetime.timedelta of -1 days (datetime.timedelta(days=-1))
why then when i change the times to;
external_datetime = "2020-01-01T00:00:00"
internal_datetime = "2020-01-01T01:30:00"
do I get a diff of datetime.timedelta(days=-1, seconds=81000)
I did wonder if it was due to being 'near' midnight but
external_datetime = "2020-01-01T11:00:00"
internal_datetime = "2020-01-01T11:30:00"
results in datetime.timedelta(days=-1, seconds=84600)
versions
python 3.8.2
pandas 1.1.4

From the documentation for timedelta:
Only days, seconds and microseconds are stored internally. Arguments
are converted to those units:
A millisecond is converted to 1000 microseconds.
A minute is converted to 60 seconds.
An hour is converted to 3600 seconds.
A week is converted to 7 days.
and days, seconds and microseconds are then normalized so that the
representation is unique, with
0 <= microseconds < 1000000
0 <= seconds < 3600*24 (the number of seconds in one day)
-999999999 <= days <= 999999999
So the number of seconds and microseconds are guaranteed to be non-negative, with the number of days being positive or negative as necessary. It makes sense to let the larger unit days be positive or negative, as that will account for most of an arbitrary interval. days can be slightly more negative than necessary, with the smaller, limited units used to make up the difference.
Note that with this representation, the sign of the interval is determined solely by the sign of the days.

Time in computing is generally based on number of seconds (or some unit of seconds) past a reference time. I'd assume Python's DateTime represents data as hours, minutes, seconds, and subunits of seconds after the start of a day. Therefore, seconds will never be negative. Because it is always seconds after, it makes sense to go -1 day + 84600 seconds so that seconds are positive.

Why is the diff of two datetime objects so?

datetime1 = '2020-08-19 10:13:19'
datetime2 = '2020-08-19 19:00:00'
diff = datetime1 - datetime2
The diff is a timedelta object, with:
diff.days = -1
diff.seconds = 54766 = 15.22 hours
There are only about 9 hours diff between the two datetimes. Why does it show the number of days is '1' and 15.22 hours? How to understand the diff of two datetimes?

If you subtract the earlier datetime from the later datetime, you get a positive timedelta, as one would expect.
The other way around, you get a negative timedelata in the unusual format.
But when you calculate -1 day + 15 hours = -24 hours + 15 hours = -9 hours, the result is correct.
Of course, doing this calculation manually is not what we want.
So, either avoid subtracting a later datetime from an earlier datetime:
# to get an absolute timedelta
if datetime2 > datetime1:
print(datetime2 - datetime1)
else:
print(datetime1 - datetime2)
Or use .total_seconds():
print((datetime1 - datetime2).total_seconds())
-31601.0
print((datetime2 - datetime1).total_seconds())
31601.0

In this example, the difference between two datetime objects has a negative number of days, and a positive number of hours.
import datetime
datetime1 = datetime.datetime.fromisoformat('2020-08-19 10:13:19')
datetime2 = datetime.datetime.fromisoformat('2020-08-19 19:00:00')
print(datetime1 - datetime2)
-1 day, 15:13:19
# divide by timedelta() (with argument of hours, minutes, seconds, etc.
print((datetime1 - datetime2) / datetime.timedelta(hours=1)) # in hours
-8.778055555555556
Here is an interesting interview with the core developer who maintains date / time in CPython: https://talkpython.fm/episodes/show/271/unlock-the-mysteries-of-time-pythons-datetime-that-is
UPDATE
You can calculate time difference in minutes, or days, or other units, by supplying a different parameter to .timedelta():
print((datetime1 - datetime2) / datetime.timedelta(minutes=1)) # in minutes
-526.68
print((datetime1 - datetime2) / datetime.timedelta(days=1)) # in days
-0.3658

How to get a time interval between two strings?

I have 2 times stored in separate strings in the form of H:M I need to get the difference between these two and be able to tell how much minutes it equals to. I was trying datetime and timedelta, but I'm only a beginner and I don't really understand how that works. I'm getting attribute errors everytime.
So I have a and b times, and I have to get their difference in minutes.
E.G. if a = 14:08 and b= 14:50 the difference should be 42
How do I do that in python in the simplest way possible? also, in what formats do I need to use for each step?

I assume the difference is 42, not 4 (since there are 42 minutes between 14:08 and 14:50).
If the times always contains of a 5 character length string, than it's reasonably easy.
time1 = '14:08'
time2 = '15:03'
hours = int(time2[:2]) - int(time1[:2])
mins = int(time2[3:]) - int(time1[3:])
print(hours)
print(mins)
print(hours * 60 + mins)
Prints:
1
-5
55
hours will be the integer value of the left two digits [:1] subtraction of the second and first time
minutes will be the integer value of the right two digits [3:] subtraction of the second and first time
This prints 55 ... with your values it prints out 42 (the above example is to show it also works when moving over a whole hour.

You can use datetime.strptime
also the difference is 42 not 4 50-8==42 I assume that was a typo
from datetime import datetime
a,b = "14:08", "14:50"
#convert to datetime
time_a = datetime.strptime(a, "%H:%M")
time_b = datetime.strptime(b, "%H:%M")
#get timedelta from the difference of the two datetimes
delta = time_b - time_a
#get the minutes elapsed
minutes = (delta.seconds//60)%60
print(minutes)
#42

You can get the difference between the datetime.timedelta objects created from the given time strings a and b by subtracting the former from the latter, and use the total_seconds method to obtain the time interval in seconds, with which you can convert to minutes by dividing it by 60:
from datetime import timedelta
from operator import sub
sub(*(timedelta(**dict(zip(('hours', 'minutes'), map(int, t.split(':'))))) for t in (b, a))).total_seconds() // 60
So that given a = '29:50' and b = '30:08', this returns:
18.0

Python timedelta object with negative values

I don't quite understand how negative arguments in datetime.timedelta are interpreted.
With Positive values:
>>> from datetime import timedelta
>>> d = timedelta(days=1,seconds=1,microseconds=1,milliseconds=1,minutes=1,hours=1,weeks=1)
>>> (d.days, d.seconds, d.microseconds)
>>> (8, 3661, 1001)
This is pretty straightforward. A similar example with negative values looks like:
>>> from datetime import timedelta
>>> d = timedelta(days=-1,seconds=-1,microseconds=-1,milliseconds=-1,minutes=-1,hours=-1,weeks=-1)
>>> (d.days, d.seconds, d.microseconds)
>>> (-9, 82738, 998999)
As per my understanding seconds and microseconds are derived like:
seconds = 86399 - (-60-3600-1)
microseconds = 999999 - (-1-1000)
Is this correct? How come days equals -9?
I am reading this section of docs. But still don't quite understand the working with negative values. Please share explanations or relevant documentation links. Thanks :)

Because of the way timedeltas are stored internally, only the days attribute can take on negative values. This can be surprising when the timedelta is printed back. An illuminating example from the docs,
>>> d = timedelta(microseconds=-1)
>>> (d.days, d.seconds, d.microseconds)
(-1, 86399, 999999)
ie. -1d + 86399s + 999999µs = -1µs

It makes complete sense, (-1 week + -1 day) + (-1 hours) + (-1 minutes) + (-1 seconds) + (-1 milliseconds) + (-1 microseconds) equals to: (-8 days) + (-1 hours) + (-1 minutes) + (-1 seconds) + (-1 milliseconds) + (-1 microseconds)
By having less than -8 days, with -1 hours, -1 minutes, ... the number of days will have to be even less than -8 to make the negative hours, minutes, seconds... into positive hours, minutes, seconds....(since only days can be represented negatively, others,such as seconds, are always represented positively). Which means that days will have to be -9.
If you print d , you will get -9 days 22:58:58.998999, with negative 9 days and positive 22+ hours. Seeing the str of the timedelta could help you have better understanding of how negative timedelta is represented.

dateutil.relativedelta - How to get duration in days?

I wish to get the total duration of a relativedelta in terms of days.
Expected:
dateutil.timedelta(1 month, 24 days) -> dateutil.timedelta(55 days)
What I tried:
dateutil.timedelta(1 month, 24 days).days -> 24 (WRONG)
Is there a simple way to do this? Thanks!

This one bothered me as well. There isn't a very clean way to get the span of time in a particular unit. This is partly because of the date-range dependency on units.
relativedelta() takes an argument for months. But when you think about how long a month is, the answer is "it depends". With that said, it's technically impossible to convert a relativedelta() directly to days, without knowing which days the delta lands on.
Here is what I ended up doing.
from datetime import datetime, timedelta
from dateutil.relativedelta import relativedelta
rd = relativedelta(years=3, months=7, days=19)
# I use 'now', but you may want to adjust your start and end range to a specific set of dates.
now = datetime.now()
# calculate the date difference from the relativedelta span
then = now - rd
# unlike normal timedelta 'then' is returned as a datetime
# subtracting two dates will give you a timedelta which contains the value you're looking for
diff = now - then
print diff.days

Simple date diff does it actually.
>>> from datetime import datetime
>>> (datetime(2017, 12, 1) - datetime(2018, 1, 1)).days
-31
To get positive number You can swap dates or use abs:
>>> abs((datetime(2017, 12, 1) - datetime(2018, 1, 1)).days)
31

In many situations you have a much restricted relativedelta, in my case, my relativedelta had only relative fields set (years, months, weeks, and days) and no other field. You may be able to get away with the simple method.
This is definitely off by few days, but it may be all you need
(365 * duration.years) + (30 * duration.months) + (duration.days)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Negative time difference in Pandas - python

We can do total_seconds (pd.to_datetime('2021-05-21 06:00:00') - pd.to_datetime('2021-05-21 06:02:00')).total_seconds() Out[9]: -120.0

Use abs to get the time delta: >>> abs(pd.to_datetime('2021-05-21 06:00:00') - pd.to_datetime('2021-05-21 06:02:00')) Timedelta('0 days 00:02:00')

Well your code is giving correct output ... Your result is Timedelta('-1 days +23:58:00') which is equal to -24:00:00 + 23:58:00 => 2 mins

Related

Why doesn't python datetime give negative seconds and hours when subtracting two dates?

Why is the diff of two datetime objects so?

How to get a time interval between two strings?

Python timedelta object with negative values

dateutil.relativedelta - How to get duration in days?

Categories

Resources