Related
I'm stuck on this because I'm not quite sure how to ask the question, so here's my best attempt!
I have a list of tuples which represent a temperature reading at a particular timestamp.
[
(datetime.datetime(2022, 11, 30, 8, 25, 10, 261853), 19.82),
(datetime.datetime(2022, 11, 30, 8, 27, 22, 479093), 20.01),
(datetime.datetime(2022, 11, 30, 8, 27, 36, 984757), 19.96),
(datetime.datetime(2022, 11, 30, 8, 36, 46, 651432), 21.25),
(datetime.datetime(2022, 11, 30, 8, 41, 27, 230438), 21.42),
...
(datetime.datetime(2022, 11, 30, 11, 57, 4, 689363), 17.8)
]
As you can see, the deltas between the records are all over the place - some are a few seconds apart, while others are minutes apart.
From these, I want to create a new list of tuples (or other data structure - I am happy to use NumPy or Pandas) where the timestamp value is exactly every 5 minutes, while the temperature reading is calculated as the assumed value at that timestamp given the data that is available. Something like this:
[
(datetime.datetime(2022, 11, 30, 8, 25, 0, 0), ??),
(datetime.datetime(2022, 11, 30, 8, 30, 0, 0), ??),
(datetime.datetime(2022, 11, 30, 8, 35, 0, 0), ??),
(datetime.datetime(2022, 11, 30, 8, 40, 0, 0), ??),
...
(datetime.datetime(2022, 11, 30, 11, 30, 0, 0), ??),
]
My end goal is to be able to plot this data using PIL, but not MatPlotLib as I'm on very constrained hardware. I want to plot a smooth temperature line over a given time period, given the imperfect data I have on hand.
Assuming lst the input list, you can use:
import pandas as pd
out = (
pd.DataFrame(lst).set_index(0).resample('5min')
.mean().interpolate('linear')
.reset_index().to_numpy().tolist()
)
If you really want a list of tuples:
out = list(map(tuple, out))
Output:
[[Timestamp('2022-11-30 08:25:00'), 19.930000000000003],
[Timestamp('2022-11-30 08:30:00'), 20.590000000000003],
[Timestamp('2022-11-30 08:35:00'), 21.25],
[Timestamp('2022-11-30 08:40:00'), 21.42],
[Timestamp('2022-11-30 08:45:00'), 21.32717948717949],
[Timestamp('2022-11-30 08:50:00'), 21.234358974358976],
...
[Timestamp('2022-11-30 11:45:00'), 17.985641025641026],
[Timestamp('2022-11-30 11:50:00'), 17.892820512820514],
[Timestamp('2022-11-30 11:55:00'), 17.8]]
For datetime types:
out = (
pd.DataFrame(lst).set_index(0).resample('5min')
.mean().interpolate('linear')[1]
)
out = list(zip(out.index.to_pydatetime(), out))
Output:
[(datetime.datetime(2022, 11, 30, 8, 25), 19.930000000000003),
(datetime.datetime(2022, 11, 30, 8, 30), 20.590000000000003),
(datetime.datetime(2022, 11, 30, 8, 35), 21.25),
(datetime.datetime(2022, 11, 30, 8, 40), 21.42),
(datetime.datetime(2022, 11, 30, 8, 45), 21.32717948717949),
(datetime.datetime(2022, 11, 30, 8, 50), 21.234358974358976),
...
(datetime.datetime(2022, 11, 30, 11, 45), 17.985641025641026),
(datetime.datetime(2022, 11, 30, 11, 50), 17.892820512820514),
(datetime.datetime(2022, 11, 30, 11, 55), 17.8)]
Before/after resampling:
from datetime import datetime
from datetime import timedelta
dateIs = datetime.today()
thisDay = datetime.strftime(dateIs, "%d")
thisDayInt = int(thisDay)
dateListTest = []
if(thisDayInt > 0):
while thisDayInt > 1:
thisDayInt -= 1
dateIs.replace(day=thisDayInt,hour=0,minute=0,second=0)
dateListTest.append(dateIs)
print(dateListTest)
Output
[datetime.datetime(2021, 11, 18, 23, 38, 23, 687371),
datetime.datetime(2021, 11, 18, 23, 38, 23, 687371),
datetime.datetime(2021, 11, 18, 23, 38, 23, 687371),
datetime.datetime(2021, 11, 18, 23, 38, 23, 687371),
datetime.datetime(2021, 11, 18, 23, 38, 23, 687371),
datetime.datetime(2021, 11, 18, 23, 38, 23, 687371),
datetime.datetime(2021, 11, 18, 23, 38, 23, 687371),
datetime.datetime(2021, 11, 18, 23, 38, 23, 687371),
datetime.datetime(2021, 11, 18, 23, 38, 23, 687371),
datetime.datetime(2021, 11, 18, 23, 38, 23, 687371),
datetime.datetime(2021, 11, 18, 23, 38, 23, 687371),
datetime.datetime(2021, 11, 18, 23, 38, 23, 687371),
datetime.datetime(2021, 11, 18, 23, 38, 23, 687371),
datetime.datetime(2021, 11, 18, 23, 38, 23, 687371),
datetime.datetime(2021, 11, 18, 23, 38, 23, 687371),
datetime.datetime(2021, 11, 18, 23, 38, 23, 687371),
datetime.datetime(2021, 11, 18, 23, 38, 23, 687371)]
What I want to do in this part is to get a list of the days from today to the beginning of the month one by one, but when I put them in the While loop, I cannot print the days by decrement. I don't know about a shorter path or where I'm skipping.
List comprehension
You could use a list comprehension for this.
from datetime import datetime
dateIs = datetime.today().replace(hour=0, minute=0, second=0, microsecond=0)
thisDay = dateIs.day
dateListTest = [dateIs.replace(day=d) for d in range(thisDay-1, 0, -1)]
print(dateListTest)
[datetime.datetime(2021, 11, 17, 0, 0),
datetime.datetime(2021, 11, 16, 0, 0),
datetime.datetime(2021, 11, 15, 0, 0),
datetime.datetime(2021, 11, 14, 0, 0),
datetime.datetime(2021, 11, 13, 0, 0),
datetime.datetime(2021, 11, 12, 0, 0),
datetime.datetime(2021, 11, 11, 0, 0),
datetime.datetime(2021, 11, 10, 0, 0),
datetime.datetime(2021, 11, 9, 0, 0),
datetime.datetime(2021, 11, 8, 0, 0),
datetime.datetime(2021, 11, 7, 0, 0),
datetime.datetime(2021, 11, 6, 0, 0),
datetime.datetime(2021, 11, 5, 0, 0),
datetime.datetime(2021, 11, 4, 0, 0),
datetime.datetime(2021, 11, 3, 0, 0),
datetime.datetime(2021, 11, 2, 0, 0),
datetime.datetime(2021, 11, 1, 0, 0)]
Pandas
You could also do it with date_range from Pandas, though you will end up with a list of datetime.date rather than datetime.datetime.
from datetime import datetime
import pandas as pd
dateIs = datetime.today().replace(hour=0, minute=0, second=0, microsecond=0)
thisDay = dateIs.day
dateListTestPandas =pd.date_range(start=dateIs.replace(day=thisDay-1), end=dateIs.replace(day=1), freq="-1D").date.tolist()
print(dateListTestPandas)
[datetime.date(2021, 11, 17),
datetime.date(2021, 11, 16),
datetime.date(2021, 11, 15),
datetime.date(2021, 11, 14),
datetime.date(2021, 11, 13),
datetime.date(2021, 11, 12),
datetime.date(2021, 11, 11),
datetime.date(2021, 11, 10),
datetime.date(2021, 11, 9),
datetime.date(2021, 11, 8),
datetime.date(2021, 11, 7),
datetime.date(2021, 11, 6),
datetime.date(2021, 11, 5),
datetime.date(2021, 11, 4),
datetime.date(2021, 11, 3),
datetime.date(2021, 11, 2),
datetime.date(2021, 11, 1)]
This question already has answers here:
Python List Group by Date
(2 answers)
Closed 3 years ago.
I have this list :
list = [datetime.datetime(2019, 10, 20, 16, 37), datetime.datetime(2019, 10, 20, 16, 50, 7), datetime.datetime(2019, 11, 21, 16, 51, 47), datetime.datetime(2019, 11, 21, 10, 51, 20), datetime.datetime(2019, 10, 21, 10, 53, 22), datetime.datetime(2019, 11, 21, 10, 58, 27)]
I want to group elements with same date in list , how to do this please ?
[datetime.datetime(2019, 10, 20, 16, 37) , datetime.datetime(2019, 10, 20, 16, 50, 7) ]
[datetime.datetime(2019, 10, 21, 10, 53, 22)]
[datetime.datetime(2019, 11, 21, 16, 51, 47), datetime.datetime(2019, 11, 21, 10, 51, 20),datetime.datetime(2019, 11, 21, 10, 58, 27)]
Use itertools.groupby:
from itertools import groupby
f = lambda x:x.date()
[[i for i in g] for k, g in groupby(sorted(l, key=f), key=f)]
Output:
[[datetime.datetime(2019, 10, 20, 16, 37),
datetime.datetime(2019, 10, 20, 16, 50, 7)],
[datetime.datetime(2019, 10, 21, 10, 53, 22)],
[datetime.datetime(2019, 11, 21, 16, 51, 47),
datetime.datetime(2019, 11, 21, 10, 51, 20),
datetime.datetime(2019, 11, 21, 10, 58, 27)]]
You could just use itertools.groupby like,
>>> import itertools
>>> import datetime
>>> x
[datetime.datetime(2019, 10, 20, 16, 37), datetime.datetime(2019, 10, 20, 16, 50, 7), datetime.datetime(2019, 11, 21, 16, 51, 47), datetime.datetime(2019, 11, 21, 10, 51, 20), datetime.datetime(2019, 10, 21, 10, 53, 22), datetime.datetime(2019, 11, 21, 10, 58, 27)]
>>> s = [list(v) for k,v in itertools.groupby(sorted(x), lambda x: (x.year, x.month, x.day))]
>>>
>>>
>>> pprint.pprint(s)
[[datetime.datetime(2019, 10, 20, 16, 37),
datetime.datetime(2019, 10, 20, 16, 50, 7)],
[datetime.datetime(2019, 10, 21, 10, 53, 22)],
[datetime.datetime(2019, 11, 21, 10, 51, 20),
datetime.datetime(2019, 11, 21, 10, 58, 27),
datetime.datetime(2019, 11, 21, 16, 51, 47)]]
>>>
Using dict.setdefault and a simple iteration.
Ex:
import datetime
data = [datetime.datetime(2019, 10, 20, 16, 37), datetime.datetime(2019, 10, 20, 16, 50, 7), datetime.datetime(2019, 11, 21, 16, 51, 47), datetime.datetime(2019, 11, 21, 10, 51, 20), datetime.datetime(2019, 10, 21, 10, 53, 22), datetime.datetime(2019, 11, 21, 10, 58, 27)]
result = {}
for i in data:
result.setdefault(i.strftime("%Y%m%d"), []).append(i)
print(list(result.values()))
Output:
[[datetime.datetime(2019, 10, 20, 16, 37),
datetime.datetime(2019, 10, 20, 16, 50, 7)],
[datetime.datetime(2019, 11, 21, 16, 51, 47),
datetime.datetime(2019, 11, 21, 10, 51, 20),
datetime.datetime(2019, 11, 21, 10, 58, 27)],
[datetime.datetime(2019, 10, 21, 10, 53, 22)]]
This question already has answers here:
How to sort a list of lists by a specific index of the inner list?
(12 answers)
Closed 4 years ago.
i have a list of tuples which is like following
[('John', datetime.datetime(2016, 11, 27, 18, 18), datetime.datetime(2017, 11, 27, 18, 18)), ('Adam', datetime.datetime(2017, 11, 27, 18, 18), datetime.datetime(2018, 11, 27, 18, 18), ('Adam', datetime.datetime(2001, 05, 27, 18, 18), datetime.datetime(2002, 09, 27, 18, 18)) ('Adam', datetime.datetime(2015, 11, 27, 18, 18), datetime.datetime(2016, 11, 27, 18, 18))]
How can i sort this list based on the datetime object, basically its in the format Name,From date, To date
Just use the key parameter of sorted on from_date:
import datetime
data = [('John', datetime.datetime(2016, 11, 27, 18, 18), datetime.datetime(2017, 11, 27, 18, 18)),
('Adam', datetime.datetime(2017, 11, 27, 18, 18), datetime.datetime(2018, 11, 27, 18, 18)),
('Adam', datetime.datetime(2001, 5, 27, 18, 18), datetime.datetime(2002, 9, 27, 18, 18)),
('Adam', datetime.datetime(2015, 11, 27, 18, 18), datetime.datetime(2016, 11, 27, 18, 18))]
result = sorted(data, key=lambda x: x[1])
for e in result:
print(e)
Output
('Adam', datetime.datetime(2001, 5, 27, 18, 18), datetime.datetime(2002, 9, 27, 18, 18))
('Adam', datetime.datetime(2015, 11, 27, 18, 18), datetime.datetime(2016, 11, 27, 18, 18))
('John', datetime.datetime(2016, 11, 27, 18, 18), datetime.datetime(2017, 11, 27, 18, 18))
('Adam', datetime.datetime(2017, 11, 27, 18, 18), datetime.datetime(2018, 11, 27, 18, 18))
I'm not sure if I got the question right. But, I think you're looking for something like this:
sorted(a, key=lambda x: x[1])
Since you can compare DateTime objects with each other, the only thing remaining to do is to ask the sorted function to sort based on them (in this case the second element of each tuple).
I am trying to create a stacked histogram with datetime objects, but I get the following error:
TypeError: unorderable types: datetime.datetime() < float()
The code does work when I either convert the objects to timestamps or when I use one range of data (no stacking).
import datetime
import matplotlib.pyplot as plt
data = [[datetime.datetime(2015, 12, 24, 21, 13, 45), datetime.datetime(2015, 12, 30, 23, 37, 8), datetime.datetime(2015, 12, 30, 19, 43, 18), datetime.datetime(2015, 12, 30, 16, 14, 12), datetime.datetime(2015, 12, 30, 11, 32, 8), datetime.datetime(2015, 12, 29, 6, 29, 25), datetime.datetime(2015, 12, 28, 22, 54, 49), datetime.datetime(2015, 12, 28, 18, 41, 50), datetime.datetime(2015, 12, 28, 14, 25, 42), datetime.datetime(2015, 12, 28, 3, 1, 34), datetime.datetime(2015, 12, 27, 21, 10, 20), datetime.datetime(2015, 12, 27, 11, 29, 38), datetime.datetime(2015, 12, 26, 20, 34, 14), datetime.datetime(2015, 12, 26, 16, 58, 47), datetime.datetime(2015, 12, 26, 10, 54, 40), datetime.datetime(2015, 12, 25, 18, 17, 42), datetime.datetime(2015, 12, 24, 15, 44, 58), datetime.datetime(2015, 12, 25, 17, 25, 9), datetime.datetime(2015, 12, 25, 12, 33, 7), datetime.datetime(2015, 12, 30, 19, 26, 15), datetime.datetime(2015, 12, 30, 12, 46, 13), datetime.datetime(2015, 12, 30, 3, 38, 24), datetime.datetime(2015, 12, 25, 21, 11, 59), datetime.datetime(2015, 12, 25, 13, 30, 34), datetime.datetime(2015, 12, 24, 14, 6, 20)], [datetime.datetime(2015, 12, 28, 20, 59, 53), datetime.datetime(2015, 12, 27, 14, 3, 41), datetime.datetime(2015, 12, 26, 9, 37, 17)], [datetime.datetime(2015, 12, 29, 17, 18, 32)], [datetime.datetime(2015, 12, 29, 23, 15, 24)]]
fig, histograms = plt.subplots(5, 1, sharex=True, squeeze=False)
h = histograms[1][0]
h.hist(data, stacked=True)
This is the code without stacking:
import datetime
import matplotlib.pyplot as plt
data = [datetime.datetime(2015, 12, 24, 21, 13, 45), datetime.datetime(2015, 12, 30, 23, 37, 8), datetime.datetime(2015, 12, 30, 19, 43, 18), datetime.datetime(2015, 12, 30, 16, 14, 12), datetime.datetime(2015, 12, 30, 11, 32, 8), datetime.datetime(2015, 12, 29, 6, 29, 25), datetime.datetime(2015, 12, 28, 22, 54, 49), datetime.datetime(2015, 12, 28, 18, 41, 50), datetime.datetime(2015, 12, 28, 14, 25, 42), datetime.datetime(2015, 12, 28, 3, 1, 34), datetime.datetime(2015, 12, 27, 21, 10, 20), datetime.datetime(2015, 12, 27, 11, 29, 38), datetime.datetime(2015, 12, 26, 20, 34, 14), datetime.datetime(2015, 12, 26, 16, 58, 47), datetime.datetime(2015, 12, 26, 10, 54, 40), datetime.datetime(2015, 12, 25, 18, 17, 42), datetime.datetime(2015, 12, 24, 15, 44, 58), datetime.datetime(2015, 12, 25, 17, 25, 9), datetime.datetime(2015, 12, 25, 12, 33, 7), datetime.datetime(2015, 12, 30, 19, 26, 15), datetime.datetime(2015, 12, 30, 12, 46, 13), datetime.datetime(2015, 12, 30, 3, 38, 24), datetime.datetime(2015, 12, 25, 21, 11, 59), datetime.datetime(2015, 12, 25, 13, 30, 34), datetime.datetime(2015, 12, 24, 14, 6, 20), datetime.datetime(2015, 12, 28, 20, 59, 53), datetime.datetime(2015, 12, 27, 14, 3, 41), datetime.datetime(2015, 12, 26, 9, 37, 17), datetime.datetime(2015, 12, 29, 17, 18, 32), datetime.datetime(2015, 12, 29, 23, 15, 24)]
fig, histograms = plt.subplots(5, 1, sharex=True, squeeze=False)
h = histograms[1][0]
h.hist(data, stacked=True)
NOTE:
As per answers, this is considered a bug. For future visitors I have filed a bug report https://github.com/matplotlib/matplotlib/issues/5898 in case you want to track progress
This is a bug, revealed by version 1.5.x supporting histograms of single series of datetime type data. Previous versions of matplotlib would not histogram datetime data whether stacked or not, failing with a similar error that said datetime could not be compared with float.
The Exception is thrown by this line of code. As you can see, that is called only when bin edges are not specified and is trying to find the minimum in the time series (comparing it with np.inf and taking the minimum of those). You can workaround this by specifying bin edges in the call, but then that leads to a further failure as the numpy histogram function called under the hood looks for less than zero width bins.
"Under the hood" when a single list of datetime.datetime objects is passed to the pyplot.hist() function, the data are actually converted to UNIX epoch timestamps (you can guess this from the labels of the x axis). This is not done when the input is a list of lists of datetime.datetime objects.
At that stage, I think we have to call it a bug and you will have to use timestamp as you have already discovered - e.g. h.hist([[t.timestamp() for t in s] for s in data], stacked=True). You can still give the bin labels in date format, even though the actual data being histogrammed are timestamps, thus this should be transparent to the user.
I'll have a look to see whether I can find a nicer workaround / fix and possibly raise an issue on the matplotlib github.
Code that works (matplotlib 1.5.1, Python 3), albeit a bit messy
import datetime
import matplotlib.pyplot as plt
data = [[datetime.datetime(2015, 12, 24, 21, 13, 45), datetime.datetime(2015, 12, 30, 23, 37, 8), datetime.datetime(2015, 12, 30, 19, 43, 18), datetime.datetime(2015, 12, 30, 16, 14, 12), datetime.datetime(2015, 12, 30, 11, 32, 8), datetime.datetime(2015, 12, 29, 6, 29, 25), datetime.datetime(2015, 12, 28, 22, 54, 49), datetime.datetime(2015, 12, 28, 18, 41, 50), datetime.datetime(2015, 12, 28, 14, 25, 42), datetime.datetime(2015, 12, 28, 3, 1, 34), datetime.datetime(2015, 12, 27, 21, 10, 20), datetime.datetime(2015, 12, 27, 11, 29, 38), datetime.datetime(2015, 12, 26, 20, 34, 14), datetime.datetime(2015, 12, 26, 16, 58, 47), datetime.datetime(2015, 12, 26, 10, 54, 40), datetime.datetime(2015, 12, 25, 18, 17, 42), datetime.datetime(2015, 12, 24, 15, 44, 58), datetime.datetime(2015, 12, 25, 17, 25, 9), datetime.datetime(2015, 12, 25, 12, 33, 7), datetime.datetime(2015, 12, 30, 19, 26, 15), datetime.datetime(2015, 12, 30, 12, 46, 13), datetime.datetime(2015, 12, 30, 3, 38, 24), datetime.datetime(2015, 12, 25, 21, 11, 59), datetime.datetime(2015, 12, 25, 13, 30, 34), datetime.datetime(2015, 12, 24, 14, 6, 20)], [datetime.datetime(2015, 12, 28, 20, 59, 53), datetime.datetime(2015, 12, 27, 14, 3, 41), datetime.datetime(2015, 12, 26, 9, 37, 17)], [datetime.datetime(2015, 12, 29, 17, 18, 32)], [datetime.datetime(2015, 12, 29, 23, 15, 24)]]
fig, histograms = plt.subplots(5, 1, sharex=True, squeeze=False)
h = histograms[1][0]
h.hist([[t.timestamp() for t in l] for l in data], stacked=True)
locs, labels = plt.xticks()
plt.xticks(locs,[datetime.datetime.fromtimestamp(t) for t in locs], rotation='vertical')
plt.gcf().subplots_adjust(bottom=0.4)
fig.set_size_inches(4, 15)
plt.show()
Produces