How to use datetime.time to plot in Python - python

I have list of timestamps in the format of HH:MM:SS and want to plot against some values using datetime.time. Seems like python doesn't like the way I do it. Can someone please help ?
import datetime
import matplotlib.pyplot as plt
# random data
x = [datetime.time(12,10,10), datetime.time(12, 11, 10)]
y = [1,5]
# plot
plt.plot(x,y)
plt.show()
*TypeError: float() argument must be a string or a number*

Well, a two-step story to get 'em PLOT really nice
Step 1: prepare data into a proper format
from a datetime to a matplotlib convention compatible float for dates/times
As usual, devil is hidden in detail.
matplotlib dates are almost equal, but not equal:
# mPlotDATEs.date2num.__doc__
#
# *d* is either a class `datetime` instance or a sequence of datetimes.
#
# Return value is a floating point number (or sequence of floats)
# which gives the number of days (fraction part represents hours,
# minutes, seconds) since 0001-01-01 00:00:00 UTC, *plus* *one*.
# The addition of one here is a historical artifact. Also, note
# that the Gregorian calendar is assumed; this is not universal
# practice. For details, see the module docstring.
So, highly recommended to re-use their "own" tool:
from matplotlib import dates as mPlotDATEs # helper functions num2date()
# # and date2num()
# # to convert to/from.
Step 2: manage axis-labels & formatting & scale (min/max) as a next issue
matplotlib brings you arms for this part too.
Check code in this answer for all details

It is still valid issue in Python 3.5.3 and Matplotlib 2.1.0.
A workaround is to use datetime.datetime objects instead of datetime.time ones:
import datetime
import matplotlib.pyplot as plt
# random data
x = [datetime.time(12,10,10), datetime.time(12, 11, 10)]
x_dt = [datetime.datetime.combine(datetime.date.today(), t) for t in x]
y = [1,5]
# plot
plt.plot(x_dt, y)
plt.show()
By deafult date part should not be visible. Otherwise you can always use DateFormatter:
import matplotlib.dates as mdates
ax.xaxis.set_major_formatter(mdates.DateFormatter('%H-%M-%S'))

I came to this page because I have a similar issue. I have a Pandas DataFrame df with a datetime column df.dtm and a data column df.x, spanning several days, but I want to plot them using matplotlib.pyplot as a function of time of day, not date and time (datetime, datetimeindex). I.e., I want all data points to be folded into the same 24h range in the plot. I can plot df.x vs. df.dtm without issue, but I've just spent two hours trying to figure out how to convert df.dtm to df.time (containing the time of day without a date) and then plotting it. The (to me) straightforward solution does not work:
df.dtm = pd.to_datetime(df.dtm)
ax.plot(df.dtm, df.x)
# Works (with times on different dates; a range >24h)
df['time'] = df.dtm.dt.time
ax.plot(df.time, df.x)
# DOES NOT WORK: ConversionError('Failed to convert value(s) to axis '
matplotlib.units.ConversionError: Failed to convert value(s) to axis units:
array([datetime.time(0, 0), datetime.time(0, 5), etc.])
This does work:
pd.plotting.register_matplotlib_converters() # Needed to plot Pandas df with Matplotlib
df.dtm = pd.to_datetime(df.dtm, utc=True) # NOTE: MUST add a timezone, even if undesired
ax.plot(df.dtm, df.x)
# Works as before
df['time'] = df.dtm.dt.time
ax.plot(df.time, df.x)
# WORKS!!! (with time of day, all data in the same 24h range)
Note that the differences are in the first two lines. The first line allows better collaboration between Pandas and Matplotlib, the second seems redundant (or even wrong), but that doesn't matter in my case, since I use a single timezone and it is not plotted.

Related

Problems with matplotlib and datetime

I am trying to plot the observed and calculated values of a time series parameter using matplotlib. The observed data are stored in an XL file which I read using openpyxl and convert to a pandas dataframe. The simulated values are read as elapsed days which I convert to numpy datetime using
delt = get_simulated_time()
t0 = np.datetime64('2004-01-01T00:00:00')
tsim = t0 + np.asarray(delt).astype('timedelta64[D]')
I plot the data using the following code snippet
df = obs_data_df.query("block=='blk-7'")
pobs = df['pres']
tobs = df['date']
tobs = np.array(tobs, dtype='datetime64')
print(type(tobs), np.min(tobs), np.max(tobs))
axs.plot(tobs, pobs, '.', color='g', label='blk-7, obs', markersize=8)
tsim = np.array(curr_sim_obj.tsim, dtype='datetime64')
print(type(tsim), np.min(tsim), np.max(tsim))
axs.plot(tsim, curr_sim_obj.psim[:, 0], '-', color='g', label='blk-7, sim', linewidth=1)
The results of the print statements are:
print(type(tobs), np.min(tobs), np.max(tobs))
... <class 'numpy.ndarray'> 2004-06-01T00:00:00.000000000 2020-06-01T00:00:00.000000000
print(type(tsim), np.min(tsim), np.max(tsim))
... <class 'numpy.ndarray'> 2004-01-01T00:00:00 2020-07-20T00:00:00
These types look OK but I get this error message from matplotlib:
ValueError: view limit minimum -36907.706903627106 is less than 1 and is an invalid Matplotlib date value. This often happens if you pass a non-datetime value to an axis that has datetime units
I don't understand why I am getting this message since the print statements indicate that the data are consistent. I tried investigating further using
print(np.dtype(tsim), np.min(tobs), np.max(tobs))
but get this error message:
TypeError: data type not understood
This has confused me even further since I set the tobs data type in the preceding statement. I have to say that I am really confused about the differences in the way that python, pandas and numpy handle dates and the various code kludges above reflect workarounds that I have picked up along the way. I would basically like to know how to plot the two different time series on the same plot so all suggestions very welcome. Thank you in advance!
Update:
While cutting down the code to get a simpler case that reproduced the error I found the following code buried in the plotting routine:
axs.plot(10*np.random.randn(100), 10*np.random.randn(100), 'o')
This was left over from testing the plot routine. Once I removed this the errors disappeared. I guess I need to check my code more carefully ...
The solution was to use the matplotlib.dates.num2date function on both sets of data.

Plotting time on the x-axis with Python's matplotlib

I am reading in data from a text file which contains data in the format (date time; microVolts):
e.g. 07.03.2017 23:14:01,000; 279
And I wish to plot a graph using matplotlib by capturing only the time (x-axis) and plotting it against microVolts (y-axis). So far, I've managed to extract the time element from the string and convert it into datetime format (shown below).
I tried to append each value of time into x to plot, but the program just freezes and displays nothing.
Here is part of the code:
from datetime import datetime
import matplotlib.pyplot as plt
ecg = open(file2).readlines()
x = []
for line in range(len(ecg)):
ecgtime = ecg[7:][line][:23]
ecgtime = datetime.strptime(ecgtime, '%d.%m.%Y %H:%M:%S,%f')
x.append(ecgtime.time())
I'm aware the datetime format is causing the issue but I can't figure out how to convert it into float/int as it says:
'invalid literal for float(): 23:14:01,000'
I have no reputation for comment than I have to answer.
datetime.datetime.time() converts to datetime.time object, you need float.
Could you try datetime.datetime.timestamp()?
See last line:
from datetime import datetime
import matplotlib.pyplot as plt
ecg = open(file2).readlines()
x = []
for line in range(len(ecg)):
ecgtime = ecg[7:][line][:23]
ecgtime = datetime.strptime(ecgtime, '%d.%m.%Y %H:%M:%S,%f')
x.append(ecgtime.timestamp())
EDIT: timestamp() is available sine Python 3.3. For Python 2 you can use
from time import mktime
...
x.append(mktime(ecgtime.timetuple()))

Smart x-axis for bokeh periodic time series

I have time series data in a pandas Series object. The values are floats representing the size of an event. Together, event times and sizes tell me how busy my system is. I'm using a scatterplot.
I want to look at traffic patterns over various time periods, to answer questions like "Is there a daily spike at noon?" or "Which days of the week have the most traffic?" Therefore I need to overlay data from many different periods. I do this by converting timestamps to timedeltas (first by subtracting the start of the first period, then by doing a mod with the period length).
Now my index uses time intervals relative to an "abstract" time period, like a day or a week. I would like to produce plots where the x-axis shows something other than just nanoseconds. Ideally it would show month, day of the week, hour, etc. depending on timescale as you zoom in and out (as Bokeh graphs generally do for time series).
The code below shows an example of how I currently plot. The resulting graph has an x-axis in units of nanoseconds, which is not what I want. How do I get a smart x-axis that behaves more like what I would see for timestamps?
import numpy as np
import pandas as pd
from bokeh.charts import show, output_file
from bokeh.plotting import figure
oneDay = np.timedelta64(24*60*60,'s')
fourHours = 24000000000000 # four hours in nanoseconds (ugly)
time = [pd.Timestamp('2015-04-27 01:00:00'), # a Monday
pd.Timestamp('2015-05-04 02:00:00'), # a Monday
pd.Timestamp('2015-05-11 03:00:00'), # a Monday
pd.Timestamp('2015-05-12 04:00:00') # a Tuesday
]
resp = [2.0, 1.3, 2.6, 1.3]
ts = pd.Series(resp, index=time)
days = dict(list(ts.groupby(lambda x: x.weekday)))
monday = days[0] # this TimeSeries consists of all data for all Mondays
# Now convert timestamps to timedeltas
# First subtract the timestamp of the starting date
# Then take the remainder after dividing by one day
# Result: each index value is in the 24 hour range [00:00:00, 23:59:59]
tdi = monday.index - pd.Timestamp(monday.index.date[0])
x = pd.TimedeltaIndex([td % oneDay for td in tdi])
y = monday.values
output_file('bogus.html')
xmax = fourHours # why doesn't np.timedelta64 work for xmax?
fig = figure(x_range=[0,xmax], y_range=[0,4])
fig.circle(x, y)
show(fig)

Manually setting xticks with xaxis_date() in Python/matplotlib

I've been looking into how to make plots against time on the x axis and have it pretty much sorted, with one strange quirk that makes me wonder whether I've run into a bug or (admittedly much more likely) am doing something I don't really understand.
Simply put, below is a simplified version of my program. If I put this in a .py file and execute it from an interpreter (ipython) I get a figure with an x axis with the year only, "2012", repeated a number of times, like this.
However, if I comment out the line (40) that sets the xticks manually, namely 'plt.xticks(tk)' and then run that exact command in the interpreter immediately after executing the script, it works great and my figure looks like this.
Similarly it also works if I just move that line to be after the savefig command in the script, that's to say to put it at the very end of the file. Of course in both cases only the figure drawn on screen will have the desired axis, and not the saved file. Why can't I set my x axis earlier?
Grateful for any insights, thanks in advance!
import matplotlib.pyplot as plt
import datetime
# define arrays for x, y and errors
x=[16.7,16.8,17.1,17.4]
y=[15,17,14,16]
e=[0.8,1.2,1.1,0.9]
xtn=[]
# convert x to datetime format
for t in x:
hours=int(t)
mins=int((t-int(t))*60)
secs=int(((t-hours)*60-mins)*60)
dt=datetime.datetime(2012,01,01,hours,mins,secs)
xtn.append(date2num(dt))
# set up plot
fig=plt.figure()
ax=fig.add_subplot(1,1,1)
# plot
ax.errorbar(xtn,y,yerr=e,fmt='+',elinewidth=2,capsize=0,color='k',ecolor='k')
# set x axis range
ax.xaxis_date()
t0=date2num(datetime.datetime(2012,01,01,16,35)) # x axis startpoint
t1=date2num(datetime.datetime(2012,01,01,17,35)) # x axis endpoint
plt.xlim(t0,t1)
# manually set xtick values
tk=[]
tk.append(date2num(datetime.datetime(2012,01,01,16,40)))
tk.append(date2num(datetime.datetime(2012,01,01,16,50)))
tk.append(date2num(datetime.datetime(2012,01,01,17,00)))
tk.append(date2num(datetime.datetime(2012,01,01,17,10)))
tk.append(date2num(datetime.datetime(2012,01,01,17,20)))
tk.append(date2num(datetime.datetime(2012,01,01,17,30)))
plt.xticks(tk)
plt.show()
# save to file
plt.savefig('savefile.png')
I don't think you need that call to xaxis_date(); since you are already providing the x-axis data in a format that matplotlib knows how to deal with. I also think there's something slightly wrong with your secs formula.
We can make use of matplotlib's built-in formatters and locators to:
set the major xticks to a regular interval (minutes, hours, days, etc.)
customize the display using a strftime formatting string
It appears that if a formatter is not specified, the default is to display the year; which is what you were seeing.
Try this out:
import datetime as dt
import matplotlib.pyplot as plt
from matplotlib.dates import DateFormatter, MinuteLocator
x = [16.7,16.8,17.1,17.4]
y = [15,17,14,16]
e = [0.8,1.2,1.1,0.9]
xtn = []
for t in x:
h = int(t)
m = int((t-int(t))*60)
xtn.append(dt.datetime.combine(dt.date(2012,1,1), dt.time(h,m)))
def larger_alim( alim ):
''' simple utility function to expand axis limits a bit '''
amin,amax = alim
arng = amax-amin
nmin = amin - 0.1 * arng
nmax = amax + 0.1 * arng
return nmin,nmax
plt.errorbar(xtn,y,yerr=e,fmt='+',elinewidth=2,capsize=0,color='k',ecolor='k')
plt.gca().xaxis.set_major_locator( MinuteLocator(byminute=range(0,60,10)) )
plt.gca().xaxis.set_major_formatter( DateFormatter('%H:%M:%S') )
plt.gca().set_xlim( larger_alim( plt.gca().get_xlim() ) )
plt.show()
Result:
FWIW the utility function larger_alim was originally written for this other question: Is there a way to tell matplotlib to loosen the zoom on the plotted data?

Assigning dates to samples in matplotlib

Given some arbitrary numpy array of data, how can I plot it to make dates appear on the x axis? In this case, sample 0 will be some time, say 7:00, and every sample afterwards will be spaced one minute apart, so that for 60 samples, the time displayed should be 7:00, 7:01, ..., 7:59.
I looked at some of the other questions on here but they all required actually setting a date and some other stuff that felt very over the top compared to what I'd like to do.
Thanks!
Christoph
If you use an array of datetime objects for your x-axis, the plot() function will behave like you want (assuming that you don't want all 60 labels from 7:00 to 7:59 to be displayed). Here is a sample code:
import random
from pylab import *
from datetime import *
N = 60
t0 = datetime.combine(date.today(), time(7,0,0))
delta_t = timedelta(minutes=1)
x_axis = t0 + arange(N)*delta_t
plot(x_axis, random(N))
show()
Concerning the use of the combine() function, see question python time + timedelta equivalent

Categories

Resources