How to chart times (mm:ss) in Matplotlib (formatting output values) - python

I'm plotting line graphs in Python Matplotlib of times which I get in mm:ss.tttt format.
I've already converted the values back to 10thousanths of a second and I can create a nice plot. But that means the Y axis show a value of "832323" instead of easier to read "1:23.2323".
Is there some way I can format the output values appropriately?

I worked this out myself shortly after I wrote this. Use Matplotlibs's axis, set_major_formatter() function.
I wrote a quick formatting function that would take a value in 10-thousandths of a second and turn it back into mm:ss.tttt. And then passed this formatter to the axis definition.
Import the 'ticker' module along with the plotting stuff:
import matplotlib.pyplot as plt
from matplotlib import ticker
Create your own value formatting function:
def format_10Kth_time(time, pos=None):
mins = time // (10000 * 60)
secs = (time - (mins * 10000 * 60)) // (10000)
fracsecs = time % 10000
return "%d:%02d.%d" % (mins, secs, fracsecs)
Then in my plot code I did this to alter the Y axis formatting:
plt.gca().yaxis.set_major_formatter(ticker.FuncFormatter(format_10Kth_time))
plt.plot(...)
plt.show()

Related

How can I calculate the time lag between two similar time series?

I'm trying to compute/visualize the time lag between 2 time series (I want to know the time lag between the humidity progression of outside and inside a room).
Each data point of my series was taken hourly. Plotting the 2 series together, I can clearly see a shift between them: Sorry for hiding the axis
Here are a part of my time series data. I will pack them in 2 arrays:
inside_humidity =
[11.77961297, 11.59755268, 12.28761522, 11.88797553, 11.78122077, 11.5694668,
11.70421932, 11.78122077, 11.74272005, 11.78122077, 11.69438733, 11.54126933,
11.28460592, 11.05624965, 10.9611012, 11.07527934, 11.25417308, 11.56040908,
11.6657186, 11.51171572, 11.49246536, 11.78594142, 11.22968373, 11.26840678,
11.26840678, 11.29447992, 11.25553344, 11.19711371, 11.17764047, 11.11922075,
11.04132778, 10.86996123, 10.67410607, 10.63493504, 10.74922916, 10.74922916,
10.6294765, 10.61011497, 10.59075345, 10.80373021, 11.07479154, 11.15223764,
11.19711371, 11.17764047, 11.15816723, 11.22250051, 11.22250051, 11.202915,
11.18332948, 11.16374396, 11.14415845, 11.12457293, 11.10498742, 11.14926578,
11.16896413, 11.16896413, 11.14926578, 10.8307902, 10.51742195, 10.28187137,
10.12608544, 9.98977276, 9.62267727, 9.31289289, 8.96438546, 8.77077022,
8.69332413, 8.51907042, 8.30609366, 8.38353975, 8.4513867, 8.47085994,
8.50980642, 8.52927966, 8.50980642, 8.55887037, 8.51969934, 8.48052831,
8.30425867, 8.2177078, 7.98402891, 7.92560918, 7.89950166, 7.83489682,
7.75789537, 7.5984808, 7.28426807, 7.39778913, 7.71943214, 8.01149931,
8.18276652, 8.23009255, 8.16215295, 7.93822471, 8.00350215, 7.93843482,
7.85072729, 7.49778011, 7.31782649, 7.29862668, 7.60162032, 8.29665484,
8.58797834, 8.50011383, 8.86757784, 8.76600556, 8.60491125, 8.4222628,
8.24923231, 8.14470714, 8.17351638, 8.52530093, 8.72220151, 9.26745883,
9.1580007, 8.61762692, 8.22187405, 8.43693644, 8.32414835, 8.32463974,
8.46833012, 8.55865487, 8.72647164, 9.04112806, 9.35578449, 9.59465974,
10.47339785, 11.07218093, 10.54091351, 10.56138918, 10.46099958, 10.38129168,
10.16434831, 10.10612612, 10.009246, 10.53502351, 10.8307902, 11.13420052,
11.64337309, 11.18958511, 10.49630791, 10.60856932, 10.37029108, 9.86281478,
9.64699826, 9.95341012, 10.24329812, 10.6848196, 11.47604231, 11.30505352,
10.72194974, 10.30058448, 10.05022037, 10.06318411, 9.90118897, 9.68530059,
9.47790657, 9.48585784, 9.61639418, 9.86244265, 10.29009361, 10.28297229,
10.32073088, 10.65389513, 11.09656351, 11.20188562, 11.24124169, 10.40503955,
9.74632512, 9.07606098, 8.85145589, 9.37080152, 9.65082743, 10.0707891,
10.68776091, 11.25879751, 11.0416348, 10.89558456, 10.7908258, 10.66539685,
10.7297755, 10.77571398, 10.9268264, 11.16021492, 11.60961709, 11.43827534,
11.96155427, 12.16116437, 12.80412266, 12.52540805, 11.96752965, 11.58099292]
outside_humidity =
[10.17449206, 10.4823292, 11.06818167, 10.82768699, 11.27582592, 11.4196233,
10.99393027, 11.4122507, 11.18192837, 10.87247831, 10.68664321, 10.37949651,
9.57155882, 10.86611665, 11.62547196, 11.32004266, 11.75537602, 11.51292063,
11.03107569, 10.7297755, 10.4345622, 10.61271497, 9.49271162, 10.15594248,
9.99053828, 9.80915398, 9.6452438, 10.06900573, 11.18075689, 11.8289847,
11.83334752, 11.27480708, 11.14370467, 10.88149985, 10.73930381, 10.7236597,
10.26210496, 11.01260226, 11.05428228, 11.58321342, 12.70523808, 12.5181118,
11.90023799, 11.67756426, 11.28859471, 10.86878222, 9.73984486, 10.18253902,
9.80915398, 10.50980784, 11.38673459, 11.22751685, 10.94171823, 10.56484228,
10.38220753, 10.05388847, 9.96147203, 9.90698862, 9.7732203, 9.85262125,
8.7412938, 8.88281702, 8.07919545, 8.02883587, 8.32341424, 8.07357711,
7.27302616, 6.73660684, 6.66722819, 7.29408637, 7.00046542, 6.46322019,
6.07150988, 6.00207234, 5.8818402, 6.82443881, 7.20212882, 7.52167696,
7.88857771, 8.351627, 8.36547023, 8.24802846, 8.18520693, 7.92420816,
7.64926024, 7.87944972, 7.82118727, 8.02091833, 7.93071882, 7.75789457,
7.5416447, 6.94430133, 6.65907535, 6.67454591, 7.25493614, 7.76939457,
7.55357806, 6.61479472, 7.17641357, 7.24664082, 8.62732387, 8.66913548,
8.70925667, 9.0477017, 8.24558224, 8.4330502, 8.44366397, 8.17995798,
8.1875752, 9.33296518, 9.66567041, 9.88581085, 8.95449382, 8.3587624,
9.20584448, 8.90605388, 8.87494884, 9.12694892, 8.35055177, 7.91879933,
7.78867253, 8.22800878, 9.03685287, 12.49630018, 11.11819755, 10.98869374,
10.65897176, 10.36444573, 10.052609, 10.87627021, 10.07379564, 10.02233847,
9.62022856, 11.21575473, 10.85483543, 11.67324627, 11.89234248, 11.10068132,
10.06942096, 8.50405894, 8.13168561, 8.83616476, 8.35675085, 8.33616802,
8.35675085, 9.02209801, 9.5530404, 9.44738836, 10.89645958, 11.44771721,
11.79943601, 10.7765335, 11.1453622, 10.74874776, 10.55195175, 10.34494483,
9.83813522, 11.26931785, 11.20641798, 10.51555027, 10.90808954, 11.80923545,
11.68300879, 11.60313809, 7.95163365, 7.77213815, 7.54209557, 7.30603673,
7.17842173, 8.25899805, 8.56494995, 10.44245578, 11.08542758, 11.74129079,
11.67979686, 12.94362214, 11.96285343, 11.8289847, 11.01388413, 10.6793698,
11.20662595, 11.97684701, 12.46383177, 11.34178655, 12.12477078, 12.48698059,
12.89325064, 12.07470295, 12.6777319, 10.91689448, 10.7676326, 10.66710434]
I know cross correlation is the right term to use, but after a while I still don't get the idea of using scipy.signal.correlate and numpy.correlate, because all I got is an array full of NaNs. So clearly I need some more knowledge in this area.
What I expect to achieve is probably a plot like those in the answer section of this thread How to make a correlation plot with a certain lag of two time series where I can see at how many hours the time lag is most likely.
Thank you a lot in advance!
With the given data, you can use the numpy and matplotlib modules to achieve the desired result.
so, you can do something like this:
import numpy as np
from matplotlib import pyplot as plt
x = np.array(inside_humidity)
y = np.array(outside_humidity)
fig = plt.figure()
# fit a curve of your choice
a, b = np.polyfit(inside_humidity, outside_humidity, 1)
y_fit = a * x + b
# scatter plot, and fitted plot (best fit used)
plt.scatter(inside_humidity, outside_humidity)
plt.plot(x, y_fit)
plt.show()
which gives this:

Making a plot that has an x-axis that has neg. values representing hours prior to the start of the event, then pos. values representing hours after

I'm not sure if my question makes sense, so apologies on that.
Basically, I am plotting some data that is ~100 hours long. On the x-axis, I want to make it so that the range goes from -50 to 50, with -1 to -50 representing the 50 hours prior to the event, 0 being in the middle representing the start of the event, and 1-50 representing the 50 hours following the start of the event. Basically, there are 107 hours worth of data and I want to try to divide the hours between each side of 0.
I initially tried using the plt.xlim() function, but that just shifts all the data to one side of the plot.
I've tried using plt.xticks and then labeling the x ticks with "-50", "-25", "0", "25", and "50", and while that somewhat works, it still does not look great. I'll add an example figure of doing it this way to add better clarification of what I'm trying to do, as well as the original plot:
Original plot:
Goal:
edit
Here's my code for plotting it:
fig_1 = plt.figure(figsize=(30,20))
file.plot(x='start',y='value')
plt.xlabel('hour')
plt.ylabel('value')
plt.xticks([0,25,50,75,100],["-50","-25","0","25","50"])
You could obtain a zero mean for the ticks using df.sub(df.mean() or np.mean().
Alternative 1:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# generate data
left = np.linspace(10,60, 54)
right = np.linspace(60,10, 53)
noise_left = np.random.normal(0, 1, 54)
noise_right = np.random.normal(0, 1, 53)
all = np.append(left + noise_left, right + noise_right)
file = pd.DataFrame({'start':np.linspace(1,107,107),'value':all})
# subtract mean
file['start'] = file['start'].sub(file['start'].mean())
fig_1 = plt.figure(figsize=(30,20))
file.plot(x='start',y='value')
plt.xlabel('hour')
plt.ylabel('value')
Output:
Alternative 2:
# subtract the mean from start to obtain zero mean ticks
ticks = file['start'] - np.mean(file['start'])
# set distance between each tick to 10
plt.xticks(file['start'][::10], ticks[::10],rotation=45)

How to use datetime.time to plot in Python

I have list of timestamps in the format of HH:MM:SS and want to plot against some values using datetime.time. Seems like python doesn't like the way I do it. Can someone please help ?
import datetime
import matplotlib.pyplot as plt
# random data
x = [datetime.time(12,10,10), datetime.time(12, 11, 10)]
y = [1,5]
# plot
plt.plot(x,y)
plt.show()
*TypeError: float() argument must be a string or a number*
Well, a two-step story to get 'em PLOT really nice
Step 1: prepare data into a proper format
from a datetime to a matplotlib convention compatible float for dates/times
As usual, devil is hidden in detail.
matplotlib dates are almost equal, but not equal:
# mPlotDATEs.date2num.__doc__
#
# *d* is either a class `datetime` instance or a sequence of datetimes.
#
# Return value is a floating point number (or sequence of floats)
# which gives the number of days (fraction part represents hours,
# minutes, seconds) since 0001-01-01 00:00:00 UTC, *plus* *one*.
# The addition of one here is a historical artifact. Also, note
# that the Gregorian calendar is assumed; this is not universal
# practice. For details, see the module docstring.
So, highly recommended to re-use their "own" tool:
from matplotlib import dates as mPlotDATEs # helper functions num2date()
# # and date2num()
# # to convert to/from.
Step 2: manage axis-labels & formatting & scale (min/max) as a next issue
matplotlib brings you arms for this part too.
Check code in this answer for all details
It is still valid issue in Python 3.5.3 and Matplotlib 2.1.0.
A workaround is to use datetime.datetime objects instead of datetime.time ones:
import datetime
import matplotlib.pyplot as plt
# random data
x = [datetime.time(12,10,10), datetime.time(12, 11, 10)]
x_dt = [datetime.datetime.combine(datetime.date.today(), t) for t in x]
y = [1,5]
# plot
plt.plot(x_dt, y)
plt.show()
By deafult date part should not be visible. Otherwise you can always use DateFormatter:
import matplotlib.dates as mdates
ax.xaxis.set_major_formatter(mdates.DateFormatter('%H-%M-%S'))
I came to this page because I have a similar issue. I have a Pandas DataFrame df with a datetime column df.dtm and a data column df.x, spanning several days, but I want to plot them using matplotlib.pyplot as a function of time of day, not date and time (datetime, datetimeindex). I.e., I want all data points to be folded into the same 24h range in the plot. I can plot df.x vs. df.dtm without issue, but I've just spent two hours trying to figure out how to convert df.dtm to df.time (containing the time of day without a date) and then plotting it. The (to me) straightforward solution does not work:
df.dtm = pd.to_datetime(df.dtm)
ax.plot(df.dtm, df.x)
# Works (with times on different dates; a range >24h)
df['time'] = df.dtm.dt.time
ax.plot(df.time, df.x)
# DOES NOT WORK: ConversionError('Failed to convert value(s) to axis '
matplotlib.units.ConversionError: Failed to convert value(s) to axis units:
array([datetime.time(0, 0), datetime.time(0, 5), etc.])
This does work:
pd.plotting.register_matplotlib_converters() # Needed to plot Pandas df with Matplotlib
df.dtm = pd.to_datetime(df.dtm, utc=True) # NOTE: MUST add a timezone, even if undesired
ax.plot(df.dtm, df.x)
# Works as before
df['time'] = df.dtm.dt.time
ax.plot(df.time, df.x)
# WORKS!!! (with time of day, all data in the same 24h range)
Note that the differences are in the first two lines. The first line allows better collaboration between Pandas and Matplotlib, the second seems redundant (or even wrong), but that doesn't matter in my case, since I use a single timezone and it is not plotted.

Manually setting xticks with xaxis_date() in Python/matplotlib

I've been looking into how to make plots against time on the x axis and have it pretty much sorted, with one strange quirk that makes me wonder whether I've run into a bug or (admittedly much more likely) am doing something I don't really understand.
Simply put, below is a simplified version of my program. If I put this in a .py file and execute it from an interpreter (ipython) I get a figure with an x axis with the year only, "2012", repeated a number of times, like this.
However, if I comment out the line (40) that sets the xticks manually, namely 'plt.xticks(tk)' and then run that exact command in the interpreter immediately after executing the script, it works great and my figure looks like this.
Similarly it also works if I just move that line to be after the savefig command in the script, that's to say to put it at the very end of the file. Of course in both cases only the figure drawn on screen will have the desired axis, and not the saved file. Why can't I set my x axis earlier?
Grateful for any insights, thanks in advance!
import matplotlib.pyplot as plt
import datetime
# define arrays for x, y and errors
x=[16.7,16.8,17.1,17.4]
y=[15,17,14,16]
e=[0.8,1.2,1.1,0.9]
xtn=[]
# convert x to datetime format
for t in x:
hours=int(t)
mins=int((t-int(t))*60)
secs=int(((t-hours)*60-mins)*60)
dt=datetime.datetime(2012,01,01,hours,mins,secs)
xtn.append(date2num(dt))
# set up plot
fig=plt.figure()
ax=fig.add_subplot(1,1,1)
# plot
ax.errorbar(xtn,y,yerr=e,fmt='+',elinewidth=2,capsize=0,color='k',ecolor='k')
# set x axis range
ax.xaxis_date()
t0=date2num(datetime.datetime(2012,01,01,16,35)) # x axis startpoint
t1=date2num(datetime.datetime(2012,01,01,17,35)) # x axis endpoint
plt.xlim(t0,t1)
# manually set xtick values
tk=[]
tk.append(date2num(datetime.datetime(2012,01,01,16,40)))
tk.append(date2num(datetime.datetime(2012,01,01,16,50)))
tk.append(date2num(datetime.datetime(2012,01,01,17,00)))
tk.append(date2num(datetime.datetime(2012,01,01,17,10)))
tk.append(date2num(datetime.datetime(2012,01,01,17,20)))
tk.append(date2num(datetime.datetime(2012,01,01,17,30)))
plt.xticks(tk)
plt.show()
# save to file
plt.savefig('savefile.png')
I don't think you need that call to xaxis_date(); since you are already providing the x-axis data in a format that matplotlib knows how to deal with. I also think there's something slightly wrong with your secs formula.
We can make use of matplotlib's built-in formatters and locators to:
set the major xticks to a regular interval (minutes, hours, days, etc.)
customize the display using a strftime formatting string
It appears that if a formatter is not specified, the default is to display the year; which is what you were seeing.
Try this out:
import datetime as dt
import matplotlib.pyplot as plt
from matplotlib.dates import DateFormatter, MinuteLocator
x = [16.7,16.8,17.1,17.4]
y = [15,17,14,16]
e = [0.8,1.2,1.1,0.9]
xtn = []
for t in x:
h = int(t)
m = int((t-int(t))*60)
xtn.append(dt.datetime.combine(dt.date(2012,1,1), dt.time(h,m)))
def larger_alim( alim ):
''' simple utility function to expand axis limits a bit '''
amin,amax = alim
arng = amax-amin
nmin = amin - 0.1 * arng
nmax = amax + 0.1 * arng
return nmin,nmax
plt.errorbar(xtn,y,yerr=e,fmt='+',elinewidth=2,capsize=0,color='k',ecolor='k')
plt.gca().xaxis.set_major_locator( MinuteLocator(byminute=range(0,60,10)) )
plt.gca().xaxis.set_major_formatter( DateFormatter('%H:%M:%S') )
plt.gca().set_xlim( larger_alim( plt.gca().get_xlim() ) )
plt.show()
Result:
FWIW the utility function larger_alim was originally written for this other question: Is there a way to tell matplotlib to loosen the zoom on the plotted data?

Assigning dates to samples in matplotlib

Given some arbitrary numpy array of data, how can I plot it to make dates appear on the x axis? In this case, sample 0 will be some time, say 7:00, and every sample afterwards will be spaced one minute apart, so that for 60 samples, the time displayed should be 7:00, 7:01, ..., 7:59.
I looked at some of the other questions on here but they all required actually setting a date and some other stuff that felt very over the top compared to what I'd like to do.
Thanks!
Christoph
If you use an array of datetime objects for your x-axis, the plot() function will behave like you want (assuming that you don't want all 60 labels from 7:00 to 7:59 to be displayed). Here is a sample code:
import random
from pylab import *
from datetime import *
N = 60
t0 = datetime.combine(date.today(), time(7,0,0))
delta_t = timedelta(minutes=1)
x_axis = t0 + arange(N)*delta_t
plot(x_axis, random(N))
show()
Concerning the use of the combine() function, see question python time + timedelta equivalent

Categories

Resources