I have an array of timestamps in the format (HH:MM:SS.mmmmmm) and another array of floating point numbers, each corresponding to a value in the timestamp array.
Can I plot time on the x axis and the numbers on the y-axis using Matplotlib?
I was trying to, but somehow it was only accepting arrays of floats. How can I get it to plot the time? Do I have to modify the format in any way?
Update:
This answer is outdated since matplotlib version 3.5. The plot function now handles datetime data directly. See https://matplotlib.org/3.5.1/api/_as_gen/matplotlib.pyplot.plot_date.html
The use of plot_date is discouraged. This method exists for historic
reasons and may be deprecated in the future.
datetime-like data should directly be plotted using plot.
If you need to plot plain numeric data as Matplotlib date format or
need to set a timezone, call ax.xaxis.axis_date / ax.yaxis.axis_date
before plot. See Axis.axis_date.
Old, outdated answer:
You must first convert your timestamps to Python datetime objects (use datetime.strptime). Then use date2num to convert the dates to matplotlib format.
Plot the dates and values using plot_date:
import matplotlib.pyplot
import matplotlib.dates
from datetime import datetime
x_values = [datetime(2021, 11, 18, 12), datetime(2021, 11, 18, 14), datetime(2021, 11, 18, 16)]
y_values = [1.0, 3.0, 2.0]
dates = matplotlib.dates.date2num(x_values)
matplotlib.pyplot.plot_date(dates, y_values)
You can also plot the timestamp, value pairs using pyplot.plot (after parsing them from their string representation). (Tested with matplotlib versions 1.2.0 and 1.3.1.)
Example:
import datetime
import random
import matplotlib.pyplot as plt
# make up some data
x = [datetime.datetime.now() + datetime.timedelta(hours=i) for i in range(12)]
y = [i+random.gauss(0,1) for i,_ in enumerate(x)]
# plot
plt.plot(x,y)
# beautify the x-labels
plt.gcf().autofmt_xdate()
plt.show()
Resulting image:
Here's the same as a scatter plot:
import datetime
import random
import matplotlib.pyplot as plt
# make up some data
x = [datetime.datetime.now() + datetime.timedelta(hours=i) for i in range(12)]
y = [i+random.gauss(0,1) for i,_ in enumerate(x)]
# plot
plt.scatter(x,y)
# beautify the x-labels
plt.gcf().autofmt_xdate()
plt.show()
Produces an image similar to this:
7 years later and this code has helped me.
However, my times still were not showing up correctly.
Using Matplotlib 2.0.0 and I had to add the following bit of code from Editing the date formatting of x-axis tick labels in matplotlib by Paul H.
import matplotlib.dates as mdates
myFmt = mdates.DateFormatter('%d')
ax.xaxis.set_major_formatter(myFmt)
I changed the format to (%H:%M) and the time displayed correctly.
All thanks to the community.
I had trouble with this using matplotlib version: 2.0.2. Running the example from above I got a centered stacked set of bubbles.
I "fixed" the problem by adding another line:
plt.plot([],[])
The entire code snippet becomes:
import datetime
import random
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
# make up some data
x = [datetime.datetime.now() + datetime.timedelta(minutes=i) for i in range(12)]
y = [i+random.gauss(0,1) for i,_ in enumerate(x)]
# plot
plt.plot([],[])
plt.scatter(x,y)
# beautify the x-labels
plt.gcf().autofmt_xdate()
myFmt = mdates.DateFormatter('%H:%M')
plt.gca().xaxis.set_major_formatter(myFmt)
plt.show()
plt.close()
This produces an image with the bubbles distributed as desired.
Pandas dataframes haven't been mentioned yet. I wanted to show how these solved my datetime problem. I have datetime to the milisecond 2021-04-01 16:05:37. I am pulling linux/haproxy throughput from /proc so I can really format it however I like. This is nice for feeding data into a live graph animation.
Here's a look at the csv. (Ignore the packets per second column I'm using that in another graph)
head -2 ~/data
date,mbps,pps
2021-04-01 16:05:37,113,9342.00
...
By using print(dataframe.dtype) I can see how the data was read in:
(base) ➜ graphs ./throughput.py
date object
mbps int64
pps float64
dtype: object
Pandas pulls the date string in as "object", which is just type char. Using this as-is in a script:
import matplotlib.pyplot as plt
import pandas as pd
dataframe = pd.read_csv("~/data")
dates = dataframe["date"]
mbps = dataframe["mbps"]
plt.plot(dates, mbps, label="mbps")
plt.title("throughput")
plt.xlabel("time")
plt.ylabel("mbps")
plt.legend()
plt.xticks(rotation=45)
plt.show()
Matplotlib renders all the milisecond time data. I've added plt.xticks(rotation=45) to tilt the dates but it's not what I want. I can convert the date "object" to a datetime64[ns]. Which matplotlib does know how to render.
dataframe["date"] = pd.to_datetime(dataframe["date"])
This time my date is type datetime64[ns]
(base) ➜ graphs ./throughput.py
date datetime64[ns]
mbps int64
pps float64
dtype: object
Same script with 1 line difference.
#!/usr/bin/env python
import matplotlib.pyplot as plt
import pandas as pd
dataframe = pd.read_csv("~/data")
# convert object to datetime64[ns]
dataframe["date"] = pd.to_datetime(dataframe["date"])
dates = dataframe["date"]
mbps = dataframe["mbps"]
plt.plot(dates, mbps, label="mbps")
plt.title("throughput")
plt.xlabel("time")
plt.ylabel("mbps")
plt.legend()
plt.xticks(rotation=45)
plt.show()
This might not have been ideal for your usecase but it might help someone else.
Related
I have two lists containing the sunset and sunrise times and the corresponding dates.
It looks like:
sunrises = ['06:30', '06:28', '06:27', ...]
dates = ['3.21', '3.22', '3.23', ...]
I want to make a plot of the sunrise times as the Y axis and the dates as the X axis.
Simply using
ax.plot(dates, sunrises)
ax.xaxis.set_major_locator(matplotlib.ticker.MultipleLocator(7))
ax.yaxis.set_major_locator(matplotlib.ticker.MultipleLocator(7))
plt.show()
can plot the dates correctly, but the time is wrong:
And actually, the sunrise time isn't supposed to be a straight line.
How do I solve this problem?
You need to transform the datetime in string format to the format that matplotlib can comprehend by using datetime
from matplotlib import pyplot as plt
import matplotlib as mpl
from datetime import datetime
import matplotlib.dates as mdates
sunrises = ['06:30', '06:28', '06:27',]
sunrises_dt = [datetime.strptime(item,'%H:%M') for item in sunrises]
dates = ['3.21', '3.22', '3.23',]
fig,ax = plt.subplots()
ax.plot(dates, sunrises_dt)
ax.yaxis.set_major_formatter(mdates.DateFormatter('%H:%M',))
ax.xaxis.set_major_locator(mpl.ticker.MultipleLocator(1))
plt.show()
This is because your sunrises are not numerical. I'm assuming you'd want them in a form such that "6:30" means 6.5. Which is calculated below:
import matplotlib.pyplot as plt
sunrises = ['06:30', '06:28', '06:27']
# This converts to decimals
sunrises = [float(x[0:2])+(float(x[-2:])/60) for x in sunrises]
dates = ['3.21', '3.22', '3.23']
plt.plot(sunrises, dates)
plt.xlabel('sunrises')
plt.ylabel('dates')
plt.show()
Note, your dates are being treated as decimals. Is this correct?
If I run the following, it appears to work as expected, but the y-axis is limited to the earliest and latest times in the data. I want it to show midnight to midnight. I thought I could do that with the code that's commented out. But when I uncomment it, I get the correct y-axis, yet nothing plots. Where am I going wrong?
from datetime import datetime
import matplotlib.pyplot as plt
data = ['2018-01-01 09:28:52', '2018-01-03 13:02:44', '2018-01-03 15:30:27', '2018-01-04 11:55:09']
x = []
y = []
for i in range(0, len(data)):
t = datetime.strptime(data[i], '%Y-%m-%d %H:%M:%S')
x.append(t.strftime('%Y-%m-%d')) # X-axis = date
y.append(t.strftime('%H:%M:%S')) # Y-axis = time
plt.plot(x, y, '.')
# begin = datetime.strptime('00:00:00', '%H:%M:%S').strftime('%H:%M:%S')
# end = datetime.strptime('23:59:59', '%H:%M:%S').strftime('%H:%M:%S')
# plt.ylim(begin, end)
plt.show()
Edit: I also noticed that the x-axis isn't right either. The data skips Jan 2, but I want that on the axis so the data is to scale.
This is a dramatically simplified version of code dealing with over a year's worth of data with over 2,500 entries.
If Pandas is available to you, consider this approach:
import pandas as pd
data = pd.to_datetime(data, yearfirst=True)
plt.plot(data.date, data.time)
_=plt.ylim(["00:00:00", "23:59:59"])
Update per comments
X-axis date formatting can be adjusted using the Locator and Formatter methods of the matplotlib.dates module. Locator finds the tick positions, and Formatter specifies how you want the labels to appear.
Sometimes Matplotlib/Pandas just gets it right, other times you need to call out exactly what you want using these extra methods. In this case, I'm not sure why those numbers are showing up, but this code will remove them.
import pandas as pd
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
f, ax = plt.subplots()
data = pd.to_datetime(data, yearfirst=True)
ax.plot(data.date, data.time)
ax.set_ylim(["00:00:00", "23:59:59"])
days = mdates.DayLocator()
d_fmt = mdates.DateFormatter('%m-%d')
ax.xaxis.set_major_locator(days)
ax.xaxis.set_major_formatter(d_fmt)
I'm basically trying to plot a graph where the x axis represent the month of the year. The data is stored in a numpy.array, with dimensions k x months. Here it follows a minimal example (my data is not this crazy):
import numpy
import matplotlib
import matplotlib.pyplot as plt
cmap = plt.get_cmap('Set3')
colors = [cmap(i) for i in numpy.linspace(0, 1, len(complaints))]
data = numpy.random.rand(18,12)
y = range(data.shape[1])
plt.figure(figsize=(15, 7), dpi=200)
for i in range(data.shape[0]):
plt.plot(y, data[i,:], color=colors[i], linewidth=5)
plt.legend(loc='center left', bbox_to_anchor=(1, 0.5))
plt.xticks(numpy.arange(0, 12, 1))
plt.xlabel('Hour of the Day')
plt.ylabel('Number of Complaints')
plt.title('Number of Complaints per Hour in 2015')
I'd like to have the xticks as strings instead of numbers. I'm wondering if I have to create a list of strings, manually, or if there is another way to translate the numbers to months. I have to do the same for weekdays, for example.
I've been looking to these examples:
http://matplotlib.org/examples/pylab_examples/finance_demo.html
http://matplotlib.org/examples/pylab_examples/date_demo2.html
But I'm not using datetime.
Althought this answer works well, for this case you can avoid defining your own FuncFormatter by using the pre-defined ones from matplotlib for dates, by using matplotlib.dates rather than matplotlib.ticker:
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import numpy as np
import pandas as pd
# Define time range with 12 different months:
# `MS` stands for month start frequency
x_data = pd.date_range('2018-01-01', periods=12, freq='MS')
# Check how this dates looks like:
print(x_data)
y_data = np.random.rand(12)
fig, ax = plt.subplots()
ax.plot(x_data, y_data)
# Make ticks on occurrences of each month:
ax.xaxis.set_major_locator(mdates.MonthLocator())
# Get only the month to show in the x-axis:
ax.xaxis.set_major_formatter(mdates.DateFormatter('%b'))
# '%b' means month as locale’s abbreviated name
plt.show()
Obtaining:
DatetimeIndex(['2018-01-01', '2018-02-01', '2018-03-01', '2018-04-01',
'2018-05-01', '2018-06-01', '2018-07-01', '2018-08-01',
'2018-09-01', '2018-10-01', '2018-11-01', '2018-12-01'],
dtype='datetime64[ns]', freq='MS')
This is an alternative plotting method plot_date, which you might want to use if your independent variable are datetime like, instead of using the more general plot method:
import datetime
data = np.random.rand(24)
#a list of time: 00:00:00 to 23:00:00
times = [datetime.datetime.strptime(str(i), '%H') for i in range(24)]
#'H' controls xticklabel format, 'H' means only the hours is shown
#day, year, week, month, etc are not shown
plt.plot_date(times, data, fmt='H')
plt.setp(plt.gca().xaxis.get_majorticklabels(),
'rotation', 90)
The benefit of it is that now you can easily control the density of xticks, if we want to have a tick every hour, we will insert these lines after plot_date:
##import it if not already imported
#import matplotlib.dates as mdates
plt.gca().xaxis.set_major_locator(mdates.HourLocator())
You can still use formatters to format your results in the way you want. For example, to have month names printed, let us first define a function taking an integer to a month abbreviation:
def getMonthName(month_number):
testdate=datetime.date(2010,int(month_number),1)
return testdate.strftime('%b')
Here, I have created an arbitrary date with the correct month and returned that month. Check the datetime documentation for available format codes if needed. If that is always easier than just setting a list by hand is another question. Now let us plot some monthly testdata:
import matplotlib.pyplot as plt
import matplotlib.ticker as mtick
import numpy as np
x_data=np.arange(1,12.5,1)
y_data=x_data**2 # Just some arbitrary data
plt.plot(x_data,y_data)
plt.gca().xaxis.set_major_locator(mtick.FixedLocator(x_data)) # Set tick locations
plt.gca().xaxis.set_major_formatter(mtick.FuncFormatter(lambda x,p:getMonthName(x)))
plt.show()
The message here is that you can use matplotlib.ticker.FuncFormatter to use any function to obtain a tick label. The function takes two arguments (value and position) and returns a string.
I am plotting time series using pandas .plot() and want to see every month shown as an x-tick.
Here is the dataset structure
Here is the result of the .plot()
I was trying to use examples from other posts and matplotlib documentation and do something like
ax.xaxis.set_major_locator(
dates.MonthLocator(revenue_pivot.index, bymonthday=1,interval=1))
But that removed all the ticks :(
I also tried to pass xticks = df.index, but it has not changed anything.
What would be the rigth way to show more ticks on x-axis?
No need to pass any args to MonthLocator. Make sure to use x_compat in the df.plot() call per #Rotkiv's answer.
import pandas as pd
import numpy as np
import matplotlib.pylab as plt
import matplotlib.dates as mdates
df = pd.DataFrame(np.random.rand(100,2), index=pd.date_range('1-1-2018', periods=100))
ax = df.plot(x_compat=True)
ax.xaxis.set_major_locator(mdates.MonthLocator())
plt.show()
formatted x-axis with set_major_locator
unformatted x-axis
You could also format the x-axis ticks and labels of a pandas DateTimeIndex "manually" using the attributes of a pandas Timestamp object.
I found that much easier than using locators from matplotlib.dates which work on other datetime formats than pandas (if I am not mistaken) and thus sometimes show an odd behaviour if dates are not converted accordingly.
Here's a generic example that shows the first day of each month as a label based on attributes of pandas Timestamp objects:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
# data
dim = 8760
idx = pd.date_range('1/1/2000 00:00:00', freq='h', periods=dim)
df = pd.DataFrame(np.random.randn(dim, 2), index=idx)
# select tick positions based on timestamp attribute logic. see:
# https://pandas.pydata.org/pandas-docs/stable/generated/pandas.Timestamp.html
positions = [p for p in df.index
if p.hour == 0
and p.is_month_start
and p.month in range(1, 13, 1)]
# for date formatting, see:
# https://docs.python.org/2/library/datetime.html#strftime-and-strptime-behavior
labels = [l.strftime('%m-%d') for l in positions]
# plot with adjusted labels
ax = df.plot(kind='line', grid=True)
ax.set_xlabel('Time (h)')
ax.set_ylabel('Foo (Bar)')
ax.set_xticks(positions)
ax.set_xticklabels(labels)
plt.show()
yields:
Hope this helps!
The right way to do that described here
Using the x_compat parameter, it is possible to suppress automatic tick resolution adjustment
df.A.plot(x_compat=True)
If you want to just show more ticks, you can also dive deep into the structure of pd.plotting._converter:
dai = ax.xaxis.minor.formatter.plot_obj.date_axis_info
dai['fmt'][dai['fmt'] == b''] = b'%b'
After plotting, the formatter is a TimeSeries_DateFormatter and _set_default_format has been called, so self.plot_obj.date_axis_info is not None. You can now manipulate the structured array .date_axis_info to be to your liking, namely contain less b'' and more b'%b'
Remove tick labels:
ax = df.plot(x='date', y=['count'])
every_nth = 10
for n, label in enumerate(ax.xaxis.get_ticklabels()):
if n % every_nth != 0:
label.set_visible(False)
Lower every_nth to include more labels, raise to keep fewer.
I have a trouble plotting data, I only want plot HH:MM:SS but the plot shows HH:MM:SS.sssss or HH:MM:SS.%f. Below i gonna detail what I did (matplotlib, numpy are already imported )
Method I
Loading files to plot
import datetime as dt
data=genfromtxt('27JAN12.K7O', delimiter=2*[4]+5*[2]+8*[7])
f245 = data[:, 7]
Generating array for time (1 data per second)
base = dt.datetime(2014,1,27,11,07,59)
time = array([base + dt.timedelta(seconds=i) for i in range(len(data))])
plot(time,f245)
When i did this, i got this plot (with innecesary precision)
here i got time like 18:15:00.000000 (i just like 18:15:00)
Method II
The same way to load data, in this case only I took the time of the data and coverted in time string
t = data[:,1] #in decimals e.g. 18,5 represents 18:30:00
tstr = map(str, [dt.timedelta(seconds=x) for x in t])
time = []
for i in tstr:
try:
time.append(dt.datetime.strptime(i, "%H:%M:%S"))
except ValueError:
time.append(dt.datetime.strptime(i, "%H:%M:%S.%f"))
plot(time,f245)
In this case, i got time like 18:15:00.%f
So, how i could repair this?
You should use plot_date to plot datetime objects using matplotlib.
You can use matplotlib.dates.date2num to convert the datetime objects into the matplotlib format.
Furthermore you can use DateFormatter objects to set the formatting of the x-axis tick labels.
I have now included a small example using generated data, hopefully this will explain everything for you.
import numpy as numpy
import matplotlib.pyplot as plt
from matplotlib.dates import date2num, DateFormatter
import datetime as dt
base = dt.datetime(2014, 1, 27, 11, 7, 59)
x = [base + dt.timedelta(seconds=i) for i in range(10)]
y = [i**2 for i in range(10)]
x = date2num(x) # Convert datetime objects to the correct format for matplotlib.
fig, ax = plt.subplots()
ax.plot_date(x, y) # Use plot_date rather than plot
# Set the xaxis major formatter as a DateFormatter object
# The string argument shows what format you want (HH:MM:SS)
ax.xaxis.set_major_formatter(DateFormatter('%H:%M:%S'))
# This simply makes them look pretty by setting them diagonal.
fig.autofmt_xdate()
plt.show()