I'm trying to graph contaminants measured in a sample over time, and some sample dates are closer together. How do I plot this line with the current datetime values, but make each xtick equidistant?
This is what I've got so far, currently the ticks are bunched together when the samples were taken closer together.
date = df_TCE.SAMPLEDATE.unique()
date_IA14 = df_TCE.SAMPLEDATE[df_TCE.SYS_LOC_CODE == 'IA-14']
IA14 = df_TCE.AL_RESULT_VALUE[df_TCE.SYS_LOC_CODE == 'IA-14']
plt.plot(date_IA14, IA14)
plt.title('TCE Time Series')
plt.xlabel('Date')
plt.ylabel('Contaminant Level')
ax = plt.subplot()
ax.set_xticks(date_IA14)
ax.set_yticks([1, 2, 3, 4, 5, 6, 7])
ax.set_facecolor('seashell')
plt.show()
This is the output with the ticks bunched:
Output
There are a few things you can try.
First, ensure that your dataframe series called SAMPLEDATE are datetime objects by running pandas.to_datetime(df_TCE.SAMPLEDATE). Resolve any parsing errors that arise so that you're truly dealing with a datetime x-axis rather than strings.
Then, check out fig.autofmt_xdate() instead of ax.set_xticks(date_IA14). Once our x axis is filled with proper datetime objects, matplotlib is smart enough to get us to reasonable xtick spacing.
If you dislike the defaults, check out matplotlib.dates.DayLocator() or the HourLocator() or the MonthLocator(), whatever meets your regular interval needs. You can apply it to your axes object like this:
ax.xaxis.set_major_locator(matplotlib.dates.DayLocator())
https://matplotlib.org/3.1.1/api/dates_api.html#matplotlib.dates.DayLocator
Related
I need to reduce or manually set the number of ticks on the x-axis of a Matplotlib line plot. This question has been asked many times here, I've gone through as many of those answers as I can find and through the Matplotlib docs and I haven't found a solution I can get working so I'm hoping for some help.
I have a Python dictionary with two sets of key:value pairs - datetime.datetime and float. There's hundreds of values in each set - but here's a snippet of the first elements just for reference:
ws_kline_dict_01 = {'time': [datetime.datetime(2023, 2, 15, 10, 35, 8)], 'close': [22183.07]}
I've converted that dictionary to a Pandas dataframe so I can see it more easily in Jupyter and also stripped out the year, month and day from 'time' using:
df_kline_dict_01 = pd.DataFrame(ws_kline_dict_01)
df_kline_dict_01['time'] = df_kline_dict_01['time'].dt.strftime('%H:%M:%S')
When I plot this via Matplotlib using 'time' as the x-axis - it prints every value as a tick which is way too cluttered (see 'Plot: Post-Panda format' below).
If I leave the datetime.datetime in its original form - Matplotlib seems to auto-select how many values it displays and it displays "Day Hour:Minutes" instead of "Hour:Minutes:Seconds" - which isn't working for me (see 'Plot: Pre-Panda format' below).
I've tried plt.locator_params(axis='x', nbins=n) - but this is giving me an error message:
"UserWarning: 'set_params()' not defined for locator of type <class 'matplotlib.category.StrCategoryLocator'>".
For reference - this is the code I'm using to produce the plot:
plt.plot(df_kline_dict_01['time'], df_kline_dict_01['close'], color = 'green', label = 'close')
plt.xticks(rotation=45, ha='right')
plt.show()
How do I (at least) reduce or (ideally) explicitly set the number of values/ticks shown on the x-axis?
Seems like this should be a pretty simple formatting task - but so far it's beating me and I'd appreciate some help getting this sorted.
Plot: Pre-Panda format
Plot: Post-Panda format
Here is a possible solution using the .xaxis.set_major_locator() method. You can adjust the max_xticks variable to suit your use-case.
...
df_kline_dict_01['time'] = df_kline_dict_01['time'].dt.strftime('%H:%M:%S')
fig, ax = plt.subplots()
ax.plot(df_kline_dict_01['time'], df_kline_dict_01['close'], color='green', label='close')
max_xticks = 6
ax.xaxis.set_major_locator(ticker.MaxNLocator(max_xticks))
plt.xticks(rotation=45, ha='right')
plt.show()
Note: I assigned max_xticks = 6 so it helps you understand the code otherwise you could just set the value in .MaxNLocator(6) in the next line of code.
Put some parameters for the locations like $plt.xticks(np.arange(min,max,step),rotation=45, ha='right')$
fill the min and max and steps as you wish
title probably does not make sense, but I will try to explain.
I am plotting chemical concentrations overtime. The x axis should be hours since midnight local time (i.e., 0,4,8,12,16,20). However, when I do this all of the xticks get smushed together to to left.
xticks = range(0,24,4)
ozoneest["mean"].plot(ax=ax, xticks=xticks,)
Results in:
xticks is only accepting arrays of datetime variables, which have values: 00:00, 04:00, 08:00, 12:00, 16:00, 20:00.
xticks = pd.date_range("2000/01/01", end="2000/01/02", freq="4H").time
ozoneest["mean"].plot(ax=ax, xticks=xticks,)
results in:
This is close to what I want, but I want just the number of the hour
Thanks!
I assume that your data is stored in a pandas dataframe with a DatetimeIndex that has an "Hour" frequency. I cannot exactly reproduce your problem seeing as you have not shared the code generating the ax object. Whether it is created with matplotlib or pandas, the problem is that the x-axis unit is based on the number of time periods (based on the DatetimeIndex frequency in pandas, days in matplotlib) that have passed since 1970-01-01. So the xticks = range(0,24,4) land far to the left relative to your datetimes. You can check the x-axis values of the default xticks with ax.get_xticks().
Here are two ways of formatting the xticks and labels as you want. I suggest that you do not create a new DatetimeIndex for the hours as this makes the code less easy to reuse, use instead the DatetimeIndex of the dataframe as shown in the second solution.
Create sample dataframe
import numpy as np # v 1.20.2
import pandas as pd # v 1.2.5
rng = np.random.default_rng(seed=123) # random number generator
time = pd.date_range(start="2000/01/01", end="2000/01/02", freq="H")[:-1]
mean = rng.normal(size=len(time))
ozoneest = pd.DataFrame(dict(mean=mean), index=time)
ozoneest.head()
Pandas plot with default xticks
ozoneest["mean"].plot()
Simple solution: do not use the DatetimeIndex as the x-axis
xticks = range(0,24,4)
ax = ozoneest["mean"].plot(use_index=False, xticks=xticks)
General solution: select xticks from DatetimeIndex and create labels with strftime
xticks = ozoneest.index[::4]
xticklabels = xticks.strftime("%H")
ax = ozoneest["mean"].plot()
ax.set_xticks(xticks)
ax.set_xticks([], minor=True)
ax.set_xticklabels(xticklabels)
This solution is more general because you do not need to manually adjust the xticks if the range of time of your dataset changes and the tick labels can be easily customized in many ways.
If you want to remove the leading zeros, you can use the following list comprehension:
xticklabels = [tick[1:] if tick[0] == "0" else tick for tick in xticks.strftime("%H")]
I'm obviously making a very basic mistake in adding a rolling mean plot to my figure.
The basic plot of close prices works fine, but as soon as I add the rolling mean to the plot, the x-axis dates get screwed up and I can't see what it's trying to do.
Here's the code:
import pandas as pd
import matplotlib.pyplot as plot
df = pd.read_csv('historical_price_data.csv')
df['Date'] = pd.to_datetime(df.Date, infer_datetime_format=True)
df.sort_index(inplace=True)
ax = df[['Date', 'Close']].plot(figsize=(14, 7), x='Date', color='black')
rolling_mean = df.Close.rolling(window=7).mean()
plot.plot(rolling_mean, color='blue', label='Rolling Mean')
plot.show()
With this sample data set I am getting this figure:
Given this simplicity of this code, I'm obviously making a very basic mistake, I just can't see what it is.
EDIT: Interesting, although #AndreyPortnoy's suggestion to set the index to Date results in the odd error that Date is not in the index, when I use the built-in's per his suggestion, the figure is no longer a complete mess, but for some reason the x-axis is reversed, and the ticks are no longer dates, but apparently ints (?) even though df.types shows Date is datetime64[ns]
#Sandipan\ Dey: Here's what the dataset looks like. Per code above I'm using pd.to_datetime() to convert to datetime64, and have tried df[::-1] to fix the problem where it is reversed when the 2nd plot (mov_avg) is added to the figure (but not reversed when figure only has the 1 plot.)
The fact that your dates for the moving averages start at 1970 suggests that an integer range index is used. It was generated by default when you read in the csv file. Try inserting
df.set_index('Date', inplace=True)
before
df.sort_index(inplace=True)
Then you can do
ax = df['Close'].plot(figsize=(14, 7), color='black')
rolling_mean = df.Close.rolling(window=7).mean()
plot.plot(rolling_mean, color='blue', label='Rolling Mean')
Note that I'm not passing x explicitly, letting pandas and matplotlib infer it.
You can simplify your code by using the builtin plotting facilities like so:
df['mov_avg'] = df['Close'].rolling(window=7).mean()
df[['Close', 'mov_avg']].plot(figsize=(14, 7))
I have a dataframe like this:
data_ = list(range(106))
index_ = pd.period_range('3/1/2004', '12/1/2012', freq='M')
df2_ = pd.DataFrame(data = data_, index = index_, columns = ['data'])
I want to plot this dataframe. Currently, I am using:
df2_.plot()
Now I like to control the labels (and possibly ticks) at the x axis. In particular, I like to have monthly ticks at the axis and possibly a label at every other month or quarterly labels. I also like to have vertical grid lines.
I started looking at this example but I am already failing at constructing the timedelta.
With regards to constructing the timedelta, datetime.timdelta() doesn’t have a parameter to specify months, so it’s probably convenient to stick to pd.date_range(). However, I found that objects of type pandas.tslib.Timestamp don’t play nice with matplotlib ticks so you could convert them to datetime.date objects like so
index_ = [pd.to_datetime(date, format='%Y-%m-%d').date()
for date in pd.date_range('2004-03-01', '2012-12-01', freq="M")]
It’s possible to add gridlines and customise axes labels by first defining a matplotlib axes object, and then passing this to DataFrame.plot()
ax = plt.axes()
df2_.plot(ax=ax)
Now you can add vertical gridlines to your plot
ax.xaxis.grid(True)
And specify quarterly xticks labels by using matplotlib.dates.MonthLocator and setting the interval to 3
ax.xaxis.set_major_locator(dates.MonthLocator(interval=3))
And finally, I found the ticks to be to be very crowded so I formatted them to get a nicer fit
ax.xaxis.set_major_formatter(dates.DateFormatter('%b %y'))
labels = ax.get_xticklabels()
plt.setp(labels, rotation=85, fontsize=8)
To produce the following:
So I am new to programming with matplotlib. I have created a color plot using imshow() and an array. At first the axis were just the row and column number of my array. I used extent = (xmin,xmax,ymin,ymax) to get the x-axis in unix time and altitude, respectively.
I want to change the x-axis from unix time (982376726,982377321) to UT(02:25:26, 02:35:21). I have created a list of the time range in HH:MM:SS. I am not sure how to replace my current x-axis with these new numbers, without changing the color plot (or making it disappear).
I was looking at datetime.time but I got confused with it.
Any help would be greatly appreciated!
I have put together some example code which should help you with your problem.
The code first generates some randomised data using numpy.random. It then calculates your x-limits and y-limits where the x-limits will be based off of two unix timestamps given in your question and the y-limits are just generic numbers.
The code then plots the randomised data and uses pyplot methods to convert the x-axis formatting to nicely represented strings (rather than unix timestamps or array numbers).
The code is well commented and should explain everything you need, if not please comment and ask for clarification.
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import datetime as dt
# Generate some random data for imshow
N = 10
arr = np.random.random((N, N))
# Create your x-limits. Using two of your unix timestamps you first
# create a list of datetime.datetime objects using map.
x_lims = list(map(dt.datetime.fromtimestamp, [982376726, 982377321]))
# You can then convert these datetime.datetime objects to the correct
# format for matplotlib to work with.
x_lims = mdates.date2num(x_lims)
# Set some generic y-limits.
y_lims = [0, 100]
fig, ax = plt.subplots()
# Using ax.imshow we set two keyword arguments. The first is extent.
# We give extent the values from x_lims and y_lims above.
# We also set the aspect to "auto" which should set the plot up nicely.
ax.imshow(arr, extent = [x_lims[0], x_lims[1], y_lims[0], y_lims[1]],
aspect='auto')
# We tell Matplotlib that the x-axis is filled with datetime data,
# this converts it from a float (which is the output of date2num)
# into a nice datetime string.
ax.xaxis_date()
# We can use a DateFormatter to choose how this datetime string will look.
# I have chosen HH:MM:SS though you could add DD/MM/YY if you had data
# over different days.
date_format = mdates.DateFormatter('%H:%M:%S')
ax.xaxis.set_major_formatter(date_format)
# This simply sets the x-axis data to diagonal so it fits better.
fig.autofmt_xdate()
plt.show()