Dates in the xaxis for a matplotlib plot with imshow - python

So I am new to programming with matplotlib. I have created a color plot using imshow() and an array. At first the axis were just the row and column number of my array. I used extent = (xmin,xmax,ymin,ymax) to get the x-axis in unix time and altitude, respectively.
I want to change the x-axis from unix time (982376726,982377321) to UT(02:25:26, 02:35:21). I have created a list of the time range in HH:MM:SS. I am not sure how to replace my current x-axis with these new numbers, without changing the color plot (or making it disappear).
I was looking at datetime.time but I got confused with it.
Any help would be greatly appreciated!

I have put together some example code which should help you with your problem.
The code first generates some randomised data using numpy.random. It then calculates your x-limits and y-limits where the x-limits will be based off of two unix timestamps given in your question and the y-limits are just generic numbers.
The code then plots the randomised data and uses pyplot methods to convert the x-axis formatting to nicely represented strings (rather than unix timestamps or array numbers).
The code is well commented and should explain everything you need, if not please comment and ask for clarification.
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import datetime as dt
# Generate some random data for imshow
N = 10
arr = np.random.random((N, N))
# Create your x-limits. Using two of your unix timestamps you first
# create a list of datetime.datetime objects using map.
x_lims = list(map(dt.datetime.fromtimestamp, [982376726, 982377321]))
# You can then convert these datetime.datetime objects to the correct
# format for matplotlib to work with.
x_lims = mdates.date2num(x_lims)
# Set some generic y-limits.
y_lims = [0, 100]
fig, ax = plt.subplots()
# Using ax.imshow we set two keyword arguments. The first is extent.
# We give extent the values from x_lims and y_lims above.
# We also set the aspect to "auto" which should set the plot up nicely.
ax.imshow(arr, extent = [x_lims[0], x_lims[1], y_lims[0], y_lims[1]],
aspect='auto')
# We tell Matplotlib that the x-axis is filled with datetime data,
# this converts it from a float (which is the output of date2num)
# into a nice datetime string.
ax.xaxis_date()
# We can use a DateFormatter to choose how this datetime string will look.
# I have chosen HH:MM:SS though you could add DD/MM/YY if you had data
# over different days.
date_format = mdates.DateFormatter('%H:%M:%S')
ax.xaxis.set_major_formatter(date_format)
# This simply sets the x-axis data to diagonal so it fits better.
fig.autofmt_xdate()
plt.show()

Related

How to plot x int date values from array matplotlib correctly?

I am having an issue when trying to plot some of the date values into a matplotlib side by side bar graph.
I first define my Series x = new_df['month'] which contains the following values:
0,2021-01-01
1,2021-02-01
2,2021-03-01
3,2021-04-01
4,2021-05-01
5,2021-06-01
6,2021-07-01
7,2021-08-01
8,2021-09-01
9,2021-10-01
10,2021-11-01
11,2021-12-01
12,2022-01-01
13,2022-02-01
14,2022-03-01
15,2022-04-01
16,2022-05-01
17,2022-06-01
18,2022-07-01
19,2022-08-01
20,2022-09-01
21,2022-10-01
22,2022-11-01
After this I define the function to plot my graph:
import matplotlib.pyplot as plt
import matplotlib.ticker as ticker
import matplotlib.dates as mdates
import numpy as np
def side_by_side_bar_chart(x, y, labels, file_name):
width = 0.25 # set bar width
ind = np.arange(len(x)) # Get the number of x labels
fig, ax = plt.subplots(figsize=(10, 8))
# Get average number in order to set labels formatting
ymax = int(max([mean(x) for x in y]))
plt.xticks(ind, x) # sets x labels with values in x list (months)
# These two lines format ax labels
dtFmt = mdates.DateFormatter('%b-%y') # define the formatting
plt.gca().xaxis.set_major_formatter(dtFmt)
plt.savefig("charts/"+ file_name + ".png", dpi = 300)
However, my x values are plotted as Jan 70 for all xticks:
Wrong labeled x ticks
I suspect that this has something to do with formatting. The same is causing similar issues in a different part of the script where I use twin(x) for a side by side chart with a trendline on top and my values are plotted wrong in the graph:
Wrong plotted graph
Does anybody have an idea how to fix these bugs? Thank you for your help in advance!
Pass the dates in the x array and plot all values correspondingly in the graphs.
The thing is that your "x" is not a date. It is obviously a string. So formatter can't interpret it correctly.
Let's try to reproduce your problem (this is the kind of minimal reproducible example I was mentioning earlier) :
import datetime
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import numpy as np # just to generate something to plot
# Generate a dummy set of 20 dates, starting from Mar 15 2020
dt=datetime.timedelta(days=31)
x0=[datetime.date(2020,3,1) + k*dt for k in range(20)]
x=[d.strftime("%Y-%m-%d") for d in x0] # This looks like your x: 20 strings
# And some y to have something to plot
y=np.cumsum(np.random.normal(0,1,20)) # Don't overthink it, it is just 20 numbers :)
# Plot y vs x (x being the strings)
plt.plot(x,y)
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%b-%y'))
plt.show()
Result
Now, solution for that is very simple: x must contains date, not strings.
From my example, I could just plt.plot(x0,y) instead of x, since x0 is the list of dates from which I computed x. But if, as it appears, you only have the string available, you can parse them. For example, using [d datetime.date.fromisoformat(d) for d in x].
Or, since you have already pandas at hand: pd.to_datetime(x) (it is not exactly the same date time, but both are understood by matplotlib)
xx=pd.to_datetime(x)
plt.plot(xx,y)
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter('%b-%y'))
plt.show()
Note that, without any action from me, it also stop printing all labels. That is because in the first case, matplotlib wasn't aware of any logical progression of x values. From its point of view, those where all just labels. And you can't, a priori, skip a label, since the reader could not guess what is between two labels separated by a gap (it seems obvious for us, since we know they are dates. But matplotlib doesn't know that. It is just as if x contained ['red', 'green', 'yellow', 'purple', 'black', 'blue', ...]. You would not expect every other label to be just arbitrarily skipped).
Whereas, now that we passed real dates to matplotlib, it is as if x was numerical: there is a logical progression of its values. Matplotlib knows it, and, more importantly, knows that we know it. So it is acceptable to just skip some to make the figure more readable: everybody knows what is between "Mar 20" and "May 20".
So, short answer: convert your string to dates.

Format tick labels in scatter plot to % in matplotlib - python [duplicate]

I have a line chart based on a simple list of numbers. By default the x-axis is just the an increment of 1 for each value plotted. I would like to be a percentage instead but can't figure out how. So instead of having an x-axis from 0 to 5, it would go from 0% to 100% (but keeping reasonably spaced tick marks. Code below. Thanks!
from matplotlib import pyplot as plt
from mpl_toolkits.axes_grid.axislines import Subplot
data=[8,12,15,17,18,18.5]
fig=plt.figure(1,(7,4))
ax=Subplot(fig,111)
fig.add_subplot(ax)
plt.plot(data)
The code below will give you a simplified x-axis which is percentage based, it assumes that each of your values are spaces equally between 0% and 100%.
It creates a perc array which holds evenly-spaced percentages that can be used to plot with. It then adjusts the formatting for the x-axis so it includes a percentage sign using matplotlib.ticker.FormatStrFormatter. Unfortunately this uses the old-style string formatting, as opposed to the new style, the old style docs can be found here.
import matplotlib.pyplot as plt
import numpy as np
import matplotlib.ticker as mtick
data = [8,12,15,17,18,18.5]
perc = np.linspace(0,100,len(data))
fig = plt.figure(1, (7,4))
ax = fig.add_subplot(1,1,1)
ax.plot(perc, data)
fmt = '%.0f%%' # Format you want the ticks, e.g. '40%'
xticks = mtick.FormatStrFormatter(fmt)
ax.xaxis.set_major_formatter(xticks)
plt.show()
This is a few months late, but I have created PR#6251 with matplotlib to add a new PercentFormatter class. With this class you can do as follows to set the axis:
import matplotlib.ticker as mtick
# Actual plotting code omitted
ax.xaxis.set_major_formatter(mtick.PercentFormatter(5.0))
This will display values from 0 to 5 on a scale of 0% to 100%. The formatter is similar in concept to what #Ffisegydd suggests doing except that it can take any arbitrary existing ticks into account.
PercentFormatter() accepts three arguments, max, decimals, and symbol. max allows you to set the value that corresponds to 100% on the axis (in your example, 5).
The other two parameters allow you to set the number of digits after the decimal point and the symbol. They default to None and '%', respectively. decimals=None will automatically set the number of decimal points based on how much of the axes you are showing.
Note that this formatter will use whatever ticks would normally be generated if you just plotted your data. It does not modify anything besides the strings that are output to the tick marks.
Update
PercentFormatter was accepted into Matplotlib in version 2.1.0.
Totally late in the day, but I wrote this and thought it could be of use:
def transformColToPercents(x, rnd, navalue):
# Returns a pandas series that can be put in a new dataframe column, where all values are scaled from 0-100%
# rnd = round(x)
# navalue = Nan== this
hv = x.max(axis=0)
lv = x.min(axis=0)
pp = pd.Series(((x-lv)*100)/(hv-lv)).round(rnd)
return pp.fillna(navalue)
df['new column'] = transformColToPercents(df['a'], 2, 0)

Plotting the hours of the day instead of time

title probably does not make sense, but I will try to explain.
I am plotting chemical concentrations overtime. The x axis should be hours since midnight local time (i.e., 0,4,8,12,16,20). However, when I do this all of the xticks get smushed together to to left.
xticks = range(0,24,4)
ozoneest["mean"].plot(ax=ax, xticks=xticks,)
Results in:
xticks is only accepting arrays of datetime variables, which have values: 00:00, 04:00, 08:00, 12:00, 16:00, 20:00.
xticks = pd.date_range("2000/01/01", end="2000/01/02", freq="4H").time
ozoneest["mean"].plot(ax=ax, xticks=xticks,)
results in:
This is close to what I want, but I want just the number of the hour
Thanks!
I assume that your data is stored in a pandas dataframe with a DatetimeIndex that has an "Hour" frequency. I cannot exactly reproduce your problem seeing as you have not shared the code generating the ax object. Whether it is created with matplotlib or pandas, the problem is that the x-axis unit is based on the number of time periods (based on the DatetimeIndex frequency in pandas, days in matplotlib) that have passed since 1970-01-01. So the xticks = range(0,24,4) land far to the left relative to your datetimes. You can check the x-axis values of the default xticks with ax.get_xticks().
Here are two ways of formatting the xticks and labels as you want. I suggest that you do not create a new DatetimeIndex for the hours as this makes the code less easy to reuse, use instead the DatetimeIndex of the dataframe as shown in the second solution.
Create sample dataframe
import numpy as np # v 1.20.2
import pandas as pd # v 1.2.5
rng = np.random.default_rng(seed=123) # random number generator
time = pd.date_range(start="2000/01/01", end="2000/01/02", freq="H")[:-1]
mean = rng.normal(size=len(time))
ozoneest = pd.DataFrame(dict(mean=mean), index=time)
ozoneest.head()
Pandas plot with default xticks
ozoneest["mean"].plot()
Simple solution: do not use the DatetimeIndex as the x-axis
xticks = range(0,24,4)
ax = ozoneest["mean"].plot(use_index=False, xticks=xticks)
General solution: select xticks from DatetimeIndex and create labels with strftime
xticks = ozoneest.index[::4]
xticklabels = xticks.strftime("%H")
ax = ozoneest["mean"].plot()
ax.set_xticks(xticks)
ax.set_xticks([], minor=True)
ax.set_xticklabels(xticklabels)
This solution is more general because you do not need to manually adjust the xticks if the range of time of your dataset changes and the tick labels can be easily customized in many ways.
If you want to remove the leading zeros, you can use the following list comprehension:
xticklabels = [tick[1:] if tick[0] == "0" else tick for tick in xticks.strftime("%H")]

axes.set_xticklabels breaks datetime format

im trying to force my will onto this matplotlib graph. When I set ax1.xaxis.set_major_formatter(myFmt) it works fine like in the upper graph.
However when I add ax1.set_xticklabels((date),rotation=45) the timeformat reverts to matplotlib time like in the lower graph.
Both use the same input time variable. I also tried ax1.plot_date() but that only changes the look of the graph not the timeformat.,
date_1 = np.vectorize(dt.datetime.fromtimestamp)(time_data) # makes a datetimeobject from unix timestamp
date = np.vectorize(mdates.date2num)(date_1) # from datetime makes matplotib time
myFmt = mdates.DateFormatter('%d-%m-%Y/%H:%M')
ax1 = plt.subplot2grid((10,3), (0,0), rowspan=4, colspan=4)
ax1.xaxis_date()
ax1.plot(date, x)
ax1.xaxis.set_major_formatter(myFmt)
ax1.set_xticklabels((date),rotation=45)#ignores time format
Any ideas how I can force the custom timeformat onto the xticklabels? I get that xticklabels directly reads and displays the date variable but shouldnt it be possible to make it stick to the format? Especially if you later want to add xticks in custom date locations.
All ideas appreciated. Cheers
A locator specifies the locations of the ticks. A formatter formats the ticklabels at those positions. Using a formatter, like
ax1.xaxis.set_major_formatter(dates.DateFormatter('%d-%m-%Y/%H:%M'))
hence works well. However, using set_xticklabels after speciying the formatter, removes the DateFormatter and replaces it with a FixedFormatter. You will hence get ticklabels at automatically chosen positions but with labels that do not correspond to those positions. The graph will hence be labelled incorrectly.
Therefore, you should never use set_xticklabels without specifying a custom locator, e.g. via set_xticks, as well.
Here there is no need to use set_xticklabels at all. The formatter alone is enough.
import datetime as dt
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
import numpy as np
time_data = np.array([1.5376248e+09,1.5376932e+09,1.5377112e+09])
x = np.array([1,3,2])
date_1 = np.vectorize(dt.datetime.fromtimestamp)(time_data)
date = np.vectorize(mdates.date2num)(date_1)
myFmt = mdates.DateFormatter('%d-%m-%Y/%H:%M')
ax1 = plt.subplot2grid((4,4), (0,0), rowspan=4, colspan=4)
ax1.xaxis_date()
ax1.plot(date, x)
ax1.xaxis.set_major_formatter(myFmt)
plt.setp(ax1.get_xticklabels(), rotation=45, ha="right")
plt.show()
Alright I think I got it now.
str_dates = []
for i in time_data:
j = dt.datetime.fromtimestamp(i)
k = j.strftime('%d-%m-%Y/%H:%M')
str_dates.append(k)
print(str_dates)
ax1.set_xticklabels((str_dates),rotation=45)
Im not sure why this doesnt work with vectorize, but taking each date one by one removes the error source that the arrays are giving me.
#iDrwish: thanks again you pushed me in the right direction.
You can coerce your time format by converting the datetime object to string.
You will have to do special handling of the dates if that are in utc-format:
from datetime import datetime
str_dates = [datetime.utcfromtimestamp(timestamp).strftime('%d-%m-%Y/%H:%M') for timestamp in date]
ax1.set_xticklabels((str_dates),rotation=45)

matplotlib: manually change yaxis values to differ from the actual value (NOT: change ticks!) [duplicate]

I am trying to plot a data and function with matplotlib 2.0 under python 2.7.
The x values of the function are evolving with time and the x is first decreasing to a certain value, than increasing again.
If the function is plotted against time, it shows function like this plot of data against time
I need the same x axis evolution for plotting against real x values. Unfortunately as the x values are the same for both parts before and after, both values are mixed together. This gives me the wrong data plot:
In this example it means I need the x-axis to start on value 2.4 and decrease to 1.0 than again increase to 2.4. I swear I found before that this is possible, but unfortunately I can't find a trace about that again.
A matplotlib axis is by default linearly increasing. More importantly, there must be an injective mapping of the number line to the axis units. So changing the data range is not really an option (at least when the aim is to keep things simple).
It would hence be good to keep the original numbers and only change the ticks and ticklabels on the axis. E.g. you could use a FuncFormatter to map the original numbers to
np.abs(x-tp)+tp
where tp would be the turning point.
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.ticker
x = np.linspace(-10,20,151)
y = np.exp(-(x-5)**2/19.)
plt.plot(x,y)
tp = 5
fmt = lambda x,pos:"{:g}".format(np.abs(x-tp)+tp)
plt.gca().xaxis.set_major_formatter(matplotlib.ticker.FuncFormatter(fmt))
plt.show()
One option would be to use two axes, and plot your two timespans separately on each axes.
for instance, if you have the following data:
myX = np.linspace(1,2.4,100)
myY1 = -1*myX
myY2 = -0.5*myX-0.5
plt.plot(myX,myY, c='b')
plt.plot(myX,myY2, c='g')
you can instead create two subplots with a shared y-axis and no space between the two axes, plot each time span independently, and finally, adjust the limits of one of your x-axis to reverse the order of the points
fig, (ax1,ax2) = plt.subplots(1,2, gridspec_kw={'wspace':0}, sharey=True)
ax1.plot(myX,myY1, c='b')
ax2.plot(myX,myY2, c='g')
ax1.set_xlim((2.4,1))
ax2.set_xlim((1,2.4))

Categories

Resources