Practicing on visualization as a Python newbie I have encountered this conceptual issue that got me thinking,
Infact I managed to change the price format on the y axis of a boxplot , from scientific notation to something more clear. Here the outputs before and after the formatting of the y axis
before
after
boxy=sns.boxplot(x="waterfront", y="price", data=df)
# my experiments success
from matplotlib.ticker import FuncFormatter
f = lambda x, pos: f'{x:.0f}'
boxy.yaxis.set_major_formatter(FuncFormatter(f))
the problem is that I realized that the attribute yaxis should refer to an AXIS object, meanwhile here what i call 'boxy' is an AXES object (at least from the seaborn documentation)
Can anyone explain it?
You're right saying that seaborn boxplot returns a matplotlib Axes object. And referring to this answer, we see Axes and Axis objects are different.
Code inspection isn't needed... but under the hood, seaborn uses matplotlib, it is noted in the GitHub here for boxplots.
when you call sns.boxplot part of drawing your plot creates Axis objects... which are objects of the matplotlib.axis module.
The y axis is in fact the first part of boxy.yaxis.set_major_formatter(FuncFormatter(f))
it is accessed with boxy.yaxis. On which you are calling the function .set_major_formatter(FuncFormatter(f)).
To see this, yaxis = boxy.get_yaxis() will return the yaxis of the boxy axes object.
EDIT IN RESPONSE TO COMMENT:
Again you're correct in the comment that this is not documented from what I could find... but if we look in the matplotlib GitHub here, we see in the YAxis class declaration:
class YAxis(Axis):
__name__ = 'yaxis'
It is just 'YAxis' renamed. Classes will assume their name in the declarative line, unless you re-specify using __name__ which was done here!
It exists!!!
boxy's yaxis inherets the set_major_formatter method from its base class, the 'Axis' class. Just to confirm this hierarchy try looking with method resolution order:
print(type(boxy.get_yaxis()).__mro__)
Should display:
(<class 'matplotlib.axis.YAxis'>, <class 'matplotlib.axis.Axis'>, <class 'matplotlib.artist.Artist'>, <class 'object'>)
I am using the basic axis.annotate(str(i)) function to show values along the points of my graph. The problem is quite quickly they get to bunched together. So I have two questions: How can I remove an annotation? And how can I make one smaller (font size)?
Here is the reference to the annotation method:
http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.annotate
I have done my research and surprisingly found nothing. Cheers.
axis.annotate(str(i)) returns an axes Annotation object. You need to assign a variable to it and then you manipulate it however you want.
fig, ax = plt.subplots(1,1)
ax.plot(range(5))
text = ax.annotate(xy = (2,2), s='asdf')
# use any set_ function to change all the properties
text.set_fontsize(20)
from datetime import datetime
fig=plt.figure()
ax=fig.add_subplot(1,1,1)
data=pd.read_csv(r"C:\Users\champion\Desktop\ch02\spx.csv")
spx=data["SPX"]
spx.plot(**ax=ax**,style="k-")
I can't understand why "ax=ax" meaning in matplotlib.
From the documentation of plot():
DataFrame.plot(x=None, y=None, kind='line', ax=None, subplots=False,
sharex=None, sharey=False, layout=None, figsize=None, use_index=True,
title=None, grid=None, legend=True, style=None, logx=False,
logy=False, loglog=False, xticks=None, yticks=None, xlim=None,
ylim=None, rot=None, fontsize=None, colormap=None, table=False,
yerr=None, xerr=None, secondary_y=False, sort_columns=False, **kwds)
Parameters: ax : matplotlib axes object, default None
You can see that ax is a keyword argument here. It just happens that you also named your variable as ax and you are sending it as the value of that keyword argument to the function plot().
Ax is the keyword for the part of the overall figure in which a chart/plot is drawn. So, when you type "spx.plot(**ax=", you are declaring the values for that part of the figure. The reason you are saying "ax=ax" is, as Nahal rightly pointed out, because you defined a variable named "ax" on the third line of code and you are using that to say what the ax keyword should be.
Here's an article with some helpful visuals.
https://towardsdatascience.com/what-are-the-plt-and-ax-in-matplotlib-exactly-d2cf4bf164a9
The current thought of the previous explanations are true, but it's an argument for a Series.plot() method.
Importing this
data=pd.read_csv(r"C:\Users\champion\Desktop\ch02\spx.csv")
What you are getting is a DataFrame.
And then:
spx=data["SPX"]
The code above is giving you a Series back. So, we are dealing with a Series.plot() method - but there is the same argument dor DataFrame.plot().
First of all, it is important to understand that this plot() method is correlated but not the same as the plot() function from matplotlib.
When you create a figure with:
fig=plt.figure()
It creates something like a blank sheet, and you can't create a plotting in a blank sheet.
The next code creates a subplot where you can finally plot something.
ax=fig.add_subplot(1,1,1)
And now we are getting to the question.
spx.plot(ax=ax,style="k-")
This piece of code is calling the plot method for a Series, and inside this method there is an optional argument called 'ax'.
The description of this argument says that it is an object of plotting from matplotlib for this plotting you want to do. If nothing is specified in there, so it makes use of the active subplotting of matplotlib.
Long story short, in your example there is only one active subplotting, and it is the 'ax' that was created before, so you could run your code without 'ax=ax', with the same result.
But it will make sense in a context when you have more than one subplotting object, so you could specify in wich one you would like to plot the spx Series.
We could have created a second and a third subplot, like this:
ax1 = fig.add_subplot(2, 2, 2)
ax2 = fig.add_subplot(2, 2, 3)
In this case, if I want to plot that Series on the 'ax1', I could have passed that to the argument:
spx.plot(ax=ax1,style="k-")
And now it is plotting on the exact box I wanted in the figure.
I want to name figures like this:
import matplotlib as plt
for i in range(0,3):
plt.figure('Method%s',%i)
But seems it is not possible this way.
another way I found is using super title but still it does not work:
from pylab import *
for i in range(0,3):
fig = gcf()
fig.suptitle('Method%s',%i)
do you know any solutions?
If you need to use the figures you are going to create, it may be a good move to store them in some kind of data structure. In my example I will use a list and I will give also an example of using later one of the figures that have been instantiated.
Re naming your figures according to a sequence number, you are correct with the general idea but not with the details, as it happend that in plt.figure() to have a user defined name you have to use a keyword argument that is not named name as one could expect, but … num …
figures = [plt.figure(num="Figure n.%d"%(i+1)) for i in range(3)]
# ^^^
...
figures[1].add_axes(...)
...
I'm new. Look at the snippet of the code that results in a graph.
df = pd.read_csv("data/GOOG.csv")
df['High'].plot()
plt.show()
My question is - how plt.show() (matplotlib.pyplot) is getting the values of x and y when plt is not being called with any parameters? The plot function is of the dataframe object. Does it store the value somewhere default from which plt can get the values?
plt.show() does not 'plot' the values, it will show the current already created figure(s).
It is df['High'].plot() which creates this figure under the hood. The pandas plotting functions are implemented by calling out to matplotlib. By default, it will create a new figure, unless the specify with the ax keyword argument a subplot on which to add the plot.