How does the plotting function of pandas and matplotlib sync? - python

I'm new. Look at the snippet of the code that results in a graph.
df = pd.read_csv("data/GOOG.csv")
df['High'].plot()
plt.show()
My question is - how plt.show() (matplotlib.pyplot) is getting the values of x and y when plt is not being called with any parameters? The plot function is of the dataframe object. Does it store the value somewhere default from which plt can get the values?

plt.show() does not 'plot' the values, it will show the current already created figure(s).
It is df['High'].plot() which creates this figure under the hood. The pandas plotting functions are implemented by calling out to matplotlib. By default, it will create a new figure, unless the specify with the ax keyword argument a subplot on which to add the plot.

Related

How can the following: "axes.yaxis.set_major_formatter(FuncFormatter(f))" work since yaxis should be an attribute of AXIS and not of AXES objects?

Practicing on visualization as a Python newbie I have encountered this conceptual issue that got me thinking,
Infact I managed to change the price format on the y axis of a boxplot , from scientific notation to something more clear. Here the outputs before and after the formatting of the y axis
before
after
boxy=sns.boxplot(x="waterfront", y="price", data=df)
# my experiments success
from matplotlib.ticker import FuncFormatter
f = lambda x, pos: f'{x:.0f}'
boxy.yaxis.set_major_formatter(FuncFormatter(f))
the problem is that I realized that the attribute yaxis should refer to an AXIS object, meanwhile here what i call 'boxy' is an AXES object (at least from the seaborn documentation)
Can anyone explain it?
You're right saying that seaborn boxplot returns a matplotlib Axes object. And referring to this answer, we see Axes and Axis objects are different.
Code inspection isn't needed... but under the hood, seaborn uses matplotlib, it is noted in the GitHub here for boxplots.
when you call sns.boxplot part of drawing your plot creates Axis objects... which are objects of the matplotlib.axis module.
The y axis is in fact the first part of boxy.yaxis.set_major_formatter(FuncFormatter(f))
it is accessed with boxy.yaxis. On which you are calling the function .set_major_formatter(FuncFormatter(f)).
To see this, yaxis = boxy.get_yaxis() will return the yaxis of the boxy axes object.
EDIT IN RESPONSE TO COMMENT:
Again you're correct in the comment that this is not documented from what I could find... but if we look in the matplotlib GitHub here, we see in the YAxis class declaration:
class YAxis(Axis):
__name__ = 'yaxis'
It is just 'YAxis' renamed. Classes will assume their name in the declarative line, unless you re-specify using __name__ which was done here!
It exists!!!
boxy's yaxis inherets the set_major_formatter method from its base class, the 'Axis' class. Just to confirm this hierarchy try looking with method resolution order:
print(type(boxy.get_yaxis()).__mro__)
Should display:
(<class 'matplotlib.axis.YAxis'>, <class 'matplotlib.axis.Axis'>, <class 'matplotlib.artist.Artist'>, <class 'object'>)

Merge two plots in one figure out of class in python

It is my first time posting in stack overflow and I am sorry if the question seems basic.
I have defined a class in another folder and in this class the plot function is defined. Is there a way to merge the two plots as one figure out of the class definition? The example of the plot defined in the class is as below:
def plot_ls(self):
plt.figure()
plt.semilogx(self.lk['ni_x'],self.lk['si_x'],'r')
plt.semilogx(self.lk['ni_X'],self.lk['si_X'],'k')
plt.grid(True)
Then from another script is the plot function is called two times for different data series:
data1.plot_ls()
data2.plot_ls()
There will be 2 plots in 2 different diagrams. Is there a method to combine them?
You would need to call plt.figure() outside of your function
def plot_ls(self):
plt.semilogx(self.lk['ni_x'],self.lk['si_x'],'r')
plt.semilogx(self.lk['ni_X'],self.lk['si_X'],'k')
And then
plt.figure()
data1.plot_ls()
data2.plot_ls()
plt.grid(True)
I'm not super familiar with the grid part, but I think this should do what you're looking for.

How to delete and resize matplotlib annotations?

I am using the basic axis.annotate(str(i)) function to show values along the points of my graph. The problem is quite quickly they get to bunched together. So I have two questions: How can I remove an annotation? And how can I make one smaller (font size)?
Here is the reference to the annotation method:
http://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.annotate
I have done my research and surprisingly found nothing. Cheers.
axis.annotate(str(i)) returns an axes Annotation object. You need to assign a variable to it and then you manipulate it however you want.
fig, ax = plt.subplots(1,1)
ax.plot(range(5))
text = ax.annotate(xy = (2,2), s='asdf')
# use any set_ function to change all the properties
text.set_fontsize(20)

I can't understand why "ax=ax" meaning in matplotlib

from datetime import datetime
fig=plt.figure()
ax=fig.add_subplot(1,1,1)
data=pd.read_csv(r"C:\Users\champion\Desktop\ch02\spx.csv")
spx=data["SPX"]
spx.plot(**ax=ax**,style="k-")
I can't understand why "ax=ax" meaning in matplotlib.
From the documentation of plot():
DataFrame.plot(x=None, y=None, kind='line', ax=None, subplots=False,
sharex=None, sharey=False, layout=None, figsize=None, use_index=True,
title=None, grid=None, legend=True, style=None, logx=False,
logy=False, loglog=False, xticks=None, yticks=None, xlim=None,
ylim=None, rot=None, fontsize=None, colormap=None, table=False,
yerr=None, xerr=None, secondary_y=False, sort_columns=False, **kwds)
Parameters: ax : matplotlib axes object, default None
You can see that ax is a keyword argument here. It just happens that you also named your variable as ax and you are sending it as the value of that keyword argument to the function plot().
Ax is the keyword for the part of the overall figure in which a chart/plot is drawn. So, when you type "spx.plot(**ax=", you are declaring the values for that part of the figure. The reason you are saying "ax=ax" is, as Nahal rightly pointed out, because you defined a variable named "ax" on the third line of code and you are using that to say what the ax keyword should be.
Here's an article with some helpful visuals.
https://towardsdatascience.com/what-are-the-plt-and-ax-in-matplotlib-exactly-d2cf4bf164a9
The current thought of the previous explanations are true, but it's an argument for a Series.plot() method.
Importing this
data=pd.read_csv(r"C:\Users\champion\Desktop\ch02\spx.csv")
What you are getting is a DataFrame.
And then:
spx=data["SPX"]
The code above is giving you a Series back. So, we are dealing with a Series.plot() method - but there is the same argument dor DataFrame.plot().
First of all, it is important to understand that this plot() method is correlated but not the same as the plot() function from matplotlib.
When you create a figure with:
fig=plt.figure()
It creates something like a blank sheet, and you can't create a plotting in a blank sheet.
The next code creates a subplot where you can finally plot something.
ax=fig.add_subplot(1,1,1)
And now we are getting to the question.
spx.plot(ax=ax,style="k-")
This piece of code is calling the plot method for a Series, and inside this method there is an optional argument called 'ax'.
The description of this argument says that it is an object of plotting from matplotlib for this plotting you want to do. If nothing is specified in there, so it makes use of the active subplotting of matplotlib.
Long story short, in your example there is only one active subplotting, and it is the 'ax' that was created before, so you could run your code without 'ax=ax', with the same result.
But it will make sense in a context when you have more than one subplotting object, so you could specify in wich one you would like to plot the spx Series.
We could have created a second and a third subplot, like this:
ax1 = fig.add_subplot(2, 2, 2)
ax2 = fig.add_subplot(2, 2, 3)
In this case, if I want to plot that Series on the 'ax1', I could have passed that to the argument:
spx.plot(ax=ax1,style="k-")
And now it is plotting on the exact box I wanted in the figure.

Additional keyword arguments in seaborn jointplot

I'm trying to find out how matplotlib and seaborn plotting functions are associated. Particularly, I'd like to know what pyplot arguments can be passed into keyword dicts marginal_kws and annot_kws in function seaborn.jointplot().
Suppose we have DataFrame data with columns c0 and c1. I guessed that joint_kws accepts arguments from pyplot.hexbin(), so when I tried to tune the appearance with arguments from there, it worked fine:
import seaborn as sns
sns.jointplot('c0', 'c1', data=data, kind='hex',
joint_kws={'gridsize':100, 'bins':'log', 'xscale':'log', 'yscale':'log'})
Then I tried to set log scale at histogram axes with an argument log=True from pyplot.hist():
sns.jointplot('c0', 'c1', data=data, kind='hex',
joint_kws={'gridsize':100, 'bins':'log', 'xscale':'log', 'yscale':'log'},
marginal_kws={'log':True})
This results in
TypeError: distplot() got an unexpected keyword argument 'log'
How to put it right?
P.S. This question is not about setting log scales in seaborn (with JointGrid, i know), but rather about passing matplotlib arguments into seaborn functions as a whole.
The dictionary of keyword arguments gets passed to distplot, which takes a dictionary for hist_kws. So you'll have to do something like marginal_kws={'hist_kws': {'log': True}}. With that said, shared log axes remain an enduring headache with jointplot, and I couldn't get something that looked good out of the box when adapting your code. Some tweaking might get it working, though.
It may also be useful to try and use JointGrid directly to avoid this kind of complexity.

Categories

Resources