Additional keyword arguments in seaborn jointplot - python

I'm trying to find out how matplotlib and seaborn plotting functions are associated. Particularly, I'd like to know what pyplot arguments can be passed into keyword dicts marginal_kws and annot_kws in function seaborn.jointplot().
Suppose we have DataFrame data with columns c0 and c1. I guessed that joint_kws accepts arguments from pyplot.hexbin(), so when I tried to tune the appearance with arguments from there, it worked fine:
import seaborn as sns
sns.jointplot('c0', 'c1', data=data, kind='hex',
joint_kws={'gridsize':100, 'bins':'log', 'xscale':'log', 'yscale':'log'})
Then I tried to set log scale at histogram axes with an argument log=True from pyplot.hist():
sns.jointplot('c0', 'c1', data=data, kind='hex',
joint_kws={'gridsize':100, 'bins':'log', 'xscale':'log', 'yscale':'log'},
marginal_kws={'log':True})
This results in
TypeError: distplot() got an unexpected keyword argument 'log'
How to put it right?
P.S. This question is not about setting log scales in seaborn (with JointGrid, i know), but rather about passing matplotlib arguments into seaborn functions as a whole.

The dictionary of keyword arguments gets passed to distplot, which takes a dictionary for hist_kws. So you'll have to do something like marginal_kws={'hist_kws': {'log': True}}. With that said, shared log axes remain an enduring headache with jointplot, and I couldn't get something that looked good out of the box when adapting your code. Some tweaking might get it working, though.
It may also be useful to try and use JointGrid directly to avoid this kind of complexity.

Related

seaborn rc parameters for set_context and set_style

In the tutorial for setting up the aesthetics of your plots, there are a few different methods:
set_style
set_context
axes_style
Each one of these accepts an rc keyword parameter dictionary. In each individual API page for the above three functions, it says:
rcdict, optional:
Parameter mappings to override the values in the preset seaborn style dictionaries. This only updates parameters that are considered part of the style definition.
Back in the tutorial page, under axes_style it goes on to say exactly how you can see what parameters are available for the rc dictionary for this one function:
If you want to see what parameters are included, you can just call the function with no arguments, which will return the current settings:
However, using this on the other functions always returns None. So, for example, I am using the following mix of matplotlib and seaborn
to set parameters:
mpl.rcParams['figure.figsize'] = [16,10]
viz_dict = {
'axes.titlesize':18,
'axes.labelsize':16,
}
sns.set_context("notebook", rc=viz_dict)
sns.set_style("whitegrid")
I also noticed that putting my dictionary in the set_style method does nothing, while, at least for those parameters, it only works in set_context. This means that they each have mutually exclusively characteristics that can be edited. However, this is not outlined anywhere in the docs.
I want to know which one of these three functions will accept a parameter for figsize. I'd also be curious to see what else they accept that might help me fine-tune things. My goal is to exclusively use the seaborn interface as often as possible. I don't need the fine tune control of things matplotlib provides, and often find it awkward anyway.
It would appear that the answer is 'none of the above'. The valid keys for set_style and set_context are listed here:
_style_keys = [
"axes.facecolor", "axes.edgecolor",
"axes.grid", "axes.axisbelow", "axes.labelcolor",
"figure.facecolor", "grid.color",
"grid.linestyle", "text.color",
"xtick.color", "ytick.color",
"xtick.direction", "ytick.direction",
"lines.solid_capstyle",
"patch.edgecolor", "patch.force_edgecolor",
"image.cmap", "font.family", "font.sans-serif",
"xtick.bottom", "xtick.top",
"ytick.left", "ytick.right",
"axes.spines.left", "axes.spines.bottom",
"axes.spines.right", "axes.spines.top",]
_context_keys = [
"font.size", "axes.labelsize",
"axes.titlesize", "xtick.labelsize",
"ytick.labelsize", "legend.fontsize",
"axes.linewidth", "grid.linewidth",
"lines.linewidth", "lines.markersize",
"patch.linewidth",
"xtick.major.width", "ytick.major.width",
"xtick.minor.width", "ytick.minor.width",
"xtick.major.size", "ytick.major.size",
"xtick.minor.size", "ytick.minor.size",]
Also note that set_style is just a convenience function which calls axes_style.
So you will have to use matplotlib.rcParams, although if the typical rcParams['figure.figsize'] = [16,10] syntax is not amenable you could of course create your own style.

Can anyone explain me seaborn's set_context()?

I am trying to learn visualization with python and stuck here:
sns.set_context('notebook')
ax = data.plot.hist(bins=25, alpha=0.42)
ax.set_xlabel('Size (cm)');
Can anyone help me to explain what does this code sample mean?
From the documentation:
seaborn.set_context(context=None, font_scale=1, rc=None)
Set the
plotting context parameters.
This affects things like the size of the labels, lines, and other
elements of the plot, but not the overall style. The base context is
“notebook”, and the other contexts are “paper”, “talk”, and “poster”,
which are version of the notebook parameters scaled by .8, 1.3, and
1.6, respectively.
Parameters: context : dict, None, or one of {paper, notebook, talk,
poster}
A dictionary of parameters or the name of a preconfigured set.
font_scale : float, optional
Separate scaling factor to independently scale the size of the font
elements.
rc : dict, optional
Parameter mappings to override the values in the preset seaborn
context dictionaries. This only updates parameters that are considered
part of the context definition.
sns.set_context('notebook') in your example, sets up a number of parameters which will define how seaborn produces plots you generate using the module.

Where can I find abbreviations of kwargs in the matplotlib documentation?

The documentation of various matplotlib methods lists all keyword arguments for each of those methods.
Appreciating that, I fail to find any hints about the abbreviations avaiable for those keyword arguments.
For example it is possible to write the following code:
fig, ax = plt.subplots() # creates figure with axes
line, = ax.plot([1,2], [1,2], color='blue', lw=2)
When I look up the documentation of matplotlib.axes.Axes.plot I am able to find the keyword argument "linewidth". But there is no hint that this argument can be abbreviated with "lw".
Assuming I didn't know that "lw" stands for "linewidth" how am I supposed to find out?

I can't understand why "ax=ax" meaning in matplotlib

from datetime import datetime
fig=plt.figure()
ax=fig.add_subplot(1,1,1)
data=pd.read_csv(r"C:\Users\champion\Desktop\ch02\spx.csv")
spx=data["SPX"]
spx.plot(**ax=ax**,style="k-")
I can't understand why "ax=ax" meaning in matplotlib.
From the documentation of plot():
DataFrame.plot(x=None, y=None, kind='line', ax=None, subplots=False,
sharex=None, sharey=False, layout=None, figsize=None, use_index=True,
title=None, grid=None, legend=True, style=None, logx=False,
logy=False, loglog=False, xticks=None, yticks=None, xlim=None,
ylim=None, rot=None, fontsize=None, colormap=None, table=False,
yerr=None, xerr=None, secondary_y=False, sort_columns=False, **kwds)
Parameters: ax : matplotlib axes object, default None
You can see that ax is a keyword argument here. It just happens that you also named your variable as ax and you are sending it as the value of that keyword argument to the function plot().
Ax is the keyword for the part of the overall figure in which a chart/plot is drawn. So, when you type "spx.plot(**ax=", you are declaring the values for that part of the figure. The reason you are saying "ax=ax" is, as Nahal rightly pointed out, because you defined a variable named "ax" on the third line of code and you are using that to say what the ax keyword should be.
Here's an article with some helpful visuals.
https://towardsdatascience.com/what-are-the-plt-and-ax-in-matplotlib-exactly-d2cf4bf164a9
The current thought of the previous explanations are true, but it's an argument for a Series.plot() method.
Importing this
data=pd.read_csv(r"C:\Users\champion\Desktop\ch02\spx.csv")
What you are getting is a DataFrame.
And then:
spx=data["SPX"]
The code above is giving you a Series back. So, we are dealing with a Series.plot() method - but there is the same argument dor DataFrame.plot().
First of all, it is important to understand that this plot() method is correlated but not the same as the plot() function from matplotlib.
When you create a figure with:
fig=plt.figure()
It creates something like a blank sheet, and you can't create a plotting in a blank sheet.
The next code creates a subplot where you can finally plot something.
ax=fig.add_subplot(1,1,1)
And now we are getting to the question.
spx.plot(ax=ax,style="k-")
This piece of code is calling the plot method for a Series, and inside this method there is an optional argument called 'ax'.
The description of this argument says that it is an object of plotting from matplotlib for this plotting you want to do. If nothing is specified in there, so it makes use of the active subplotting of matplotlib.
Long story short, in your example there is only one active subplotting, and it is the 'ax' that was created before, so you could run your code without 'ax=ax', with the same result.
But it will make sense in a context when you have more than one subplotting object, so you could specify in wich one you would like to plot the spx Series.
We could have created a second and a third subplot, like this:
ax1 = fig.add_subplot(2, 2, 2)
ax2 = fig.add_subplot(2, 2, 3)
In this case, if I want to plot that Series on the 'ax1', I could have passed that to the argument:
spx.plot(ax=ax1,style="k-")
And now it is plotting on the exact box I wanted in the figure.

How does the plotting function of pandas and matplotlib sync?

I'm new. Look at the snippet of the code that results in a graph.
df = pd.read_csv("data/GOOG.csv")
df['High'].plot()
plt.show()
My question is - how plt.show() (matplotlib.pyplot) is getting the values of x and y when plt is not being called with any parameters? The plot function is of the dataframe object. Does it store the value somewhere default from which plt can get the values?
plt.show() does not 'plot' the values, it will show the current already created figure(s).
It is df['High'].plot() which creates this figure under the hood. The pandas plotting functions are implemented by calling out to matplotlib. By default, it will create a new figure, unless the specify with the ax keyword argument a subplot on which to add the plot.

Categories

Resources