matplot and seaborn figure parameters/customizations - python

I'm so confused between the two. Every time I make a chart on either pyplot or seaborn, I have to guess what syntax to use. For example, for seaborn doesn't have a title setter so I have to remember to use plt.title. Or, for seaborn charts, plt.xlabel doesn't work, so I have to use sns.axlable(x,y).
And also, randomly I run into the following problem. I'm simply trying to make my seaborn jointplot bigger but I have no success trying both the plt nor the seaborn methods (any tips as to a good documentation showing all the chart parameters??? I find them scattered on the web and it seems like each solution on stack overflow is unique...which adds to the overall confusion).
Here's my code:
a = plt.figure(figsize=(30,30))
a.set_size_inches(30,30)
sns.jointplot(x='COAST',y='NORTH',data = data_df, kind = 'kde')
Notice I used the plt method and the sns.set_size_inches methods. Both gave me a small chart.
So frustrated with the random overlaps of the two libraries. Any pro tips to lessen the confusion will be greatly appreciated!
edit: This is also true for seaborn's pairplot. I have no success in changing the pairplot's size.

sns.jointplot creates its own figure instance (as #tcaswell suspected). It doesn't appear that you can tell jointplot to use an existing figure. I think you have two options:
You can give sns.jointplot the size option. e.g.:
sns.jointplot(x='COAST', y='NORTH', data=data_df, kind='kde', size=30)
You can alter the JointGrid figure size after creating it, using:
g=sns.jointplot(x='COAST', y='NORTH', data=data_df, kind='kde')
g.fig.set_size_inches(30,30)
I presume option 1 is the better option, as it is a built-in seaborn option

Related

Data visualization in python (matplotlib) [duplicate]

I'm not really new to matplotlib and I'm deeply ashamed to admit I have always used it as a tool for getting a solution as quick and easy as possible. So I know how to get basic plots, subplots and stuff and have quite a few code which gets reused from time to time...but I have no "deep(er) knowledge" of matplotlib.
Recently I thought I should change this and work myself through some tutorials. However, I am still confused about matplotlibs plt, fig(ure) and ax(arr). What is really the difference?
In most cases, for some "quick'n'dirty' plotting I see people using just pyplot as plt and directly plot with plt.plot. Since I am having multiple stuff to plot quite often, I frequently use f, axarr = plt.subplots()...but most times you see only code putting data into the axarr and ignoring the figure f.
So, my question is: what is a clean way to work with matplotlib? When to use plt only, what is or what should a figure be used for? Should subplots just containing data? Or is it valid and good practice to everything like styling, clearing a plot, ..., inside of subplots?
I hope this is not to wide-ranging. Basically I am asking for some advice for the true purposes of plt <-> fig <-> ax(arr) (and when/how to use them properly).
Tutorials would also be welcome. The matplotlib documentation is rather confusing to me. When one searches something really specific, like rescaling a legend, different plot markers and colors and so on the official documentation is really precise but rather general information is not that good in my opinion. Too much different examples, no real explanations of the purposes...looks more or less like a big listing of all possible API methods and arguments.
pyplot is the 'scripting' level API in matplotlib (its highest level API to do a lot with matplotlib). It allows you to use matplotlib using a procedural interface in a similar way as you can do it with Matlab. pyplot has a notion of 'current figure' and 'current axes' that all the functions delegate to (#tacaswell dixit). So, when you use the functions available on the module pyplot you are plotting to the 'current figure' and 'current axes'.
If you want 'fine-grain' control of where/what your are plotting then you should use an object oriented API using instances of Figure and Axes.
Functions available in pyplot have an equivalent method in the Axes.
From the repo anatomy of matplotlib:
The Figure is the top-level container in this hierarchy. It is the overall window/page that everything is drawn on. You can have multiple independent figures and Figures can contain multiple Axes.
But...
Most plotting occurs on an Axes. The axes is effectively the area that we plot data on and any ticks/labels/etc associated with it. Usually we'll set up an Axes with a call to subplot (which places Axes on a regular grid), so in most cases, Axes and Subplot are synonymous.
Each Axes has an XAxis and a YAxis. These contain the ticks, tick locations, labels, etc.
If you want to know the anatomy of a plot you can visit this link.
I think that this tutorial explains well the basic notions of the object hierarchy of matplotlib like Figure and Axes, as well as the notion of current figure and current Axes.
If you want a quick answer: There is the Figure object which is the container that wraps multiple Axes(which is different from axis) which also contains smaller objects like legends, line, tick marks ... as shown in this image taken from matplotlib documentation
So when we do
>>> import matplotlib.pyplot as plt
>>> fig, ax = plt.subplots()
>>> type(fig)
<class 'matplotlib.figure.Figure'>
>>> type(ax)
<class 'matplotlib.axes._subplots.AxesSubplot'>
We have created a Figure object and an Axes object that is contained in that figure.
pyplot is matlab like API for those who are familiar with matlab and want to make quick and dirty plots
figure is object-oriented API for those who doesn't care about matlab style plotting
So you can use either one but perhaps not both together.

python multiple stacked plots along y axis

I have a binned data of an x-axis n-length vector and 3 y-axis n-length vector for 3 different histograms on the same x-axis.
Now I want this kind of stacked bar plot or any thing similar as below.
The nearest I have found is Qtiplot (which is not python). It can generate exactly this kind of histogram plots. But it computes the histogram by itself and requires the actual data samples which are not present in my case (I only have the histogram itself).
Please note that I don't know python very well. So I don't have a clue from where I shall start, neither I am really in a mood to learn programming in python. I need this only to make a nice vector-graphics plot for my research thesis.
I have tagged python as I think python is the most obvious language. In case someone knows any better solution other than in python (but not Matlab, I cannot install that huge pile), I will thankfully add the proper tag.
Thanks in advance for any help.
use matplotlib package in python
import matplotlib.pyplot as plt
apple_weight=[3,3,3,10,10,1,1,1,4,4,4,4,7,7,7]
banana_weight=[3,3,3,10,10,1,1,1,4,4,4,4,7,7,7]
mango_weight=[3,3,3,10,10,1,1,1,4,4,4,4,7,7,7]
fig=plt.figure()
ax1=fig.add_subplot(311)
ax2=fig.add_subplot(312)
ax3=fig.add_subplot(313)
ax1.hist(apple_weight)
ax2.hist(banana_weight)
ax3.hist(mango_weight)
plt.show()
import matplotlib.pyplot as plt
apple_weight=[3,3,3,10,10,1,1,1,4,4,4,4,7,7,7]
banana_weight=[3,3,3,10,10,1,1,1,4,4,4,4,7,7,7]
mango_weight=[3,3,3,10,10,1,1,1,4,4,4,4,7,7,7]
fig=plt.figure()
ax1=fig.add_subplot(111)
ax2=ax1.twinx()
#only two y axes so the third list just add to either
ax1.hist(apple_weight)
ax2.hist(banana_weight)
ax1.hist(mango_weight)
plt.show()

Determine kind of Matplotlib Axes subplot

Given a matplotlib.axes_subplots.AexesSubplot object how do I tell what type of plot it contains? Is there a matplotlib feature that will determine this for me? for example...
I commonly plot data with pandas
import pandas as pd
df = pd.DataFrame({'y':range(10)})
line_ax = df.plot()
or
bar_ax = df.plot(kind='bar')
or
barh_ax = df.plot(kind='barh')
The matplotlib axes does not care about which plot it contains and it does not even know about it.
The question would also be how to distinguish "kinds" of plots. What kind of plot is in an axes which contains 2 bars, several markers, 2 lines and 3 arrows?
The kind argument to pandas plot function is simply a flag by which pandas decides which plotting function to call. This is independent of the axes and you may of course also have a plot produced by kind='bar' and kind='scatter' in the same axes.
So the answer is: No there is no general way to determine the kind of plot in an axes, mainly due to the fact that there is no such thing as a "kind of plot".
Of course, depending on what you'd need this type of information for, there are probably alternative ways to accomplish what you need.

Pyplot/Subplot APIs Matplotlib

I'm making something using Matplotlib where I have multiple subplots on a figure. It seems to me like the subplot API is limited compared to the PyPlot API: for example, I can't seem to make custom axes labels in my subplot although it is possible using PyPlot.
My question is: Is there a richer subplot API besides the tiny one on the PyPlot page (http://matplotlib.org/api/pyplot_api.html), and/or is there a way to get the full functionality of a PyPlot on a subplot?
Basically, what is a subplot? I can't find it in the documentation. Even more generally, when should I use a figure vs an axis vs a subplot? They all seem to do essentially the same thing.
Consider the following code:
import matplotlib.pyplot as plt
fig = plt.figure()
ax = fig.add_subplot(2,1,1)
Then ax is an axis? Can I use the pyplot API to customize ax?
Thanks for your help.
While i suggest that use the axes methods, there is the plt.sca function (set current axes).
So
plt.sca(ax)
does what you want, i think.

How to change the amount of increments in pyplot axis

Hi probably quite a simple question but..
When plotting a graph using matplotlib.pyplot my Y axis goes from -0.04 to 0.03 which is fine but there are 8 labels for increments (eg 0.03,0.02,0.01 etc.). I need more maybe 16 or so.
Thanks for your help
Matplotlib has several different algorithms for choosing tick locations automatically, and e.g. LinearLocator or MaxNLocator may suit your purpose. See the major_minor demo for how to use Locators in general, and the ticker api documentation for the various Locators available. The documentation for the individual classes is somewhat sparse, but guessing based on the argument names tends to work fine.
Use set_yticks() to change the tick locations. For example:
import scipy, pylab
fig = pylab.figure()
ax = fig.add_subplot(1,1,1)
ax.plot(scipy.randn(8))
ax.set_yticks(scipy.arange(-1.5,1.5,0.25))
fig.show()
pylab.yticks() is another option.
(source: stevetjoa.com)

Categories

Resources