Data visualization in python (matplotlib) [duplicate] - python

I'm not really new to matplotlib and I'm deeply ashamed to admit I have always used it as a tool for getting a solution as quick and easy as possible. So I know how to get basic plots, subplots and stuff and have quite a few code which gets reused from time to time...but I have no "deep(er) knowledge" of matplotlib.
Recently I thought I should change this and work myself through some tutorials. However, I am still confused about matplotlibs plt, fig(ure) and ax(arr). What is really the difference?
In most cases, for some "quick'n'dirty' plotting I see people using just pyplot as plt and directly plot with plt.plot. Since I am having multiple stuff to plot quite often, I frequently use f, axarr = plt.subplots()...but most times you see only code putting data into the axarr and ignoring the figure f.
So, my question is: what is a clean way to work with matplotlib? When to use plt only, what is or what should a figure be used for? Should subplots just containing data? Or is it valid and good practice to everything like styling, clearing a plot, ..., inside of subplots?
I hope this is not to wide-ranging. Basically I am asking for some advice for the true purposes of plt <-> fig <-> ax(arr) (and when/how to use them properly).
Tutorials would also be welcome. The matplotlib documentation is rather confusing to me. When one searches something really specific, like rescaling a legend, different plot markers and colors and so on the official documentation is really precise but rather general information is not that good in my opinion. Too much different examples, no real explanations of the purposes...looks more or less like a big listing of all possible API methods and arguments.

pyplot is the 'scripting' level API in matplotlib (its highest level API to do a lot with matplotlib). It allows you to use matplotlib using a procedural interface in a similar way as you can do it with Matlab. pyplot has a notion of 'current figure' and 'current axes' that all the functions delegate to (#tacaswell dixit). So, when you use the functions available on the module pyplot you are plotting to the 'current figure' and 'current axes'.
If you want 'fine-grain' control of where/what your are plotting then you should use an object oriented API using instances of Figure and Axes.
Functions available in pyplot have an equivalent method in the Axes.
From the repo anatomy of matplotlib:
The Figure is the top-level container in this hierarchy. It is the overall window/page that everything is drawn on. You can have multiple independent figures and Figures can contain multiple Axes.
But...
Most plotting occurs on an Axes. The axes is effectively the area that we plot data on and any ticks/labels/etc associated with it. Usually we'll set up an Axes with a call to subplot (which places Axes on a regular grid), so in most cases, Axes and Subplot are synonymous.
Each Axes has an XAxis and a YAxis. These contain the ticks, tick locations, labels, etc.
If you want to know the anatomy of a plot you can visit this link.

I think that this tutorial explains well the basic notions of the object hierarchy of matplotlib like Figure and Axes, as well as the notion of current figure and current Axes.
If you want a quick answer: There is the Figure object which is the container that wraps multiple Axes(which is different from axis) which also contains smaller objects like legends, line, tick marks ... as shown in this image taken from matplotlib documentation
So when we do
>>> import matplotlib.pyplot as plt
>>> fig, ax = plt.subplots()
>>> type(fig)
<class 'matplotlib.figure.Figure'>
>>> type(ax)
<class 'matplotlib.axes._subplots.AxesSubplot'>
We have created a Figure object and an Axes object that is contained in that figure.

pyplot is matlab like API for those who are familiar with matlab and want to make quick and dirty plots
figure is object-oriented API for those who doesn't care about matlab style plotting
So you can use either one but perhaps not both together.

Related

Python3 - Plotting the same matplotlib axes object on multiple figures?

I have a script which I'm adapting to include a GUI. In it, I create a plot with subplots (the arrangement of which depends on the number of plots - e.g. 4 plots go into a square rather than 4-across). That plot (with a subplot for each of the "targets" analyzed) gets saved to a .png.
In building the GUI, I'm writing up the 'results' frame and would like to show these individual subplots on their own tabs. I've written the code to lay out the frame how I want it, but in order to separate the subplots into their own plots, I need to draw the completed Axes object (e.g. the entire subplot for that target) onto a new figure in the frame.
Since the number of subplots isn't known before runtime, I already have my Axes objects/subplots in an array (/list?) axs, whose members are the individual Axes objects (each containing data points created with ax.scatter() and several lines and annotations created with ax.plot() and ax.annotate).
When I initially create the axes, I do so with
fig, axs = plt.subplots(num='Title', nrows=numrow, ncols=numcol,
figsize=[numcol*5, numrow*5],
subplot_kw={'adjustable':'box', 'aspect':1})
Is there a way to now take these axes and draw them onto a new figure (the one that will be contained in the 'results' frame of the GUI)? In my searches, I only came up with ways to plot multiple axes onto a single figure (i.e. how to use subplots()) but nothing came up on how I'd throw a pre-existing Axes object into a new figure that it wasn't originally associated with. I'd rather not re-draw the axes from scratch -- there's quite a bit of decoration and multiple datasets / lines plotted onto them already.
Any ideas? Happy to post code as requested, but since this more of a "How do I do this" than a "why doesn't my code work", I didn't post much of it.
Thank you!
I believe that's not possible and you will have to recreate the Axes objects inside the other figure. Which is just a matter of code reorganization. Note that your approach would not noticeably improve rendering performance. Matplotlib would have to re-render the Axes objects anyway, and that's the computationally expensive part. Creating the objects is relatively cheap.
What you're trying to do is pretty much this:
from matplotlib import pyplot
pyplot.ion()
figure1 = pyplot.figure()
axes = figure1.add_subplot()
axes.plot([0, 1], [0, 1])
figure2 = pyplot.figure()
figure2.add_axes(axes)
Which raises:
ValueError: The Axes must have been created in the present figure
And the documentation of add_axes() notes:
In rare circumstances, add_axes may be called with a single argument, an Axes instance already created in the present figure but not in the figure's list of Axes.
So that's a pretty clear indication that this is not a supported use case.

What are the two styles of matplotlib?

I'm reading the documentation for matplotlib. Under the 'Coding Styles' section, it says:
When viewing this documentation and examples, you will find different coding styles and usage patterns.
Later...
Of the different styles, there are two that are officially supported. Therefore, these are the preferred ways to use matplotlib.
For the pyplot style...
But then in the rest of that section they never explicitly explain or mention the 'second' supported coding style. They say something about a 'MATLAB-style' but it is unclear from the context if that is referring to the pyplot style (as if it is like MATLAB) or if it is a separate style itself.
Question
What is the second supported matplotlib coding style and how does it relate / differ from the pyplot style?
Arguably this part of the usage guide is a bit hard to understand in its current form. There was however an update recently (#14223), which might make it clearer. A preview version of this can be found here:
https://matplotlib.org/devdocs/tutorials/introductory/usage.html#the-object-oriented-interface-and-the-pyplot-interface
As noted above, there are essentially two ways to use Matplotlib:
Explicitly create figures and axes, and call methods on them (the "object-oriented (OO) style").
Rely on pyplot to automatically create and manage the figures and axes, and use pyplot functions for plotting.
The next level down in the hierarchy is the first level of the object-oriented
interface, in which pyplot is used only for a few functions such as figure
creation, and the user explicitly creates and keeps track of the figure and axes
objects. At this level, the user uses pyplot to create figures, and through those
figures, one or more axes objects can be created. These axes objects are then used
for most plotting actions.

Create a figure of figures with matplotlib

I would like to know if there is a way to combined several figures created with matplotlib in one unique figure.
Most of the existing topics are related to multiple plots within one figure. But here, I have several functions which all create one elaborated figure (not just a plot, the figure itself is a multiple plot with texts, title, legends,...)
So instead of just doing the layout of those several figures using a software like Word, is there a way to directly combined all my figures in one unique figure under python ?
Thank you in advance !
The concept of figure in matplotlib does not allow to have a figure inside a figure. The figure is the canvas for other artists, like axes. You may of course add as many axes to a figure as you like. So for example instead of one figure with 4 axes and another figure with 6 axes, you can create a figure with 10 axes.
A good choice may be to use the gridspec, as detailed on the respecive matplotlib page.
After additional researches, it seems my problem has no easy solution within Matplotlib itself. Multiple figures layout needs external post-processing of plots.
For those having the same problem, here is an interesting link :
Publication-quality figures with matplotlib and svgutils

matplot and seaborn figure parameters/customizations

I'm so confused between the two. Every time I make a chart on either pyplot or seaborn, I have to guess what syntax to use. For example, for seaborn doesn't have a title setter so I have to remember to use plt.title. Or, for seaborn charts, plt.xlabel doesn't work, so I have to use sns.axlable(x,y).
And also, randomly I run into the following problem. I'm simply trying to make my seaborn jointplot bigger but I have no success trying both the plt nor the seaborn methods (any tips as to a good documentation showing all the chart parameters??? I find them scattered on the web and it seems like each solution on stack overflow is unique...which adds to the overall confusion).
Here's my code:
a = plt.figure(figsize=(30,30))
a.set_size_inches(30,30)
sns.jointplot(x='COAST',y='NORTH',data = data_df, kind = 'kde')
Notice I used the plt method and the sns.set_size_inches methods. Both gave me a small chart.
So frustrated with the random overlaps of the two libraries. Any pro tips to lessen the confusion will be greatly appreciated!
edit: This is also true for seaborn's pairplot. I have no success in changing the pairplot's size.
sns.jointplot creates its own figure instance (as #tcaswell suspected). It doesn't appear that you can tell jointplot to use an existing figure. I think you have two options:
You can give sns.jointplot the size option. e.g.:
sns.jointplot(x='COAST', y='NORTH', data=data_df, kind='kde', size=30)
You can alter the JointGrid figure size after creating it, using:
g=sns.jointplot(x='COAST', y='NORTH', data=data_df, kind='kde')
g.fig.set_size_inches(30,30)
I presume option 1 is the better option, as it is a built-in seaborn option

Embed matplotlib figure in larger figure

I am writing a bunch of scripts and functions for processing astronomical data. I have a set of galaxies, for which I want to plot some different properties in a 3-panel plot. I have an example of the layout here:
Now, this is not a problem. But sometimes, I want to create this plot just for a single galaxy. In other cases, I want to make a larger plot consisting of subplots that each are made up of the three+pane structure, like this mockup:
For the sake of modularity and reusability of my code, I would like to do something to the effect of just letting my function return a matplotlib.figure.Figure object and then let the caller - function or interactive session - decide whether to show() or savefig the object or embed it in a larger figure. But I cannot seem to find any hints of this in the documentation or elsewhere, it doesn't seem to be something people do that often.
Any suggestions as to what would be the best road to take? I have speculated whether using axes_grid would be the solution, but it doesn't seem quite clean and caller-agnostic to me. Any suggestions?
The best solution is to separate the figure layout logic from the plotting logic. Write your plotting code something like this:
def three_panel_plot(data, ploting_args, ax1, ax2, ax3):
# what you do to plot
So now the code that turns data -> images takes as arguments the data and where it should plot that data too.
If you want to do just one, it's easy, if you want to do a 3x3 grid, you just need to generate the layout and then loop over the axes sets + data.
The way you are suggesting (returning an object out of your plotting routine) would be very hard in matplotlib, the internals are too connected.

Categories

Resources