How does matplolib knows on which figure to plot

How does matplolib knows on which figure to plot - python

I am trying to understand how matplotlib works (my question is probably more general though).
To plot a curve, I can do either of the following:
fig,ax=plt.subplots()
plt.plot([1,2,3])
#%%
fig,ax=plt.subplots()
ax.plot([1,2,3])
In the first case, python knows that the plot should be done with respect to the set of axes "ax". In the second case I "tell it" in a way that it is for the set of axes "ax" because I use the method from ax.
First question:
How does the first way works ? How does python knows that the plot has to be done with ax ? What is the mechanics behind this ? Indeed I would expect that I have to tell it on what to plot. But it "deduces" it.
Second question:
As a related question: is there anything different between the two ways of plotting or are they totally equivalent ?
Third question:
plt.plot has an "analog method" with the axes: ax.plot
plt.pcolor has an "analog method" with the axes: ax.pcolor
We can find many other examples for that.
Is it true for all plotting method ? Inside of matplotlib is it indeed exactly the same code behind plt.FUNCTION and ax.FUNCTION (using the same "FUNCTION" for both)?

Related

Limiting Number of ticks in Matplotlib

Apologies for the really long set of questions.
I am trying to plot a graph in matplotlib. I was faced with this issue of limiting the number of ticks on both of the axes. Looking into pyplot I could not find any solution.
The only solution I came across was by creating a subplot in the following manner.
ax = plt.subplot(111)
ax.xaxis.set_major_locator(plt.MaxNLocator(4))
Although the above works, I am left with a few unsolved questions most of them in relation to how the matplotlib library is structured.
Is there no feature whereby an object of pyplot.plot() can have
the number of ticks limited. Do I have to always depend on
subplotting?
When i create an object ax = plt.subplot(111) I find that it
creates an instance as below
type(ax)
Out[228]: matplotlib.axes._subplots.AxesSubplot
Why does the documentation say that the subplot method returns a class ---> axes.SubplotBase
Also I see that we need to use the xaxis attribute of ax(is it a method) which helps set the property related to the ticks.
type(ax.xaxis)
Out[233]: matplotlib.axis.XAxis
When ax is an object of some subclass of matplotlib.axes (not sure if it is SubplotBase or AxesSubplot) how come we can refer to ax.xaxis. The xaxis (or axis.Xaxis) attribute is not mentioned under to documentation of the matplotlib.axes.
I am pretty confused over the hierarchy and structure of matplotlib. It would be be helpful if someone can point me to an article or blog which details the structure of these features.
Looking through the documentation I could not figure out a suitable attribute of the subplot class which could help solve this problem related to number of ticks. I am not sure how I am going to solve the next problem if I cant go through the documentation and figure it out.
Thanks,
Sree

Why does matplotlib have so many options for 'making' a plot?

So I have been using matplot for a while now, and one thing that confuses me how it handles subplots.
It can be done as such:
fig = plt.figure()
ax = fig.add_subplot(1,2,1)
Or. It could be done as such:
plt.subplots(1,2)
Or, one could do this instead:
plt.subplot(211)
Or of course, if we only need one plot we can immediately run
plt.plot(x,y) # or .scatter or whatever.
Why? Is there any actual reason why you should use one over the others?

Matplotlib is a Python library, that was significantly influenced by MatLab and aimed in part at former and current MatLab users, therefore has 2 types of syntax:
MatLab syntax, plt.sublot(211); plt.plot(); plt.colorbar();, that implies that every time you create a figure or subplot it is stored inside as last active object and all plotting and changing of parameters is applied to it. It is to make it comfortable to use for those, who transitioned from MatLab. The idea is you create an element and instantly apply all actions to it, then create next and never return to previous until you call plt.show .
Classic programming syntax with explicit object declaration and operations on said objects. It is comfortable for everyone else and allows one to go back to previously created objects (figures and axes) and make additional changes.
The matlab way makes it hard to work with multiple figures. (figure is an independent picture, axes is region of that picture that you plot data in) plt.show always shows the last figure you created, example:
plt.figure() #new figure created and stored as current active
Plt.subplot() #new axes created in current figure and stored as current axes
Plt.plot() #data plotted in current axes
Plt.subplot() #second axes added to figure and made current active
Plt.plot() #plot in current active that is second now
Plt.figure() #the new figure created, old can still be found with difficulty i believe,
#but current active is now different and plt.show will not show anything you plotted before.
But some people find it better for quick and dirty plotting.
You can transition between the 2 by using fig_1 = plt.gcf();, ax_1 = plt.gca(); which are get current figure and get current axes redpectively. There are also multiple ways to change appearance, one for matlab
plt.make_current_axes_without_ticks( param)
and one for oop languages (
fig=plt.figure;
ax=fig.add_axes();
axis=ax.y_axis();
yticks=axis.ticks();
yticks.set_visible(false);
or smth like that.
Multiple interfaces sure make learning it harder, but it makes easier to transition from other similar tools and make it less rigid, meaning sometimes there're very simple ways to make frequently used but convoluted changes. (See ticks example)

Data visualization in python (matplotlib) [duplicate]

I'm not really new to matplotlib and I'm deeply ashamed to admit I have always used it as a tool for getting a solution as quick and easy as possible. So I know how to get basic plots, subplots and stuff and have quite a few code which gets reused from time to time...but I have no "deep(er) knowledge" of matplotlib.
Recently I thought I should change this and work myself through some tutorials. However, I am still confused about matplotlibs plt, fig(ure) and ax(arr). What is really the difference?
In most cases, for some "quick'n'dirty' plotting I see people using just pyplot as plt and directly plot with plt.plot. Since I am having multiple stuff to plot quite often, I frequently use f, axarr = plt.subplots()...but most times you see only code putting data into the axarr and ignoring the figure f.
So, my question is: what is a clean way to work with matplotlib? When to use plt only, what is or what should a figure be used for? Should subplots just containing data? Or is it valid and good practice to everything like styling, clearing a plot, ..., inside of subplots?
I hope this is not to wide-ranging. Basically I am asking for some advice for the true purposes of plt <-> fig <-> ax(arr) (and when/how to use them properly).
Tutorials would also be welcome. The matplotlib documentation is rather confusing to me. When one searches something really specific, like rescaling a legend, different plot markers and colors and so on the official documentation is really precise but rather general information is not that good in my opinion. Too much different examples, no real explanations of the purposes...looks more or less like a big listing of all possible API methods and arguments.

pyplot is the 'scripting' level API in matplotlib (its highest level API to do a lot with matplotlib). It allows you to use matplotlib using a procedural interface in a similar way as you can do it with Matlab. pyplot has a notion of 'current figure' and 'current axes' that all the functions delegate to (#tacaswell dixit). So, when you use the functions available on the module pyplot you are plotting to the 'current figure' and 'current axes'.
If you want 'fine-grain' control of where/what your are plotting then you should use an object oriented API using instances of Figure and Axes.
Functions available in pyplot have an equivalent method in the Axes.
From the repo anatomy of matplotlib:
The Figure is the top-level container in this hierarchy. It is the overall window/page that everything is drawn on. You can have multiple independent figures and Figures can contain multiple Axes.
But...
Most plotting occurs on an Axes. The axes is effectively the area that we plot data on and any ticks/labels/etc associated with it. Usually we'll set up an Axes with a call to subplot (which places Axes on a regular grid), so in most cases, Axes and Subplot are synonymous.
Each Axes has an XAxis and a YAxis. These contain the ticks, tick locations, labels, etc.
If you want to know the anatomy of a plot you can visit this link.

I think that this tutorial explains well the basic notions of the object hierarchy of matplotlib like Figure and Axes, as well as the notion of current figure and current Axes.
If you want a quick answer: There is the Figure object which is the container that wraps multiple Axes(which is different from axis) which also contains smaller objects like legends, line, tick marks ... as shown in this image taken from matplotlib documentation
So when we do
>>> import matplotlib.pyplot as plt
>>> fig, ax = plt.subplots()
>>> type(fig)
<class 'matplotlib.figure.Figure'>
>>> type(ax)
<class 'matplotlib.axes._subplots.AxesSubplot'>
We have created a Figure object and an Axes object that is contained in that figure.

pyplot is matlab like API for those who are familiar with matlab and want to make quick and dirty plots
figure is object-oriented API for those who doesn't care about matlab style plotting
So you can use either one but perhaps not both together.

Creating two completely independent plots in matplotlib and going back and forth between them

I'd like to create two independent matplotlib plots within a python script, and potentially jump back and forth between them as I add lines, annotations, etc. to the various plots (for example, perhaps I call a function which adds lines to both plots, and then another function which adds the annotations).
I expect that by working off matplotlib examples I'd be able to figure out some solution that works, but I'd like to know what the preferred and cleanest way of doing this is. I tend to get confused about when I should be doing things like
fig,ax=plt.subplots()
and when I should be doing things like:
fig=plt.figure()
Furthermore, how should I be switching back and forth between plots. If I did something like
fig1,ax1=plt.subplots()
fig2,ax2=plt.subplots()
can I then just refer to these plots by doing something like:
ax1.plt.plot([some stuff])
ax2.plt.plot([otherstuff]
? I ask this because often in the matplotlib examples they don't refer to the plot like this after calling plt.subplot() but instead call commands like
plt.plot([stuff])
where presumably it doesn't matter that they didn't specify ax1 or ax2 because there's only one plot in the example. At the end I'd like to save both plots to file using something like
plt.savefig(....)
although I need, again, to be able to refer to both plots independently. So what's the proper way of implementing this?

If you want to be able to write code that clearly applies commands to distinct axes, you want to use the object oriented interface.
Actually, both of your first two examples are using this interface. The differences is that plt.subplots() will create both a figure object and a grid of axes, while plt.figure() just creates the figure.
The figure object has methods to create axes within it. So, these two blocks of code are equivalent:
fig, ax = plt.subplots()
and
fig = plt.figure()
ax = fig.add_subplot(111)
Generally, the latter approach is only going to be more useful when you want multiple axes within the figure that don't follow a regular grid. So, you could do:
fig = plt.figure()
ax = fig.add_axes([.1, .1, .2, .8])
which will add a tall axes object on the left side of the figure.
Next, how do you get multiple axes to plot on?
The subplots function takes two positional arguments specifying the number of rows and columns in the grid (these default to (1, 1). So if you want two axes side by side, you would do
fig, axes = plt.subplots(1, 2)
Now axes is going to be a (1, 2) object array that is filled with Axes objects. It's often more convenient, for a small grid, to use Python's tuple unpacking and get direct references to the objects:
fig, (ax1, ax2) = plt.subplots(1, 2)
Now, what do you do with these objects, and what's the relationship between them and the MATLAB-style procedure interface?
Most functions in the pyplot namespace also exist as methods on either Figure or Axes objects. Matplotlib (and MATLAB) has a concept of the "current" figure and axes. When you call a function like plt.plot, it draws on the current axis (usually the most recently created one). When you call a function like plt.savefig, it saves the current figure.
For simple task, this is a bit more direct and usually easier than using the object-oriented interface. However, when you start making more complex plots, e.g. a grid of axes where each grid has multiple layers (maybe a scatterplot and a regression line), being able to structure the code around what you are doing rather than where you are doing has substantial advantages. Generally, plotting code that is written in an object-oriented fashion will scale much better than code written in a procedural fashion.

Embed matplotlib figure in larger figure

I am writing a bunch of scripts and functions for processing astronomical data. I have a set of galaxies, for which I want to plot some different properties in a 3-panel plot. I have an example of the layout here:
Now, this is not a problem. But sometimes, I want to create this plot just for a single galaxy. In other cases, I want to make a larger plot consisting of subplots that each are made up of the three+pane structure, like this mockup:
For the sake of modularity and reusability of my code, I would like to do something to the effect of just letting my function return a matplotlib.figure.Figure object and then let the caller - function or interactive session - decide whether to show() or savefig the object or embed it in a larger figure. But I cannot seem to find any hints of this in the documentation or elsewhere, it doesn't seem to be something people do that often.
Any suggestions as to what would be the best road to take? I have speculated whether using axes_grid would be the solution, but it doesn't seem quite clean and caller-agnostic to me. Any suggestions?

The best solution is to separate the figure layout logic from the plotting logic. Write your plotting code something like this:
def three_panel_plot(data, ploting_args, ax1, ax2, ax3):
# what you do to plot
So now the code that turns data -> images takes as arguments the data and where it should plot that data too.
If you want to do just one, it's easy, if you want to do a 3x3 grid, you just need to generate the layout and then loop over the axes sets + data.
The way you are suggesting (returning an object out of your plotting routine) would be very hard in matplotlib, the internals are too connected.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.