What is being plotted by plt.plot with a tuple argument? - python

This code snippet:
import matplotlib.pyplot as plt
plt.plot(([1, 2, 3], [1, 2, 3]))
plt.show()
produces:
What function is being plotted here? Is this use case described in matplotlib documentation?
This snippet:
plt.plot(([1, 2, 3], [1, 2, 3], [2, 3, 4]))
produces:

From the new test case you provided we can see it is picking the i-th element on the list and building a series.
So it ends up plotting the series y = {1, 1, 2}, y = {2, 2 , 3} and y = {3, 3, 4}.
On a more generic note, we can assume that using a tuple of list will plot multiple series.
Honestly, it doesn't look that user friendly to write the input like that but there might be some case where it is more convenient.
The x-values are picked by a default according to the docs:
The horizontal / vertical coordinates of the data points. x values are optional and default to range(len(y)).

Calling plt.plot(y) is calling plot in the Axes class. Looking at the source code, the key description closest to your problem states the following for plotting multiple sets of data:
- If *x* and/or *y* are 2D arrays a separate data set will be drawn
for every column. If both *x* and *y* are 2D, they must have the
same shape. If only one of them is 2D with shape (N, m) the other
must have length N and will be used for every data set m.
Example:
>>> x = [1, 2, 3]
>>> y = np.array([[1, 2], [3, 4], [5, 6]])
>>> plot(x, y)
is equivalent to:
>>> for col in range(y.shape[1]):
... plot(x, y[:, col])
The main difference here compared to your example is that x is implicitly defined based on the length of your tuple (described elsewhere in the documentation) and that you are using a tuple rather than an np.array. I tried digging further into the source code to see where tuples would become arrays. In particular at line 1632: lines = [*self._get_lines(*args, data=data, **kwargs)] seems to be where the different lines are likely generated, but that is as far as I got.
Of note, this is one of three ways to plot multiple lines of data, this being the most compact.

Related

Visualising entity density on a 2D plane using pcolormesh in matplotlib, Python

I am trying to recreate the following heatmap (created with R) with Python.
This heatmap represents the entity concentration in a room, where the lighter color equate denser entity. The X axis is length in meters and the Y axis is height in meters.
Currently, I have been trying to recreate the data with matplotlib's pcolormesh. Unfortunately, I cannot seem to understand how to define X and Y columns for the heatmap.
The following code produces the heatmap below:
df = pd.DataFrame({"x": [0, 0, 1, 1, 2, 2],
"y": [3, 4, 3, 4, 3, 4],
"concentration": [1123, 1238, 1285, 1394, 5123, 8712]})
plt.pcolormesh(df)
plt.show()
Whereas I would like to see something like this:
As you can see, for X and Y I actually use values from columns in the dataframe.
Basically, X and Y are coordinates and the color at those pixels is dependent on the value of concentration (third column).
I tried to pass the arguments like that:
plt.pcolormesh([df["x"], df["z"]], df["concentration"])
plt.show()
But this leads to the following error:
TypeError: pcolormesh() takes 1 or 3 positional arguments but 2 were given
How am I to represenet the concentration data as a 2D array?
Am I even on the correct path?

ValueError: color kwarg must have one color per data set. 4462 data sets and 1 colors were provided

I have an error that I do not comprehend, am quite new to this so thank you all in advance!
I am using Jupyter Notebook (Anaconda3)The link here shows my code and error message
The problem is the dimensionality of your all_unbalanced_data array/list.
If you are using an N-dimensional data (N different datasets, lists of data, etc) as input to plt.hist, then the color kwarg must be of the same dimensionality.
You input one single color, so for the script to work your data must be shaped as a 1-dimensional array.
A rule of thumb
Suppose your data are contained in a numpy array:
all_unbalanced_data = np.array([[1, 2, 3, 4], [5, 6, 7, 8]])
Than you can extract the shape (dimensionality) of the array:
all_unbalanced_data.shape
>>> (2, 4)
The the number of colors plt.hist will expect will be 2:
color = ['color_code_1', 'color_code_2']
So in your case plt.hist is expecting 4462 different colors.

LinearNDInterpolatorExtrapolate returns error with trivial example

I'm trying to use scipy's LinearNDInterpolatorExtrapolate.
The following minimal code should be as trivial as possible, yet it returns an error
from scipy.interpolate import NearestNDInterpolator
points = [[0,0,0], [1,0,0], [1,1,0],[0,1,0],[.5,.5,1]]
values = [1,2,3,4,5]
interpolator = NearestNDInterpolator(points,values)
interpolator([.5,.5,.8])
returns
TypeError: only integer scalar arrays can be converted to a scalar index
The error seems to come from line 81 of scipy.interpolate.ndgriddata [source]. Unfortunately I could not chase the error further, as I don't understand what tree.query is returning.
Is this a bug or I'm doing something wrong?
In your case, it seems like a problem with value type. Because first values of points and values are Python's integers, the rest are interpreted as integers.
The following fixes your code and returns a correct answer, which is [5]:
import numpy as np
from scipy.interpolate import NearestNDInterpolator
points = np.array([[0, 0, 0], [1, 0, 0], [1, 1, 0],[0, 1, 0],[.5, .5, 1]])
values = np.array([1, 2, 3, 4, 5])
interpolator = NearestNDInterpolator(points, values)
interpolator(np.array([[.5, .5, .8]]))
>>> array([5])
Notice two things:
I imported numpy and used np.array. This is the preferable way to work with scipy, because np.array, albeit being static, is much faster comparing to python's list and provides a spectrum of mathematical operations.
When calling interpolator, I used [[...]] instead of your [...]. Why? It highlights the fact that NearestNDInterpolator can interpolate values in multiple points.
Pass your input as arrays
interpolator = NearestNDInterpolator(np.array(points),np.array(
values))
You can even pass many points:
interpolator([np.array([.5,.5,.8]),np.array([1,1,2])])
>>>> array([5,5])
Just pass the values without a list as a tuple of x-values
from scipy.interpolate import NearestNDInterpolator
points = [[0,0,0], [1,0,0], [1,1,0],[0,1,0],[.5,.5,1]]
values = [1,2,3,4,5]
interpolator = NearestNDInterpolator(points,values)
interpolator((.5,.5,.8))
# 5
If you want to stick to passing lists, you can unpack the list contents using * as
interpolator(*[.5,.5,.8])
For interpolating for more than one points, you can map the interpolator onto your list of points (tuples)
answer = list(map(interpolator, [(.5,.5,.8), (.05, 1.6, 2.9)]))
# [5, 5]

How to plot contours from multidimensional data in MatPlotLib (NumPy)?

I have many measurements of several quantities in an array, like this:
m = array([[2, 1, 3, 2, 1, 4, 2], # measurements for quantity A
[8, 7, 6, 7, 5, 6, 8], # measurements for quantity B
[0, 1, 2, 0, 3, 2, 1], # measurements for quantity C
[5, 6, 7, 5, 6, 5, 7]] # measurements for quantity D
)
The quantities are correlated and I need to plot various contour plots. Like "contours of B vs. D x A".
It is true that in the general case the functions might be not well defined -- for example in the above data, columns 0 and 3 show that for the same (D=5,A=2) point there are two distinct values for B (B=8 and B=7). But still, for some combinations I know there is a functional dependence, which I need plotted.
The contour() function from MatPlotLib expects three arrays: X and Y can be 1D arrays, and Z has to be a 2D array with corresponding values. How should I prepare/extract these arrays from m?
You will probably want to use something like scipy.interpolate.griddata to prepare your Z arrays. This will interpolate your data to a regularly spaced 2D array, given your input X and Y, and a set of sorted, regularly spaced X and Y arrays which you will need for eventual plotting. For example, if X and Y contain data points between 1 and 10, then you need to construct a set of new X and Y with a step size that makes sense for your data, e.g.
Xout = numpy.linspace(1,10,10)
Yout = numpy.linspace(1,10,10)
To turn your Xout and Yout arrays into 2D arrays you can use numpy.meshgrid, e.g.
Xout_2d, Yout_2d = numpy.meshgrid(Xout,Yout)
Then you can use those new regularly spaced arrays to construct your interpolated Z array that you can use for plotting, e.g.
Zout = scipy.interpolate.griddata((X,Y),Z,(Xout_2d,Yout_2d))
This interpolated 2D Zout should be usable for a contour plot with Xout_2d and Yout_2d.
Extracting your arrays from m is simple, you just do something like this:
A, B, C, D = (row for row in m)

Stitching grids of varying grid spacing

I read data from binary files into numpy arrays with np.fromfile. These data represent Z values on a grid for which spacing and shape are known so there is no problem reshaping the 1D array into the the shape of the grid and plotting with plt.imshow. So if I have N grids I can plot N subplots showing all data in one figure but what I'd really like to do is plot them as one image.
I can't just stack the arrays because the data in each array is spaced differently and because they have different shapes.
My idea was to "supersample" all grids to the spacing of the finest grid, stack and plot but I am not sure that is such a good idea as these grid files can become quite large.
By the way: Let's say I wanted to do that, how do I go from:
0, 1, 2
3, 4, 5
to:
0, 0, 1, 1, 2, 2
0, 0, 1, 1, 2, 2
3, 3, 4, 4, 5, 5
3, 3, 4, 4, 5, 5
I'm open to any suggestions.
Thanks,
Shahar
The answer if you just plot is: don't. plt.imshow has a keyword argument extent which you can use to zoom the imagine when plotting. Other then that I would suggest scipy.ndimage.zoom, with order=0, it is equivalent to repeating values, but you can zoom to any size easily or use a different order to get some smooth interpolation. np.tile could be an option for very simple zooming too.
Here is an example:
a = np.arange(9).reshape(3,3)
b = np.arange(36).reshape(6,6)
plt.imshow(a, extent=[0,1,0,1], interpolation='none')
plt.imshow(b, extent=(1,2,0,1), interpolation='none')
# note scaling is "broke"
plt.xlim(0,2)
of course to get the same color range for both, you should add vim=... and vmax keywords.

Categories

Resources