Contour plot on 2 parameter data in python - python

Hi I have a numpy object with shape (1000,3) that I wish do a contour plot of. The first two columns represent x and y values and the 3rd is the associated density value at the the point denoted by the x and y values. These are NOT evenly spaced as the x and y values were generated by MCMC sampling methods. I wish to plot the x and y values and demarcate points which have density at a certain level.
I have tried calling the contour function but it does not appear to work.
presuming I have a data object such that np.shape(data) gives (1000,3)
plt.figure()
contour(data[:,0],data[:,1],data[:,2])
plt.show()
this does not seem to work and gives the following error
TypeError: Input z must be a 2D array
I understand z, the 3rd column needs to be some sort of meshgrid, but the all the examples I have seen seen to rely on constructing one from evenly spaced x and y which I do not have. Help appreciated on how I can resolve this.
EDIT: I have figured it out. Need to use the method for unevenly spaced points as described here.
https://matplotlib.org/devdocs/gallery/images_contours_and_fields/griddata_demo.html#sphx-glr-gallery-images-contours-and-fields-griddata-demo-py

Related

Contour plot of 2D Numpy array in Python

my aim is to get a contour plot in Python for an (100,100) array imported and created with Fortran.
I imported the array from Fortran in the following way :
x=np.linspace(0.02,10,100),
y=np.linspace(0.47,4,100)
f = (np.fromfile(('/path/result.dat'
), dtype=np.float64).reshape((len(x), len(y)), order="F"))
So the result is dependent from x and y and gives a value for every combination of x and y.
How can I create a corresponding contour plot? So far what I tried was:
X, Y= np.meshgrid(x, y)
plt.contourf(X, Y, f, colors='black')
plt.show()
But the resulting contour plot shows values that dont make sense. I also tried imshow() but it did not work. If you could help me, I will be very grateful!
The arrangement of X,Y, and f plays a role here. Without looking at how the result.dat was generated, though, it is difficult to answer this question. Intuition tells me, the values of f(x,y) may not match with the meshgrid.
The improper values might be arising because the values of X and Y don't correspond to the values of f. Try order = "C" or order = "A". Also, your x and y should really be defined before reshaping the data.
x=np.linspace(0.02,10,100)
y=np.linspace(0.47,4,100)
f = np.fromfile(('/path/result.dat'), dtype=np.float64).reshape((len(x), len(y)), order="<>")
Maybe try reordering X and Y if this doesn't work.

Line/surface of best fit for 3D scatterplot from Excel data

I have a database of the height, weight, and age of 100s of people. Using matplotlib, I've been able to create a 3D scatterplot of these 3 variables with the xyz co-ordinates of each point representing the (height,weight,age) of one person.
Is it possible to create a (i) line (ii) surface of best fit for the data? The meshgrid would be incomplete since I don't have an age (z) value for each pairing of height and weight (x,y) values. Can we draw the line/surface regardless? Do I have to impute the missing z-values in the meshgrid, and if so, how would I do that?
Most other answers I've seen on this topic assume z is a function of x and y, or that the meshgrid is complete, both of which are not the case here.
Can you try using numpy.meshgrid and then fill your unknown z values with numpy.nan? Matplotlib should ignore numpy.nan from the plots.
By 'best fit' did you mean an interpolation? If so you can pass your data through scipy.interpolate.RectBivariateSpline. I think that would suit your problem?

Heatmap from 1D array or list, what should x-axis be?

I have a 1D array consisting of over 100,000 values. If I were to plot it on a scatterplot, it would pretty much just be one solid color block. So, I want to use a heatmap instead.
I saw various methods, but they either want a 2D array or have "x" and "y" values. If my 1D array values were y, what should the x-axis be? I only want to see how highly "concentrated" those values are in one area of the plot.
plt.imshow() requires at least a 2D array.
plt.pcolormesh() requires X and Y.
plt.hexbin() requires X and Y.
np.histogram2d() requires X and Y (from an example I saw).
Thank you.

Python fastKDE beyond limits of data points

I'm trying to use the fastKDE package (https://pypi.python.org/pypi/fastkde/1.0.8) to find the KDE of a point in a 2D plot. However, I want to know the KDE beyond the limits of the data points, and cannot figure out how to do this.
Using the code listed on the site linked above;
#!python
import numpy as np
from fastkde import fastKDE
import pylab as PP
#Generate two random variables dataset (representing 100000 pairs of datapoints)
N = 2e5
var1 = 50*np.random.normal(size=N) + 0.1
var2 = 0.01*np.random.normal(size=N) - 300
#Do the self-consistent density estimate
myPDF,axes = fastKDE.pdf(var1,var2)
#Extract the axes from the axis list
v1,v2 = axes
#Plot contours of the PDF should be a set of concentric ellipsoids centered on
#(0.1, -300) Comparitively, the y axis range should be tiny and the x axis range
#should be large
PP.contour(v1,v2,myPDF)
PP.show()
I'm able to find the KDE for any point within the limits of the data, but how do I find the KDE for say the point (0,300), without having to include it into var1 and var2. I don't want the KDE to be calculated with this data point, I want to know the KDE at that point.
I guess what I really want to be able to do is give the fastKDE a histogram of the data, so that I can set its axes myself. I just don't know if this is possible?
Cheers
I, too, have been experimenting with this code and have run into the same issues. What I've done (in lieu of a good N-D extrapolator) is to build a KDTree (with scipy.spatial) from the grid points that fastKDE returns and find the nearest grid point to the point I was to evaluate. I then lookup the corresponding pdf value at that point (it should be small near the edge of the pdf grid if not identically zero) and assign that value accordingly.
I came across this post while searching for a solution of this problem. Similiar to the building of a KDTree you could just calculate your stepsize in every griddimension, and then get the index of your query point by just subtracting the point value with the beginning of your axis and divide by the stepsize of that dimension, finally round it off, turn it to integer and voila. So for example in 1D:
def fastkde_test(test_x):
kde, axes = fastKDE.pdf(test_x, numPoints=num_p)
x_step = (max(axes)-min(axes)) / len(axes)
x_ind = np.int32(np.round((test_x-min(axes)) / x_step))
return kde[x_ind]
where test_x in this case is both the set for defining the KDE and the query set. Doing it this way is marginally faster by a factor of 10 in my case (at least in 1D, higher dimensions not yet tested) and does basically the same thing as the KDTree query.
I hope this helps anyone coming across this problem in the future, as I just did.
Edit: if your querying points outside of the range over which the KDE was calculated this method of course can only give you the same result as the KDTree query, namely the corresponding border of your KDE-grid. You would however have to hardcode this by cutting the resulting x_ind at the highest index, i.e. `len(axes)-1'.

Matplotlib: Reversing axis order based on coordinates given to pcolor/pcolormesh

I'm building an application in which there is a colorplot plotted with pcolor. The user can apply various operations on this data like gradients, offsets and filters. One of these operations is a flip, where the x or y axis gets reversed. I represent every operation by a function which takes a matrix of datapoints and their x and y coordinates, and returns the same. I want to know if it's possible to reverse an axis with the data given to pcolor.
I have tried reversing the x/y coordinates but this will make matplotlib just reverse the data in the x or y direction in order to keep the axes positively increasing up and to the right.
So my question is as follows. Can I make pcolor reverse an axis so that it increases in the opposite direction when I supply a reversed X and Y. Or is there maybe a function option that I haven't seen yet?
If you are doing this,
pcolor(x, y, dat)
Then this should do what you want,
pcolor(y, x, dat.T)
In addition to swapping the y and x values, you must also to take the transpose (.T) of your data. This should work equivalently for pcolormesh.

Categories

Resources