Natgrid interpolation in Matplotlib producing Null Values - python

Everyone,
I am currently having some difficulty with the interpolation produced by griddata in matplotlib. I have a data set that contains 3 values per a data point. I am attempting to plot this data in a contour. The problem I am having is that the the built in delauny interpolation breaks on some data sets. So, I have tried to move over to the natgrid interpolatoin. This works for some data. However, other data sets it produces some null values in the output of griddata. I should also mention that these null values occur in an area where most values are the same (0 in this case).
Are there any suggestions as to what could cause this?
Edit: Here are some more details.
The source data is rather large and cannot be pasted in here. So, I used a pastebin to hold it. I hope this is allowed. That is available here: http://pastebin.com/C7Nvvcaw. This is formatted like x, y, z. So the first column is the x value and so on. This is the result for printing every item in the xyz list and piping it into a file.
The same applies for the post interpolation data. So, I have used a pastebin as well. Here it is: http://pastebin.com/ZB6S2qFk. Each [] corresponds with a particular y value. The y values are not included because they are in the separate lists. Though, they just range from 0-50 for the Y axis and 0-100 for the X axis. This is the output of printing every item in the zi list from my code and piping it into a file.
Now for the code, I have not attached all of the code because a large part of it is out of the scope of this question. So, I will post the relevant components. The initial data is in a list of lists providing the ability to navigate the data in 3 dimensions. I suppose this is similar to a 3 dimensional array, though it is not and just a list of lists. The source data's list is called xyz.
#Function to get individual column of data
def column(matrix, i):
return [row[i] for row in matrix]
#Getting Max and Mins
xmin = float(min(column(xyz, 0)))
xmax = float(max(column(xyz, 0)))
ymin = float(min(column(xyz, 1)))
ymax = float(max(column(xyz, 1)))
#Resolution for interpolation (x and y list lengths)
resx = 100
resy = 50
xi = np.linspace(xmin, xmax, resx)
yi = np.linspace(ymin, ymax, resy)
x = np.array(column(xyz, 0))
y = np.array(column(xyz, 1))
z = np.array(column(xyz, 2))
zi = griddata(x, y, z, xi, yi, interp='nn')
EDIT: I found a solution. The 0 values are breaking the interpolation. I was able to catch them and change them all to 0.1 and the interpolation no longer produces the null values. this is acceptable in my case because 0 and 0.1 fall under the same contour color. For other users if you need to retain absolute accuracy of the data, I would suggest transforming all the values by +1, interpolating, then transforming the interpolated values -1. This will retain the accuracy without breaking the interpolation method.
Thanks for any help,
Daniel

Related

Contour plot of 2D Numpy array in Python

my aim is to get a contour plot in Python for an (100,100) array imported and created with Fortran.
I imported the array from Fortran in the following way :
x=np.linspace(0.02,10,100),
y=np.linspace(0.47,4,100)
f = (np.fromfile(('/path/result.dat'
), dtype=np.float64).reshape((len(x), len(y)), order="F"))
So the result is dependent from x and y and gives a value for every combination of x and y.
How can I create a corresponding contour plot? So far what I tried was:
X, Y= np.meshgrid(x, y)
plt.contourf(X, Y, f, colors='black')
plt.show()
But the resulting contour plot shows values that dont make sense. I also tried imshow() but it did not work. If you could help me, I will be very grateful!
The arrangement of X,Y, and f plays a role here. Without looking at how the result.dat was generated, though, it is difficult to answer this question. Intuition tells me, the values of f(x,y) may not match with the meshgrid.
The improper values might be arising because the values of X and Y don't correspond to the values of f. Try order = "C" or order = "A". Also, your x and y should really be defined before reshaping the data.
x=np.linspace(0.02,10,100)
y=np.linspace(0.47,4,100)
f = np.fromfile(('/path/result.dat'), dtype=np.float64).reshape((len(x), len(y)), order="<>")
Maybe try reordering X and Y if this doesn't work.

Contour plot on 2 parameter data in python

Hi I have a numpy object with shape (1000,3) that I wish do a contour plot of. The first two columns represent x and y values and the 3rd is the associated density value at the the point denoted by the x and y values. These are NOT evenly spaced as the x and y values were generated by MCMC sampling methods. I wish to plot the x and y values and demarcate points which have density at a certain level.
I have tried calling the contour function but it does not appear to work.
presuming I have a data object such that np.shape(data) gives (1000,3)
plt.figure()
contour(data[:,0],data[:,1],data[:,2])
plt.show()
this does not seem to work and gives the following error
TypeError: Input z must be a 2D array
I understand z, the 3rd column needs to be some sort of meshgrid, but the all the examples I have seen seen to rely on constructing one from evenly spaced x and y which I do not have. Help appreciated on how I can resolve this.
EDIT: I have figured it out. Need to use the method for unevenly spaced points as described here.
https://matplotlib.org/devdocs/gallery/images_contours_and_fields/griddata_demo.html#sphx-glr-gallery-images-contours-and-fields-griddata-demo-py

How to vectorize a python code that needs interpolation for specific data points

I have a problem where I use a computer program called MCNP to calculate the energy deposition in a square geometry from a particle flux. The square geometry is broken down into a mesh grid with 50 cubic meshes in length, width and height. The data is placed into a text file displaying the centroid position of each mesh in cartesian coordinates (x,y and z position) and the energy deposition at that x,y,z coordinate. The data is then extracted with a Python script. I have a script that allows me to take a slice in the z plane and plot a heat map of energy deposition on that plane and the script works, but I dont think it is very efficient and I am looking for solutions to vectorize the process.
The code reads in the X, Y and Z coordinates as three separate 1-D numpy arrays and also reads in the energy deposition at that coordinate as a 1-D numpy array. For the sake of this description, lets assume I want to take a slice at the Z coordinate of zero, but none of the mesh centroids are at the z-coordinate of 0, then I have to (and do) cycle through all of the data points in the Z-coordinate array until it finds one that is greater than zero (array index i) with a proceeding array index (i-1) that is less than zero. It then needs to use those array points in Z-space along with the slice location (in this case 0) and the energy deposition at those array indices and interpolate to find the correct energy deposition at that z-location of the slice. Since the X and Y arrays are unaffected, now I have the coordinate of X, Y and can plot a heat map of that specific X,Y location and the Energy deposition at the slice location. The code also needs to determine if the slice location is already in the data set, in which case no interpolation is needed. The code I have works, but I could not see how to use built in scipy interpolation schemes and instead wrote a function to do the interpolation. In this scenario and had to use a for loop to iterate until I found the position where the z-position was above and below the slice location (z=0 in this instance). I am attaching my example code in this post and am asking for help to better vectorize this code snippet (if it can be better vectorized) and hopefully learn something in the process.
# - This transforms the read in data from a list to a numpy array
# where Magnitude represents the energy deposition
XArray = np.array(XArray); YArray = np.array(YArray)
ZArray = np.array(ZArray); Magnitude = np.array(Magnitude)
#==============================================================
# - This section creates planar data for a 2-D plot
# Interpolation function for determining 2-D slice of 3-D data
def Interpolate(X1,X2,Y1,Y2,X3):
Slope = (Y2-Y1)/(X2-X1)
Y3 = (X3-X1)*Slope
Y3 = Y3 + Y1
return Y3
# This represents the location on the Z-axis where a slice is taken
Slice_Location = 0.0
XVal = []; YVal = []; ZVal = []
Tally = []; Error = []
counter = 1.0
length = len(XArray)-1
for numbers in range(length):
# - If data falls on the selected plane location then use existing data
if ZArray[counter] == Slice_Location:
XVal.append(XArray[counter])
YVal.append(YArray[counter])
ZVal.append(ZArray[counter])
Tally.append(float(Magnitude[counter]))
# - If existing data does not exist on selected plane then interpolate
if ZArray[counter-1] < Slice_Location and ZArray[counter] > Slice_Location:
XVal.append(XArray[counter])
YVal.append(YArray[counter])
ZVal.append(Slice_Location)
Value = Interpolate(ZArray[counter-1],ZArray[counter],Magnitude[counter-1], \
Magnitude[counter],Slice_Location)
Tally.append(float(Value))
counter = counter + 1
XVal = np.array(XVal); YVal = np.array(YVal); ZVal = np.array(ZVal)
Tally = np.array(Tally);

From 1D graph to 2D mask

I have calculated the boundaries in which I want to sample points.
For example one dataset looks like:
Now I want to find point in the red area, which I do in the following way:
The plot consist of 10 lines, so I reshape to get the region limits per value of x.
limits = data.reshape(data.shape + (5, 2))
Now for a particular value of x data looks like:
limits[20] = array([[ 5.65624197, 6.70331962],
[ 13.68248989, 14.77227669],
[ 15.50973796, 16.61491606],
[ 24.03948128, 25.14907398],
[ 26.41541777, 27.53475798]])
I thought to make a mesh and mask the area as following
X, Y = np.meshgrid(xs, ys)
bool_array = np.zeros(Y.shape)
for j, y in enumerate(limits):
for min_y, max_y in y:
inds = np.where(np.logical_and(ys >= min_y, ys <= max_y))[0]
bool_array[inds, j] = True
plt.imshow(bool_array[::-1])
(don't know why the graph need to be plotted inverted)
results in
which is indeed the data I'm looking for , now I could use the True values to take points with a different function.
The problem is that this code is very slow, and my datasets will get much bigger.
I would like to find a more efficient way of finding this "mask".
I tried several things and ended up with the following result which worked for my simple cases
low_bound = limits[:,:,0]
upp_bound = limits[:,:,1]
mask = np.any((low_bound[:,None,:] <= Y.T[:,:,None]) & ( Y.T[:,:,None] <= upp_bound[:,None,:]),axis=-1).T
I know it looks ugly. What I do is introducing an additional dimension in which I subsequently check conditions whether it lies between two ending points. At the end I collapse the additional dimension by using np.any.
I don't know how much faster it is compared to your code. However, given that I don't use a single for loop there should be a performance boost.
Check the code with your data and tell me if something goes wrong.
Edit:
plt.imshow plots (0,0) in the lower left edge when you use
plt.imshow(mask,origin='lower')

trouble with performing coordinate map/interpolation with interp2d

I have what is essentially a 4 column lookup table: cols 1, 2 are the respective xi,yj coordinates which map to x'i, y'j coordinates in the respective 3rd and 4th cols.
My goal is to provide a method to enter some (xnew,ynew) position within the range of my look-up values in the 1st and 2nd columns(xi,yj) then map that position to an interpolated (x'i,y'j) from the range of positions in the 3rd and 4th cols of the lut.
I have tried using interp2d, but have not been able to figure out how to enter the arrays into the proper format. For example: I don't understand why scipy.interpolate.interp2d(x'i, y'j, [xi,yj] kind='linear') gives me the following error:
ValueError: Invalid length for input z for non rectangular grid'.
This seems so simple, but I have not been able to figure it out. I will gladly provide more information if required.
interp2d requires that the interpolated function be 1D, see the docs:
z : 1-D ndarray The values of the function to interpolate at the data
points. If z is a multi-dimensional array, it is flattened before use.
So when you enter [xi,yj], it gets converted from its (2, n) shape to (2*n,), hence the error.
You can get around this setting up two different interpolating functions, one for each coordinate. If your lut is a single array of shape (n, 4), you would do something like:
x_interp = scipy.interpolate.interp2d(lut[0], lut[1], lut[2], kind = 'linear')
y_interp = scipy.interpolate.interp2d(lut[0], lut[1], lut[3], kind = 'linear')
And you can now do things like:
new_x, new_y = x_interp(x, y), y_interp(x, y)

Categories

Resources