Finding nearest xy-point in numpy array and second nearest with condition

Finding nearest xy-point in numpy array and second nearest with condition - python

My problem is like the problem in the thread Finding index of nearest point in numpy arrays of x and y coordinates, but it's extended:
For better visualization here's an image
(manipulated image, original from: by 112BKS - Eigenes WerkOriginal graph/Data from [.. ? ..], CC BY-SA 3.0, link to page):
On the one hand there is a array datafield. It consists of a numpy array with elements [value x y]. That are the thin blue lines with the numbers (they are the value). On the other hand there is the array orangeline in a numpy array with elements [x y].
What I want to do is to calculate the value of any elements in orangeline. I visualized one concrete element of orangeline with the green circle. The value for it can I interpolate with the two elements from datafield, visualized with the triangles. As result I get for the green circle a value between 225 and 230.
First step: Find for every element in orangeline the closest element in datafield.(In the example that is the pink triangle.)
Second step: Find for every element in 'orangeline' the closest element in datafield but with another value than the one from the first step. (In the example that is the brown triangle.)
Third step: Interpolate the value for every element in orangeline from those the two founded values and the distances to those elements.
First step can be solved with
mytree = scipy.spatial.cKDTree(datafield[:, 1:3])
dist1, indexes1 = mytree.query(orangeline)
But now I don't know how to filter the datafield for the second step. Is there a solution?

With help from #unutbu's comment I found this solution which works quite good also in those cases where the orangeline goes not through the field.
Here are the functions for the grid:
import matplotlib.mlab as mlab
import numpy as np
import scipy
def define_grid(rawdata):
xmin, xmax = np.amin(rawdata[:, 1]), np.amax(rawdata[:,1])
ymin, ymax = np.amin(rawdata[:, 2]), np.amax(rawdata[:,2])
x, y, z = rawdata[:, 1], rawdata[:, 2], rawdata[:, 0]
# Size of regular grid
ny, nx = (ymax - ymin), (xmax - xmin)
# Generate a regular grid to interpolate the data.
xi = np.linspace(xmin, xmax, nx)
yi = np.linspace(ymin, ymax, ny)
xi, yi = np.meshgrid(xi, yi)
# Interpolate using delaunay triangularization
zi = mlab.griddata(x,y,z,xi,yi)
return xi, yi, zi
def grid_as_array(xi,yi,zi):
xi_flat, yi_flat, zi_flat = np.ravel(xi), np.ravel(yi), np.ravel(zi)
# reduce arrays for faster calculation, take only every second element
xi_red, yi_red, zi_red = xi_flat[1::2], yi_flat[1::2], zi_flat[1::2]
# stack to array with elements [x y z], but there are z values that are 'nan'
xyz_with_nan = np.hstack((xi_red[:, np.newaxis], yi_red[:, np.newaxis],
zi_red[:, np.newaxis]))
# sort out those elements with 'nan'
xyz = xyz_with_nan[~np.isnan(xyz_with_nan).any(axis=1)]
return xyz
Another function to find the closest point from the grid for the values from orangeline:
def closest_node(points, datafield):
mytree = scipy.spatial.cKDTree(datafield)
dist, indexes = mytree.query(points)
return indexes
And now the code:
# use function to create from the raw data an interpolated datafield
xi, yi, zi = define_grid(datafield)
# rearrange those values to bring them in the form of an array with [x y z]
xyz = grid_as_array(xi, yi, zi)
# search closest values from grid for the points of the orangeline
# orangeline_xy is the array with elements [x y]
indexes = self.closest_node(orangeline_xy, xyz[:,0:2])
# take z values from the grid which we found before
orangeline_z = xyz[indexes, 2]
# add those z values to the points of the orangeline
orangeline_xyz = np.hstack((orangeline_xy,orangeline_z[:, np.newaxis]))

Related

Problem in using scipy.spatial.cKDTree or queryBallPoint with random 2d points using nearest neighbors between two datasets

I have P_0 points spread randomly in a 2d box. Then I divide them in two groups S and I. If some points of S come too close to I, they are deleted from the S group and added to the I group. The problem I am facing is that sometimes they are not correctly deleted from S, but they are properly added to I. Hence, the total number of points keeps erroneously growing.
Here is the code:
from scipy.spatial import cKDTree
import numpy as np
import matplotlib.pyplot as plt
P_0 = 100 # initial susceptible population
# dimensions of box
Lx = 5.0
Ly = 5.0
# generate P_0 random points inside box
X = np.random.uniform(0, Lx, P_0)
Y = np.random.uniform(0, Ly, P_0)
pts = np.column_stack((X, Y)) # array of 2d points
S = np.arange(10, P_0) # indices of the susceptible
I = np.arange(10) # indices of the infected
# Divide points into infected and susceptible groups
r_I = pts[I]
r_S = pts[S]
tree = cKDTree(r_S)
# idx represents the indices to points in r_S which are closer than r to
# points in r_I
idx = tree.query_ball_point(r_I, r=0.4)
idx = np.hstack(idx) # flatten the lists into one numpy array
idx = idx.astype(int) # Make sure idx indices have int type
print idx
# plot points
plt.figure()
plt.plot (r_S[:, 0], r_S[:, 1], 'bo') # plot all r_S points
plt.plot (r_S[idx, 0], r_S[idx, 1], 'ko') # color those points nearest to r_I
plt.plot (r_I[:, 0], r_I[:, 1], 'ro') # identify the r_I points
print len(S), len(I), len(S)+len(I)
I= np.append(I, S[idx]) # add the closest points to I
S = np.delete(S, idx) # delete the closest points from S
# points in r_I
idx = tree.query_ball_point(r_I, r=0.4)
idx = np.hstack(idx) # flatten the lists into one numpy array
idx = idx.astype(int) # Make sure idx indices have int type
print idx
# plot points
plt.figure()
plt.plot (r_S[:, 0], r_S[:, 1], 'bo') # plot all r_S points
plt.plot (r_S[idx, 0], r_S[idx, 1], 'ko') # color those points nearest to r_I
plt.plot (r_I[:, 0], r_I[:, 1], 'ro') # identify the r_I points
print len(S), len(I), len(S)+len(I)
I= np.append(I, S[idx]) # add the closest points to I
S = np.delete(S, idx) # delete the closest points from S
plt.figure('S group')
plt.plot (pts[S, 0], pts[S, 1], 'bo') # plot the updated r_S points
plt.figure('I group')
plt.plot (pts[I, 0], pts[I, 1], 'ro') # plot the updated r_I points
print len(S), len(I), len(S)+len(I), len(idx)
plt.show()
So, I don't know why not all points in r_S closer than r, sometimes aren't deleted from S.
One might have to run the code a few times for the error to appear, or just increase P_0 to 1000 for example or increase the value of r. it might be a problem with idx and the way I am using numpy delete.

You could double check your assumption by swapping the actions (doing a move by doing the deletion first and the addition second, only on a successful deletion) and testing the deletion in a separate variable.
Compare the resulting size after deletion with the original size of the group. If sizes match, no deletion had occurred (for whatever reason), which is a signal not to add something on the other side.
Then you could have a print of groups on the upper case and see the indexes which participate to shed some light into the tunnel.

As I just commented, I just had to eliminate the duplicates in idx.
I added the line
idx = np.unique(idx)
just below idx = idx.astype(int)

quiver plot in Python using 2D arrays

Hi I am trying to use the quiver plot to create a vector field plot. Here is my logic and approach, I first create the x,y coordinates for position by np.arange and use a step size of 0.1. Then I mesh the grid for x,y. Then I import the x component of the function Fx, and the y component Fy into python as .dat files. The .dat files are each 2D arrays (just a square matrix). I then use the quiver command for the meshed x,y coordinates and the Fx,Fy 2d arrays. However the quiver plot output does not make much sense at all in terms of what I was expecting.
Is there a problem with my code that I am overlooking? Does np.arange work if the step size is not an integer amount? I printed out all the arrays to manually check the data and everything seems fine.
Could it be that my four 2D arrays do not all have the same shape? The two .dat files I import are each 40x40 square matrices. Not sure if this is matching up well with the grid I meshed.
Other than that, I am unsure as to what the issue is. Any help or suggestions would be greatly appreciated. I can add the data in my .dat file if that will help. Thanks! ( I have checked all other examples on stack overflow for this problem and it seems my code is logically correct so I am very stuck)
import numpy as np
import matplotlib.pyplot as plt
data = np.genfromtxt('file1.dat')
data2 = np.genfromtxt('file2.dat')
nx = 2
ny = 2
x=np.arange(-nx,nx,0.1)
y=np.arange(-ny,ny,0.1)
xi,yi=np.meshgrid(x,y)
Fx = data[::5] #picks out every 5 rows in the matrix
Fy = data2[::5]
#print(Fx)
#print(Fy)
#print(xi)
#print(yi)
plt.axes([0.065, 0.065, 0.9, 0.9])
plt.quiver(xi,yi,Fx,Fy, alpha=.5)
plt.quiver(xi,yi,Fx,Fy, edgecolor='k',facecolor='none', linewidth=.5)
plt.show()
EDIT: .dat files below as asked. If there is a way to import the .dat file let me know, I realize this is a lot of numbers and formatted horribly. Fx is listed first, then the Fy array. I am expecting a very nice quiver plot in which I have some kind of circular pattern/ circular flow. The arrows should all form a clockwise and or counter clockwise flow.
-30.9032192 0.512708426 0.511251688 0.508112907 0.503038108 0.495766401 0.486015081 0.473499298 0.457935333 0.439051390 0.416606665 0.390406251 0.360321403 0.326310992 0.288441181 0.246901810 0.202013552 0.154238343 0.104165822 5.24933599E-02 0.00000000 -5.24933599E-02 -0.104165822 -0.154238343 -0.202013552 -0.246901810 -0.288441181 -0.326310992 -0.360321403 -0.390406251 -0.416606665 -0.439051390 -0.457935333 -0.473499298 -0.486015081 -0.495766401 -0.503038108 -0.508112907 -0.511251688 -0.512708426 30.9032192
0.640149713 0.648661256 0.646115780 0.638335168 -13.4731970 -13.0613079 0.587181866 0.561966598 0.533295572 0.501472771 0.466741979 0.429292738 0.389282435 0.346857786 0.302170664 0.255400449 0.206771404 0.156560570 0.105099753 5.27719632E-02 2.10129052E-08 -5.27718328E-02 -0.105099864 -0.156560570 -0.206771582 -0.255400449 -0.302170008 -0.346857607 -0.389282405 -0.429292321 -0.466741502 -0.501472294 -0.533295095 -0.561966538 -0.587181747 13.0613060 13.4731960 -0.638335109 -0.646115661 -0.648661256 -0.640149713
0.799892545 0.824215114 0.801061392 0.776797950 0.753669202 0.730814993 0.707295001 0.682291210 0.655105412 -8.68122292 -8.12608242 0.554765701 0.513439834 0.467435867 0.416336209 0.359773695 0.297508597 0.229575798 0.156477526 7.93530941E-02 6.53175791E-10 -7.93530941E-02 -0.156477645 -0.229576021 -0.297508597 -0.359773695 -0.416336179 -0.467435598 -0.513440192 -0.554765582 8.12608242 8.68122387 -0.655105233 -0.682291508 -0.707294881 -0.730815291 -0.753669143 -0.776797950 -0.801061392 -0.824215114 -0.799892545
0.940612555 0.983826339 0.933131218 0.884394646 0.842061043 0.804476202 0.769944012 0.737089813 0.704840183 0.672395170 0.639202237 0.604933023 0.569452882 0.532750905 0.494812310 -2.68859553 -2.16188312 0.365726620 0.304749787 0.205249593 6.78142031E-09 -0.205249622 -0.304749817 -0.365726680 2.16188359 2.68859553 -0.494812399 -0.532750905 -0.569453001 -0.604932964 -0.639202118 -0.672395170 -0.704840362 -0.737089515 -0.769943893 -0.804476202 -0.842061162 -0.884394407 -0.933131695 -0.983826339 -0.940612555
0.999167860 1.05166125 0.986028075 0.923735499 0.870001256 0.822448075 0.778727889 0.736939847 0.695574820 0.653458953 0.609715879 0.563743949 0.515199065 0.463976830 0.410177410 0.354019582 0.295616359 0.234412342 0.167968050 9.07804966E-02 -8.54922577E-10 -9.07804891E-02 -0.167968005 -0.234412268 -0.295616418 -0.354019672 -0.410177410 -0.463976830 -0.515199006 -0.563743949 -0.609715819 -0.653458893 -0.695574880 -0.736939907 -0.778727889 -0.822448075 -0.870001316 -0.923735559 -0.986028075 -1.05166125 -0.999167860
0.940612555 0.983826339 0.932870448 0.884094179 0.841758013 0.804004610 0.768958390 0.735091329 0.701199591 0.666386902 0.630052805 0.591893077 0.551910400 0.510422051 0.468044579 0.425626040 0.384017974 0.343483299 0.302600116 -0.377980769 8.43500270E-10 0.377980769 -0.302600116 -0.343483359 -0.384017944 -0.425625950 -0.468044549 -0.510422230 -0.551910520 -0.591892898 -0.630052805 -0.666386902 -0.701199770 -0.735090971 -0.768958986 -0.804005086 -0.841758251 -0.884094059 -0.932870448 -0.983826339 -0.940612555
0.799892545 0.824215114 0.807587028 0.790868759 0.775763810 0.761242151 0.746228993 0.729784787 0.711097538 0.689466000 0.664264023 -6.33222771 -5.70436525 0.561126649 0.514991641 0.460934460 0.396892428 0.320130050 0.227872163 0.119494393 -1.02303694E-08 -0.119494416 -0.227872089 -0.320129842 -0.396892160 -0.460934043 -0.514991641 -0.561126769 5.70436525 6.33222771 -0.664264023 -0.689466000 -0.711097836 -0.729784369 -0.746228993 -0.761242330 -0.775764227 -0.790868759 -0.807587445 -0.824215114 -0.799892545
0.640149713 0.648661256 0.658376634 0.663496077 0.663335323 -12.7135134 -12.2490902 0.630356669 0.608760655 0.581994295 0.550120413 0.513214111 0.471384048 0.424800932 0.373717010 0.318486720 0.259573966 0.197552294 0.133099481 6.69753179E-02 -1.07370708E-08 -6.69753179E-02 -0.133099481 -0.197552368 -0.259573698 -0.318486512 -0.373717397 -0.424800485 -0.471384078 -0.513214111 -0.550120771 -0.581994355 -0.608760655 -0.630356669 12.2490902 12.7135134 -0.663335383 -0.663496077 -0.658376753 -0.648661256 -0.640149713
-30.9032192 0.512708426 0.511251688 0.508112907 0.503038108 0.495766401 0.486015081 0.473499298 0.457935333 0.439051390 0.416606665 0.390406251 0.360321403 0.326310992 0.288441181 0.246901810 0.202013552 0.154238343 0.104165822 5.24933599E-02 0.00000000 -5.24933599E-02 -0.104165822 -0.154238343 -0.202013552 -0.246901810 -0.288441181 -0.326310992 -0.360321403 -0.390406251 -0.416606665 -0.439051390 -0.457935333 -0.473499298 -0.486015081 -0.495766401 -0.503038108 -0.508112907 -0.511251688 -0.512708426 30.9032192
Now Fy array:
-0.205083355 -0.525830388 -0.552687049 -0.580741763 -0.609929502 -0.640149713 -0.671258569 -0.703064799 -0.735320449 -0.767719150 -0.799892545 -0.831412077 -0.861791074 -0.890495777 -0.916961849 -0.940612555 -0.960886896 -0.977269113 -0.989315629 -0.996686459 -0.999167860 -0.996686459 -0.989315629 -0.977269113 -0.960886896 -0.940612555 -0.916961849 -0.890495777 -0.861791074 -0.831412077 -0.799892545 -0.767719150 -0.735320449 -0.703064799 -0.671258569 -0.640149713 -0.609929502 -0.580741763 -0.552687049 -0.525830388 -0.205083355
-0.495766401 -0.496165156 -0.509083092 -0.549605310 13.5129404 13.0519953 -0.646288395 -0.672055602 -0.695797563 -0.717920899 -0.738660455 -0.758110344 -0.776252687 -0.792979062 -0.808119476 -0.821464479 -0.832787395 -0.841867268 -0.848508835 -0.852558434 -0.853919387 -0.852558374 -0.848508716 -0.841867328 -0.832787514 -0.821464896 -0.808119833 -0.792978704 -0.776252151 -0.758110642 -0.738660395 -0.717920780 -0.695797503 -0.672055602 -0.646288335 13.0519953 13.5129395 -0.549605191 -0.509083092 -0.496165156 -0.495766401
-0.416606665 -0.387658477 -0.370003909 -0.412325561 -0.451486528 -0.484789789 -0.512974977 -0.536900580 -0.557342112 8.73137856 8.12754345 -0.604040861 -0.616312325 -0.627466083 -0.637651145 -0.646887839 -0.655064702 -0.661947429 -0.667217672 -0.670547307 -0.671688557 -0.670547426 -0.667217493 -0.661947429 -0.655064702 -0.646887779 -0.637651086 -0.627466381 -0.616312623 -0.604041040 8.12754345 8.73137951 -0.557341993 -0.536900103 -0.512975276 -0.484789670 -0.451485991 -0.412325561 -0.370003909 -0.387658477 -0.416606665
-0.246901810 -0.228335708 -0.217398927 -0.246074528 -0.271431714 -0.291785061 -0.307664692 -0.319617361 -0.328106791 -0.333535194 -0.336277753 -0.336733580 -0.335400879 -0.333002120 -0.330682963 2.81363893 2.24033999 -0.348281264 -0.372185618 -0.395866930 -0.403591305 -0.395866960 -0.372185677 -0.348281264 2.24033999 2.81363893 -0.330682874 -0.333002120 -0.335400909 -0.336733490 -0.336277664 -0.333535045 -0.328106642 -0.319617361 -0.307664692 -0.291785270 -0.271431714 -0.246074289 -0.217398927 -0.228335708 -0.246901810
0.00000000 -3.97699699E-02 -8.22334886E-02 -9.01840925E-02 -9.43243951E-02 -9.68469381E-02 -9.79287177E-02 -9.75681171E-02 -9.57226083E-02 -9.23085213E-02 -8.71856511E-02 -8.01347122E-02 -7.08276853E-02 -5.87978214E-02 -4.34263758E-02 -2.40071025E-02 -4.12676527E-05 2.79203784E-02 5.66387177E-02 7.90976062E-02 8.76100808E-02 7.90975988E-02 5.66387326E-02 2.79204026E-02 -4.12871887E-05 -2.40071043E-02 -4.34263758E-02 -5.87978400E-02 -7.08276406E-02 -8.01346377E-02 -8.71856511E-02 -9.23085883E-02 -9.57226381E-02 -9.75680798E-02 -9.79286432E-02 -9.68469679E-02 -9.43244398E-02 -9.01841149E-02 -8.22335258E-02 -3.97699960E-02 0.00000000
0.246901810 0.149554759 5.41899577E-02 6.69130459E-02 8.30149651E-02 9.62892994E-02 0.106718197 0.114569001 0.119987577 0.122970015 0.123354375 0.120809816 0.114815064 0.104622498 8.91864598E-02 6.69886991E-02 3.55363674E-02 -1.02187870E-02 -8.21609423E-02 -0.177876130 -0.191068053 -0.177876085 -8.21608678E-02 -1.02187609E-02 3.55363339E-02 6.69886544E-02 8.91865119E-02 0.104622573 0.114814982 0.120810024 0.123354279 0.122969493 0.119987287 0.114568666 0.106718197 9.62890834E-02 8.30147490E-02 6.69130459E-02 5.41902333E-02 0.149555355 0.246901810
0.416606665 0.324635506 0.239433557 0.271107137 0.304715306 0.333829224 0.358776420 0.380251735 0.398895025 0.415270001 0.429880798 -6.52393579 -5.84947205 0.467720896 0.479777455 0.492111117 0.504699171 0.516976655 0.527697802 0.535157621 0.537844956 0.535157681 0.527697802 0.516976714 0.504699290 0.492111027 0.479777277 0.467720628 -5.84947205 -6.52393579 0.429880500 0.415270001 0.398895413 0.380252063 0.358776003 0.333829224 0.304715246 0.271106362 0.239433587 0.324635804 0.416606665
0.495766401 0.468931794 0.452914894 0.491556555 0.528390408 -12.8101072 -12.3052654 0.617275119 0.641844690 0.664552093 0.685565233 0.704941750 0.722658634 0.738638997 0.752775729 0.764953554 0.775063336 0.783014059 0.788738489 0.792190075 0.793342948 0.792190075 0.788738668 0.783013999 0.775063157 0.764953852 0.752775729 0.738638759 0.722658694 0.704941571 0.685565174 0.664552152 0.641844690 0.617275119 -12.3052645 -12.8101072 0.528390408 0.491556555 0.452914953 0.468931794 0.495766401
0.512708426 0.525830388 0.552687049 0.580741763 0.609929502 0.640149713 0.671258569 0.703064799 0.735320449 0.767719150 0.799892545 0.831412077 0.861791074 0.890495777 0.916961849 0.940612555 0.960886896 0.977269113 0.989315629 0.996686459 0.999167860 0.996686459 0.989315629 0.977269113 0.960886896 0.940612555 0.916961849 0.890495777 0.861791074 0.831412077 0.799892545 0.767719150 0.735320449 0.703064799 0.671258569 0.640149713 0.609929502 0.580741763 0.552687049 0.525830388 0.512708426

There appear to be unusually large values (perhaps indication of an asymptotic singularity?) along the lines y=x and y=-x.
You can see this in the data you posted. Consider for example, the first line:
-31.3490391 6.68895245E-02 6.68859407E-02 ... -6.68895245E-02 31.3490391
The first value is large and negative, followed by numbers which are small and positive. Near the end of the line the numbers are small and negative, while the last value is large and positive. Clearly, as it stands, this data is not going to produce a smoothly varying quiver plot.
If we remove these unusually large values:
data[np.abs(data) > 1] = np.nan
data2[np.abs(data2) > 1] = np.nan
then
import numpy as np
import matplotlib.pyplot as plt
data = np.genfromtxt('file1.dat')
data2 = np.genfromtxt('file2.dat')
data[np.abs(data) > 1] = np.nan
data2[np.abs(data2) > 1] = np.nan
N = 10
Fx = data[::N, ::N]
Fy = data2[::N, ::N]
nrows, ncols = Fx.shape
nx = 2
ny = 2
x = np.linspace(-nx, nx, ncols)
y = np.linspace(-ny, ny, nrows)
xi, yi = np.meshgrid(x, y, indexing='ij')
plt.axes([0.065, 0.065, 0.9, 0.9])
plt.quiver(xi, yi, Fx, Fy, alpha=.5)
plt.quiver(xi, yi, Fx, Fy, edgecolor='k', facecolor='none', linewidth=.5)
plt.show()
yields
data is a 2D array of shape (301, 301):
In [109]: data.shape
Out[109]: (301, 301)
If we slice data using data[::10] then the result has shape
In [113]: data[::10].shape
Out[113]: (31, 301)
Notice that only the first axis gets sliced. To slice both the first and second axes, use data[::10, ::10]:
In [114]: data[::10, ::10].shape
Out[114]: (31, 31)
See the docs for more on multidimensional slicing.
Always pay attention to the shape of NumPy arrays. It is often the key to understanding NumPy operations.
Although plt.quiver can sometimes accept arrays of different shape,
it is easiest to use plt.quiver by passing four arrays the the same shape.
To ensure that xi, yi, Fx, Fy all have the same shape, slice data and data2 to form Fx and Fy first, and then build xi and yi to conform to the (common) shape of Fx of Fy:
nrows, ncols = Fx.shape
x = np.linspace(-nx, nx, ncols)
y = np.linspace(-ny, ny, nrows)
The third argument to np.linspace indicates the number of elements in the
returned array.

First make sure the dimension for Fx and Fy is the same to avoid any confusion. Then generate the grid space dimension based on the data dimension. You can use np.linspace instead of np.arange as:
x = np.linspace(-nx, nx, Fx.shape[1])
y = np.linspace(-ny, ny, Fx.shape[0])
Update:
The complete code looks like:
import numpy as np
import matplotlib.pyplot as plt
# Fxdata.dat contain Fx data while Fydata.dat is Fy downloaded from the provided link
Fx = np.genfromtxt('Fxdata.dat')
Fy = np.genfromtxt('Fydata.dat')
# input data dimensions
num_Fx = Fx.shape[0] # number of lines for the data in file1.dat
length_Fx = Fx.shape[1] # length of each row for file1.dat
nx = 15
ny = 15
# you can generate the grid points based on the dimensions of the input data
x = np.linspace(-nx, nx, length_Fx)
y = np.linspace(-ny, ny, num_Fx)
# grid points
xi,yi=np.meshgrid(x,y)
#
plt.axes([0.065, 0.065, 0.9, 0.9])
plt.quiver(xi,yi,Fx,Fy, alpha=.5)
#plt.quiver(xi,yi,Fx,Fy, edgecolor='k',facecolor='none', linewidth=.5)
plt.show()
Not sure if it make sense now but the resulting plot looks like:

Correct usage of scipy.interpolate.RegularGridInterpolator

I am a little confused by the documentation for scipy.interpolate.RegularGridInterpolator.
Say for instance I have a function f: R^3 => R which is sampled on the vertices of the unit cube. I would like to interpolate so as to find values inside the cube.
import numpy as np
# Grid points / sample locations
X = np.array([[0,0,0], [0,0,1], [0,1,0], [0,1,1], [1,0,0], [1,0,1], [1,1,0], [1,1,1.]])
# Function values at the grid points
F = np.random.rand(8)
Now, RegularGridInterpolator takes a points argument, and a values argument.
points : tuple of ndarray of float, with shapes (m1, ), ..., (mn, )
The points defining the regular grid in n dimensions.
values : array_like, shape (m1, ..., mn, ...)
The data on the regular grid in n dimensions.
I interpret this as being able to call as such:
import scipy.interpolate as irp
rgi = irp.RegularGridInterpolator(X, F)
However, when I do so, I get the following error:
ValueError: There are 8 point arrays, but values has 1 dimensions
What am I misinterpreting in the docs?

Ok I feel silly when I answer my own question, but I found my mistake with help from the documentation of the original regulargrid lib:
https://github.com/JohannesBuchner/regulargrid
points should be a list of arrays that specifies how the points are spaced along each axis.
For example, to take the unit cube as above, I should set:
pts = ( np.array([0,1.]), )*3
or if I had data which was sampled at higher resolution along the last axis, I might set:
pts = ( np.array([0,1.]), np.array([0,1.]), np.array([0,0.5,1.]) )
Finally, values has to be of shape corresponding to the grid laid out implicitly by points. For example,
val_size = map(lambda q: q.shape[0], pts)
vals = np.zeros( val_size )
# make an arbitrary function to test:
func = lambda pt: (pt**2).sum()
# collect func's values at grid pts
for i in range(pts[0].shape[0]):
for j in range(pts[1].shape[0]):
for k in range(pts[2].shape[0]):
vals[i,j,k] = func(np.array([pts[0][i], pts[1][j], pts[2][k]]))
So finally,
rgi = irp.RegularGridInterpolator(points=pts, values=vals)
runs and performs as desired.

Your answer is nicer, and it's perfectly OK for you to accept it. I'm just adding this as an "alternate" way to script it.
import numpy as np
import scipy.interpolate as spint
RGI = spint.RegularGridInterpolator
x = np.linspace(0, 1, 3) # or 0.5*np.arange(3.) works too
# populate the 3D array of values (re-using x because lazy)
X, Y, Z = np.meshgrid(x, x, x, indexing='ij')
vals = np.sin(X) + np.cos(Y) + np.tan(Z)
# make the interpolator, (list of 1D axes, values at all points)
rgi = RGI(points=[x, x, x], values=vals) # can also be [x]*3 or (x,)*3
tst = (0.47, 0.49, 0.53)
print rgi(tst)
print np.sin(tst[0]) + np.cos(tst[1]) + np.tan(tst[2])
returns:
1.93765972087
1.92113615659

Correspondence between a "ij" meshgrid and a long meshgrid

Consider a matrix Z that contains grid-based results for z = z(a,m,e). Z has shape (len(aGrid), len(mGrid), len(eGrid)). Z[0,1,2] contains the z(a=aGrid[0], m=mGrid[1], e=eGrid[2]). However, we may have removed some elements from the state space from the object (for example and simplicity, (a,m,e : a > 3). Say that the size of the valid state space is x.
I have been suggested a code to transform this object to an object Z2 of shape (x, 3). Every row in Z2 corresponds to an element i from Z2: (aGrid[a[i]], mGrid[m[i]], eGrid[e[i]]).
# first create Z, a mesh grid based matrix that has some invalid states (we set them to NaN)
aGrid = np.arange(0, 10, dtype=float)
mGrid = np.arange(100, 110, dtype=float)
eGrid = np.arange(1000, 1200, dtype=float)
A,M,E = np.meshgrid(aGrid, mGrid, eGrid, indexing='ij')
Z = A
Z[Z > 3] = np.NaN #remove some states from being "allowed"
# now, translate them from shape (len(aGrid), len(mGrid), len(eGrid)) to
grids = [A,M,E]
grid_bc = np.broadcast_arrays(*grids)
Z2 = np.column_stack([g.ravel() for g in grid_bc])
Z2[np.isnan(Z.ravel())] = np.nan
Z3 = Z2[~np.isnan(Z2)]
Through some computation, I then get a matrix V4 that has the shape of Z3 but contains 4 columns.
I am given
Z2 (as above)
Z3 (as above)
V4 which is a matrix shape (Z3.shape[0], Z3.shape[1]+1): it has an additional column appended
(if necessary, I still have access to the grid A,M,E)
and I need to recreate
V, which is the matrix that contains the values (of the last column) of V4, but is transformed back to the shape of Z1.
That is, if there is a row in V4 that reads (aGrid[0], mGrid[1], eGrid[2], v1), then the the value of V at V[0,1,2] = v1, etc. for all rows in V4,
Efficiency is key.

Given your original problem conditions, recreated as follows, modified such that A is a copy of Z:
aGrid = np.arange(0, 10, dtype=float)
mGrid = np.arange(100, 110, dtype=float)
eGrid = np.arange(1000, 1200, dtype=float)
A,M,E = np.meshgrid(aGrid, mGrid, eGrid, indexing='ij')
Z = A.copy()
Z[Z > 3] = np.NaN
grids = [A,M,E]
grid_bc = np.broadcast_arrays(*grids)
Z2 = np.column_stack([g.ravel() for g in grid_bc])
Z2[np.isnan(Z.ravel())] = np.nan
Z3 = Z2[~np.isnan(Z2)]
A function can be defined as follows, to recreate a dense N-D matrix from a sparse 2D # data points x # dims + 1 matrix. The first argument of the function is the aformentioned 2D matrix, the last (optional) arguments are the grid indexes for each dimension:
import numpy as np
def map_array_to_index(uniq_arr):
return np.vectorize(dict(map(reversed, enumerate(uniq_arr))).__getitem__)
def recreate(arr, *coord_arrays):
if len(coord_arrays) != arr.shape[1] - 1:
coord_arrays = map(np.unique, arr.T[0:-1])
lookups = map(map_array_to_index, coord_arrays)
new_array = np.nan * np.ones(map(len, coord_arrays))
new_array[tuple(l(c) for c, l in zip(arr.T[0:-1], lookups))] = arr[:, -1]
new_grids = np.meshgrid(*coord_arrays, indexing='ij')
return new_array, new_grids
Given a 2D matrix V4, defined above with values derived from Z,
V4 = np.column_stack([g.ravel() for g in grid_bc] + [Z.ravel()])
it is possible to recreate Z as follows:
V4_orig_form, V4_grids = recreate(V4, aGrid, mGrid, eGrid)
All non-NaN values correctly test for equality:
np.all(Z[~np.isnan(Z)] == V4_orig_form[~np.isnan(V4_orig_form)])
The function also works without aGrid, mGrid, eGrid passed in, but in this case it will not include any coordinate that is not present in the corresponding column of the input array.

So Z is the same shape as A,M,E; and Z2 is the shape (Z.ravel(),len(grids)) = (10x10x200, 3) in this case (if you do not filter out the NaN elements).
This is how you recreate your grids from the values of Z2:
grids = Z2.T
A,M,E = [g.reshape(A.shape) for g in grids]
Z = A # or whatever other calculation you need here
The only thing you need is the shape to which you want to go back. NaN will propagate to the final array.

How can an almost arbitrary plane in a 3D dataset be plotted by matplotlib?

There is an array containing 3D data of shape e.g. (64,64,64), how do you plot a plane given by a point and a normal (similar to hkl planes in crystallography), through this dataset?
Similar to what can be done in MayaVi by rotating a plane through the data.
The resulting plot will contain non-square planes in most cases.
Can those be done with matplotlib (some sort of non-rectangular patch)?
Edit: I almost solved this myself (see below) but still wonder how non-rectangular patches can be plotted in matplotlib...?
Edit: Due to discussions below I restated the question.

This is funny, a similar question I replied to just today. The way to go is: interpolation. You can use griddata from scipy.interpolate:
Griddata
This page features a very nice example, and the signature of the function is really close to your data.
You still have to somehow define the points on you plane for which you want to interpolate the data. I will have a look at this, my linear algebra lessons where a couple of years ago

I have the penultimate solution for this problem. Partially solved by using the second answer to Plot a plane based on a normal vector and a point in Matlab or matplotlib :
# coding: utf-8
import numpy as np
from matplotlib.pyplot import imshow,show
A=np.empty((64,64,64)) #This is the data array
def f(x,y):
return np.sin(x/(2*np.pi))+np.cos(y/(2*np.pi))
xx,yy= np.meshgrid(range(64), range(64))
for x in range(64):
A[:,:,x]=f(xx,yy)*np.cos(x/np.pi)
N=np.zeros((64,64))
"""This is the plane we cut from A.
It should be larger than 64, due to diagonal planes being larger.
Will be fixed."""
normal=np.array([-1,-1,1]) #Define cut plane here. Normal vector components restricted to integers
point=np.array([0,0,0])
d = -np.sum(point*normal)
def plane(x,y): # Get plane's z values
return (-normal[0]*x-normal[1]*y-d)/normal[2]
def getZZ(x,y): #Get z for all values x,y. If z>64 it's out of range
for i in x:
for j in y:
if plane(i,j)<64:
N[i,j]=A[i,j,plane(i,j)]
getZZ(range(64),range(64))
imshow(N, interpolation="Nearest")
show()
It's not the ultimate solution since the plot is not restricted to points having a z value, planes larger than 64 * 64 are not accounted for and the planes have to be defined at (0,0,0).

For the reduced requirements, I prepared a simple example
import numpy as np
import pylab as plt
data = np.arange((64**3))
data.resize((64,64,64))
def get_slice(volume, orientation, index):
orientation2slicefunc = {
"x" : lambda ar:ar[index,:,:],
"y" : lambda ar:ar[:,index,:],
"z" : lambda ar:ar[:,:,index]
}
return orientation2slicefunc[orientation](volume)
plt.subplot(221)
plt.imshow(get_slice(data, "x", 10), vmin=0, vmax=64**3)
plt.subplot(222)
plt.imshow(get_slice(data, "x", 39), vmin=0, vmax=64**3)
plt.subplot(223)
plt.imshow(get_slice(data, "y", 15), vmin=0, vmax=64**3)
plt.subplot(224)
plt.imshow(get_slice(data, "z", 25), vmin=0, vmax=64**3)
plt.show()
This leads to the following plot:
The main trick is dictionary mapping orienations to lambda-methods, which saves us from writing annoying if-then-else-blocks. Of course you can decide to give different names,
e.g., numbers, for the orientations.
Maybe this helps you.
Thorsten
P.S.: I didn't care about "IndexOutOfRange", for me it's o.k. to let this exception pop out since it is perfectly understandable in this context.

I had to do something similar for a MRI data enhancement:
Probably the code can be optimized but it works as it is.
My data is 3 dimension numpy array representing an MRI scanner. It has size [128,128,128] but the code can be modified to accept any dimensions. Also when the plane is outside the cube boundary you have to give the default values to the variable fill in the main function, in my case I choose: data_cube[0:5,0:5,0:5].mean()
def create_normal_vector(x, y,z):
normal = np.asarray([x,y,z])
normal = normal/np.sqrt(sum(normal**2))
return normal
def get_plane_equation_parameters(normal,point):
a,b,c = normal
d = np.dot(normal,point)
return a,b,c,d #ax+by+cz=d
def get_point_plane_proximity(plane,point):
#just aproximation
return np.dot(plane[0:-1],point) - plane[-1]
def get_corner_interesections(plane, cube_dim = 128): #to reduce the search space
#dimension is 128,128,128
corners_list = []
only_x = np.zeros(4)
min_prox_x = 9999
min_prox_y = 9999
min_prox_z = 9999
min_prox_yz = 9999
for i in range(cube_dim):
temp_min_prox_x=abs(get_point_plane_proximity(plane,np.asarray([i,0,0])))
# print("pseudo distance x: {0}, point: [{1},0,0]".format(temp_min_prox_x,i))
if temp_min_prox_x < min_prox_x:
min_prox_x = temp_min_prox_x
corner_intersection_x = np.asarray([i,0,0])
only_x[0]= i
temp_min_prox_y=abs(get_point_plane_proximity(plane,np.asarray([i,cube_dim,0])))
# print("pseudo distance y: {0}, point: [{1},{2},0]".format(temp_min_prox_y,i,cube_dim))
if temp_min_prox_y < min_prox_y:
min_prox_y = temp_min_prox_y
corner_intersection_y = np.asarray([i,cube_dim,0])
only_x[1]= i
temp_min_prox_z=abs(get_point_plane_proximity(plane,np.asarray([i,0,cube_dim])))
#print("pseudo distance z: {0}, point: [{1},0,{2}]".format(temp_min_prox_z,i,cube_dim))
if temp_min_prox_z < min_prox_z:
min_prox_z = temp_min_prox_z
corner_intersection_z = np.asarray([i,0,cube_dim])
only_x[2]= i
temp_min_prox_yz=abs(get_point_plane_proximity(plane,np.asarray([i,cube_dim,cube_dim])))
#print("pseudo distance z: {0}, point: [{1},{2},{2}]".format(temp_min_prox_yz,i,cube_dim))
if temp_min_prox_yz < min_prox_yz:
min_prox_yz = temp_min_prox_yz
corner_intersection_yz = np.asarray([i,cube_dim,cube_dim])
only_x[3]= i
corners_list.append(corner_intersection_x)
corners_list.append(corner_intersection_y)
corners_list.append(corner_intersection_z)
corners_list.append(corner_intersection_yz)
corners_list.append(only_x.min())
corners_list.append(only_x.max())
return corners_list
def get_points_intersection(plane,min_x,max_x,data_cube,shape=128):
fill = data_cube[0:5,0:5,0:5].mean() #this can be a parameter
extended_data_cube = np.ones([shape+2,shape,shape])*fill
extended_data_cube[1:shape+1,:,:] = data_cube
diag_image = np.zeros([shape,shape])
min_x_value = 999999
for i in range(shape):
for j in range(shape):
for k in range(int(min_x),int(max_x)+1):
current_value = abs(get_point_plane_proximity(plane,np.asarray([k,i,j])))
#print("current_value:{0}, val: [{1},{2},{3}]".format(current_value,k,i,j))
if current_value < min_x_value:
diag_image[i,j] = extended_data_cube[k,i,j]
min_x_value = current_value
min_x_value = 999999
return diag_image
The way it works is the following:
you create a normal vector:
for example [5,0,3]
normal1=create_normal_vector(5, 0,3) #this is only to normalize
then you create a point:
(my cube data shape is [128,128,128])
point = [64,64,64]
You calculate the plane equation parameters, [a,b,c,d] where ax+by+cz=d
plane1=get_plane_equation_parameters(normal1,point)
then to reduce the search space you can calculate the intersection of the plane with the cube:
corners1 = get_corner_interesections(plane1,128)
where corners1 = [intersection [x,0,0],intersection [x,128,0],intersection [x,0,128],intersection [x,128,128], min intersection [x,y,z], max intersection [x,y,z]]
With all these you can calculate the intersection between the cube and the plane:
image1 = get_points_intersection(plane1,corners1[-2],corners1[-1],data_cube)
Some examples:
normal is [1,0,0] point is [64,64,64]
normal is [5,1,0],[5,1,1],[5,0,1] point is [64,64,64]:
normal is [5,3,0],[5,3,3],[5,0,3] point is [64,64,64]:
normal is [5,-5,0],[5,-5,-5],[5,0,-5] point is [64,64,64]:
Thank you.

The other answers here do not appear to be very efficient with explicit loops over pixels or using scipy.interpolate.griddata, which is designed for unstructured input data. Here is an efficient (vectorized) and generic solution.
There is a pure numpy implementation (for nearest-neighbor "interpolation") and one for linear interpolation, which delegates the interpolation to scipy.ndimage.map_coordinates. (The latter function probably didn't exist in 2013, when this question was asked.)
import numpy as np
from scipy.ndimage import map_coordinates
def slice_datacube(cube, center, eXY, mXY, fill=np.nan, interp=True):
"""Get a 2D slice from a 3-D array.
Copyright: Han-Kwang Nienhuys, 2020.
License: any of CC-BY-SA, CC-BY, BSD, GPL, LGPL
Reference: https://stackoverflow.com/a/62733930/6228891
Parameters:
- cube: 3D array, assumed shape (nx, ny, nz).
- center: shape (3,) with coordinates of center.
can be float.
- eXY: unit vectors, shape (2, 3) - for X and Y axes of the slice.
(unit vectors must be orthogonal; normalization is optional).
- mXY: size tuple of output array (mX, mY) - int.
- fill: value to use for out-of-range points.
- interp: whether to interpolate (rather than using 'nearest')
Return:
- slice: array, shape (mX, mY).
"""
center = np.array(center, dtype=float)
assert center.shape == (3,)
eXY = np.array(eXY)/np.linalg.norm(eXY, axis=1)[:, np.newaxis]
if not np.isclose(eXY[0] # eXY[1], 0, atol=1e-6):
raise ValueError(f'eX and eY not orthogonal.')
# R: rotation matrix: data_coords = center + R # slice_coords
eZ = np.cross(eXY[0], eXY[1])
R = np.array([eXY[0], eXY[1], eZ], dtype=np.float32).T
# setup slice points P with coordinates (X, Y, 0)
mX, mY = int(mXY[0]), int(mXY[1])
Xs = np.arange(0.5-mX/2, 0.5+mX/2)
Ys = np.arange(0.5-mY/2, 0.5+mY/2)
PP = np.zeros((3, mX, mY), dtype=np.float32)
PP[0, :, :] = Xs.reshape(mX, 1)
PP[1, :, :] = Ys.reshape(1, mY)
# Transform to data coordinates (x, y, z) - idx.shape == (3, mX, mY)
if interp:
idx = np.einsum('il,ljk->ijk', R, PP) + center.reshape(3, 1, 1)
slice = map_coordinates(cube, idx, order=1, mode='constant', cval=fill)
else:
idx = np.einsum('il,ljk->ijk', R, PP) + (0.5 + center.reshape(3, 1, 1))
idx = idx.astype(np.int16)
# Find out which coordinates are out of range - shape (mX, mY)
badpoints = np.any([
idx[0, :, :] < 0,
idx[0, :, :] >= cube.shape[0],
idx[1, :, :] < 0,
idx[1, :, :] >= cube.shape[1],
idx[2, :, :] < 0,
idx[2, :, :] >= cube.shape[2],
], axis=0)
idx[:, badpoints] = 0
slice = cube[idx[0], idx[1], idx[2]]
slice[badpoints] = fill
return slice
# Demonstration
nx, ny, nz = 50, 70, 100
cube = np.full((nx, ny, nz), np.float32(1))
cube[nx//4:nx*3//4, :, :] += 1
cube[:, ny//2:ny*3//4, :] += 3
cube[:, :, nz//4:nz//2] += 7
cube[nx//3-2:nx//3+2, ny//2-2:ny//2+2, :] = 0 # black dot
Rz, Rx = np.pi/6, np.pi/4 # rotation angles around z and x
cz, sz = np.cos(Rz), np.sin(Rz)
cx, sx = np.cos(Rx), np.sin(Rx)
Rmz = np.array([[cz, -sz, 0], [sz, cz, 0], [0, 0, 1]])
Rmx = np.array([[1, 0, 0], [0, cx, -sx], [0, sx, cx]])
eXY = (Rmx # Rmz).T[:2]
slice = slice_datacube(
cube,
center=[nx/3, ny/2, nz*0.7],
eXY=eXY,
mXY=[80, 90],
fill=np.nan,
interp=False
)
import matplotlib.pyplot as plt
plt.close('all')
plt.imshow(slice.T) # imshow expects shape (mY, mX)
plt.colorbar()
Output (for interp=False):
For this test case (50x70x100 datacube, 80x90 slice size) the run time is 376 µs (interp=False) and 550 µs (interp=True) on my laptop.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.