quiver plot in Python using 2D arrays - python

Hi I am trying to use the quiver plot to create a vector field plot. Here is my logic and approach, I first create the x,y coordinates for position by np.arange and use a step size of 0.1. Then I mesh the grid for x,y. Then I import the x component of the function Fx, and the y component Fy into python as .dat files. The .dat files are each 2D arrays (just a square matrix). I then use the quiver command for the meshed x,y coordinates and the Fx,Fy 2d arrays. However the quiver plot output does not make much sense at all in terms of what I was expecting.
Is there a problem with my code that I am overlooking? Does np.arange work if the step size is not an integer amount? I printed out all the arrays to manually check the data and everything seems fine.
Could it be that my four 2D arrays do not all have the same shape? The two .dat files I import are each 40x40 square matrices. Not sure if this is matching up well with the grid I meshed.
Other than that, I am unsure as to what the issue is. Any help or suggestions would be greatly appreciated. I can add the data in my .dat file if that will help. Thanks! ( I have checked all other examples on stack overflow for this problem and it seems my code is logically correct so I am very stuck)
import numpy as np
import matplotlib.pyplot as plt
data = np.genfromtxt('file1.dat')
data2 = np.genfromtxt('file2.dat')
nx = 2
ny = 2
x=np.arange(-nx,nx,0.1)
y=np.arange(-ny,ny,0.1)
xi,yi=np.meshgrid(x,y)
Fx = data[::5] #picks out every 5 rows in the matrix
Fy = data2[::5]
#print(Fx)
#print(Fy)
#print(xi)
#print(yi)
plt.axes([0.065, 0.065, 0.9, 0.9])
plt.quiver(xi,yi,Fx,Fy, alpha=.5)
plt.quiver(xi,yi,Fx,Fy, edgecolor='k',facecolor='none', linewidth=.5)
plt.show()
EDIT: .dat files below as asked. If there is a way to import the .dat file let me know, I realize this is a lot of numbers and formatted horribly. Fx is listed first, then the Fy array. I am expecting a very nice quiver plot in which I have some kind of circular pattern/ circular flow. The arrows should all form a clockwise and or counter clockwise flow.
-30.9032192 0.512708426 0.511251688 0.508112907 0.503038108 0.495766401 0.486015081 0.473499298 0.457935333 0.439051390 0.416606665 0.390406251 0.360321403 0.326310992 0.288441181 0.246901810 0.202013552 0.154238343 0.104165822 5.24933599E-02 0.00000000 -5.24933599E-02 -0.104165822 -0.154238343 -0.202013552 -0.246901810 -0.288441181 -0.326310992 -0.360321403 -0.390406251 -0.416606665 -0.439051390 -0.457935333 -0.473499298 -0.486015081 -0.495766401 -0.503038108 -0.508112907 -0.511251688 -0.512708426 30.9032192
0.640149713 0.648661256 0.646115780 0.638335168 -13.4731970 -13.0613079 0.587181866 0.561966598 0.533295572 0.501472771 0.466741979 0.429292738 0.389282435 0.346857786 0.302170664 0.255400449 0.206771404 0.156560570 0.105099753 5.27719632E-02 2.10129052E-08 -5.27718328E-02 -0.105099864 -0.156560570 -0.206771582 -0.255400449 -0.302170008 -0.346857607 -0.389282405 -0.429292321 -0.466741502 -0.501472294 -0.533295095 -0.561966538 -0.587181747 13.0613060 13.4731960 -0.638335109 -0.646115661 -0.648661256 -0.640149713
0.799892545 0.824215114 0.801061392 0.776797950 0.753669202 0.730814993 0.707295001 0.682291210 0.655105412 -8.68122292 -8.12608242 0.554765701 0.513439834 0.467435867 0.416336209 0.359773695 0.297508597 0.229575798 0.156477526 7.93530941E-02 6.53175791E-10 -7.93530941E-02 -0.156477645 -0.229576021 -0.297508597 -0.359773695 -0.416336179 -0.467435598 -0.513440192 -0.554765582 8.12608242 8.68122387 -0.655105233 -0.682291508 -0.707294881 -0.730815291 -0.753669143 -0.776797950 -0.801061392 -0.824215114 -0.799892545
0.940612555 0.983826339 0.933131218 0.884394646 0.842061043 0.804476202 0.769944012 0.737089813 0.704840183 0.672395170 0.639202237 0.604933023 0.569452882 0.532750905 0.494812310 -2.68859553 -2.16188312 0.365726620 0.304749787 0.205249593 6.78142031E-09 -0.205249622 -0.304749817 -0.365726680 2.16188359 2.68859553 -0.494812399 -0.532750905 -0.569453001 -0.604932964 -0.639202118 -0.672395170 -0.704840362 -0.737089515 -0.769943893 -0.804476202 -0.842061162 -0.884394407 -0.933131695 -0.983826339 -0.940612555
0.999167860 1.05166125 0.986028075 0.923735499 0.870001256 0.822448075 0.778727889 0.736939847 0.695574820 0.653458953 0.609715879 0.563743949 0.515199065 0.463976830 0.410177410 0.354019582 0.295616359 0.234412342 0.167968050 9.07804966E-02 -8.54922577E-10 -9.07804891E-02 -0.167968005 -0.234412268 -0.295616418 -0.354019672 -0.410177410 -0.463976830 -0.515199006 -0.563743949 -0.609715819 -0.653458893 -0.695574880 -0.736939907 -0.778727889 -0.822448075 -0.870001316 -0.923735559 -0.986028075 -1.05166125 -0.999167860
0.940612555 0.983826339 0.932870448 0.884094179 0.841758013 0.804004610 0.768958390 0.735091329 0.701199591 0.666386902 0.630052805 0.591893077 0.551910400 0.510422051 0.468044579 0.425626040 0.384017974 0.343483299 0.302600116 -0.377980769 8.43500270E-10 0.377980769 -0.302600116 -0.343483359 -0.384017944 -0.425625950 -0.468044549 -0.510422230 -0.551910520 -0.591892898 -0.630052805 -0.666386902 -0.701199770 -0.735090971 -0.768958986 -0.804005086 -0.841758251 -0.884094059 -0.932870448 -0.983826339 -0.940612555
0.799892545 0.824215114 0.807587028 0.790868759 0.775763810 0.761242151 0.746228993 0.729784787 0.711097538 0.689466000 0.664264023 -6.33222771 -5.70436525 0.561126649 0.514991641 0.460934460 0.396892428 0.320130050 0.227872163 0.119494393 -1.02303694E-08 -0.119494416 -0.227872089 -0.320129842 -0.396892160 -0.460934043 -0.514991641 -0.561126769 5.70436525 6.33222771 -0.664264023 -0.689466000 -0.711097836 -0.729784369 -0.746228993 -0.761242330 -0.775764227 -0.790868759 -0.807587445 -0.824215114 -0.799892545
0.640149713 0.648661256 0.658376634 0.663496077 0.663335323 -12.7135134 -12.2490902 0.630356669 0.608760655 0.581994295 0.550120413 0.513214111 0.471384048 0.424800932 0.373717010 0.318486720 0.259573966 0.197552294 0.133099481 6.69753179E-02 -1.07370708E-08 -6.69753179E-02 -0.133099481 -0.197552368 -0.259573698 -0.318486512 -0.373717397 -0.424800485 -0.471384078 -0.513214111 -0.550120771 -0.581994355 -0.608760655 -0.630356669 12.2490902 12.7135134 -0.663335383 -0.663496077 -0.658376753 -0.648661256 -0.640149713
-30.9032192 0.512708426 0.511251688 0.508112907 0.503038108 0.495766401 0.486015081 0.473499298 0.457935333 0.439051390 0.416606665 0.390406251 0.360321403 0.326310992 0.288441181 0.246901810 0.202013552 0.154238343 0.104165822 5.24933599E-02 0.00000000 -5.24933599E-02 -0.104165822 -0.154238343 -0.202013552 -0.246901810 -0.288441181 -0.326310992 -0.360321403 -0.390406251 -0.416606665 -0.439051390 -0.457935333 -0.473499298 -0.486015081 -0.495766401 -0.503038108 -0.508112907 -0.511251688 -0.512708426 30.9032192
Now Fy array:
-0.205083355 -0.525830388 -0.552687049 -0.580741763 -0.609929502 -0.640149713 -0.671258569 -0.703064799 -0.735320449 -0.767719150 -0.799892545 -0.831412077 -0.861791074 -0.890495777 -0.916961849 -0.940612555 -0.960886896 -0.977269113 -0.989315629 -0.996686459 -0.999167860 -0.996686459 -0.989315629 -0.977269113 -0.960886896 -0.940612555 -0.916961849 -0.890495777 -0.861791074 -0.831412077 -0.799892545 -0.767719150 -0.735320449 -0.703064799 -0.671258569 -0.640149713 -0.609929502 -0.580741763 -0.552687049 -0.525830388 -0.205083355
-0.495766401 -0.496165156 -0.509083092 -0.549605310 13.5129404 13.0519953 -0.646288395 -0.672055602 -0.695797563 -0.717920899 -0.738660455 -0.758110344 -0.776252687 -0.792979062 -0.808119476 -0.821464479 -0.832787395 -0.841867268 -0.848508835 -0.852558434 -0.853919387 -0.852558374 -0.848508716 -0.841867328 -0.832787514 -0.821464896 -0.808119833 -0.792978704 -0.776252151 -0.758110642 -0.738660395 -0.717920780 -0.695797503 -0.672055602 -0.646288335 13.0519953 13.5129395 -0.549605191 -0.509083092 -0.496165156 -0.495766401
-0.416606665 -0.387658477 -0.370003909 -0.412325561 -0.451486528 -0.484789789 -0.512974977 -0.536900580 -0.557342112 8.73137856 8.12754345 -0.604040861 -0.616312325 -0.627466083 -0.637651145 -0.646887839 -0.655064702 -0.661947429 -0.667217672 -0.670547307 -0.671688557 -0.670547426 -0.667217493 -0.661947429 -0.655064702 -0.646887779 -0.637651086 -0.627466381 -0.616312623 -0.604041040 8.12754345 8.73137951 -0.557341993 -0.536900103 -0.512975276 -0.484789670 -0.451485991 -0.412325561 -0.370003909 -0.387658477 -0.416606665
-0.246901810 -0.228335708 -0.217398927 -0.246074528 -0.271431714 -0.291785061 -0.307664692 -0.319617361 -0.328106791 -0.333535194 -0.336277753 -0.336733580 -0.335400879 -0.333002120 -0.330682963 2.81363893 2.24033999 -0.348281264 -0.372185618 -0.395866930 -0.403591305 -0.395866960 -0.372185677 -0.348281264 2.24033999 2.81363893 -0.330682874 -0.333002120 -0.335400909 -0.336733490 -0.336277664 -0.333535045 -0.328106642 -0.319617361 -0.307664692 -0.291785270 -0.271431714 -0.246074289 -0.217398927 -0.228335708 -0.246901810
0.00000000 -3.97699699E-02 -8.22334886E-02 -9.01840925E-02 -9.43243951E-02 -9.68469381E-02 -9.79287177E-02 -9.75681171E-02 -9.57226083E-02 -9.23085213E-02 -8.71856511E-02 -8.01347122E-02 -7.08276853E-02 -5.87978214E-02 -4.34263758E-02 -2.40071025E-02 -4.12676527E-05 2.79203784E-02 5.66387177E-02 7.90976062E-02 8.76100808E-02 7.90975988E-02 5.66387326E-02 2.79204026E-02 -4.12871887E-05 -2.40071043E-02 -4.34263758E-02 -5.87978400E-02 -7.08276406E-02 -8.01346377E-02 -8.71856511E-02 -9.23085883E-02 -9.57226381E-02 -9.75680798E-02 -9.79286432E-02 -9.68469679E-02 -9.43244398E-02 -9.01841149E-02 -8.22335258E-02 -3.97699960E-02 0.00000000
0.246901810 0.149554759 5.41899577E-02 6.69130459E-02 8.30149651E-02 9.62892994E-02 0.106718197 0.114569001 0.119987577 0.122970015 0.123354375 0.120809816 0.114815064 0.104622498 8.91864598E-02 6.69886991E-02 3.55363674E-02 -1.02187870E-02 -8.21609423E-02 -0.177876130 -0.191068053 -0.177876085 -8.21608678E-02 -1.02187609E-02 3.55363339E-02 6.69886544E-02 8.91865119E-02 0.104622573 0.114814982 0.120810024 0.123354279 0.122969493 0.119987287 0.114568666 0.106718197 9.62890834E-02 8.30147490E-02 6.69130459E-02 5.41902333E-02 0.149555355 0.246901810
0.416606665 0.324635506 0.239433557 0.271107137 0.304715306 0.333829224 0.358776420 0.380251735 0.398895025 0.415270001 0.429880798 -6.52393579 -5.84947205 0.467720896 0.479777455 0.492111117 0.504699171 0.516976655 0.527697802 0.535157621 0.537844956 0.535157681 0.527697802 0.516976714 0.504699290 0.492111027 0.479777277 0.467720628 -5.84947205 -6.52393579 0.429880500 0.415270001 0.398895413 0.380252063 0.358776003 0.333829224 0.304715246 0.271106362 0.239433587 0.324635804 0.416606665
0.495766401 0.468931794 0.452914894 0.491556555 0.528390408 -12.8101072 -12.3052654 0.617275119 0.641844690 0.664552093 0.685565233 0.704941750 0.722658634 0.738638997 0.752775729 0.764953554 0.775063336 0.783014059 0.788738489 0.792190075 0.793342948 0.792190075 0.788738668 0.783013999 0.775063157 0.764953852 0.752775729 0.738638759 0.722658694 0.704941571 0.685565174 0.664552152 0.641844690 0.617275119 -12.3052645 -12.8101072 0.528390408 0.491556555 0.452914953 0.468931794 0.495766401
0.512708426 0.525830388 0.552687049 0.580741763 0.609929502 0.640149713 0.671258569 0.703064799 0.735320449 0.767719150 0.799892545 0.831412077 0.861791074 0.890495777 0.916961849 0.940612555 0.960886896 0.977269113 0.989315629 0.996686459 0.999167860 0.996686459 0.989315629 0.977269113 0.960886896 0.940612555 0.916961849 0.890495777 0.861791074 0.831412077 0.799892545 0.767719150 0.735320449 0.703064799 0.671258569 0.640149713 0.609929502 0.580741763 0.552687049 0.525830388 0.512708426

There appear to be unusually large values (perhaps indication of an asymptotic singularity?) along the lines y=x and y=-x.
You can see this in the data you posted. Consider for example, the first line:
-31.3490391 6.68895245E-02 6.68859407E-02 ... -6.68895245E-02 31.3490391
The first value is large and negative, followed by numbers which are small and positive. Near the end of the line the numbers are small and negative, while the last value is large and positive. Clearly, as it stands, this data is not going to produce a smoothly varying quiver plot.
If we remove these unusually large values:
data[np.abs(data) > 1] = np.nan
data2[np.abs(data2) > 1] = np.nan
then
import numpy as np
import matplotlib.pyplot as plt
data = np.genfromtxt('file1.dat')
data2 = np.genfromtxt('file2.dat')
data[np.abs(data) > 1] = np.nan
data2[np.abs(data2) > 1] = np.nan
N = 10
Fx = data[::N, ::N]
Fy = data2[::N, ::N]
nrows, ncols = Fx.shape
nx = 2
ny = 2
x = np.linspace(-nx, nx, ncols)
y = np.linspace(-ny, ny, nrows)
xi, yi = np.meshgrid(x, y, indexing='ij')
plt.axes([0.065, 0.065, 0.9, 0.9])
plt.quiver(xi, yi, Fx, Fy, alpha=.5)
plt.quiver(xi, yi, Fx, Fy, edgecolor='k', facecolor='none', linewidth=.5)
plt.show()
yields
data is a 2D array of shape (301, 301):
In [109]: data.shape
Out[109]: (301, 301)
If we slice data using data[::10] then the result has shape
In [113]: data[::10].shape
Out[113]: (31, 301)
Notice that only the first axis gets sliced. To slice both the first and second axes, use data[::10, ::10]:
In [114]: data[::10, ::10].shape
Out[114]: (31, 31)
See the docs for more on multidimensional slicing.
Always pay attention to the shape of NumPy arrays. It is often the key to understanding NumPy operations.
Although plt.quiver can sometimes accept arrays of different shape,
it is easiest to use plt.quiver by passing four arrays the the same shape.
To ensure that xi, yi, Fx, Fy all have the same shape, slice data and data2 to form Fx and Fy first, and then build xi and yi to conform to the (common) shape of Fx of Fy:
nrows, ncols = Fx.shape
x = np.linspace(-nx, nx, ncols)
y = np.linspace(-ny, ny, nrows)
The third argument to np.linspace indicates the number of elements in the
returned array.

First make sure the dimension for Fx and Fy is the same to avoid any confusion. Then generate the grid space dimension based on the data dimension. You can use np.linspace instead of np.arange as:
x = np.linspace(-nx, nx, Fx.shape[1])
y = np.linspace(-ny, ny, Fx.shape[0])
Update:
The complete code looks like:
import numpy as np
import matplotlib.pyplot as plt
# Fxdata.dat contain Fx data while Fydata.dat is Fy downloaded from the provided link
Fx = np.genfromtxt('Fxdata.dat')
Fy = np.genfromtxt('Fydata.dat')
# input data dimensions
num_Fx = Fx.shape[0] # number of lines for the data in file1.dat
length_Fx = Fx.shape[1] # length of each row for file1.dat
nx = 15
ny = 15
# you can generate the grid points based on the dimensions of the input data
x = np.linspace(-nx, nx, length_Fx)
y = np.linspace(-ny, ny, num_Fx)
# grid points
xi,yi=np.meshgrid(x,y)
#
plt.axes([0.065, 0.065, 0.9, 0.9])
plt.quiver(xi,yi,Fx,Fy, alpha=.5)
#plt.quiver(xi,yi,Fx,Fy, edgecolor='k',facecolor='none', linewidth=.5)
plt.show()
Not sure if it make sense now but the resulting plot looks like:

Related

Estimate joint density with 2d Gaussian kernel

I have the following data set where I have to estimate the joint density of 'bwt' and 'age' using kernel density estimation with a 2-dimensional Gaussian kernel and width h=5. I can't use modules such as scipy where there are ready functions to do this and I have to built functions to calculate the density. Here's what I've gotten so far.
import numpy as np
import pandas as pd
babies_full = pd.read_csv("https://www2.helsinki.fi/sites/default/files/atoms/files/babies2.txt", sep='\t')
#Getting the columns I need
babies_full1=babies_full[['gestation', 'age']]
x=np.array(babies_full1,'int')
#2d Gaussian kernel
def k_2dgauss(x):
return np.exp(-np.sum(x**2, 1)/2) / np.sqrt(2*np.pi)
#Multivariate kernel density
def mv_kernel_density(t, x, h):
d = x.shape[1]
return np.mean(k_2dgauss((t - x)/h))/h**d
t = np.linspace(1.0, 5.0, 50)
h=5
print(mv_kernel_density(t, x, h))
However, I get a value error 'ValueError: operands could not be broadcast together with shapes (50,) (1173,2)' which think is because different shape of the matrices. I also don't understand why k_2dgauss(x) for me returns an array of zeros since it should only return one value. In general, I am new to the concept of kernel density estimation I don't really know if I've written the functions right so any hints would help!
Following on from my comments on your original post, I think this is what you want to do, but if not then come back to me and we can try again.
# info supplied by OP
import numpy as np
import pandas as pdbabies_full = \
pd.read_csv("https://www2.helsinki.fi/sites/default/files/atoms/files/babies2.txt", sep='\t')
#Getting the columns I need
babies_full1=babies_full[['gestation', 'age']]
x=np.array(babies_full1,'int')
# my contributions
from math import floor, ceil
def binMaker(arr, base):
"""function I already use for this sort of thing.
arr is the arr I want to make bins for
base is the bin separation, but does require you to import floor and ceil
otherwise you can make these bins manually yourself"""
binMin = floor(arr.min() / base) * base
binMax = ceil(arr.max() / base) * base
return np.arange(binMin, binMax + base, base)
bins1 = binMaker(x[:,0], 20.) # bins from 140. to 360. spaced 20 apart
bins2 = binMaker(x[:,1], 5.) # bins from 15. to 45. spaced 5. apart
counts = np.zeros((len(bins1)-1, len(bins2)-1)) # empty array for counts to go in
for i in range(0, len(bins1)-1): # loop over the intervals, hence the -1
boo = (x[:,0] >= bins1[i]) * (x[:,0] < bins1[i+1])
for j in range(0, len(bins2)-1): # loop over the intervals, hence the -1
counts[i,j] = np.count_nonzero((x[boo,1] >= bins2[j]) *
(x[boo,1] < bins2[j+1]))
# if you want your PDF to be a fraction of the total
# rather than the number of counts, do the next line
counts /= x.shape[0]
# plotting
import matplotlib.pyplot as plt
from matplotlib.colors import BoundaryNorm
# setting the levels so that each number in counts has its own colour
levels = np.linspace(-0.5, counts.max()+0.5, int(counts.max())+2)
cmap = plt.get_cmap('viridis') # or any colormap you like
norm = BoundaryNorm(levels, ncolors=cmap.N, clip=True)
fig, ax = plt.subplots(1, 1, figsize=(6,5), dpi=150)
pcm = ax.pcolormesh(bins2, bins1, counts, ec='k', lw=1)
fig.colorbar(pcm, ax=ax, label='Counts (%)')
ax.set_xlabel('Age')
ax.set_ylabel('Gestation')
ax.set_xticks(bins2)
ax.set_yticks(bins1)
plt.title('Manually making a 2D (joint) PDF')
If this is what you wanted, then there is an easier way with np.histgoram2d, although I think you specified it had to be using your own methods, and not built in functions. I've included it anyway for completeness' sake.
pdf = np.histogram2d(x[:,0], x[:,1], bins=(bins1,bins2))[0]
pdf /= x.shape[0] # again for normalising and making a percentage
levels = np.linspace(-0.5, pdf.max()+0.5, int(pdf.max())+2)
cmap = plt.get_cmap('viridis') # or any colormap you like
norm = BoundaryNorm(levels, ncolors=cmap.N, clip=True)
fig, ax = plt.subplots(1, 1, figsize=(6,5), dpi=150)
pcm = ax.pcolormesh(bins2, bins1, pdf, ec='k', lw=1)
fig.colorbar(pcm, ax=ax, label='Counts (%)')
ax.set_xlabel('Age')
ax.set_ylabel('Gestation')
ax.set_xticks(bins2)
ax.set_yticks(bins1)
plt.title('using np.histogram2d to make a 2D (joint) PDF')
Final note - in this example, the only place where counts doesn't equal pdf is for the bin between 40 <= age < 45 and 280 <= gestation 300, which I think is due to how, in my manual case, I've used <= and <, and I'm a little unsure how np.histogram2d handles values outside the bin ranges, or on the bin edges etc. We can see the element of x that is responsible
>>> print(x[1011])
[280 45]

Finding nearest xy-point in numpy array and second nearest with condition

My problem is like the problem in the thread Finding index of nearest point in numpy arrays of x and y coordinates, but it's extended:
For better visualization here's an image
(manipulated image, original from: by 112BKS - Eigenes WerkOriginal graph/Data from [.. ? ..], CC BY-SA 3.0, link to page):
On the one hand there is a array datafield. It consists of a numpy array with elements [value x y]. That are the thin blue lines with the numbers (they are the value). On the other hand there is the array orangeline in a numpy array with elements [x y].
What I want to do is to calculate the value of any elements in orangeline. I visualized one concrete element of orangeline with the green circle. The value for it can I interpolate with the two elements from datafield, visualized with the triangles. As result I get for the green circle a value between 225 and 230.
First step: Find for every element in orangeline the closest element in datafield.(In the example that is the pink triangle.)
Second step: Find for every element in 'orangeline' the closest element in datafield but with another value than the one from the first step. (In the example that is the brown triangle.)
Third step: Interpolate the value for every element in orangeline from those the two founded values and the distances to those elements.
First step can be solved with
mytree = scipy.spatial.cKDTree(datafield[:, 1:3])
dist1, indexes1 = mytree.query(orangeline)
But now I don't know how to filter the datafield for the second step. Is there a solution?
With help from #unutbu's comment I found this solution which works quite good also in those cases where the orangeline goes not through the field.
Here are the functions for the grid:
import matplotlib.mlab as mlab
import numpy as np
import scipy
def define_grid(rawdata):
xmin, xmax = np.amin(rawdata[:, 1]), np.amax(rawdata[:,1])
ymin, ymax = np.amin(rawdata[:, 2]), np.amax(rawdata[:,2])
x, y, z = rawdata[:, 1], rawdata[:, 2], rawdata[:, 0]
# Size of regular grid
ny, nx = (ymax - ymin), (xmax - xmin)
# Generate a regular grid to interpolate the data.
xi = np.linspace(xmin, xmax, nx)
yi = np.linspace(ymin, ymax, ny)
xi, yi = np.meshgrid(xi, yi)
# Interpolate using delaunay triangularization
zi = mlab.griddata(x,y,z,xi,yi)
return xi, yi, zi
def grid_as_array(xi,yi,zi):
xi_flat, yi_flat, zi_flat = np.ravel(xi), np.ravel(yi), np.ravel(zi)
# reduce arrays for faster calculation, take only every second element
xi_red, yi_red, zi_red = xi_flat[1::2], yi_flat[1::2], zi_flat[1::2]
# stack to array with elements [x y z], but there are z values that are 'nan'
xyz_with_nan = np.hstack((xi_red[:, np.newaxis], yi_red[:, np.newaxis],
zi_red[:, np.newaxis]))
# sort out those elements with 'nan'
xyz = xyz_with_nan[~np.isnan(xyz_with_nan).any(axis=1)]
return xyz
Another function to find the closest point from the grid for the values from orangeline:
def closest_node(points, datafield):
mytree = scipy.spatial.cKDTree(datafield)
dist, indexes = mytree.query(points)
return indexes
And now the code:
# use function to create from the raw data an interpolated datafield
xi, yi, zi = define_grid(datafield)
# rearrange those values to bring them in the form of an array with [x y z]
xyz = grid_as_array(xi, yi, zi)
# search closest values from grid for the points of the orangeline
# orangeline_xy is the array with elements [x y]
indexes = self.closest_node(orangeline_xy, xyz[:,0:2])
# take z values from the grid which we found before
orangeline_z = xyz[indexes, 2]
# add those z values to the points of the orangeline
orangeline_xyz = np.hstack((orangeline_xy,orangeline_z[:, np.newaxis]))

Correct usage of scipy.interpolate.RegularGridInterpolator

I am a little confused by the documentation for scipy.interpolate.RegularGridInterpolator.
Say for instance I have a function f: R^3 => R which is sampled on the vertices of the unit cube. I would like to interpolate so as to find values inside the cube.
import numpy as np
# Grid points / sample locations
X = np.array([[0,0,0], [0,0,1], [0,1,0], [0,1,1], [1,0,0], [1,0,1], [1,1,0], [1,1,1.]])
# Function values at the grid points
F = np.random.rand(8)
Now, RegularGridInterpolator takes a points argument, and a values argument.
points : tuple of ndarray of float, with shapes (m1, ), ..., (mn, )
The points defining the regular grid in n dimensions.
values : array_like, shape (m1, ..., mn, ...)
The data on the regular grid in n dimensions.
I interpret this as being able to call as such:
import scipy.interpolate as irp
rgi = irp.RegularGridInterpolator(X, F)
However, when I do so, I get the following error:
ValueError: There are 8 point arrays, but values has 1 dimensions
What am I misinterpreting in the docs?
Ok I feel silly when I answer my own question, but I found my mistake with help from the documentation of the original regulargrid lib:
https://github.com/JohannesBuchner/regulargrid
points should be a list of arrays that specifies how the points are spaced along each axis.
For example, to take the unit cube as above, I should set:
pts = ( np.array([0,1.]), )*3
or if I had data which was sampled at higher resolution along the last axis, I might set:
pts = ( np.array([0,1.]), np.array([0,1.]), np.array([0,0.5,1.]) )
Finally, values has to be of shape corresponding to the grid laid out implicitly by points. For example,
val_size = map(lambda q: q.shape[0], pts)
vals = np.zeros( val_size )
# make an arbitrary function to test:
func = lambda pt: (pt**2).sum()
# collect func's values at grid pts
for i in range(pts[0].shape[0]):
for j in range(pts[1].shape[0]):
for k in range(pts[2].shape[0]):
vals[i,j,k] = func(np.array([pts[0][i], pts[1][j], pts[2][k]]))
So finally,
rgi = irp.RegularGridInterpolator(points=pts, values=vals)
runs and performs as desired.
Your answer is nicer, and it's perfectly OK for you to accept it. I'm just adding this as an "alternate" way to script it.
import numpy as np
import scipy.interpolate as spint
RGI = spint.RegularGridInterpolator
x = np.linspace(0, 1, 3) # or 0.5*np.arange(3.) works too
# populate the 3D array of values (re-using x because lazy)
X, Y, Z = np.meshgrid(x, x, x, indexing='ij')
vals = np.sin(X) + np.cos(Y) + np.tan(Z)
# make the interpolator, (list of 1D axes, values at all points)
rgi = RGI(points=[x, x, x], values=vals) # can also be [x]*3 or (x,)*3
tst = (0.47, 0.49, 0.53)
print rgi(tst)
print np.sin(tst[0]) + np.cos(tst[1]) + np.tan(tst[2])
returns:
1.93765972087
1.92113615659

Referencing Data From a 2D Histogram

I have the following code that reads data from a CSV file and creates a 2D histogram:
import numpy as np
import pandas as pd
import matplotlib as mpl
import matplotlib.pyplot as plt
#Read in CSV data
filename = 'Complete_Storms_All_US_Only.csv'
df = pd.read_csv(filename)
min_85 = df.min85
min_37 = df.min37
verification = df.one_min_15
#Numbers
x = min_85
y = min_37
H = verification
#Estimate the 2D histogram
nbins = 33
H, xedges, yedges = np.histogram2d(x,y,bins=nbins)
#Rotate and flip H
H = np.rot90(H)
H = np.flipud(H)
#Mask zeros
Hmasked = np.ma.masked_where(H==0,H)
#Calculate Averages
avgarr = np.zeros((nbins, nbins))
xbins = np.digitize(x, xedges[1:-1])
ybins = np.digitize(y, yedges[1:-1])
for xb, yb, v in zip(xbins, ybins, verification):
avgarr[yb, xb] += v
divisor = H.copy()
divisor[divisor==0.0] = np.nan
avgarr /= divisor
binavg = np.around((avgarr * 100), decimals=1)
binper = np.ma.array(binavg, mask=np.isnan(binavg))
#Plot 2D histogram using pcolor
fig1 = plt.figure()
plt.pcolormesh(xedges,yedges,binper)
plt.title('1 minute at +/- 0.15 degrees')
plt.xlabel('min 85 GHz PCT (K)')
plt.ylabel('min 37 GHz PCT (K)')
cbar = plt.colorbar()
cbar.ax.set_ylabel('Probability of CG Lightning (%)')
plt.show()
Each pixel in the histogram contains the probability of lightning for a given range of temperatures at two different frequencies on the x and y axis (min_85 on the x axis and min_37 on the y axis). I am trying to reference the probability of lightning from the histogram based on a wide range of temperatures that vary on an individual basis for any given storm. Each storm has a min_85 and min_37 that corresponds to a probability from the 2D histogram. I know there is a brute-force method where you can create a ridiculous amount of if statements, with one for each pixel, but this is tedious and inefficient when trying to incorporate over multiple 2D histograms. Is there a more efficient way to reference the probability from the histogram based on the given min_85 and min_37? I have a separate file with the min_85 and min_37 data for a large amount of storms, I just need to assign the corresponding probability of lightning from the histogram to each one.
It sounds like all you need to do is turn the min_85 and min_37 values into indices. Something like this will work:
# min85data and min37data from your file
dx = xedges[1] - xedges[0]
dy = yedges[1] - yedges[0]
min85inds = np.floor((min85data - yedges[1]) / dx).astype(np.int)
min37inds = np.floor((min37data - yedges[0]) / dy).astype(np.int)
# Pretend you didn't do all that flipping of H, or make a copy of it first
hvals = h_orig[min85inds, min37ends]
But do make sure that the resulting indices are valid before you extract them.

2D histogram of 1D function of random number

I have a 1D function of N values. I want to make a 2D histogram so that I get an image of nx*ny pixels and then cut the image summing along 1 dimension. The function before and after should be the same. I tried with a gaussian but I am missing a factor sqrt. Please see the code. Am I missing something in my fucntion?
I draw r from a random number
import numpy as np
import random
import matplotlib.pyplot as plt
sigma=0.5
N=100000
r=np.random.normal(0, sigma, N)
ind = np.where( r>=0 )
r =r[ind]
N=len(r)
phi=2*np.pi*np.random.rand(N)
x=r*np.cos(phi)
y=r*np.sin(phi)
Ir=np.zeros(N)
Ir[:]=1
Now I want the see the distribution Ir=f(x, y), as a 2D image.
def getImage2D(x, y, fun, nx=100, ny=100, xmin=-1, xmax=1, ymin=-1, ymax=1):
dx=(xmax-xmin)*(1.0/nx);
dy=(ymax-ymin)*(1.0/ny);
image = np.ndarray(shape=(nx, ny), dtype=float); image.fill(0.0)
for i in range(len(fun)):
mr = (x[i]-xmin)/dx;
nr = (y[i]-ymin)/dy;
m, n=int(mr), int(nr)
image[m, n]=image[m, n]+fun[i];
return image
P=getImage2D(x, y, Ir, nx=101, ny=101, xmin=-3, xmax=3, ymin=-3, ymax=3)
#P=getImage2D(x, y, r**0.5*Ir, nx=101, ny=101, xmin=-3, xmax=3, ymin=-3, ymax=3)
Px=np.sum(P, axis=1)
Px=Px/np.max(Px)
plt.figure()
plt.imshow(P)
plt.show(block=False)
If I plot the cut Px (after summing along y), I am not getting the same gaussian of width sigma!!
v=np.linspace(np.min(r), np.max(r), len(r))
v=v/np.max(v)
plt.figure()
plt.plot(np.linspace(-3, 3, 101), Px)
plt.plot(np.sort(r)[::-1], v, 'g')
plt.show(block=False)
Why the width is not same in both cases? If I put a weight r**0.5 with Ir then the width is same(sigma=0.5). Is there any mistake that I am doing in getImage2D function?

Categories

Resources