I am experimenting with gradient descent and want to plot a contour of the gradient given independent variables x and y.
The optimization objective is to estimate a point given only a list of points and the distances to each of those points. I have a list of vectors of form [(x_1, y_1, d_1), ..., (x_n, y_n, d_n)] where d_i is the measured distance from the point to be estimated to the point (x_i, y_i), and I have a function g(x, y) that returns the gradient at the point (x, y). (The function g(x, y) uses the training vectors to calculate the gradient.)
The gradient descent algorithm works fine and arrives at a close estimate to the actual point coordinates. I want now to visualize the gradient as a contour map. I have the following for x and y values:
xlist = np.linspace(min([v[0] for v in vectors])-1, max([v[0] for v in vectors])+1, 100)
ylist = np.linspace(min([v[1] for v in vectors])-1, max([v[1] for v in vectors])+1, 100)
X, Y = np.meshgrid(xlist, ylist)
But now I need a Z value that maps each pair of coordinates in the grid mesh to g(x, y), and it needs to be the correct shape for the matplotlib contour plot. The examples I have seen have been useless because they all simply multiplied the x and y arrays to generate z values (which obviously will not work in this case), and all the tips, tricks, and SO answers I have encountered ultimately did not help.
How do I use my custom function g(x, y) to create the 2D Z array necessary for constructing a valid contour plot?
Related
I have a function that returns the density estimate for points (x, y). I would like to iterate over all (x, y) points for a given 2-D grid and have the density function compute the estimate for each point so that I can have a matrix of density values which I can then plot.
Say the function is called density(x, y), that takes any point (x, y) and returns the density estimate (z) for that (x, y). I would like to be able to apply the function to each point within a 2-Dimensional grid and store the density estimate wherein I could use, say, plt.pcolormesh() to view the density.
How can I do this?
I think you want something on the lines of this.
First, define a density function. For simplicity, I am taking the function |x| + |y|.
def density(x, y):
return np.abs(x) + np.abs(y)
Now let's define the points along x and y dimensions and populate the arrays. In the following example, x and y are 1D arrays which store n_x and n_y points each sampled uniformly in [-1, 1].
n_x = 100
n_y = 100
x = np.linspace(-1, 1, n_x)
y = np.linspace(-1, 1, n_y)
Compute the grid in terms of pairs of points and compute the density D over each point in the grid.
xx, yy = np.meshgrid(x, y)
D = density(xx, yy)
Note that you don't need to explicitly iterate over meshgrid, you can use the seemingly scalar density() function for the arrays xx and yy as well. For details about meshgrid, see this page.
Next simply use pcolormesh() to display or save.
plt.pcolormesh(x, y, D)
plt.title('Density function = |x| + |y|')
plt.savefig('density.png')
The output is:
A hyperboloid has the formula
-x^2/a^2 - y^2/b^2 + z^2/c^2 = 1.
How can I generate samples from this hyperboloid in Python? (Say, with a=b=c=1.)
I was thinking to pick random x and y in [0,1] and then fill in the z value that would make the formula equal 1. However this would not sample uniformly. Is there a better way?
This is only a partial answer.
J.F. Williamson, "Random selection of points distributed on curved surfaces", Physics in Medicine & Biology 32(10), 1987, describes a general method of choosing a uniformly random point on a parametric surface. It is an acceptance/rejection method that accepts or rejects each candidate point depending on its stretch factor (norm-of-gradient). To use this method for a parametric surface, several things have to be known about the surface, namely—
x(u, v), y(u, v) and z(u, v), which are functions that generate 3-dimensional coordinates from two dimensional coordinates u and v,
The ranges of u and v,
g(point), the norm of the gradient ("stretch factor") at each point on the surface, and
gmax, the maximum value of g for the entire surface.
The algorithm is then:
Generate a point on the surface, xyz.
If g(xyz) >= RNDU01()*gmax, where RNDU01() is a uniform random number in [0, 1), accept the point. Otherwise, repeat this process.
In the case of a hyperboloid with parameters a=b=c=1:
The gradient is [2*x, -2*y, 2*z].
The maximum value of the gradient norm is:2*sqrt(3), if x, y, and z are all in the interval [0, 1].
The only thing left is to turn the implicit formula into a parametric equation that is a function of two-dimensional coordinates u and v. I know this algorithm works for parametric surfaces, but I don't know if it still works if we "pick random x and y in [0,1] and then fill in the z value that would make the formula equal" in step 1.
I can't quite wrap my head around on how to extrapolate from a dataset where the points are not ordered, i.e. be decreasing for 'x'. like so:
http://www.pic-host.org/images/2014/07/21/0b5ad6a11266f549.png
I got that I need to create a plot for the x and y values seperately. So the code that gets me this: (The points are ordered)
x = bananax
y = bananay
t = np.arange(x.shape[0], dtype=float)
t /= t[-1]
nt = np.linspace(0, 1, 100)
x1 = scipy.interpolate.spline(t, x, nt)
y1 = scipy.interpolate.spline(t, y, nt)
plt.plot(nt, x1, label='data x')
plt.plot(nt, y1, label='data y')
Now I got the interpolated splines. I guess I have to do the extrapolation for f(nt)=x1 and f(nt)=y1 respectivly. I get how to interpolate from the data with a simple linear regression but I'm missing how to get a more complex spline(?) extrapolated from it.
The aim is to let the extrapolated function follow the curvature of the datapoints. (At one end at least)
Cheers, and thanks!
I believe that you're on the right track in that you're creating a parametric curve (creating x(t) and y(t)) because the points are ordered. Part of issue seems to be that the spline function is giving you back discrete values rather than the form and parameters of the spline. scipy.optimize has some nice tools that will help you find functions rather than calculating points
If you've got any insight into the underlying process generating the data I suggest that you use that to help select a functional form for fitting. These more free-form methods will give you a degree of flexibility to do so.
Fit x(t) and y(t) and hold onto the resulting fitting functions. They'll be generated with data from t=0 to t=1 but nothing* will stop you from evaluating them outside that range.
I can recommend the following links for guidance on curve fitting procedure:
short: http://glowingpython.blogspot.com/2011/05/curve-fitting-using-fmin.html
long: http://nbviewer.ipython.org/gist/keflavich/4042018
*almost nothing
Thanks this got me on the right track. What worked for me was:
x = bananax
y = bananay
#------ fit a spline to the coordinates, x and y axis are interpolated towards t
t = np.arange(x.shape[0], dtype=float) #t is # of values
t /= t[-1] #t is now devided from 0 to 1
nt = np.linspace(0, 1, 100) #nt is array with values from 0 to 1 with 100 intermediate values
x1 = scipy.interpolate.spline(t, x, nt) #The x values where spline should estimate the y values
y1 = scipy.interpolate.spline(t, y, nt)
#------ create a new linear space for nnt in which an extrapolation from the interpolated spline will be made
nnt = np.linspace(-1, 1, 100) #values <0 are extrapolated (interpolation started at the tip(=0)
x1fit = np.polyfit(nt,x1,3) #fits a polynomial function of the nth order with the spline as input, output are the function parameters
y1fit = np.polyfit(nt,y1,3)
xpoly = np.poly1d(x1fit) #genereates the function based on the parameters obtained by polyfit
ypoly = np.poly1d(y1fit)
I would like to know how does numpy.gradient work.
I used gradient to try to calculate group velocity (group velocity of a wave packet is the derivative of frequencies respect to wavenumbers, not a group of velocities). I fed a 3 column array to it, the first 2 colums are x and y coords, the third column is the frequency of that point (x,y). I need to calculate gradient and I did expect a 2d vector, being gradient definition
df/dx*i+df/dy*j+df/dz*k
and my function only a function of x and y i did expect something like
df/dx*i+df/dy*j
But i got 2 arrays with 3 colums each, i.e. 2 3d vectors; at first i thought that the sum of the two would give me the vector i were searchin for but the z component doesn't vanish. I hope i've been sufficiently clear in my explanation. I would like to know how numpy.gradient works and if it's the right choice for my problem. Otherwise i would like to know if there's any other python function i can use.
What i mean is: I want to calculate gradient of an array of values:
data=[[x1,x2,x3]...[x1,x2,x3]]
where x1,x2 are point coordinates on an uniform grid (my points on the brillouin zone) and x3 is the value of frequency for that point. I give in input also steps for derivation for the 2 directions:
stepx=abs(max(unique(data[:,0])-min(unique(data[:,0]))/(len(unique(data[:,0]))-1)
the same for y direction.
I didn't build my data on a grid, i already have a grid and this is why kind examples given here in answers do not help me.
A more fitting example should have a grid of points and values like the one i have:
data=[]
for i in range(10):
for j in range(10):
data.append([i,j,i**2+j**2])
data=array(data,dtype=float)
gx,gy=gradient(data)
another thing i can add is that my grid is not a square one but has the shape of a polygon being the brillouin zone of a 2d crystal.
I've understood that numpy.gradient works properly only on a square grid of values, not what i'm searchin for. Even if i make my data as a grid that would have lots of zeroes outside of the polygon of my original data, that would add really high vectors to my gradient affecting (negatively) the precision of calculation. This module seems to me more a toy than a tool, it has severe limitations imho.
Problem solved using dictionaries.
You need to give gradient a matrix that describes your angular frequency values for your (x,y) points. e.g.
def f(x,y):
return np.sin((x + y))
x = y = np.arange(-5, 5, 0.05)
X, Y = np.meshgrid(x, y)
zs = np.array([f(x,y) for x,y in zip(np.ravel(X), np.ravel(Y))])
Z = zs.reshape(X.shape)
gx,gy = np.gradient(Z,0.05,0.05)
You can see that plotting Z as a surface gives:
Here is how to interpret your gradient:
gx is a matrix that gives the change dz/dx at all points. e.g. gx[0][0] is dz/dx at (x0,y0). Visualizing gx helps in understanding:
Since my data was generated from f(x,y) = sin(x+y) gy looks the same.
Here is a more obvious example using f(x,y) = sin(x)...
f(x,y)
and the gradients
update Let's take a look at the xy pairs.
This is the code I used:
def f(x,y):
return np.sin(x)
x = y = np.arange(-3,3,.05)
X, Y = np.meshgrid(x, y)
zs = np.array([f(x,y) for x,y in zip(np.ravel(X), np.ravel(Y))])
xy_pairs = np.array([str(x)+','+str(y) for x,y in zip(np.ravel(X), np.ravel(Y))])
Z = zs.reshape(X.shape)
xy_pairs = xy_pairs.reshape(X.shape)
gy,gx = np.gradient(Z,.05,.05)
Now we can look and see exactly what is happening. Say we wanted to know what point was associated with the value atZ[20][30]? Then...
>>> Z[20][30]
-0.99749498660405478
And the point is
>>> xy_pairs[20][30]
'-1.5,-2.0'
Is that right? Let's check.
>>> np.sin(-1.5)
-0.99749498660405445
Yes.
And what are our gradient components at that point?
>>> gy[20][30]
0.0
>>> gx[20][30]
0.070707731517679617
Do those check out?
dz/dy always 0 check.
dz/dx = cos(x) and...
>>> np.cos(-1.5)
0.070737201667702906
Looks good.
You'll notice they aren't exactly correct, that is because my Z data isn't continuous, there is a step size of 0.05 and gradient can only approximate the rate of change.
I checked the available interpolation method in scipy, but could not get the proper solution for my case.
assume i have 100 points whose coordinates are random,
e.g., their x and y positions are:
x=np.random.rand(100)*100
y=np.random.rand(100)*100
z = f(x,y) #the point value calculated by certain function
now i want to get the point value z of a new evenly sampled coordinates (xnew and y new)
xnew = range(100)
ynew = range(100)
how should i do this using bilinear sampling?
i know it is possible to do it point by point, e.g., find the 4 nearest random points, and do the interpolation, but there got to be some easier existing functions to do this
thanks alot!
Use scipy.interpolate.griddata. It does the exact thing you need
# griddata expects an ndarray for the interpolant coordinates
interpolants = numpy.array([xnew, ynew])
# defaults to linear interpolation
znew = scipy.interpolate.griddata((x, y), z, interpolants)
http://docs.scipy.org/doc/scipy/reference/generated/scipy.interpolate.griddata.html#scipy.interpolate.griddata