Fitting Fresnel Equations Using Scipy

Fitting Fresnel Equations Using Scipy - python

I am attempting a non-linear fit of Fresnel equations with data of reflectance against angle of incidence. Found on this site http://en.wikipedia.org/wiki/Fresnel_equations are two graphs that have a red and blue line. I need to basically fit the blue line when n1 = 1 to my data.
Here I use the following code where th is theta, the angle of incidence.
def Rperp(th, n, norm, constant):
numerator = np.cos(th) - np.sqrt(n**2.0 - np.sin(th)**2.0)
denominator = 1.0 * np.cos(th) + np.sqrt(n**2.0 - np.sin(th)**2.0)
return ((numerator / denominator)**2.0) * norm + constant
The parameters I'm looking for are:
the index of refraction n
some normalization to multiply by and
a constant to shift the baseline of the graph.
My attempt is the following:
xdata = angle[1:] * 1.0 # angle of incidence
ydata = greenDD[1:] # reflectance
params = curve_fit(Rperp, xdata, ydata)
What I get is a division of zero apparently and gives me [1, 1, 1] for the parameters. The Fresnel equation itself is the bit without the normalizer and the constant in Rperp. Theta in the equation is the angle of incidence also. Overall I am just not sure if I am doing this right at all to get the parameters.
The idea seems to be the first parameter in the function is the independent variable and the rest are the dependent variables going to be found. Then you just plug into scipy's curve_fit and it will give you a fit to your data for the parameters. If it is just getting around division of zero, which I had though might be integer division, then it seems like I should be set. Any help is appreciated and let me know if things need to be clarified (such as np is numpy).

Make sure to pass the arguments to the trigonometric functions, like sine, in radians, not degrees.
As for why you're getting a negative refractive index returned: it is because in your function, you're always squaring the refractive index. The curve_fit algorithm might end up in a local minimum state where (by accident) n is negative, because it has the same value as n positive.
Ideally, you'd add constraints to the minimization problem, but for this (simple) problem, just observe your formula and remember that a result of negative n is simply solved by changing the sign, as you did.
You could also try passing an initial guess to the algorithm and you might observe that it will not end up in the local minimum with negative value.

Related

Reconstructing curve from gradient

Suppose I have a curve, and then I estimate its gradient via finite differences by using np.gradient. Given an initial point x[0] and the gradient vector, how can I reconstruct the original curve? Mathematically I see its possible given this system of equations, but I'm not certain how to do it programmatically.
Here is a simple example of my problem, where I have sin(x) and I compute the numerical difference, which matches cos(x).
test = np.vectorize(np.sin)(x)
numerical_grad = np.gradient(test, 30./100)
analytical_grad = np.vectorize(np.cos)(x)
## Plot data.
ax.plot(test, label='data', marker='o')
ax.plot(numerical_grad, label='gradient')
ax.plot(analytical_grad, label='proof', alpha=0.5)
ax.legend();

I found how to do it, by using numpy's trapz function (trapezoidal rule integration).
Following up on the code I presented on the question, to reproduce the input array test, we do:
x = np.linspace(1, 30, 100)
integral = list()
for t in range(len(x)):
integral.append(test[0] + np.trapz(numerical_grad[:t+1], x[:t+1]))
The integral array then contains the results of the numerical integration.

You can restore initial curve using integration.
As life example: If you have function for position for 1D moving, you can get function for velocity as derivative (gradient)
v(t) = s(t)' = ds / dt
And having velocity, you can potentially get position (not all functions are integrable analytically - in this case numerical integration is used) with some unknown constant (shift) added - and with initial position you can restore exact value
s(T) = Integral[from 0 to T](v(t)dt) + s(0)

Fitting an ellipse through orbital data

I've generated a bunch of data for the (x,y,z) coordinates of a planet as it orbits around the Sun. Now I want to fit an ellipse through this data.
What I tried to do:
I created a dummy ellipse based on five parameters: The semi-major axis & eccentricity that defines the size & shape and the three euler angles that rotate the ellipse around. Since my data is not always centered at origin I also need to translate the ellipse requiring additional three variables (dx,dy,dz).
Once I initialise this function with these eight variables I get back N number of points that lie on this ellipse. (N = number of data points I am plotting the ellipse through)
I calculate the deviation of these dummy points from the actual data and then I minimise this deviation using some minimisation method to find the best fitting values for these eight variables.
My problem is with the very last part: minimising the deviation and finding the variables' values.
To minimise the deviation I use scipy.optimize.minimize to try and approximate the best fitting variables but it just doesn't do good enough of a job:
Here is an image of what one of my best fits looks like and that's with a very generously accurate initial guess. (blue = data, red = fit)
Here is the entire code. (No data required, it generates its own phony data)
In short, I use this scipy function:
initial_guess = [0.3,0.2,0.1,0.7,3,0.0,-0.1,0.0]
bnds = ((0.2, 0.5), (0.1, 0.3), (0, 2*np.pi), (0, 2*np.pi), (0, 2*np.pi), (-0.5,0.5), (-0.5,0.5), (-0.3,0.3)) #reasonable bounds for the variables
result = optimize.minimize(deviation, initial_guess, args=(data,), method='L-BFGS-B', bounds=bnds, tol=1e-8) #perform minimalisation
semi_major,eccentricity,inclination,periapsis,longitude,dx,dy,dz = result["x"]
To minimize this error (or deviation) function:
def deviation(variables, data):
"""
This function calculates the cumulative seperation between the ellipse fit points and data points and returns it
"""
num_pts = len(data[:,0])
semi_major,eccentricity,inclination,periapsis,longitude,dx,dy,dz = variables
dummy_ellipse = generate_ellipse(num_pts,semi_major,eccentricity,inclination,periapsis,longitude,dz,dy,dz)
deviations = np.zeros(len(data[:,0]))
pair_deviations = np.zeros(len(data[:,0]))
# Calculate separation between each pair of points
for j in range(len(data[:,0])):
for i in range(len(data[:,0])):
pair_deviations[i] = np.sqrt((data[j,0]-dummy_ellipse[i,0])**2 + (data[j,1]-dummy_ellipse[i,1])**2 + (data[j,2]-dummy_ellipse[i,2])**2)
deviations[j] = min(pair_deviations) # only pick the closest point to the data point j.
total_deviation = sum(deviations)
return total_deviation
(My code may be a bit messy & inefficient, I'm new to this)
I may be making some logical error in my coding but I think it comes down to the scipy.minimize.optimize function. I don't know enough about how it works and what to expect of it. I was also recommended to try Markov chain Monte Carlo when dealing with this many variables. I did take a look at the emcee, but it's a little above my head right now.

First, you have a typo in your objective function that prevents optimization of one of the variables:
dummy_ellipse = generate_ellipse(...,dz,dy,dz)
should be
dummy_ellipse = generate_ellipse(...,dx,dy,dz)
Also, taking sqrt out and minimizing the sum of squared euclidean distances makes it numerically somewhat easier for the optimizer.
Your objective function is also not everywhere differentiable because of the min(), as assumed by the BFGS solver, so its performance will be suboptimal.
Also, approaching the problem from analytical geometry perspective may help: an ellipse in 3d is defined as a solution of two equations
f1(x,y,z,p) = 0
f2(x,y,z,p) = 0
Where p are the parameters of the ellipse. Now, to fit the parameters to a data set, you could try to minimize
F(p) = sum_{j=1}^N [f1(x_j,y_j,z_j,p)**2 + f2(x_j,y_j,z_j,p)**2]
where the sum goes over data points.
Even better, in this problem formulation you could use optimize.leastsq, which may be more efficient in least squares problems.

Integral of Intensity function in python

There is a function which determine the intensity of the Fraunhofer diffraction pattern of a circular aperture... (more information)
Integral of the function in distance x= [-3.8317 , 3.8317] must be about 83.8% ( If assume that I0 is 100) and when you increase the distance to [-13.33 , 13.33] it should be about 95%.
But when I use integral in python, the answer is wrong.. I don't know what's going wrong in my code :(
from scipy.integrate import quad
from scipy import special as sp
I0=100.0
dist=3.8317
I= quad(lambda x:( I0*((2*sp.j1(x)/x)**2)) , -dist, dist)[0]
print I
Result of the integral can't be bigger than 100 (I0) because this is the diffraction of I0 ... I don't know.. may be scaling... may be the method! :(

The problem seems to be in the function's behaviour near zero. If the function is plotted, it looks smooth:
However, scipy.integrate.quad complains about round-off errors, which is very strange with this beautiful curve. However, the function is not defined at 0 (of course, you are dividing by zero!), hence the integration does not go well.
You may use a simpler integration method or do something about your function. You may also be able to integrate it to very close to zero from both sides. However, with these numbers the integral does not look right when looking at your results.
However, I think I have a hunch of what your problem is. As far as I remember, the integral you have shown is actually the intensity (power/area) of Fraunhofer diffraction as a function of distance from the center. If you want to integrate the total power within some radius, you will have to do it in two dimensions.
By simple area integration rules you should multiply your function by 2 pi r before integrating (or x instead of r in your case). Then it becomes:
f = lambda(r): r*(sp.j1(r)/r)**2
or
f = lambda(r): sp.j1(r)**2/r
or even better:
f = lambda(r): r * (sp.j0(r) + sp.jn(2,r))
The last form is best as it does not suffer from any singularities. It is based on Jaime's comment to the original answer (see the comment below this answer!).
(Note that I omitted a couple of constants.) Now you can integrate it from zero to infinity (no negative radii):
fullpower = quad(f, 1e-9, np.inf)[0]
Then you can integrate from some other radius and normalize by the full intensity:
pwr = quad(f, 1e-9, 3.8317)[0] / fullpower
And you get 0.839 (which is quite close to 84 %). If you try the farther radius (13.33):
pwr = quad(f, 1e-9, 13.33)
which gives 0.954.
It should be noted that we introduce a small error by starting the integration from 1e-9 instead of 0. The magnitude of the error can be estimated by trying different values for the starting point. The integration result changes very little between 1e-9 and 1e-12, so they seem to be safe. Of course, you could use, e.g., 1e-30, but then there may be numerical instability in the division. (In this case there isn't, but in general singularities are numerically evil.)
Let us do one thing still:
import matplotlib.pyplot as plt
import numpy as np
x = linspace(0.01, 20, 1000)
intg = np.array([ quad(f, 1e-9, xx)[0] for xx in x])
plt.plot(x, intg/fullpower)
plt.grid('on')
plt.show()
And this is what we get:
At least this looks right, the dark fringes of the Airy disk are clearly visible.
What comes to the last part of the question: I0 defines the maximum intensity (the units may be, e.g. W/m2), whereas the integral gives total power (if the intensity is in W/m2, the total power is in W). Setting the maximum intensity to 100 does not guarantee anything about the total power. That is why it is important to calculate the total power.
There actually exists a closed form equation for the total power radiated onto a circular area:
P(x) = P0 ( 1 - J0(x)^2 - J1(x)^2 ),
where P0 is the total power.

Note that you also can get a closed form solution for your integration using Sympy:
import sympy as sy
sy.init_printing() # LaTeX like pretty printing in IPython
x,d = sy.symbols("x,d", real=True)
I0=100
dist=3.8317
f = I0*((2*sy.besselj(1,x)/x)**2) # the integrand
F = f.integrate((x, -d, d)) # symbolic integration
print(F.evalf(subs={d:dist})) # numeric evalution
F evaluates to:
1600*d*besselj(0, Abs(d))**2/3 + 1600*d*besselj(1, Abs(d))**2/3 - 800*besselj(1, Abs(d))**2/(3*d)
with besselj(0,r) corresponding to sp.j0(r).

They might be a singularity in the integration algorithm when doing the jacobian at x = 0. You can exclude this points from the integration with "points":
f = lambda x:( I0*((2*sp.j1(x)/x)**2))
I = quad(f, -dist, dist, points = [0])
I get then the following result (is this your desired result?)
331.4990321315221

High frequency noise at solving differential equation

I'm trying to simulate a simple diffusion based on Fick's 2nd law.
from pylab import *
import numpy as np
gridpoints = 128
def profile(x):
range = 2.
straggle = .1576
dose = 1
return dose/(sqrt(2*pi)*straggle)*exp(-(x-range)**2/2/straggle**2)
x = linspace(0,4,gridpoints)
nx = profile(x)
dx = x[1] - x[0] # use np.diff(x) if x is not uniform
dxdx = dx**2
figure(figsize=(12,8))
plot(x,nx)
timestep = 0.5
steps = 21
diffusion_coefficient = 0.002
for i in range(steps):
coefficients = [-1.785714e-3, 2.539683e-2, -0.2e0, 1.6e0,
-2.847222e0,
1.6e0, -0.2e0, 2.539683e-2, -1.785714e-3]
ccf = (np.convolve(nx, coefficients) / dxdx)[4:-4] # second order derivative
nx = timestep*diffusion_coefficient*ccf + nx
plot(x,nx)
for the first few time steps everything looks fine, but then I start to get high frequency noise, do to build-up from numerical errors which are amplified through the second derivative. Since it seems to be hard to increase the float precision I'm hoping that there is something else that I can do to suppress this? I already increased the number of points that are being used to construct the 2nd derivative.

I don't have the time to study your solution in detail, but it seems that you are solving the partial differential equation with a forward Euler scheme. This is pretty easy to implement, as you show, but this can become numerical instable if your timestep is too small. Your only solution is to reduce the timestep or to increase the spatial resolution.
The easiest way to explain this is for the 1-D case: assume your concentration is a function of spatial coordinate x and timestep i. If you do all the math (write down your equations, substitute the partial derivatives with finite differences, should be pretty easy), you will probably get something like this:
C(x, i+1) = [1 - 2 * k] * C(x, i) + k * [C(x - 1, i) + C(x + 1, i)]
so the concentration of a point on the next step depends on its previous value and the ones of its two neighbors. It is not too hard to see that when k = 0.5, every point gets replaced by the average of its two neighbors, so a concentration profile of [...,0,1,0,1,0,...] will become [...,1,0,1,0,1,...] on the next step. If k > 0.5, such a profile will blow up exponentially. You calculate your second order derivative with a longer convolution (I effectively use [1,-2,1]), but I guess that does not change anything for the instability problem.
I don't know about normal diffusion, but based on experience with thermal diffusion, I would guess that k scales with dt * diffusion_coeff / dx^2. You thus have to chose your timestep small enough so that your simulation does not become instable. To make the simulation stable, but still as fast as possible, chose your parameters so that k is a bit smaller than 0.5. Something similar can be derived for 2-D and 3-D cases. The easiest way to achieve this is to increase dx, since your total calculation time will scale with 1/dx^3 for a linear problem, 1/dx^4 for 2-D problems, and even 1/dx^5 for 3-D problems.
There are better methods to solve diffusion equations, I believe that Crank Nicolson is at least standard for solving heat-equations (which is also a diffusion problem). The 'problem' is that this is an implicit method, which means that you have to solve a set of equations to calculate your 'concentration' at the next timestep, which is a bit of a pain to implement. But this method is guaranteed to be numerical stable, even for big timesteps.

Fitting curve: why small numbers are better?

I spent some time these days on a problem. I have a set of data:
y = f(t), where y is very small concentration (10^-7), and t is in second. t varies from 0 to around 12000.
The measurements follow an established model:
y = Vs * t - ((Vs - Vi) * (1 - np.exp(-k * t)) / k)
And I need to find Vs, Vi, and k. So I used curve_fit, which returns the best fitting parameters, and I plotted the curve.
And then I used a similar model:
y = (Vs * t/3600 - ((Vs - Vi) * (1 - np.exp(-k * t/3600)) / k)) * 10**7
By doing that, t is a number of hour, and y is a number between 0 and about 10. The parameters returned are of course different. But when I plot each curve, here is what I get:
http://i.imgur.com/XLa4LtL.png
The green fit is the first model, the blue one with the "normalized" model. And the red dots are the experimental values.
The fitting curves are different. I think it's not expected, and I don't understand why. Are the calculations more accurate if the numbers are "reasonnable" ?

The docstring for optimize.curve_fit says,
p0 : None, scalar, or M-length sequence
Initial guess for the parameters. If None, then the initial
values will all be 1 (if the number of parameters for the function
can be determined using introspection, otherwise a ValueError
is raised).
Thus, to begin with, the initial guess for the parameters is by default 1.
Moreover, curve fitting algorithms have to sample the function for various values of the parameters. The "various values" are initially chosen with an initial step size on the order of 1. The algorithm will work better if your data varies somewhat smoothly with changes in the parameter values that on the order of 1.
If the function varies wildly with parameter changes on the order of 1, then the algorithm may tend to miss the optimum parameter values.
Note that even if the algorithm uses an adaptive step size when it tweaks the parameter values, if the initial tweak is so far off the mark as to produce a big residual, and if tweaking in some other direction happens to produce a smaller residual, then the algorithm may wander off in the wrong direction and miss the local minimum. It may find some other (undesired) local minimum, or simply fail to converge. So using an algorithm with an adaptive step size won't necessarily save you.
The moral of the story is that scaling your data can improve the algorithm's chances of of finding the desired minimum.
Numerical algorithms in general all tend to work better when applied to data whose magnitude is on the order of 1. This bias enters into the algorithm in numerous ways. For instance, optimize.curve_fit relies on optimize.leastsq, and the call signature for optimize.leastsq is:
def leastsq(func, x0, args=(), Dfun=None, full_output=0,
col_deriv=0, ftol=1.49012e-8, xtol=1.49012e-8,
gtol=0.0, maxfev=0, epsfcn=None, factor=100, diag=None):
Thus, by default, the tolerances ftol and xtol are on the order of 1e-8. If finding the optimum parameter values require much smaller tolerances, then these hard-coded default numbers will cause optimize.curve_fit to miss the optimize parameter values.
To make this more concrete, suppose you were trying to minimize f(x) = 1e-100*x**2. The factor of 1e-100 squashes the y-values so much that a wide range of x-values (the parameter values mentioned above) will fit within the tolerance of 1e-8. So, with un-ideal scaling, leastsq will not do a good job of finding the minimum.
Another reason to use floats on the order of 1 is because there are many more (IEEE754) floats in the interval [-1,1] than there are far away from 1. For example,
import struct
def floats_between(x, y):
"""
http://stackoverflow.com/a/3587987/190597 (jsbueno)
"""
a = struct.pack("<dd", x, y)
b = struct.unpack("<qq", a)
return b[1] - b[0]
In [26]: floats_between(0,1) / float(floats_between(1e6,1e7))
Out[26]: 311.4397707054894
This shows there are over 300 times as many floats representing numbers between 0 and 1 than there are in the interval [1e6, 1e7].
Thus, all else being equal, you'll typically get a more accurate answer if working with small numbers than very large numbers.

I would imagine it has more to do with the initial parameter estimates you are passing to curve fit. If you are not passing any I believe they all default to 1. Normalizing your data makes those initial estimates closer to the truth. If you don't want to use normalized data just pass the initial estimates yourself and give them reasonable values.

Others have already mentioned that you probably need to have a good starting guess for your fit. In cases like this is, I usually try to find some quick and dirty tricks to get at least a ballpark estimate of the parameters. In your case, for large t, the exponential decays pretty quickly to zero, so for large t, you have
y == Vs * t - (Vs - Vi) / k
Doing a first-order linear fit like
[slope1, offset1] = polyfit(t[t > 2000], y[t > 2000], 1)
you will get slope1 == Vs and offset1 == (Vi - Vs) / k.
Subtracting this straight line from all the points you have, you get the exponential
residual == y - slope1 * t - offset1 == (Vs - Vi) * exp(-t * k)
Taking the log of both sides, you get
log(residual) == log(Vs - Vi) - t * k
So doing a second fit
[slope2, offset2] = polyfit(t, log(y - slope1 * t - offset1), 1)
will give you slope2 == -k and offset2 == log(Vs - Vi), which should be solvable for Vi since you already know Vs. You might have to limit the second fit to small values of t, otherwise you might be taking the log of negative numbers. Collect all the parameters you obtained with these fits and use them as the starting points for your curve_fit.
Finally, you might want to look into doing some sort of weighted fit. The information about the exponential part of your curve is contained in just the first few points, so maybe you should give those a higher weight. Doing this in a statistically correct way is not trivial.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.