Integral of Intensity function in python

Integral of Intensity function in python - python

There is a function which determine the intensity of the Fraunhofer diffraction pattern of a circular aperture... (more information)
Integral of the function in distance x= [-3.8317 , 3.8317] must be about 83.8% ( If assume that I0 is 100) and when you increase the distance to [-13.33 , 13.33] it should be about 95%.
But when I use integral in python, the answer is wrong.. I don't know what's going wrong in my code :(
from scipy.integrate import quad
from scipy import special as sp
I0=100.0
dist=3.8317
I= quad(lambda x:( I0*((2*sp.j1(x)/x)**2)) , -dist, dist)[0]
print I
Result of the integral can't be bigger than 100 (I0) because this is the diffraction of I0 ... I don't know.. may be scaling... may be the method! :(

The problem seems to be in the function's behaviour near zero. If the function is plotted, it looks smooth:
However, scipy.integrate.quad complains about round-off errors, which is very strange with this beautiful curve. However, the function is not defined at 0 (of course, you are dividing by zero!), hence the integration does not go well.
You may use a simpler integration method or do something about your function. You may also be able to integrate it to very close to zero from both sides. However, with these numbers the integral does not look right when looking at your results.
However, I think I have a hunch of what your problem is. As far as I remember, the integral you have shown is actually the intensity (power/area) of Fraunhofer diffraction as a function of distance from the center. If you want to integrate the total power within some radius, you will have to do it in two dimensions.
By simple area integration rules you should multiply your function by 2 pi r before integrating (or x instead of r in your case). Then it becomes:
f = lambda(r): r*(sp.j1(r)/r)**2
or
f = lambda(r): sp.j1(r)**2/r
or even better:
f = lambda(r): r * (sp.j0(r) + sp.jn(2,r))
The last form is best as it does not suffer from any singularities. It is based on Jaime's comment to the original answer (see the comment below this answer!).
(Note that I omitted a couple of constants.) Now you can integrate it from zero to infinity (no negative radii):
fullpower = quad(f, 1e-9, np.inf)[0]
Then you can integrate from some other radius and normalize by the full intensity:
pwr = quad(f, 1e-9, 3.8317)[0] / fullpower
And you get 0.839 (which is quite close to 84 %). If you try the farther radius (13.33):
pwr = quad(f, 1e-9, 13.33)
which gives 0.954.
It should be noted that we introduce a small error by starting the integration from 1e-9 instead of 0. The magnitude of the error can be estimated by trying different values for the starting point. The integration result changes very little between 1e-9 and 1e-12, so they seem to be safe. Of course, you could use, e.g., 1e-30, but then there may be numerical instability in the division. (In this case there isn't, but in general singularities are numerically evil.)
Let us do one thing still:
import matplotlib.pyplot as plt
import numpy as np
x = linspace(0.01, 20, 1000)
intg = np.array([ quad(f, 1e-9, xx)[0] for xx in x])
plt.plot(x, intg/fullpower)
plt.grid('on')
plt.show()
And this is what we get:
At least this looks right, the dark fringes of the Airy disk are clearly visible.
What comes to the last part of the question: I0 defines the maximum intensity (the units may be, e.g. W/m2), whereas the integral gives total power (if the intensity is in W/m2, the total power is in W). Setting the maximum intensity to 100 does not guarantee anything about the total power. That is why it is important to calculate the total power.
There actually exists a closed form equation for the total power radiated onto a circular area:
P(x) = P0 ( 1 - J0(x)^2 - J1(x)^2 ),
where P0 is the total power.

Note that you also can get a closed form solution for your integration using Sympy:
import sympy as sy
sy.init_printing() # LaTeX like pretty printing in IPython
x,d = sy.symbols("x,d", real=True)
I0=100
dist=3.8317
f = I0*((2*sy.besselj(1,x)/x)**2) # the integrand
F = f.integrate((x, -d, d)) # symbolic integration
print(F.evalf(subs={d:dist})) # numeric evalution
F evaluates to:
1600*d*besselj(0, Abs(d))**2/3 + 1600*d*besselj(1, Abs(d))**2/3 - 800*besselj(1, Abs(d))**2/(3*d)
with besselj(0,r) corresponding to sp.j0(r).

They might be a singularity in the integration algorithm when doing the jacobian at x = 0. You can exclude this points from the integration with "points":
f = lambda x:( I0*((2*sp.j1(x)/x)**2))
I = quad(f, -dist, dist, points = [0])
I get then the following result (is this your desired result?)
331.4990321315221

Related

Double antiderivative computation in python

I have the following problem. I have a function f defined in python using numpy functions. The function is smooth and integrable on positive reals. I want to construct the double antiderivative of the function (assuming that both the value and the slope of the antiderivative at 0 are 0) so that I can evaluate it on any positive real smaller than 100.
Definition of antiderivative of f at x:
integrate f(s) with s from 0 to x
Definition of double antiderivative of f at x:
integrate (integrate f(t) with t from 0 to s) with s from 0 to x
The actual form of f is not important, so I will use a simple one for convenience. But please note that even though my example has a known closed form, my actual function does not.
import numpy as np
f = lambda x: np.exp(-x)*x
My solution is to construct the antiderivative as an array using naive numerical integration:
N = 10000
delta = 100/N
xs = np.linspace(0,100,N+1)
vs = f(xs)
avs = np.cumsum(vs)*delta
aavs = np.cumsum(avs)*delta
This of course works but it gives me arrays instead of functions. But this is not a big problem as I can interpolate aavs using a spline to get a function and get rid of the arrays.
from scipy.interpolate import UnivariateSpline
aaf = UnivariateSpline(xs, aavs)
The function aaf is approximately the double antiderivative of f.
The problem is that even though it works, there is quite a bit of overhead before I can get my function and precision is expensive.
My other idea was to interpolate f by a spline and take the antiderivative of that, however this introduces numerical errors that are too big for what I want to use the function.
Is there any better way to do that? By better I mean faster without sacrificing accuracy.
Edit: What I hope is possible is to use some kind of Fourier transform to avoid integrating twice. I hope that there is some convenient transform of vs that allows to multiply the values component-wise with xs and transform back to get the double antiderivative. I played with this a bit, but I got lost.
Edit: I figured out that by using the trapezoidal rule instead of a naive sum, increases the accuracy quite a bit. Using Simpson's rule should increase the accuracy further, but it's somewhat fiddly to do with numpy arrays.
Edit: As #user202729 rightfully complains, this seems off. The reason it seems off is because I have skipped some details. I explain here why what I say makes sense, but it does not affect my question.
My actual goal is not to find the double antiderivative of f, but to find a transformation of this. I have skipped that because I think it only confuses the matter.
The function f decays exponentially as x approaches 0 or infinity. I am minimizing the numerical error in the integration by starting the sum from 0 and going up to approximately the peak of f. This ensure that the relative error is approximately constant. Then I start from the opposite direction from some very big x and go back to the peak. Then I do the same for the antiderivative values.
Then I transform the aavs by another function which is sensitive to numerical errors. Then I find the region where the errors are big (the values oscillate violently) and drop these values. Finally I approximate what I believe are good values by a spline.
Now if I use spline to approximate f, it introduces an absolute error which is the dominant term in a rather large interval. This gets "integrated" twice and it ends up being a rather large relative error in aavs. Then once I transform aavs, I find that the 'good region' has shrunk considerably.
EDIT: The actual form of f is something I'm still looking into. However, it is going to be a generalisation of the lognormal distribution. Right now I am playing with the following family.
I start by defining a generalization of the normal distribution:
def pdf_n(params, center=0.0, slope=8):
scale, min, diff = params
if diff > 0:
r = min
l = min + diff
else:
r = min - diff
l = min
def retfun(m):
x = (m - center)/scale
E = special.expit(slope*x)*(r - l) + l
return np.exp( -np.power(1 + x*x, E)/2 )
return np.vectorize(retfun)
It may not be obvious what is happening here, but the result is quite simple. The function decays as exp(-x^(2l)) on the left and as exp(-x^(2r)) on the right. For min=1 and diff=0, this is the normal distribution. Note that this is not normalized. Then I define
g = pdf(params)
f = np.vectorize(lambda x:g(np.log(x))/x/area)
where area is the normalization constant.
Note that this is not the actual code I use. I stripped it down to the bare minimum.

You can compute the two np.cumsum (and the divisions) at once more efficiently using Numba. This is significantly faster since there is no need for several temporary arrays to be allocated, filled, read again and freed. Here is a naive implementation:
import numba as nb
#nb.njit('float64[::1](float64[::1], float64)') # Assume vs is contiguous
def doubleAntiderivative_naive(vs, delta):
res = np.empty(vs.size, dtype=np.float64)
sum1, sum2 = 0.0, 0.0
for i in range(vs.size):
sum1 += vs[i] * delta
sum2 += sum1 * delta
res[i] = sum2
return res
However, the sum is not very good in term of numerical stability. A Kahan summation is needed to improve the accuracy (or possibly the alternative Kahan–Babuška-Klein algorithm if you are paranoid about the accuracy and performance do not matter so much). Note that Numpy use a pair-wise algorithm which is quite good but far from being prefect in term of accuracy (this is a good compromise for both performance and accuracy).
Moreover, delta can be factorized during in the summation (ie. the result just need to be premultiplied by delta**2).
Here is an implementation using the more accurate Kahan summation:
#nb.njit('float64[::1](float64[::1], float64)')
def doubleAntiderivative_accurate(vs, delta):
res = np.empty(vs.size, dtype=np.float64)
delta2 = delta * delta
sum1, sum2 = 0.0, 0.0
c1, c2 = 0.0, 0.0
for i in range(vs.size):
# Kahan summation of the antiderivative of vs
y1 = vs[i] - c1
t1 = sum1 + y1
c1 = (t1 - sum1) - y1
sum1 = t1
# Kahan summation of the double antiderivative of vs
y2 = sum1 - c2
t2 = sum2 + y2
c2 = (t2 - sum2) - y2
sum2 = t2
res[i] = sum2 * delta2
return res
Here is the performance of the approaches on my machine (with an i5-9600KF processor):
Numpy cumsum: 51.3 us
Naive Numba: 11.6 us
Accutate Numba: 37.2 us
Here is the relative error of the approaches (based on the provided input function):
Numpy cumsum: 1e-13
Naive Numba: 5e-14
Accutate Numba: 2e-16
Perfect precision: 1e-16 (assuming 64-bit numbers are used)
If f can be easily computed using Numba (this is the case here), then vs[i] can be replaced by calls to f (inlined by Numba). This helps to reduce the memory consumption of the computation (N can be huge without saturating your RAM).
As for the interpolation, the splines often gives good numerical result but they are quite expensive to compute and AFAIK they require the whole array to be computed (each item of the array impact all the spline although some items may have a negligible impact alone). Regarding your needs, you could consider using Lagrange polynomials. You should be careful when using Lagrange polynomials on the edges. In your case, you can easily solve the numerical divergence issue on the edges by extending the array size with the border values (since you know the derivative on each edges of vs is 0). You can apply the interpolation on the fly with this method which can be good for both performance (typically if the computation is parallelized) and memory usage.

First, I created a version of the code I found more intuitive. Here I multiply cumulative sum values by bin widths. I believe there is a small error in the original version of the code related to the bin width issue.
import numpy as np
f = lambda x: np.exp(-x)*x
N = 1000
xs = np.linspace(0,100,N+1)
domainwidth = ( np.max(xs) - np.min(xs) )
binwidth = domainwidth / N
vs = f(xs)
avs = np.cumsum(vs)*binwidth
aavs = np.cumsum(avs)*binwidth
Next, for visualization here is some very simple plotting code:
import matplotlib
import matplotlib.pyplot as plt
plt.figure()
plt.scatter( xs, vs )
plt.figure()
plt.scatter( xs, avs )
plt.figure()
plt.scatter( xs, aavs )
plt.show()
The first integral matches the known result of the example expression and can be seen on wolfram
Below is a simple function that extracts an element from the second derivative. Note that int is a bad rounding function. I assume this is what you have implemented already.
def extract_double_antideriv_value(x):
return aavs[int(x/binwidth)]
singleresult = extract_double_antideriv_value(50.24)
print('singleresult', singleresult)
Whatever full computation steps are required, we need to know them before we can start optimizing. Do you have a million different functions to integrate? If you only need to query a single double anti-derivative many times, your original solution should be fairly ideal.
Symbolic Approximation:
Have you considered approximations to the original function f, which can have closed form integration solutions? You have a limited domain on which the function lives. Perhaps approximate f with a Taylor series (which can be constructed with known maximum error) then integrate exactly? (consider Pade, Taylor, Fourier, Cheby, Lagrange(as suggested by another answer), etc...)
Log Tricks:
Another alternative to dealing with spiky errors, would be to take the log of your original function. Is f always positive? Is the integration error caused because the neighborhood around the max is very small? If so, you can study ln(f) or even ln(ln(f)) instead. It would really help to understand what f looks like more.
Approximation Integration Tricks
There exist countless integration tricks in general, which can make approximate closed form solutions to undo-able integrals. A very common one when exponetnial functions are involved (I think yours is expoential?) is to use Laplace's Method. But which trick to pull out of the bag is highly dependent upon the conditions which f satisfies.

Integrating a gaussian over a very long interval

I want to integrate a Gaussian function over a very large interval. I chose spicy.integrate.quad function for the integration. The function seems to work only when I select a small enough interval. When I use the codes below,
from scipy.integrate import quad
from math import pi, exp, sqrt
def func(x, mean, sigma):
return 1/(sqrt(2*pi)*sigma) * exp(-1/2*((x-mean)/sigma)**2)
print(quad(func, 0, 1e+31, args=(1e+29, 1e+28))[0]) # case 1
print(quad(func, 0, 1e+32, args=(1e+29, 1e+28))[0]) # case 2
print(quad(func, 0, 1e+33, args=(1e+29, 1e+28))[0]) # case 3
print(quad(func, 1e+25, 1e+33, args=(1e+29, 1e+28))[0]) # case 4
then the followings are printed.
1.0
1.0000000000000004
0.0
0.0
To obtain a reasonable result, I had to try and change the lower/upper bounds of the integral several times and empirically determine it to [0, 1e+32]. This seems risky to me, as when the mean and sigma of the gaussian function changes, then I always have to try different bounds.
Is there a clear way to integrate the function from 0 to 1e+50 without bothering with bounds? If not, how do you expect from beginning which bounds would give non-zero value?

In short, you can't.
On this long interval, the region where the gaussian is non-zero is tiny, and the adaptive procedure which works under the hood of integrate.quad fails to see it. And so would pretty much any adaptive routine, unless by chance.

Notice,
and the CDF of a normal random variable is known as ϕ(x) as it can not be expressed by an elementary function. So take ϕ((b-m)/s) - ϕ((a-m)/s). Also note that ϕ(x) = 1/2(1 + erf(x/sqrt(2))) so you need not call .quad to actually perform an integration and may have better luck with erf from scipy.
from scipy.special import erf
def prob(mu, sigma, a, b):
phi = lambda x: 1/2*(1 + erf((x - mu)/(sigma*np.sqrt(2))))
return phi(b) - phi(a)
This may give more accurate results (it does than the above)
>>> print(prob(0, 1e+31, 0, 1e+50))
0.5
>>> print(prob(0, 1e+32, 1e+28, 1e+29))
0.000359047985937333
>>> print(prob(0, 1e+33, 1e+28, 1e+29))
3.5904805169684195e-05
>>> print(prob(1e+25, 1e+33, 1e+28, 1e+29))
3.590480516979522e-05
and avoid the intense floating point error you are experiencing. However, the regions you integrate are so small in area that you may still see 0.

Why is my code using 4th Runge-Kutta isn't giving me the expected values?

I'm having a little trouble trying to understand what's wrong with me code, any help would be extremely helpful.
I wanted to solve this simple equation
However, the values my code gives doesn't match with my book ones or wolfram ones as y goes up as x grows.
import matplotlib.pyplot as plt
from numpy import exp
from scipy.integrate import ode
# initial values
y0, t0 = [1.0], 0.0
def f(t, y):
f = [3.0*y[0] - 4.0/exp(t)]
return f
# initialize the 4th order Runge-Kutta solver
r = ode(f).set_integrator('dopri5')
r.set_initial_value(y0, t0)
t1 = 10
dt = 0.1
x, y = [], []
while r.successful() and r.t < t1:
x.append(r.t+dt); y.append(r.integrate(r.t+dt))
print(r.t+dt, r.integrate(r.t+dt))

Your equation in general has the solution
y(x) = (y0-1)*exp(3*x) + exp(-x)
Due to the choice of initial conditions, the exact solution does not contain the growing component of the first term. However, small perturbations due to discretization and floating point errors will generate a non-zero coefficient in the growing term. Now at the end of the integration interval this random coefficient is multiplied by exp(3*10)=1.107e+13 which will magnify small discretization errors of size 1e-7 to contributions in the result of size 1e+6 as observed when running the original code.
You can force the integrator to be more precise in its internal steps without reducing the output step size dt by setting error thresholds like in
r = ode(f).set_integrator('dopri5', atol=1e-16, rtol=1e-20)
However, you can not avoid the deterioration of the result completely as the floating point errors of size 1e-16 get magnified to global error contributions of size 1e-3.
Also, you should notice that each call of r.integrate(r.t+dt) will advance the integrator by dt so that the stored array and the printed values are in lock-step. If you want to just print the current state of the integrator use
print(r.t,r.y,yexact(r.t,y0))
where the last is to compare to the exact solution which is, as already said,
def yexact(x,y0):
return [ (y0[0]-1)*exp(3*x)+exp(-x) ]

How to calculate the perimeter of an ellipse

I want to calculate the perimeter of an ellipse with given values for minor and major axis. I'm currently using Python.
I have calculated the minor axis and major axis lengths for the ellipse i.e. a and b.
It’s easy to calculate the area but I want to calculate the perimeter of the ellipse for calculating a rounded length. Do you have any idea?

According to Ramanujan's first approximation formula of finding perimeter of Ellipse ->
>>> import math
>>>
>>> def calculate_perimeter(a,b):
... perimeter = math.pi * ( 3*(a+b) - math.sqrt( (3*a + b) * (a + 3*b) ) )
... return perimeter
...
>>> calculate_perimeter(2,3)
15.865437575563961
You can compare the result with google calculator also

a definition problem: major, minor axes differ from semi-major, semi-minor
the OP should be clear, those grabbing, comparing to online solutions should be too
you can get sympy to (numerically) solve the problem, I'm using the full axes definition
from sympy import *
a, b, w = symbols('a b w')
x = a/2 * cos(w)
y = b/2 * sin(w)
dx = diff(x, w)
dy = diff(y, w)
ds = sqrt(dx**2 + dy**2)
def perimeter(majr, minr):
return Integral(ds.subs([(a,majr),(b,minr)]), (w, 0, 2*pi)).evalf().doit()
print('test1: a, b = 1 gives dia = 1 circle, perimeter/pi = ',
perimeter(1, 1)/pi.evalf())
print('test2: a, b = 4,6 ellipse perimeter = ', perimeter(4,6))
test1: a, b = 1 gives dia = 1 circle, perimeter/pi = 1.00000000000000
test2: a, b = 4,6 ellipse perimeter = 15.8654395892906
its also possible to export the symbolic ds equation as a function to try with other Python lib integration functions
func_dw = lambdify((w, a, b), ds)
from scipy import integrate
print(integrate.quad(func_dw, 0, 2*np.pi, args=(4, 6)))
(15.865439589290586, 2.23277254813499e-12)
scipy.integrate.quad(func, a, b, args=()...
Returns:
y : float, The integral of func from a to b.
abserr : float, An estimate of the
absolute error in the result

As Mark stated in a comment, you can simply use scipy.special.ellipe. This implementation uses the complete elliptic integral of the second kind as approximated in the original C function ellpe.c. As described in scipy's docs:
the computation uses the approximation,
E(m) ~ P(1-m) - (1-m) log(1-m) Q(1-m)
where P and Q are tenth-order polynomials
from scipy.special import ellipe
a = 3.5
b = 2.1
# eccentricity squared
e_sq = 1.0 - b**2/a**2
# circumference formula
C = 4 * a * ellipe(e_sq)
17.868899204378693

This is kind of a meta answer comparing the ones above.
Actually, Ramanujan's second approximation is more accurate and a bit more complex than the formula in Rezwan4029's answer (which uses Ramanujan's first approximation). The second approximation is:
π * ((a+b) + (3(a-b)²) / (10*(a+b) + sqrt(a² + 14ab + b²)))
But I looked at all the answers above and compared their results. For good reasons which will become apparent later I chose Gabriel's version as the truth source, i.e. the value to compare the others against.
For the answer Rezwan4029 gave, I plotted the error in percent over a grid of 2**(-10) .. 2**9. This is the result (both axes are the power, so the point (3|5) shows the error for an ellipse of radii 2**3, 2**5):
It is obvious that only the difference in the power is relevant for the error, so I also plotted this:
What emerges in any case is that the error ranges from 0 for circles to 0.45% for extremely eccentric ellipses. Depending on your application this might be completely acceptable or render the solution unusable.
For Ramanujan's 2nd approximation formula the situation is very similar, the error is about 1/10 of the former:
The sympy solution of Mark Dickinson and the scipy solution of Gabriel still have still some differences, but they are at most in the range of 1e-6, so a different ball park. But the sympy solution is extremely slow, so the scipy version probably should be used in most cases.
For the sake of completeness, here's a distribution of the error (this time the logarithm of the error is on the z-axis, otherwise it wouldn't tell us very much, so the height corresponds roughly with the negative of the number of valid digits):
Conclusion: Use the scipy method. It's fast and very likely very accurate, maybe even the most accurate of the three proposed methods.

Use the improvement made by a russian mathematician few years ago (not infinite series calculation but convergence calculation using AGM and MAGM) http://www.ams.org/notices/201208/rtx120801094p.pdf or
https://indico-hlit.jinr.ru/event/187/contributions/1769/attachments/543/931/SAdlaj.pdf
An use is there: surface plots in matplotlib using a function z = f(x,y) where f cannot be written in standard functions. HowTo? (script for drawing a surface including isoperimeter curves: it means all X-Y from a curve are all half-parameter of all ellipses having the same perimeter). Or contact direct the mathematician, or buy at springernature.com the article "An Arithmetic-Geometric Mean of a Third Kind!",Semjon Adlaj, Federal Research Center “Informatics and Control” of the Russian Academy of Sciences, Vavilov St. 44, Moscow 119333, Russia SemjonAdlaj#gmail.com

There are some good answers but I wanted to clarify things in terms of exact/approximate calculations, as well as computational speed.
For the exact circumference using pure python, check out my pyellipse code https://gist.github.com/TimSC/4be20baeac7890e15773d31efb752d23 The approach I implemented was proposed by Adlaj 2012 (as suggested by #floppy_molly).
Alternatively, for the exact circumference, use scipy.special.ellipe as described by #Gabriel. This is twice as slow as Adlaj 2012.
For good approximation that is fast to compute and has no scipy dependency, see Ramanujan's 2nd approximation as described by #Alfe
For another good approximation that is fast to compute (that avoids using square root), use the Padé approximation by Jacobsen and Waadeland 1985 http://www.numericana.com/answer/ellipse.htm#hudson
h = pow(a-b, 2.0) / pow(a+b, 2.0)
C = (math.pi * (a+b) * (256.0 - 48.0 * h - 21.0 * h*h)
/(256.0 - 112.0 * h + 3.0 * h*h))
There are many other approaches but these are the most useful for normal applications.

Fitting Fresnel Equations Using Scipy

I am attempting a non-linear fit of Fresnel equations with data of reflectance against angle of incidence. Found on this site http://en.wikipedia.org/wiki/Fresnel_equations are two graphs that have a red and blue line. I need to basically fit the blue line when n1 = 1 to my data.
Here I use the following code where th is theta, the angle of incidence.
def Rperp(th, n, norm, constant):
numerator = np.cos(th) - np.sqrt(n**2.0 - np.sin(th)**2.0)
denominator = 1.0 * np.cos(th) + np.sqrt(n**2.0 - np.sin(th)**2.0)
return ((numerator / denominator)**2.0) * norm + constant
The parameters I'm looking for are:
the index of refraction n
some normalization to multiply by and
a constant to shift the baseline of the graph.
My attempt is the following:
xdata = angle[1:] * 1.0 # angle of incidence
ydata = greenDD[1:] # reflectance
params = curve_fit(Rperp, xdata, ydata)
What I get is a division of zero apparently and gives me [1, 1, 1] for the parameters. The Fresnel equation itself is the bit without the normalizer and the constant in Rperp. Theta in the equation is the angle of incidence also. Overall I am just not sure if I am doing this right at all to get the parameters.
The idea seems to be the first parameter in the function is the independent variable and the rest are the dependent variables going to be found. Then you just plug into scipy's curve_fit and it will give you a fit to your data for the parameters. If it is just getting around division of zero, which I had though might be integer division, then it seems like I should be set. Any help is appreciated and let me know if things need to be clarified (such as np is numpy).

Make sure to pass the arguments to the trigonometric functions, like sine, in radians, not degrees.
As for why you're getting a negative refractive index returned: it is because in your function, you're always squaring the refractive index. The curve_fit algorithm might end up in a local minimum state where (by accident) n is negative, because it has the same value as n positive.
Ideally, you'd add constraints to the minimization problem, but for this (simple) problem, just observe your formula and remember that a result of negative n is simply solved by changing the sign, as you did.
You could also try passing an initial guess to the algorithm and you might observe that it will not end up in the local minimum with negative value.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.