Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions concerning problems with code you've written must describe the specific problem — and include valid code to reproduce it — in the question itself. See SSCCE.org for guidance.
Closed 9 years ago.
Improve this question
I have two lists of data, one with x values and the other with corresponding y values. How can I find the best fit? I've tried messing with scipy.optimize.leastsq but I just can't seem to get it right.
Any help is greatly appreciated
I think it would be simpler to use numpy.polyfit, which performs Least squares polynomial fit. This is a simple snippet:
import numpy as np
x = np.array([0,1,2,3,4,5])
y = np.array([2.1, 2.9, 4.15, 4.98, 5.5, 6])
z = np.polyfit(x, y, 1)
p = np.poly1d(z)
#plotting
import matplotlib.pyplot as plt
xp = np.linspace(-1, 6, 100)
plt.plot(x, y, '.', xp, p(xp))
plt.show()
Related
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I have been coding in Matlab for a few years and was recently switched to Python. How could I convert a Matlab function code into Python3 code as shown below?
function [estimates, model] = curvefitting(x, y, numOfpoint)
model = #expfun;
estimates = point(model, numOfpoint);
function [sse, FittedCurve] = expfun(params)
A = params(1);
B = params(2);
C = params(3);
FittedCurve = A*(x-B).^C;
ErrorVector = FittedCurve - y;
sse = sum(ErrorVector .^ 2);
end
end
What is #expfun meaning in python? How could I make model = #expfun work in python?
Instead of doing a line-by-line translation, I'd instead recommend using the tools available in Python. In this case if you are trying to perform a curve fit, first define your fit function
import numpy as np
def func(x, a, b, c):
return a * np.power(x - b, c)
Then you you use scipy.optimize.curve_fit
from scipy.optimize import curve_fit
# Assume these were already populated
x = np.array([...])
y = np.array([...])
# Perform curve fit
popt, pcov = curve_fit(func, x, y)
# Get fitted y-values at each x point
fit_y = func(x, *popt)
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
I have a set of data in (x, y, z) format where z is the output of some formula involving x and y. I want to find out what the formula is, and my Internet research suggests that statistical regression is the way to do this.
However, all of the examples I have found while researching only deal with two-dimensional data sets (x, y) which is not useful for my situation. Said examples also don't seem to provide a way to see what the resulting formula is, they just provide a function for predicting future outputs based on data not in a training data set.
The level of precision needed is that the formula for z needs to produce results within +/- 0.5 of actual values.
Can anyone tell me how I can do what I want to do? Please note I was not asking for specific recommendations on a software library to use.
If the formula is a linear function, checkout this tutorial. It uses Ordinary least squares to fit your data which is quite powerful.
Assume that you have data points (x1, y1, z1), (x2, y2, z2), ..., (xn, yn, zn), transform them into three separated numpy arrays X, Y and Z.
import numpy as np
X = np.array([x1, x2, ..., xn])
Y = np.array([y1, y2, ..., yn])
Z = np.array([z1, z2, ..., zn])
Then, use ols to fit them!
import pandas
from statsmodels.formula.api import ols
# Your data.
# Z = a*X + b*Y + c
data = pandas.DataFrame({'x': X, 'y': Y, 'z': Z})
# Fit your data with ols model.
model = ols("Z ~ X + Y", data).fit()
# Get your model summary.
print(model.summary())
# Get your model parameters.
print(model._results.params)
# should be approximately array([c, a, b])
If more variables are presented
Add as much variables in the DataFrame as you like.
# Your data.
data = pandas.DataFrame({'v1': V1, 'v2': V2, 'v3': V3, 'v4': V4, 'z': Z})
Reference
Python package StatsModel
The most basic tool you need to use is Multiple linear regression. The basic method models z as a linear function of x and y, added a Gaussian noise e on top of them: f(x,y) = a1*x + a2*y + a3 and then z is produced as f(x,y) + e, where e is usually a zero mean Gaussian with unknown variance. You need to find the coefficients a1,a2 and the bias a3, which are usually estimated with Maximum Likelihood, which then boils down to ordinary least squares under the Gaussian assumption. It has closed form analytic solution.
Since you have access to Python, take a look to linear regression in scikit-learn:
http://scikit-learn.org/stable/modules/linear_model.html#ordinary-least-squares
If you can reuse code from an existing a Python 3 tkinter GUI application on GitHub, take a look at fitting the linear polynomial surface equation that you mentioned using my tkInterFit project - it will also create fitted surface and contour plots. The GitHub source code is at https://github.com/zunzun/tkInterFit with a BSD license.
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
I know there is an weighted OLS solver, and a
constrained OLS solver.
Is there a routine that combines the two?
You can simulate OLS weighting by modifying the X and y inputs. In OLS, you solve β for
XtX β = Xty.
In Weighted OLS, you solve
XtX W β = Xt W y.
where W is a diagonal matrix with nonnegative entries. It follows that W0.5 exists, and you can formulate this as
(X W0.5)t(XW0.5) β = (X W0.5)t(XW0.5) y,
which is an OLS problem with X W0.5 and W0.5 y.
Consequently, by modifying the inputs, you can use a non-negative constraint system which does not directly recognize weights.
Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
would like to ask if it is possible to calculate the area under curve for a fitted distribution curve?
The curve would look like this
I've seen some post online regarding the usage of trapz, but i'm not sure if it will work for a curve like that. Please enlighten me and thank you for the help!
If your distribution, f, is discretized on a set of points, x, that you know about, then you can use scipy.integrate.trapz or scipy.integrate.simps directly (pass f, x as arguments in that order). For a quick check (e.g. that your distribution is normalized), just sum the values of f and multiply by the grid spacing:
import numpy as np
from scipy.integrate import trapz, simps
x, dx = np.linspace(-100, 250, 50, retstep=True)
mean, sigma = 90, 20
f = np.exp(-((x-mean)/sigma)**2/2) / sigma / np.sqrt(2 * np.pi)
print('{:18.16f}'.format(np.sum(f)*dx))
print('{:18.16f}'.format(trapz(f, x)))
print('{:18.16f}'.format(simps(f, x)))
Output:
1.0000000000000002
0.9999999999999992
1.0000000000000016
Firstly, you have to find a function from a graph. You can check here. Then you can use integration in python with scipy. You can check here for integration.
It is just math stuff as Daniel Sanchez says.
Closed. This question is not reproducible or was caused by typos. It is not currently accepting answers.
This question was caused by a typo or a problem that can no longer be reproduced. While similar questions may be on-topic here, this one was resolved in a way less likely to help future readers.
Closed 8 years ago.
Improve this question
import matplotlib.pyplot as plt
import numpy as np
import scipy as sc
import math
t,theta1=np.loadtxt('Single Small Angle 1.txt',unpack=True,skiprows=2)
t2,theta2=np.loadtxt('Single Small Angle 3.txt',unpack=True,skiprows=2)
theta=[]
omega=np.arange(int(len(theta1)/5)-1)
for x in range (int(len(theta1)/5-1)):
omega[x]=(theta1[x*5]-theta1[(x+1)*5])/.005
theta[x]=theta1[x*5]
plt.plot(theta1,omega)
plt.xlabel("${\Theta}$ [rad]")
plt.ylabel("${\Omega}$ [rad/s]")
plt.title("Small Angle Approximation Phase Space")
plt.show()
Traceback (most recent call last):
theta[x]=theta1[x*5]
IndexError: list assignment index out of range
[Finished in 0.6s with exit code 1]
I have no clue what I'm doing and I just want to fix the error. I'm just trying to make a phase space and I need the derivative of my theta1 stuff so I can have d(theta1)/dt.
The below should work -- your problem was not initializing theta with the proper length. You could equally well insert the line omega=np.arange(int(len(theta1)/5)-1). This creates both as lists, then appends to them, and I think it reads a bit better.
theta, omega = [],[]
for x in range (int(len(theta1)/5-1)):
omega.append((theta1[x*5]-theta1[(x+1)*5])/.005)
theta.append(theta1[x*5])
omega = np.array(omega)
theta = np.array(theta)