Weighted Non-negative Least Square Linear Regression in python [closed] - python

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 6 years ago.
Improve this question
I know there is an weighted OLS solver, and a
constrained OLS solver.
Is there a routine that combines the two?

You can simulate OLS weighting by modifying the X and y inputs. In OLS, you solve β for
XtX β = Xty.
In Weighted OLS, you solve
XtX W β = Xt W y.
where W is a diagonal matrix with nonnegative entries. It follows that W0.5 exists, and you can formulate this as
(X W0.5)t(XW0.5) β = (X W0.5)t(XW0.5) y,
which is an OLS problem with X W0.5 and W0.5 y.
Consequently, by modifying the inputs, you can use a non-negative constraint system which does not directly recognize weights.

Related

Caculate the standard deviation be for the vatiable in the matrix created by tensorflow [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 1 year ago.
Improve this question
import tensorflow as tf
input=[50,10]
O1 = layers.fully connected(input, 20, tf.sigmoid)
Why my input is wrong?
I am not sure I understand the question, but...
The sigmoid layer will output an array with numbers between 0 and 1, but you can't really calculate what the standard deviation will be before feeding your network.
If you are talking about the matrix that contains the weight parameters, then this depends on how you initialize them. But after the training of the network, the deviation will not be the same as before the training.
EDIT:
Ok, so you simply want to calculate the standard deviation for a matrix. In that case see numpy.
a = np.array([[1, 2], [3, 4]]) # or your 50 by 50 matrix
np.std(a)

Normalize a vector with pre-defined mean [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 2 years ago.
Improve this question
I would like to normalize a vector such that the mean of the normalized vector would be a certain pre-defined value. For instance, I want the mean to be 0.1 in the following example:
import numpy as np
from sklearn.preprocessing import normalize
array = np.arange(1,11)
array_norm = normalize(array[:,np.newaxis], axis=0).ravel()
Of course, np.mean(array_norm) is 0.28 and not 0.1. Is there a way to this in Python?
You could just multiply each element by mean_you_want / current_mean. If you multiply each element by a scalar, the mean will also be multiplied by that scalar. In your case, that would be 0.1/np.mean(array_norm)
array_norm *= 0.1/np.mean(array_norm)
This should do the trick.

Statistical regression on multi-dimensional data [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 5 years ago.
Improve this question
I have a set of data in (x, y, z) format where z is the output of some formula involving x and y. I want to find out what the formula is, and my Internet research suggests that statistical regression is the way to do this.
However, all of the examples I have found while researching only deal with two-dimensional data sets (x, y) which is not useful for my situation. Said examples also don't seem to provide a way to see what the resulting formula is, they just provide a function for predicting future outputs based on data not in a training data set.
The level of precision needed is that the formula for z needs to produce results within +/- 0.5 of actual values.
Can anyone tell me how I can do what I want to do? Please note I was not asking for specific recommendations on a software library to use.
If the formula is a linear function, checkout this tutorial. It uses Ordinary least squares to fit your data which is quite powerful.
Assume that you have data points (x1, y1, z1), (x2, y2, z2), ..., (xn, yn, zn), transform them into three separated numpy arrays X, Y and Z.
import numpy as np
X = np.array([x1, x2, ..., xn])
Y = np.array([y1, y2, ..., yn])
Z = np.array([z1, z2, ..., zn])
Then, use ols to fit them!
import pandas
from statsmodels.formula.api import ols
# Your data.
# Z = a*X + b*Y + c
data = pandas.DataFrame({'x': X, 'y': Y, 'z': Z})
# Fit your data with ols model.
model = ols("Z ~ X + Y", data).fit()
# Get your model summary.
print(model.summary())
# Get your model parameters.
print(model._results.params)
# should be approximately array([c, a, b])
If more variables are presented
Add as much variables in the DataFrame as you like.
# Your data.
data = pandas.DataFrame({'v1': V1, 'v2': V2, 'v3': V3, 'v4': V4, 'z': Z})
Reference
Python package StatsModel
The most basic tool you need to use is Multiple linear regression. The basic method models z as a linear function of x and y, added a Gaussian noise e on top of them: f(x,y) = a1*x + a2*y + a3 and then z is produced as f(x,y) + e, where e is usually a zero mean Gaussian with unknown variance. You need to find the coefficients a1,a2 and the bias a3, which are usually estimated with Maximum Likelihood, which then boils down to ordinary least squares under the Gaussian assumption. It has closed form analytic solution.
Since you have access to Python, take a look to linear regression in scikit-learn:
http://scikit-learn.org/stable/modules/linear_model.html#ordinary-least-squares
If you can reuse code from an existing a Python 3 tkinter GUI application on GitHub, take a look at fitting the linear polynomial surface equation that you mentioned using my tkInterFit project - it will also create fitted surface and contour plots. The GitHub source code is at https://github.com/zunzun/tkInterFit with a BSD license.

python - how to find area under curve? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 7 years ago.
Improve this question
would like to ask if it is possible to calculate the area under curve for a fitted distribution curve?
The curve would look like this
I've seen some post online regarding the usage of trapz, but i'm not sure if it will work for a curve like that. Please enlighten me and thank you for the help!
If your distribution, f, is discretized on a set of points, x, that you know about, then you can use scipy.integrate.trapz or scipy.integrate.simps directly (pass f, x as arguments in that order). For a quick check (e.g. that your distribution is normalized), just sum the values of f and multiply by the grid spacing:
import numpy as np
from scipy.integrate import trapz, simps
x, dx = np.linspace(-100, 250, 50, retstep=True)
mean, sigma = 90, 20
f = np.exp(-((x-mean)/sigma)**2/2) / sigma / np.sqrt(2 * np.pi)
print('{:18.16f}'.format(np.sum(f)*dx))
print('{:18.16f}'.format(trapz(f, x)))
print('{:18.16f}'.format(simps(f, x)))
Output:
1.0000000000000002
0.9999999999999992
1.0000000000000016
Firstly, you have to find a function from a graph. You can check here. Then you can use integration in python with scipy. You can check here for integration.
It is just math stuff as Daniel Sanchez says.

Least square method in python [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
Questions concerning problems with code you've written must describe the specific problem — and include valid code to reproduce it — in the question itself. See SSCCE.org for guidance.
Closed 9 years ago.
Improve this question
I have two lists of data, one with x values and the other with corresponding y values. How can I find the best fit? I've tried messing with scipy.optimize.leastsq but I just can't seem to get it right.
Any help is greatly appreciated
I think it would be simpler to use numpy.polyfit, which performs Least squares polynomial fit. This is a simple snippet:
import numpy as np
x = np.array([0,1,2,3,4,5])
y = np.array([2.1, 2.9, 4.15, 4.98, 5.5, 6])
z = np.polyfit(x, y, 1)
p = np.poly1d(z)
#plotting
import matplotlib.pyplot as plt
xp = np.linspace(-1, 6, 100)
plt.plot(x, y, '.', xp, p(xp))
plt.show()

Categories

Resources