How to calculate sums of squares in Python? - python

First, is the formula TSS = ESS + RSS always correct? Even for an exponential model? If it is, I just do not understand where am I wrong.
I have 2 arrays of x and y values, where y depends on x.
x = np.array([1.5, 2.1, 2.4, 2.7, 3.2, 3.4, 3.6, 3.7, 4.0, 4.5, 5.1, 5.6])
y = np.array([0.6, 1.2, 1.3, 1.4, 1.45, 1.5, 1.6, 1.8, 1.9, 1.95, 2.1, 2.2])
I have a function that determines coefficients a and b and returns an equation of linear regression (or just a and b if needed)
def Linear(x, y, getAB = False):
AVG_X = np.average(x)
AVG_Y = np.average(y)
DISP_X = np.var(x)
DISP_Y = np.var(y)
STD_X = np.std(x)
STD_Y = np.std(y)
AVG_prod = np.average(x*y)
cov = AVG_prod - (AVG_X*AVG_Y)
b = cov/DISP_X
a = AVG_Y - b*AVG_X
if getAB:
return a, b
return lambda X: a + b*X
I have a function that determines coefficients a and b and returns an equation of EXPONENTIAL regression
def Exponential(x, y, getAB = False):
LOG_Y_array = [math.log(value) for value in y]
A, B = Linear(x, LOG_Y_array, getAB = True)
a = math.exp(A)
b = math.exp(B)
if getAB:
return a, b
return lambda X: a * (b**X)
I created the array of calculated y values based of exponential model
Exponential_Prediction = Exponential(x, y)
Exponential_Prediction_y = [Exponential_Prediction(value) for value in x]
And finally, that is how I calculate TSS, ESS and RSS
TSS = np.sum((y - np.average(y))**2)
ESS_Exp = np.sum((Exponential_Prediction_y - np.average(y))**2)
RSS_Exp = np.sum((y-Exponential_Prediction_y)**2)
That is all pretty clear, except the output of this
print(str(TSS) + " = " + str(ESS_Exp) + " + " + str(RSS_Exp))
is 2.18166666667 = 2.75523753042 + 0.432362713806
I do not understand how ESS could be more than TSS

You're missing a term that is zero when you're using linear regression, since you're not, you have to add it. In the link that Vince commented, you can see that TSS = ESS + RSS + 2*sum((y - yhat)*(yhat - ybar)).
You need to include that extra term in order for it to add up:
extra_term = 2 * np.sum((y - Exponential_Prediction_y) * (Exponential_Prediction_y - y.mean()))
print(str(TSS) + " = " + str(ESS_Exp) + " + " + str(RSS_Exp) + " + " + str(extra_term))

Related

Effectively solve an overdetermined nonlinear equation system using fitted data in python

I think it’s the easiest way to describe my problem with a small example.
I have this data which is my input data. I have 3 LEDs and each LED is represented by 4 color-parts (value 1 to 4 in each array). If I increase the intensity of the LEDs (in this example from 10% to 30%) the color-parts change in different ways.
LED1_10 = np.array([1.5, 1, 0.5, 0.5])
LED1_20 = np.array([2.5, 1.75, 1.2, 1.2])
LED1_30 = np.array([3, 2.3, 1.7, 1.7])
LED2_10 = np.array([0.2, 0.8, 0.4, 0.4])
LED2_20 = np.array([0.6, 1.6, 0.5, 0.5])
LED2_30 = np.array([1.0, 2.0, 0.55, 0.55])
LED3_10 = np.array([1, 0.1, 0.4, 0.4])
LED3_20 = np.array([2.5, 0.8, 0.9, 0.9])
LED3_30 = np.array([3.25, 1, 1.3, 1.3])
The column elements of the arrays belong together. So if I bring LED1 from 10% to 30% the value in column 1 rises from 1.5 to 2.5 and then to 3. I want to find a polynomial for the rise of each of the LED color-parts, so I rearrange the data and use a polynomial fit to get the equations which describe the way the values are rising for each LED.
### Rearrange the values
LED1 = np.stack((LED1_10, LED1_20, LED1_30)).T
LED2 = np.stack((LED2_10, LED2_20, LED2_30)).T
LED3 = np.stack((LED3_10, LED3_20, LED3_30)).T
### Create x-vectro
x = np.array([10,20,30])
### Polynomal fits
Fit_LED1 = []
for i in range(len(LED1)):
z = np.polyfit(x, LED1[i], 2)
Fit_LED1.append(z)
Fit_LED2 = []
for i in range(len(LED2)):
z = np.polyfit(x, LED2[i], 2)
Fit_LED2.append(z)
Fit_LED3 = []
for i in range(len(LED3)):
z = np.polyfit(x, LED3[i], 2)
Fit_LED3.append(z)
Now I want to generate light of a specific color mixing together the light of each of the 3 different LEDs. Therefor I need to find out which intensity I need to use from each of the LEDs to get the best possible result. Color-parts 1-4 are represented by the solution vector: b = [7, 8, 2, 5] I do this solving the overdetermined nonlinear equation system like this:
def f(x):
x1, x2, x3 = x
return np.asarray(((-2.50000000e-03*x1**2 + 1.75000000e-01*x1 + -5.91091254e-15) + (-2.03207837e-18*x2**2 + 4.00000000e-02*x2 + -2.00000000e-01) + (-0.00375*x3**2 + 0.2625*x3 + -1.25),
(-0.001*x1**2 + 0.105*x1 + 0.05) + (-0.002*x2**2 + 0.14*x2 + -0.4) + (-0.0025*x3**2 + 0.145*x3 + -1.1),
(-0.001*x1**2 + 0.1*x1 + -0.4 ) + (-0.00025*x2**2 + 0.0175*x2 + 0.25) + (-0.0005*x3**2 + 0.065*x3 + -0.2),
(-0.001*x1**2 + 0.1*x1 + -0.4 ) + (-0.00025*x2**2 + 0.0175*x2 + 0.25) + (-0.0005*x3**2 + 0.065*x3 + -0.2)))
def system(x,b):
return (f(x)-b)
b = [7, 8, 2, 5]
x = scipy.optimize.leastsq(system, np.asarray((1,1,1)), args=b)[0]
I used the polynomials I got from the fit for each LED and added them to each other to get a equation for each of the four color parts. This would give the same result and maybe is a bit easier to read:
def g(x):
x1, x2, x3 = x
return np.asarray(((Fit_LED1[0][0]*x1**2 + Fit_LED1[0][1]*x1 + Fit_LED1[0][2]) + (Fit_LED2[0][0]*x1**2 + Fit_LED2[0][1]*x1 + Fit_LED2[0][2]) + (Fit_LED3[0][0]*x1**2 + Fit_LED3[0][1]*x1 + Fit_LED3[0][2]),
(Fit_LED1[1][0]*x1**2 + Fit_LED1[1][1]*x1 + Fit_LED1[1][2]) + (Fit_LED2[1][0]*x1**2 + Fit_LED2[1][1]*x1 + Fit_LED2[1][2]) + (Fit_LED3[1][0]*x1**2 + Fit_LED3[1][1]*x1 + Fit_LED3[1][2]),
(Fit_LED1[2][0]*x1**2 + Fit_LED1[2][1]*x1 + Fit_LED1[2][2]) + (Fit_LED2[2][0]*x1**2 + Fit_LED2[2][1]*x1 + Fit_LED2[2][2]) + (Fit_LED3[2][0]*x1**2 + Fit_LED3[2][1]*x1 + Fit_LED3[2][2]),
(Fit_LED1[3][0]*x1**2 + Fit_LED1[3][1]*x1 + Fit_LED1[3][2]) + (Fit_LED2[3][0]*x1**2 + Fit_LED2[3][1]*x1 + Fit_LED2[3][2]) + (Fit_LED3[3][0]*x1**2 + Fit_LED3[3][1]*x1 + Fit_LED3[3][2])))
def system(x,b):
return (f(x)-b)
b = [5, 8, 4, 12]
x = scipy.optimize.leastsq(system, np.asarray((1,1,1)), args=b)[0]
Now my problem is that I need to type each of the functions separately which is a lot of work, especially because my real-world application consists of 40 LEDs and more than 1000 color-parts for each of the LEDs. Is there an easier and more efficient way to define the equations for the equations system rather than typing each of the separately like I did here?
def g(x):
x1, x2, x3 = x
return np.asarray(((Fit_LED1[0][0]*x1**2 + Fit_LED1[0][1]*x1 + Fit_LED1[0][2]) + (Fit_LED2[0][0]*x1**2 + Fit_LED2[0][1]*x1 + Fit_LED2[0][2]) + (Fit_LED3[0][0]*x1**2 + Fit_LED3[0][1]*x1 + Fit_LED3[0][2]),
(Fit_LED1[1][0]*x1**2 + Fit_LED1[1][1]*x1 + Fit_LED1[1][2]) + (Fit_LED2[1][0]*x1**2 + Fit_LED2[1][1]*x1 + Fit_LED2[1][2]) + (Fit_LED3[1][0]*x1**2 + Fit_LED3[1][1]*x1 + Fit_LED3[1][2]),
(Fit_LED1[2][0]*x1**2 + Fit_LED1[2][1]*x1 + Fit_LED1[2][2]) + (Fit_LED2[2][0]*x1**2 + Fit_LED2[2][1]*x1 + Fit_LED2[2][2]) + (Fit_LED3[2][0]*x1**2 + Fit_LED3[2][1]*x1 + Fit_LED3[2][2]),
(Fit_LED1[3][0]*x1**2 + Fit_LED1[3][1]*x1 + Fit_LED1[3][2]) + (Fit_LED2[3][0]*x1**2 + Fit_LED2[3][1]*x1 + Fit_LED2[3][2]) + (Fit_LED3[3][0]*x1**2 + Fit_LED3[3][1]*x1 + Fit_LED3[3][2])))
I hope I was able to make my problem clear and would be very thankful if anyone could help me solving this task.
Thank you very much in advance :)
There is a pattern in your equations which you can vectorise. First of all, gather all the LED fits in a 3D array.
fits = np.array([Fit_LED1, Fit_LED2, Fit_LED3])
And then define g(x) as
def g(x):
X = np.array([x**2, x, np.ones_like(x)]).T
return np.sum(fits * X[:,None], axis=(0, 2))
You can also confirm the result is correct with np.isclose(f(x), g(x)).
Of course you should do the same to LED1, LED2, etc so you don't have to hardcode Fit_LED1, etc. Just put everything in a 3d array and loop for each LED index.
LEDs = np.array([LED1, LED2, LED3])
fits = [
[np.polyfit(np.array([10, 20, 30]), LEDs[i,j], 2) for j in range(LEDs.shape[1])]
for i in range(LEDs.shape[0])
]
fits = np.array(fits)

How to not print the j in Python complex numbers?

Let's say one of the answers is supposed to be 3.00, it will be printed as 3.00+0.00j.
How do I remove the j and have it as 3.00 only?
# Viete's Algorithm
def result_2(a3,a2,a0,a1):
b = b_cof(a3,a2,a1,a0)
a = a_cof(a3,a2,a1)
p = P(a3, a2)
r = -(b / 2.0)
q = (a / 3.0)
if ((r**2)+(q**3))<= 0.0:
if q==0:
theta = 0
if q<0:
theta = cmath.acos(r/(-q**(3.0/2.0)))
phi1 = theta / 3.0
phi2 = phi1 - ((2*cmath.pi) / 3.0)
phi3 = phi1 + ((2*cmath.pi) / 3.0)
print("X1 = ", "{:.2f}".format(2*math.sqrt(-q)*cmath.cos(phi1)-p/3.0))
print("X2 = ", "{:.2f}".format(2*math.sqrt(-q)*cmath.cos(phi2)-p/3.0))
print("X3 = ", "{:.2f}".format(2*math.sqrt(-q)*cmath.cos(phi3)-p/3.0))
You could drop the imaginary part from the number if it is zero:
>>> x=3+0j
>>> print(f"X1 = {x if x.imag else x.real:.2f}")
X1 = 3.00
>>> x=3+1j
>>> print(f"X1 = {x if x.imag else x.real:.2f}")
X1 = 3.00+1.00j

Numpy: efficiently apply function that takes surrounding cells

I have a numpy array (Potential) and I would like to compute the electromagnetic field. Right now it is the bottleneck of my program.
I have an array V dimension n+2, m+2. I want to create an Array E dimension n,m. The calculation of each cell is to do cell is ~:
sqrt((Cell_left-Cell_right)^2+(Cell_top-Cell_bottom)^2)
I would like to know if there is a way to apply a function to the whole array to avoid the expensive computation of "for loop" :)
right now my code is :
def set_e(self):
pass
for i in range(0, n):
for j in range(0, m):
self.E[i, j] = self.get_local_e(i, j)
def get_local_e(self, i, j):
return (
((self.solution[i + 2, j + 1] - self.solution[i, j + 1]) / unt_y) ** 2
+ ((self.solution[i + 1, j + 2] - self.solution[i + 1, j]) / unt_x) ** 2
) ** 0.5
Thanks
For the people that are interested in this issue, It is possible to do array calculation that way :
def set_e(self):
y_tmp = np.power((self.solution[:-2, 1:-1] - self.solution[2:, 1:-1]) / unt_y, 2)
x_tmp = np.power((self.solution[1:-1, :-2] - self.solution[1:-1, 2:]) / unt_x, 2)
self.E = np.power(x_tmp + y_tmp, 0.5)
It solved my issue
Let's work on this here.
There is something strange about your equation, as only computes the gradient along one row, see y_tmp.
The gradient function calculates along all rows and columns, that's what the shape of the input is the same as with the output.
import numpy as np
solution = np.array([[1.0, 2.0, 3.0, 4.0],
[3.0, 5.0, 7.0, 9.0],
[5.0, 8.0, 11.0, 14.0],
[7.0, 11.0, 15.0, 19.0]])
unt_y = 1
unt_x = 1
g = np.gradient(solution, unt_y, unt_x)
print(g)
a,b = g
c = np.power(a+b, 2.0)
print(c)
def set_e():
y_tmp = np.power((solution[:-2, 1:-1] - solution[2:, 1:-1]) / unt_y, 2)
print('y_tmp', y_tmp)
x_tmp = np.power((solution[1:-1, :-2] - solution[1:-1, 2:]) / unt_x, 2)
E = np.power(x_tmp + y_tmp, 0.5)
print(E)
set_e()

Forward kinematics and dynamics of 3D robot(UR5) arm with sympy

I am using sympy 1.4 to do kinematics and dynamics of ur5 robot. The express function in sympy seems to return wrong answer.ere The overall aim would be to obtain the Mass, Coriolis, Centripetal and gravity matrices as symbolic expressions. In the attached code, I am trying out a 3R manipulator
In the attached code, I am expecting the position vector of J3 frame to be 0.707106781186548*Base.i + 2.70710678118655*Base.j with respect to B where as the expression returned by the express function has a negative j.
Any idea where I am making the mistake. Or is there a better way to convert Denavit-Hartenberg representation to get the coordinate frames of the joints?
edit 1: I find that the sign convention used in sympy is just the opposite of what I have learned. For example, for a z axis rotation, the rotation matrix of B with respect to A is defined in sympy as
[cos(a) sin(a) 0;
-sin(a) cos(a) 0;
0 0 1]
Is there a way to go to the sign convention where R_z =
[cos(a) -sin(a) 0;
sin(a) cos(a) 0;
0 0 1]
without using the transpose function always?
from sympy import *
from sympy.physics.mechanics import *
from sympy.physics.vector import ReferenceFrame, Vector
from sympy.physics.vector import time_derivative
from sympy.tensor.array import Array
# Planar 3R manipulator (minimal code)
# DH representation
a = Array([0, 1, 2])
d = Array([0.0, 0.0, 0.0])
alpha = Array([0.0, 0.0, 0.0])
# q1, q2, q3 = dynamicsymbols('q1:4')
# q = [q1, q2, q3]
q = [np.pi/4, np.pi/4, 0.0,]
x_p = a[1]*cos(q[0]) + a[2]*cos(q[0] + q[1])
y_p = a[1]*sin(q[0]) + a[2]*sin(q[0] + q[1])
print 'x_p, y_p:', x_p, y_p
def transformationMatrix():
q_i = Symbol("q_i")
alpha_i = Symbol("alpha_i")
a_i = Symbol("a_i")
d_i = Symbol("d_i")
T = Matrix([[cos(q_i), -sin(q_i), 0, a_i],
[sin(q_i) * cos(alpha_i), cos(q_i) * cos(alpha_i), -sin(alpha_i), -sin(alpha_i) * d_i],
[sin(q_i) * sin(alpha_i), cos(q_i) * sin(alpha_i), cos(alpha_i), cos(alpha_i) * d_i],
[0, 0, 0, 1]])
return T
T = transformationMatrix()
q_i = Symbol("q_i")
alpha_i = Symbol("alpha_i")
a_i = Symbol("a_i")
d_i = Symbol("d_i")
T01 = T.subs(alpha_i, alpha[0]).subs(a_i, a[0]).subs(d_i, d[0]).subs(q_i, q[0])
T12 = T.subs(alpha_i, alpha[1]).subs(a_i, a[1]).subs(d_i, d[1]).subs(q_i, q[1])
T23 = T.subs(alpha_i, alpha[2]).subs(a_i, a[2]).subs(d_i, d[2]).subs(q_i, q[2])
T02 = T01*T12
T03 = T02*T23
B = CoordSys3D('Base') # Base (0) reference frame
J1 = CoordSys3D('Joint1', location=T01[0, 3]*B.i + T01[1, 3]*B.j + T01[2, 3]*B.k, rotation_matrix=T01[0:3, 0:3], parent=B)
J2 = CoordSys3D('Joint2', location=T12[0, 3]*J1.i + T12[1, 3]*J1.j + T12[2, 3]*J1.k, rotation_matrix=T12[0:3, 0:3], parent=J1)
J3 = CoordSys3D('Joint3', location=T23[0, 3]*J2.i + T23[1, 3]*J2.j + T23[2, 3]*J2.k, rotation_matrix=T23[0:3, 0:3], parent=J2)
express(J3.position_wrt(B), B)
expected result : produces 0.707106781186548*Base.i + 2.70710678118655*Base.j
actual result: produces 0.707106781186548*Base.i + (-2.70710678118655)*Base.j

How to calculate weight to minimize variance?

given several vectors:
x1 = [3 4 6]
x2 = [2 8 1]
x3 = [5 5 4]
x4 = [6 2 1]
I wanna find weight w1, w2, w3 to each item, and get the weighted sum of each vector: yi = w1*i1 + w2*i2 + w3*i3. for example, y1 = 3*w1 + 4*w2 + 6*w3
to make the variance of these values(y1, y2, y3, y4) to be minimized.
notice: w1, w2, w3 should > 0, and w1 + w2 + w3 = 1
I don't know what kind of problems it should be... and how to solve it in python or matlab?
You can start with building a loss function stating the variance and the constraints on w's. The mean is m = (1/4)*(y1 + y2 + y3 + y4). The variance is then (1/4)*((y1-m)^2 + (y2-m)^2 + (y3-m)^2 + (y4-m)^2) and the constraint is a*(w1+w2+w3 - 1) where a is the Lagrange multiplier. The problem looks like to me a convex optimisation with convex constraints since the loss function is quadratic with respect to target variables (w1,w2,w3) and the constraints are linear. You can look for projected gradient descent algorithms which respect to the constraints provided. Take a look to here http://www.ifp.illinois.edu/~angelia/L5_exist_optimality.pdf There are no straightforward analytic solutions to such kind of problems in general.
w = [5, 6, 7]
x1 = [3, 4, 6]
x2 = [2, 8, 1]
x3 = [5, 5, 4]
y1, y2, y3 = 0, 0, 0
for index, i in enumerate(w):
y1 = y1 + i * x1[index]
y2 = y2 + i * x2[index]
y3 = y3 + i * x3[index]
print(min(y1, y2, y3))
I think I maybe get the purpose of your problem.But if you want to find the smallest value, I hope this can help you.
I just make the values fixed, you can make it to be the def when you see this is one way to solve your question.
I don't know much about optimization problem, but I get the idea of gradient descent so I tried to reduce the weight between the max score and min score, my script is below:
# coding: utf-8
import numpy as np
#7.72
#7.6
#8.26
def get_max(alist):
max_score = max(alist)
idx = alist.index(max_score)
return max_score, idx
def get_min(alist):
max_score = min(alist)
idx = alist.index(max_score)
return max_score, idx
def get_weighted(alist,aweight):
res = []
for i in range(0, len(alist)):
res.append(alist[i]*aweight[i])
return res
def get_sub(list1, list2):
res = []
for i in range(0, len(list1)):
res.append(list1[i] - list2[i])
return res
def grad_dec(w,dist, st = 0.001):
max_item, max_item_idx = get_max(dist)
min_item, min_item_idx = get_min(dist)
w[max_item_idx] = w[max_item_idx] - st
w[min_item_idx] = w[min_item_idx] + st
def cal_score(w, x):
score = []
print 'weight', w ,x
for i in range(0, len(x)):
score_i = 0
for j in range(0,5):
score_i = w[j]*x[i][j] + score_i
score.append(score_i)
# check variance is small enough
print 'score', score
return score
# cal_score(w,x)
if __name__ == "__main__":
init_w = [0.2, 0.2, 0.2, 0.2, 0.2, 0.2]
x = [[7.3, 10, 8.3, 8.8, 4.2], [6.8, 8.9, 8.4, 9.7, 4.2], [6.9, 9.9, 9.7, 8.1, 6.7]]
score = cal_score(init_w,x)
variance = np.var(score)
round = 0
for round in range(0, 100):
if variance < 0.012:
print 'ok'
break
max_score, idx = get_max(score)
min_score, idx2 = get_min(score)
weighted_1 = get_weighted(x[idx], init_w)
weighted_2 = get_weighted(x[idx2], init_w)
dist = get_sub(weighted_1, weighted_2)
# print max_score, idx, min_score, idx2, dist
grad_dec(init_w, dist)
score = cal_score(init_w, x)
variance = np.var(score)
print 'variance', variance
print score
In my practice it really can reduce the variance. I am very glad but I don't know whether my solution is solid in math.
My full solution can be viewed in PDF.
The trick is to put the vectors x_i as columns of a matrix X.
Then writing the problem becomes a Convex Problem with constrain of the solution to be on the Unit Simplex.
I solved it using Projected Sub Gradient Method.
I calculated the Gradient of the objective function and created a projection to the Unit Simplex.
Now all needed is to iterate them.
I validated my solution using CVX.
% StackOverflow 44984132
% How to calculate weight to minimize variance?
% Remarks:
% 1. sa
% TODO:
% 1. ds
% Release Notes
% - 1.0.000 08/07/2017
% * First release.
%% General Parameters
run('InitScript.m');
figureIdx = 0; %<! Continue from Question 1
figureCounterSpec = '%04d';
generateFigures = OFF;
%% Simulation Parameters
dimOrder = 3;
numSamples = 4;
mX = randi([1, 10], [dimOrder, numSamples]);
vE = ones([dimOrder, 1]);
%% Solve Using CVX
cvx_begin('quiet')
cvx_precision('best');
variable vW(numSamples)
minimize( (0.5 * sum_square_abs( mX * vW - (1 / numSamples) * (vE.' * mX * vW) * vE )) )
subject to
sum(vW) == 1;
vW >= 0;
cvx_end
disp([' ']);
disp(['CVX Solution - [ ', num2str(vW.'), ' ]']);
%% Solve Using Projected Sub Gradient
numIterations = 20000;
stepSize = 0.001;
simplexRadius = 1; %<! Unit Simplex Radius
stopThr = 1e-6;
hKernelFun = #(vW) ((mX * vW) - ((1 / numSamples) * ((vE.' * mX * vW) * vE)));
hObjFun = #(vW) 0.5 * sum(hKernelFun(vW) .^ 2);
hGradFun = #(vW) (mX.' * hKernelFun(vW)) - ((1 / numSamples) * vE.' * (hKernelFun(vW)) * mX.' * vE);
vW = rand([numSamples, 1]);
vW = vW(:) / sum(vW);
for ii = 1:numIterations
vGradW = hGradFun(vW);
vW = vW - (stepSize * vGradW);
% Projecting onto the Unit Simplex
% sum(vW) == 1, vW >= 0.
vW = ProjectSimplex(vW, simplexRadius, stopThr);
end
disp([' ']);
disp(['Projected Sub Gradient Solution - [ ', num2str(vW.'), ' ]']);
%% Restore Defaults
% set(0, 'DefaultFigureWindowStyle', 'normal');
% set(0, 'DefaultAxesLooseInset', defaultLoosInset);
You can see the full code in StackOverflow Q44984132 (PDF is available as well).

Categories

Resources