I am wondering what my best approach is in the following scenario. I have 8 unknowns, however a virtually unlimited number of non-linear equations which makes the system over-determined.
unknowns:
U
M
V
N
J
S
W
N
equations:
U*M + V * Catime1 – V*M – Mgtime1 = 0
J*M + W * Catime1 – W*M – Srtime1 = 0
U*N + V * Catime2 – V*N – Mgtime2 = 0
J*N + W * Catime2 – W*N – Srtime2 = 0
U*S + V * Catime3 – V*S - Mgtime3 = 0
J*S + W * Catime3 – W*S - Srtime3 = 0
U*T + V * Catime4 – W*S - Mgtime4 = 0
J*T + W * Catime4 – W*S - Srtime4 = 0
Here's what I need help with:
1) identify which Matlab (or even within python) function will solve this set of equations.
2) generate the input (equations) using python using a large Catime(i-1) and Srtime(i-1) data-set.
Write the problem as a Non Linear Least Squares problem and try to minimize it.
This will handle correctly the additional data you have.
Related
I'm working with an ensemble of precipitation data and I want to post-process. One technique that I want to try is EMOS (Ensemble Model Output Statistics). There is an R package called EnsembleMOS that does the job. However, since most of my other scripts are Python, I want to translate the ensembleMOScsg0 part of that package to Python.
The problem that I'm facing is that scipy.optimize.minimize is returning the initial guess. I've found many different questions with a similar issue, and the answers were almost always related to some kind of problem on the function that is minimized.
My function:
def crps(train,obs,pars):
a = pars[0]
c = pars[1]
d = pars[2]
shift = pars[3]
# print(a,c,d,pars[6])
media = a
crps = 0
for i in range(len(train)):
var = np.var(train.iloc[i])
for j in range(len(train.columns)-1):
media += pars[j + 4]*train.iloc[i,j]
dp = c + d*var
shape = (media**2)/(dp)
scale = dp/media
gamma = ss.gamma.cdf
beta = ss.beta.rvs
x = obs.iloc[i]
z = (x + shift**2)/scale
cc = (shift**2)/scale
crps += scale*z*(2*gamma(z,a=shape)-1) \
- scale*cc*(gamma(cc,a=shape))**2 \
+ media*(1+2*gamma(cc,a=shape)*gamma(cc,a=shape+1)-gamma(cc,a=shape))**2 \
- 2*gamma(z,a=shape+1) \
- media*(1-gamma(2*cc,a=shape)) \
* beta(0.5,shape) \
/np.pi
return crps
The objective of the minimization is to find the parameters of a linear regression equation and its variance, that's why I separated the first 4 pars (a the intercept, c and d from the variance and the shift part of the gamma distribution) from the other 12 (the b's, since I have 12 ensemble members, I need 12 b's).
The minimization part is:
pars = [0.0001,0.0001,0.0001,0.0001] + [0.0001]*12
bnds = ((.0001,20),) * 4
bnds = bnds + ((.0001,20),) * 12
result = minimize(partial(crps,train,obs),pars,method='L-BFGS-B',bounds=bnds)
The process ends with a success flag, but it always converges to the initial values.
Now I'm thinking about using rpy2 and running this package directly with Python.
More information about this procedure can be found on Baran and Nemoda (2016).
I tried to solve 1D coupled PDEs for an advection-diffusion-reaction problem with the Matlab function Pdepe (https://www.mathworks.com/help/matlab/ref/pdepe.html). This function is not working properly in my case of a high advection term as compared to the diffusion term.
Therefore, I searched and found this option of using the Python library FiPy to solve my PDEs system.
My initial conditions are u1=1 for 4*L/10
My coupled equations are of the following form:
du1/dt = d/dx(D1 * du1/dx) + g * x * du1/dx - mu1 * u1 / (K + u1) * u2
du2/dt = d/dx(D2 * du2/dx) + g * x * du2/dx + mu2 * u1 / (K + u1) * u2
I tried to write it by combining the FiPy examples (examples.convection.exponential1DSource.mesh1D, examples.levelSet.advection.mesh1D, examples.cahnHilliard.mesh2DCoupled).
The following lines are not a working example but my first attempt to write the code. This is my first use of FiPy (out of the tests and examples of the documentation), so this might seem to miss the point completely for the regular users.
from fipy import *
g = 0.66
L = 10.
nx = 1000
mu1 = 1.
mu2 = 1.
K = 1.
D1 = 1.
D2 = 1.
mesh = Grid1D(dx=L / 1000, nx=nx)
x = mesh.cellCenters[0]
convCoeff = g*(x-L/2)
u10 = 4*L/10 < x < 6*L/10
u20 = 1.
u1 = CellVariable(name="u1", mesh=mesh, value=u10)
u2 = CellVariable(name="u2", mesh=mesh, value=u20)
## Neumann boundary conditions
u1.faceGrad.constrain(0., where=mesh.facesLeft)
u1.faceGrad.constrain(0., where=mesh.facesRight)
u2.faceGrad.constrain(0., where=mesh.facesLeft)
u2.faceGrad.constrain(0., where=mesh.facesRight)
sourceCoeff1 = -1*mu1*u1/(K+u1)*u2
sourceCoeff2 = 1*mu2*u1/(K+u1)*u2
eq11 = (TransientTerm(var=u1) == DiffusionTerm(coeff=D1, var=u1) + ConvectionTerm(coeff=convCoeff))
eq21 = (TransientTerm(var=u2) == DiffusionTerm(coeff=D2, var=u2) + ConvectionTerm(coeff=convCoeff))
eq12 = ImplicitSourceTerm(coeff=sourceCoeff1, var=u1)
eq22 = ImplicitSourceTerm(coeff=sourceCoeff2, var=u2)
eq1 = eq11 & eq12
eq2 = eq21 & eq22
eqn = eq1 & eq2
vi = Viewer((u1, u2))
for t in range(100):
u1.updateOld()
u2.updateOld()
eqn.solve(dt=1.e-3)
vi.plot()
Thank you for any suggestion or correction.
If you happen to know a good tutorial for this specific kind of problem, I would be happy to read it, since I did not find anything better than the examples in the FiPy documentation.
Several issues:
python chained comparisons do not work in numpy and therefore do not work in FiPy. So, write
u10 = (4*L/10 < x) & (x < 6*L/10)
Further, this makes u10 a field of Booleans, which confuses FiPy, so
write
u10 = ((4*L/10 < x) & (x < 6*L/10)) * 1.
or, better yet, write
u1 = CellVariable(name="u1", mesh=mesh, value=0., hasOld=True)
u2 = CellVariable(name="u2", mesh=mesh, value=1., hasOld=True)
u1.setValue(1., where=(4*L/10 < x) & (x < 6*L/10))
ConvectionTerm takes a vector coefficient. One way to get this is
convCoeff = g*(x-L/2) * [[1.]]
which represents a 1D rank-1 variable
If you declare which Variable a Term applies to, you must do it for all Terms, so write, e.g.,
ConvectionTerm(coeff=convCoeff, var=u1)
ConvectionTerm(coeff=g*x, var=u1)
does not represent g * x * du1/dx. It represents d(g * x * u1)/dx. So, I believe you'll want
ConvectionTerm(coeff=convCoeff, var=u1) - ImplicitSourceTerm(coeff=g, var=u1)
ImplicitSourceTerm(coeff=sourceCoeff1, var=u1 does not represent
-1*mu1*u1/(K+u1)*u2, rather it represents -1*mu1*u1/(K+u1)*u2*u1. So, for best coupling between equations, write
sourceCoeff1 = -mu1*u1/(K+u1)
sourceCoeff2 = mu2*u2/(K+u1)
... ImplicitSourceTerm(coeff=sourceCoeff1, var=u2) ...
... ImplicitSourceTerm(coeff=sourceCoeff2, var=u1) ...
As pointed out by #wd15 in the comments, you are declaring four equations for two unknowns. & does not mean "add two equations together" (which can be accomplished with +), rather it means "solve these two equations simultaneously". So, write
sourceCoeff1 = mu1*u1/(K+u1)
sourceCoeff2 = mu2*u2/(K+u1)
eq1 = (TransientTerm(var=u1)
== DiffusionTerm(coeff=D1, var=u1)
+ ConvectionTerm(coeff=convCoeff, var=u1)
- ImplicitSourceTerm(coeff=g, var=u1)
- ImplicitSourceTerm(coeff=sourceCoeff1, var=u2))
eq2 = (TransientTerm(var=u2)
== DiffusionTerm(coeff=D2, var=u2)
+ ConvectionTerm(coeff=convCoeff, var=u2)
- ImplicitSourceTerm(coeff=g, var=u2)
+ ImplicitSourceTerm(coeff=sourceCoeff2, var=u1))
eqn = eq1 & eq2
A CellVariable must be declared with hasOld=True in order to call updateOld(), so
u1 = CellVariable(name="u1", mesh=mesh, value=u10, hasOld=True)
u2 = CellVariable(name="u2", mesh=mesh, value=u20, hasOld=True)
Full code that seems to work is here
I'm trying to get SymPy to solve a system of equations but it gives me an error saying:
NotImplementedError: could not solve 3*sin(3*t0/2)*tan(t0) + 2*cos(3*t0/2) - 4
Is there another way for me to be able to solve the system of equations:
sin(x)+(y-x)cos(x) = 0
-1.5(y-x)sin(1.5x)+cos(1.5x) = 2
I used :
from sympy import *
solve([sin(x)+(y-x)cos(x), -1.5(y-x)sin(1.5x)+cos(1.5x)-2], x, y)
SymPy could do better with this equation, but ultimately it's equivalent to some 10th degree polynomial the roots of which can only be represented abstractly. I'll describe the steps one can take and show how far SymPy can go. It's a semi-manual solution process which should be more automatic.
First of all, don't put 1.5, or other floating point numbers, in the equations. Instead, introduce a coefficient a = Rational(3, 2) and use that:
eq = [sin(x) + (y-x)*cos(x), -a*(y-x)*sin(a*x) + cos(a*x) - 2]
Variable y can be eliminated using the first equation: y=x-tan(x), which is easy for us to see, but SymPy sometimes misses the opportunity. Let's help it:
eq1 = eq[1].subs(y, x-tan(x)) # 3*sin(3*x/2)*tan(x)/2 + cos(3*x/2) - 2
As is, solve and solveset (an alternative SymPy solver) give up on the equation because of this mix of trigonometric functions of different arguments. Some of us remember from school days that trigonometric functions can be expressed as rational functions of the tangent of half-argument, so let's do that: rewrite the equation in terms of tan.
eq2 = eq1.rewrite(tan) # (-tan(3*x/4)**2 + 1)/(tan(3*x/4)**2 + 1) - 2 + 3*tan(3*x/4)*tan(x)/(tan(3*x/4)**2 + 1)
As mentioned, this halves the argument. Having fractions like x/4 in trig functions is bad. Introduce a new symbol, var('u'), and make u = x/4:
eq3 = eq2.subs(x, 4*u) # (-tan(3*u)**2 + 1)/(tan(3*u)**2 + 1) - 2 + 3*tan(3*u)*tan(4*u)/(tan(3*u)**2 + 1)
Now we can expand all these tangents in terms of tan(u), using expand_trig. The equation gets longer:
eq4 = expand_trig(eq3) # (1 - (-tan(u)**3 + 3*tan(u))**2/(-3*tan(u)**2 + 1)**2)/(1 + (-tan(u)**3 + 3*tan(u))**2/(-3*tan(u)**2 + 1)**2) - 2 + 3*(-4*tan(u)**3 + 4*tan(u))*(-tan(u)**3 + 3*tan(u))/((1 + (-tan(u)**3 + 3*tan(u))**2/(-3*tan(u)**2 + 1)**2)*(-3*tan(u)**2 + 1)*(tan(u)**4 - 6*tan(u)**2 + 1))
But it's also simpler because tan(u) can be treated as another unknown, say v.
eq5 = eq4.subs(tan(u), v) # (1 - (-v**3 + 3*v)**2/(-3*v**2 + 1)**2)/(1 + (-v**3 + 3*v)**2/(-3*v**2 + 1)**2) - 2 + 3*(-4*v**3 + 4*v)*(-v**3 + 3*v)/((1 + (-v**3 + 3*v)**2/(-3*v**2 + 1)**2)*(-3*v**2 + 1)*(v**4 - 6*v**2 + 1))
Great, now we have a rational function. It can be handled with solveset(eq5, x). By default solveset gives all complex solutions and we need only real roots among them, so let's specify the domain as Reals:
vsol = list(solveset(eq5, v, domain=S.Reals))
There is no algebraic formula for these, so they are recorded somewhat abstractly but these are actual numbers we can work with:
[CRootOf(3*v**10 + 9*v**8 - 78*v**6 + 22*v**4 - 21*v**2 + 1, 0),
CRootOf(3*v**10 + 9*v**8 - 78*v**6 + 22*v**4 - 21*v**2 + 1, 1),
CRootOf(3*v**10 + 9*v**8 - 78*v**6 + 22*v**4 - 21*v**2 + 1, 2),
CRootOf(3*v**10 + 9*v**8 - 78*v**6 + 22*v**4 - 21*v**2 + 1, 3)]
For example, we can go back to x and y now, and evaluate the solutions:
xsol = [4*atan(v) for v in vsol]
ysol = [x - tan(x) for x in xsol]
numsol = [(N(x), N(y)) for x, y in zip(xsol, ysol)]
Numeric values are
[(-4.35962510714700, -1.64344290066272),
(-0.877886785847899, 0.326585146723377),
(0.877886785847899, -0.326585146723377),
(4.35962510714700, 1.64344290066272)]
Of course there are infinitely more because the tangent is periodic. Finally, let's check these actually work:
residuals = [[e.subs({x: xv, y: yv}) for e in eq] for xv, yv in numsol]
These are a bunch of numbers of order 1e-15 or less, so yes, the equations hold within machine precision.
Unlike a purely numeric solution we'd get from SciPy or other numeric solvers, these can be evaluated with any accuracy without repeating the process. For example, 50 digits of the first x-solution:
xsol[0].evalf(50) # -4.3596251071470021258397061103704574594477338857831
Just for the fun of it here is a manual solution that only needs solving a polynomial of degree 5:
Write t = x/2, a = y-x, s = sin t, c = cos t, S = sin x and
C = cos x.
The the given equations can be rewritten
(1) 2 sc + a (c^2 - s^2) = 0
(2) 3 a s^3 - 9 a c^2 s - 6 c s^2 + 2 c^3 = 4
Multiplying (1) by 3 s and adding to (2):
(3) -6 a c^2 s + 2 c^3 = 4
Next we substitute a = -S / C and use S = 2sc and s^2 = 1 - c^2:
(4) 12 c^3 (1 - c^2) / C + 2 c^3 = 4
Multiply with C = 2 c^2 - 1:
(5) c^3 (12 - 12 c^2 + 4 c^2 - 2) = 8 c^2 - 4
Finally,
(6) 4 c^5 - 5 c^3 + 4 c^2 - 2 = 0
This has a pair of complex solutions, one real solution outside the domain of the cosine and another two solutions which give the four principal solutions for x.
(7) c_1/2 = 0.90520121, -0.57206084
(8) x_1/2/3/4 = +/- 2 arccos(x_1/2)
I am using python's scikit-learn package to implement PCA .I am getting math
domain error :
C:\Users\Akshenndra\Anaconda2\lib\site-packages\sklearn\decomposition\pca.pyc in _assess_dimension_(spectrum, rank, n_samples, n_features)
78 for j in range(i + 1, len(spectrum)):
79 pa += log((spectrum[i] - spectrum[j]) *
---> 80 (1. / spectrum_[j] - 1. / spectrum_[i])) + log(n_samples)
81
82 ll = pu + pl + pv + pp - pa / 2. - rank * log(n_samples) / 2.
ValueError: math domain error
I already know that math domain error is caused when we take logarithm of a negative number ,but I don't understand here how can there be a negative number inside the logarithm ? because this code works fine for other datasets.
maybe is this related to what is written in the sci-kitlearn's website -"This implementation uses the scipy.linalg implementation of the singular value decomposition. It only works for dense arrays and is not scalable to large dimensional data."(there are large number of 0 values)
I think you should add 1 instead, as the numpy log1p description page.
Since log(p+1) = 0 when p = 0 (while log(e-99) = -99), and as the quote in the link
For real-valued input, log1p is accurate also for x so small that 1 + x == 1 in floating-point accuracy
The code can be modified as follows to make what you trying to resolve more reasonable:
for i in range(rank):
for j in range(i + 1, len(spectrum)):
pa += log((spectrum[i] - spectrum[j]) *
(1. / spectrum_[j] - 1. / spectrum_[i]) + 1) + log(n_samples + 1)
ll = pu + pl + pv + pp - pa / 2. - rank * log(n_samples + 1) / 2
I don't know whether i am right or not, but I truly find a way to solve it.
I just print some error information(The value of spectrum_[i] and spectrum_[j]), and I find :
sometimes, they are same!!!
(Maybe they are not same but they are too close, I guess)
so , here
pa += log((spectrum[i] - spectrum[j]) *
(1. / spectrum_[j] - 1. / spectrum_[i])) + log(n_samples)
it will report error when calculate log(0).
My way to solve it is to add a very small number 1e-99 to 0, so it become log(0 + 1e-99)
so you can just change it to:
pa += log((spectrum[i] - spectrum[j]) *
(1. / spectrum_[j] - 1. / spectrum_[i]) + 1e-99) + log(n_samples)
I'd like to implement Euler's method (the explicit and the implicit one)
(https://en.wikipedia.org/wiki/Euler_method) for the following model:
x(t)' = q(x_M -x(t))x(t)
x(0) = x_0
where q, x_M and x_0 are real numbers.
I know already the (theoretical) implementation of the method. But I couldn't figure out where I can insert / change the model.
Could anybody help?
EDIT: You were right. I didn't understand correctly the method. Now, after a few hours, I think that I really got it! With the explicit method, I'm pretty sure (nevertheless: could anybody please have a look at my code? )
With the implicit implementation, I'm not very sure if it's correct. Could please anyone have a look at the implementation of the implicit method and give me a feedback what's correct / not good?
def explizit_euler():
''' x(t)' = q(xM -x(t))x(t)
x(0) = x0'''
q = 2.
xM = 2
x0 = 0.5
T = 5
dt = 0.01
N = T / dt
x = x0
t = 0.
for i in range (0 , int(N)):
t = t + dt
x = x + dt * (q * (xM - x) * x)
print '%6.3f %6.3f' % (t, x)
def implizit_euler():
''' x(t)' = q(xM -x(t))x(t)
x(0) = x0'''
q = 2.
xM = 2
x0 = 0.5
T = 5
dt = 0.01
N = T / dt
x = x0
t = 0.
for i in range (0 , int(N)):
t = t + dt
x = (1.0 / (1.0 - q *(xM + x) * x))
print '%6.3f %6.3f' % (t, x)
Pre-emptive note: Although the general idea should be correct, I did all the algebra in place in the editor box so there might be mistakes there. Please, check it yourself before using for anything really important.
I'm not sure how you come to the "implicit" formula
x = (1.0 / (1.0 - q *(xM + x) * x))
but this is wrong and you can check it by comparing your "explicit" and "implicit" results: they should slightly diverge but with this formula they will diverge drastically.
To understand the implicit Euler method, you should first get the idea behind the explicit one. And the idea is really simple and is explained at the Derivation section in the wiki: since derivative y'(x) is a limit of (y(x+h) - y(x))/h, you can approximate y(x+h) as y(x) + h*y'(x) for small h, assuming our original differential equation is
y'(x) = F(x, y(x))
Note that the reason this is only an approximation rather than exact value is that even over small range [x, x+h] the derivative y'(x) changes slightly. It means that if you want to get a better approximation of y(x+h), you need a better approximation of "average" derivative y'(x) over the range [x, x+h]. Let's call that approximation just y'. One idea of such improvement is to find both y' and y(x+h) at the same time by saying that we want to find such y' and y(x+h) that y' would be actually y'(x+h) (i.e. the derivative at the end). This results in the following system of equations:
y'(x+h) = F(x+h, y(x+h))
y(x+h) = y(x) + h*y'(x+h)
which is equivalent to a single "implicit" equation:
y(x+h) - y(x) = h * F(x+h, y(x+h))
It is called "implicit" because here the target y(x+h) is also a part of F. And note that quite similar equation is mentioned in the Modifications and extensions section of the wiki article.
So now going to your case that equation becomes
x(t+dt) - x(t) = dt*q*(xM -x(t+dt))*x(t+dt)
or equivalently
dt*q*x(t+dt)^2 + (1 - dt*q*xM)*x(t+dt) - x(t) = 0
This is a quadratic equation with two solutions:
x(t+dt) = [(dt*q*xM - 1) ± sqrt((dt*q*xM - 1)^2 + 4*dt*q*x(t))]/(2*dt*q)
Obviously we want the solution that is "close" to the x(t) which is the + solution. So the code should be something like:
b = (q * xM * dt - 1)
x(t+h) = (b + (b ** 2 + 4 * q * x(t) * dt) ** 0.5) / 2 / q / dt
(editor note:) Applying the binomial complement, this formula has the numerically more stable form for small dt, where then b < 0,
x(t+h) = (2 * x(t)) / ((b ** 2 + 4 * q * x(t) * dt) ** 0.5 - b)