Related
Using Python 3.9.7 in VS code. I'm doing some stress analysis in python:
from math import sin, cos, tan, degrees, radians
from operator import eq
from sympy import *
x, y, z, a = symbols("x y z a")
vm_crit = ((x - y)**2 + (y - z)**2 + (z - x)**2)**(1/2) -(2**(1/2))*a
vm_crit_sub = vm_crit.subs(x, z)
vm_crit_solve = solveset(vm_crit_sub,y)
print(vm_crit_solve)
I get:
ConditionSet(y, Eq(-1.4142135623731*a + ((-y + z)**2 + (y - z)**2)**0.5, 0), Complexes)
The second argument is correct, but for some reason the function isn't solving for 'y'. If I get rid of the squares:
x, y, z, a = symbols("x y z a")
vm_crit = (x + y + z) - a
vm_crit_sub = vm_crit.subs(x, z)
vm_crit_solve = solveset(vm_crit_sub,y)
print(vm_crit_solve)
I get:
{a - 2*z}
This is correct.
In sympy do you have to expand before solving? If so, is there a way around it?
Thanks for any help.
General:
I am using maximum entropy to find distribution for on positive integers vectors, I can estimate the mean and variance, and have three equation I am trying to find a and b,
The equations:
integral(exp(a*x^2+bx+c) from (0 , infinity))-1
integral(xexp(ax^2+bx+c)from (0 , infinity))- mean
integral(x^2*exp(a*x^2+bx+c) from (0 , infinity))- mean^2 - var
(integrals between [0,∞))
The problem:
I am trying to use numerical solver and I used fsolve of sympy
But I guess I am missing some knowledge.
My code:
import numpy as np
import sympy as sym
from scipy.optimize import *
def myFunction(x,*data):
y = sym.symbols('y')
m,v=data
F = [0]*3
x[0] = - abs(x[0])
print(x)
F[0] = (sym.integrate(sym.exp(x[0] * y ** 2 + x[1] * y + x[2]), (y, 0,sym.oo)) -1).evalf()
F[1] = (sym.integrate(y*sym.exp(x[0] * y ** 2 + x[1] * y + x[2]), (y, 0,sym.oo))-m).evalf()
F[2] = (sym.integrate((y**2)*sym.exp(x[0] * y ** 2 + x[1] * y + x[2]), (y,0,sym.oo)) -v-m).evalf()
print(F)
return F
data = (10,3.5) # mean and var for example
xGuess = [1, 1, 1]
z = fsolve(myFunction,xGuess,args = data)
print(z)
my result are not that accurate, is there a better way to solve it?
integral(exp(a*x^2+bx+c))-1 = 5.67659292676884
integral(xexp(ax^2+bx+c))- mean = −1.32123173796713
integral(x^2*exp(a*x^2+bx+c))- mean^2 - var = −2.20825624606312
Thanks
I have rewritten the problem replacing sympy with numpy and lambdas (inline functions).
Also note that in your problem statement you subtract the third equation with $mean^2$, but in your code you only subtract $mean$.
import numpy as np
from scipy.optimize import minimize
from scipy.integrate import quad
def myFunction(x,data):
m,v=data
F = np.zeros(3) # use numpy array
# use scipy.integrade.quad for integration of lambda functions
# quad output is (result, error), so we just select the result value at the end
F[0] = quad(lambda y: np.exp(x[0] * y ** 2 + x[1] * y + x[2]), 0, np.inf)[0] -1
F[1] = quad(lambda y: y*np.exp(x[0] * y ** 2 + x[1] * y + x[2]), 0, np.inf)[0] -m
F[2] = quad(lambda y: (y**2)*np.exp(x[0] * y ** 2 + x[1] * y + x[2]), 0, np.inf)[0] -v-m**2
# minimize the squared error
return np.sum(F**2)
data = (10,3.5) # mean and var for example
xGuess = [-1, 1, 1]
z = minimize(lambda x: myFunction(x, data), x0=xGuess,
bounds=((None, 0), (None, None), (None, None))) # use bounds for negative first coefficient
print(z)
# x: array([-0.99899311, 2.18819689, 1.85313181])
Does this seem more reasonable?
When i run the following code i get
TypeError: can't multiply sequence by non-int of type "Add'
Can anyone explain why I get this error?
from sympy.core.symbol import symbols
from sympy.solvers.solveset import nonlinsolve
x, y, z, r, R, a, m, n, b, k1, k2 = symbols('x,y,z,r,R,a,m,n,b,k1,k2', positive=True)
f1 = r * x * (1 - x / k1) - (a * z * x ** (n + 1)) / (x ** n + y ** n)
f2 = R * y * (1 - y / k2) - (b * z * y ** (n + 1)) / (x ** n + y ** n)
f3 = z * (a * x ** (n + 1) + b * y ** (n + 1)) / (x ** n + y ** n) - m * z
f = [f1, f2, f3]
nonlinsolve(f, [x, y, z])
The error message is not really descriptive but the full stack trace indicates where the problem was: SymPy tries to work with the expression as if it was a polynomial, and finds that impossible because the exponent n is a symbol rather than a concrete integer.
Simply put, SymPy does not have an algorithm for solving systems like that one (and I'm not sure if any CAS has).
When written in polynomial form, the system has monomials of total degree n+2. So, already for n = 1 this is utterly hopeless: a system of three cubic equations with three unknowns. SymPy can solve the case n = 0, and I wouldn't expect anything more than that.
Pretend I start with some simple dataset which is defined on R2 follows:
DataPointsDomain = [0,1,2,3,4,5]
DataPointsRange = [3,6,5,7,9,1]
With scipy I can make a lazy polynomial spline using the following:
ScipySplineObject = scipy.interpolate.InterpolatedUnivariateSpline(
DataPointsDomain,
DataPointsRange,
k = 1, )
What is the equivalent object in sympy??
SympySplineObject = ...???
(I want to define this object and do analytic sympy manipulation like taking integrals, derivatives, etc... on the sympy object )
In SymPy versions above 1.1.1, including the current development version, there is a built-in method interpolating_spline which takes four arguments: the spline degree, the variable, domain values and range values.
from sympy import *
DataPointsDomain = [0,1,2,3,4,5]
DataPointsRange = [3,6,5,7,9,1]
x = symbols('x')
s = interpolating_spline(3, x, DataPointsDomain, DataPointsRange)
This returns
Piecewise((23*x**3/15 - 33*x**2/5 + 121*x/15 + 3, (x >= 0) & (x <= 2)),
(-2*x**3/3 + 33*x**2/5 - 55*x/3 + 103/5, (x >= 2) & (x <= 3)),
(-28*x**3/15 + 87*x**2/5 - 761*x/15 + 53, (x >= 3) & (x <= 5)))
which is a "not a knot" cubic spline through the given points.
Old answer
An interpolating spline can be constructed with SymPy, but this takes some effort. The method bspline_basis_set returns the basis of B-splines for given x-values, but then it's up to you to find their coefficients.
First, we need the list of knots, which is not exactly the same as the list of x-values (xv below). The endpoints xv[0] and xv[-1] will appear deg+1 times where deg is the degree of the spline, because at the endpoints all the coefficients change values (from something to zero). Also, some of the x-values close to them may not appear at all, as there will be no changes of coefficients there ("not a knot" conditions). Finally, for even-degree splines (yuck) the interior knots are placed midway between data points. So we need this helper function:
from sympy import *
def knots(xv, deg):
if deg % 2 == 1:
j = (deg+1) // 2
interior_knots = xv[j:-j]
else:
j = deg // 2
interior_knots = [Rational(a+b, 2) for a, b in zip(xv[j:-j-1], xv[j+1:-j])]
return [xv[0]] * (deg+1) + interior_knots + [xv[-1]] * (deg+1)
After getting b-splines from bspline_basis_set method, one has to plug in the x-values and form a linear system from which to find the coefficients coeff. At last, the spline is constructed:
xv = [0, 1, 2, 3, 4, 5]
yv = [3, 6, 5, 7, 9, 1]
deg = 3
x = Symbol("x")
basis = bspline_basis_set(deg, knots(xv, deg), x)
A = [[b.subs(x, v) for b in basis] for v in xv]
coeff = linsolve((Matrix(A), Matrix(yv)), symbols('c0:{}'.format(len(xv))))
spline = sum([c*b for c, b in zip(list(coeff)[0], basis)])
print(spline)
This spline is a SymPy object. Here it is for degree 3:
3*Piecewise((-x**3/8 + 3*x**2/4 - 3*x/2 + 1, (x >= 0) & (x <= 2)), (0, True)) + Piecewise((x**3/8 - 9*x**2/8 + 27*x/8 - 27/8, (x >= 3) & (x <= 5)), (0, True)) + 377*Piecewise((19*x**3/72 - 5*x**2/4 + 3*x/2, (x >= 0) & (x <= 2)), (-x**3/9 + x**2 - 3*x + 3, (x >= 2) & (x <= 3)), (0, True))/45 + 547*Piecewise((x**3/9 - 2*x**2/3 + 4*x/3 - 8/9, (x >= 2) & (x <= 3)), (-19*x**3/72 + 65*x**2/24 - 211*x/24 + 665/72, (x >= 3) & (x <= 5)), (0, True))/45 + 346*Piecewise((x**3/30, (x >= 0) & (x <= 2)), (-11*x**3/45 + 5*x**2/3 - 10*x/3 + 20/9, (x >= 2) & (x <= 3)), (31*x**3/180 - 25*x**2/12 + 95*x/12 - 325/36, (x >= 3) & (x <= 5)), (0, True))/45 + 146*Piecewise((-31*x**3/180 + x**2/2, (x >= 0) & (x <= 2)), (11*x**3/45 - 2*x**2 + 5*x - 10/3, (x >= 2) & (x <= 3)), (-x**3/30 + x**2/2 - 5*x/2 + 25/6, (x >= 3) & (x <= 5)), (0, True))/45
You can differentiate it, with
spline.diff(x)
You can integrate it:
integrate(spline, (x, 0, 5)) # 197/3
You can plot it and see it indeed interpolates the given values:
plot(spline, (x, 0, 5))
I even plotted them for degrees 1,2,3 together:
Disclaimers:
The code given above works in the development version of SymPy, and should work in 1.1.2+; there was a bug in B-spline method in previous versions.
Some of this takes a good deal of time because Piecewise objects are slow. In my experience, the basis construction takes longest.
I'm doing a very simple probability calculations of getting subset of X, Y, Z from set of A-Z (with corresponding probabilities x, y, z).
And because of very heavy formulas, in order to handle them, I'm trying to simplify (or collect or factor - I dont know the exact definition) these polynomial expressions using sympy.
So.. having this (a very simple probability calculation expression of getting subset of X,Y,Z from set of A-Z with corresponding probabilities x, y, z)
import sympy as sp
x, y, z = sp.symbols('x y z')
expression = (
x * (1 - x) * y * (1 - x - y) * z +
x * (1 - x) * z * (1 - x - z) * y +
y * (1 - y) * x * (1 - y - x) * z +
y * (1 - y) * z * (1 - y - z) * x +
z * (1 - z) * y * (1 - z - y) * x +
z * (1 - z) * x * (1 - z - x) * y
)
I want to get something like this
x * y * z * (6 * (1 - x - y - z) + (x + y) ** 2 + (y + z) ** 2 + (x + z) ** 2)
a poly, rewritten in way to have as few operations (+, -, *, **, ...) as possible
I tried factor(), collect(), simplify(). But result differs from my expectations. Mostly I get
2*x*y*z*(x**2 + x*y + x*z - 3*x + y**2 + y*z - 3*y + z**2 - 3*z + 3)
I know that sympy can combine polynomials into simple forms:
sp.factor(x**2 + 2*x*y + y**2) # gives (x + y)**2
But how to make sympy to combine polynomials from expressions above?
If this is impossible task in sympy, may be there are any other options?
Putting together some of the methods happens to give a nice answer this time. It would be interesting to see if this strategy works more often than not on the equations you generate or if, as the name implies, this is just a lucky result this time.
def iflfactor(eq):
"""Return the "I'm feeling lucky" factored form of eq."""
e = Mul(*[horner(e) if e.is_Add else e for e in
Mul.make_args(factor_terms(expand(eq)))])
r, e = cse(e)
s = [ri[0] for ri in r]
e = Mul(*[collect(ei.expand(), s) if ei.is_Add else ei for ei in
Mul.make_args(e[0])]).subs(r)
return e
>>> iflfactor(eq) # using your equation as eq
2*x*y*z*(x**2 + x*y + y**2 + (z - 3)*(x + y + z) + 3)
>>> _.count_ops()
15
BTW, a difference between factor_terms and gcd_terms is that factor_terms will work harder to pull out common terms while retaining the original structure of the expression, very much like you would do by hand (i.e. looking for common terms in Adds that can be pulled out).
>>> factor_terms(x/(z+z*y)+x/z)
x*(1 + 1/(y + 1))/z
>>> gcd_terms(x/(z+z*y)+x/z)
x*(y*z + 2*z)/(z*(y*z + z))
For what it's worth,
Chris
As far as I know, there is no function that does exactly that. I believe it is actually a very hard problem. See Reduce the number of operations on a simple expression for some discussion on it.
There are however, quite a few simplification functions in SymPy that you can try. One that you haven't mentioned that gives a different result is gcd_terms, which factorizes out a symbolic gcd without doing an expansions. It gives
>>> gcd_terms(expression)
x*y*z*((-x + 1)*(-x - y + 1) + (-x + 1)*(-x - z + 1) + (-y + 1)*(-x - y + 1) + (-y + 1)*(-y - z + 1) + (-z + 1)*(-x - z + 1) + (-z + 1)*(-y - z + 1))
Another useful function is .count_ops, which counts the number of operations in an expression. For example
>>> expression.count_ops()
47
>>> factor(expression).count_ops()
22
>>> e = x * y * z * (6 * (1 - x - y - z) + (x + y) ** 2 + (y + z) ** 2 + (x + z) ** 2)
>>> e.count_ops()
18
(note that e.count_ops() is not the same as you counted yourself, because SymPy automatically distributes the 6*(1 - x - y - z) to 6 - 6*x - 6*y - 6*z).
Other useful functions:
cse: Performs a common subexpression elimination on the expression. Sometimes you can simplify the individual parts and then put it back together. This also helps in general to avoid duplicate computations.
horner: Applies the Horner scheme to a polynomial. This minimizes the number of operations if the polynomial is in one variable.
factor_terms: Similar to gcd_terms. I'm actually not entirely clear what the difference is.
Note that by default, simplify will try several simplifications, and return the one that is minimized by count_ops.
I have had a similar problem, and ended up implementing my own solution before I stumbled across this one. Mine seems to do a much better job reducing the number of operations. However, mine also does a brute-force style set of collections over all combinations of variables. Thus, it's runtime grows super-exponentially in the number of variables. OTOH, I've managed to run it on equations with 7 variables in a not-unreasonable (but far from real-time) amount of time.
It is possible that there are some ways to prune some of the search branches here, but I haven't bothered with it. Further optimizations are welcome.
def collect_best(expr, measure=sympy.count_ops):
# This method performs sympy.collect over all permutations of the free variables, and returns the best collection
best = expr
best_score = measure(expr)
perms = itertools.permutations(expr.free_symbols)
permlen = np.math.factorial(len(expr.free_symbols))
print(permlen)
for i, perm in enumerate(perms):
if (permlen > 1000) and not (i%int(permlen/100)):
print(i)
collected = sympy.collect(expr, perm)
if measure(collected) < best_score:
best_score = measure(collected)
best = collected
return best
def product(args):
arg = next(args)
try:
return arg*product(args)
except:
return arg
def rcollect_best(expr, measure=sympy.count_ops):
# This method performs collect_best recursively on the collected terms
best = collect_best(expr, measure)
best_score = measure(best)
if expr == best:
return best
if isinstance(best, sympy.Mul):
return product(map(rcollect_best, best.args))
if isinstance(best, sympy.Add):
return sum(map(rcollect_best, best.args))
To illustrate the performance, this paper(paywalled, sorry) has 7 formulae that are 5th degree polynomials in 7 variables with up to 29 terms and 158 operations in the expanded forms. After applying both rcollect_best and #smichr's iflfactor, the number of operations in the 7 formulae are:
[6, 15, 100, 68, 39, 13, 2]
and
[32, 37, 113, 73, 40, 15, 2]
respectively. iflfactor has 433% more operations than rcollect_best for one of the formulae. Also, the number of operations in the expanded formulae are:
[39, 49, 158, 136, 79, 27, 2]