Factoring polys in sympy - python

I'm doing a very simple probability calculations of getting subset of X, Y, Z from set of A-Z (with corresponding probabilities x, y, z).
And because of very heavy formulas, in order to handle them, I'm trying to simplify (or collect or factor - I dont know the exact definition) these polynomial expressions using sympy.
So.. having this (a very simple probability calculation expression of getting subset of X,Y,Z from set of A-Z with corresponding probabilities x, y, z)
import sympy as sp
x, y, z = sp.symbols('x y z')
expression = (
x * (1 - x) * y * (1 - x - y) * z +
x * (1 - x) * z * (1 - x - z) * y +
y * (1 - y) * x * (1 - y - x) * z +
y * (1 - y) * z * (1 - y - z) * x +
z * (1 - z) * y * (1 - z - y) * x +
z * (1 - z) * x * (1 - z - x) * y
)
I want to get something like this
x * y * z * (6 * (1 - x - y - z) + (x + y) ** 2 + (y + z) ** 2 + (x + z) ** 2)
a poly, rewritten in way to have as few operations (+, -, *, **, ...) as possible
I tried factor(), collect(), simplify(). But result differs from my expectations. Mostly I get
2*x*y*z*(x**2 + x*y + x*z - 3*x + y**2 + y*z - 3*y + z**2 - 3*z + 3)
I know that sympy can combine polynomials into simple forms:
sp.factor(x**2 + 2*x*y + y**2) # gives (x + y)**2
But how to make sympy to combine polynomials from expressions above?
If this is impossible task in sympy, may be there are any other options?

Putting together some of the methods happens to give a nice answer this time. It would be interesting to see if this strategy works more often than not on the equations you generate or if, as the name implies, this is just a lucky result this time.
def iflfactor(eq):
"""Return the "I'm feeling lucky" factored form of eq."""
e = Mul(*[horner(e) if e.is_Add else e for e in
Mul.make_args(factor_terms(expand(eq)))])
r, e = cse(e)
s = [ri[0] for ri in r]
e = Mul(*[collect(ei.expand(), s) if ei.is_Add else ei for ei in
Mul.make_args(e[0])]).subs(r)
return e
>>> iflfactor(eq) # using your equation as eq
2*x*y*z*(x**2 + x*y + y**2 + (z - 3)*(x + y + z) + 3)
>>> _.count_ops()
15
BTW, a difference between factor_terms and gcd_terms is that factor_terms will work harder to pull out common terms while retaining the original structure of the expression, very much like you would do by hand (i.e. looking for common terms in Adds that can be pulled out).
>>> factor_terms(x/(z+z*y)+x/z)
x*(1 + 1/(y + 1))/z
>>> gcd_terms(x/(z+z*y)+x/z)
x*(y*z + 2*z)/(z*(y*z + z))
For what it's worth,
Chris

As far as I know, there is no function that does exactly that. I believe it is actually a very hard problem. See Reduce the number of operations on a simple expression for some discussion on it.
There are however, quite a few simplification functions in SymPy that you can try. One that you haven't mentioned that gives a different result is gcd_terms, which factorizes out a symbolic gcd without doing an expansions. It gives
>>> gcd_terms(expression)
x*y*z*((-x + 1)*(-x - y + 1) + (-x + 1)*(-x - z + 1) + (-y + 1)*(-x - y + 1) + (-y + 1)*(-y - z + 1) + (-z + 1)*(-x - z + 1) + (-z + 1)*(-y - z + 1))
Another useful function is .count_ops, which counts the number of operations in an expression. For example
>>> expression.count_ops()
47
>>> factor(expression).count_ops()
22
>>> e = x * y * z * (6 * (1 - x - y - z) + (x + y) ** 2 + (y + z) ** 2 + (x + z) ** 2)
>>> e.count_ops()
18
(note that e.count_ops() is not the same as you counted yourself, because SymPy automatically distributes the 6*(1 - x - y - z) to 6 - 6*x - 6*y - 6*z).
Other useful functions:
cse: Performs a common subexpression elimination on the expression. Sometimes you can simplify the individual parts and then put it back together. This also helps in general to avoid duplicate computations.
horner: Applies the Horner scheme to a polynomial. This minimizes the number of operations if the polynomial is in one variable.
factor_terms: Similar to gcd_terms. I'm actually not entirely clear what the difference is.
Note that by default, simplify will try several simplifications, and return the one that is minimized by count_ops.

I have had a similar problem, and ended up implementing my own solution before I stumbled across this one. Mine seems to do a much better job reducing the number of operations. However, mine also does a brute-force style set of collections over all combinations of variables. Thus, it's runtime grows super-exponentially in the number of variables. OTOH, I've managed to run it on equations with 7 variables in a not-unreasonable (but far from real-time) amount of time.
It is possible that there are some ways to prune some of the search branches here, but I haven't bothered with it. Further optimizations are welcome.
def collect_best(expr, measure=sympy.count_ops):
# This method performs sympy.collect over all permutations of the free variables, and returns the best collection
best = expr
best_score = measure(expr)
perms = itertools.permutations(expr.free_symbols)
permlen = np.math.factorial(len(expr.free_symbols))
print(permlen)
for i, perm in enumerate(perms):
if (permlen > 1000) and not (i%int(permlen/100)):
print(i)
collected = sympy.collect(expr, perm)
if measure(collected) < best_score:
best_score = measure(collected)
best = collected
return best
def product(args):
arg = next(args)
try:
return arg*product(args)
except:
return arg
def rcollect_best(expr, measure=sympy.count_ops):
# This method performs collect_best recursively on the collected terms
best = collect_best(expr, measure)
best_score = measure(best)
if expr == best:
return best
if isinstance(best, sympy.Mul):
return product(map(rcollect_best, best.args))
if isinstance(best, sympy.Add):
return sum(map(rcollect_best, best.args))
To illustrate the performance, this paper(paywalled, sorry) has 7 formulae that are 5th degree polynomials in 7 variables with up to 29 terms and 158 operations in the expanded forms. After applying both rcollect_best and #smichr's iflfactor, the number of operations in the 7 formulae are:
[6, 15, 100, 68, 39, 13, 2]
and
[32, 37, 113, 73, 40, 15, 2]
respectively. iflfactor has 433% more operations than rcollect_best for one of the formulae. Also, the number of operations in the expanded formulae are:
[39, 49, 158, 136, 79, 27, 2]

Related

Is there a way to solve the constrained optimisation function (Lagrange) with n number of constraints?

I am trying to solve the constrained optimisation function using Lagrange multipliers.
The function contains two unknown variables: x,y whose values are to be found.
Here is my working code:
x, y, lamb = sp.symbols('x, y, lamb', real=True)
func = sp.sqrt((x - tax_amount20) ** 2 + (y - tax_amount10) ** 2) # Target function
const = (tax20 + 1)* x / tax20 + (tax10 +1) * y / tax10 - total # Constraint function
# define lagrangian
lagrang = func - lamb * const
# gradient of Lagrangian
grad_lagrang = [sp.diff(lagrang, var) for var in [x, y, lamb]]
# solving
spoints = sp.solve(grad_lagrang, [x, y, lamb], dict=True)
Can anyone advise how to dynamically build the function with n - number of targets and constraints?
For example
x,y,z would have:
1) target function
sp.sqrt((x - tax_amount20) ** 2 + (y - tax_amount10) ** 2 + (z - tax_amount15) ** 2)
2) constraint function
(tax20 + 1)* x / tax20 + (tax10 +1) * y / tax10 + (tax15 +1) * z / tax15 - total
Target and constraint functions should therefore be built dynamically depending on how many variables are available.
BTW. we always know the other variables of the function (i.e. tax20 and tax_amount20 for x ; tax10 and tax_amount10 for y and so on...)

Full factorization of polynomials over complexes with SymPy

I want to fully factorize a polynom, thus factorize it over complexes.
SymPy provide factor to do it, but I’m very surprised that factorization is done only over integer roots, e.g. :
>>> from sympy import *
>>> z = symbols('z')
>>> factor(z**2 - 1, z)
(z - 1)*(z + 1)
>>> factor(z**2 + 1, z)
z**2 + 1
or
>>> factor(2*z - 1, z)
2*z - 1
>>> factor(2*z - 1, z, extension=[Integer(1)/2])
2*(z - 1/2)
An answered question already exists : Factor to complex roots using sympy, and the solution given by asmeurer works :
>>> factor(z**2 + 1, z, extension=[I])
(z - I)*(z + I)
but you need to specify every divisor of non-integer roots, e.g. :
>>> factor(z**2 + 2, z, extension=[I])
z**2 + 2
>>> factor(z**2 + 2, z, extension=[I, sqrt(2)])
(z - sqrt(2)*I)*(z + sqrt(2)*I)
My question is : how to fully factorize a polynom (thus over complexes), without needing to give every divisor to extension ?
asmeurer gives a solution to do this :
>>> poly = z**2 + 2
>>> r = roots(poly, z)
>>> LC(poly, z)*Mul(*[(z - a)**r[a] for a in r])
/ ___ \ / ___ \
\z - \/ 2 *I/*\z + \/ 2 *I/
But it should exists a native way to do it, no ?
Someting like factor(poly, z, complex=True).
I looked for in the documentation of factor, but I did not find anything.
Futhermore, factor can take domain as optional argument that I believed allows to specified the set on which the factorization is made, but not
>>> factor(z**2 + 2, z, domain='Z')
2
z + 2
>>> factor(z**2 + 2, z, domain='R')
/ 2 \
2.0*\0.5*z + 1.0/
>>> factor(z**2 + 2, z, domain='C')
2
1.0*z + 2.0
The domain argument should work and in the case of Gaussian rationals you can also use gaussian=True which is equivalent to extension=I:
In [24]: factor(z**2 + 1, gaussian=True)
Out[24]: (z - ⅈ)⋅(z + ⅈ)
That doesn't work in your case though because the factorisation needs to be over QQ(I, sqrt(2)) rather than QQ(I). The reason that domains 'R' and 'C' don't work as expected is because they are inexact floating point domains rather than domains representing the real or complex numbers in the pure mathematical sense and factorisation is
The approaches above can be combined though with
In [28]: e = z**2 + 2
In [29]: factor(e, extension=roots(e))
Out[29]: (z - √2⋅ⅈ)⋅(z + √2⋅ⅈ)

Changing coefficients modulo p in a SymPy polynomial

I took a cryptography course this semester in graduate school, and once of the topics we covered was NTRU. I am trying to code this in pure Python, purely as a hobby. When I attempt to find a polynomial's inverse modulo p (in this example p = 3), SymPy always returns negative coefficients, when I want strictly positive coefficients. Here is the code I have. I'll explain what I mean.
import sympy as sym
from sympy import GF
def make_poly(N,coeffs):
"""Create a polynomial in x."""
x = sym.Symbol('x')
coeffs = list(reversed(coeffs))
y = 0
for i in range(N):
y += (x**i)*coeffs[i]
y = sym.poly(y)
return y
N = 7
p = 3
q = 41
f = [1,0,-1,1,1,0,-1]
f_poly = make_poly(N,f)
x = sym.Symbol('x')
Fp = sym.polys.polytools.invert(f_poly,x**N-1,domain=GF(p))
Fq = sym.polys.polytools.invert(f_poly,x**N-1,domain=GF(q))
print('\nf =',f_poly)
print('\nFp =',Fp)
print('\nFq =',Fq)
In this code, f_poly is a polynomial with degree at most 6 (its degree is at most N-1), whose coefficients come from the list f (the first entry in f is the coefficient on the highest power of x, continuing in descending order).
Now, I want to find the inverse polynomial of f_poly in the convolution polynomial ring Rp = (Z/pZ)[x]/(x^N - 1)(Z/pZ)[x] (similarly for q). The output of the print statements at the bottom are:
f = Poly(x**6 - x**4 + x**3 + x**2 - 1, x, domain='ZZ')
Fp = Poly(x**6 - x**5 + x**3 + x**2 + x + 1, x, modulus=3)
Fq = Poly(8*x**6 - 15*x**5 - 10*x**4 - 20*x**3 - x**2 + 2*x - 4, x, modulus=41)
These polynomials are correct in modulus, but I would like to have positive coefficients everywhere, as later on in the algorithm there is some centerlifting involved, so I need to have positive coefficients. The results should be
Fp = x^6 + 2x^5 + x^3 + x^2 + x + 1
Fq = 8x^6 + 26x^5 + 31x^4 + 21x^3 + 40x^2 + 2x + 37
The answers I'm getting are correct in modulus, but I think that SymPy's invert is changing some of the coefficients to negative variants, instead of staying inside the mod.
Is there any way I can update the coefficients of this polynomial to have only positive coefficients in modulus, or is this just an artifact of SymPy's function? I want to keep the SymPy Poly format so I can use some of its embedded functions later on down the line. Any insight would be much appreciated!
This seems to be down to how the finite field object implemented in GF "wraps" integers around the given modulus. The default behavior is symmetric, which means that any integer x for which x % modulo <= modulo//2 maps to x % modulo, and otherwise maps to (x % modulo) - modulo. So GF(10)(5) == 5, whereas GF(10)(6) == -4. You can make GF always map to positive numbers instead by passing the symmetric=False argument:
import sympy as sym
from sympy import GF
def make_poly(N, coeffs):
"""Create a polynomial in x."""
x = sym.Symbol('x')
coeffs = list(reversed(coeffs))
y = 0
for i in range(N):
y += (x**i)*coeffs[i]
y = sym.poly(y)
return y
N = 7
p = 3
q = 41
f = [1,0,-1,1,1,0,-1]
f_poly = make_poly(N,f)
x = sym.Symbol('x')
Fp = sym.polys.polytools.invert(f_poly,x**N-1,domain=GF(p, symmetric=False))
Fq = sym.polys.polytools.invert(f_poly,x**N-1,domain=GF(q, symmetric=False))
print('\nf =',f_poly)
print('\nFp =',Fp)
print('\nFq =',Fq)
Now you'll get the polynomials you wanted. The output from the print(...) statements at the end of the example should look like:
f = Poly(x**6 - x**4 + x**3 + x**2 - 1, x, domain='ZZ')
Fp = Poly(x**6 + 2*x**5 + x**3 + x**2 + x + 1, x, modulus=3)
Fq = Poly(8*x**6 + 26*x**5 + 31*x**4 + 21*x**3 + 40*x**2 + 2*x + 37, x, modulus=41)
Mostly as a note for my own reference, here's how you would get Fp using Mathematica:
Fp = PolynomialMod[Algebra`PolynomialPowerMod`PolynomialPowerMod[x^6 - x^4 + x^3 + x^2 - 1, -1, x, x^7 - 1], 3]
output:
1 + x + x^2 + x^3 + 2 x^5 + x^6

Equilibria of complex non linear system

When i run the following code i get
TypeError: can't multiply sequence by non-int of type "Add'
Can anyone explain why I get this error?
from sympy.core.symbol import symbols
from sympy.solvers.solveset import nonlinsolve
x, y, z, r, R, a, m, n, b, k1, k2 = symbols('x,y,z,r,R,a,m,n,b,k1,k2', positive=True)
f1 = r * x * (1 - x / k1) - (a * z * x ** (n + 1)) / (x ** n + y ** n)
f2 = R * y * (1 - y / k2) - (b * z * y ** (n + 1)) / (x ** n + y ** n)
f3 = z * (a * x ** (n + 1) + b * y ** (n + 1)) / (x ** n + y ** n) - m * z
f = [f1, f2, f3]
nonlinsolve(f, [x, y, z])
The error message is not really descriptive but the full stack trace indicates where the problem was: SymPy tries to work with the expression as if it was a polynomial, and finds that impossible because the exponent n is a symbol rather than a concrete integer.
Simply put, SymPy does not have an algorithm for solving systems like that one (and I'm not sure if any CAS has).
When written in polynomial form, the system has monomials of total degree n+2. So, already for n = 1 this is utterly hopeless: a system of three cubic equations with three unknowns. SymPy can solve the case n = 0, and I wouldn't expect anything more than that.

Finding the minimum of a function on a closed interval with Python

Updated: How do I find the minimum of a function on a closed interval [0,3.5] in Python? So far I found the max and min but am unsure how to filter out the minimum from here.
import sympy as sp
x = sp.symbols('x')
f = (x**3 / 3) - (2 * x**2) + (3 * x) + 1
fprime = f.diff(x)
all_solutions = [(xx, f.subs(x, xx)) for xx in sp.solve(fprime, x)]
print (all_solutions)
Since this PR you should be able to do the following:
from sympy.calculus.util import *
f = (x**3 / 3) - (2 * x**2) - 3 * x + 1
ivl = Interval(0,3)
print(minimum(f, x, ivl))
print(maximum(f, x, ivl))
print(stationary_points(f, x, ivl))
Perhaps something like this
from sympy import solveset, symbols, Interval, Min
x = symbols('x')
lower_bound = 0
upper_bound = 3.5
function = (x**3/3) - (2*x**2) - 3*x + 1
zeros = solveset(function, x, domain=Interval(lower_bound, upper_bound))
assert zeros.is_FiniteSet # If there are infinite solutions the next line will hang.
ans = Min(function.subs(x, lower_bound), function.subs(x, upper_bound), *[function.subs(x, i) for i in zeros])
Here's a possible solution using sympy:
import sympy as sp
x = sp.Symbol('x', real=True)
f = (x**3 / 3) - (2 * x**2) - 3 * x + 1
#f = 3 * x**4 - 4 * x**3 - 12 * x**2 + 3
fprime = f.diff(x)
all_solutions = [(xx, f.subs(x, xx)) for xx in sp.solve(fprime, x)]
interval = [0, 3.5]
interval_solutions = filter(
lambda x: x[0] >= interval[0] and x[0] <= interval[1], all_solutions)
print(all_solutions)
print(interval_solutions)
all_solutions is giving you all points where the first derivative is zero, interval_solutions is constraining those solutions to a closed interval. This should give you some good clues to find minimums and maximums :-)
The f.subs commands show two ways of displaying the value of the given function at x=3.5, the first as a rational approximation, the second as the exact fraction.

Categories

Resources