Changing coefficients modulo p in a SymPy polynomial - python

I took a cryptography course this semester in graduate school, and once of the topics we covered was NTRU. I am trying to code this in pure Python, purely as a hobby. When I attempt to find a polynomial's inverse modulo p (in this example p = 3), SymPy always returns negative coefficients, when I want strictly positive coefficients. Here is the code I have. I'll explain what I mean.
import sympy as sym
from sympy import GF
def make_poly(N,coeffs):
"""Create a polynomial in x."""
x = sym.Symbol('x')
coeffs = list(reversed(coeffs))
y = 0
for i in range(N):
y += (x**i)*coeffs[i]
y = sym.poly(y)
return y
N = 7
p = 3
q = 41
f = [1,0,-1,1,1,0,-1]
f_poly = make_poly(N,f)
x = sym.Symbol('x')
Fp = sym.polys.polytools.invert(f_poly,x**N-1,domain=GF(p))
Fq = sym.polys.polytools.invert(f_poly,x**N-1,domain=GF(q))
print('\nf =',f_poly)
print('\nFp =',Fp)
print('\nFq =',Fq)
In this code, f_poly is a polynomial with degree at most 6 (its degree is at most N-1), whose coefficients come from the list f (the first entry in f is the coefficient on the highest power of x, continuing in descending order).
Now, I want to find the inverse polynomial of f_poly in the convolution polynomial ring Rp = (Z/pZ)[x]/(x^N - 1)(Z/pZ)[x] (similarly for q). The output of the print statements at the bottom are:
f = Poly(x**6 - x**4 + x**3 + x**2 - 1, x, domain='ZZ')
Fp = Poly(x**6 - x**5 + x**3 + x**2 + x + 1, x, modulus=3)
Fq = Poly(8*x**6 - 15*x**5 - 10*x**4 - 20*x**3 - x**2 + 2*x - 4, x, modulus=41)
These polynomials are correct in modulus, but I would like to have positive coefficients everywhere, as later on in the algorithm there is some centerlifting involved, so I need to have positive coefficients. The results should be
Fp = x^6 + 2x^5 + x^3 + x^2 + x + 1
Fq = 8x^6 + 26x^5 + 31x^4 + 21x^3 + 40x^2 + 2x + 37
The answers I'm getting are correct in modulus, but I think that SymPy's invert is changing some of the coefficients to negative variants, instead of staying inside the mod.
Is there any way I can update the coefficients of this polynomial to have only positive coefficients in modulus, or is this just an artifact of SymPy's function? I want to keep the SymPy Poly format so I can use some of its embedded functions later on down the line. Any insight would be much appreciated!

This seems to be down to how the finite field object implemented in GF "wraps" integers around the given modulus. The default behavior is symmetric, which means that any integer x for which x % modulo <= modulo//2 maps to x % modulo, and otherwise maps to (x % modulo) - modulo. So GF(10)(5) == 5, whereas GF(10)(6) == -4. You can make GF always map to positive numbers instead by passing the symmetric=False argument:
import sympy as sym
from sympy import GF
def make_poly(N, coeffs):
"""Create a polynomial in x."""
x = sym.Symbol('x')
coeffs = list(reversed(coeffs))
y = 0
for i in range(N):
y += (x**i)*coeffs[i]
y = sym.poly(y)
return y
N = 7
p = 3
q = 41
f = [1,0,-1,1,1,0,-1]
f_poly = make_poly(N,f)
x = sym.Symbol('x')
Fp = sym.polys.polytools.invert(f_poly,x**N-1,domain=GF(p, symmetric=False))
Fq = sym.polys.polytools.invert(f_poly,x**N-1,domain=GF(q, symmetric=False))
print('\nf =',f_poly)
print('\nFp =',Fp)
print('\nFq =',Fq)
Now you'll get the polynomials you wanted. The output from the print(...) statements at the end of the example should look like:
f = Poly(x**6 - x**4 + x**3 + x**2 - 1, x, domain='ZZ')
Fp = Poly(x**6 + 2*x**5 + x**3 + x**2 + x + 1, x, modulus=3)
Fq = Poly(8*x**6 + 26*x**5 + 31*x**4 + 21*x**3 + 40*x**2 + 2*x + 37, x, modulus=41)
Mostly as a note for my own reference, here's how you would get Fp using Mathematica:
Fp = PolynomialMod[Algebra`PolynomialPowerMod`PolynomialPowerMod[x^6 - x^4 + x^3 + x^2 - 1, -1, x, x^7 - 1], 3]
output:
1 + x + x^2 + x^3 + 2 x^5 + x^6

Related

Minimize system of nonlinear equation (integral on exponent)

General:
I am using maximum entropy to find distribution for on positive integers vectors, I can estimate the mean and variance, and have three equation I am trying to find a and b,
The equations:
integral(exp(a*x^2+bx+c) from (0 , infinity))-1
integral(xexp(ax^2+bx+c)from (0 , infinity))- mean
integral(x^2*exp(a*x^2+bx+c) from (0 , infinity))- mean^2 - var
(integrals between [0,∞))
The problem:
I am trying to use numerical solver and I used fsolve of sympy
But I guess I am missing some knowledge.
My code:
import numpy as np
import sympy as sym
from scipy.optimize import *
def myFunction(x,*data):
y = sym.symbols('y')
m,v=data
F = [0]*3
x[0] = - abs(x[0])
print(x)
F[0] = (sym.integrate(sym.exp(x[0] * y ** 2 + x[1] * y + x[2]), (y, 0,sym.oo)) -1).evalf()
F[1] = (sym.integrate(y*sym.exp(x[0] * y ** 2 + x[1] * y + x[2]), (y, 0,sym.oo))-m).evalf()
F[2] = (sym.integrate((y**2)*sym.exp(x[0] * y ** 2 + x[1] * y + x[2]), (y,0,sym.oo)) -v-m).evalf()
print(F)
return F
data = (10,3.5) # mean and var for example
xGuess = [1, 1, 1]
z = fsolve(myFunction,xGuess,args = data)
print(z)
my result are not that accurate, is there a better way to solve it?
integral(exp(a*x^2+bx+c))-1 = 5.67659292676884
integral(xexp(ax^2+bx+c))- mean = −1.32123173796713
integral(x^2*exp(a*x^2+bx+c))- mean^2 - var = −2.20825624606312
Thanks
I have rewritten the problem replacing sympy with numpy and lambdas (inline functions).
Also note that in your problem statement you subtract the third equation with $mean^2$, but in your code you only subtract $mean$.
import numpy as np
from scipy.optimize import minimize
from scipy.integrate import quad
def myFunction(x,data):
m,v=data
F = np.zeros(3) # use numpy array
# use scipy.integrade.quad for integration of lambda functions
# quad output is (result, error), so we just select the result value at the end
F[0] = quad(lambda y: np.exp(x[0] * y ** 2 + x[1] * y + x[2]), 0, np.inf)[0] -1
F[1] = quad(lambda y: y*np.exp(x[0] * y ** 2 + x[1] * y + x[2]), 0, np.inf)[0] -m
F[2] = quad(lambda y: (y**2)*np.exp(x[0] * y ** 2 + x[1] * y + x[2]), 0, np.inf)[0] -v-m**2
# minimize the squared error
return np.sum(F**2)
data = (10,3.5) # mean and var for example
xGuess = [-1, 1, 1]
z = minimize(lambda x: myFunction(x, data), x0=xGuess,
bounds=((None, 0), (None, None), (None, None))) # use bounds for negative first coefficient
print(z)
# x: array([-0.99899311, 2.18819689, 1.85313181])
Does this seem more reasonable?

Remove the coefficients from a SymPy expression

I have a symbolic expression in SymPy that I need to remove all the coefficients from after I take the 2nd derivative of the expression. Here is the code
import sympy as sym
x, y = sym.symbols('x y')
# define the shape function
S_function = 0 +\
x + y +\
x**2 + x*y + y**2 +\
x**3 + x**2*y + x*y**2 + y**3
# take the 2nd derivative of the shape function for the x components
S_function_xx = sym.expand(sym.diff(S_function,x,2))
print(S_function_xx)
And I get an output of 6*x + 2*y + 2, I need to convert this to x + y. Any suggestions are welcome.
If the expression is in fact a polynomial (as here), the following can be used: convert it to a Poly object, get its list of monomials with monoms, then recombine, deleting the constant term.
monom_list = sym.Poly(S_function_xx, x, y).monoms()
new_function = sum([x**a*y**b for a, b in monom_list if a + b > 0])
If S_function_xx is not a polynomial, it's not so clear what is a "coefficient".

Implementing a custom datatype in Sympy

I want to perform calculations with a binary operation (Tensor) that takes two non-commutative arguments, converts them into something like a pair, and then does funny things when I multiply these pairs.
# a, b, c, d are non-commutative
Tensor(a, b) * Tensor(c, d) == Tensor(a*c, d*b) # yes, in this order
Furthermore, I want all integer constants to be taken modulo 2.
-Tensor(a, b) == Tensor(a, b)
2*Tensor(a, b) == 0
Tensor(2*a, b) == 0
My shot at doing this:
import sympy as sp
from sympy.core.expr import Expr
class Tensor(Expr):
__slots__ = ['is_commutative']
def __new__(cls, l, r):
l = sp.sympify(l)
r = sp.sympify(r)
obj = Expr.__new__(cls, l, r)
obj.is_commutative = False
return obj
def __neg__(self):
return self
def __mul__(self, other):
if isinstance(other, Tensor):
return Tensor(self.args[0] * other.args[0], other.args[1] * self.args[1])
elif other.is_number:
if other % 2 == 0:
return 0
else:
return self
else:
return sp.Mul(self, other)
x, y = sp.symbols('x, y', commutative=False)
Ym = Tensor(y, 1) - Tensor(1, y)
Yp = Tensor(y, 1) + Tensor(1, y)
Xm = Tensor(x, 1) - Tensor(1, x)
d1 = Ym * Yp + Xm * 0
print(d1)
print(sp.expand(d1))
d2 = Xm * Ym
print(d2)
print(sp.expand(d2))
Output:
(Tensor(1, y) + Tensor(y, 1))**2
Tensor(1, y**2) + 2*Tensor(y, y) + Tensor(y**2, 1)
(Tensor(1, x) + Tensor(x, 1))*(Tensor(1, y) + Tensor(y, 1))
Tensor(1, x)*Tensor(1, y) + Tensor(1, x)*Tensor(y, 1) + Tensor(x, 1)*Tensor(1, y) + Tensor(x, 1)*Tensor(y, 1)
Test #1 is correct.
Test #2 has a term 2*Tensor(y, y) which should be zero (since I'm doing all calculations modulo 2, and, 2 % 2 == 0). How do I enforce that?
Test #3 is correct.
Test #4 does not multiply different Tensors at all. Tensor(1, x)*Tensor(1, y) should be Tensor(1, y*x), for example. How do I enforce that?
Context (if you're interested in why I'm doing this):
This is for calculating a bimodule resolution of an algebra over a char=2 field. A bimodule resolution of an algebra R over a field K is a minimal projective resolution of the same algebra R over its enveloping algebra R⊗R^op. Here op means "multiplication works in the opposite direction". The enveloping algebra has both left and right action on the algebra:
(a⊗b)*r == a*r*b
r*(a⊗b) == b*r*a
There is a theorem that simplifies calculations in the case where a minimal projective resolution of the algebra over the field is known. Still, they are quite tedious and I want to stop doing them manually.

Finding the minimum of a function on a closed interval with Python

Updated: How do I find the minimum of a function on a closed interval [0,3.5] in Python? So far I found the max and min but am unsure how to filter out the minimum from here.
import sympy as sp
x = sp.symbols('x')
f = (x**3 / 3) - (2 * x**2) + (3 * x) + 1
fprime = f.diff(x)
all_solutions = [(xx, f.subs(x, xx)) for xx in sp.solve(fprime, x)]
print (all_solutions)
Since this PR you should be able to do the following:
from sympy.calculus.util import *
f = (x**3 / 3) - (2 * x**2) - 3 * x + 1
ivl = Interval(0,3)
print(minimum(f, x, ivl))
print(maximum(f, x, ivl))
print(stationary_points(f, x, ivl))
Perhaps something like this
from sympy import solveset, symbols, Interval, Min
x = symbols('x')
lower_bound = 0
upper_bound = 3.5
function = (x**3/3) - (2*x**2) - 3*x + 1
zeros = solveset(function, x, domain=Interval(lower_bound, upper_bound))
assert zeros.is_FiniteSet # If there are infinite solutions the next line will hang.
ans = Min(function.subs(x, lower_bound), function.subs(x, upper_bound), *[function.subs(x, i) for i in zeros])
Here's a possible solution using sympy:
import sympy as sp
x = sp.Symbol('x', real=True)
f = (x**3 / 3) - (2 * x**2) - 3 * x + 1
#f = 3 * x**4 - 4 * x**3 - 12 * x**2 + 3
fprime = f.diff(x)
all_solutions = [(xx, f.subs(x, xx)) for xx in sp.solve(fprime, x)]
interval = [0, 3.5]
interval_solutions = filter(
lambda x: x[0] >= interval[0] and x[0] <= interval[1], all_solutions)
print(all_solutions)
print(interval_solutions)
all_solutions is giving you all points where the first derivative is zero, interval_solutions is constraining those solutions to a closed interval. This should give you some good clues to find minimums and maximums :-)
The f.subs commands show two ways of displaying the value of the given function at x=3.5, the first as a rational approximation, the second as the exact fraction.

Factoring polys in sympy

I'm doing a very simple probability calculations of getting subset of X, Y, Z from set of A-Z (with corresponding probabilities x, y, z).
And because of very heavy formulas, in order to handle them, I'm trying to simplify (or collect or factor - I dont know the exact definition) these polynomial expressions using sympy.
So.. having this (a very simple probability calculation expression of getting subset of X,Y,Z from set of A-Z with corresponding probabilities x, y, z)
import sympy as sp
x, y, z = sp.symbols('x y z')
expression = (
x * (1 - x) * y * (1 - x - y) * z +
x * (1 - x) * z * (1 - x - z) * y +
y * (1 - y) * x * (1 - y - x) * z +
y * (1 - y) * z * (1 - y - z) * x +
z * (1 - z) * y * (1 - z - y) * x +
z * (1 - z) * x * (1 - z - x) * y
)
I want to get something like this
x * y * z * (6 * (1 - x - y - z) + (x + y) ** 2 + (y + z) ** 2 + (x + z) ** 2)
a poly, rewritten in way to have as few operations (+, -, *, **, ...) as possible
I tried factor(), collect(), simplify(). But result differs from my expectations. Mostly I get
2*x*y*z*(x**2 + x*y + x*z - 3*x + y**2 + y*z - 3*y + z**2 - 3*z + 3)
I know that sympy can combine polynomials into simple forms:
sp.factor(x**2 + 2*x*y + y**2) # gives (x + y)**2
But how to make sympy to combine polynomials from expressions above?
If this is impossible task in sympy, may be there are any other options?
Putting together some of the methods happens to give a nice answer this time. It would be interesting to see if this strategy works more often than not on the equations you generate or if, as the name implies, this is just a lucky result this time.
def iflfactor(eq):
"""Return the "I'm feeling lucky" factored form of eq."""
e = Mul(*[horner(e) if e.is_Add else e for e in
Mul.make_args(factor_terms(expand(eq)))])
r, e = cse(e)
s = [ri[0] for ri in r]
e = Mul(*[collect(ei.expand(), s) if ei.is_Add else ei for ei in
Mul.make_args(e[0])]).subs(r)
return e
>>> iflfactor(eq) # using your equation as eq
2*x*y*z*(x**2 + x*y + y**2 + (z - 3)*(x + y + z) + 3)
>>> _.count_ops()
15
BTW, a difference between factor_terms and gcd_terms is that factor_terms will work harder to pull out common terms while retaining the original structure of the expression, very much like you would do by hand (i.e. looking for common terms in Adds that can be pulled out).
>>> factor_terms(x/(z+z*y)+x/z)
x*(1 + 1/(y + 1))/z
>>> gcd_terms(x/(z+z*y)+x/z)
x*(y*z + 2*z)/(z*(y*z + z))
For what it's worth,
Chris
As far as I know, there is no function that does exactly that. I believe it is actually a very hard problem. See Reduce the number of operations on a simple expression for some discussion on it.
There are however, quite a few simplification functions in SymPy that you can try. One that you haven't mentioned that gives a different result is gcd_terms, which factorizes out a symbolic gcd without doing an expansions. It gives
>>> gcd_terms(expression)
x*y*z*((-x + 1)*(-x - y + 1) + (-x + 1)*(-x - z + 1) + (-y + 1)*(-x - y + 1) + (-y + 1)*(-y - z + 1) + (-z + 1)*(-x - z + 1) + (-z + 1)*(-y - z + 1))
Another useful function is .count_ops, which counts the number of operations in an expression. For example
>>> expression.count_ops()
47
>>> factor(expression).count_ops()
22
>>> e = x * y * z * (6 * (1 - x - y - z) + (x + y) ** 2 + (y + z) ** 2 + (x + z) ** 2)
>>> e.count_ops()
18
(note that e.count_ops() is not the same as you counted yourself, because SymPy automatically distributes the 6*(1 - x - y - z) to 6 - 6*x - 6*y - 6*z).
Other useful functions:
cse: Performs a common subexpression elimination on the expression. Sometimes you can simplify the individual parts and then put it back together. This also helps in general to avoid duplicate computations.
horner: Applies the Horner scheme to a polynomial. This minimizes the number of operations if the polynomial is in one variable.
factor_terms: Similar to gcd_terms. I'm actually not entirely clear what the difference is.
Note that by default, simplify will try several simplifications, and return the one that is minimized by count_ops.
I have had a similar problem, and ended up implementing my own solution before I stumbled across this one. Mine seems to do a much better job reducing the number of operations. However, mine also does a brute-force style set of collections over all combinations of variables. Thus, it's runtime grows super-exponentially in the number of variables. OTOH, I've managed to run it on equations with 7 variables in a not-unreasonable (but far from real-time) amount of time.
It is possible that there are some ways to prune some of the search branches here, but I haven't bothered with it. Further optimizations are welcome.
def collect_best(expr, measure=sympy.count_ops):
# This method performs sympy.collect over all permutations of the free variables, and returns the best collection
best = expr
best_score = measure(expr)
perms = itertools.permutations(expr.free_symbols)
permlen = np.math.factorial(len(expr.free_symbols))
print(permlen)
for i, perm in enumerate(perms):
if (permlen > 1000) and not (i%int(permlen/100)):
print(i)
collected = sympy.collect(expr, perm)
if measure(collected) < best_score:
best_score = measure(collected)
best = collected
return best
def product(args):
arg = next(args)
try:
return arg*product(args)
except:
return arg
def rcollect_best(expr, measure=sympy.count_ops):
# This method performs collect_best recursively on the collected terms
best = collect_best(expr, measure)
best_score = measure(best)
if expr == best:
return best
if isinstance(best, sympy.Mul):
return product(map(rcollect_best, best.args))
if isinstance(best, sympy.Add):
return sum(map(rcollect_best, best.args))
To illustrate the performance, this paper(paywalled, sorry) has 7 formulae that are 5th degree polynomials in 7 variables with up to 29 terms and 158 operations in the expanded forms. After applying both rcollect_best and #smichr's iflfactor, the number of operations in the 7 formulae are:
[6, 15, 100, 68, 39, 13, 2]
and
[32, 37, 113, 73, 40, 15, 2]
respectively. iflfactor has 433% more operations than rcollect_best for one of the formulae. Also, the number of operations in the expanded formulae are:
[39, 49, 158, 136, 79, 27, 2]

Categories

Resources