Z3 cannot check equivalence of two formulae - python

(Why are not the math formulae showing correctly?)
I am performing a test over the Z3 library in Python (Collab) to see whether it knows to distinguish formulae.
The test is the following: (1) I make a quantifier elimination over a formula $phi_1$, (2) I change the formula in a way it remains semantically equivalent: for instance, $phi_1 \equiv (a<b+1)$ to $\phi_2 \equiv (a<1+b)$, (3) I test whether $phi_1=phi_2$.
To see whether $phi_1=phi_2$, I perform the following query: for all the variables, I see whether formulae imply each other. Like $\forall * . (\phi_1 \rightleftarrow \phi_2)$ Is this correct?
So, imagine I apply this on my machine:
x, t1, t2 = Reals('x t1 t2')
g = Goal()
g.add(Exists(x, And(t1 < x, x < t2)))
t = Tactic('qe')
res = t(g)
The result res is [[Not(0 <= t1 + -1*t2)]], so a semantically equivalent formula is: [[Not(0 <= -1*t2 + t1)]] Am I right?
Let us check whether [[Not(0 <= t1 + -1*t2)]] = [[Not(0 <= -1*t2 + t1)]]. So I apply the universal double-implication formula above:
w = Goal()
w.add(ForAll(t1, (ForAll(t2, And(
Implies(Not(0 <= -1*t2 + t1), Not(0 <= t1 + -1*t2)),
Implies(Not(0 <= t1 + -1*t2), Not(0 <= -1*t2 + t1)),
)))))
tt = Tactic('qe')
areThey = tt(w)
print (areThey)
And the result is.. [[]] I do not know how to interpret this. An optimistic approach is to think that it returns emptyness, since quantifier elimination has been capable to eliminate both quantifiers successfully (i.e. with true result).
I think this can be a problem of using a wrong tactic, or maybe Z3 does not deal OK with universal quantifiers.
However, the most probable situation is that I am probably missing something key and Z3 is clever enough to distinguish.
Any help?

This just means that the quantifier-elimination tactic reduced the goal to empty-subset; i.e., it eliminated it completely. You've nothing left to do.
In general, to check if two formulas are equivalent in z3, you assert the negation of their equivalence; and see if z3 can come up with a model: If the negation is satisfiable, then that is a counter-example for the original equivalence. If you get unsat, then you conclude that the original equivalence holds for all inputs. This is how you code that in z3:
from z3 import *
t1, t2 = Reals('t1 t2')
s = Solver()
fml1 = Not(0 <= -1*t2 + t1)
fml2 = Not(0 <= t1 + -1*t2)
s.add(Not(fml1 == fml2))
print(s.check())
If you run this, you'll see:
unsat
meaning the equivalence holds.

Related

Create an unknown number of programmatically defined variables

I have a recursive function that can produce a difficult-to-know number of expressions, each needing a new variable multiplied to it. These variables will later be removed out by calculations involving integration or residue.
How can I develop these unknown number of variables? Maybe indexed? All examples I've seen on the internet are working with an a priori known object of a definite size, e.g. "item" in How can you dynamically create variables via a while loop? or Accessing the index in Python 'for' loops
I think I can boil it down to this simple example to use in my real script:
import sympy as s
p0,p1,p2,p3,p4=s.symbols('p0 p1 p2 p3 p4')
l = [p0, p1, p2, p3, p4]
def f(n):
if n == 0:
return l[n]
elif n == 1:
return l[n]
else:
return f(n-1)*l[n]+f(n-2)
f(3) # works
f(6) # doesnt' work - need to define ahead of time the
# dummy variables l[6], l[5], ....
# even if they are just symbols for (much) later numerical evaluation.
I need this above snippet to actually generate the needed unknowns ahead of time.
I saw some mentions of pandas, but couldn't find a good example for my need, nor even sure if that was the best route. Also saw things like, "...an unknown number of lines [file]...", or "...unknown number of arguments...", but those are, seemingly, not applicable.
Indexed objects represent an abstract thing with an index taking any values, with no restriction on how large the index can be.
import sympy as s
p = s.IndexedBase("p")
def f(n):
if n == 0 or n == 1:
return p[n]
else:
return f(n-1)*p[n] + f(n-2)
print(f(7))
Output
(p[0] + p[1]*p[2])*p[3] + (((p[0] + p[1]*p[2])*p[3] + p[1])*p[4] + p[0] + p[1]*p[2])*p[5] + (((p[0] + p[1]*p[2])*p[3] + p[1])*p[4] + ((p[0] + p[1]*p[2])*p[3] + (((p[0] + p[1]*p[2])*p[3] + p[1])*p[4] + p[0] + p[1]*p[2])*p[5] + p[1])*p[6] + p[0] + p[1]*p[2])*p[7] + p[1]
As an aside, things like p0,p1,p2,p3,p4=s.symbols('p0 p1 p2 p3 p4') can be done more easily with syms = s.symbols('p0:5') or even
n = ...
syms = s.symbols('p0:{}'.format(n))
This creates individual symbols, not an indexed object, so the number n has to be known at the time of creation. But still easier than listing p0 p1 and so on.

Optimize Python math code for fixed values of variables in function

I have a very long math formula (just to put you in context: it has 293095 characters) which in practice will be the body of a python function. This function has 15 input parameters as in:
def math_func(t,X,P,n1,n2,R,r):
x,y,z = X
a,b,c = P
u1,v1,w1 = n1
u2,v2,w2 = n2
return <long math formula>
The formula uses simple math operations + - * ** / and one function call to arctan. Here an extract of it:
r*((-16*(r**6*t*u1**6 - 6*r**6*u1**5*u2 - 15*r**6*t*u1**4*u2**2 +
20*r**6*u1**3*u2**3 + 15*r**6*t*u1**2*u2**4 - 6*r**6*u1*u2**5 -
r**6*t*u2**6 + 3*r**6*t*u1**4*v1**2 - 12*r**6*u1**3*u2*v1**2 -
18*r**6*t*u1**2*u2**2*v1**2 + 12*r**6*u1*u2**3*v1**2 +
3*r**6*t*u2**4*v1**2 + 3*r**6*t*u1**2*v1**4 - 6*r**6*u1*u2*v1**4 -
3*r**6*t*u2**2*v1**4 + r**6*t*v1**6 - 6*r**6*u1**4*v1*v2 -
24*r**6*t*u1**3*u2*v1*v2 + 36*r**6*u1**2*u2**2*v1*v2 +
24*r**6*t*u1*u2**3*v1*v2 - 6*r**6*u2**4*v1*v2 -
12*r**6*u1**2*v1**3*v2 - 24*r**6*t*u1*u2*v1**3*v2 +
12*r**6*u2**2*v1**3*v2 - 6*r**6*v1**5*v2 - 3*r**6*t*u1**4*v2**2 + ...
Now the point is that in practice the bulk evaluation of this function will be done for fixed values of P,n1,n2,R and r which reduces the set of free variables to only four, and "in theory" the formula with less parameters should be faster.
So the question is: How can I implement this optimization in Python?
I know I can put everything in a string and do some sort of replace,compile and eval like in
formula = formula.replace('r','1').replace('R','2')....
code = compile(formula,'formula-name','eval')
math_func = lambda t,x,y,z: eval(code)
It would be good if some operations (like power) are substituted by their value, for example 18*r**6*t*u1**2*u2**2*v1**2 should become 18*t for r=u1=u2=v1=1. I think compile should do so but in any case I'm not sure. Does compile actually perform this optimization?
My solution speeds up the computation but if I can squeeze it more it will be great. Note: preferable within standard Python (I could try Cython later).
In general I'm interesting in a pythonic way to accomplish my goal maybe with some extra libraries: what is a reasonably good way of doing this? Is my solution a good approach?
EDIT: (To give more context)
The huge expression is the output of a symbolic line integral over an arc of circle. The arc is given in space by the radius r, two ortho-normal vectors (like the x and y axis in a 2D version) n1=(u1,v1,w1),n2=(u2,v2,w2) and the center P=(a,b,c). The rest is the point over which I'm performing the integration X=(x,y,z) and a parameter R for the function I'm integrating.
Sympy and Maple just take ages to compute this, the actual output is from Mathematica.
If you are curious about the formula here it is (pseudo-pseudo-code):
G(u) = P + r*(1-u**2)/(1+u**2)*n1 + r*2*u/(1+u**2)*n2
integral of (1-|X-G(t)|^2/R^2)^3 over t
You could use Sympy:
>>> from sympy import symbols
>>> x,y,z,a,b,c,u1,v1,w1,u2,v2,w2,t,r = symbols("x,y,z,a,b,c,u1,v1,w1,u2,v2,w2,t,r")
>>> r=u1=u2=v1=1
>>> a = 18*r**6*t*u1**2*u2**2*v1**2
>>> a
18*t
Then you can create a Python function like this:
>>> from sympy import lambdify
>>> f = lambdify(t, a)
>>> f(1)
18
And that f function is indeed simply 18*t:
>>> import dis
>>> dis.dis(f)
1 0 LOAD_CONST 1 (18)
3 LOAD_FAST 0 (_Dummy_18)
6 BINARY_MULTIPLY
7 RETURN_VALUE
If you want to compile the resulting code into machine code, you can try a JIT compiler such as Numba, Theano, or Parakeet.
Here's how I would approach this problem:
compile() your function to an AST (Abstract Syntax Tree) instead of a normal bytecode function - see the standard ast module for details.
Traverse the AST, replacing all references to the fixed parameters with their fixed value. There are libraries such as macropy that may be useful for this, I don't have any specific recommendation.
Traverse the AST again, performing whatever optimizations this might enable, such as Mult(1, X) => X. You don't have to worry about operations between two constants, as Python (since 2.6) optimizes that already.
compile() the AST into a normal function. Call it, and hope that the speed was increased by a sufficient amount to justify all the pre-optimization.
Note that Python will never optimize things like 1*X on its own, as it cannot know what type X will be at runtime - it could be an instance of a class that implements the multiplication operation in an arbitrary way, so the result is not necessarily X. Only your knowledge that all the variables are ordinary numbers, obeying the usual rules of arithmetic, makes this optimization valid.
The "right way" to solve a problem like this is one or more of:
Find a more efficient formulation
Symbolically simplify and reduce terms
Use vectorization (e.g. NumPy)
Punt to low-level libraries that are already optimized (e.g. in languages like C or Fortran that implicitly do strong expression optimization, rather than Python, which does nada).
Let's say for a moment, though, that approaches 1, 3, and 4 are not available, and you have to do this in Python. Then simplifying and "hoisting" common subexpressions is your primary tool.
The good news is, there are a lot of opportunities. The expression r**6, for example, is repeated 26 times. You could save 25 computations by simply assigning r_6 = r ** 6 once, then replacing r**6 every time it occurs.
When you start looking for common expressions here, you'll find them everywhere. It'd be nice to mechanize that process, right? In general, that requires a full expression parser (e.g. from the ast module) and is an exponential-time optimization problem. But your expression is a bit of a special case. While long and varied, it's not especially complicated. It has few internal parenthetical groupings, so we can get away with a quicker and dirtier approach.
Before the how, the resulting code is:
sa = r**6 # 26 occurrences
sb = u1**2 # 5 occurrences
sc = u2**2 # 5 occurrences
sd = v1**2 # 5 occurrences
se = u1**4 # 4 occurrences
sf = u2**3 # 3 occurrences
sg = u1**3 # 3 occurrences
sh = v1**4 # 3 occurrences
si = u2**4 # 3 occurrences
sj = v1**3 # 3 occurrences
sk = v2**2 # 1 occurrence
sl = v1**6 # 1 occurrence
sm = v1**5 # 1 occurrence
sn = u1**6 # 1 occurrence
so = u1**5 # 1 occurrence
sp = u2**6 # 1 occurrence
sq = u2**5 # 1 occurrence
sr = 6*sa # 6 occurrences
ss = 3*sa # 5 occurrences
st = ss*t # 5 occurrences
su = 12*sa # 4 occurrences
sv = sa*t # 3 occurrences
sw = v1*v2 # 5 occurrences
sx = sj*v2 # 3 occurrences
sy = 24*sv # 3 occurrences
sz = 15*sv # 2 occurrences
sA = sr*u1 # 2 occurrences
sB = sy*u1 # 2 occurrences
sC = sb*sc # 2 occurrences
sD = st*se # 2 occurrences
# revised formula
sv*sn - sr*so*u2 - sz*se*sc +
20*sa*sg*sf + sz*sb*si - sA*sq -
sv*sp + sD*sd - su*sg*u2*sd -
18*sv*sC*sd + su*u1*sf*sd +
st*si*sd + st*sb*sh - sA*u2*sh -
st*sc*sh + sv*sl - sr*se*sw -
sy*sg*u2*sw + 36*sa*sC*sw +
sB*sf*sw - sr*si*sw -
su*sb*sx - sB*u2*sx +
su*sc*sx - sr*sm*v2 - sD*sk
That avoids 81 computations. It's just a rough cut. Even the result could be further improved. The subexpressions sr*sw and su*sd for example, could be pre-computed as well. But we'll leave that next level for another day.
Note that this doesn't include the starting r*((-16*(. The majority of the simplification can be (and needs to be) done on the core of the expression, not on its outer terms. So I stripped those away for now; they can be added back once the common core is computed.
How do you do this?
f = """
r**6*t*u1**6 - 6*r**6*u1**5*u2 - 15*r**6*t*u1**4*u2**2 +
20*r**6*u1**3*u2**3 + 15*r**6*t*u1**2*u2**4 - 6*r**6*u1*u2**5 -
r**6*t*u2**6 + 3*r**6*t*u1**4*v1**2 - 12*r**6*u1**3*u2*v1**2 -
18*r**6*t*u1**2*u2**2*v1**2 + 12*r**6*u1*u2**3*v1**2 +
3*r**6*t*u2**4*v1**2 + 3*r**6*t*u1**2*v1**4 - 6*r**6*u1*u2*v1**4 -
3*r**6*t*u2**2*v1**4 + r**6*t*v1**6 - 6*r**6*u1**4*v1*v2 -
24*r**6*t*u1**3*u2*v1*v2 + 36*r**6*u1**2*u2**2*v1*v2 +
24*r**6*t*u1*u2**3*v1*v2 - 6*r**6*u2**4*v1*v2 -
12*r**6*u1**2*v1**3*v2 - 24*r**6*t*u1*u2*v1**3*v2 +
12*r**6*u2**2*v1**3*v2 - 6*r**6*v1**5*v2 - 3*r**6*t*u1**4*v2**2
""".strip()
from collections import Counter
import re
expre = re.compile('(?<!\w)\w+\*\*\d+')
multre = re.compile('(?<!\w)\w+\*\w+')
expr_saved = 0
stmts = []
secache = {}
seindex = 0
def subexpr(e):
global seindex
cached = secache.get(e)
if cached:
return cached
base = ord('a') if seindex < 26 else ord('A') - 26
name = 's' + chr(seindex + base)
seindex += 1
secache[e] = name
return name
def hoist(e, flat, c):
"""
Hoist the expression e into name defined by flat.
c is the count of how many times seen in incoming
formula.
"""
global expr_saved
assign = "{} = {}".format(flat, e)
s = "{:30} # {} occurrence{}".format(assign, c, '' if c == 1 else 's')
stmts.append(s)
print "{} needless computations quashed with {}".format(c-1, flat)
expr_saved += c - 1
def common_exp(form):
"""
Replace ALL exponentiation operations with a hoisted
sub-expression.
"""
# find the exponentiation operations
exponents = re.findall(expre, form)
# find and count exponentiation operations
expcount = Counter(re.findall(expre, form))
# for each exponentiation, create a hoisted sub-expression
for e, c in expcount.most_common():
hoist(e, subexpr(e), c)
# replace all exponentiation operations with their sub-expressions
form = re.sub(expre, lambda x: subexpr(x.group(0)), form)
return form
def common_mult(f):
"""
Replace multiplication operations with a hoisted
sub-expression if they occur > 1 time. Also, only
replaces one sub-expression at a time (the most common)
because it may affect further expressions
"""
mults = re.findall(multre, f)
for e, c in Counter(mults).most_common():
# unlike exponents, only replace if >1 occurrence
if c == 1:
return f
# occurs >1 time, so hoist
hoist(e, subexpr(e), c)
# replace in loop and return
return re.sub('(?<!\w)' + re.escape(e), subexpr(e), f)
# return f.replace(e, flat(e))
return f
# fix all exponents
form = common_exp(f)
# fix selected multiplies
prev = form
while True:
form = common_mult(form)
if form == prev:
# have converged; no more replacements possible
break
prev = form
print "--"
mults = re.split(r'\s*[+-]\s*', form)
smults = ['*'.join(sorted(terms.split('*'))) for terms in mults]
print smults
# print the hoisted statements and the revised expression
print '\n'.join(stmts)
print
print "# revised formula"
print form
Parsing with regular expressions is dicey business. That journey is prone to error, sorrow, and regret. I guarded against bad outcomes by hoisting some exponentiations that didn't strictly need to be, and by plugging random values into both the before and after formulas to make sure they both give the same results. I recommend the "punt to C" strategy if this is production code. But if you can't...

Corresponding Coefficients in Python SymPy Pattern Matching

I have a function named f = 0.5/(z-3). I would like to know what would the coefficients p and q be if f was written in the following form: q/(1-p*z) but unfortunately sympy match function returns None. Am I doing something wrong? or what is the right way of doing something like this?
Here is the code:
z = symbols('z')
p, q = Wild('p'), Wild('q')
print (0.5/(z-3)).match(q/(1-p*z))
EDIT:
My expected answer is: q=-1/6 and p = 1/3
One way of course is
p, q = symbols('p q')
f = 0.5/(z-3)
print solve(f - q/(1-p*z), p, q,rational=True)
But I don't know how to do that in pattern matching, or if it's capable of doing something like this.
Thanks in Advance =)
If you start by converting to linear form,
1 / (2*z - 6) == q / (1 - p*z)
# multiply both sides
# by (2*z - 6) * (1 - p*z)
1 - p*z == q * (2*z - 6)
then
from sympy import Eq, solve, symbols, Wild
z = symbols("z")
p,q = symbols("p q", cls=Wild)
solve(Eq(1 - p*z, q*(2*z - 6)), (p,q))
gives
{p_: 1/3, q_: -1/6}
as expected.
Edit: I found a slightly different approach:
solve(Eq(f, g)) is equivalent to solve(f - g) (implicitly ==0)
We can reduce f - g like simplify(f - g), but by default it doesn't do anything because the resulting equation is more than 1.7 times longer than the original (default value for ratio argument).
If we specify a higher ratio, like simplify(f - g, ratio=5), we get
>>> simplify(1/(2*z-6) - q/(1-p*z), ratio=5)
(z*p_ + 2*q_*(z - 3) - 1)/(2*(z - 3)*(z*p_ - 1))
This is now in a form the solver will deal with:
>>> solve(_, (p,q))
{p_: 1/3, q_: -1/6}
SymPy's pattern matcher only does minimal algebraic manipulation to match things. It doesn't match in this case because there is no 1 in the denominator. It would be better to match against a/(b + c*z) and manipulate a, b, and c into the p and q. solve can show you the exact formula:
In [7]: solve(Eq(a/(b + c*z), q/(1 - p*z)), (q, p))
Out[7]:
⎧ -c a⎫
⎨p: ───, q: ─⎬
⎩ b b⎭
Finally, it's always a good idea to use exclude when constructing Wild object, like Wild('a', exclude=[z]). Otherwise you can get unexpected behavior like
In [11]: a, b = Wild('a'), Wild('b')
In [12]: S(2).match(a + b*z)
Out[12]:
⎧ 2⎫
⎨a: 0, b: ─⎬
⎩ z⎭
which is technically correct, but probably not what you want.

Writing a function for x * sin(3/x) in python

I have to write a function, s(x) = x * sin(3/x) in python that is capable of taking single values or vectors/arrays, but I'm having a little trouble handling the cases when x is zero (or has an element that's zero). This is what I have so far:
def s(x):
result = zeros(size(x))
for a in range(0,size(x)):
if (x[a] == 0):
result[a] = 0
else:
result[a] = float(x[a] * sin(3.0/x[a]))
return result
Which...doesn't work for x = 0. And it's kinda messy. Even worse, I'm unable to use sympy's integrate function on it, or use it in my own simpson/trapezoidal rule code. Any ideas?
When I use integrate() on this function, I get the following error message: "Symbol" object does not support indexing.
This takes about 30 seconds per integrate call:
import sympy as sp
x = sp.Symbol('x')
int2 = sp.integrate(x*sp.sin(3./x),(x,0.000001,2)).evalf(8)
print int2
int1 = sp.integrate(x*sp.sin(3./x),(x,0,2)).evalf(8)
print int1
The results are:
1.0996940
-4.5*Si(zoo) + 8.1682775
Clearly you want to start the integration from a small positive number to avoid the problem at x = 0.
You can also assign x*sin(3./x) to a variable, e.g.:
s = x*sin(3./x)
int1 = sp.integrate(s, (x, 0.00001, 2))
My original answer using scipy to compute the integral:
import scipy.integrate
import math
def s(x):
if abs(x) < 0.00001:
return 0
else:
return x*math.sin(3.0/x)
s_exact = scipy.integrate.quad(s, 0, 2)
print s_exact
See the scipy docs for more integration options.
If you want to use SymPy's integrate, you need a symbolic function. A wrong value at a point doesn't really matter for integration (at least mathematically), so you shouldn't worry about it.
It seems there is a bug in SymPy that gives an answer in terms of zoo at 0, because it isn't using limit correctly. You'll need to compute the limits manually. For example, the integral from 0 to 1:
In [14]: res = integrate(x*sin(3/x), x)
In [15]: ans = limit(res, x, 1) - limit(res, x, 0)
In [16]: ans
Out[16]:
9⋅π 3⋅cos(3) sin(3) 9⋅Si(3)
- ─── + ──────── + ────── + ───────
4 2 2 2
In [17]: ans.evalf()
Out[17]: -0.164075835450162

Why SymPy can't solve quadratic equation with complicated coefficients

SymPy can easily solve quadratic equations with short simple coefficients.
For example:
from pprint import pprint
from sympy import *
x,b,f,Lb,z = symbols('x b f Lb z')
eq31 = Eq((x*b + f)**2, 4*Lb**2*z**2*(1 - x**2))
pprint(eq31)
sol = solve(eq31, x)
pprint(sol)
But with a little bit larger coefficients - it can't:
from pprint import pprint
from sympy import *
c3,b,f,Lb,z = symbols('c3 b f Lb z')
phi,Lf,r = symbols('phi Lf r')
eq23 = Eq(
(
c3 * (2*Lb*b - 2*Lb*f + 2*Lb*r*cos(phi + pi/6))
+ (Lb**2 - Lf**2 + b**2 - 2*b*f + 2*b*r*cos(phi + pi/6) + f**2 - 2*f*r*cos(phi + pi/6) + r**2 + z**2)
)**2,
4*Lb**2*z**2*(1 - c3**2)
)
pprint(eq23)
print("\n\nSolve (23) for c3:")
solutions_23 = solve(eq23, c3)
pprint(solutions_23)
Why?
This is not specific to Sympy - other programs like Maple or Mathematica suffer from same the problem: When solving an equation, solve needs to choose a proper solution strategy (see e.g. Sympy's Solvers) based on assumptions about the variables and the structure of the equation. These are choices are normally heuristic and often incorrect (hence no solution, or false strategies are tried first). Furthermore, the assumptions of variables is often to broad (e.g., complex instead of reals).
Thus, for complex equations the solution strategy often has to be given by the user. For your example, you could use:
sol23 = roots(eq23.lhs - eq23.rhs, c3)
Since symbolic solutions are supported, one thing you can do is solve the generic quadratic and substitute in your specific coefficients:
>>> eq = eq23.lhs-eq23.rhs
>>> a,b,c = Poly(eq,c3).all_coeffs()
>>> var('A:C')
(A, B, C)
>>> ans=[i.xreplace({A:a,B:b,C:c}) for i in solve(A*x**2 + B*x + C,x)]
>>> print filldedent(ans)
...
But you can get the same result if you just shut of simplification and checking:
>>> ans=solve(eq23,c3,simplify=False,check=False)
(Those are the really expensive parts of the call to solve.)

Categories

Resources