Converting CNF format to DIMACS format - python

My lab partner and I are working on writing code to make our own SAT solver using Python for one of our courses. So far we have written this code to convert SoP to CNF. Now we are stuck as to how to convert the CNF to DIMACS format. We understand how the DIMACS format works when completing it by hand but we are stuck on writing the actually code to go from CNF to DIMACS. Everything we have found so far inputs files that are already in the DIMACS format.
from sympy.logic.boolalg import to_cnf
from sympy.abc import A, B, C, D
f = to_cnf(~(A | B) | D)
g = to_cnf(~A&B&C | ~D&A)

The sympy boolalg module lets you build an abstract syntax tree (AST) for the expression. In CNF form you'll have a top-level And node, with one or more children; each child is an Or node with one or more literals; a literal is either a Not node of a symbol, or just a symbol directly.
From the DIMACS side, the top-level And is implicit. You are just listing the Or nodes, and if a symbol was in a Not you mark that with a - before the symbol's variable. You are essentially merely assigning new names for the symbols and writing it down in a new text form. (The fact that DIMACS variable names look like integers is just because it's convenient; they do not have integer semantics/arithmetic/etc.)
To track mapping between DIMACS variables and sympy symbols, something like this is helpful:
class DimacsMapping:
def __init__(self):
self._symbol_to_variable = {}
self._variable_to_symbol = {}
self._total_variables = 0
#property
def total_variables(self):
return self._total_variables
def new_variable(self):
self._total_variables += 1
return self._total_variables
def get_variable_for(self, symbol):
result = self._symbol_to_variable.get(symbol)
if result is None:
result = self.new_variable()
self._symbol_to_variable[symbol] = result
self._variable_to_symbol[result] = symbol
return result
def get_symbol_for(self, variable):
return self._variable_to_symbol[variable]
def __str__(self) -> str:
return str(self._variable_to_symbol)
You can always ask for a new (fresh, never been used) variable with new_variable. DIMACS variables start from 1, not 0. (The 0 value is used to indicate not-a-variable, primarily for marking the end-of-clause.)
We don't want to just allocate new variables every time, but also remember which variables were assigned to a symbol. This maintains a mapping from symbol to variable and vice versa. You hand a sympy symbol to get_variable_for and either the previously used variable for that symbol is returned, or a new variable is allocated and returned, with the mapping noted.
It tracks the reverse mapping, so you can recover the original symbol given a variable in get_symbol_for; this is useful for turning a SAT assignment back into sympy assignments.
Next, we need something to store this mapping along with the clause list. You need both to emit valid DIMACS, since the header line contains both the variable count (which the mapping knows) and the clause count (which the clause list knows). This is basically a glorified tuple, with a __str__ that does the conversion to well-formed DIMACS text:
class DimacsFormula:
def __init__(self, mapping, clauses):
self._mapping = mapping
self._clauses = clauses
#property
def mapping(self):
return self._mapping
#property
def clauses(self):
return self._clauses
def __str__(self):
header = f"p cnf {self._mapping.total_variables} {len(self._clauses)}"
body = "\n".join(
" ".join([str(literal) for literal in clause] + ["0"])
for clause in self._clauses
)
return "\n".join([header, body])
Last, we just walk over the sympy AST to create DIMACS clauses:
from sympy.core.symbol import Symbol
from sympy.logic.boolalg import to_cnf, And, Or, Not
def to_dimacs_formula(sympy_cnf):
dimacs_mapping = DimacsMapping()
dimacs_clauses = []
assert type(sympy_cnf) == And
for sympy_clause in sympy_cnf.args:
assert type(sympy_clause) == Or
dimacs_clause = []
for sympy_literal in sympy_clause.args:
if type(sympy_literal) == Not:
sympy_symbol, polarity = sympy_literal.args[0], -1
elif type(sympy_literal) == Symbol:
sympy_symbol, polarity = sympy_literal, 1
else:
raise AssertionError("invalid cnf")
dimacs_variable = dimacs_mapping.get_variable_for(sympy_symbol)
dimacs_literal = dimacs_variable * polarity
dimacs_clause.append(dimacs_literal)
dimacs_clauses.append(dimacs_clause)
return DimacsFormula(dimacs_mapping, dimacs_clauses)
This just descends down the tree, until it gets the root symbol and whether or not it was negated (i.e., was in a Not indicating negative polarity). Once the symbol is mapped to its variable, we can leave it positive or negate it to maintain polarity and append it to the DIMACS clause.
Do this for all Or nodes and we have a mapped DIMACS formula.
f = to_cnf(~(A | B) | D)
print(f)
print()
f_dimacs = to_dimacs_formula(f)
print(f_dimacs)
print()
print(f_dimacs.mapping)
(D | ~A) & (D | ~B)
p cnf 3 2
1 -2 0
1 -3 0
{1: D, 2: A, 3: B}
As an aside, you probably don't want to use to_cnf to get a CNF for purposes of testing satisfiability. In general, converting a boolean formula to an equivalent CNF can result in exponential size increase.
Note in the above example for f, variable D only appeared once in the formula yet appeared twice in the CNF. If it had been more complicated, like (C | D), then that entire subformula gets copied:
f = to_cnf(~(A | B) | (C | D))
print(f)
(C | D | ~A) & (C | D | ~B)
If it was even more complicated, you can see how you end up with copies of copies of copies... and so on. For the purposes of testing satisfiability, we do not need an equivalent formula but merely an equisatisfiable one.
This is a formula that may not be equivalent, but is satisfiable if and only if the original was. It can have new clauses and different variables. This relaxation gives a linear sized translation instead.
To do this, rather than allow subformulas to be copied we will allocate a variable that represents the truth value of that subformula and use that instead. This is called a Tseitin transformation, and I go into more detail in this answer.
As a simple example, let's say we want to use a variable x to represent (a ∧ b). We would write this as x ≡ (a ∧ b), which can be done with three CNF clauses: (¬x ∨ a) ∧ (¬x ∨ b) ∧ (¬a ∨ ¬b ∨ x). Now x is true if and only if (a ∧ b) is.
This top-level function kicks off the process, so that the recursive calls share the same mapping and clause set. The final outcome is a single variable representing the truth value of the entire formula. We must force this to be true (else a SAT solver will simply choose any input variables to the formula, follow the implications, and produce an evaluated formula of any output).
def to_dimacs_tseitin(sympy_formula):
dimacs_mapping = DimacsMapping()
dimacs_clauses = []
# Convert the formula, with this final variable representing the outcome
# of the entire formula. Since we are stating this formula should evaluate
# to true, this variable is appended as a unit clause stating such.
formula_literal = _to_dimacs_tseitin_literal(
sympy_formula, dimacs_mapping, dimacs_clauses
)
dimacs_clauses.append([formula_literal])
return DimacsFormula(dimacs_mapping, dimacs_clauses)
The bulk of the translation is the code that adds clauses specific to the operation being performed. The recursion happens at the point where we demand a single variable that represents the output of the subformula arguments.
def _to_dimacs_tseitin_literal(sympy_formula, dimacs_mapping, dimacs_clauses):
# Base case, the formula is just a symbol.
if type(sympy_formula) == Symbol:
return dimacs_mapping.get_variable_for(sympy_formula)
# Otherwise, it is some operation on one or more subformulas. First we
# need to get a literal representing the outcome of each of those.
args_literals = [
_to_dimacs_tseitin_literal(arg, dimacs_mapping, dimacs_clauses)
for arg in sympy_formula.args
]
# As a special case, we won't bother wasting a new variable for `Not`.
if type(sympy_formula) == Not:
return -args_literals[0]
# General case requires a new variable and new clauses.
result = dimacs_mapping.new_variable()
if type(sympy_formula) == And:
for arg_literal in args_literals:
dimacs_clauses.append([-result, arg_literal])
dimacs_clauses.append(
[result] + [-arg_literal for arg_literal in args_literals]
)
elif type(sympy_formula) == Or:
for arg_literal in args_literals:
dimacs_clauses.append([result, -arg_literal])
dimacs_clauses.append(
[-result] + [arg_literal for arg_literal in args_literals]
)
else:
# TODO: Handle all the other sympy operation types.
raise NotImplementedError()
return result
Now boolean formulas do not need to be in CNF to become DIMACS:
f = ~(A | B) | (C | D)
print(f)
print()
f_dimacs = to_dimacs_tseitin(f)
print(f_dimacs)
print()
print(f_dimacs.mapping)
C | D | ~(A | B)
p cnf 6 8
5 -3 0
5 -4 0
-5 3 4 0
6 -1 0
6 -2 0
6 5 0
-6 1 2 -5 0
6 0
{1: C, 2: D, 3: A, 4: B}
Even after unit propagation the resulting formula is clearly larger in this small example. However, the linear scaling helps substantially when 'real' formulas are converted. Furthermore, modern SAT solvers understand that formulas will be translated this way and perform in-processing tailored toward it.

Related

How to write Python class to solve for any missing variable given an equation and values for the other variables?

I am trying to write a Python class that, when given an equation (or set of equations, eventually) with n variables and values for any n-1 of the variables, will solve for the remaining undefined variable.
For example, if I currently wanted a class for simple conversion between Celsius and Farenheit in either direction, currently I would write it as such:
class TempConversion():
def __init__(self, temp, unit):
self.unit = unit
if self.unit == "C":
self.celsius = temp
elif self.unit == "F":
self.farenheit = temp
def convert_temps(self):
if self.unit == "C":
self.farenheit = 9/5 * self.celsius + 32
elif self.unit == "F":
self.celsius = ( self.farenheit - 32 ) * 5/9
Even in this simple example, I currently have to write the equation twice under the .convert_temps() method. In a real simulation program that involves several dozen equations involving several variables each, this could potential require writing dozens of lines of highly redundant code, prone to arithmetic errors in implementation and messy to read.
I suspect that there is a common way to write this efficiently/flexibly, either with native Python or with a particular package, such that each equation only needs to be 'written once.' I cannot seem to find the right search terms on SO, however.
It seems like SymPy might be an option from python solving equations for unknown variable but that SymPy would require me to define whichever variable was missing as a Symbol(), which would presumably still mean writing a lot of conditional statements etc.
Since you asked to make an example:
sympy uses symbols and does symbolic computation, not numeric calculations.
For each each unknown in your equation you would need a symbol. Luckily developers were wise enough to add some sort of symbol where you do not need to name specifically. (Dummy)
Use one of conversation formulas and rearrange it to be in first order equation with two unknowns format. (ax + by + c = 0)
Use the passed value as one of unknowns and solve the equation for the other one.
Dummy symbols:
Normally sympy requires you to create symbols and those symbols must have different names.
In your example you would need two symbols as below:
c = sp.Symbol("C")
f = sp.Symbol("F")
or
c, f = sp.symbols("C F")
However in your case you might not know how many unknown are there so you can use Dummy. It takes no required argument and creates symbols.
c = sp.Dummy()
f = sp.Dummy()
Rearranging the formula
Here you can use of conversion formulas. Let us use C2F.
F = 9 / 5 * C + 32
Becomes:
9 / 5 * C + 32 - F = 0
Where F is temperature in Fahreneit and C is temperature in Celius.
Now the code would be something like:
import sympy as sp
class TempConversion:
def __init__(self, temp: float, unit: str) -> None:
self.unit = unit
self.temp = temp
self.f, self.c = sp.Dummy(), sp.Dummy()
self.equation = 9 / 5 * self.c + 32 - self.f
def convert(self) -> list:
if "CELSIUS".startswith(self.unit.upper()):
reduced = self.equation.subs(self.c, self.temp)
return sp.solve(reduced, self.f)
elif "FAHRENHEIT".startswith(self.unit.upper()):
reduced = self.equation.subs(self.f, self.temp)
return sp.solve(reduced, self.c)
else:
raise ValueError("Unknown Temperature Unit")
if __name__ == '__main__':
t = TempConversion(22.75, "F")
print(t.convert())
Please notice in this example there were 2 unknowns and one was provided by user. In other examples you might need parse the given equation or create an equation builder class. Create a list of dummy symbols and do other clever things.

Finding Interpolated Data Value

This is a question I've had before: I have two arrays representing the inputs and corresponding outputs of a function. I need to find the input for a specific output that falls between data points. How do I do that?
For example:
import numpy as np
B = np.arange(0,10,1)
def fun(b):
return b*3/5
A = fun(B)
How to get the value of "B" for fun to return 3.75?
This technique uses linear interpolation to approximate.
I start with this function:
def interpABS(A,B,Aval):
if Aval>max(A) or Aval<min(A):
print('Error: Extrapolating beyond given data')
else:
if len(A)==len(B):
for i in np.arange(1,len(A),1):
ihi = i
ilo = i-1
if A[i]>Aval:
break
Alo = A[ilo]
Blo = B[ilo]
Ahi = A[ihi]
Bhi = B[ihi]
out = Blo + (Bhi-Blo)*(Aval-Alo)/(Ahi-Alo)
return out
else:
print('Error: inputs of different sizes')
Note: I'm kind of an amateur and don't know how to set up exceptions, so instead the error outputs are just print commands on a different path from the rest of the function. Those more experienced than I am may recommend improvements.
Use the output array from your function as A, and the corresponding input array as B, then input your target value as Aval. interpABS will return the an approximate input for your original function to get the target value
So, for our example above, interpABS(A,B,3.75) will return a value of 6.25
This can be useful even if Aval is a value of A to find the corresponding B value, since the math simplifies to Blo + 0. For example, changing Aval in the above example will give 5.0, which is part of the original input set B.

Z3py: coercion and value extraction between Float32() and BitVecSort(32)

I am currently experimenting with the floating point (FP) theory, in combination with bit-vectors; I am working with Z3 4.6.0.0.
I have found limited documentation in the use of the FP API (it seems the only real hits are on z3.py itself), so I have try to provide "complete" examples, to also act as a demonstration of how I (believe) the API should be used.
Here was my first attempt at using the FP theory inside of Z3py:
#!/usr/bin/env python
from z3 import *
s = Solver()
#
# Create our value
#
to_find = "-12.345"
fp_val = FPVal(to_find, Float32())
#
# Create our variable
#
fp_var = Const("fp_var", Float32())
#
# Create the constraints
#
s.add(fp_var == fp_val)
assert s.check() == sat, "assertion never fails because the instance is SAT"
m = s.model()
#
# Prints -12.3449993134
#
print eval(str(m.eval(fp_var, model_completion=True)))
# EOF
If you run this example, it works as expected, and we indeed get that fp_var is equal to (I presume) the float nearest to -12.345 (so far so good; apart from the use of Python's eval to obtain the value as a Python float).
My next step was to try and coerce a floating point value into a bit-vector, whilst checking a non-integer value:
#!/usr/bin/env python
from z3 import *
s = Solver()
#
# Create our value
#
to_find = "-12.345"
fp_val = FPVal(to_find, Float32())
#
# Create our variable
#
bv_var = Const("bv_var", BitVecSort(32))
#
# We now use a "cast" to allow us to do the floating point comparison
#
fp_var = fpSignedToFP(RNE(), bv_var, Float32())
#
# Create the constraints
#
s.add(fp_var == fp_val)
#
# This is UNSAT because fpSignedToFP only supports _integers_!
#
assert s.check() == unsat, "instance is UNSAT, so assertion doesn't fail"
# EOF
In this example, we attempt to "convert" from the bit-vector into a floating point value (using fpSignedToFP), and then assert that the floating point value is equal to the value we are looking for. However, and this matches the documentation for Z3, we get UNSAT, because fpSignedToFP only supports integers.
I then started to "get creative" and see if I could use transitivity with the fpToSBV API call, which does not state it has the limitation of being restricted to only integer floats:
#!/usr/bin/env python
from z3 import *
print get_full_version()
s = Solver()
#
# Create our value
#
to_find = "-12.345"
fp_val = FPVal(to_find, Float32())
#
# Create our variable
#
bv_var = Const("bv_var", BitVecSort(32))
#
# We now create an additional, ancillary variable
#
fp_var = Const("fp_var", Float32())
#
# Create the constraints
#
s.add(fp_var == fp_val)
s.add(bv_var == fpToSBV(RNE(), fp_var, BitVecSort(32)))
assert s.check() == sat, "this example works"
m = s.model()
#
# Prints -12.3449993134
#
print eval(str(m.eval(fp_var, model_completion=True)))
#
# To read out the value from the BV, we create a new variable, assert it is
# the same as the originating BV, and then extract our new variable
#
eval_fp = Const("eval_fp", Float32())
s.add(bv_var == fpToSBV(RNE(), eval_fp, BitVecSort(32)))
assert s.check() == sat, "this cannot change the satisfiability"
m = s.model()
#
# Prints +oo True True
#
print str(m.eval(eval_fp, model_completion=True)),
print m.eval(eval_fp, model_completion=True).isInf(),
print m.eval(eval_fp, model_completion=True).isPositive()
# EOF
To explain this example slightly: we encode our problem as normal, but rather than creating a term for our floating point expression, we create a whole new constant, and assert that our bit-vector is equal to the floating point value, via fpToSBV. Our assertion on the value we wish to find is then against our floating point value. If this model is satisfiable, we then create another floating point constant, that points to our original bit-vector and we try to extract the value for that constant.
This gives a chain like this:
bv_var == fpToSBV(fp_var)
fp_var == -12.345
fpToSBV(eval_fp) == bv_var
which I had hoped by transitivity would have either given the same value for fp_var and eval_fp. However, while fp_var gets the correct value, eval_fp comes out as positive infinity!
I also tried experimenting with the use of fpToSBV and fpSignedToFP together to try and see if that would work (I was sceptical of this, due to the limitations in fpSignedToFP):
#!/usr/bin/env python
from z3 import *
s = Solver()
#
# Create our value
#
to_find = "-12.345"
bv_val = fpToSBV(RNE(), FPVal(to_find, Float32()), BitVecSort(32))
#
# Create our variable
#
bv_var = Const("bv_var", BitVecSort(32))
#
# Create the constraints
#
s.add(bv_var == bv_val)
assert s.check() == sat, "this example works"
m = s.model()
#
# Floating point constant to evaluate
#
eval_fp = fpSignedToFP(RNE(), bv_var, Float32())
#
# Prints -12.0
#
print eval(str(m.eval(eval_fp, model_completion=True)))
# EOF
This had the most success and did actually give an integer close the the floating point value of our constraint (i.e., -12.0). However, it still does not contain the values after the decimal place.
So, my question is: in Z3, is it possible to coerce a floating point value into a bit-vector, and then "back out" again? If so, how should it be done? Alternatively, are the bit-vector/floating point operations only "partially interpreted" (like with int2bv and friends)?
Update
After experimenting some more, I found that the following works (using the Z3-specific extension fpToIEEEBV):
#!/usr/bin/env python
from z3 import *
s = Solver()
#
# Create our value
#
to_find = "-12.345"
fp_val = FPVal(to_find, Float32())
#
# Create our variable
#
bv_var = Const("bv_var", BitVecSort(32))
#
# Convert the value to check to a BV
#
bv_val = fpToIEEEBV(fp_val)
#
# Create the constraints
#
s.add(bv_var == bv_val)
assert s.check() == sat, "this example is SAT, so don't see this"
m = s.model()
#
# Evaluation expression
#
eval_expr = fpBVToFP(bv_var, Float32())
#
# Prints -12.3449993134
#
print eval(str(m.eval(eval_expr)))
# EOF
Is this the correct way of dealing with this issue?
Putting aside how the Python API works, this is already specified in the definition of the floating-point logic, see here: http://smtlib.cs.uiowa.edu/theories-FloatingPoint.shtml
In particular, note the partiality of some of these functions:
"All fp.to_* functions are unspecified for NaN and infinity input
values. In addition, fp.to_ubv and fp.to_sbv are unspecified for
finite number inputs that are out of range (which includes all
negative numbers for fp.to_ubv).
This means for instance that the formula
(= (fp.to_real (_ NaN 8 24)) (fp.to_real (fp c1 c2 c3)))
is satisfiable in this theory for all binary constants c1, c2, and
c3 (of the proper sort). "
So, what you're doing (pending Python specific aspects) is per the standard, with the above caveats.
Using the interchange format
For conversions to/from IEEE754 interchange format, the logic provides a way to convert a given-bit vector to a corresponding float:
; from single bitstring representation in IEEE 754-2008 interchange
format, ; with m = eb + sb ((_ to_fp eb sb) (_ BitVec m) (_
FloatingPoint eb sb))
In the other direction, the logic has this to say:
There is no function for converting from (_ FloatingPoint eb sb) to
the corresponding IEEE 754-2008 binary format, as a bit vector (_
BitVec m) with m = eb + sb, because (_ NaN eb sb) has multiple,
well-defined representations. Instead, an encoding of the kind below
is recommended, where f is a term of sort (_ FloatingPoint eb sb):
(declare-fun b () (_ BitVec m))
(assert (= ((_ to_fp eb sb) b) f))
Again, all this is in http://smtlib.cs.uiowa.edu/theories-FloatingPoint.shtml

Python: Function takes 1 argument for 2 given

I have looked on this website for something similar, and attempted to debug using previous answers, and failed.
I'm testing (I did not write this module) a module that changes the grade value of a course's grades from a B- to say a B, but never going across base grade levels (ie, B+ to an A-).
The original module is called transcript.py
I'm testing it in my own testtranscript.py
I'm testing that module by importing it: 'import transcript' and 'import cornelltest'
I have ensured that all files are in the same folder/directory.
There is the function raise_grade present in transcript.py (there are multiple definitions in this module, but raise_grade is the only one giving me any trouble).
ti is in the form ('class name', 'gradvalue')
There's already another definition converting floats to strings and back (ie 3.0--> B).
def raise_grade(ti):
""""Raise gradeval of transcript line ti by a non-noticeable amount.
"""
# value of the base letter grade, e.g., 4 (or 4.0) for a 4.3
bval = int(ti.gradeval)
print 'bval is:"' + str(bval) + '"'
# part after decimal point in raised grade, e.g., 3 (or 3.0) for a 4.3
newdec = min(int((ti.gradeval + .3)*10) % 10, 3)
print 'newdec is:"' + str(newdec) + '"'
# get result by add the two values together, after shifting newdec one
# decimal place
newval = bval + round(newdec/10.0, 1)
ti.gradeval = newval
print 'newval is:"' + str(newval) + '"'
I will probably get rid of the print later.
When I run testtranscript, which imports transcript:
def test_raise():
"""test raise_grade"""
testobj = transcript.Titem('CS1110','B-')
transcript.raise_grade('CS1110','B-')
cornelltest.assert_floats_equal(3.0,transcript.lettergrade_to_val("B-"))
I get this from the cmd shell:
TypeError: raise_grade takes exactly 1 argument (2 given)
Edit1: So now I see that I am giving it two parameters when raise_grade(ti) is just one, but perhaps it would shed more light if I just put out the rest of the code. I'm still stuck as to why I get a ['str' object has no gradeval error]
LETTER_LIST = ['B', 'A']
# List of valid modifiers to base letter grades.
MODIFIER_LIST = ['-','+']
def lettergrade_to_val(lg):
"""Returns: numerical value of letter grade lg.
The usual numerical scheme is assumed: A+ -> 4.3, A -> 4.0, A- -> 3.7, etc.
Precondition: lg is a 1 or 2-character string consisting of a "base" letter
in LETTER_LIST optionally followed by a modifier in MODIFIER_LIST."""
# if LETTER_LIST or MODIFIER_LIST change, the implementation of
# this function must change.
# get value of base letter. Trick: index in LETTER_LIST is shifted from value
bv = LETTER_LIST.index(lg[0]) + 3
# Trick with indexing in MODIFIER_LIST to get the modifier value
return bv + ((MODIFIER_LIST.index(lg[1]) - .5)*.3/.5 if (len(lg) == 2) else 0)
class Titem(object):
"""A Titem is an 'item' on a transcript, like "CS1110 A+"
Instance variables:
course [string]: course name. Always at least 1 character long.
gradeval [float]: the numerical equivalent of the letter grade.
Valid letter grades are 1 or 2 chars long, and consist
of a "base" letter in LETTER_LIST optionally followed
by a modifier in MODIFIER_LIST.
We store values instead of letter grades to facilitate
calculations of GPA later.
(In "real" life, one would write a function that,
when displaying a Titem, would display the letter
grade even though the underlying representation is
numerical, but we're keeping things simple for this
lab.)
"""
def __init__(self, n, lg):
"""Initializer: A new transcript line with course (name) n, gradeval
the numerical equivalent of letter grade lg.
Preconditions: n is a non-empty string.
lg is a string consisting of a "base" letter in LETTER_LIST
optionally followed by modifier in MODIFIER_LIST.
"""
# assert statements that cause an error when preconditions are violated
assert type(n) == str and type(lg) == str, 'argument type error'
assert (len(n) >= 1 and 0 < len(lg) <= 2 and lg[0] in LETTER_LIST and
(len(lg) == 1 or lg[1] in MODIFIER_LIST)), 'argument value error'
self.course = n
self.gradeval = lettergrade_to_val(lg)
Edit2: I understand the original problem... but it seems that the original writer screwed up the code, since raise_grade doesn't work properly for grade values at 3.7 ---> 4.0, since bval takes the original float and makes it an int, which doesn't work in this case.
You are calling the function incorrectly, you should be passing the testobj:
def test_raise():
"""test raise_grade"""
testobj = transcript.Titem('CS1110','B-')
transcript.raise_grade(testobj)
...
The raise_grade function is expecting a single argument ti which has a gradeval attribute, i.e. a Titem instance.

Can I write a function that carries out symbolic calculations in Python 2.7?

I'm currently transitioning from Java to Python and have taken on the task of trying to create a calculator that can carry out symbolic operations on infix-notated mathematical expressions (without using custom modules like Sympy). Currently, it's built to accept strings that are space delimited and can only carry out the (, ), +, -, *, and / operators. Unfortunately, I can't figure out the basic algorithm for simplifying symbolic expressions.
For example, given the string '2 * ( ( 9 / 6 ) + 6 * x )', my program should carry out the following steps:
2 * ( 1.5 + 6 * x )
3 + 12 * x
But I can't get the program to ignore the x when distributing the 2. In addition, how can I handle 'x * 6 / x' so it returns '6' after simplification?
EDIT: To clarify, by "symbolic" I meant that it will leave letters like "A" and "f" in the output while carrying out the remaining calculations.
EDIT 2: I (mostly) finished the code. I'm posting it here if anyone stumbles on this post in the future, or if any of you were curious.
def reduceExpr(useArray):
# Use Python's native eval() to compute if no letters are detected.
if (not hasLetters(useArray)):
return [calculate(useArray)] # Different from eval() because it returns string version of result
# Base case. Returns useArray if the list size is 1 (i.e., it contains one string).
if (len(useArray) == 1):
return useArray
# Base case. Returns the space-joined elements of useArray as a list with one string.
if (len(useArray) == 3):
return [' '.join(useArray)]
# Checks to see if parentheses are present in the expression & sets.
# Counts number of parentheses & keeps track of first ( found.
parentheses = 0
leftIdx = -1
# This try/except block is essentially an if/else block. Since useArray.index('(') triggers a KeyError
# if it can't find '(' in useArray, the next line is not carried out, and parentheses is not incremented.
try:
leftIdx = useArray.index('(')
parentheses += 1
except Exception:
pass
# If a KeyError was returned, leftIdx = -1 and rightIdx = parentheses = 0.
rightIdx = leftIdx + 1
while (parentheses > 0):
if (useArray[rightIdx] == '('):
parentheses += 1
elif (useArray[rightIdx] == ')'):
parentheses -= 1
rightIdx += 1
# Provided parentheses pair isn't empty, runs contents through again; else, removes the parentheses
if (leftIdx > -1 and rightIdx - leftIdx > 2):
return reduceExpr(useArray[:leftIdx] + [' '.join(['(',reduceExpr(useArray[leftIdx+1:rightIdx-1])[0],')'])] + useArray[rightIdx:])
elif (leftIdx > -1):
return reduceExpr(useArray[:leftIdx] + useArray[rightIdx:])
# If operator is + or -, hold the first two elements and process the rest of the list first
if isAddSub(useArray[1]):
return reduceExpr(useArray[:2] + reduceExpr(useArray[2:]))
# Else, if operator is * or /, process the first 3 elements first, then the rest of the list
elif isMultDiv(useArray[1]):
return reduceExpr(reduceExpr(useArray[:3]) + useArray[3:])
# Just placed this so the compiler wouldn't complain that the function had no return (since this was called by yet another function).
return None
You need much more processing before you go into operations on symbols. The form you want to get to is a tree of operations with values in the leaf nodes. First you need to do a lexer run on the string to get elements - although if you always have space-separated elements it might be enough to just split the string. Then you need to parse that array of tokens using some grammar you require.
If you need theoretical information about grammars and parsing text, start here: http://en.wikipedia.org/wiki/Parsing If you need something more practical, go to https://github.com/pyparsing/pyparsing (you don't have to use the pyparsing module itself, but their documentation has a lot of interesting info) or http://www.nltk.org/book
From 2 * ( ( 9 / 6 ) + 6 * x ), you need to get to a tree like this:
*
2 +
/ *
9 6 6 x
Then you can visit each node and decide if you want to simplify it. Constant operations will be the simplest ones to eliminate - just compute the result and exchange the "/" node with 1.5 because all children are constants.
There are many strategies to continue, but essentially you need to find a way to go through the tree and modify it until there's nothing left to change.
If you want to print the result then, just walk the tree again and produce an expression which describes it.
If you are parsing expressions in Python, you might consider Python syntax for the expressions and parse them using the ast module (AST = abstract syntax tree).
The advantages of using Python syntax: you don't have to make a separate language for the purpose, the parser is built in, and so is the evaluator. Disadvantages: there's quite a lot of extra complexity in the parse tree that you don't need (you can avoid some of it by using the built-in NodeVisitor and NodeTransformer classes to do your work).
>>> import ast
>>> a = ast.parse('x**2 + x', mode='eval')
>>> ast.dump(a)
"Expression(body=BinOp(left=BinOp(left=Name(id='x', ctx=Load()), op=Pow(),
right=Num(n=2)), op=Add(), right=Name(id='x', ctx=Load())))"
Here's an example class that walks a Python parse tree and does recursive constant folding (for binary operations), to show you the kind of thing you can do fairly easily.
from ast import *
class FoldConstants(NodeTransformer):
def visit_BinOp(self, node):
self.generic_visit(node)
if isinstance(node.left, Num) and isinstance(node.right, Num):
expr = copy_location(Expression(node), node)
value = eval(compile(expr, '<string>', 'eval'))
return copy_location(Num(value), node)
else:
return node
>>> ast.dump(FoldConstants().visit(ast.parse('3**2 - 5 + x', mode='eval')))
"Expression(body=BinOp(left=Num(n=4), op=Add(), right=Name(id='x', ctx=Load())))"

Categories

Resources