Python: Function takes 1 argument for 2 given - python

I have looked on this website for something similar, and attempted to debug using previous answers, and failed.
I'm testing (I did not write this module) a module that changes the grade value of a course's grades from a B- to say a B, but never going across base grade levels (ie, B+ to an A-).
The original module is called transcript.py
I'm testing it in my own testtranscript.py
I'm testing that module by importing it: 'import transcript' and 'import cornelltest'
I have ensured that all files are in the same folder/directory.
There is the function raise_grade present in transcript.py (there are multiple definitions in this module, but raise_grade is the only one giving me any trouble).
ti is in the form ('class name', 'gradvalue')
There's already another definition converting floats to strings and back (ie 3.0--> B).
def raise_grade(ti):
""""Raise gradeval of transcript line ti by a non-noticeable amount.
"""
# value of the base letter grade, e.g., 4 (or 4.0) for a 4.3
bval = int(ti.gradeval)
print 'bval is:"' + str(bval) + '"'
# part after decimal point in raised grade, e.g., 3 (or 3.0) for a 4.3
newdec = min(int((ti.gradeval + .3)*10) % 10, 3)
print 'newdec is:"' + str(newdec) + '"'
# get result by add the two values together, after shifting newdec one
# decimal place
newval = bval + round(newdec/10.0, 1)
ti.gradeval = newval
print 'newval is:"' + str(newval) + '"'
I will probably get rid of the print later.
When I run testtranscript, which imports transcript:
def test_raise():
"""test raise_grade"""
testobj = transcript.Titem('CS1110','B-')
transcript.raise_grade('CS1110','B-')
cornelltest.assert_floats_equal(3.0,transcript.lettergrade_to_val("B-"))
I get this from the cmd shell:
TypeError: raise_grade takes exactly 1 argument (2 given)
Edit1: So now I see that I am giving it two parameters when raise_grade(ti) is just one, but perhaps it would shed more light if I just put out the rest of the code. I'm still stuck as to why I get a ['str' object has no gradeval error]
LETTER_LIST = ['B', 'A']
# List of valid modifiers to base letter grades.
MODIFIER_LIST = ['-','+']
def lettergrade_to_val(lg):
"""Returns: numerical value of letter grade lg.
The usual numerical scheme is assumed: A+ -> 4.3, A -> 4.0, A- -> 3.7, etc.
Precondition: lg is a 1 or 2-character string consisting of a "base" letter
in LETTER_LIST optionally followed by a modifier in MODIFIER_LIST."""
# if LETTER_LIST or MODIFIER_LIST change, the implementation of
# this function must change.
# get value of base letter. Trick: index in LETTER_LIST is shifted from value
bv = LETTER_LIST.index(lg[0]) + 3
# Trick with indexing in MODIFIER_LIST to get the modifier value
return bv + ((MODIFIER_LIST.index(lg[1]) - .5)*.3/.5 if (len(lg) == 2) else 0)
class Titem(object):
"""A Titem is an 'item' on a transcript, like "CS1110 A+"
Instance variables:
course [string]: course name. Always at least 1 character long.
gradeval [float]: the numerical equivalent of the letter grade.
Valid letter grades are 1 or 2 chars long, and consist
of a "base" letter in LETTER_LIST optionally followed
by a modifier in MODIFIER_LIST.
We store values instead of letter grades to facilitate
calculations of GPA later.
(In "real" life, one would write a function that,
when displaying a Titem, would display the letter
grade even though the underlying representation is
numerical, but we're keeping things simple for this
lab.)
"""
def __init__(self, n, lg):
"""Initializer: A new transcript line with course (name) n, gradeval
the numerical equivalent of letter grade lg.
Preconditions: n is a non-empty string.
lg is a string consisting of a "base" letter in LETTER_LIST
optionally followed by modifier in MODIFIER_LIST.
"""
# assert statements that cause an error when preconditions are violated
assert type(n) == str and type(lg) == str, 'argument type error'
assert (len(n) >= 1 and 0 < len(lg) <= 2 and lg[0] in LETTER_LIST and
(len(lg) == 1 or lg[1] in MODIFIER_LIST)), 'argument value error'
self.course = n
self.gradeval = lettergrade_to_val(lg)
Edit2: I understand the original problem... but it seems that the original writer screwed up the code, since raise_grade doesn't work properly for grade values at 3.7 ---> 4.0, since bval takes the original float and makes it an int, which doesn't work in this case.

You are calling the function incorrectly, you should be passing the testobj:
def test_raise():
"""test raise_grade"""
testobj = transcript.Titem('CS1110','B-')
transcript.raise_grade(testobj)
...
The raise_grade function is expecting a single argument ti which has a gradeval attribute, i.e. a Titem instance.

Related

Converting CNF format to DIMACS format

My lab partner and I are working on writing code to make our own SAT solver using Python for one of our courses. So far we have written this code to convert SoP to CNF. Now we are stuck as to how to convert the CNF to DIMACS format. We understand how the DIMACS format works when completing it by hand but we are stuck on writing the actually code to go from CNF to DIMACS. Everything we have found so far inputs files that are already in the DIMACS format.
from sympy.logic.boolalg import to_cnf
from sympy.abc import A, B, C, D
f = to_cnf(~(A | B) | D)
g = to_cnf(~A&B&C | ~D&A)
The sympy boolalg module lets you build an abstract syntax tree (AST) for the expression. In CNF form you'll have a top-level And node, with one or more children; each child is an Or node with one or more literals; a literal is either a Not node of a symbol, or just a symbol directly.
From the DIMACS side, the top-level And is implicit. You are just listing the Or nodes, and if a symbol was in a Not you mark that with a - before the symbol's variable. You are essentially merely assigning new names for the symbols and writing it down in a new text form. (The fact that DIMACS variable names look like integers is just because it's convenient; they do not have integer semantics/arithmetic/etc.)
To track mapping between DIMACS variables and sympy symbols, something like this is helpful:
class DimacsMapping:
def __init__(self):
self._symbol_to_variable = {}
self._variable_to_symbol = {}
self._total_variables = 0
#property
def total_variables(self):
return self._total_variables
def new_variable(self):
self._total_variables += 1
return self._total_variables
def get_variable_for(self, symbol):
result = self._symbol_to_variable.get(symbol)
if result is None:
result = self.new_variable()
self._symbol_to_variable[symbol] = result
self._variable_to_symbol[result] = symbol
return result
def get_symbol_for(self, variable):
return self._variable_to_symbol[variable]
def __str__(self) -> str:
return str(self._variable_to_symbol)
You can always ask for a new (fresh, never been used) variable with new_variable. DIMACS variables start from 1, not 0. (The 0 value is used to indicate not-a-variable, primarily for marking the end-of-clause.)
We don't want to just allocate new variables every time, but also remember which variables were assigned to a symbol. This maintains a mapping from symbol to variable and vice versa. You hand a sympy symbol to get_variable_for and either the previously used variable for that symbol is returned, or a new variable is allocated and returned, with the mapping noted.
It tracks the reverse mapping, so you can recover the original symbol given a variable in get_symbol_for; this is useful for turning a SAT assignment back into sympy assignments.
Next, we need something to store this mapping along with the clause list. You need both to emit valid DIMACS, since the header line contains both the variable count (which the mapping knows) and the clause count (which the clause list knows). This is basically a glorified tuple, with a __str__ that does the conversion to well-formed DIMACS text:
class DimacsFormula:
def __init__(self, mapping, clauses):
self._mapping = mapping
self._clauses = clauses
#property
def mapping(self):
return self._mapping
#property
def clauses(self):
return self._clauses
def __str__(self):
header = f"p cnf {self._mapping.total_variables} {len(self._clauses)}"
body = "\n".join(
" ".join([str(literal) for literal in clause] + ["0"])
for clause in self._clauses
)
return "\n".join([header, body])
Last, we just walk over the sympy AST to create DIMACS clauses:
from sympy.core.symbol import Symbol
from sympy.logic.boolalg import to_cnf, And, Or, Not
def to_dimacs_formula(sympy_cnf):
dimacs_mapping = DimacsMapping()
dimacs_clauses = []
assert type(sympy_cnf) == And
for sympy_clause in sympy_cnf.args:
assert type(sympy_clause) == Or
dimacs_clause = []
for sympy_literal in sympy_clause.args:
if type(sympy_literal) == Not:
sympy_symbol, polarity = sympy_literal.args[0], -1
elif type(sympy_literal) == Symbol:
sympy_symbol, polarity = sympy_literal, 1
else:
raise AssertionError("invalid cnf")
dimacs_variable = dimacs_mapping.get_variable_for(sympy_symbol)
dimacs_literal = dimacs_variable * polarity
dimacs_clause.append(dimacs_literal)
dimacs_clauses.append(dimacs_clause)
return DimacsFormula(dimacs_mapping, dimacs_clauses)
This just descends down the tree, until it gets the root symbol and whether or not it was negated (i.e., was in a Not indicating negative polarity). Once the symbol is mapped to its variable, we can leave it positive or negate it to maintain polarity and append it to the DIMACS clause.
Do this for all Or nodes and we have a mapped DIMACS formula.
f = to_cnf(~(A | B) | D)
print(f)
print()
f_dimacs = to_dimacs_formula(f)
print(f_dimacs)
print()
print(f_dimacs.mapping)
(D | ~A) & (D | ~B)
p cnf 3 2
1 -2 0
1 -3 0
{1: D, 2: A, 3: B}
As an aside, you probably don't want to use to_cnf to get a CNF for purposes of testing satisfiability. In general, converting a boolean formula to an equivalent CNF can result in exponential size increase.
Note in the above example for f, variable D only appeared once in the formula yet appeared twice in the CNF. If it had been more complicated, like (C | D), then that entire subformula gets copied:
f = to_cnf(~(A | B) | (C | D))
print(f)
(C | D | ~A) & (C | D | ~B)
If it was even more complicated, you can see how you end up with copies of copies of copies... and so on. For the purposes of testing satisfiability, we do not need an equivalent formula but merely an equisatisfiable one.
This is a formula that may not be equivalent, but is satisfiable if and only if the original was. It can have new clauses and different variables. This relaxation gives a linear sized translation instead.
To do this, rather than allow subformulas to be copied we will allocate a variable that represents the truth value of that subformula and use that instead. This is called a Tseitin transformation, and I go into more detail in this answer.
As a simple example, let's say we want to use a variable x to represent (a ∧ b). We would write this as x ≡ (a ∧ b), which can be done with three CNF clauses: (¬x ∨ a) ∧ (¬x ∨ b) ∧ (¬a ∨ ¬b ∨ x). Now x is true if and only if (a ∧ b) is.
This top-level function kicks off the process, so that the recursive calls share the same mapping and clause set. The final outcome is a single variable representing the truth value of the entire formula. We must force this to be true (else a SAT solver will simply choose any input variables to the formula, follow the implications, and produce an evaluated formula of any output).
def to_dimacs_tseitin(sympy_formula):
dimacs_mapping = DimacsMapping()
dimacs_clauses = []
# Convert the formula, with this final variable representing the outcome
# of the entire formula. Since we are stating this formula should evaluate
# to true, this variable is appended as a unit clause stating such.
formula_literal = _to_dimacs_tseitin_literal(
sympy_formula, dimacs_mapping, dimacs_clauses
)
dimacs_clauses.append([formula_literal])
return DimacsFormula(dimacs_mapping, dimacs_clauses)
The bulk of the translation is the code that adds clauses specific to the operation being performed. The recursion happens at the point where we demand a single variable that represents the output of the subformula arguments.
def _to_dimacs_tseitin_literal(sympy_formula, dimacs_mapping, dimacs_clauses):
# Base case, the formula is just a symbol.
if type(sympy_formula) == Symbol:
return dimacs_mapping.get_variable_for(sympy_formula)
# Otherwise, it is some operation on one or more subformulas. First we
# need to get a literal representing the outcome of each of those.
args_literals = [
_to_dimacs_tseitin_literal(arg, dimacs_mapping, dimacs_clauses)
for arg in sympy_formula.args
]
# As a special case, we won't bother wasting a new variable for `Not`.
if type(sympy_formula) == Not:
return -args_literals[0]
# General case requires a new variable and new clauses.
result = dimacs_mapping.new_variable()
if type(sympy_formula) == And:
for arg_literal in args_literals:
dimacs_clauses.append([-result, arg_literal])
dimacs_clauses.append(
[result] + [-arg_literal for arg_literal in args_literals]
)
elif type(sympy_formula) == Or:
for arg_literal in args_literals:
dimacs_clauses.append([result, -arg_literal])
dimacs_clauses.append(
[-result] + [arg_literal for arg_literal in args_literals]
)
else:
# TODO: Handle all the other sympy operation types.
raise NotImplementedError()
return result
Now boolean formulas do not need to be in CNF to become DIMACS:
f = ~(A | B) | (C | D)
print(f)
print()
f_dimacs = to_dimacs_tseitin(f)
print(f_dimacs)
print()
print(f_dimacs.mapping)
C | D | ~(A | B)
p cnf 6 8
5 -3 0
5 -4 0
-5 3 4 0
6 -1 0
6 -2 0
6 5 0
-6 1 2 -5 0
6 0
{1: C, 2: D, 3: A, 4: B}
Even after unit propagation the resulting formula is clearly larger in this small example. However, the linear scaling helps substantially when 'real' formulas are converted. Furthermore, modern SAT solvers understand that formulas will be translated this way and perform in-processing tailored toward it.

Order by Progressive Number

I've recently started python, and I'm looking for a way to order a progressively number
Example, as I work with accounting, a basic balance sheet structure start as:
1 - Asset
1.1 - Asset short term
1.1.1 - xxxxxxxxx
11102312313 - Cash (Accounting account)
and that repeats for Liabilities (2) , Expenses (3) Revenue (4)
How can I order like the example above, cause if I order direct on Excel it would like that :
1
2
3
4
100000000
1.1
1.1.1
etc..
But I do need the to list as the first example
nice! This isn't trivial for a Python beginner, yet pretty manageable. Won't do your work here, but set you on the right path:
Let's assume you have a list of strings, ["1 - Asset", "1.1 - Asset short term", ...]
You'll want to sort that list (tutorial), so you need to use Python's built-in sorted() on that list
But sorted will do an alphabetical sorting by default, which doesn't seem to be what you want
To teach it other sorting methods, you need to implement a key, i.e. some class (or more generally, type) or function return value that can be compared (like with <, >, >=) correctly. For example, a string that starts with a pattern like *.* should always be > than a string that starts with a plain number like 10000000.
Then it's just sortedlist = sorted(inputlist, key=accountingkey)
A typical key class might look like
def extract_ordinal_from_string(string):
hyphenposition = string.find(" -")
if hyphenposition < 0:
raise Exception(f"Row doesn't contain ' -', can't be sorted: {string}")
return string[:hyphenposition]
class accounting_key:
def __init__(self, string):
self.key = extract_ordinal_from_string(string)
self.dot = "." in self.key
def __lt__(other):
"""
lt: less-than (<) comparison operator
"""
otherkey = extract_ordinal_from_string(other)
otherdot = "." in otherkey
if otherdot and self.dot:
""""
Both contain dots.
Lets put them both in tuples, compare these:
(1,1,0) < (1,2)
"""
selftuple = tuple(int(substring) for substring in self.key.split("."))
othertuple = tuple(int(substring) for substring in otherkey.split("."))
return selftuple < othertuple
if not self.dot and not otherdot:
return int(self.key) < int(otherkey)
if self.dot and not otherdot:
### And so on, up to you to implement

Python .join[] multiplys total string characters across multiple lines

I am working on a personal project to help my understanding of python 3.4.2 looping and concatenating strings from multiple sources.
My goal with this is to take 'string' use join and call __len__() inside to build a string it is multiplying my results. I would like the lengths to be 5 then 10 then 15. Right now it is coming out 5 then 25 then 105. If I keep going I get 425,1705,6825,etc...
I hope I'm missing something simple, but any help would be amazing. I'm also trying to do my joins efficiently (I know the prints aren't, those are for debugging purposes.)
I used a visualized python tool online to step through it and see if I could figure it out. I just am missing something.
http://www.pythontutor.com/visualize.html#mode=edit
Thank you in advance!
import random
def main():
#String values will be pulled from
string = 'valuehereisheldbythebeholderwhomrolledadtwentyandcriticalmissed'
#Initial string creation
strTest = ''
print('strTest Blank: ' + strTest)
#first round string generation
strTest = strTest.join([string[randomIndex(string.__len__())] for i in range(randomLength())])
print('strTest 1: ' + strTest)
print('strTest 1 length: ' + str(strTest.__len__()))
#second round string generation
strTest = strTest.join([string[randomIndex(string.__len__())] for i in range(randomLength())])
print('strTest 2: ' + strTest)
print('strTest 2 length: ' + str(strTest.__len__()))
#final round string generation
strTest = strTest.join([string[randomIndex(string.__len__())] for i in range(randomLength())])
print('strTest 3: ' + strTest)
print('strTest 3 length: ' + str(strTest.__len__()))
def randomIndex(index):
#create random value between position 0 and total string length to generate string
return random.randint(0,index)
def randomLength():
#return random length for string creation, static for testing
return 5
#return random.randint(10,100)
main()
# output desired is
# strTest 1 length: 5
# strTest 2 length: 10
# strTest 3 length: 15
The code runs without any issue, what's happening actually is, each time you call strTest.join(...), you are actually joining each random character and the next you get from string with the previous value of strTest.
Quoting from Python Doc:
str.join(iterable) Return a string which is the concatenation of the
strings in the iterable iterable. A TypeError will be raised if there
are any non-string values in iterable, including bytes objects. The
separator between elements is the string providing this method.
Example:
>>> s = 'ALPHA'
>>> '*'.join(s)
'A*L*P*H*A'
>>> s = 'TEST'
>>> ss = '-long-string-'
>>> ss.join(s)
'T-long-string-E-long-string-S-long-string-T'
So probably you want something like:
strTest = strTest + ''.join([string[randomIndex(string.__len__())] for i in range(randomLength())])

Significant numbers digits of value by its error

I'm in need of a function returning only the significant part of a value with respect to a given error. Meaning something like this:
def (value, error):
""" This function takes a value and determines its significant
accuracy by its error.
It returns only the scientific important part of a value and drops the rest. """
magic magic magic....
return formated value as String.
What i have written so far to show what I mean:
import numpy as np
def signigicant(value, error):
""" Returns a number in a scintific format. Meaning a value has an error
and that error determines how many digits of the
value are signifcant. e.g. value = 12.345MHz,
error = 0.1MHz => 12.3MHz because the error is at the first digit.
(in reality drop the MHz its just to show why.)"""
xx = "%E"%error # I assume this is most ineffective.
xx = xx.split("E")
xx = int(xx[1])
if error <= value: # this should be the normal case
yy = np.around(value, -xx)
if xx >= 0: # Error is 1 or bigger
return "%i"%yy
else: # Error is smaller than 1
string = "%."+str(-xx) +"f"
return string%yy
if error > value: # This should not be usual but it can happen.
return "%g"%value
What I don't want is a function like numpys around or round. Those functions take a value and want to know what part of this value is important. The point is that in general I don't know how many digits are significant. It depends in the size of the error of that value.
Another example:
value = 123, error = 12, => 120
One can drop the 3, because the error is at the size of 10. However this behaviour is not so important, because some people still write 123 for the value. Here it is okay but not perfectly right.
For big numbers the "g" string operator is a usable choice but not always what I need. For e.g.
If the error is bigger than the value.( happens e.g. when someone wants to measure something that does not exist.)
value = 10, error = 100
I still wish to keep the 10 as the value because I done know it any better. The function should return 10 then and not 0.
The stuff I have written does work more or less, but its clearly not effective or elegant in any way. Also I assume this question does concern hundreds of people because every scientist has to format numbers in that way. So I'm sure there is a ready to use solution somewhere but I haven't found it yet.
Probably my google skills aren't good enough but I wasn't able to find a solution to this in two days and now I ask here.
For testing my code I used this the following but more is needed.
errors = [0.2,1.123,1.0, 123123.1233215,0.123123123768]
values = [12.3453,123123321.4321432, 0.000321 ,321321.986123612361236,0.00001233214 ]
for value, error in zip(values, errors):
print "Teste Value: ",value, "Error:", error
print "Result: ", signigicant(value, error)
import math
def round_on_error(value, error):
significant_digits = 10**math.floor(math.log(error, 10))
return value // significant_digits * significant_digits
Example:
>>> errors = [0.2,1.123,1.0, 123123.1233215,0.123123123768]
>>> values = [12.3453,123123321.4321432, 0.000321 ,321321.986123612361236,0.00001233214 ]
>>> map(round_on_error, values, errors)
[12.3, 123123321.0, 0.0, 300000.0, 0.0]
And if you want to keep a value that is inferior to its error
if (value < error)
return value
else
def round_on_error(value, error):
significant_digits = 10**math.floor(math.log(error, 10))
return value // significant_digits * significant_digits

Can I write a function that carries out symbolic calculations in Python 2.7?

I'm currently transitioning from Java to Python and have taken on the task of trying to create a calculator that can carry out symbolic operations on infix-notated mathematical expressions (without using custom modules like Sympy). Currently, it's built to accept strings that are space delimited and can only carry out the (, ), +, -, *, and / operators. Unfortunately, I can't figure out the basic algorithm for simplifying symbolic expressions.
For example, given the string '2 * ( ( 9 / 6 ) + 6 * x )', my program should carry out the following steps:
2 * ( 1.5 + 6 * x )
3 + 12 * x
But I can't get the program to ignore the x when distributing the 2. In addition, how can I handle 'x * 6 / x' so it returns '6' after simplification?
EDIT: To clarify, by "symbolic" I meant that it will leave letters like "A" and "f" in the output while carrying out the remaining calculations.
EDIT 2: I (mostly) finished the code. I'm posting it here if anyone stumbles on this post in the future, or if any of you were curious.
def reduceExpr(useArray):
# Use Python's native eval() to compute if no letters are detected.
if (not hasLetters(useArray)):
return [calculate(useArray)] # Different from eval() because it returns string version of result
# Base case. Returns useArray if the list size is 1 (i.e., it contains one string).
if (len(useArray) == 1):
return useArray
# Base case. Returns the space-joined elements of useArray as a list with one string.
if (len(useArray) == 3):
return [' '.join(useArray)]
# Checks to see if parentheses are present in the expression & sets.
# Counts number of parentheses & keeps track of first ( found.
parentheses = 0
leftIdx = -1
# This try/except block is essentially an if/else block. Since useArray.index('(') triggers a KeyError
# if it can't find '(' in useArray, the next line is not carried out, and parentheses is not incremented.
try:
leftIdx = useArray.index('(')
parentheses += 1
except Exception:
pass
# If a KeyError was returned, leftIdx = -1 and rightIdx = parentheses = 0.
rightIdx = leftIdx + 1
while (parentheses > 0):
if (useArray[rightIdx] == '('):
parentheses += 1
elif (useArray[rightIdx] == ')'):
parentheses -= 1
rightIdx += 1
# Provided parentheses pair isn't empty, runs contents through again; else, removes the parentheses
if (leftIdx > -1 and rightIdx - leftIdx > 2):
return reduceExpr(useArray[:leftIdx] + [' '.join(['(',reduceExpr(useArray[leftIdx+1:rightIdx-1])[0],')'])] + useArray[rightIdx:])
elif (leftIdx > -1):
return reduceExpr(useArray[:leftIdx] + useArray[rightIdx:])
# If operator is + or -, hold the first two elements and process the rest of the list first
if isAddSub(useArray[1]):
return reduceExpr(useArray[:2] + reduceExpr(useArray[2:]))
# Else, if operator is * or /, process the first 3 elements first, then the rest of the list
elif isMultDiv(useArray[1]):
return reduceExpr(reduceExpr(useArray[:3]) + useArray[3:])
# Just placed this so the compiler wouldn't complain that the function had no return (since this was called by yet another function).
return None
You need much more processing before you go into operations on symbols. The form you want to get to is a tree of operations with values in the leaf nodes. First you need to do a lexer run on the string to get elements - although if you always have space-separated elements it might be enough to just split the string. Then you need to parse that array of tokens using some grammar you require.
If you need theoretical information about grammars and parsing text, start here: http://en.wikipedia.org/wiki/Parsing If you need something more practical, go to https://github.com/pyparsing/pyparsing (you don't have to use the pyparsing module itself, but their documentation has a lot of interesting info) or http://www.nltk.org/book
From 2 * ( ( 9 / 6 ) + 6 * x ), you need to get to a tree like this:
*
2 +
/ *
9 6 6 x
Then you can visit each node and decide if you want to simplify it. Constant operations will be the simplest ones to eliminate - just compute the result and exchange the "/" node with 1.5 because all children are constants.
There are many strategies to continue, but essentially you need to find a way to go through the tree and modify it until there's nothing left to change.
If you want to print the result then, just walk the tree again and produce an expression which describes it.
If you are parsing expressions in Python, you might consider Python syntax for the expressions and parse them using the ast module (AST = abstract syntax tree).
The advantages of using Python syntax: you don't have to make a separate language for the purpose, the parser is built in, and so is the evaluator. Disadvantages: there's quite a lot of extra complexity in the parse tree that you don't need (you can avoid some of it by using the built-in NodeVisitor and NodeTransformer classes to do your work).
>>> import ast
>>> a = ast.parse('x**2 + x', mode='eval')
>>> ast.dump(a)
"Expression(body=BinOp(left=BinOp(left=Name(id='x', ctx=Load()), op=Pow(),
right=Num(n=2)), op=Add(), right=Name(id='x', ctx=Load())))"
Here's an example class that walks a Python parse tree and does recursive constant folding (for binary operations), to show you the kind of thing you can do fairly easily.
from ast import *
class FoldConstants(NodeTransformer):
def visit_BinOp(self, node):
self.generic_visit(node)
if isinstance(node.left, Num) and isinstance(node.right, Num):
expr = copy_location(Expression(node), node)
value = eval(compile(expr, '<string>', 'eval'))
return copy_location(Num(value), node)
else:
return node
>>> ast.dump(FoldConstants().visit(ast.parse('3**2 - 5 + x', mode='eval')))
"Expression(body=BinOp(left=Num(n=4), op=Add(), right=Name(id='x', ctx=Load())))"

Categories

Resources