How transform str('+') to mathematical operation?
For example:
a = [0,1,2] # or a = ['0','1','2']
b = ['+','-','*']
c = int(a[0]+b[0]+a[1])
In other words, how transform str('-1*2') to int(), without for i in c: if i == '+': ...
Thanks.
You can also use the operator module:
import operator as op
#Create a mapping between the string and the operator:
ops = {'+': op.add, '-': op.sub, '*': op.mul}
a = [0,1,2]
b = ['+','-','*']
#use the mapping
c = ops[b[0]](a[0], a[1])
i thin you're looking for eval(), but i advice to use something else...
however,
>>> eval('-1*2')
-2
eval 'executes' the string you pass to it, like code. so it's quite dangerous for security, especially if the parameters are user input...
in this case i suggest to use parsing library, such as ply http://www.dabeaz.com/ply/
that for such thing is really simple to use and very effective :)
If your math expressions will fit Python syntax but you are skeered of eval (you should be) you can look into python's ast module (docs). It will parse the expression into an abstract syntax tree you can iterate over. You can evaluate a limited subset of Python and throw errors if you encounter anything outside your expression grammar.
You can read about http://en.wikipedia.org/wiki/Reverse_Polish_notation
Use eval like everyone else is saying but filter it first.
>>> s = '1 + 12 / 2 - 12*31'
>>> allowed = set(' 1234567890()+-/*\\')
>>> if allowed.issuperset(s):
... eval(s)
...
-365
Use eval:
> eval(str('-1*2'))
> -2
Dett,
A Simple eval on the whole string... However be aware that if the user inputs the string, eval is risky, unless you do some parsing first
x = '-1*2'
y = eval(x)
y will then be the integer value.
Related
When creating grammar rules for a language I am making, I would like to be able to check syntax and step through it instead of the current method which often will miss syntax errors.
I've started off using regular expressions to define the grammar like so:
add = r"(\+)"
sub = r"(-)"
mul = r"(\*)"
div = r"(\\)"
pow = r"(\^)"
bin_op = fr"({add}|{sub}|{mul}|{div}|{pow})"
open_br = r"(\()"
close_br = r"(\))"
open_sq = r"(\[)"
close_sq = r"(\])"
dot = r"(\.)"
short_id = r"([A-Za-z]\d*)" # i.e. "a1", "b1232", etc.
long_id = r"([A-Za-z0-9]+)" # i.e. "sin2", "tan", etc. for use in assignment
long_id_ref = r"('" + long_id + "')" # i.e. "'sin'", for referencing
#note that "'sin'" is one value while "sin" = "s" * "i" * "n"
id_assign = fr"({short_id}|{long_id})" # for assignment
id_ref = fr"({short_id}|{long_id_ref})" # for reference (apostrophes)
integer = r"(\d+)" # i.e 123
float = fr"(\d+{dot}\d+)" # i.e. 3.4
operand = fr"({integer}|{float}|{id_ref})"
Now the issue here is that definitions may be recursive, for example in expression = fr"{expression}{bin_op}{expression}|({open_br}{expression}{close_br})|({expression}{open_sq}{expression}{close_sq})|..." and as you can see, I have shown some possible expressions that would be recursive. The issue is, of course, that expression is not defined when defining expression therefore an error would be raised.
It seems that (?R) would not work since it would copy everything before it not the whole string. Does Python's regex have a way of dealing with this or do I have to create my own BNF or regex interpreter that supports recursion?
Alternatively would it be feasible to use regular expressions but not use any recursion?
I know that there are 3rd-party applications that can help with this but I'd like to be able to do it all myself without external code.
I want to code a unit converter and I need to extract the given value from the unit in the input string.
To provide a user friendly experience while using the converter I want the user to be able to input the value and the unit in the same string. My problem is that I want to extract the numbers and the letters so that I can tell the program the unit and the value and store them in two different variables. For extracting the letters, I used the in operator, and that works properly. I also found a solution for getting the numbers from the input, but that doesn't work for values with exponents.
a = str(input("Type in your wavelength: "))
if "mm" in a:
print("Unit = Millimeter")
b = float(a.split()[0])
Storing simple inputs like 567 mm as a float in b works but I want to be able to extract inputs like 5*10**6 mm but it says
could not convert string to float: '5*10**6'.
So what can I use to extract more complex numbers like this into a float?
Traditionally, in Python, as in many other languages, exponents are prefixed by the letter e or E. While 5 * 10**6 is not a valid floating point literal, 5e6 most definitely is.
This is something to keep in mind for the future, but it won't solve your issue with the in operator. The problem is that in can only check if something you already know is there. What if your input was 5e-8 km instead?
You should start by coming up with an unambiguously clear definition of how you identify the boundary between number and units in a string. For example, units could be the last contiguous bit of non-digit characters in your string.
You could then split the string using regular expressions. Since the first part can be an arbitrary expression, so you can evaluate it with something as simple as ast.literal_eval. The more complicated your expression can be, the more complicated your parser will have to be as well.
Here's an example to get you started:
from ast import literal_eval
import re
pattern = re.compile(r'(.*[\d\.])\s*(\D+)')
data = '5 * 10**6 mm'
match = pattern.fullmatch(data)
if not match:
raise ValueError('Invalid Expression')
num, units = match.groups()
num = literal_eval(num)
It seems that you are looking for the eval function, as noted in #Rasgel's answer. Documentation here
As some people have pointed out, it poses a big security risk.
To circumvent this, I can think of 2 ways:
1. Combine eval with regex
If you only want to do basic arithmetic operations like addition, subtraction and maybe 2**4 or sth like that, then you can use regex to first remove any non-numerical, non-arithmetic operational characters.
import re
a = str(input("Type in your wavelength: "))
if "mm" in a:
print("Unit = Millimeter")
# After parsing the units,
# Remove anything other than digits, +, -, *, /, . (floats), ! (factorial?) and ()
# If you require any other symbols, add them in
pruned_a = re.sub(r'[^0-9\*\+\-\/\!\.\(\)]', "", a)
result = eval(pruned_a)
2. Make sure eval doesn't actually evaluate any of your local or global variables in your python code.
result = eval(expression, {'__builtins__': None}, {})
(the above code is from another Stackoverflow answer here: Math Expression Evaluation -- there might be other solutions there that you might be interested in)
Combined
import re
a = str(input("Type in your wavelength: "))
if "mm" in a:
print("Unit = Millimeter")
# After parsing the units,
# Remove anything other than digits, +, -, *, /, . (floats), ! (factorial?) and ()
# If you require any other symbols, add them in
pruned_a = re.sub(r'[^0-9\*\+\-\/\!\.\(\)]', "", a)
result = eval(pruned_a, {'__builtins__': None}, {}) #to be extra safe :)
There are many ways to tackle this simple problem, using str.split, regular expressions, eval, ast.literal_eval... Here I propose you to have your own safe routine that will evaluate simple mathematical expressions, code below:
import re
import ast
import operator
def safe_eval(s):
bin_ops = {
ast.Add: operator.add,
ast.Sub: operator.sub,
ast.Mult: operator.mul,
ast.Div: operator.itruediv,
ast.Mod: operator.mod,
ast.Pow: operator.pow
}
node = ast.parse(s, mode='eval')
def _eval(node):
if isinstance(node, ast.Expression):
return _eval(node.body)
elif isinstance(node, ast.Str):
return node.s
elif isinstance(node, ast.Num):
return node.n
elif isinstance(node, ast.BinOp):
return bin_ops[type(node.op)](_eval(node.left), _eval(node.right))
else:
raise Exception('Unsupported type {}'.format(node))
return _eval(node.body)
if __name__ == '__main__':
text = str(input("Type in your wavelength: "))
tokens = [v.strip() for v in text.split()]
if len(tokens) < 2:
raise Exception("expected input: <wavelength expression> <unit>")
wavelength = safe_eval("".join(tokens[:-1]))
dtype = tokens[-1]
print(f"You've typed {wavelength} in {dtype}")
I'll also recommend you read this post Why is using 'eval' a bad practice?
In case you have a string like 5*106and want to convert this number into a float, you can use the eval() function.
>>> float(eval('5*106'))
530.0
I would like to output a user input expression to a string.
The reason is that the input expression is user defined. I want to output the result of the expression, and print the statement which lead to this result.
import sys
import shutil
expression1 = sys.path
expression2 = shutil.which
def get_expression_str(expression):
if callable(expression):
return expression.__module__ +'.'+ expression.__name__
else:
raise TypeError('Could not convert expression to string')
#print(get_expression_str(expression1))
# returns : builtins.TypeError: Could not convert expression to string
#print(get_expression_str(expression2))
# returns : shutil.which
#print(str(expression1))
#results in a list like ['/home/bernard/clones/it-should-work/unit_test', ... ,'/usr/lib/python3/dist-packages']
#print(repr(expression1))
#results in a list like ['/home/bernard/clones/it-should-work/unit_test', ... ,'/usr/lib/python3/dist-packages']
I looked into the Python inspect module but even
inspect.iscode(sys.path)
returns False
For those who wonder why it is the reverse of a string parsed to an expression using functools.partial see parse statement string
Background.
A program should work. Should, but it not always does. Because a program need specific resources, OS, OS version, other packages, files, etc. Every program needs different requirements (resources) to function properly.
Which specific requirement are needed can not be predicted. The system knows best which resources are and are not available. So instead of manually checking all settings and configurations let a help program do this for you.
So the user, or developer of a program, specify his requirements together with statements how to to retrieve this information : expressions. Which could be executed using eval. Could. Like mentioned on StackOverflow eval is evil.
Use of eval is hard to make secure using a blacklist, see : http://nedbatchelder.com/blog/201206/eval_really_is_dangerous.html
Using multiple tips of SO I use a namedtuple, with a string, to compare with the user input string, and a function.
A white-list is better then a blacklist. Only if the parsed expression string match a "bare_expression" then an expression is returned.
This white-list contains more information how to process f.e. the "unit_of_measurement" . It goes to far to explain what and why, but this is needed. The list of the namedtuples is much more then just a white-list and is defined :
Expr_UOfM = collections.namedtuple('Expr_UOfM', ['bare_expression', 'keylist', 'function', 'unit_of_measurement', 'attrlist'])
The namedtuple which match a (very limited) list:
Exp_list = [Expr_UOfM('sys.path', '' , sys.path, un.STR, []),
Expr_UOfM('shutil.which', '', shutil.which, None, [])]
This list may be very long and the content is crucial for further correct processing. Note the first and third field are very similar. There should be a single point of reference, but for me, this is on this moment not possible. Note the string : 'sys.path' is equal to (a part of) the user input, and the expression : sys.path is part of the namedtuple list. A good separation, limiting possible abuse.
If the string and the expression are not 100% identical weird behavior may occur which is very hard to debug.
So it want using the get_expression_str function check if the first and third field are identical. Just for total robustness of
the program.
I use Python 3.4
You could use inspect.getsource() and wrap your expression in a lambda. Then you can get an expression with this function:
def lambda_to_expr_str(lambda_fn):
"""c.f. https://stackoverflow.com/a/52615415/134077"""
if not lambda_fn.__name__ == "<lambda>":
raise ValueError('Tried to convert non-lambda expression to string')
else:
lambda_str = inspect.getsource(lambda_fn).strip()
expression_start = lambda_str.index(':') + 1
expression_str = lambda_str[expression_start:].strip()
if expression_str.endswith(')') and '(' not in expression_str:
# i.e. l = lambda_to_expr_str(lambda x: x + 1) => x + 1)
expression_str = expression_str[:-1]
return expression_str
Usage:
$ lambda_to_expr_str(lambda: sys.executable)
> 'sys.executable'
OR
$ f = lambda: sys.executable
$ lambda_to_expr_str(f)
> 'sys.executable'
And then eval with
$ eval(lambda_to_expr_str(lambda: sys.executable))
> '/usr/bin/python3.5'
Note that you can take parameters with this approach and pass them with the locals param of eval.
$ l = lambda_to_expr_str(lambda x: x + 1) # now l == 'x + 1'
$ eval(l, None, {'x': 1})
> 2
Here be Dragons. There are many pathological cases with this approach:
$ l, z = lambda_to_expr_str(lambda x: x + 1), 1234
$ l
> 'x + 1), 1234'
This is because inspect.getsource gets the entire line of code the lambda was declared on. Getting source of functions declared with def would avoid this problem, however passing a function body to eval is not possible as there could be side effects, i.e. setting variables, etc... Lambda's can produce side effects as well in Python 2, so even more dragons lie in pre-Python-3 land.
Why not use eval?
>>> exp1 = "sys.path"
>>> exp2 = "[x*x for x in [1,2,3]]"
>>> eval(exp1)
['', 'C:\\Python27\\lib\\site-packages\\setuptools-0.6c11-py2.7.egg', 'C:\\Pytho
n27\\lib\\site-packages\\pip-1.1-py2.7.egg', 'C:\\Python27\\lib\\site-packages\\
django_celery-3.1.1-py2.7.egg', 'C:\\Python27\\lib\\site-packages\\south-0.8.4-p
y2.7.egg', 'C:\\Windows\\system32\\python27.zip', 'C:\\Python27\\DLLs', 'C:\\Pyt
hon27\\lib', 'C:\\Python27\\lib\\plat-win', 'C:\\Python27\\lib\\lib-tk', 'C:\\Py
thon27', 'C:\\Python27\\lib\\site-packages', 'C:\\Python27\\lib\\site-packages\\
PIL']
>>> eval(exp2)
[1, 4, 9]
I have a situation with some code where eval() came up as a possible solution. Now I have never had to use eval() before but, I have come across plenty of information about the potential danger it can cause. That said, I'm very wary about using it.
My situation is that I have input being given by a user:
datamap = input('Provide some data here: ')
Where datamap needs to be a dictionary. I searched around and found that eval() could work this out. I thought that I might be able to check the type of the input before trying to use the data and that would be a viable security precaution.
datamap = eval(input('Provide some data here: ')
if not isinstance(datamap, dict):
return
I read through the docs and I am still unclear if this would be safe or not. Does eval evaluate the data as soon as its entered or after the datamap variable is called?
Is the ast module's .literal_eval() the only safe option?
datamap = eval(input('Provide some data here: ')) means that you actually evaluate the code before you deem it to be unsafe or not. It evaluates the code as soon as the function is called. See also the dangers of eval.
ast.literal_eval raises an exception if the input isn't a valid Python datatype, so the code won't be executed if it's not.
Use ast.literal_eval whenever you need eval. You shouldn't usually evaluate literal Python statements.
ast.literal_eval() only considers a small subset of Python's syntax to be valid:
The string or node provided may only consist of the following Python literal structures: strings, bytes, numbers, tuples, lists, dicts, sets, booleans, and None.
Passing __import__('os').system('rm -rf /a-path-you-really-care-about') into ast.literal_eval() will raise an error, but eval() will happily delete your files.
Since it looks like you're only letting the user input a plain dictionary, use ast.literal_eval(). It safely does what you want and nothing more.
eval:
This is very powerful, but is also very dangerous if you accept strings to evaluate from untrusted input. Suppose the string being evaluated is "os.system('rm -rf /')" ? It will really start deleting all the files on your computer.
ast.literal_eval:
Safely evaluate an expression node or a string containing a Python literal or container display. The string or node provided may only consist of the following Python literal structures: strings, bytes, numbers, tuples, lists, dicts, sets, booleans, None, bytes and sets.
Syntax:
eval(expression, globals=None, locals=None)
import ast
ast.literal_eval(node_or_string)
Example:
# python 2.x - doesn't accept operators in string format
import ast
ast.literal_eval('[1, 2, 3]') # output: [1, 2, 3]
ast.literal_eval('1+1') # output: ValueError: malformed string
# python 3.0 -3.6
import ast
ast.literal_eval("1+1") # output : 2
ast.literal_eval("{'a': 2, 'b': 3, 3:'xyz'}") # output : {'a': 2, 'b': 3, 3:'xyz'}
# type dictionary
ast.literal_eval("",{}) # output : Syntax Error required only one parameter
ast.literal_eval("__import__('os').system('rm -rf /')") # output : error
eval("__import__('os').system('rm -rf /')")
# output : start deleting all the files on your computer.
# restricting using global and local variables
eval("__import__('os').system('rm -rf /')",{'__builtins__':{}},{})
# output : Error due to blocked imports by passing '__builtins__':{} in global
# But still eval is not safe. we can access and break the code as given below
s = """
(lambda fc=(
lambda n: [
c for c in
().__class__.__bases__[0].__subclasses__()
if c.__name__ == n
][0]
):
fc("function")(
fc("code")(
0,0,0,0,"KABOOM",(),(),(),"","",0,""
),{}
)()
)()
"""
eval(s, {'__builtins__':{}})
In the above code ().__class__.__bases__[0] nothing but object itself.
Now we instantiated all the subclasses, here our main enter code hereobjective is to find one class named n from it.
We need to code object and function object from instantiated subclasses. This is an alternative way from CPython to access subclasses of object and attach the system.
From python 3.7 ast.literal_eval() is now stricter. Addition and subtraction of arbitrary numbers are no longer allowed. link
Python's eager in its evaluation, so eval(input(...)) (Python 3) will evaluate the user's input as soon as it hits the eval, regardless of what you do with the data afterwards. Therefore, this is not safe, especially when you eval user input.
Use ast.literal_eval.
As an example, entering this at the prompt could be very bad for you:
__import__('os').system('rm -rf /a-path-you-really-care-about')
In recent Python3 ast.literal_eval() no longer parses simple strings, instead you are supposed to use the ast.parse() method to create an AST then interpret it.
This is a complete example of using ast.parse() correctly in Python 3.6+ to evaluate simple arithmetic expressions safely.
import ast, operator, math
import logging
logger = logging.getLogger(__file__)
def safe_eval(s):
def checkmath(x, *args):
if x not in [x for x in dir(math) if not "__" in x]:
raise SyntaxError(f"Unknown func {x}()")
fun = getattr(math, x)
return fun(*args)
binOps = {
ast.Add: operator.add,
ast.Sub: operator.sub,
ast.Mult: operator.mul,
ast.Div: operator.truediv,
ast.Mod: operator.mod,
ast.Pow: operator.pow,
ast.Call: checkmath,
ast.BinOp: ast.BinOp,
}
unOps = {
ast.USub: operator.neg,
ast.UAdd: operator.pos,
ast.UnaryOp: ast.UnaryOp,
}
ops = tuple(binOps) + tuple(unOps)
tree = ast.parse(s, mode='eval')
def _eval(node):
if isinstance(node, ast.Expression):
logger.debug("Expr")
return _eval(node.body)
elif isinstance(node, ast.Str):
logger.debug("Str")
return node.s
elif isinstance(node, ast.Num):
logger.debug("Num")
return node.value
elif isinstance(node, ast.Constant):
logger.info("Const")
return node.value
elif isinstance(node, ast.BinOp):
logger.debug("BinOp")
if isinstance(node.left, ops):
left = _eval(node.left)
else:
left = node.left.value
if isinstance(node.right, ops):
right = _eval(node.right)
else:
right = node.right.value
return binOps[type(node.op)](left, right)
elif isinstance(node, ast.UnaryOp):
logger.debug("UpOp")
if isinstance(node.operand, ops):
operand = _eval(node.operand)
else:
operand = node.operand.value
return unOps[type(node.op)](operand)
elif isinstance(node, ast.Call):
args = [_eval(x) for x in node.args]
r = checkmath(node.func.id, *args)
return r
else:
raise SyntaxError(f"Bad syntax, {type(node)}")
return _eval(tree)
if __name__ == "__main__":
logger.setLevel(logging.DEBUG)
ch = logging.StreamHandler()
logger.addHandler(ch)
assert safe_eval("1+1") == 2
assert safe_eval("1+-5") == -4
assert safe_eval("-1") == -1
assert safe_eval("-+1") == -1
assert safe_eval("(100*10)+6") == 1006
assert safe_eval("100*(10+6)") == 1600
assert safe_eval("2**4") == 2**4
assert safe_eval("sqrt(16)+1") == math.sqrt(16) + 1
assert safe_eval("1.2345 * 10") == 1.2345 * 10
print("Tests pass")
If all you need is a user provided dictionary, a possible better solution is json.loads. The main limitation is that JSON dicts ("objects") require string keys. Also you can only provide literal data, but that is also the case for ast.literal_eval.
In Z3Py, I need to check if something is a term using the standard grammar term := const | var | f(t1,...,tn)). I have written the following function to determine that but my method to check if for n-ary function doesn't seem very optimal.
Is there a better way to do so? These utility functions is_term, is_atom, is_literal, etc would be useful to be included in Z3. I will put them in the contrib section
CONNECTIVE_OPS = [Z3_OP_NOT,Z3_OP_AND,Z3_OP_OR,Z3_OP_IMPLIES,Z3_OP_IFF,Z3_OP_ITE]
REL_OPS = [Z3_OP_EQ,Z3_OP_LE,Z3_OP_LT,Z3_OP_GE,Z3_OP_GT]
def is_term(a):
"""
term := const | var | f(t1,...,tn)
"""
if is_const(a):
return True
else:
r = (is_app(a) and \
a.decl().kind() not in CONNECTIVE_OPS + REL_OPS and \
all(is_term(c) for c in a.children()))
return r
The function is reasonable, a few comments:
It depends on what you mean by "var" in your specification. Z3 has variables as de-Brujin indices. There is a function in z3py "is_var(a)" to check if "a" is a variable index.
There is another Boolean connective Z3_OP_XOR.
There are additional relational operations, such as operations that compare bit-vectors.
It depends on your intent and usage of the code, but you could alternatively check if the
sort of the expression is Boolean, and if it is ensure that the head function symbol is
uninterpreted.
is_const(a) is defined as return is_app(a) and a.num_args() == 0. So is_const is really handled by the default case.
Expressions that Z3 creates as a result of simplification, parsing or other transformations may have many shared sub-expressions. So a straight-forward recursive descent can take exponential time in the DAG size of the expression. You can deal with this by maintaining a hash table of visited nodes. From Python you can use Z3_get_ast_id to retrieve a unique number for the expression and maintain this in a set. The identifiers are unique as long as terms are not garbage collected, so
you should just maintain such a set as a local variable.
So, something along the lines of:
def get_expr_id(e):
return Z3_get_ast_id(e.ctx.ref(), e.ast)
def is_term_aux(a, seen):
if get_expr_id(a) in seen:
return True
else:
seen[get_expr_id(a)] = True
r = (is_app(a) and \
a.decl().kind() not in CONNECTIVE_OPS + REL_OPS and \
all(is_term_aux(c, seen) for c in a.children()))
return r
def is_term(a):
return is_term_aux(a, {})
The "text book" definitions of term, atom and literal used in first-order logic cannot be directly applied to Z3 expressions. In Z3, we allow expressions such as f(And(a, b)) > 0 and f(ForAll([x], g(x) == 0)), where f is a function from Boolean to Integer. This extensions do not increase the expressivity, but they are very convenient when writing problems. The SMT 2.0 standard also allows "term" if-then-else expressions. This is another feature that allows us to nest "formulas" inside "terms". Example: g(If(And(a, b), 1, 0)).
When implementing procedures that manipulate Z3 expressions, we sometimes need to distinguish between Boolean and non-Boolean expressions. In this case, a "term" is just an expression that does not have Boolean sort.
def is_term(a):
return not is_bool(a)
In other instances, we want to process the Boolean connectives (And, Or, ...) in a special way. For example, we are defining a CNF translator. In this case, we define an "atom" as any Boolean expression that is not a quantifier, is a (free) variable or an application that is not one of the Boolean connectives.
def is_atom(a):
return is_bool(a) and (is_var(a) or (is_app(a) and a.decl().kind() not in CONNECTIVE_OPS))
After we define a atom, a literal can be defined as:
def is_literal(a):
return is_atom(a) or (is_not(a) and is_atom(a.arg(0)))
Here is an example that demonstrates these functions (also available online at rise4fun):
x = Int('x')
p, q = Bools('p q')
f = Function('f', IntSort(), BoolSort())
g = Function('g', IntSort(), IntSort())
print is_literal(Not(x > 0))
print is_literal(f(x))
print is_atom(Not(x > 0))
print is_atom(f(x))
print is_atom(x)
print is_term(f(x))
print is_term(g(x))
print is_term(x)
print is_term(Var(1, IntSort()))