I have a text file (an output of a different process that I can't alter) which contains logical comparisons (only these three: >, <=, in) stored as strings. Let's say this is a line in my file, which should be evaluated:
myStr = "x>2 and y<=30 and z in ('def', 'abc')"
Some of my variables are categorical and I specify them, and the rest are numerical:
categoricalVars = ('z')
The values of my variables are stored in a dictionary, let's assume these are their values. Note that they always come in as strings, even for numeric variables:
x, y, z = '5', '6', 'abc'
So my question is how I can safely evaluate (i.e. without using eval()) the truth of myStr in reference to this last line.
What I have done is: First change myStr to reflect the data types:
import re
delim = "(\>|\<=|\ in )" # Put in group to find later which delimiter is used
def pyRules(s):
varName = re.split(delim, s)[0]
rest = "".join(re.split(delim, s)[1:])
if varName in categoricalVars:
return varName + rest
else:
return "float(" + varName + ")" + rest
# Call:
[pyRules(e) for e in myStr.split(' and ')]
# Result:
['float(x)>2', 'float(y)<=30', "z in ('def', 'abc')"]
Now I can easily do:
[eval(pyRules(e)) for e in myStr.split(' and ')]
# Result:
[True, True, True]
But I want to avoid this. I tried ast.literal_eval() but got the following error:
import ast
[ast.literal_eval(pyRules(e)) for e in myStr.split(' and ')]
# Result:
Traceback (most recent call last):
File "<ipython-input-556-dae16951de03>", line 1, in <module>
ast.literal_eval(ast.parse(conds[0]))
File "C:\ProgramData\Anaconda2\lib\ast.py", line 80, in literal_eval
return _convert(node_or_string)
File "C:\ProgramData\Anaconda2\lib\ast.py", line 79, in _convert
raise ValueError('malformed string')
ValueError: malformed string
Next, I tried the following approach, which almost gave me the right answer:
def pyRules(s):
varName = re.split(delim, s)[0]
operation = "".join(re.split(delim, s)[1:])
if varName in categoricalVars:
return "'{" + varName + "}'" + operation
else:
return "float({" + varName + "})" + operation
rules = [pyRules(e).format(x='5',y='6',z='abc') for e in myStr.split(' and ')]
# rules is:
['float(5)>2', 'float(6)<=30', "'abc' in ('def', 'abc')"]
I can again use the eval() on this and get [True, True, True] but to avoid it I defined my own inequality checker function:
def check(x):
first, operation, second = re.split(delim, x)
if operation == ">":
return first > second
elif operation == "<=":
return first <= second
elif operation == " in ":
return first in second
# Call:
[check(pyRules(e).format(x='5',y='6',z='abc')) for e in myStr.split(' and ')]
# Result:
[True, False, True]
It is having a hard time evaluating the second item, i.e: 'float(6)<=30' I also recreated this function using the operator module per this SO thread which is essentially the same thing, and got the same result.
I checked pyparsing, couldn't get it to work (which even looks scary, look at this!), and SymPy but unfortunately it also uses eval frequently, as documented in the hyperlink I provided.
Question 2: Is it okay to use eval given that I am 100% sure that I don't have any crazy string that can interfere with os and erase my disk and other crazy stuff like that?
Note: This is a piece of a big code that I built in Python 2, so Python 2 based answers would be ideal; but I can move to Python 3 if anybody thinks my answer is in that sphere.
After a few hours of work, I figured out a way to get ast.literal_eval() to work! My logic is to look at the two sides of, say, x>2, i.e. x and 2, make sure these are safe by evaluating both with literal_eval, and then run it through my check() function, which does the evaluation. Same for z in ('def', 'abc'): First make sure both z and ('def', 'abc') are safe, then do the actual boolean checking with the check() function.
Since I fully trusted my inputs I could've done the easier eval() way, but just wanted to be double-cautious. And wanted to build some code for everybody out there who have security issues (user inputs etc.) and need to safely evaluate logicals. Hope it helps somebody!
Please see my full code below, any comments/recommendations are welcome.
import re
import ast
myStr = "x>2 and y<=30 and z in ('def', 'abc')"
categoricalVars = ('z')
x, y, z = '5', '6', 'abc'
delim = "(\>|\<=|\ in )" # Put in group to find in the func check() which delimiter is used
def pyRules(s):
"""
Place {} around variable names so that we can str.format() in the func check()
"""
varName = re.split(delim, s)[0]
rest = "".join(re.split(delim, s)[1:])
return "'{" + varName + "}'" + rest
def check(x):
"""
If operation is > or <= then it is a numeric var, use double literal_eval to
parse floats e.g. "'5'" (dual quotes) to 5.0. This is equivalent to:
float(ast.literal_eval(first)). Else it is categorical, just literal_eval once
"""
first, operation, second = re.split(delim, x)
if operation == ">":
return ast.literal_eval(ast.literal_eval(first)) > ast.literal_eval(second)
elif operation == "<=":
return ast.literal_eval(ast.literal_eval(first)) <= ast.literal_eval(second)
elif operation == " in ":
return ast.literal_eval(first) in ast.literal_eval(second)
# These are my raw rules:
print [pyRules(e) for e in myStr.split(' and ')]
# These are my processed rules:
print [pyRules(e).format(x='5',y='6',z='abc') for e in myStr.split(' and ')]
# And these are my final results of logical evaluation:
print [check(pyRules(e).format(x='5',y='6',z='abc')) for e in myStr.split(' and ')]
Results of the three result lines:
["'{x}'>2", "'{y}'<=30", "'{z}' in ('def', 'abc')"]
["'5'>2", "'6'<=30", "'abc' in ('def', 'abc')"]
[True, True, True]
Thanks!
Related
I want to write a summation function, but can't figure out how I would parse the bottom expression and right expression.
def summation(count: int, bottom_var: str, espression: str):
out = 0
# parse bottom_var into a variable and it's value
value = ···
var_name = ···
expression.replace(var_name, value)
···
I want you to be able to use the inputs the same way as in normal sigma notation, as in bottom_var is 'n=13', not '13'.
You enter an assignment for bottom_var, and an expression using the variable defined in bottom_var in expression.
Example:
summation(4, 'x=1', 'x+1')
(would return 14, as 2+3+4+5=14)
First, parse the bottom_var to get the symbol and starting value:
var, value = bottom_var.split('=')
var = var.strip()
value = eval(value)
Using split we get the two parts of the equal sign easily. Using strip we even allow any number of spaces, like x = 1.
Then, we want to create a loop from the starting value to count:
for i in range(value, count+1):
...
Lastly, we want to use the loop to sum the expression when each time the symbol is replaced with the current iteration's value. All in all:
def summation(count: int, bottom_var: str, expression: str):
var, value = bottom_var.split('=')
var = var.strip()
value = eval(value)
res = 0
for i in range(value, count+1):
res += eval(expression.replace(var, str(i)))
return res
For example:
>>> summation(4, 'x=1', 'x+1')
14
Proposing the code in this answer, I feel the need to ask you to read about Why is using 'eval' a bad practice? and please make sure that it is OK for your application. Notice that depending on the context of the use of your code, using eval can be quite dangerous and lead to bad outcomes.
This is how i did it:
def summation(count,bottom_var,expression):
begin = False
x = ""
v = ""
for char in bottom_var:
if begin:
x += char
if char == "=":
begin = True
if begin == False:
v += char
x = int(x)
expression = expression.replace(v,str("x"))
print(expression)
for n in range(count):
x = eval(expression)
summation(4,"d=152",'d+145*2')
There are some built-in function in python which execute code from text or str which are exec() and eval() function.
for example :
>>> exec('n=13')
>>> print(n)
13
>>> eval('n+1')
14
you can use this in your code.
I'm trying to create a function which will solve for some numeric computation – which is given as a string.
Example:
def calculate(expression):
# Solve the expression below
return
# Result should be 19
calculate("5+8-3+9")
I have tried using .split() but got stuck.
For a problem like this we can try tackling it by using this, a string calculator.
'''
Calculates a string of characters only for addition and subtraction
with parentheses. This calculator utilizes the stack method.
'''
import re # imports regular expression library for usage
def calculate(s: str) -> int:
s = re.sub(r'[A-Za-z\s\t]+', '', s)
res = 0
num = 0
sign = 1
stack = []
for ss in s:
# checks if each element is a digit
if ss.isdigit():
num = 10 * num + int(ss)
# if not a digit, checks if its + or - sign
elif ss in ["-", "+"]:
res = res + sign * num
num = 0
sign = [-1, 1][ss == "+"]
'''
sign = [-1, 1][ss=="+"] is the same as:
# int(True) = 1, int(False) = 0. Hence,
if ss == "+":
sign = 1
else:
sign = -1
'''
return res + num * sign
s = input("Enter your string: ")
# OR if you'd like, can uncomment this line below and comment the line above.
# s = "5+8-3+9" # As an expression in a string
print(calculate(s))
I would suggest breaking the question down to its numbers and operators.
Also, I've made the assumption that only whole numbers will be used – and only addition and subtraction.
def calculate(expression):
# Get all components
components = re.findall("[+-]{1}[0-9]*|^[0-9]*",expression)
# get each number with its positive or negative operator
operators = re.compile("[-+]")
# Iterate and add to a list
all_nums = []
for x in components:
# get the number
n = int(re.sub(operators,"",x))
# For all terms after the first
if operators.search(x):
op = operators.search(x).group()
if op=="+":
n = n
elif op=="-":
n=-n
# Save the number
all_nums.append(n)
# Finally, add them up
return sum(all_nums)
x = "5+8-3+9"
calculate(x)
# returns 19
First of all, it's okay to be a beginner - I was in your exact shoes just a few years ago!
I'm going to attempt to provide an elementary/beginner approach to solving this problem, with just the basics.
So first we want to determine what the limits of our function input will be. For this, I'll assume we only accept mathematical expressions with basic addition/subtraction operators.
import re
def calculate(expression: str) -> int:
if not re.match("^[0-9\+-]*$", expression):
return None
For this you'll see I opted for regex, which is a slightly more advanced concept, but think about it like a validity check for expression. Basically, the pattern I wrote checks that there is a fully qualified string that has only integers, plus sign, and minus sign. If you want to learn more about the expression ^[0-9\+-]*$, I highly recommend https://regexr.com/.
For our purposes and understanding though, these test cases should suffice:
>>> re.match("^[0-9\+-]*$", "abcs")
>>> re.match("^[0-9\+-]*$", "1+2")
<re.Match object; span=(0, 3), match='1+2'>
>>> re.match("^[0-9\+-]*$", "1+2/3")
>>>
Now that we have verified our expression, we can get to work on calculating the final value.
Let's try your idea with str.split()! It won't be entirely straightforward because split by definition splits a string up according to a delimiter(s) but discards them in the output. Fear not, because there's another way! The re package I imported earlier can come into handy. So the re library comes with a handy function, split!
By using capture groups for our separator, we are able to split and keep our separators.
>>> re.split("(\d+)", "1+393958-3")
['', '1', '+', '393958', '-', '3', '']
So, with this up our sleeve...
import re
def calculate(expression: str) -> int:
if not re.match("^[0-9\+-]*$", expression):
return None
expression_arr = re.split("(\d+)", expression)[1:-1]
while len(expression_arr) > 1:
# TODO stuff
return int(expression[0])
We can now move onto our while loop. It stands to reason that as long as the array has more than one item, there is some sort of operation left to do.
import re
def calculate(expression: str) -> int:
if not re.match("^[0-9\+-]*$", expression):
return None
expression_arr = re.split("(\d+)", expression)[1:-1]
while len(expression_arr) > 1:
if expression_arr[1] == "+":
eval = int(expression_arr[0]) + int(expression_arr[2])
if expression_arr[1] == "-":
eval = int(expression_arr[0]) - int(expression_arr[2])
del expression_arr[:3]
expression_arr.insert(0, eval)
return int(expression_arr[0])
It's pretty straightforward from there - we check the next operator (which always has to be at expression_arr[1]) and either add or subtract, and make the corresponding changes to expression_arr.
We can verify that it passes the test case you provided. (I added some logging to help with visualization)
>>> calculate("5+8-3+9")
['5', '+', '8', '-', '3', '+', '9']
[13, '-', '3', '+', '9']
[10, '+', '9']
[19]
19
I would understand how to do this assuming that I was only looking for one specific character, but in this instance I am looking for any of the 4 operators, '+', '-', '*', '/'. The function returns -1 if there is no operator in the passed string, txt, otherwise it returns the position of the leftmost operator. So I'm thinking find() would be optimal here.
What I have so far:
def findNextOpr(txt):
# txt must be a nonempty string.
if len(txt) <= 0 or not isinstance(txt, str):
print("type error: findNextOpr")
return "type error: findNextOpr"
if '+' in txt:
return txt.find('+')
elif '-' in txt:
return txt.find('-')
else
return -1
I think if I did what I did for the '+' and '-' operators for the other operators, it wouldn't work for multiple instances of that operator in one expression. Can a loop be incorporated here?
Your current approach is not very efficient, as you will iterate over txt, multiple times, 2 (in and find()) for each operator.
You could use index() instead of find() and just ignore the ValueError exception , e.g.:
def findNextOpr(txt):
for o in '+-*/':
try:
return txt.index(o)
except ValueError:
pass
return -1
You can do this in a single (perhaps more readable) pass by enumerate()ing the txt and return if you find the character, e.g.:
def findNextOpr(txt):
for i, c in enumerate(txt):
if c in '+-*/':
return i
return -1
Note: if you wanted all of the operators you could change the return to yield, and then just iterate over the generator, e.g.:
def findNextOpr(txt):
for i, c in enumerate(txt):
if c in '+-*/':
yield i
In []:
for op in findNextOpr('1+2-3+4'):
print(op)
Out[]:
1
3
5
You can improve your code a bit because you keep looking at the string a lot of times. '+' in txt actually searches through the string just like txt.find('+') does. So you can combine those easily to avoid having to search through it twice:
pos = txt.find('+')
if pos >= 0:
return pos
But this still leaves you with the problem that this will return for the first operator you are looking for if that operator is contained anywhere within the string. So you don’t actually get the first position any of these operators is within the string.
So what you want to do is look for all operators separately, and then return the lowest non-negative number since that’s the first occurence of any of the operators within the string:
plusPos = txt.find('+')
minusPos = txt.find('-')
multPos = txt.find('*')
divPos = txt.find('/')
return min(pos for pos in (plusPos, minusPos, multPos, divPos) if pos >= 0)
First, you shouldn't be printing or returning error messages; you should be raising exceptions. TypeError and ValueError would be appropriate here. (A string that isn't long enough is the latter, not the former.)
Second, you can simply find the the positions of all the operators in the string using a list comprehension, exclude results of -1, and return the lowest of the positions using min().
def findNextOpr(text, start=0):
ops = "+-/*"
assert isinstance(text, str), "text must be a string"
# "text must not be empty" isn't strictly true:
# you'll get a perfectly sensible result for an empty string
assert text, "text must not be empty"
op_idxs = [pos for pos in (text.find(op, start) for op in ops) if pos > -1]
return min(op_idxs) if op_idxs else -1
I've added a start argument that can be used to find the next operator: simply pass in the index of the last-found operator, plus 1.
I'm trying to enter values from the tuple ('a',1), ('b',2),('c',3) into the function dostuff but i always get a return of None or False. i'm new to this to i'm sorry if this question is basic. I would appreciate any help.
I expect the result of this to be:
a1---8
b2---8
c3---8
Code:
def dostuff(stri,numb,char):
cal = stri+str(numb)+'---'+str(char)
return cal
def callit (tups,char):
for x in range(len(tups)):
dostuff(tups[x][0],tups[x][1],char)
print(callit([('a',1), ('b',2),('c',3)],8))
I think you're misunderstanding the return value of the functions: unless otherwise specified, all functions will return None at completion. Your code:
print(callit([('a',1), ('b',2),('c',3)],8))`
is telling the Python interpreter "print the return value of this function call." This isn't printing what you expect it to because the callit function doesn't have a return value specified. You could either change the return in your dostuff function like so:
def dostuff(stri,numb,char):
cal = stri+str(numb)+'---'+str(char)
print cal
def callit (tups,char):
for x in range(len(tups)):
dostuff(tups[x][0],tups[x][1],char)
callit([('a',1), ('b',2),('c',3)],8)
This changes the return on the third line into a print command, and removes the print command from the callit call.
Another option would be:
def dostuff(stri,numb,char):
cal = stri+str(numb)+'---'+str(char)
return cal
def callit (tups,char):
for x in range(len(tups)):
cal = dostuff(tups[x][0],tups[x][1],char)
print(cal)
callit([('a',1), ('b',2),('c',3)],8)
This takes the return value from the dostuff function and stores it in a variable named cal, which could then be printed or written to a file on disk.
as #n1c9 said, every Python function must return some object, and if there's no return statement written in the function definition the function will implicitly return the None object. (implicitly meaning that under the hood, Python will see that there's no return statement and returnNone)
However, while there's nothing wrong in this case with printing a value in a function rather than returning it, it's generally considered bad practice. This is because if you ever want to test the function to aid in debugging, you have to write the test within the function definition. While if you returned the value you could just test the return value of calling the function.
So when you're debugging this code you might write something like this:
def test_callit():
tups = [('a', 1), ('b', 2), ('c', 3)]
expected = 'a1---8\nb2---8\nc3---8'
result = callit(tups, 8)
assert result == expected, (str(result) + " != " + expected)
if you're unfamiliar with the assert statement, you can read up about it here
Now that you have a test function, you can go back and modify your code. Callit needs a return value, which in this case should probably be a string. so for the functioncallit you might write
def callit(tups, char):
result = ''
for x in range(len(tups)):
result += dostuff(tups[x][0], tups[x][1], char) + '\n'
result = result[:result.rfind('\n')] # trim off the last \n
return result
when you run test_callit, if you get any assertion errors you can see how it differs from what you expect in the traceback.
What I'm about to talk about isn't really relevant to your question, but I would say improves the readability of your code.
Python's for statement is very different from most other programming languages, because it actually acts like a foreach loop. Currently, the code ignores that feature and forces regular for-loop functionality. it's actually simpler and faster to write something like this:
def callit(tups, char):
result = ''
for tup in tups:
result += dostuff(tup[0], tup[1], char) + '\n'
result = result[:result.rfind('\n')] # trim off the last \n
return result
For a function to return a value in python it must have a return statement. In your callit function you lack a value to return. A more Pythonic approach would have both a value and iterate through the tuples using something like this:
def callit(tups, char):
x = [dostuff(a, b, char) for a, b in tups]
return x
Since tups is a list of tuples, we can iterate through it using for a, b in tups - this grabs both elements in the pairs. Next dostuff(a, b, char) is calling your dostuff function on each pair of elements and the char specified. Enclosing that in brackets makes the result a list, which we then return using the return statement.
Note you don't need to do:
x = ...
return x
You can just use return [dostuff(a, b, char) for a, b in tups] but I used the former for clarity.
You can use a list comprehension to do it in one line:
char = 8
point_list = [('a', 1), ('b', 2),('c', 3)]
print("\n".join(["{}{}---{}".format(s, n, char) for s, n in point_list]))
"{}{}---{}".format(s, n, char) creates a string by replacing {} by each one of the input in format, therefore "{}{}---{}".format("a", 1, 8) will return "a1---8"
for s, n in point_list will create an implicit loop over the point_list list and for each element of the list (each tuple) will store the first element in s and the second in n
["{}{}---{}".format(s, n, char) for s, n in point_list] is therefore a list created by applying the format we want to each of the tuples: it will return ["a1---8","b2---8","c3---8"]
finally "\n".join(["a1---8","b2---8","c3---8"]) creates a single string from a list of strings by concatenating each element of the list to "\n" to the next one: "a1---8\nb2---8\nc3---8" ("\n" is a special character representing the end of a line
I'm trying to make a function, f(x), that would add a "-" between each letter:
For example:
f("James")
should output as:
J-a-m-e-s-
I would love it if you could use simple python functions as I am new to programming. Thanks in advance. Also, please use the "for" function because it is what I'm trying to learn.
Edit:
yes, I do want the "-" after the "s".
Can I try like this:
>>> def f(n):
... return '-'.join(n)
...
>>> f('james')
'j-a-m-e-s'
>>>
Not really sure if you require the last 'hyphen'.
Edit:
Even if you want suffixed '-', then can do like
def f(n):
return '-'.join(n) + '-'
As being learner, it is important to understand for your that "better to concat more than two strings in python" would be using str.join(iterable), whereas + operator is fine to append one string with another.
Please read following posts to explore further:
Any reason not to use + to concatenate two strings?
which is better to concat string in python?
How slow is Python's string concatenation vs. str.join?
Also, please use the "for" function because it is what I'm trying to learn
>>> def f(s):
m = s[0]
for i in s[1:]:
m += '-' + i
return m
>>> f("James")
'J-a-m-e-s'
m = s[0] character at the index 0 is assigned to the variable m
for i in s[1:]: iterate from the second character and
m += '-' + i append - + char to the variable m
Finally return the value of variable m
If you want - at the last then you could do like this.
>>> def f(s):
m = ""
for i in s:
m += i + '-'
return m
>>> f("James")
'J-a-m-e-s-'
text_list = [c+"-" for c in text]
text_strung = "".join(text_list)
As a function, takes a string as input.
def dashify(input):
output = ""
for ch in input:
output = output + ch + "-"
return output
Given you asked for a solution that uses for and a final -, simply iterate over the message and add the character and '-' to an intermediate list, then join it up. This avoids the use of string concatenations:
>>> def f(message)
l = []
for c in message:
l.append(c)
l.append('-')
return "".join(l)
>>> print(f('James'))
J-a-m-e-s-
I'm sorry, but I just have to take Alexander Ravikovich's answer a step further:
f = lambda text: "".join([c+"-" for c in text])
print(f('James')) # J-a-m-e-s-
It is never too early to learn about list comprehension.
"".join(a_list) is self-explanatory: glueing elements of a list together with a string (empty string in this example).
lambda... well that's just a way to define a function in a line. Think
square = lambda x: x**2
square(2) # returns 4
square(3) # returns 9
Python is fun, it's not {enter-a-boring-programming-language-here}.