Python calculator - evaluating string of statements using decimal module - python

I'm writing a Discord bot that will accept user input as a string and then evaluate the expression inside it, nothing fancy just simple arithmetic operations. I have two concerns - safety and decimals. First I used simpleeval package, it worked fine but it had trouble with decimals, e.g
0.1 + 0.1 + 0.1 - 0.3 would return 5.551115123125783e-17. After googling a lot I found an answer that work's but it uses the built in eval() function and apparently using it is a big no.
Is there a better/safer way of handling this? This implements https://docs.python.org/3/library/tokenize.html#examples decistmt() method which substitutes Decimals for floats in a string of statements. But in the end I use eval() and with all those checks I'm still unsure if it's safe and I'd rather avoid it.
This is what decistmt() does:
from tokenize import tokenize, untokenize, NUMBER, STRING, NAME, OP
from io import BytesIO
def decistmt(s):
"""Substitute Decimals for floats in a string of statements.
>>> from decimal import Decimal
>>> s = 'print(+21.3e-5*-.1234/81.7)'
>>> decistmt(s)
"print (+Decimal ('21.3e-5')*-Decimal ('.1234')/Decimal ('81.7'))"
The format of the exponent is inherited from the platform C library.
Known cases are "e-007" (Windows) and "e-07" (not Windows). Since
we're only showing 12 digits, and the 13th isn't close to 5, the
rest of the output should be platform-independent.
>>> exec(s) #doctest: +ELLIPSIS
-3.21716034272e-0...7
Output from calculations with Decimal should be identical across all
platforms.
>>> exec(decistmt(s))
-3.217160342717258261933904529E-7
"""
result = []
g = tokenize(BytesIO(s.encode('utf-8')).readline) # tokenize the string
for toknum, tokval, _, _, _ in g:
if toknum == NUMBER and '.' in tokval: # replace NUMBER tokens
result.extend([
(NAME, 'Decimal'),
(OP, '('),
(STRING, repr(tokval)),
(OP, ')')
])
else:
result.append((toknum, tokval))
return untokenize(result).decode('utf-8')
# example user input: "(20+5)^4 - 550 + 8"
#bot.command()
async def calc(context, *, user_input):
#this is so user can use both ^ and ** for power and use "," and "." for decimals
equation = user_input.replace('^', "**").replace(",", ".")
valid_operators: list = ["+", "-", "/", "*", "%", "^", "**"]
# checks if a string contains any element from a list, it will also return false if the iterable is empty, so this covers empty check too
operator_check: bool = any(
operator in equation for operator in valid_operators)
# checks if arithmetic operator is last or first element in equation, to prevent doing something like ".calc 2+" or ".calc +2"
def is_last_or_first(equation: str):
for operator in valid_operators:
if operator == equation[-1]:
return True
elif operator == equation[0]:
if operator == "-":
return False
else:
return True
#isupper and islower checks whether there are letters in user input
if not operator_check or is_last_or_first(equation) or equation.isupper() or equation.islower():
return await context.send("Invalid input")
result = eval(decistmt(equation))
result = float(result)
# returning user_input here so user sees "^" instead of "**"
async def result_generator(result: int or float):
await context.send(f'**Input:** ```fix\n{user_input}```**Result:** ```fix\n{result}```')
# this is so if the result is .0 it returns an int
if result.is_integer():
await result_generator(int(result))
else:
await result_generator(result)
This is what happens after user input
user_input = "0.1 + 0.1 + 0.1 - 0.3"
float_to_decimal = decistmt(user_input)
print(float_to_decimal)
print(type(float_to_decimal))
# Decimal ('0.1')+Decimal ('0.1')+Decimal ('0.1')-Decimal ('0.3')
# <class 'str'>
Now I need to evaluate this input so I'm using eval(). My question is - is this safe (I assume not) and is there some other way to evaluate float_to_decimal?
EDIT. As requested, more in depth explanation:
The whole application is a chat bot. This "calc" function is one of the commands users can use. It is invoked by inputing ".calc " in the chat, ".calc" is a prefix, anything after that is arguments and it will be concatenated to a string and ultimately a string is what I'll get as an argument. I perform a bunch of checks to limit the input (remove letters etc.). After checks I am left with a string consisting of numbers, arithmetic operators and brackets. I want to evaluate the mathematical expression from that string. I pass that string to decistmt function which transforms each float in that string to Decimal objects, the result IS A STRING looking like this: "Decimal ('2.5') + Decimal ('-5.2')". Now I need to evaluate the expression inside that string. I used simpleeval module for that but it is incompatible with Decimal module so I'm evaluating using built in eval() method. My question is, is there a safer way of evaluating mathematical expression in a string like that?

A recent spinoff package from pyparsing is plusminus, a wrapper around pyparsing specifically for this case of embedded arithmetic evaluation. You can try it yourself at http://ptmcg.pythonanywhere.com/plusminus. Since plusminus uses its own parser, it is not subject to the common eval attacks, as Ned Batchelder describes in this still-timely blog post: https://nedbatchelder.com/blog/201206/eval_really_is_dangerous.html. Plusminus also includes handling for expressions which, while valid and "safe", are so time-consuming that they could be used for a denial-of-service attack - such as "9**9**9".
I'm planning a refactoring pass on plusminus that will change the API a bit, but you could probably use it as-is for your Discord plugin. (You'll find that plusminus also supports a number of additional operators and functions beyond the standard ones defined for Python, such as "|4-7|" for absolute value, or "√42" for sqrt(42) - click the Examples button on that page for more expressions that are supported. You can also save values in a variable, and then use that variable in later expressions. (This may not work for your plugin, since the variable state might be shared by multiple Discordians.)
Plusminus is also designed to support subclassing to define new operators, such as a dice roller that will evaluate "3d6+d20" as "3 rolls of a 6-sided die, plus 1 roll of a 20-sided die", including random die rolls each time it is evaluated.

Try using Decimal on numbers as #Shiva suggested.
Also here's the answer from user #unutbu which consists of using PyParsing library with custom wrapper for evaluating math expressions.
In case of Python or system expressions (e.g. import <package> or dir()) will throw an ParseException error.

Related

Is there a way to check if every character in a string conforms to the set RE conditions?

I've been re-learning python over the last 2 days, and decided to use regular expression [here out referred to as RE] for the first time (in conjunction with tkinter), its exceptionally confusing.
I'm trying to check every character in a string is a number or period, however this has proven difficult for me to wrap my head around.
Here is the code:
def matchFloat(string, search=re.compile(r'[0-9.]').search):
return bool(search(string))
def matchInt(string, search=re.compile(r'[0-9]').search):
return bool(search(string))
def callbackFloat(P):
if matchFloat(P) or P == "":
return True
else:
return False
def callbackInt(P):
if matchInt(P) or P == "":
return True
else:
return False
The first character entered into my enter box [see below] is forced to be a number or . (in the Floats case), however RE search() only requires 1 of the characters to meet the conditions for it to return True.
So in short, Is there a way to only return True if every character in a string conforms to the set RE conditions?
Any help is appreciated, thank you in advanced!
Images:
As you can see, I'm quite new to this.
Disallowed Characters In Box
This thread may be helpful as it covers the topic of tkinter input validation.
However, quick answer to your question:
search(r"^[0-9]+$", string)
would match an integer string. The RE pattern means "one or more digits, extending from the beginning of the string to the end". This can also be shortened to r"^\d+$".
You could also use the re.fullmatch() method:
fullmatch(r"[0-9]+", string)
And for the sake of completeness, I'll point out that you don't need to do this work yourself. You can determine if a string represents an integer with string.isdigit(), string.isdecimal(), or string.isnumeric() methods.
As for checking whether a string is a float, the Pythonic way is to just try converting to a float and catch the exception if it fails:
try:
num = float(string)
except:
print("not a float")

Testing a "very close number" in Python with doctest [duplicate]

It's well known that comparing floats for equality is a little fiddly due to rounding and precision issues.
For example: Comparing Floating Point Numbers, 2012 Edition
What is the recommended way to deal with this in Python?
Is a standard library function for this somewhere?
Python 3.5 adds the math.isclose and cmath.isclose functions as described in PEP 485.
If you're using an earlier version of Python, the equivalent function is given in the documentation.
def isclose(a, b, rel_tol=1e-09, abs_tol=0.0):
return abs(a-b) <= max(rel_tol * max(abs(a), abs(b)), abs_tol)
rel_tol is a relative tolerance, it is multiplied by the greater of the magnitudes of the two arguments; as the values get larger, so does the allowed difference between them while still considering them equal.
abs_tol is an absolute tolerance that is applied as-is in all cases. If the difference is less than either of those tolerances, the values are considered equal.
Something as simple as the following may be good enough:
return abs(f1 - f2) <= allowed_error
I would agree that Gareth's answer is probably most appropriate as a lightweight function/solution.
But I thought it would be helpful to note that if you are using NumPy or are considering it, there is a packaged function for this.
numpy.isclose(a, b, rtol=1e-05, atol=1e-08, equal_nan=False)
A little disclaimer though: installing NumPy can be a non-trivial experience depending on your platform.
Use Python's decimal module, which provides the Decimal class.
From the comments:
It is worth noting that if you're
doing math-heavy work and you don't
absolutely need the precision from
decimal, this can really bog things
down. Floats are way, way faster to
deal with, but imprecise. Decimals are
extremely precise but slow.
The common wisdom that floating-point numbers cannot be compared for equality is inaccurate. Floating-point numbers are no different from integers: If you evaluate "a == b", you will get true if they are identical numbers and false otherwise (with the understanding that two NaNs are of course not identical numbers).
The actual problem is this: If I have done some calculations and am not sure the two numbers I have to compare are exactly correct, then what? This problem is the same for floating-point as it is for integers. If you evaluate the integer expression "7/3*3", it will not compare equal to "7*3/3".
So suppose we asked "How do I compare integers for equality?" in such a situation. There is no single answer; what you should do depends on the specific situation, notably what sort of errors you have and what you want to achieve.
Here are some possible choices.
If you want to get a "true" result if the mathematically exact numbers would be equal, then you might try to use the properties of the calculations you perform to prove that you get the same errors in the two numbers. If that is feasible, and you compare two numbers that result from expressions that would give equal numbers if computed exactly, then you will get "true" from the comparison. Another approach is that you might analyze the properties of the calculations and prove that the error never exceeds a certain amount, perhaps an absolute amount or an amount relative to one of the inputs or one of the outputs. In that case, you can ask whether the two calculated numbers differ by at most that amount, and return "true" if they are within the interval. If you cannot prove an error bound, you might guess and hope for the best. One way of guessing is to evaluate many random samples and see what sort of distribution you get in the results.
Of course, since we only set the requirement that you get "true" if the mathematically exact results are equal, we left open the possibility that you get "true" even if they are unequal. (In fact, we can satisfy the requirement by always returning "true". This makes the calculation simple but is generally undesirable, so I will discuss improving the situation below.)
If you want to get a "false" result if the mathematically exact numbers would be unequal, you need to prove that your evaluation of the numbers yields different numbers if the mathematically exact numbers would be unequal. This may be impossible for practical purposes in many common situations. So let us consider an alternative.
A useful requirement might be that we get a "false" result if the mathematically exact numbers differ by more than a certain amount. For example, perhaps we are going to calculate where a ball thrown in a computer game traveled, and we want to know whether it struck a bat. In this case, we certainly want to get "true" if the ball strikes the bat, and we want to get "false" if the ball is far from the bat, and we can accept an incorrect "true" answer if the ball in a mathematically exact simulation missed the bat but is within a millimeter of hitting the bat. In that case, we need to prove (or guess/estimate) that our calculation of the ball's position and the bat's position have a combined error of at most one millimeter (for all positions of interest). This would allow us to always return "false" if the ball and bat are more than a millimeter apart, to return "true" if they touch, and to return "true" if they are close enough to be acceptable.
So, how you decide what to return when comparing floating-point numbers depends very much on your specific situation.
As to how you go about proving error bounds for calculations, that can be a complicated subject. Any floating-point implementation using the IEEE 754 standard in round-to-nearest mode returns the floating-point number nearest to the exact result for any basic operation (notably multiplication, division, addition, subtraction, square root). (In case of tie, round so the low bit is even.) (Be particularly careful about square root and division; your language implementation might use methods that do not conform to IEEE 754 for those.) Because of this requirement, we know the error in a single result is at most 1/2 of the value of the least significant bit. (If it were more, the rounding would have gone to a different number that is within 1/2 the value.)
Going on from there gets substantially more complicated; the next step is performing an operation where one of the inputs already has some error. For simple expressions, these errors can be followed through the calculations to reach a bound on the final error. In practice, this is only done in a few situations, such as working on a high-quality mathematics library. And, of course, you need precise control over exactly which operations are performed. High-level languages often give the compiler a lot of slack, so you might not know in which order operations are performed.
There is much more that could be (and is) written about this topic, but I have to stop there. In summary, the answer is: There is no library routine for this comparison because there is no single solution that fits most needs that is worth putting into a library routine. (If comparing with a relative or absolute error interval suffices for you, you can do it simply without a library routine.)
math.isclose() has been added to Python 3.5 for that (source code). Here is a port of it to Python 2. It's difference from one-liner of Mark Ransom is that it can handle "inf" and "-inf" properly.
def isclose(a, b, rel_tol=1e-09, abs_tol=0.0):
'''
Python 2 implementation of Python 3.5 math.isclose()
https://github.com/python/cpython/blob/v3.5.10/Modules/mathmodule.c#L1993
'''
# sanity check on the inputs
if rel_tol < 0 or abs_tol < 0:
raise ValueError("tolerances must be non-negative")
# short circuit exact equality -- needed to catch two infinities of
# the same sign. And perhaps speeds things up a bit sometimes.
if a == b:
return True
# This catches the case of two infinities of opposite sign, or
# one infinity and one finite number. Two infinities of opposite
# sign would otherwise have an infinite relative tolerance.
# Two infinities of the same sign are caught by the equality check
# above.
if math.isinf(a) or math.isinf(b):
return False
# now do the regular computation
# this is essentially the "weak" test from the Boost library
diff = math.fabs(b - a)
result = (((diff <= math.fabs(rel_tol * b)) or
(diff <= math.fabs(rel_tol * a))) or
(diff <= abs_tol))
return result
I'm not aware of anything in the Python standard library (or elsewhere) that implements Dawson's AlmostEqual2sComplement function. If that's the sort of behaviour you want, you'll have to implement it yourself. (In which case, rather than using Dawson's clever bitwise hacks you'd probably do better to use more conventional tests of the form if abs(a-b) <= eps1*(abs(a)+abs(b)) + eps2 or similar. To get Dawson-like behaviour you might say something like if abs(a-b) <= eps*max(EPS,abs(a),abs(b)) for some small fixed EPS; this isn't exactly the same as Dawson, but it's similar in spirit.
If you want to use it in testing/TDD context, I'd say this is a standard way:
from nose.tools import assert_almost_equals
assert_almost_equals(x, y, places=7) # The default is 7
In terms of absolute error, you can just check
if abs(a - b) <= error:
print("Almost equal")
Some information of why float act weird in Python:
Python 3 Tutorial 03 - if-else, logical operators and top beginner mistakes
You can also use math.isclose for relative errors.
This is useful for the case where you want to make sure two numbers are the same 'up to precision', and there isn't any need to specify the tolerance:
Find minimum precision of the two numbers
Round both of them to minimum precision and compare
def isclose(a, b):
astr = str(a)
aprec = len(astr.split('.')[1]) if '.' in astr else 0
bstr = str(b)
bprec = len(bstr.split('.')[1]) if '.' in bstr else 0
prec = min(aprec, bprec)
return round(a, prec) == round(b, prec)
As written, it only works for numbers without the 'e' in their string representation (meaning 0.9999999999995e-4 < number <= 0.9999999999995e11)
Example:
>>> isclose(10.0, 10.049)
True
>>> isclose(10.0, 10.05)
False
For some of the cases where you can affect the source number representation, you can represent them as fractions instead of floats, using integer numerator and denominator. That way you can have exact comparisons.
See Fraction from fractions module for details.
I liked Sesquipedal's suggestion, but with modification (a special use case when both values are 0 returns False). In my case, I was on Python 2.7 and just used a simple function:
if f1 ==0 and f2 == 0:
return True
else:
return abs(f1-f2) < tol*max(abs(f1),abs(f2))
If you want to do it in a testing or TDD context using the pytest package, here's how:
import pytest
PRECISION = 1e-3
def assert_almost_equal():
obtained_value = 99.99
expected_value = 100.00
assert obtained_value == pytest.approx(expected_value, PRECISION)
I found the following comparison helpful:
str(f1) == str(f2)
To compare up to a given decimal without atol/rtol:
def almost_equal(a, b, decimal=6):
return '{0:.{1}f}'.format(a, decimal) == '{0:.{1}f}'.format(b, decimal)
print(almost_equal(0.0, 0.0001, decimal=5)) # False
print(almost_equal(0.0, 0.0001, decimal=4)) # True
This maybe is a bit ugly hack, but it works pretty well when you don't need more than the default float precision (about 11 decimals).
The round_to function uses the format method from the built-in str class to round up the float to a string that represents the float with the number of decimals needed, and then applies the eval built-in function to the rounded float string to get back to the float numeric type.
The is_close function just applies a simple conditional to the rounded up float.
def round_to(float_num, prec):
return eval("'{:." + str(int(prec)) + "f}'.format(" + str(float_num) + ")")
def is_close(float_a, float_b, prec):
if round_to(float_a, prec) == round_to(float_b, prec):
return True
return False
>>>a = 10.0
10.0
>>>b = 10.0001
10.0001
>>>print is_close(a, b, prec=3)
True
>>>print is_close(a, b, prec=4)
False
Update:
As suggested by #stepehjfox, a cleaner way to build a rount_to function avoiding "eval" is using nested formatting:
def round_to(float_num, prec):
return '{:.{precision}f}'.format(float_num, precision=prec)
Following the same idea, the code can be even simpler using the great new f-strings (Python 3.6+):
def round_to(float_num, prec):
return f'{float_num:.{prec}f}'
So, we could even wrap it up all in one simple and clean 'is_close' function:
def is_close(a, b, prec):
return f'{a:.{prec}f}' == f'{b:.{prec}f}'
If you want to compare floats, the options above are great, but in my case, I ended up using Enum's, since I only had few valid floats my use case was accepting.
from enum import Enum
class HolidayMultipliers(Enum):
EMPLOYED_LESS_THAN_YEAR = 2.0
EMPLOYED_MORE_THAN_YEAR = 2.5
Then running:
testable_value = 2.0
HolidayMultipliers(testable_value)
If the float is valid, it's fine, but otherwise it will just throw an ValueError.
Use == is a simple good way, if you don't care about tolerance precisely.
# Python 3.8.5
>>> 1.0000000000001 == 1
False
>>> 1.00000000000001 == 1
True
But watch out for 0:
>>> 0 == 0.00000000000000000000000000000000000000000001
False
The 0 is always the zero.
Use math.isclose if you want to control the tolerance.
The default a == b is equivalent to math.isclose(a, b, rel_tol=1e-16, abs_tol=0).
If you still want to use == with a self-defined tolerance:
>>> class MyFloat(float):
def __eq__(self, another):
return math.isclose(self, another, rel_tol=0, abs_tol=0.001)
>>> a == MyFloat(0)
>>> a
0.0
>>> a == 0.001
True
So far, I didn't find anywhere to config it globally for float. Besides, mock is also not working for float.__eq__.

The tolerance of using `==` to compare floating numbers in Python [duplicate]

It's well known that comparing floats for equality is a little fiddly due to rounding and precision issues.
For example: Comparing Floating Point Numbers, 2012 Edition
What is the recommended way to deal with this in Python?
Is a standard library function for this somewhere?
Python 3.5 adds the math.isclose and cmath.isclose functions as described in PEP 485.
If you're using an earlier version of Python, the equivalent function is given in the documentation.
def isclose(a, b, rel_tol=1e-09, abs_tol=0.0):
return abs(a-b) <= max(rel_tol * max(abs(a), abs(b)), abs_tol)
rel_tol is a relative tolerance, it is multiplied by the greater of the magnitudes of the two arguments; as the values get larger, so does the allowed difference between them while still considering them equal.
abs_tol is an absolute tolerance that is applied as-is in all cases. If the difference is less than either of those tolerances, the values are considered equal.
Something as simple as the following may be good enough:
return abs(f1 - f2) <= allowed_error
I would agree that Gareth's answer is probably most appropriate as a lightweight function/solution.
But I thought it would be helpful to note that if you are using NumPy or are considering it, there is a packaged function for this.
numpy.isclose(a, b, rtol=1e-05, atol=1e-08, equal_nan=False)
A little disclaimer though: installing NumPy can be a non-trivial experience depending on your platform.
Use Python's decimal module, which provides the Decimal class.
From the comments:
It is worth noting that if you're
doing math-heavy work and you don't
absolutely need the precision from
decimal, this can really bog things
down. Floats are way, way faster to
deal with, but imprecise. Decimals are
extremely precise but slow.
The common wisdom that floating-point numbers cannot be compared for equality is inaccurate. Floating-point numbers are no different from integers: If you evaluate "a == b", you will get true if they are identical numbers and false otherwise (with the understanding that two NaNs are of course not identical numbers).
The actual problem is this: If I have done some calculations and am not sure the two numbers I have to compare are exactly correct, then what? This problem is the same for floating-point as it is for integers. If you evaluate the integer expression "7/3*3", it will not compare equal to "7*3/3".
So suppose we asked "How do I compare integers for equality?" in such a situation. There is no single answer; what you should do depends on the specific situation, notably what sort of errors you have and what you want to achieve.
Here are some possible choices.
If you want to get a "true" result if the mathematically exact numbers would be equal, then you might try to use the properties of the calculations you perform to prove that you get the same errors in the two numbers. If that is feasible, and you compare two numbers that result from expressions that would give equal numbers if computed exactly, then you will get "true" from the comparison. Another approach is that you might analyze the properties of the calculations and prove that the error never exceeds a certain amount, perhaps an absolute amount or an amount relative to one of the inputs or one of the outputs. In that case, you can ask whether the two calculated numbers differ by at most that amount, and return "true" if they are within the interval. If you cannot prove an error bound, you might guess and hope for the best. One way of guessing is to evaluate many random samples and see what sort of distribution you get in the results.
Of course, since we only set the requirement that you get "true" if the mathematically exact results are equal, we left open the possibility that you get "true" even if they are unequal. (In fact, we can satisfy the requirement by always returning "true". This makes the calculation simple but is generally undesirable, so I will discuss improving the situation below.)
If you want to get a "false" result if the mathematically exact numbers would be unequal, you need to prove that your evaluation of the numbers yields different numbers if the mathematically exact numbers would be unequal. This may be impossible for practical purposes in many common situations. So let us consider an alternative.
A useful requirement might be that we get a "false" result if the mathematically exact numbers differ by more than a certain amount. For example, perhaps we are going to calculate where a ball thrown in a computer game traveled, and we want to know whether it struck a bat. In this case, we certainly want to get "true" if the ball strikes the bat, and we want to get "false" if the ball is far from the bat, and we can accept an incorrect "true" answer if the ball in a mathematically exact simulation missed the bat but is within a millimeter of hitting the bat. In that case, we need to prove (or guess/estimate) that our calculation of the ball's position and the bat's position have a combined error of at most one millimeter (for all positions of interest). This would allow us to always return "false" if the ball and bat are more than a millimeter apart, to return "true" if they touch, and to return "true" if they are close enough to be acceptable.
So, how you decide what to return when comparing floating-point numbers depends very much on your specific situation.
As to how you go about proving error bounds for calculations, that can be a complicated subject. Any floating-point implementation using the IEEE 754 standard in round-to-nearest mode returns the floating-point number nearest to the exact result for any basic operation (notably multiplication, division, addition, subtraction, square root). (In case of tie, round so the low bit is even.) (Be particularly careful about square root and division; your language implementation might use methods that do not conform to IEEE 754 for those.) Because of this requirement, we know the error in a single result is at most 1/2 of the value of the least significant bit. (If it were more, the rounding would have gone to a different number that is within 1/2 the value.)
Going on from there gets substantially more complicated; the next step is performing an operation where one of the inputs already has some error. For simple expressions, these errors can be followed through the calculations to reach a bound on the final error. In practice, this is only done in a few situations, such as working on a high-quality mathematics library. And, of course, you need precise control over exactly which operations are performed. High-level languages often give the compiler a lot of slack, so you might not know in which order operations are performed.
There is much more that could be (and is) written about this topic, but I have to stop there. In summary, the answer is: There is no library routine for this comparison because there is no single solution that fits most needs that is worth putting into a library routine. (If comparing with a relative or absolute error interval suffices for you, you can do it simply without a library routine.)
math.isclose() has been added to Python 3.5 for that (source code). Here is a port of it to Python 2. It's difference from one-liner of Mark Ransom is that it can handle "inf" and "-inf" properly.
def isclose(a, b, rel_tol=1e-09, abs_tol=0.0):
'''
Python 2 implementation of Python 3.5 math.isclose()
https://github.com/python/cpython/blob/v3.5.10/Modules/mathmodule.c#L1993
'''
# sanity check on the inputs
if rel_tol < 0 or abs_tol < 0:
raise ValueError("tolerances must be non-negative")
# short circuit exact equality -- needed to catch two infinities of
# the same sign. And perhaps speeds things up a bit sometimes.
if a == b:
return True
# This catches the case of two infinities of opposite sign, or
# one infinity and one finite number. Two infinities of opposite
# sign would otherwise have an infinite relative tolerance.
# Two infinities of the same sign are caught by the equality check
# above.
if math.isinf(a) or math.isinf(b):
return False
# now do the regular computation
# this is essentially the "weak" test from the Boost library
diff = math.fabs(b - a)
result = (((diff <= math.fabs(rel_tol * b)) or
(diff <= math.fabs(rel_tol * a))) or
(diff <= abs_tol))
return result
I'm not aware of anything in the Python standard library (or elsewhere) that implements Dawson's AlmostEqual2sComplement function. If that's the sort of behaviour you want, you'll have to implement it yourself. (In which case, rather than using Dawson's clever bitwise hacks you'd probably do better to use more conventional tests of the form if abs(a-b) <= eps1*(abs(a)+abs(b)) + eps2 or similar. To get Dawson-like behaviour you might say something like if abs(a-b) <= eps*max(EPS,abs(a),abs(b)) for some small fixed EPS; this isn't exactly the same as Dawson, but it's similar in spirit.
If you want to use it in testing/TDD context, I'd say this is a standard way:
from nose.tools import assert_almost_equals
assert_almost_equals(x, y, places=7) # The default is 7
In terms of absolute error, you can just check
if abs(a - b) <= error:
print("Almost equal")
Some information of why float act weird in Python:
Python 3 Tutorial 03 - if-else, logical operators and top beginner mistakes
You can also use math.isclose for relative errors.
This is useful for the case where you want to make sure two numbers are the same 'up to precision', and there isn't any need to specify the tolerance:
Find minimum precision of the two numbers
Round both of them to minimum precision and compare
def isclose(a, b):
astr = str(a)
aprec = len(astr.split('.')[1]) if '.' in astr else 0
bstr = str(b)
bprec = len(bstr.split('.')[1]) if '.' in bstr else 0
prec = min(aprec, bprec)
return round(a, prec) == round(b, prec)
As written, it only works for numbers without the 'e' in their string representation (meaning 0.9999999999995e-4 < number <= 0.9999999999995e11)
Example:
>>> isclose(10.0, 10.049)
True
>>> isclose(10.0, 10.05)
False
For some of the cases where you can affect the source number representation, you can represent them as fractions instead of floats, using integer numerator and denominator. That way you can have exact comparisons.
See Fraction from fractions module for details.
I liked Sesquipedal's suggestion, but with modification (a special use case when both values are 0 returns False). In my case, I was on Python 2.7 and just used a simple function:
if f1 ==0 and f2 == 0:
return True
else:
return abs(f1-f2) < tol*max(abs(f1),abs(f2))
If you want to do it in a testing or TDD context using the pytest package, here's how:
import pytest
PRECISION = 1e-3
def assert_almost_equal():
obtained_value = 99.99
expected_value = 100.00
assert obtained_value == pytest.approx(expected_value, PRECISION)
I found the following comparison helpful:
str(f1) == str(f2)
To compare up to a given decimal without atol/rtol:
def almost_equal(a, b, decimal=6):
return '{0:.{1}f}'.format(a, decimal) == '{0:.{1}f}'.format(b, decimal)
print(almost_equal(0.0, 0.0001, decimal=5)) # False
print(almost_equal(0.0, 0.0001, decimal=4)) # True
This maybe is a bit ugly hack, but it works pretty well when you don't need more than the default float precision (about 11 decimals).
The round_to function uses the format method from the built-in str class to round up the float to a string that represents the float with the number of decimals needed, and then applies the eval built-in function to the rounded float string to get back to the float numeric type.
The is_close function just applies a simple conditional to the rounded up float.
def round_to(float_num, prec):
return eval("'{:." + str(int(prec)) + "f}'.format(" + str(float_num) + ")")
def is_close(float_a, float_b, prec):
if round_to(float_a, prec) == round_to(float_b, prec):
return True
return False
>>>a = 10.0
10.0
>>>b = 10.0001
10.0001
>>>print is_close(a, b, prec=3)
True
>>>print is_close(a, b, prec=4)
False
Update:
As suggested by #stepehjfox, a cleaner way to build a rount_to function avoiding "eval" is using nested formatting:
def round_to(float_num, prec):
return '{:.{precision}f}'.format(float_num, precision=prec)
Following the same idea, the code can be even simpler using the great new f-strings (Python 3.6+):
def round_to(float_num, prec):
return f'{float_num:.{prec}f}'
So, we could even wrap it up all in one simple and clean 'is_close' function:
def is_close(a, b, prec):
return f'{a:.{prec}f}' == f'{b:.{prec}f}'
If you want to compare floats, the options above are great, but in my case, I ended up using Enum's, since I only had few valid floats my use case was accepting.
from enum import Enum
class HolidayMultipliers(Enum):
EMPLOYED_LESS_THAN_YEAR = 2.0
EMPLOYED_MORE_THAN_YEAR = 2.5
Then running:
testable_value = 2.0
HolidayMultipliers(testable_value)
If the float is valid, it's fine, but otherwise it will just throw an ValueError.
Use == is a simple good way, if you don't care about tolerance precisely.
# Python 3.8.5
>>> 1.0000000000001 == 1
False
>>> 1.00000000000001 == 1
True
But watch out for 0:
>>> 0 == 0.00000000000000000000000000000000000000000001
False
The 0 is always the zero.
Use math.isclose if you want to control the tolerance.
The default a == b is equivalent to math.isclose(a, b, rel_tol=1e-16, abs_tol=0).
If you still want to use == with a self-defined tolerance:
>>> class MyFloat(float):
def __eq__(self, another):
return math.isclose(self, another, rel_tol=0, abs_tol=0.001)
>>> a == MyFloat(0)
>>> a
0.0
>>> a == 0.001
True
So far, I didn't find anywhere to config it globally for float. Besides, mock is also not working for float.__eq__.

Comparing bit representation of objects in Python

I am watching a video named The Mighty Dictionary which has the following code:
k1 = bits(hash('Monty'))
k2 = bits(hash('Money'))
diff = ('^' [a==b] for a,b in zip(k1,k2))
print(k1,k2,''.join(diff))
As I understand, bits is not a built-in method in Python, but his own written method which is similar to `format(x, 'b'), or is it something that existed in Python 2? (I've never wrote code in Python 2)
I've tried to accomplish the same, get the bits representation of the strings and check where the bits differ:
k1 = format(hash('Monty'),'b')
k2 = format(hash('Money'),'b')
diff = ('^ ' [a==b] for a,b in zip(k1,k2))
print(k1,'\n',k2,'\n',''.join(diff))
I do get the expected result:
UPDATED
Had to shift the first line by 1 space to match the symbols
110111010100001110100101100000100110111111110001001101111000110
-1000001111101001011101001010101101000111001011011000011110100
^ ^^^ ^ ^^ ^^^ ^^^^^^^ ^ ^^^^^ ^^ ^^ ^^^^^^^ ^ ^ ^^^
Also, the lengths of the bits are not the same, whereas I understand that no matter the string, it will take the same, in my case, 64 bits? But its 63 and 62.
print(len(format(hash('Monty'),'b')))
print(len(format(hash('Money'),'b')))
63
62
So, to sum up my question:
Is bits a built-in method in Python2?
Is the recommended way to compare bit representation of an object is using the following:
def fn():
pass
print(format(hash(fn),'b'))
# -111111111111111111111111111111111101111000110001011100000000101
Shouldn't all objects have the same length of bits that represent the object depending on the processor? If I run the following code several times I get these results:
def fn():
pass
def nf():
pass
print(format(hash(fn),'b'))
print(format(hash(nf),'b'))
# first time
# 10001001010011010111110000100
# -111111111111111111111111111111111101110110101100101000001000001
# second time
# 10001001010011010111111101010
# 10001001010011010111110000100
# third time
# 10001001010011010111101010001
# -111111111111111111111111111111111101110110101100101000001000001
No, bits is not a built-in function in Python 2 or Python 3.
By default format() doesn't show leading zeroes. Use the format string 032b to format the number in a 32-character field with leading zeroes.
>>> format(hash('Monty'), '032b')
'1001000100011010010110101101101011000010101011100110001010001'
Another problem you're running into is that hash() can return negative numbers. Maybe this couldn't happen in Python 2, or his bits() function shows the two's complement bits of the number. You can do this by normalizing the input:
def bits(n):
if n < 0:
n = 2**32 + n
return format(n, '032b')
Every time you run the code, you define new fn and nf functions. Different functions will not necessarily have the same hash code, even if they have the same name.
If you don't redefine the functions, you should get the same hash codes each time.
Hashing strings and numbers just depends on the contents, but hashing more complex objects depends on the specific instance.

Why is eval('"\x27"') == eval('"\\x27"')?

I'm very confused with python's eval():
I tried eval('"\x27"') == eval('"\\x27"') and it evaluates to True. Can somebody explain why this is the case? Both expressions evaluate to "'". I understand why eval('"\x27"') does (the string evaluated has a single character, which is an escaped hexadecimal representing an apostrophe), but shouldn't eval('"\\x27"') be equal to "\\x27"?
Secondly, adding to the confusion, if I set the following variables,
s = "\x27"
t = "\\x27"
then eval('s') is again "'", but eval('t') is "\\x27". Why is that?
According to the docs, eval "parses and evaluates the argument as a python expression". In other words, it's applying the same processing that is applied if you write x = "foobar \n" inside a program or the IDLE. In this example, \n gets turned into a newline-character (which, note, is not identical to the literal \n).
If you typed x = "\x27" into the IDLE, you'd get x == "'". The x27 is escaped because of the backslash and thus changed during evaluation. If you escape the backslash, then x27 is not changed during evaluation. Instead, you simply get a string with a backslash followed by x27.
Now if you evaluated that string again, you only have one backslash left - seemingly escaping x27. Thus, it is changed to '.
Another way to look at this: eval("\x27") evaluates the argument twice, but it is only changed the first time, to "'". eval("\\x27") also evaluates the argument twice, first to \x27, then to "'".
Here's an easier example to demonstrate how this works:
>>> x = "\"foobar\""
>>> x == "foobar"
False
>>> x == "\"foobar\""
True
>>> x = eval(x) # changes value of x from "foobar" to just foobar. Still string though, thus still ""
>>> x == "foobar"
True
>>> x == "\"foobar\""
False
Look at it like this: The right hand side of y = "2" contains two components: the information that y should be of type string, expressed using the two ", and the value of that string, expressed by the character 2. The separation of these two aspects is done during evaluation of the code you write. The string object itself never sees the " during initialization.
So in the above example, after the first line, we have x of type str with value "foobar". If you evaluate that again, the " are interpreted this time not as part of the value of x but as the type of x. So eval("\"foobar\"") basically transforms the string "foobar" to foobar, which, if you want to use that using the language Python, you have to write as "\"foobar\"" and "foobar".
For '"\x27"' the backslash escape is expanded during parsing, so this is literally
'"\'"'. '"\\x27"' only strips the backslash, i.e. it is equal to r'"\x27"'.
A direct eval() on the string literals adds another iteration of special character expansion: In the first case, \' is a valid escape sequence yielding '. The second case is the same as above.
When you use variable names, only one round of unescaping is performed when you assign the values. eval('s') simply expands to the value of s without further unescaping. If you want an emulation of the first case, you need to eval(s), i.e. evaluate the value of the string referenced by s.

Categories

Resources