This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Inaccurate Logarithm in Python
Why are the math.log10(x) and math.log(x,10) results different?
In [1]: from math import *
In [2]: log10(1000)
Out[2]: 3.0
In [3]: log(1000,10)
Out[3]: 2.9999999999999996
It's a known bug : http://bugs.python.org/issue3724
Seems logX(y) is always more precise than the equivalent log(Y, X).
math.log10 and math.log(x, 10) are using different algorithm, and the former is usually more accurate. Actually, it's a known issue(Issue6765): math.log, log10 inconsistency.
One may think in this way: log10(x) has a fixed base, hence it can be computed directly by some mathematical approximation formula(e.g. Taylor series), while log(x, 10) comes from a more general formula with two variables, which may be indirectly calculated by log(x) / log(10)(at least the precision of log(10) will affect the precision of quotient). So it's natural that the former way is both faster and more accurate, and that is reasonable considering that it takes advantage of a pre-known logarithmic base(i.e. 10).
As others have pointed out, log(1000, 10) is computed internally as log(1000) / log(10). This can be verified empirically:
In [3]: math.log(1000, 10) == math.log(1000) / math.log(10)
Out[3]: True
In [4]: math.log10(1000) == math.log(1000) / math.log(10)
Out[4]: False
The results of neither log(1000) nor log(10) can be represented as float, so the final result is also inexact.
Related
I am working with Python 3.6.
I am really confused, why this happened ?
In [1]: import numpy as np
In [2]: a = np.array(-1)
In [3]: a
Out[3]: array(-1)
In [4]: a ** (1/3)
/Users/wonderful/anaconda/bin/ipython:1: RuntimeWarning: invalid value encountered in power
#!/Users/wonderful/anaconda/bin/python
Out[4]: nan
Numpy does not seem to allow fractional powers of negative numbers, even if the power would not result in a complex number. (I actually had this same problem earlier today, unrelatedly). One workaround is to use
np.sign(a) * (np.abs(a)) ** (1 / 3)
change the dtype to complex numbers
a = np.array(-1, dtype=np.complex)
The problem arises when you are working with roots of negative numbers.
This question already has answers here:
Is floating point math broken?
(31 answers)
Closed 2 years ago.
Maybe this was answered before, but I'm trying to understand what is the best way to work with Pandas subtraction.
import pandas as pd
import random
import numpy as np
random.seed(42)
data = {'r': list([float(random.random()) for i in range(5)])}
for i in range(5):
data['r'].append(float(0.7))
df = pd.DataFrame(data)
If I run the following, I get the expected results:
print(np.sum(df['r'] >= 0.7))
6
However, if I modify slightly the condition, I don't get the expected results:
print(np.sum(df['r']-0.5 >= 0.2))
1
The same happens if I try to fix it by casting into float or np.float64 (and combinations of this), like the following:
print(np.sum(df['r'].astype(np.float64)-np.float64(0.5) >= np.float64(0.2)))
1
For sure I'm not doing the casting properly, but any help on this would be more than welcome!
You're not doing anything improperly. This is a totally straightforward floating point error. It will always happen.
>>> 0.7 >= 0.7
True
>>> (0.7 - 0.5) >= 0.2
False
You have to remember that floating point numbers are represented in binary, so they can only represent sums of powers of 2 with perfect precision. Anything that can't be represented finitely as a sum of powers of two will be subject to error like this.
You can see why by forcing Python to display the full-precision value associated with the literal 0.7:
format(0.7, '.60g')
'0.6999999999999999555910790149937383830547332763671875'
To add to #senderle answer, since this is a floating point issue you can solve it by:
((df['r'] - 0.5) >= 0.19).sum()
Oh a slightly different note, I'm not sure why you use np.sum when you could just use pandas .sum, seems like an unnecessary import
in a while loop of a code of mine the while breaks because
>>> 3<=abs(math.log(1000,10))
False
but, of course the log of 1000 in base 10 is exactly 3, so i was confident that the condition should work...
But in fact i have this:
>>> abs(math.log(1000,10))
2.9999999999999996
So i would like any suggestions: is there a "best practice" rounding the result or is there a smarter method to compute logarithms?
Thanks a lot!
For base 10 you should use:
math.log10(x)
It returns the base-10 logarithm of x. This is usually more accurate than log(x, 10).
Maybe you can use a tolerance to account for the computational errors. Something like this for example (you can tune the power of 10 you want depending on the required accuracy)
abs(math.log(1000,10))-3 >= -1e-6
Using numpy I also have np.log10(1000) = 3.0. You could use this as an alternative.
You're seeing float numeric imprecision in action, which is a fundamental limitation of computing systems you cannot entirely circumvent. Two standard practices are: (1) increasing numeric precision (e.g. float64), or (2) comparison by threshold:
You can use Numpy, which runs on float64 by default, and is the fastest Python library for numeric computing (even faster than C)
Ex: abs(math.log(1000,10) - eps), where eps (epsilon) is some small number (e.g. 1e-7)
In fact, Numpy has a handy method just for this, if comparing multiple values at once (vectors, matrices): np.allclose. To compare as in (2), set rtol=0, and just use atol; example:
vec1 = np.array([1, 1, 1, 1])
vec2 = np.array([.33, .33, .33, .33])
print(np.allclose(vec1/3, vec2, rtol=0, atol=0.01)) # True
print(np.allclose(vec1/3, vec2, rtol=0, atol=0.001)) # False
How to see true values: (1e-5 + 1e-5 + 1.) == 1.00002 will show True, but it isn't; Python automatically simplifies the console-logged representation for you. To see the actual numeric value, use Python's format:
print(format(1e-5 + 1e-5 + 1., '.32f'))
# print(format(1e-5 + 1e-5 + 1., '.32f'))
1.00001999999999990897947554913117
^ this is the true value stored in memory, and it's what 1.00002 is converted to before running ==, which is why it shows True.
You can use math.isclose to check is a value is close to another value.
import math
print(math.isclose(3,math.log(1000,10)))
You can change your condition like this
logvalue = math.log(1000,10)
print(3 <= logvalue or math.isclose(3,logvalue))
And get True as output
This question already has answers here:
What is the best way to compare floats for almost-equality in Python?
(18 answers)
Closed 6 years ago.
I have been looking around to find a general way of comparing two numerics in Python. In particular, I want to figure out whether they are the same or not.
The numeric types in Python are:
int, long, float & complex
For example, I can compare 2 integers (a type of numeric) by simply saying:
a == b
For floats, we have to be more careful due to rounding precision, but I can compare them within some tolerance.
Question
We get 2 general numerics a and b: How do we compare them? I was thinking of casting both to complex (which would then have a 0 imaginary part if the type is, say, int) and compare in that domain?
This question is more general than simply comparing floats directly. Certainly, it is related to this problem, but it is not the same.
In Python 3.5 (and in Numpy) you can use isclose
Read the PEP 485 that describes it, Python 3.5 math library listing and numpy.isclose for more. The numpy version works in all versions of Python that numpy is supported.
Examples:
>>> from math import isclose
>>> isclose(1,1.00000000001)
True
>>> isclose(1,1.00001)
False
The relative and absolute tolerance can be changed.
Relative tolerance can be thought of as +- a percentage between the two values:
>>> isclose(100,98.9, rel_tol=0.02)
True
>>> isclose(100,97.1, rel_tol=0.02)
False
The absolute tolerance is a absolute value between the two values. It is the same as the test of abs(a-b)<=tolerance
All numeric types of Python are support with the Python 3.5 version. (Use the cmath version for complex)
I think longer term, this is your better bet for numerics. For older Python, just import the source. There is a version on Github.
Or, (forgoing error checking and inf and NaN support) you can just use:
def myisclose(a, b, *, rel_tol=1e-09, abs_tol=0.0):
return abs(a-b) <= max( rel_tol * max(abs(a), abs(b)), abs_tol )
If you are looking to compare different types of numerics, there is nothing wrong with the == operator: Python will handle the type-casting. Consider the following:
>>> 1 == 1 + 0j == 1.0
True
In cases where you are doing mathematical operations that could result in loss of precision (especially with floats), a common technique is to check if the values are within a certain tolerance. For example:
>>> (10**.5)**2
10.000000000000002
>>> (10**.5)**2 == 10
False
In this case, you can find the absolute value of the difference and make sure it is under a certain threshold:
>>> abs((10**.5)**2 - 10) < 1e-10
True
Why not just use == ?
>>1 == (1+0j)
True
>>1.0 == 1
True
I'm pretty sure this works for all numeric types.
Is it possible to calculate the value of the mathematical constant, e with high precision (2000+ decimal places) using Python?
I am particularly interested in a solution either in or that integrates with NumPy or SciPy.
You can set the precision you want with the decimal built-in module:
from decimal import *
getcontext().prec = 40
Decimal(1).exp()
This returns:
Decimal('2.718281828459045235360287471352662497757')
This can also be done with sympy using numerical evaluation:
import sympy
print sympy.N(sympy.E, 100)
Using a series sum you could calculate it:
getcontext().prec = 2000
e = Decimal(0)
i = 0
while True:
fact = math.factorial(i)
e += Decimal(1)/fact
i += 1
if fact > 10**2000: break
But that's not really necessary, as what Mermoz did agrees just fine with it:
>>> e
Decimal('2.7182818284590452353602874713526624977572470936999595749669676
277240766303535475945713821785251664274274663919320030599218174135966290
435729003342952605956307381323286279434907632338298807531952510190115738
341879307021540891499348841675092447614606680822648001684774118537423454
424371075390777449920695517027618386062613313845830007520449338265602976
067371132007093287091274437470472306969772093101416928368190255151086574
637721112523897844250569536967707854499699679468644549059879316368892300
987931277361782154249992295763514822082698951936680331825288693984964651
058209392398294887933203625094431173012381970684161403970198376793206832
823764648042953118023287825098194558153017567173613320698112509961818815
930416903515988885193458072738667385894228792284998920868058257492796104
841984443634632449684875602336248270419786232090021609902353043699418491
463140934317381436405462531520961836908887070167683964243781405927145635
490613031072085103837505101157477041718986106873969655212671546889570350
354021234078498193343210681701210056278802351930332247450158539047304199
577770935036604169973297250886876966403555707162268447162560798826517871
341951246652010305921236677194325278675398558944896970964097545918569563
802363701621120477427228364896134225164450781824423529486363721417402388
934412479635743702637552944483379980161254922785092577825620926226483262
779333865664816277251640191059004916449982893150566047258027786318641551
956532442586982946959308019152987211725563475463964479101459040905862984
967912874068705048958586717479854667757573205681288459205413340539220001
137863009455606881667400169842055804033637953764520304024322566135278369
511778838638744396625322498506549958862342818997077332761717839280349465
014345588970719425863987727547109629537415211151368350627526023264847287
039207643100595841166120545297030236472549296669381151373227536450988890
313602057248176585118063036442812314965507047510254465011727211555194866
850800368532281831521960037356252794495158284188294787610852639810')
>>> Decimal(1).exp()
Decimal('2.7182818284590452353602874713526624977572470936999595749669676
277240766303535475945713821785251664274274663919320030599218174135966290
435729003342952605956307381323286279434907632338298807531952510190115738
341879307021540891499348841675092447614606680822648001684774118537423454
424371075390777449920695517027618386062613313845830007520449338265602976
067371132007093287091274437470472306969772093101416928368190255151086574
637721112523897844250569536967707854499699679468644549059879316368892300
987931277361782154249992295763514822082698951936680331825288693984964651
058209392398294887933203625094431173012381970684161403970198376793206832
823764648042953118023287825098194558153017567173613320698112509961818815
930416903515988885193458072738667385894228792284998920868058257492796104
841984443634632449684875602336248270419786232090021609902353043699418491
463140934317381436405462531520961836908887070167683964243781405927145635
490613031072085103837505101157477041718986106873969655212671546889570350
354021234078498193343210681701210056278802351930332247450158539047304199
577770935036604169973297250886876966403555707162268447162560798826517871
341951246652010305921236677194325278675398558944896970964097545918569563
802363701621120477427228364896134225164450781824423529486363721417402388
934412479635743702637552944483379980161254922785092577825620926226483262
779333865664816277251640191059004916449982893150566047258027786318641551
956532442586982946959308019152987211725563475463964479101459040905862984
967912874068705048958586717479854667757573205681288459205413340539220001
137863009455606881667400169842055804033637953764520304024322566135278369
511778838638744396625322498506549958862342818997077332761717839280349465
014345588970719425863987727547109629537415211151368350627526023264847287
039207643100595841166120545297030236472549296669381151373227536450988890
313602057248176585118063036442812314965507047510254465011727211555194866
850800368532281831521960037356252794495158284188294787610852639814')
The excellent pure-python library, Mpmath, will certainly do the trick.
The sole focus of this library is multi-precision floating-point arithmetic.
E.g., Mpath can evaluate e to arbitrary precision:
In [2]: from mpmath import *
# set the desired precision on the fly
In [3]: mp.dps=20; mp.pretty=True
In [4]: +e
Out[4]: 2.7182818284590452354
# re-set the precision (50 digits)
In [5]: mp.dps=50; mp.pretty=True
In [6]: +e
Out[6]: 2.7182818284590452353602874713526624977572470937
As an aside, Mpmath is also tightly integrated with Matplotlib.
I would think you could combine the info from these webpages:
http://en.wikipedia.org/wiki/Taylor_series
This gives you the familiar power series. Since you're working with large factorial numbers you should then probably work with gmpy which implements multi-precision arithmetic.
Using Sage:
N(e, digits=2000)