Python: Decimals with trigonometric functions - python

I'm having a little problem, take a look:
>>> import math
>>> math.sin(math.pi)
1.2246467991473532e-16
This is not what I learnt in my Calculus class (It was 0, actually)
So, now, my question:
I need to perform some heavy trigonometric calculus with Python. What library can I use to get correct values?
Can I use Decimal?
EDIT:
Sorry, What I mean is other thing.
What I want is some way to do:
>>> awesome_lib.sin(180)
0
or this:
>>> awesome_lib.sin(Decimal("180"))
0
I need a libraray that performs good trigonometric calculus. Everybody knows that sin 180° is 0, I need a library that can do that too.

1.2246467991473532e-16 is close to 0 -- there are 16 zeroes between the decimal point and the first significant digit -- much as 3.1415926535897931 (the value of math.pi) is close to pi. The answer is correct to sixteen decimal places!
So if you want sin(pi) to equal 0, simply round it to a reasonable number of decimal places. 15 looks good to me and should be plenty for any application:
print round(math.sin(math.pi), 15)

Pi is an irrational number so it can't be represented exactly using a finite number of bits. However, you can use some library for symbolic computation such as sympy.
>>> sympy.sin(sympy.pi)
0
Regarding the second part of you question, if you want to use degrees instead of radians you can define a simple conversion function
def radians(x):
return x * sympy.pi / 180
and use it as follows:
>>> sympy.sin(radians(180))
0

If you find the result unexpected, I dare suggesting that you have a look at this text:
What Every Computer Scientist Should Know About Floating-Point Arithmetic
It's really worth it.

you can also try gmpy or real
in gmpy you can specify the precision explicitly:
gmpy.pi(256)
in real.py you could use the pa() function:
from real import pa,pi
pa(pi)

Short Answer -
Decimal.cos() and Decimal.sin() can both be implemented from Decimal.exp() implementation by splitting all even terms into the cos() function and all the odd terms into the sin() function and alternating the sign of each term between positive and negative in both of those series. No change needed in the loop which only computes N terms based on configured precision (Decimal.getcontext().prec).
Long Answer -
Python decimal.Decimal supports exp() function that takes only a real number argument (unlike exp() in R language) and computes the infinite series only up to the number of terms based on configured precision (decimal.Decimal.getcontext().prec).
Currently the even terms compute cosh() and the odd terms compute sinh(). Their sum is returned as the result of exp(). If the sign of each term was modified to alternate between positive and negative within each series, the even-terms-series will compute cos() and the odd-terms-series would compute sin().
Additionally, like R language, this change could enable Decimal.exp() to support complex arguments, so that exp(1j*x) could return Decimal.cos(x) + 1j * Decimal.sin(x).

Related

Python math.log and math.log10 giving different results

I was writing a code to calculate number of digits in a given whole number.
I was initially using
math.log(num,10)
but found out it was giving incorrect(approximate) value at num = 1000
math.log(1000,10)
>2.9999999999999996
I understand that the above might be due to the floating point arithmetic in computers being done differently but the same, however, works flawlessly using math.log10
math.log10(1000)
>3.0
Is it correct to assume that log10 is more accurate than log and to use it wherever log base 10 is involved instead of going with the more generalized log function?
Python's math documentation specifically says:
math.log10(x)
Return the base-10 logarithm of x. This is usually more accurate than log(x, 10).
According to the Python Math module documentation:
math.log(x,[base])
With one argument, return the natural logarithm of x (to base e).
With two arguments, return the logarithm of x to the given base, calculated as log(x)/log(base).
Whereas in the math.log10 section:
math.log10(x)
Return the base-10 logarithm of x. This is usually more accurate than log(x, 10).
It might be due to the rounding of the floating point numbers.
Because,
If I take the first method of using log(1000)/log(10), I get:
>>> log(1000)
6.907755278982137
>>> log(10)
2.302585092994046
>>> 6.907755278982137/2.302585092994046
2.9999999999999996

Does Python document its behavior for rounding to a specified number of fractional digits?

Is the algorithm used for rounding a float in Python to a specified number of digits specified in any Python documentation? The semantics of round with zero fractional digits (i.e. rounding to an integer) are simple to understand, but it's not clear to me how the case where the number of digits is nonzero is implemented.
The most straightforward implementation of the function that I can think of (given the existence of round to zero fractional digits) would be:
def round_impl(x, ndigits):
return (10 ** -ndigits) * round(x * (10 ** ndigits))
I'm trying to write some C++ code that mimics the behavior of Python's round() function for all values of ndigits, and the above agrees with Python for the most part, when translated to equivalent C++ calls. However, there are some cases where it differs, e.g.:
>>> round(0.493125, 5)
0.49312
>>> round_impl(0.493125, 5)
0.49313
There is clearly a difference that occurs when the value to be rounded is at or very near the exact midpoint between two potential output values. Therefore, it seems important that I try to use the same technique if I want similar results.
Is the specific means for performing the rounding specified by Python? I'm using CPython 2.7.15 in my tests, but I'm specifically targeting v2.7+.
Also refer to What Every Programmer Should Know About Floating-Point Arithmetic, which has more detailed explanations for why this is happening as it is.
This is a mess. First of all, as far as float is concerned, there is no such number as 0.493125, when you write 0.493125 what you actually get is:
0.493124999999999980015985556747182272374629974365234375
So this number is not exactly between two decimals, it's actually closer to 0.49312 than it is to 0.49313, so it should definitely round to 0.49312, that much is clear.
The problem is that when you multiply by 105, you get the exact number 49312.5. So what happened here is the multiplication gave you an inexact result which by coincidence canceled out the rounding error in the original number. Two rounding errors canceled each other out, yay! But the problem is that when you do this, the rounding is actually incorrect... at least if you want to round up at midpoints, but Python 3 and Python 2 behave differently. Python 2 rounds away from 0, and Python 3 rounds towards even least-significant digits.
Python 2
if two multiples are equally close, rounding is done away from 0
Python 3
...if two multiples are equally close, rounding is done toward the even choice...
Summary
In Python 2,
>>> round(49312.5)
49313.0
>>> round(0.493125, 5)
0.49312
In Python 3,
>>> round(49312.5)
49312
>>> round(0.493125, 5)
0.49312
And in both cases, 0.493125 is really just a short way of writing 0.493124999999999980015985556747182272374629974365234375.
So, how does it work?
I see two plausible ways for round() to actually behave.
Choose the closest decimal number with the specified number of digits, and then round that decimal number to float precision. This is hard to implement, because it requires doing calculations with more precision than you can get from a float.
Take the two closest decimal numbers with the specified number of digits, round them both to float precision, and return whichever is closer. This will give incorrect results, because it rounds numbers twice.
And Python chooses... option #1! The exactly correct, but much harder to implement version. Refer to Objects/floatobject.c:927 double_round(). It uses the following process:
Write the floating-point number to a string in decimal format, using the requested precision.
Parse the string back in as a float.
This uses code based on David Gay's dtoa library. If you want C++ code that gets the actual correct result like Python does, this is a good start. Fortunately you can just include dtoa.c in your program and call it, since its licensing is very permissive.
The Python documentation for and 2.7 specifies the behaviour:
Values are rounded to the closest multiple of 10 to the power minus
ndigits; if two multiples are equally close, rounding is done away
from 0.
For 3.7:
For the built-in types supporting round(), values are rounded to the
closest multiple of 10 to the power minus ndigits; if two multiples
are equally close, rounding is done toward the even choice
Update:
The (cpython) implementation can be found floatobjcet.c in the function float___round___impl, which calls round if ndigits is not given, but double_round if it is.
double_round has two implementations.
One converts the double to a string (aka decimal) and back to a double.
The other one does some floating point calculations, calls to pow and at its core calls round. It seems to have more potential problems with overflows, since it actually multiplies the input by 10**-ndigits.
For the precise algorithm, look at the linked source file.

Dealing with large numbers in R [Inf] and Python

I am learning Python these days, and this is probably my first post on Python. I am relatively new to R as well, and have been using R for about a year. I am comparing both the languages while learning Python. I apologize if this question is too basic.
I am unsure why R outputs Inf for something python doesn't. Let's take 2^1500 as an example.
In R:
nchar(2^1500)
[1] 3
2^1500
[1] Inf
In Python:
len(str(2**1500))
Out[7]: 452
2**1500
Out[8]: 3507466211043403874...
I have two questions:
a) Why is it that R provides Inf when Python doesn't.
b) I researched How to work with large numbers in R? thread. It seems that Brobdingnag could help us out with dealing with large numbers. However, even in such case, I am unable to compute nchar. How do I compute above expression i.e. 2^1500 in R
2^Brobdingnag::as.brob(500)
[1] +exp(346.57)
> nchar(2^Brobdingnag::as.brob(500))
Error in nchar(2^Brobdingnag::as.brob(500)) :
no method for coercing this S4 class to a vector
In answer to your questions:
a) They use different representations for numbers. Most numbers in R are represented as double precision floating point values. These are all 64 bits long, and give about 15 digit precision throughout the range, which goes from -double.xmax to double.xmax, then switches to signed infinite values. R also uses 32 bit integer values sometimes. These cover the range of roughly +/- 2 billion. R chooses these types because it is geared towards statistical and numerical methods, and those rarely need more precision than double precision gives. (They often need a bigger range, but usually taking logs solves that problem.)
Python is more of a general purpose platform, and it has types discussed in MichaelChirico's comment.
b) Besides Brobdingnag, the gmp package can handle arbitrarily large integers. For example,
> as.bigz(2)^1500
Big Integer ('bigz') :
[1] 35074662110434038747627587960280857993524015880330828824075798024790963850563322203657080886584969261653150406795437517399294548941469959754171038918004700847889956485329097264486802711583462946536682184340138629451355458264946342525383619389314960644665052551751442335509249173361130355796109709885580674313954210217657847432626760733004753275317192133674703563372783297041993227052663333668509952000175053355529058880434182538386715523683713208549376
> nchar(as.character(as.bigz(2)^1500))
[1] 452
I imagine the as.character() call would also be needed with Brobdingnag.
Apparently python uses arbitrary precision integers by default when needed. R does not. However, there are many useful R packages to perform arbitrary precision arithmetic. Which package to pick depends on the use case.
To bring up a package that hasn't been discussed yet, consider the Rmpfr package:
> library(Rmpfr)
> a <- 2^mpfr(1500, 10000)
> a
1 'mpfr' number of precision 10000 bits
[1] 35074662110434038747627587960280857993524015880330828824075798024790963850563322203657080886584969261653150406795437517399294548941469959754171038918004700847889956485329097264486802711583462946536682184340138629451355458264946342525383619389314960644665052551751442335509249173361130355796109709885580674313954210217657847432626760733004753275317192133674703563372783297041993227052663333668509952000175053355529058880434182538386715523683713208549376
It requires you to set a precision, but if you make it large enough it can hold 2^1500 as integer.
However, it also doesn't seem to define an as.character() function:
> as.character(a)
[1] "<S4 object of class \"mpfr1\">"
So if your problem is specifically to count digits, then the gmp package as discussed in this answer is probably the way to go. On the other hand, if you're interested in arbitrary precision floating point arithmetic, Rmpfr might be a better choice.

How to avoid floating point arithmetics issues?

Python (and almost anything else) has known limitations while working with floating point numbers (nice overview provided here).
While problem is described well in the documentation it avoids providing any approach to fixing it. And with this question I am seeking to find a more or less robust way to avoid situations like the following:
print(math.floor(0.09/0.015)) # >> 6
print(math.floor(0.009/0.0015)) # >> 5
print(99.99-99.973) # >> 0.016999999999825377
print(.99-.973) # >> 0.017000000000000015
var = 0.009
step = 0.0015
print(var < math.floor(var/step)*step+step) # False
print(var < (math.floor(var/step)+1)*step) # True
And unlike suggested in this question, their solution does not help to fix a problem like next peace of code failing randomly:
total_bins = math.ceil((data_max - data_min) / width) # round to upper
new_max = data_min + total_bins * width
assert new_max >= data_max
# fails. because for example 1.9459999999999997 < 1.946
If you deal in discrete quantities, use int.
Sometimes people use float in places where they definitely shouldn't. If you're counting something (like number of cars in the world) as opposed to measuring something (like how much gasoline is used per day), floating-point is probably the wrong choice. Currency is another example where floating point numbers are often abused: if you're storing your bank account balance in a database, it's really not 123.45 dollars, it's 12345 cents. (But also see below about Decimal.)
Most of the rest of the time, use float.
Floating-point numbers are general-purpose. They're extremely accurate; they just can't represent certain fractions, like finite decimal numbers can't represent the number 1/3. Floats are generally suited for any kind of analog quantity where the measurement has error bars: length, mass, frequency, energy -- if there's uncertainty on the order of 2^(-52) or greater, there's probably no good reason not to use float.
If you need human-readable numbers, use float but format it.
"This number looks weird" is a bad reason not to use float. But that doesn't mean you have to display the number to arbitrary precision. If a number with only three significant figures comes out to 19.99909997918947, format it to one decimal place and be done with it.
>>> print('{:0.1f}'.format(e**pi - pi))
20.0
If you need precise decimal representation, use Decimal.
Sraw's answer refers to the decimal module, which is part of the standard library. I already mentioned currency as a discrete quantity, but you may need to do calculations on amounts of currency in which not all numbers are discrete, for example calculating interest. If you're writing code for an accounting system, there will be rules that say when rounding is applied and to what accuracy various calculations are done, and those specifications will be written in terms of decimal places. In this situation and others where the decimal representation is inherent to the problem specification, you'll want to use a decimal type.
>>> from decimal import Decimal
>>> rate = Decimal('0.0345')
>>> principal = Decimal('3412.65')
>>> interest = rate*principal
>>> interest
Decimal('117.736425')
>>> interest.quantize(Decimal('0.01'))
Decimal('117.74')
But most importantly, use data types and operations that make sense in context.
Several of your examples use math.floor, which takes a float and chops off the fractional part. In any situation where you should use math.floor, floating-point error doesn't matter. (If you want to round to the nearest integer, use round instead.) Yes, there are ways to use floating-point operations that have wrong results from a mathematical standpoint. But real-world quantities usually fall into one of these categories:
Exact, and therefore should not be put in a float;
Imprecise to a degree far exceeding the likely accumulation of floating-point error.
As a programmer, it's part of your job to know the quantities you're dealing with and choose appropriate data types. So there's no "fix" for floating point numbers, because there's no "problem" really -- just people using the wrong type for the wrong thing.
Let's talk about decimal. Actually, this library converts number into a string-like object, and then do any arithmetical operation based on chars.
So in this case, it can handle significantly huge number with almost perfect precision.
But, as it calculate number based on chars, it cost much more.
Further, if you want to use decimal, to ensure precision, you need consistently use it. If you mix decimal with normal types such as float, it may cause unexpected problems.
Finally, when you construct a Decimal object, it is better to pass a string but not a number.
>>> print(Decimal(99.99) - Decimal(99.973))
0.01699999999999590727384202182
>>> print(Decimal("99.99") - Decimal("99.973"))
0.017
It depends what your end goal is - there is no way to "perfectly" store floating point numbers. Only "good enough".
If you are working with money for example (dollars and cents) it is common practice to not store dollars - and only cents. (dollar = 100 cents) - this is how paypal stores your account balance on their servers.
There is also the python Decimal class for fixed point arithmetic.

Exponential of very small number in python

I am trying to calculate the exponential of -1200 in python (it's an example, I don't need -1200 in particular but a collection of numbers that are around -1200).
>>> math.exp(-1200)
0.0
It is giving me an underflow; How may I go around this problem?
Thanks for any help :)
In the standard library, you can look at the decimal module:
>>> import decimal
>>> decimal.Decimal(-1200)
Decimal('-1200')
>>> decimal.Decimal(-1200).exp()
Decimal('7.024601888177132554529322758E-522')
If you need more functions than decimal supports, you could look at the library mpmath, which I use and like a lot:
>>> import mpmath
>>> mpmath.exp(-1200)
mpf('7.0246018881771323e-522')
>>> mpmath.mp.dps = 200
>>> mpmath.exp(-1200)
mpf('7.0246018881771325545293227583680003334372949620241053728126200964731446389957280922886658181655138626308272350874157946618434229308939128146439669946631241632494494046687627223476088395986988628688095132e-522')
but if possible, you should see if you can recast your equations to work entirely in the log space.
Try calculating in logarithmic domain as long as possible. I.e. avoid calculating the exact value but keep working with exponents.
exp(-1200) IS a very very small number (just as exp(1200) is a very very big one), so maybe the exact value is not really what you are interested in. If you only need to compare these numbers then logarithmic space should be enough.

Categories

Resources