I am playing around with the math module in Python 3.4 and I got some curious results when using fmod function for which I am having hard times in getting detailed info from the python website.
One simple example is the following:
from math import *
x = 99809175801648148531
y = 6.5169020832937505
sqrt(x)-cos(x)**fmod(x, y)*log10(x)
it returns:
(9990454237.014296+8.722374238018135j)
How to interpret this result? What is j?
Is it an imaginary number like i?
If so, why j and not i?
Any info, as well as links to some resources about fmod are very welcome.
The result you got was a complex number because you exponentiated a negative number. i and j are just notational choices to represent the imaginary number unit, i being used in mathematics more and j being used in engineering more. You can see in the docs that Python has chosen to use j:
https://docs.python.org/2/library/cmath.html#conversions-to-and-from-polar-coordinates
Here, j is the same as i, the square root of -1. It is a convention commonly used in engineering, where i is used to denote electrical current.
The reason complex numbers arise in your case is that you're raising a negative number to a fractional power. See How do you compute negative numbers to fractional powers? for further discussion.
cos(x) is a negative number. When you raise a negative number to a non-integral power, it is not surprising to get a complex result. Most roots of negative numbers are complex.
>>> x = 99809175801648148531
>>> y = 6.5169020832937505
>>> cos(x)
-0.7962325418899466
>>> fmod(x,y)
3.3940870272073056
>>> cos(x)**fmod(x,y)
(-0.1507219382442201-0.436136801343955j)
Imaginary numbers can be represented with either an 'i' or a 'j'. I believe the reasons are historical. Mathematicians prefered 'i' for imaginary. Electrical engineers didn't want to get an imaginary 'i' confused with an 'i' for current, so they used 'j'. Now, both are used.
Related
I found some Python code that claims checking primality based on Fermat's little theorem:
def CheckIfProbablyPrime(x):
return (2 << x - 2) % x == 1
My questions:
How does it work?
What's its relation to Fermat's little theorem?
How accurate is this method?
If it's not accurate, what's the advantage of using it?
I found it here.
1. How does it work?
Fermat's little theorem says that if a number x is prime, then for any integer a:
If we divide both sides by a, then we can re-write the equation as follows:
I'm going to punt on proving how this works (your first question) because there are many good proofs (better than I can provide) on this wiki page and under some Google searches.
2. Relation between code and theorem
So, the function you posted checks if (2 << x - 2) % x == 1.
First off, (2 << x-2) is the same thing as writing 2**(x-1), or in math-form:
That's because << is the logical left-shift operator, which is explained better here. The relation between bit-shifting and multiplying by powers of 2 is specific to the way that numbers are represented on computers (in binary), but it all boils down to
I can subtract 1 from the exponent on both sides, which gives
Now, we know from above that for any number a,
Let's say then that a = 2. That gives us
Well heck, that's the same as 2 << (x-2)! So then we can write:
Which leads to the final relation:
Now, the math version of mod looks kind of odd, but we can write the equivalent code as follows:
(2 << x - 2) % x == 1
And that's the relation.
3. Accuracy of method
So, I think "accuracy" is a bad term here, because Fermat's little theorem is definitely true for all prime numbers. However, that does not mean that it's true or false for all numbers -- which is to say, if I have some number i, and I'm not sure if i is prime, using Fermat's Little Relation will only tell me if it is definitely NOT prime. If Fermat's Little Relation is true, then i could not be prime. These kinds of numbers are called pseudoprime numbers, or more specifically in this case Fermat Pseudoprime numbers.
If this sort of thing sounds interesting, take a look at the Carmichael numbers AKA the Absolute Fermat Pseudoprimes, which pass the Fermat test in any base but are not prime. In our case we run into numbers which pass in base 2, but Fermat's little theorem might not hold for these numbers in other bases -- the Carmichael numbers pass the test for all bases coprime to x.
On the wiki page of the Carmichael there is a discussion of their distribution over the range of natural numbers -- they appear exponentially with the size of the range over which you're looking, though the exponent is less than 1 (about 1/3). So, if you're searching for primes over a big range, you're going to run into exponentially more Carmichael numbers, which are effectively false positives for this method CheckIfProbablyPrime. That might be okay, depending on your input and how much you care about running into false positives.
4. Why is this useful?
In short, it's an optimization.
The main reason to use something like this is to speed up a search for prime numbers. That's because actually checking if a number is prime is expensive -- i.e. more than O(1) running time. Doable, but still more expensive than O(1) time. So, if we can avoid doing that actual check for some numbers, we'll be able to devote more time to checking actual candidates. Since Fermat's little relation will only say yes if a number is possibly prime (it will never say no if the number is prime), and it can be checked in O(1) time, we can toss it into an is_prime loop to ignore a fair amount of numbers. So, we can speed things up.
There are many primality checks like this one, you can find some coded prime checkers here
Final Note
One of the confusing things about this optimization is that it uses the bit shift operator << instead of the exponentiation operator **. This is because bit shifting is one of the fastest operations that your computer can do, while exponentiation is slower by some amount. It is not always the best optimization in many cases, because most modern languages know how to replace things we write with more optimized operations. But, that's my venture as to why the authors of this code used the bit shift instead of 2**(x-1).
Edit: As MarkDickinson notes, taking the exponent of a number and then modding it explicitly is not the best way to do it. This is a thing called modular exponentiation, and there exist algorithms which can do it faster than the way we've written it. Python's builtin pow actually implements one of these algorithms, and takes an optional third argument to mod by. So we can write a final version of this function:
def CheckIfProbablyPrime(x):
return pow(2, x-1, x) == 1
Which is not only more readable but also faster than the confusing bit-shift crap. You know what they say.
I believe, the code in your example is incorrect because binary left shift operator is not equivalent to power of a number, which is used in Fermat's little theorem. With base of two, binary left shift would be equal to power of x + 1, which is NOT used in a version of Fermat's little format.
Instead, use ** for power of integer in Python.
def CheckIfProbablyPrime(x):
return (2 ** x - 2) % x == 0
" p − a is an integer multiple of p " therefore for primes, following theorem, result of 2 in power of x - 2 divided by x will leave a leftover of 0 (modulo '%' checks for number left over after division.
For x - 1 version,
def CheckIfProbablyPrime(a, x):
return (a ** (x-1) - 1) % x == 0
both variations should result as true for prime numbers, because they're representing the Fermat's little theorem in Python
My code:
import math
import cmath
print "E^ln(-1)", cmath.exp(cmath.log(-1))
What it prints:
E^ln(-1) (-1+1.2246467991473532E-16j)
What it should print:
-1
(For Reference, Google checking my calculation)
According to the documentation at python.org cmath.exp(x) returns e^(x), and cmath.log(x) returns ln (x), so unless I'm missing a semicolon or something , this is a pretty straightforward three line program.
When I test cmath.log(-1) it returns πi (technically 3.141592653589793j). Which is right. Euler's identity says e^(πi) = -1, yet Python says when I raise e^(πi), I get some kind of crazy talk (specifically -1+1.2246467991473532E-16j).
Why does Python hate me, and how do I appease it?
Is there a library to include to make it do math right, or a sacrifice I have to offer to van Rossum? Is this some kind of floating point precision issue perhaps?
The big problem I'm having is that the precision is off enough to have other values appear closer to 0 than actual zero in the final function (not shown), so boolean tests are worthless (i.e. if(x==0)) and so are local minimums, etc...
For example, in an iteration below:
X = 2 Y= (-2-1.4708141202500006E-15j)
X = 3 Y= -2.449293598294706E-15j
X = 4 Y= -2.204364238465236E-15j
X = 5 Y= -2.204364238465236E-15j
X = 6 Y= (-2-6.123233995736765E-16j)
X = 7 Y= -2.449293598294706E-15j
3 & 7 are both actually equal to zero, yet they appear to have the largest imaginary parts of the bunch, and 4 and 5 don't have their real parts at all.
Sorry for the tone. Very frustrated.
As you've already demonstrated, cmath.log(-1) doesn't return exactly i*pi. Of course, returning pi exactly is impossible as pi is an irrational number...
Now you raise e to the power of something that isn't exactly i*pi and you expect to get exactly -1. However, if cmath returned that, you would be getting an incorrect result. (After all, exp(i*pi+epsilon) shouldn't equal -1 -- Euler doesn't make that claim!).
For what it's worth, the result is very close to what you expect -- the real part is -1 with an imaginary part close to floating point precision.
It appears to be a rounding issue. While -1+1.22460635382e-16j is not a correct value, 1.22460635382e-16j is pretty close to zero. I don't know how you could fix this but a quick and dirty way could be rounding the number to a certain number of digits after the dot ( 14 maybe ? ).
Anything less than 10^-15 is normally zero. Computer calculations have a certain error that is often in that range. Floating point representations are representations, not exact values.
The problem is inherent to representing irrational numbers (like π) in finite space as floating points.
The best you can do is filter your result and set it to zero if its value is within a given range.
>>> tolerance = 1e-15
>>> def clean_complex(c):
... real,imag = c.real, c.imag
... if -tolerance < real < tolerance:
... real = 0
... if -tolerance < imag < tolerance:
... imag = 0
... return complex(real,imag)
...
>>> clean_complex( cmath.exp(cmath.log(-1)) )
(-1+0j)
I am wondering about the way Python (3.3.0) prints complex numbers. I am looking for an explanation, not a way to change the print.
Example:
>>> complex(1,1)-complex(1,1)
0j
Why doesn't it just print "0"? My guess is: to keep the output of type complex.
Next example:
>>> complex(0,1)*-1
(-0-1j)
Well, a simple "-1j" or "(-1j)" would have done. And why "-0"?? Isn't that the same as +0? It doesn't seem to be a rounding problem:
>>> (complex(0,1)*-1).real == 0.0
True
And when the imaginary part gets positive, the -0 vanishes:
>>> complex(0,1)
1j
>>> complex(0,1)*-1
(-0-1j)
>>> complex(0,1)*-1*-1
1j
Yet another example:
>>> complex(0,1)*complex(0,1)*-1
(1-0j)
>>> complex(0,1)*complex(0,1)*-1*-1
(-1+0j)
>>> (complex(0,1)*complex(0,1)*-1).imag
-0.0
Am I missing something here?
It prints 0j to indicate that it's still a complex value. You can also type it back in that way:
>>> 0j
0j
The rest is probably the result of the magic of IEEE 754 floating point representation, which makes a distinction between 0 and -0, the so-called signed zero. Basically, there's a single bit that says whether the number is positive or negative, regardless of whether the number happens to be zero. This explains why 1j * -1 gives something with a negative zero real part: the positive zero got multiplied by -1.
-0 is required by the standard to compare equal to +0, which explains why (1j * -1).real == 0.0 still holds.
The reason that Python still decides to print the -0, is that in the complex world these make a difference for branch cuts, for instance in the phase function:
>>> phase(complex(-1.0, 0.0))
3.141592653589793
>>> phase(complex(-1.0, -0.0))
-3.141592653589793
This is about the imaginary part, not the real part, but it's easy to imagine situations where the sign of the real part would make a similar difference.
The answer lies in the Python source code itself.
I'll work with one of your examples. Let
a = complex(0,1)
b = complex(-1, 0)
When you doa*b you're calling this function:
real_part = a.real*b.real - a.imag*b.imag
imag_part = a.real*b.imag + a.imag*b.real
And if you do that in the python interpreter, you'll get
>>> real_part
-0.0
>>> imag_part
-1.0
From IEEE754, you're getting a negative zero, and since that's not +0, you get the parens and the real part when printing it.
if (v->cval.real == 0. && copysign(1.0, v->cval.real)==1.0) {
/* Real part is +0: just output the imaginary part and do not
include parens. */
...
else {
/* Format imaginary part with sign, real part without. Include
parens in the result. */
...
I guess (but I don't know for sure) that the rationale comes from the importance of that sign when calculating with elementary complex functions (there's a reference for this in the wikipedia article on signed zero).
0j is an imaginary literal which indeed indicates a complex number rather than an integer or floating-point one.
The +-0 ("signed zero") is a result of Python's conformance to IEEE 754 floating point representation since in Python, complex is by definition a pair of floating point numbers. Due to the latter, there's no need to print or specify zero fraction parts for a complex too.
The -0 part is printed in order to accurately represent the contents as repr()'s documentation demands (repr() is implicitly called whenever an operation's result is output to the console).
Regarding the question why (-0+1j) = 1j but (1j*-1) = (-0+1j).
Note that (-0+0j) or (-0.0+0j) aren't single complex numbers but expressions - an int/float added to a complex. To compute the result, first the first number is converted to a complex (-0-> (0.0,0.0) since integers don't have signed zeros, -0.0-> (-0.0,0.0)). Then its .real and .imag are added to the corresponding ones of 1j which are (+0.0,1.0). The result is (+0.0,1.0) :^) . To construct a complex directly, use complex(-0.0,1).
As far as the first question is concerned: if it just printed 0 it would be mathematically correct, but you wouldn't know you were dealing with a complex object vs an int. As long as you don't specify .real you will always get a J component.
I'm not sure why you would ever get -0; it's not technically incorrect (-1 * 0 = 0) but it's syntactically odd.
As far as the rest goes, it's strange that it isn't consistent, however none are technically correct, just an artifact of the implementation.
How can we truncate (not round) the cube root of a given number after the 10th decimal place in python?
For Example:
If number is 8 the required output is 2.0000000000 and for 33076161 it is 321.0000000000
Scale - truncate - unscale:
n = 10.0
cube_root = 1e-10 * int(1e10 * n**(1.0/3.0))
You should only do such truncations (unless you have a serious reason otherwise) while printing out results. There is no exact binary representation in floating point format, for a whole host of everyday decimal values:
print 33076161**(1.0/3.0)
A calculator gives you a different answer than Python gives you. Even Windows calculator does a passable job on cuberoot(33076161), whereas the answer given by python will be minutely incorrect unless you use rounding.
So, the question you ask is fundamentally unanswerable since it assumes capabilities that do not exist in floating point math.
Wrong Answer #1: This actually rounds instead of truncating, but for the cases you specified, it provides the correct output, probably due to rounding compensating for the inherent floating point precision problem you will hit in case #2:
print "%3.10f" % 10**(1.0/3.0)
Wrong Answer #2: But you could truncate (as a string) an 11-digit rounded value, which, as has been pointed out to me, would fail for values very near rollover, and in other strange ways, so DON'T do this:
print ("%3.11f" % 10**(1.0/3.0))[:-1]
Reasonably Close Answer #3: I wrote a little function that is for display only:
import math
def str_truncate(f,d):
s = f*(10.0**(d))
str = `math.trunc(s)`.rstrip('L')
n = len(str)-d
w = str[0:n]
if w=='':
w='0'
ad =str[n:d+n]
return w+'.'+ad
d = 8**(1.0/3.0)
t=str_truncate(d,10)
print 'case 1',t
d = 33076161**(1.0/3.0)
t=str_truncate(d,10)
print 'case 2',t
d = 10000**(1.0/3.0)
t=str_truncate(d,10)
print 'case 3',t
d = 0.1**(1.0/3.0)
t=str_truncate(d,10)
print 'case 4',t
Note that Python fails to perform exactly as per your expectations in case #2 due to your friendly neighborhood floating point precision being non-infinite.
You should maybe know about this document too:
What Every Computer Scientist Should Know About Floating Point
And you might be interested to know that Python has add-ons that provide arbitary precision features that will allow you to calculate the cube root of something to any number of decimals you might want. Using packages like mpmath, you can free yourself from the accuracy limitations of conventional floating point math, but at a considerable cost in performance (speed).
It is interesting to me that the built-in decimal unit does not solve this problem, since 1/3 is a rational (repeating) but non-terminating number in decimal, thus it can't be accurately represented either in decimal notation, nor floating point:
import decimal
third = decimal.Decimal(1)/decimal.Decimal(3)
print decimal.Decimal(33076161)**third # cuberoot using decimal
output:
320.9999999999999999999999998
Update: Sven provided this cool use of Logs which works for this particular case, it outputs the desired 321 value, instead of 320.99999...: Nifty. I love Log(). However this works for 321 cubed, but fails in the case of 320 cubed:
exp(log(33076161)/3)
It seems that fractions doesn't solve this problem, but I wish it did:
import fractions
third = fractions.Fraction(1,3)
def cuberoot(n):
return n ** third
print '%.14f'%cuberoot(33076161)
num = 17**(1.0/3.0)
num = int(num * 100000000000)/100000000000.0
print "%.10f" % num
What about this code .. I have created it for my personal use. although it is so simple, it is working well.
def truncation_machine(original,edge):
'''
Function of the function :) :
it performs truncation operation on long decimal numbers.
Input:
a) the number that needs to undergo truncation.
b) the no. of decimals that we want to KEEP.
Output:
A clean truncated number.
Example: original=1.123456789
edge=4
output=1.1234
'''
import math
g=original*(10**edge)
h=math.trunc(g)
T=h/(10**edge)
print('The original number ('+str(original)+') underwent a '+str(edge)+'-digit truncation to be in the form: '+str(T))
return T
I have been asked to test a library provided by a 3rd party. The library is known to be accurate to n significant figures. Any less-significant errors can safely be ignored. I want to write a function to help me compare the results:
def nearlyequal( a, b, sigfig=5 ):
The purpose of this function is to determine if two floating-point numbers (a and b) are approximately equal. The function will return True if a==b (exact match) or if a and b have the same value when rounded to sigfig significant-figures when written in decimal.
Can anybody suggest a good implementation? I've written a mini unit-test. Unless you can see a bug in my tests then a good implementation should pass the following:
assert nearlyequal(1, 1, 5)
assert nearlyequal(1.0, 1.0, 5)
assert nearlyequal(1.0, 1.0, 5)
assert nearlyequal(-1e-9, 1e-9, 5)
assert nearlyequal(1e9, 1e9 + 1 , 5)
assert not nearlyequal( 1e4, 1e4 + 1, 5)
assert nearlyequal( 0.0, 1e-15, 5 )
assert not nearlyequal( 0.0, 1e-4, 6 )
Additional notes:
Values a and b might be of type int, float or numpy.float64. Values a and b will always be of the same type. It's vital that conversion does not introduce additional error into the function.
Lets keep this numerical, so functions that convert to strings or use non-mathematical tricks are not ideal. This program will be audited by somebody who is a mathematician who will want to be able to prove that the function does what it is supposed to do.
Speed... I've got to compare a lot of numbers so the faster the better.
I've got numpy, scipy and the standard-library. Anything else will be hard for me to get, especially for such a small part of the project.
As of Python 3.5, the standard way to do this (using the standard library) is with the math.isclose function.
It has the following signature:
isclose(a, b, rel_tol=1e-9, abs_tol=0.0)
An example of usage with absolute error tolerance:
from math import isclose
a = 1.0
b = 1.00000001
assert isclose(a, b, abs_tol=1e-8)
If you want it with precision of n significant digits, simply replace the last line with:
assert isclose(a, b, abs_tol=10**-n)
There is a function assert_approx_equal in numpy.testing (source here) which may be a good starting point.
def assert_approx_equal(actual,desired,significant=7,err_msg='',verbose=True):
"""
Raise an assertion if two items are not equal up to significant digits.
.. note:: It is recommended to use one of `assert_allclose`,
`assert_array_almost_equal_nulp` or `assert_array_max_ulp`
instead of this function for more consistent floating point
comparisons.
Given two numbers, check that they are approximately equal.
Approximately equal is defined as the number of significant digits
that agree.
Here's a take.
def nearly_equal(a,b,sig_fig=5):
return ( a==b or
int(a*10**sig_fig) == int(b*10**sig_fig)
)
I believe your question is not defined well enough, and the unit-tests you present prove it:
If by 'round to N sig-fig decimal places' you mean 'N decimal places to the right of the decimal point', then the test assert nearlyequal(1e9, 1e9 + 1 , 5) should fail, because even when you round 1000000000 and 1000000001 to 0.00001 accuracy, they are still different.
And if by 'round to N sig-fig decimal places' you mean 'The N most significant digits, regardless of the decimal point', then the test assert nearlyequal(-1e-9, 1e-9, 5) should fail, because 0.000000001 and -0.000000001 are totally different when viewed this way.
If you meant the first definition, then the first answer on this page (by Triptych) is good.
If you meant the second definition, please say it, I promise to think about it :-)
There are already plenty of great answers, but here's a think:
def closeness(a, b):
"""Returns measure of equality (for two floats), in unit
of decimal significant figures."""
if a == b:
return float("infinity")
difference = abs(a - b)
avg = (a + b)/2
return math.log10( avg / difference )
if closeness(1000, 1000.1) > 3:
print "Joy!"
This is a fairly common issue with floating point numbers. I solve it based on the discussion in Section 1.5 of Demmel[1]. (1) Calculate the roundoff error. (2) Check that the roundoff error is less than some epsilon. I haven't used python in some time and only have version 2.4.3, but I'll try to get this correct.
Step 1. Roundoff error
def roundoff_error(exact, approximate):
return abs(approximate/exact - 1.0)
Step 2. Floating point equality
def float_equal(float1, float2, epsilon=2.0e-9):
return (roundoff_error(float1, float2) < epsilon)
There are a couple obvious deficiencies with this code.
Division by zero error if the exact value is Zero.
Does not verify that the arguments are floating point values.
Revision 1.
def roundoff_error(exact, approximate):
if (exact == 0.0 or approximate == 0.0):
return abs(exact + approximate)
else:
return abs(approximate/exact - 1.0)
def float_equal(float1, float2, epsilon=2.0e-9):
if not isinstance(float1,float):
raise TypeError,"First argument is not a float."
elif not isinstance(float2,float):
raise TypeError,"Second argument is not a float."
else:
return (roundoff_error(float1, float2) < epsilon)
That's a little better. If either the exact or the approximate value is zero, than the error is equal to the value of the other. If something besides a floating point value is provided, a TypeError is raised.
At this point, the only difficult thing is setting the correct value for epsilon. I noticed in the documentation for version 2.6.1 that there is an epsilon attribute in sys.float_info, so I would use twice that value as the default epsilon. But the correct value depends on both your application and your algorithm.
[1] James W. Demmel, Applied Numerical Linear Algebra, SIAM, 1997.
"Significant figures" in decimal is a matter of adjusting the decimal point and truncating to an integer.
>>> int(3.1415926 * 10**3)
3141
>>> int(1234567 * 10**-3)
1234
>>>
Oren Shemesh got part of the problem with the problem as stated but there's more:
assert nearlyequal( 0.0, 1e-15, 5 )
also fails the second definition (and that's the definition I learned in school.)
No matter how many digits you are looking at, 0 will not equal a not-zero. This could prove to be a headache for such tests if you have a case whose correct answer is zero.
There is a interesting solution to this by B. Dawson (with C++ code)
at "Comparing Floating Point Numbers". His approach relies on strict IEEE representation of two numbers and the enforced lexicographical ordering when said numbers are represented as unsigned integers.
I have been asked to test a library provided by a 3rd party
If you are using the default Python unittest framework, you can use assertAlmostEqual
self.assertAlmostEqual(a, b, places=5)
There are lots of ways of comparing two numbers to see if they agree to N significant digits. Roughly speaking you just want to make sure that their difference is less than 10^-N times the largest of the two numbers being compared. That's easy enough.
But, what if one of the numbers is zero? The whole concept of relative-differences or significant-digits falls down when comparing against zero. To handle that case you need to have an absolute-difference as well, which should be specified differently from the relative-difference.
I discuss the problems of comparing floating-point numbers -- including a specific case of handling zero -- in this blog post:
http://randomascii.wordpress.com/2012/02/25/comparing-floating-point-numbers-2012-edition/