I am using np.divide to divide two vectors. The numerator has all floats and the denominator is a mix of nice sized floats, extremely small floats, and np.inf. The resulting vector has a np.nan in every place, even though only a handful of entries should have that. How can I fix this to have np.nan where appropriate and floats everywhere else?
try this code
np.true_divide(A, B, where=(A!=0) | (B!=0))
or
C = A / B # may print warnings, suppress them with np.seterrstate if you want
C[np.isnan(C)] = 0
Related
Suppose I've got an array of strings and I know that they are numbers but don't know whether they are intish or floatish. If I knew, I could simply convert using .astype(int) or .astype(float). What is a good (readable, performant, ideally not involving off the cuff regular expressions) way of finding out? Is there some Python or numpy machinery one could hijack?
One thing you could certainly do is parse them all as floats, since ints will also parse as floats. Then you can see if the remainder when divided by 1 is 0 ( if it is, then you have an integer). Then you have a boolean array telling you where the integers are. like so:
import numpy as np
array = np.array(['454.13', '1.243', '8'])
floats = array.astype(float)
remainders = np.remainder(floats, 1)
int_indices = remainders == 0
ints = floats[int_indices]
I deliberately have done it here in 'too many' steps to make it clearer what I mean, hope this helps.
The output is:
>>>ints
array([8.00000000e+00])
>>>int_indices
array([False, False, True])
I'm trying to estimate marketshares with the following formula:
c = np.exp(-Mu*a)/(np.exp(-Mu*a)+np.exp(-Mu*b))
in which a and b are 9x9 matrices with cell values that can be larger than 1000. Because the numbers are so small, Python returns NaN values. In order to enhance precision of the estimation i have already tried np.float128 but all this does is raise the error that numpy doesn't have an attribute called float128. I have also tried longdouble, again without success. Are there other ways to make Python show the actual values of the cells instead of NaN?
You have:
c = np.exp(-Mu*a)/(np.exp(-Mu*a)+np.exp(-Mu*b))
Multipying the numerator and denominator by e^(Mu*a), you get:
c = 1/(1+np.exp(Mu*(a-b)))
This is just a reformulation of the same formula.
Now, if the exp term is still too small, and you do not need a more precise result, then your c is approximately very close to 1. And if you still need to control precision, you can take log on both sides and use the Taylor expansion of log(1+x).
I'm just working on Project Euler problem 12, so I need to do some testing against numbers that are multiples of over 500 unique factors.
I figured that the array [1, 2, 3... 500] would be a good starting point, since the product of that array is the lowest possible such number. However, numpy.prod() returns zero for this array. I'm sure I'm missing something obvious, but what the hell is it?
>>> import numpy as np
>>> array = []
>>> for i in range(1,100):
... array.append(i)
...
>>> np.prod(array)
0
>>> array.append(501)
>>> np.prod(array)
0
>>> array.append(5320934)
>>> np.prod(array)
0
Note that Python uses "unlimited" integers, but in numpy everything is typed, and so it is a "C"-style (probably 64-bit) integer here. You're probably experiencing an overflow.
If you look at the documentation for numpy.prod, you can see the dtype parameter:
The type of the returned array, as well as of the accumulator in which the elements are multiplied.
There are a few things you can do:
Drop back to Python, and multiply using its "unlimited integers" (see this question for how to do so).
Consider whether you actually need to find the product of such huge numbers. Often, when you're working with the product of very small or very large numbers, you switch to sums of logarithms. As #WarrenWeckesser notes, this is obviously imprecise (it's not like taking the exponent at the end will give you the exact solution) - rather, it's used to gauge whether one product is growing faster than another.
Those numbers get very big, fast.
>>> np.prod(array[:25])
7034535277573963776
>>> np.prod(array[:26])
-1569523520172457984
>>> type(_)
numpy.int64
You're actually overflowing numpy's data type here, hence the wack results. If you stick to python ints, you won't have overflow.
>>> import operator
>>> reduce(operator.mul, array, 1)
933262154439441526816992388562667004907159682643816214685929638952175999932299156089414639761565182862536979208272237582511852109168640000000000000000000000L
You get the result 0 due to the large number of factors 2 in the product, there are more than 450 of those factors. Thus in a reduction modulo 2^64, the result is zero.
Why the data type forces this reduction is explained in the other answers.
250+125+62+31+15+7+3+1 = 494 is the multiplicity of 2 in 500!
added 12/2020: or, in closer reading the question and its code,
49+24+12+6+3+1 = 95 as the multiplicity of 2 in 99!
which is the product of the first part of your list. Still enough binary zeros at the end of the number to fill all the bit positions of a 64bit integer. Just to compare, you get
19+3 = 22 factors of 5 in 99!
which is also the number of trailing zeros in the decimal expression of this factorial.
My code:
import math
import cmath
print "E^ln(-1)", cmath.exp(cmath.log(-1))
What it prints:
E^ln(-1) (-1+1.2246467991473532E-16j)
What it should print:
-1
(For Reference, Google checking my calculation)
According to the documentation at python.org cmath.exp(x) returns e^(x), and cmath.log(x) returns ln (x), so unless I'm missing a semicolon or something , this is a pretty straightforward three line program.
When I test cmath.log(-1) it returns πi (technically 3.141592653589793j). Which is right. Euler's identity says e^(πi) = -1, yet Python says when I raise e^(πi), I get some kind of crazy talk (specifically -1+1.2246467991473532E-16j).
Why does Python hate me, and how do I appease it?
Is there a library to include to make it do math right, or a sacrifice I have to offer to van Rossum? Is this some kind of floating point precision issue perhaps?
The big problem I'm having is that the precision is off enough to have other values appear closer to 0 than actual zero in the final function (not shown), so boolean tests are worthless (i.e. if(x==0)) and so are local minimums, etc...
For example, in an iteration below:
X = 2 Y= (-2-1.4708141202500006E-15j)
X = 3 Y= -2.449293598294706E-15j
X = 4 Y= -2.204364238465236E-15j
X = 5 Y= -2.204364238465236E-15j
X = 6 Y= (-2-6.123233995736765E-16j)
X = 7 Y= -2.449293598294706E-15j
3 & 7 are both actually equal to zero, yet they appear to have the largest imaginary parts of the bunch, and 4 and 5 don't have their real parts at all.
Sorry for the tone. Very frustrated.
As you've already demonstrated, cmath.log(-1) doesn't return exactly i*pi. Of course, returning pi exactly is impossible as pi is an irrational number...
Now you raise e to the power of something that isn't exactly i*pi and you expect to get exactly -1. However, if cmath returned that, you would be getting an incorrect result. (After all, exp(i*pi+epsilon) shouldn't equal -1 -- Euler doesn't make that claim!).
For what it's worth, the result is very close to what you expect -- the real part is -1 with an imaginary part close to floating point precision.
It appears to be a rounding issue. While -1+1.22460635382e-16j is not a correct value, 1.22460635382e-16j is pretty close to zero. I don't know how you could fix this but a quick and dirty way could be rounding the number to a certain number of digits after the dot ( 14 maybe ? ).
Anything less than 10^-15 is normally zero. Computer calculations have a certain error that is often in that range. Floating point representations are representations, not exact values.
The problem is inherent to representing irrational numbers (like π) in finite space as floating points.
The best you can do is filter your result and set it to zero if its value is within a given range.
>>> tolerance = 1e-15
>>> def clean_complex(c):
... real,imag = c.real, c.imag
... if -tolerance < real < tolerance:
... real = 0
... if -tolerance < imag < tolerance:
... imag = 0
... return complex(real,imag)
...
>>> clean_complex( cmath.exp(cmath.log(-1)) )
(-1+0j)
i need to get a list of numbers derived from a range(15,20,1), were each number is 15*x*100:
X Y
15 100
16 93,75
17 88,2352941176
18 83,3333333333
19 78,9473684211
20 75
The result should be an array of y with two decimals.
I tried with python and it took me a while to find out that it will only handle integers. Then I tried numpy, but I am still not getting there. I am a nub, so sorry for the stupidity of the question, but after trying for two hours I decided to post a question.
Best,
Mace
Do you need something like:
ylist = [float(x)*15*100 for x in range(15,21)]
?
This would return:
[22500.0, 24000.0, 25500.0, 27000.0, 28500.0, 30000.0]
I'm not quite sure what your Y column means, since your formula 15*x*100 doesn't generate those values.
If you actually mean x*100/15, it would be:
ylist = [15/float(x)*100 for x in range(15,21)]
Or even simpler:
ylist = [15.0/x*100.0 for x in range(15,21)]
If all the values in a calculation are of type int, python will create an int as result. If, on the other hand, one of them is a float or double, that'll be the type of the result.
This coercion can be done both explicitly using float(x), or simply having one of your constants represented as a floating point value, like 100.0.
As to the 2 decimal places need, it depends on what you need to do with the values.
One way is to use round to 2 decimal places, like:
ylist = [round(15.0/x*100.0, 2) for x in range(15,21)]
If you always need two decimal places, probably you'll want to use string formatting, check #mgilson reply for that.
Perhaps you're looking for something like:
>>> print [float(15)/x*100 for x in range(15,21)]
[100.0, 93.75, 88.23529411764706, 83.33333333333334, 78.94736842105263, 75.0]
This doesn't give you the number to 2 decimal places. For that you'll need round or string formatting ... (I'm not sure exactly what you want to do with the numbers after the fact, so it's hard to give a recommendation here). Here's an example with string formatting:
>>> print ['{0:.2f}'.format(float(15)/x*100) for x in range(15,21)]
['100.00', '93.75', '88.24', '83.33', '78.95', '75.00']