A function determines y(integer) from given x (integer) and s (float) as follows:
floor(x * s)
If x and y are known how to calculate s so that floor(x * s) is guaranteed to be exactly equal to y.
If I simply perform s = y / x is there any chance that floor(x * s) won't be equal to y due to floating point operations?
If I simply perform s = y / x is there any chance that floor(x * s) won't be equal to y due to floating point operations?
Yes, there is a chance it won't be equal. #Eric Postpischil offer a simple counter example: y = 1 and x = 49.
(For discussion, let us limit x,y > 0.)
To find a scale factor s for a given x,y, that often works, we need to reverse y = floor(x * s) mathematically. We need to account for the multiplication error (see ULP) and floor truncation.
# Pseudo code
e = ULP(x*s)
y < (x*s + 0.5*e) + 1
y >= (x*s - 0.5*e)
# Estimate e
est = ULP((float)y)
s_lower = ((float)y - 1 - 0.5*est)/(float)x
s_upper = ((float)y + 0.5*est)/(float)x
A candidate s will lie s_lower < s <= s_upper.
Perform the above with higher precision routines. Then I recommend to use the float closest to the mid-point of s_lower, s_upper.
Alternatively, an initial stab at s could use:
s_first_attempt = ((float)y - 0.5)/(float)x
If we rephrase your question, you are wondering if the equation y = floor( x * y/x ) holds for x and y integers, where y/x translates in python into a 64-bit floating-point, and the subsequent multiplication also generates a 64b floating point value.
Python's 64b floating points follow the IEEE-754 norm, which gives them 15-17 bits of decimal precision. To perform the division and multiplication, both x and y are converted into floats, and these operations might reduce the minimum precision in up to 1 bit (really worst case), but they will for sure not increase the precision. As such, you can only expect up to 15-17 bits of precision in this operation. This means that y values above 10^15 might/will present rounding errors.
More practically, one example of this can be (and you can reuse this code for other examples):
import numpy as np
print("{:f}".format(np.floor(1.3 * (1.1e24 / 1.3))))
#> 1100000000000000008388608.000000
Related
Currently I want to generate some samples to get expectation & variance of it.
Given the probability density function: f(x) = {2x, 0 <= x <= 1; 0 otherwise}
I already found that E(X) = 2/3, Var(X) = 1/18, my detail solution is from here https://math.stackexchange.com/questions/4430163/simulating-expectation-of-continuous-random-variable
But here is what I have when simulating using python:
import numpy as np
N = 100_000
X = np.random.uniform(size=N, low=0, high=1)
Y = [2*x for x in X]
np.mean(Y) # 1.00221 <- not equal to 2/3
np.var(Y) # 0.3323 <- not equal to 1/18
What am I doing wrong here? Thank you in advanced.
You are generating the mean and variance of Y = 2X, when you want the mean and variance of the X's themselves. You know the density, but the CDF is more useful for random variate generation than the PDF. For your problem, the density is:
so the CDF is:
Given that the CDF is an easily invertible function for the range [0,1], you can use inverse transform sampling to generate X values by setting F(X) = U, where U is a Uniform(0,1) random variable, and inverting the relationship to solve for X. For your problem, this yields X = U1/2.
In other words, you can generate X values with
import numpy as np
N = 100_000
X = np.sqrt(np.random.uniform(size = N))
and then do anything you want with the data, such as calculate mean and variance, plot histograms, use in simulation models, or whatever.
A histogram will confirm that the generated data have the desired density:
import matplotlib.pyplot as plt
plt.hist(X, bins = 100, density = True)
plt.show()
produces
The mean and variance estimates can then be calculated directly from the data:
print(np.mean(X), np.var(X)) # => 0.6661509538922444 0.05556962913014367
But wait! There’s more...
Margin of error
Simulation generates random data, so estimates of mean and variance will be variable across repeated runs. Statisticians use confidence intervals to quantify the magnitude of the uncertainty in statistical estimates. When the sample size is sufficiently large to invoke the central limit theorem, an interval estimate of the mean is calculated as (x-bar ± half-width), where x-bar is the estimate of the mean. For a so-called 95% confidence interval, the half-width is 1.96 * s / sqrt(n) where:
s is the estimated standard deviation;
n is the number of samples used in the estimates of mean and standard deviation; and
1.96 is a scaling constant derived from the normal distribution and the desired level of confidence.
The half-width is a quantitative measure of the margin of error, a.k.a. precision, of the estimate. Note that as n gets larger, the estimate has a smaller margin of error and becomes more precise, but there are diminishing returns to increasing the sample size due to the square root. Increasing the precision by a factor of 2 would require 4 times the sample size if independent sampling is used.
In Python:
var = np.var(X)
print(np.mean(X), var, 1.96 * np.sqrt(var / N))
produces results such as
0.6666763186360812 0.05511848269208021 0.0014551397290634852
where the third column is the confidence interval half-width.
Improving precision
Inverse transform sampling can yield greater precision for a given sample size if we use a clever trick based on fundamental properties of expectation and variance. In intro prob/stats courses you probably were told that Var(X + Y) = Var(X) + Var(Y). The true relationship is actually Var(X + Y) = Var(X) + Var(Y) + 2Cov(X,Y), where Cov(X,Y) is the covariance between X and Y. If they are independent, the covariance is 0 and the general relationship becomes the one we learn/teach in intro courses, but if they are not independent the more general equation must be used. Variance is always a positive quantity, but covariance can be either positive or negative. Consequently, it’s easy to see that if X and Y have negative covariance the variance of their sum will be less than when they are independent. Negative covariance means that when X is above its mean Y tends to be below its mean, and vice-versa.
So how does that help? It helps because we can use the inverse transform, along with a technique known as antithetic variates, to create pairs of random variables which are identically distributed but have negative covariance. If U is a random variable with a Uniform(0,1) distribution, U’ = 1 - U also has a Uniform(0,1) distribution. (In fact, flipping any symmetric distribution will produce the same distribution.) As a result, X = F-1(U) and X’ = F-1(U’) are identically distributed since they’re defined by the same CDF, but will have negative covariance because they fall on opposite sides of their shared median and thus strongly tend to fall on opposite sides of their mean. If we average each pair to get A = (F-1(ui) + F-1(1-ui)) / 2) the expected value E[A] = E[(X + X’)/2] = 2E[X]/2 = E[X] while the variance Var(A) = [(Var(X) + Var(X’) + 2Cov(X,X’)]/4 = 2[Var(X) + Cov(X,X’)]/4 = [Var(X) + Cov(X,X’)]/2. In other words, we get a random variable A whose average is an unbiased estimate of the mean of X but which has less variance.
To fairly compare antithetic results head-to-head with independent sampling, we take the original sample size and allocate it with half the data being generated by the inverse transform of the U’s, and the other half generated by antithetic pairing using 1-U’s. We then average the paired values and generate statistics as before. In Python:
U = np.random.uniform(size = N // 2)
antithetic_avg = (np.sqrt(U) + np.sqrt(1.0 - U)) / 2
anti_var = np.var(antithetic_avg)
print(np.mean(antithetic_avg), anti_var, 1.96*np.sqrt(anti_var / (N / 2)))
which produces results such as
0.6667222935263972 0.0018911848781598295 0.0003811869837216061
Note that the half-width produced with independent sampling is nearly 4 times as large as the half-width produced using antithetic variates. To put it another way, we would need more than an order of magnitude more data for independent sampling to achieve the same precision.
To approximate the integral of some function of x, say, g(x), over S = [0, 1], using Monte Carlo simulation, you
generate N random numbers in [0, 1] (i.e. draw from the uniform distribution U[0, 1])
calculate the arithmetic mean of g(x_i) over i = 1 to i = N where x_i is the ith random number: i.e. (1 / N) times the sum from i = 1 to i = N of g(x_i).
The result of step 2 is the approximation of the integral.
The expected value of continuous random variable X with pdf f(x) and set of possible values S is the integral of x * f(x) over S. The variance of X is the expected value of X-squared minus the square of the expected value of X.
Expected value: to approximate the integral of x * f(x) over S = [0, 1] (i.e. the expected value of X), set g(x) = x * f(x) and apply the method outlined above.
Variance: to approximate the integral of (x * x) * f(x) over S = [0, 1] (i.e. the expected value of X-squared), set g(x) = (x * x) * f(x) and apply the method outlined above. Subtract the result of this by the square of the estimate of the expected value of X to obtain an estimate of the variance of X.
Adapting your method:
import numpy as np
N = 100_000
X = np.random.uniform(size = N, low = 0, high = 1)
Y = [x * (2 * x) for x in X]
E = [(x * x) * (2 * x) for x in X]
# mean
print((a := np.mean(Y)))
# variance
print(np.mean(E) - a * a)
Output
0.6662016482614397
0.05554821798023696
Instead of making Y and E lists, a much better approach is
Y = X * (2 * X)
E = (X * X) * (2 * X)
Y, E in this case are numpy arrays. This approach is much more efficient. Try making N = 100_000_000 and compare the execution times of both methods. The second should be much faster.
Is there are difference in accuracy between math.pow, numpy.power, numpy.float_power, pow() and ** in python, between two floating point numbers x,y?
I assume x is very close to 1, and y is large.
One way in which you would lose precision in all cases is if you are computing a small number (z say) and then computing
p = pow( 1.0+z, y)
The problem is that doubles have around 16 significant figures, so if z is say 1e-8, in forming 1.0+z you will lose half of those figures. Worse, if z is smaller than 1e-16, 1.0+z will be exactly 1.
You can get round this by using the numpy function log1p. This computes the log of its argument plus one, without actually adding 1 to its argument, so not losing precision.
You can compute p above as
p = exp( log1p(z)*y)
which will eliminate the loss of precision due to calculating 1+z
I have to calculate the exponential of the following array for my project:
w = [-1.52820754859, -0.000234000845064, -0.00527938881237, 5797.19232191, -6.64682108484,
18924.7087966, -69.308158911, 1.1158892974, 1.04454511882, 116.795573742]
But I've been getting overflow due to the number 18924.7087966.
The goal is to avoid using extra packages such as bigfloat (except "numpy") and get a close result (which has a small relative error).
1.So far I've tried using higher precision (i.e. float128):
def getlogZ_robust(w):
Z = sum(np.exp(np.dot(x,w).astype(np.float128)) for x in iter_all_observations())
return np.log(Z)
But I still get "inf" which is what I want to avoid.
I've tried clipping it using nump.clip():
def getlogZ_robust(w):
Z = sum(np.exp(np.clip(np.dot(x,w).astype(np.float128),-11000, 11000)) for x in iter_all_observations())
return np.log(Z)
But the relative error is too big.
Can you help me solving this problem, if it is possible?
Only significantly extended or arbitrary precision packages will be able to handle the huge differences in numbers. The exponential of the largest and most negative numbers in w differ by 8000 (!) orders of magnitude. float (i.e. double precision) has 'only' 15 digits of precision (meaning 1+1e-16 is numerically equal to 1), such that adding the small numbers to the huge exponential of the largest number has no effect. As a matter of fact, exp(18924.7087966) is so huge, that it dominates the sum. Below is a script performing the sum with extended precision in mpmath: the ratio of the sum of exponentials and exp(18924.7087966) is basically 1.
w = [-1.52820754859, -0.000234000845064, -0.00527938881237, 5797.19232191, -6.64682108484,
18924.7087966, -69.308158911, 1.1158892974, 1.04454511882, 116.795573742]
u = min(w)
v = max(w)
import mpmath
#using plenty of precision
mpmath.mp.dps = 32768
print('%.5e' % mpmath.log10(mpmath.exp(v)/mpmath.exp(u)))
#exp(w) differs by 8000 orders of magnitude for largest and smallest number
s = sum([mpmath.exp(mpmath.mpf(x)) for x in w])
print('%.5e' % (mpmath.exp(v)/s))
#largest exp(w) dominates such that ratio over the sums of exp(w) and exp(max(w)) is approx. 1
If the issues of loosing digits in the final results due to hugely different orders of magnitudes of added terms in not a concern, one could also mathematically transform the log of sums over exponentials the following way avoiding exp of large numbers:
log(sum(exp(w)))
= log(sum(exp(w-wmax)*exp(wmax)))
= wmax + log(sum(exp(w-wmax)))
In python:
import numpy as np
v = np.array(w)
m = np.max(v)
print(m + np.log(np.sum(np.exp(v-m))))
Note that np.log(np.sum(np.exp(v-m))) is numerically zero as the exponential of the largest number completely dominates the sum here.
Numpy has a function called logaddexp which computes
logaddexp(x1, x2) == log(exp(x1) + exp(x2))
without explicitly computing the intermediate exp() values. This way it avoids the overflow. So here is the solution:
def getlogZ_robust(w):
Z = 0
for x in iter_all_observations():
Z = np.logaddexp(Z, np.dot(x,w))
return Z
I calculated the sum over an array and over a zero padded version of the same array:
import numpy as np
np.random.seed(3635250408)
n0, n1 = int(2**16.9), 2**17
xx = np.random.randn(n0)
yy = np.zeros(n1)
yy[:n0] = xx
sx, sy = np.sum(xx), np.sum(yy)
print(f"sx = {sx}, sy = {sy}") # -> sx = -508.33773983674155, sy = -508.3377398367416
print(f"sy - sx:", sy - sx) # -> sy - sx: -5.68434188608e-14
print("np.ptp(yy[:n0] - xx) =", np.ptp(yy[:n0] - xx)) # -> 0
Why don't I get identical results?
Interestingly, I am able to show similar effects in Mathematica. I am using Python 3.6 (Anaconda 5.0 with MKL support) and Numpy 1.13.3. Perhaps, could it be an MKL issue?
Update: #rich-l and #jkim noted that rounding problems might be the cause. I am not convinced, because adding zero should not alter a floating point number (The problem arose, when investigating a data set of that size - where the deviations were significantly larger).
You might be running into floating-point precision issues at this point.
By default, numpy uses double precision floats for storing the values, with 16 digits of precision. The first result outputs 17 digits.
I suspect that in the former case the fluctuations in values result in the two values being rounded slightly differently, with the former being resulting in a rounding to a half (5.5e-16), and the latter exceeding the threshold to be rounded to a full number (6.0e-16).
However, this is just a hypothesis - I don't know for sure how numpy does rounding for the least significant digit.
Floating-point arithmetic is not associative:
In [129]: ((0.1+0.2)+0.3) == (0.1+(0.2+0.3))
Out[129]: False
So the order in which the items are added affects the result.
numpy.sum usually uses pairwise summation. It reverts to naive summation (from left to right) when the length of the array is less than 8 or when summing over a strided axis.
Since pairwise summation recursively breaks the sequence into two groups, the
addition of zero padding affects the midpoint where the sequence gets divided and hence
alters the order in which the values are added. And since floating-point
arithmetic is not associative, zero padding can affect the result.
For example, consider
import numpy as np
np.random.seed(3635250408)
n0, n1 = 6, 8
xx = np.random.randn(n0)
# array([ 1.8545852 , -0.30387171, -0.57164897, -0.40679684, -0.8569989 ,
# 0.32546545])
yy = np.zeros(n1)
yy[:n0] = xx
# array([ 1.8545852 , -0.30387171, -0.57164897, -0.40679684, -0.8569989 ,
# 0.32546545, 0. , 0. ])
xx.sum() and yy.sum() are not the same value:
In [138]: xx.sum()
Out[138]: 0.040734223419930771
In [139]: yy.sum()
Out[139]: 0.040734223419930826
In [148]: xx.sum() == yy.sum()
Out[148]: False
Since len(xx) < 8, the values in xx are summed from left to right:
In [151]: xx.sum() == (((((xx[0]+xx[1])+xx[2])+xx[3])+xx[4])+xx[5])
Out[151]: True
Since len(yy) >= 8, pairwise summation is used to compute yy.sum():
In [147]: yy.sum() == (yy[0]+yy[1]+yy[2]+yy[3])+(yy[4]+yy[5]+yy[6]+yy[7])
Out[147]: True
Related NumPy developer discussions:
numpy.sum is not stable
implementation of pairwise summation
implementing a numerically stable sum
numpy.sum does not use Kahan nor Shewchuk summation (used by math.fsum). I believe these algorithms would
produce a stable result under the zero-padding issue that you've raised but I'm not expert enough to say for sure.
I want to convert floating point sin values to fixed point values.
import numpy as np
Fs = 8000
f = 5
sample = 8000
x = np.arange(sample)
y = np.sin(2 * np.pi * f * x / Fs)
How can I easily convert this y floating point samples to fixed point samples?
Each element should be of 16bit and 1 bit integer part and 15 bits should be of fractional part, so that I can pass these samples to a DAC chip.
To convert the samples from float to Q1.15, multiply the samples by 2 ** 15. However, as mentioned in the comments, you can't represent 1.0 in Q1.15, since the LSB is representing the sign. Therefore you should clamp your values in the range of [-1, MAX_Q1_15] where MAX_Q1_15 = 1.0 - (2 ** -15). This can be done with a few helpful numpy functions.
y_clamped = np.clip(y, -1.0, float.fromhex("0x0.fffe"))
y_fixed = np.multiply(y_clamped, 32768).astype(np.int16)
Although you may fear this representation does not accurately represent the value of 1.0, it is close enough to do computation with. For example, if you were to square 1.0:
fmul_16x16 = lambda x, y: x * y >> 15
fmul_16x16(32767, 32767) # Result --> 32766
Which is very close, with 1-bit error.
Hopefully it helps.
You can use fxpmath to convert float values to fractional fixed-point. It supports Numpy arrays as inputs, so:
from fxpmath import Fxp
# your example code here
y_fxp = Fxp(y, signed=True, n_word=16, n_frac=15)
# plotting code here
15 bits for fractional give you a very low value for amplitue resolution, so I plot Q5.4 to show the conversion in an exaggerated way: