How to round polynomial coefficients when using the polyfit() function?

How to round polynomial coefficients when using the polyfit() function? - python

I fit a polynomial to points from list data using the polyfit() function:
import numpy as np
data = [1,4,5,7,8,11,14,15,16,19]
x = np.arange(0,len(data))
y = np.array(data)
z = np.polyfit(x,y,2)
print (z)
print ("{0}x^2 + {1}x + {2}".format(*z))
Output:
[0.00378788 1.90530303 1.31818182]
0.003787878787878751x^2 + 1.9053030303030298x + 1.3181818181818175
How to get a fit to points, with rounded coefficients for example to three decimal places? For example, to get:
[0.004 1.905 1.318]
0.004x^2 + 1.905x + 1.318

There is no option in the polyfit method for the purpose of rounding. IIUC, you can use round after applying polyfit.
import numpy as np
data = [1,4,5,7,8,11,14,15,16,19]
x = np.arange(0,len(data))
y = np.array(data)
z = np.polyfit(x,y,2).round(decimals=3)
array([0.004, 1.905, 1.318])
For values exactly halfway between rounded decimal values, NumPy rounds to the nearest even value. Thus 1.5 and 2.5 round to 2.0, -0.5 and 0.5 round to 0.0, etc. Results may also be surprising due to the inexact representation of decimal fractions in the IEEE floating point standard [R9] and errors introduced when scaling by powers of ten. -- Cited from numpy.around

Related

How to account for inexact decimals in weighted median calculation without interpolation?

Is there an established way to account for inexact decimals while calculating a weighted median? The issue I'm running into is the case where np.cumsum(wt) / np.sum(wt) == 0.5 evaluates to False because of decimals that are inexactly represented in binary.
For example...
arr = np.array([[40, 38.8182],
[40, 38.8182],
[50, 38.8182],
[60, 38.8182],
[70, 38.8182],
[70, 38.8182]])
arr = arr[arr[:,0].argsort()]
val = arr[:,0]
wt = arr[:,1]
wt_percentile = np.cumsum(wt) / np.sum(wt)
arr_filter = (wt_percentile == 0.5) | (np.cumsum(wt_percentile > 0.5) == 1)
median = np.nanmean(np.where(arr_filter, val, np.nan), dtype='float64')
print(median)
50.0 # returned value
55.0 # expected value
To show the intermediate values:
print(wt_percentile)
[0.16666667 0.33333333 0.5 0.66666667 0.83333333 1. ]
print(wt_percentile == 0.5)
[False False False False False False] # Third value should be True
Approaches I've considered so far:
wt = np.around(wt * 10000) # (A) Convert weights to integers
wt_percentile = np.around(np.cumsum(wt) / np.sum(wt), 15) # (B) Round the result of the wt_percentile calculation
(A) The problem with converting my weights to integers is that (1) it requires knowing the number of decimal places beforehand and (2) I might need to revert the weights for other calculations.
(B) Rounding the wt_percentile result might be okay, but I'm not sure if there's a universal rounding precision that would work for most, if not all, scenarios.
Appreciate any guidance.

This problem occurs when evaluating equalities or inequalities with floating point numbers. Many decimal values can only be represented approximately in binary, and thus divisions may cause truncation errors, yielding inexact results.
What should be done is define a tolerance, e.g. tol=1E-9 (machine precision for 64 bit floats is 15 to 16 decimal points, so you could go lower) and then change:
(wt_percentile == 0.5)
to
(np.abs(wt_percentile - 0.5) < tol)
Approach B is a possible solution, yes.
Though, one could argue that you cannot know beforehand what is a good number of decimals to round to due to truncation errors. Then again, the same could be said when defining the tolerance tol.
Another issue is that, especially for a larger codebase, using a tol for evaluating floating point equality shows more clear intent and is easier to maintain. You could create a function floatIsEqual(val1, val2, tol) for example, where val1 and val2 could be scalars, arrays, or any object which can be evaluated to float, returning boolean values/arrays.

Correctly determining scaling factor

A function determines y(integer) from given x (integer) and s (float) as follows:
floor(x * s)
If x and y are known how to calculate s so that floor(x * s) is guaranteed to be exactly equal to y.
If I simply perform s = y / x is there any chance that floor(x * s) won't be equal to y due to floating point operations?

If I simply perform s = y / x is there any chance that floor(x * s) won't be equal to y due to floating point operations?
Yes, there is a chance it won't be equal. #Eric Postpischil offer a simple counter example: y = 1 and x = 49.
(For discussion, let us limit x,y > 0.)
To find a scale factor s for a given x,y, that often works, we need to reverse y = floor(x * s) mathematically. We need to account for the multiplication error (see ULP) and floor truncation.
# Pseudo code
e = ULP(x*s)
y < (x*s + 0.5*e) + 1
y >= (x*s - 0.5*e)
# Estimate e
est = ULP((float)y)
s_lower = ((float)y - 1 - 0.5*est)/(float)x
s_upper = ((float)y + 0.5*est)/(float)x
A candidate s will lie s_lower < s <= s_upper.
Perform the above with higher precision routines. Then I recommend to use the float closest to the mid-point of s_lower, s_upper.
Alternatively, an initial stab at s could use:
s_first_attempt = ((float)y - 0.5)/(float)x

If we rephrase your question, you are wondering if the equation y = floor( x * y/x ) holds for x and y integers, where y/x translates in python into a 64-bit floating-point, and the subsequent multiplication also generates a 64b floating point value.
Python's 64b floating points follow the IEEE-754 norm, which gives them 15-17 bits of decimal precision. To perform the division and multiplication, both x and y are converted into floats, and these operations might reduce the minimum precision in up to 1 bit (really worst case), but they will for sure not increase the precision. As such, you can only expect up to 15-17 bits of precision in this operation. This means that y values above 10^15 might/will present rounding errors.
More practically, one example of this can be (and you can reuse this code for other examples):
import numpy as np
print("{:f}".format(np.floor(1.3 * (1.1e24 / 1.3))))
#> 1100000000000000008388608.000000

Accuracy of math.pow, numpy.power, numpy.float_power, pow and ** in python

Is there are difference in accuracy between math.pow, numpy.power, numpy.float_power, pow() and ** in python, between two floating point numbers x,y?
I assume x is very close to 1, and y is large.

One way in which you would lose precision in all cases is if you are computing a small number (z say) and then computing
p = pow( 1.0+z, y)
The problem is that doubles have around 16 significant figures, so if z is say 1e-8, in forming 1.0+z you will lose half of those figures. Worse, if z is smaller than 1e-16, 1.0+z will be exactly 1.
You can get round this by using the numpy function log1p. This computes the log of its argument plus one, without actually adding 1 to its argument, so not losing precision.
You can compute p above as
p = exp( log1p(z)*y)
which will eliminate the loss of precision due to calculating 1+z

Why does NumPy give a different result when summing over a zero padded array?

I calculated the sum over an array and over a zero padded version of the same array:
import numpy as np
np.random.seed(3635250408)
n0, n1 = int(2**16.9), 2**17
xx = np.random.randn(n0)
yy = np.zeros(n1)
yy[:n0] = xx
sx, sy = np.sum(xx), np.sum(yy)
print(f"sx = {sx}, sy = {sy}") # -> sx = -508.33773983674155, sy = -508.3377398367416
print(f"sy - sx:", sy - sx) # -> sy - sx: -5.68434188608e-14
print("np.ptp(yy[:n0] - xx) =", np.ptp(yy[:n0] - xx)) # -> 0
Why don't I get identical results?
Interestingly, I am able to show similar effects in Mathematica. I am using Python 3.6 (Anaconda 5.0 with MKL support) and Numpy 1.13.3. Perhaps, could it be an MKL issue?
Update: #rich-l and #jkim noted that rounding problems might be the cause. I am not convinced, because adding zero should not alter a floating point number (The problem arose, when investigating a data set of that size - where the deviations were significantly larger).

You might be running into floating-point precision issues at this point.
By default, numpy uses double precision floats for storing the values, with 16 digits of precision. The first result outputs 17 digits.
I suspect that in the former case the fluctuations in values result in the two values being rounded slightly differently, with the former being resulting in a rounding to a half (5.5e-16), and the latter exceeding the threshold to be rounded to a full number (6.0e-16).
However, this is just a hypothesis - I don't know for sure how numpy does rounding for the least significant digit.

Floating-point arithmetic is not associative:
In [129]: ((0.1+0.2)+0.3) == (0.1+(0.2+0.3))
Out[129]: False
So the order in which the items are added affects the result.
numpy.sum usually uses pairwise summation. It reverts to naive summation (from left to right) when the length of the array is less than 8 or when summing over a strided axis.
Since pairwise summation recursively breaks the sequence into two groups, the
addition of zero padding affects the midpoint where the sequence gets divided and hence
alters the order in which the values are added. And since floating-point
arithmetic is not associative, zero padding can affect the result.
For example, consider
import numpy as np
np.random.seed(3635250408)
n0, n1 = 6, 8
xx = np.random.randn(n0)
# array([ 1.8545852 , -0.30387171, -0.57164897, -0.40679684, -0.8569989 ,
# 0.32546545])
yy = np.zeros(n1)
yy[:n0] = xx
# array([ 1.8545852 , -0.30387171, -0.57164897, -0.40679684, -0.8569989 ,
# 0.32546545, 0. , 0. ])
xx.sum() and yy.sum() are not the same value:
In [138]: xx.sum()
Out[138]: 0.040734223419930771
In [139]: yy.sum()
Out[139]: 0.040734223419930826
In [148]: xx.sum() == yy.sum()
Out[148]: False
Since len(xx) < 8, the values in xx are summed from left to right:
In [151]: xx.sum() == (((((xx[0]+xx[1])+xx[2])+xx[3])+xx[4])+xx[5])
Out[151]: True
Since len(yy) >= 8, pairwise summation is used to compute yy.sum():
In [147]: yy.sum() == (yy[0]+yy[1]+yy[2]+yy[3])+(yy[4]+yy[5]+yy[6]+yy[7])
Out[147]: True
Related NumPy developer discussions:
numpy.sum is not stable
implementation of pairwise summation
implementing a numerically stable sum
numpy.sum does not use Kahan nor Shewchuk summation (used by math.fsum). I believe these algorithms would
produce a stable result under the zero-padding issue that you've raised but I'm not expert enough to say for sure.

Convert floating point to fixed point

I want to convert floating point sin values to fixed point values.
import numpy as np
Fs = 8000
f = 5
sample = 8000
x = np.arange(sample)
y = np.sin(2 * np.pi * f * x / Fs)
How can I easily convert this y floating point samples to fixed point samples?
Each element should be of 16bit and 1 bit integer part and 15 bits should be of fractional part, so that I can pass these samples to a DAC chip.

To convert the samples from float to Q1.15, multiply the samples by 2 ** 15. However, as mentioned in the comments, you can't represent 1.0 in Q1.15, since the LSB is representing the sign. Therefore you should clamp your values in the range of [-1, MAX_Q1_15] where MAX_Q1_15 = 1.0 - (2 ** -15). This can be done with a few helpful numpy functions.
y_clamped = np.clip(y, -1.0, float.fromhex("0x0.fffe"))
y_fixed = np.multiply(y_clamped, 32768).astype(np.int16)
Although you may fear this representation does not accurately represent the value of 1.0, it is close enough to do computation with. For example, if you were to square 1.0:
fmul_16x16 = lambda x, y: x * y >> 15
fmul_16x16(32767, 32767) # Result --> 32766
Which is very close, with 1-bit error.
Hopefully it helps.

You can use fxpmath to convert float values to fractional fixed-point. It supports Numpy arrays as inputs, so:
from fxpmath import Fxp
# your example code here
y_fxp = Fxp(y, signed=True, n_word=16, n_frac=15)
# plotting code here
15 bits for fractional give you a very low value for amplitue resolution, so I plot Q5.4 to show the conversion in an exaggerated way:

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

How to round polynomial coefficients when using the polyfit() function? - python

Related

How to account for inexact decimals in weighted median calculation without interpolation?

Correctly determining scaling factor

Accuracy of math.pow, numpy.power, numpy.float_power, pow and ** in python

Why does NumPy give a different result when summing over a zero padded array?

Convert floating point to fixed point

Categories

Resources