This question already has answers here:
Pretty-print a NumPy array without scientific notation and with given precision
(14 answers)
Closed 3 years ago.
If I have a numpy array like this:
[2.15295647e+01, 8.12531501e+00, 3.97113829e+00, 1.00777250e+01]
how can I move the decimal point and format the numbers so I end up with a numpy array like this:
[21.53, 8.13, 3.97, 10.08]
np.around(a, decimals=2) only gives me [2.15300000e+01, 8.13000000e+00, 3.97000000e+00, 1.00800000e+01] Which I don't want and I haven't found another way to do it.
In order to make numpy display float arrays in an arbitrary format, you can define a custom function that takes a float value as its input and returns a formatted string:
In [1]: float_formatter = "{:.2f}".format
The f here means fixed-point format (not 'scientific'), and the .2 means two decimal places (you can read more about string formatting here).
Let's test it out with a float value:
In [2]: float_formatter(1.234567E3)
Out[2]: '1234.57'
To make numpy print all float arrays this way, you can pass the formatter= argument to np.set_printoptions:
In [3]: np.set_printoptions(formatter={'float_kind':float_formatter})
Now numpy will print all float arrays this way:
In [4]: np.random.randn(5) * 10
Out[4]: array([5.25, 3.91, 0.04, -1.53, 6.68]
Note that this only affects numpy arrays, not scalars:
In [5]: np.pi
Out[5]: 3.141592653589793
It also won't affect non-floats, complex floats etc - you will need to define separate formatters for other scalar types.
You should also be aware that this only affects how numpy displays float values - the actual values that will be used in computations will retain their original precision.
For example:
In [6]: a = np.array([1E-9])
In [7]: a
Out[7]: array([0.00])
In [8]: a == 0
Out[8]: array([False], dtype=bool)
numpy prints a as if it were equal to 0, but it is not - it still equals 1E-9.
If you actually want to round the values in your array in a way that affects how they will be used in calculations, you should use np.round, as others have already pointed out.
You can use round function. Here some example
numpy.round([2.15295647e+01, 8.12531501e+00, 3.97113829e+00, 1.00777250e+01],2)
array([ 21.53, 8.13, 3.97, 10.08])
IF you want change just display representation, I would not recommended to alter printing format globally, as it suggested above. I would format my output in place.
>>a=np.array([2.15295647e+01, 8.12531501e+00, 3.97113829e+00, 1.00777250e+01])
>>> print([ "{:0.2f}".format(x) for x in a ])
['21.53', '8.13', '3.97', '10.08']
You're confusing actual precision and display precision. Decimal rounding cannot be represented exactly in binary. You should try:
> np.set_printoptions(precision=2)
> np.array([5.333333])
array([ 5.33])
[ round(x,2) for x in [2.15295647e+01, 8.12531501e+00, 3.97113829e+00, 1.00777250e+01]]
Related
In [25]: np.power(10,-100)
Out[25]: 0
In [26]: math.pow(10,-100)
Out[26]: 1e-100
I would expect both the commands to return 1e-100. This is not a precision issue either, since the issue persists even after increasing precision to 500. Is there some setting which I can change to get the correct answer?
Oh, it's much "worse" than that:
In [2]: numpy.power(10,-1)
Out[2]: 0
But this is a hint to what's going on: 10 is an integer, and numpy.power doesn't coerce the numbers to floats. But this works:
In [3]: numpy.power(10.,-1)
Out[3]: 0.10000000000000001
In [4]: numpy.power(10.,-100)
Out[4]: 1e-100
Note, however, that the power operator, **, does convert to float:
In [5]: 10**-1
Out[5]: 0.1
numpy method assumes you want integer returned since you supplied an integer.
np.power(10.0,-100)
works as you would expect.
(Just a footnote to the two other answers on this page.)
Given input two input values, you can check the datatype of the object that np.power will return by inspecting the types attribute:
>>> np.power.types
['bb->b', 'BB->B', 'hh->h', 'HH->H', 'ii->i', 'II->I', 'll->l', 'LL->L', 'qq->q',
'QQ->Q', 'ee->e', 'ff->f', 'dd->d', 'gg->g', 'FF->F', 'DD->D', 'GG->G', 'OO->O']
Python-compatible integer types are denoted by l, compatible-compatible Python floats by d (documents).
np.power effectively decides what to return by checking the types of the arguments passed and using the first matching signature from this list.
So given 10 and -100, np.power matches the integer integer -> integer signature and returns the integer 0.
On the other hand, if one of the arguments is a float then the integer argument will also be cast to a float, and the float float -> float signature is used (and the correct float value is returned).
I'm new to python and numpy library.I'm doing PCA on my custom dataset.
I calculate the mean of each row of my dataframe from pandas but I get below result as mean array:
[ 7.433148e+46
7.433148e+47
7.433148e+47
7.433148e+46
7.433148e+46
7.433148e+46
7.433148e+46
7.433148e+45
7.433148e+47]
And my code is :
np.set_printoptions(precision=6)
np.set_printoptions(suppress=False)
df['mean']=df.mean(axis=1)
mean_vector = np.array(df.iloc[:,15],dtype=np.float64)
print('Mean Vector:\n', mean_vector)
what's the meaning of this numbers?
and how should I remove e from the number?
Any help really appreciate,
Thanks in advance.
Are these large numbers realistic, and, if so how do you want to display them?
Copy and paste from your question:
In [1]: x=np.array([7.433148e+46,7.433148e+47])
The default numpy display adds a few decimal pts.
In [2]: x
Out[2]: array([ 7.43314800e+46, 7.43314800e+47])
changing precision doesn't change much
In [5]: np.set_printoptions(precision=6)
In [6]: np.set_printoptions(suppress=True)
In [7]: x
Out[7]: array([ 7.433148e+46, 7.433148e+47])
suppress does less. It supresses small floating point values, not large ones
suppress : bool, optional
Whether or not suppress printing of small floating point values using
scientific notation (default False).
The default python display for one of these numbers - also scientific:
In [8]: x[0]
Out[8]: 7.4331480000000002e+46
With a formatting command we can display it in it's 46+ character glory (or gory detail):
In [9]: '%f'%x[0]
Out[9]: '74331480000000001782664341808476383296708673536.000000'
If that was a real value I'd prefer to see the scientific notation.
In [11]: '%.6g'%x[0]
Out[11]: '7.43315e+46'
To illustrate what suppress does, print the inverse of this array:
In [12]: 1/x
Out[12]: array([ 0., 0.])
In [13]: np.set_printoptions(suppress=False)
In [14]: 1/x
Out[14]: array([ 1.345325e-47, 1.345325e-48])
===============
I'm not that familiar with pandas, but I wonder if your mean calculation makes sense. What does pandas print for df.iloc[:,15]? For the mean to be this large, the original data has to have values of similar size. How does the source display them? I wonder if most of your values are smaller, normal values, and your have a few excessively large ones (outliers) that 'distort' the mean.
I think you can simplify the array extraction with values:
mean_vector = np.array(df.iloc[:,15],dtype=np.float64)
mean_vector = df.iloc[:,15].values
I have a numpy array come of whose elements are in scientific format and I want to convert them into decimal format. My numpy array looks like this:
[array([ 93495052.96955582, 98555123.06146193])]
[array([ 1.00097681e+09, 9.98276347e+08])]
[array([ 6.86812785e+09, 6.90391125e+09])]
[array([ 7.75127468e+08, 8.02369833e+08])]
and this is formed using this line in my code:
list1.append(np.array(regr.predict(data),dtype = np.float))
Now I want to convert elements in list1 from scientific format to decimal format. I looked around for some solution and found out that print format(0.00001357, 'f') converts numbers from scientific format to decimal format but how do I use it to convert elements of my array?
First off, as several people have noted, there's a very large difference between how the numbers are displayed and how they're stored.
If you want to convert them to strings, then use '{:f}'.format(x) (or the % equivalent).
However, it sounds like you're only wanting the numbers to be displayed differently when you're working interactively (or through a print statement).
Changing how numpy arrays are printed
The way that numpy arrays are displayed interactively is controlled by numpy.set_printoptions.
Note that this does not convert the numbers to strings or change them in any way.
As a quick example:
In [1]: import numpy as np
In [2]: x = 1e9 * np.random.random(5)
In [3]: x
Out[3]:
array([ 4.96602724e+08, 5.42486095e+08, 4.74495681e+08,
7.37709684e+07, 9.75410927e+08])
In [4]: np.set_printoptions(formatter={'float_kind':'{:f}'.format})
In [5]: x
Out[5]:
array([496602723.824146, 542486095.316912, 474495680.688025,
73770968.413642, 975410926.873148])
We've only changed how numpy will display the numbers. They're still floats.
We can operate on them mathematically, and they'll behave like numbers:
In [6]: x[0]
Out[6]: 496602723.82414573
In [7]: x[0] * 2
Out[7]: 993205447.64829147
Converting to strings
Now let's say we had converted them to a list of strings:
In [1]: import numpy as np
In [2]: x = 1e9 * np.random.random(5)
In [3]: x
Out[3]:
array([ 2.56619581e+08, 2.55721261e+08, 3.36984986e+08,
2.67541556e+08, 9.01048842e+08])
In [4]: x = ['{:f}'.format(item) for item in x]
In [5]: x
Out[5]:
['256619580.697790',
'255721261.271977',
'336984986.430552',
'267541556.373619',
'901048842.193849']
Now they're a list of strings. If we operate on them mathematically, they'll behave like strings, not numbers:
In [6]: x[0] * 2
Out[6]: '256619580.697790256619580.697790'
Controlling how numpy arrays are saved with savetxt
Finally, if you're using numpy.savetxt, and would like to control how the data is output to disk, consider using the fmt parameter instead of manually converting elements of the array to strings.
For example, if we were to do:
np.savetxt('temp.txt', x)
By default, the ascii representation of the array would use scientific notation if it is more compact:
8.702970453168644905e+08
9.991634082796489000e+08
5.032002956810175180e+08
2.382398232565869987e+08
1.868727085152311921e+08
However, we can control that using fmt. Note that it expects the "old-style" % formatting strings:
np.savetxt('temp2.txt', x, fmt='%f')
And we'll get:
870297045.316864
999163408.279649
503200295.681018
238239823.256587
186872708.515231
If you just want to print them without using scientific notation, you can do np.set_printoptions(suppress=True).
I'm working with mpmath python library to gain precision during some computations, but i need to cast the result in a numpy native type.
More precisely i need to cast an mpmath matrix (that contains mpf object types) in an numpy.ndarray (that contains float types).
I have solved the problem with a raw approach:
# My input Matrix:
matr = mp.matrix(
[[ '115.80200375', '22.80402473', '13.69453064', '54.28049263'],
[ '22.80402473', '86.14887381', '53.79999432', '42.78548627'],
[ '13.69453064', '53.79999432', '110.9695448' , '37.24270321'],
[ '54.28049263', '42.78548627', '37.24270321', '95.79388469']])
# multiple precision computation
D = MPDBiteration(matr)
# Create a new ndarray
Z = numpy.ndarray((matr.cols,matr.rows),dtype=numpy.float)
# I fill it pretty "manually"
for i in range(0,matr.rows):
for j in range(0,matr.cols):
Z[i,j] = D[i,j] # or float(D[i,j]) seems to work the same
My question is:
Is there a better/more elegant/easier/clever way to do it?
UPDATE:
Reading again and again the mpmath documentation I've found this very useful method: tolist() , it can be used as follows:
Z = np.array(matr.tolist(),dtype=np.float32)
It seems slightly better and elegant (no for loops needed)
Are there better ways to do it? Does my second solution round or chop extra digits?
Your second method is to be preferred, but using np.float32 means casting numbers to single precision. For your matrix, this precision is too low: 115.80200375 becomes 115.80200195 due to truncation. You can set double precition explicitly with numpy.float64, or just pass Python's float type as an argument, which means the same.
Z = numpy.array(matr.tolist(), dtype=float)
or, to keep the matrix structure,
Z = numpy.matrix(matr.tolist(), dtype=float)
You can do that when vectorizing a function (which is what we usually want to do anyway). The following example vectorizes and converts the theta function
import numpy as np
import mpmath as mpm
jtheta3_fn = lambda z,q: mpm.jtheta(n=3,z=z,q=q)
jtheta3_fn = np.vectorize(jtheta3_fn,otypes=(float,))
I have an array that grows with each iteration of a loop:
for i in range(100):
frac[i] = some fraction between 0 and 1 with many decimal places
This all works fine. When I check the type(frac[i]), I am told that it is 'numpy.float64'.
For my code to be as precise as I need it to be, I need to use the decimal module and change each frac[i] to the decimal type.
I updated my code:
for i in range(100):
frac[i] = some fraction between 0 and 1 with many decimal places
frac[i] = decimal.Decimal(frac[i])
But when I check the type, I am STILL told that frac[i] is 'numpy.float64'.
I have managed to change other variables to decimal in this way before, so I wonder if you could tell me why this doesn't seem to work.
Thank you.
Depending where your fractions are coming from, you may find it ideal to use the fractions module. Some examples from the docs:
>>> from fractions import Fraction
>>> Fraction(16, -10)
Fraction(-8, 5)
>>> Fraction(123)
Fraction(123, 1)
>>> Fraction()
Fraction(0, 1)
>>> Fraction('3/7')
Fraction(3, 7)
>>> Fraction(' -3/7 ')
Fraction(-3, 7)
>>> Fraction('1.414213 \t\n')
Fraction(1414213, 1000000)
>>> Fraction('-.125')
Fraction(-1, 8)
>>> Fraction('7e-6')
Fraction(7, 1000000)
>>> Fraction(2.25)
Fraction(9, 4)
>>> Fraction(1.1)
Fraction(2476979795053773, 2251799813685248)
>>> from decimal import Decimal
>>> Fraction(Decimal('1.1'))
Fraction(11, 10)
You can also perform all of the regular arithmetic operations; if the result can't be expressed as a fraction, it will be converted to a float:
>>> Fraction(3, 4) + Fraction(1, 16)
Fraction(13, 16)
>>> Fraction(3, 4) * Fraction(1, 16)
Fraction(3, 64)
>>> Fraction(3, 4) ** Fraction(1, 16)
0.982180548555
Note: I haven't used numpy at all, so what follows is mostly just an educated guess.
It sounds like you are using a typed array of type float64. Typed arrays are a particular feature of numpy — the elements of arrays (actually Lists) in Python itself can change dynamically from type to type, and there is no need for all elements of a Python list to have the same type.
With a float64-type array, your values are being cast to floats as they are assigned to array elements, undoing whatever type-casting you've done to them before that point.
The documentation for numpy array creation mentions that the default array type is float64. You probably need to change this to Decimal.
Adding the keyword argument dtype=Decimal to a call to np.arange should do this. You should then have an array of type Decimal, and any float or float64 values you assign it should be cast to Decimal. I don't know enough about what you're doing, or about numpy, to know if this is a sensible thing to be doing with a numpy array.
I was just playing around with Decimals with Numpy.
I found that Numpy offers a function called np.vectorize that allows you to take a function and apply it over a numpy array.
In [23]:
import numpy as np
import decimal
D = decimal.Decimal
We'll create a regular np float array
In [24]:
f10 = np.random.ranf(10)
f10
Out[24]:
array([ 0.45410583, 0.35353919, 0.5976785 , 0.12030978, 0.00976334,
0.47035594, 0.76010096, 0.09229687, 0.24842551, 0.30564141])
trying to convert the array using np.asarray to Decimal type doesn't work. It seems that trying to use np.asarray and specifying the decimal.Decimal type sets the array to object which is to be expected but if you actually access an individual element of the array it still has a float data type.
In [25]:
f10todec = np.asarray(f10, dtype = decimal.Decimal)
print f10todec.dtype, f10todec
print type(f10todec[0])
object [0.454105831376884 0.3535391906233327 0.5976785016396975 0.1203097778312584
0.009763339031407026 0.47035593879363524 0.7601009625324361
0.09229687387940333 0.24842550566826282 0.30564141425653435]
<type 'float'>
If you give np.array a homogenous python list of Decimal types then it seems to preserve the type, hense the list comprehension below to get a list of the values in the first array as Decimal datatype. So I had to make the decimal array this way.
In [26]:
D10 = np.array([D(d) for d in f10])
D10
Out[26]:
array([Decimal('0.4541058313768839838076019077561795711517333984375'),
Decimal('0.35353919062333272194109667907468974590301513671875'),
Decimal('0.597678501639697490332991947070695459842681884765625'),
Decimal('0.12030977783125840208100498784915544092655181884765625'),
Decimal('0.00976333903140702563661079693702049553394317626953125'),
Decimal('0.47035593879363524205672320022131316363811492919921875'),
Decimal('0.76010096253243608632743644193396903574466705322265625'),
Decimal('0.09229687387940332943259136300184763967990875244140625'),
Decimal('0.24842550566826282487653543284977786242961883544921875'),
Decimal('0.30564141425653434946951847450691275298595428466796875')], dtype=object)
basic math operations seem to work ok
In [27]:
D10/2
Out[27]:
array([Decimal('0.2270529156884419919038009539'),
Decimal('0.1767695953116663609705483395'),
Decimal('0.2988392508198487451664959735'),
Decimal('0.06015488891562920104050249392'),
Decimal('0.004881669515703512818305398469'),
Decimal('0.2351779693968176210283616001'),
Decimal('0.3800504812662180431637182210'),
Decimal('0.04614843693970166471629568150'),
Decimal('0.1242127528341314124382677164'),
Decimal('0.1528207071282671747347592373')], dtype=object)
In [28]:
np.sqrt(D10)
Out[28]:
array([Decimal('0.6738737503248542354573624759'),
Decimal('0.5945916166776426405934196108'),
Decimal('0.7730966961769384578392278689'),
Decimal('0.3468569991095154505863255680'),
Decimal('0.09880961001545864636229121433'),
Decimal('0.6858250059553349663476168402'),
Decimal('0.8718376927688066448819998853'),
Decimal('0.3038040057000620415496242404'),
Decimal('0.4984230187985531079935481296'),
Decimal('0.5528484550548498633920483390')], dtype=object)
Untill you try a trig function for which there is no corresponding function in the decimal module
In [29]:
np.sin(D10)
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-29-31ba62da35b8> in <module>()
----> 1 np.sin(D10)
AttributeError: 'Decimal' object has no attribute 'sin'
so lets use np.vectorize so we can use decimal's quantize function to do rounding.
In [30]:
npquantize = np.vectorize(decimal.Decimal.quantize)
qnt_D10 = npquantize(D10, D('.000001'))
qnt_D10
Out[30]:
array([Decimal('0.454106'), Decimal('0.353539'), Decimal('0.597679'),
Decimal('0.120310'), Decimal('0.009763'), Decimal('0.470356'),
Decimal('0.760101'), Decimal('0.092297'), Decimal('0.248426'),
Decimal('0.305641')], dtype=object)
You also need to be careful about some regular python math functions because they will automaticaly change the return type to float. I assume this is because the number can't be calculated accuratly based on the function like SIN or COS.
so i guess the short answer is use a list comprehension to get and convert the items in a numpy array to a python list then create that array from the list of Decimals.
To return numpy arrays with their type intact I guess you could use the vectorize function to wrap any function that works with Decimal type to apply over the np array.
On a side note there is a module in the pip that provides numpy style arrays with IEEE Decimals https://pypi.python.org/pypi/decimalpy/0.1
Try doing decimal.Decimal.from_float(frac[i])