Are Decimal 'dtypes' available in NumPy?

Are Decimal 'dtypes' available in NumPy? - python

Are Decimal data type objects (dtypes) available in NumPy?
>>> import decimal, numpy
>>> d = decimal.Decimal('1.1')
>>> s = [['123.123','23'],['2323.212','123123.21312']]
>>> ss = numpy.array(s, dtype=numpy.dtype(decimal.Decimal))
>>> a = numpy.array(s, dtype=float)
>>> type(d)
<class 'decimal.Decimal'>
>>> type(ss[1,1])
<class 'str'>
>>> type(a[1,1])
<class 'numpy.float64'>
I suppose numpy.array doesn't support every dtype, but I sort of thought that it would at least let a dtype propagate as far as it could as long as the right operations were defined. Am I missing something? Is there some way for this to work?

NumPy doesn't recognize decimal.Decimal as a specific type. The closest it can get is the most general dtype, object. So when converting the elements to the desired dtype, the conversion is a no operation.
>>> ss.dtype
dtype('object')
Keep in mind that because the elements of the array are Python objects, you won't get much of a speedup using them. For example, if you try to add this to any other array, the other elements will have to be boxed back into Python objects and added via the normal Python addition code. You might gain some speed in that the iteration will be in C, but not that much.

Unfortunately, you have to cast each of your items to Decimal when you create the numpy.array. Something like
s = [['123.123','23'],['2323.212','123123.21312']]
decimal_s = [[decimal.Decimal(x) for x in y] for y in s]
ss = numpy.array(decimal_s)

Important caveat: this is a bad answer
You would probably do best to skip to the next answer.
It seems that Decimal is available:
>>> import decimal, numpy
>>> d = decimal.Decimal('1.1')
>>> a = numpy.array([d,d,d],dtype=numpy.dtype(decimal.Decimal))
>>> type(a[1])
<class 'decimal.Decimal'>
I'm not sure exactly what you are trying to accomplish. Your example is more complicated than is necessary for simply creating a decimal NumPy array.

Related

Why is a NumPy int not an instance of a Python int, but a NumPy float is an instance of a Python float?

Consider the following:
>>> import numbers
>>> import numpy
>>> a = numpy.int_(0)
>>> isinstance(a, int)
False
>>> isinstance(a, numbers.Integral)
True
>>> b = numpy.float_(0)
>>> isinstance(b, float)
True
>>> isinstance(b, numbers.Real)
True
NumPy's numpy.int_ and numpy.float_ types are both in Python's numeric abstract base class hierarchy, but it is strange to me that a np.int_ object is not an instance of the built-in int class, while a np.float_ object is an instance of the built-in float type.
Why is this the case?

Python integers can be arbitrary length: type(10**1000) is still int, and will print out a one and then a thousand zeros on your screen if you output it.
Numpy int64 (which is what int_ is on my machine) are integers represented by 8 bytes (64 bits), and anything over that cannot be represented. For example, np.int_(10)**1000 will give you a wrong answer - but quickly ;).
Thus, they are different kinds of numbers; subclassing one under the other makes as much sense as subclassing int under float would, is what I assume numpy people thought. It is best to keep them separate, so that no-one is confused about the fact that it would be unwise to confuse them.
The split is done because arbitrary-size integers are slow, while numpy tries to speed up computation by sticking to machine-friendly types.
On the other hand, floating point is the standard IEEE floating point, both in Python and in numpy, supported out-of-the-box by our processors.

Because numpy.int_() is actually 64-bit, and int can have an arbitrary size, it uses about 4 extra bytes for every 2^30 worth of bits you put in. int64 has constant size:
>>> import numpy as np
>>> a = np.int_(0)
>>> type(a)
<type 'numpy.int64'>
>>> b = 0
>>> type(b)
<type 'int'>

Find maximum integer type at runtime in numpy

I would like to know if there is a way to find out the maximum, for the sake of having something specific let's say, integer type (or unsigned integer, or float, or complex - any "fixed size" type) supported by numpy at runtime. That is, let's assume that I know (from documentation) that largest unsigned integer type in the current version of numpy is np.uint64 and I have a line of code such as:
y = np.uint64(x)
I would like my code to use whatever is the largest, let's say, unsigned integer type available in the version of numpy that my code uses. That is, I would be interested in replacing the above hardcoded type with something like this:
y = np.largest_uint_type(x)
Is there such a method?

You can use np.sctypes:
>>> def largest_of_kind(kind):
... return max(np.sctypes[kind], key=lambda x: np.dtype(x).itemsize)
...
>>> largest_of_kind('int')
<class 'numpy.int64'>
>>> largest_of_kind('uint')
<class 'numpy.uint64'>
>>> largest_of_kind('float')
<class 'numpy.float128'>
>>> largest_of_kind('complex')
<class 'numpy.complex256'>

While I do like #PaulPanzer solution, I also found that numpy defines a function maximum_sctype() not documented in numpy's standard docs. This function fundamentally does the same thing as #PaulPanzer solution (plus some edge case analysis). From the code it is clear that sctype types are sorted in the increasing size order. Using this function, what I need can be done as follows:
y = np.maximum_sctype(np.float)(x) # currently np.float128 on OSX
y = np.maximum_sctype(np.uint8)(x) # currently np.uint64
etc.

Not so elegant, but using the prior knowledge that np.uint is always an exponent of 2, you can do something like that:
for i in range(4,100):
try:
eval('np.uint'+str(2**i)+'(0)')
except:
c=i-1
break
answer='np.uint'+str(2**c)
>>answer
Out[657]: 'np.uint64'
and you can use it as
y=eval(answer+'('+str(x)+')')
or, alternatively without the assumption of exp(2) and with no eval (check all the numbers up to N, here 1000):
for i in range(1000):
if hasattr(np,'uint'+str(i)):
x='uint'+str(i)
>>x
Out[662]: 'uint64'

numpy.subtract performs subtraction wrong on 1-dimensional arrays

I read the post is-floating-point-math-broken and get Why it
happens, but I couldn't find a solution that could help me..
How can I do the correct subtraction?
Python version 2.6.6, Numpy version 1.4.1.
I have two numpy.ndarray each one contain float32 values, origin and new. I'm trying to use numpy.subtract to subtract them but I get the following (odd) result:
>>> import numpy as
>>> with open('base_R.l_BSREM_S.9.1_001.bin', 'r+') as fid:
origin = np.fromfile(fid, np.float32)
>>> with open('new_R.l_BSREM_S.9.1_001.bin', 'r+') as fid:
new = np.fromfile(fid, np.float32)
>>> diff = np.subtract(origin, new)
>>> origin[5184939]
0.10000000149011611938
>>> new[5184939]
0.00000000023283064365
>>> diff[5184939]
0.10000000149011611938
Also when I try to subtract the arrays at 5184939 I get the same result as diff[5184939]
>>> origin[5184939] - new[5184939]
0.10000000149011611938
But when I do the following I get this results:
>>> 0.10000000149011611938 - 0.00000000023283064365
0.10000000125728548
and that's not equal to diff[5184939]
How the right subtraction can be done? (0.10000000125728548 is the one that I need)
Please help, and Thanks in advance

You might add your Python and numpy versions to the question.
Differences can arise from np.float32 v np.float64 dtype, the default Python float type, as well as display standards. numpy uses different display rounding than the underlying Python.
The subtraction itself does not differ.
I can reproduce the 0.10000000125728548 value, which may also display as 0.1 (out 8 decimals).
I'm not sure where the 0.10000000149011611938 comes from. That looks as though new[5184939] was identically 0, not just something small like 0.00000000023283064365.

converting a float to decimal type in Python

I have an array that grows with each iteration of a loop:
for i in range(100):
frac[i] = some fraction between 0 and 1 with many decimal places
This all works fine. When I check the type(frac[i]), I am told that it is 'numpy.float64'.
For my code to be as precise as I need it to be, I need to use the decimal module and change each frac[i] to the decimal type.
I updated my code:
for i in range(100):
frac[i] = some fraction between 0 and 1 with many decimal places
frac[i] = decimal.Decimal(frac[i])
But when I check the type, I am STILL told that frac[i] is 'numpy.float64'.
I have managed to change other variables to decimal in this way before, so I wonder if you could tell me why this doesn't seem to work.
Thank you.

Depending where your fractions are coming from, you may find it ideal to use the fractions module. Some examples from the docs:
>>> from fractions import Fraction
>>> Fraction(16, -10)
Fraction(-8, 5)
>>> Fraction(123)
Fraction(123, 1)
>>> Fraction()
Fraction(0, 1)
>>> Fraction('3/7')
Fraction(3, 7)
>>> Fraction(' -3/7 ')
Fraction(-3, 7)
>>> Fraction('1.414213 \t\n')
Fraction(1414213, 1000000)
>>> Fraction('-.125')
Fraction(-1, 8)
>>> Fraction('7e-6')
Fraction(7, 1000000)
>>> Fraction(2.25)
Fraction(9, 4)
>>> Fraction(1.1)
Fraction(2476979795053773, 2251799813685248)
>>> from decimal import Decimal
>>> Fraction(Decimal('1.1'))
Fraction(11, 10)
You can also perform all of the regular arithmetic operations; if the result can't be expressed as a fraction, it will be converted to a float:
>>> Fraction(3, 4) + Fraction(1, 16)
Fraction(13, 16)
>>> Fraction(3, 4) * Fraction(1, 16)
Fraction(3, 64)
>>> Fraction(3, 4) ** Fraction(1, 16)
0.982180548555

Note: I haven't used numpy at all, so what follows is mostly just an educated guess.
It sounds like you are using a typed array of type float64. Typed arrays are a particular feature of numpy — the elements of arrays (actually Lists) in Python itself can change dynamically from type to type, and there is no need for all elements of a Python list to have the same type.
With a float64-type array, your values are being cast to floats as they are assigned to array elements, undoing whatever type-casting you've done to them before that point.
The documentation for numpy array creation mentions that the default array type is float64. You probably need to change this to Decimal.
Adding the keyword argument dtype=Decimal to a call to np.arange should do this. You should then have an array of type Decimal, and any float or float64 values you assign it should be cast to Decimal. I don't know enough about what you're doing, or about numpy, to know if this is a sensible thing to be doing with a numpy array.

I was just playing around with Decimals with Numpy.
I found that Numpy offers a function called np.vectorize that allows you to take a function and apply it over a numpy array.
In [23]:
import numpy as np
import decimal
D = decimal.Decimal

We'll create a regular np float array
In [24]:
f10 = np.random.ranf(10)
f10
Out[24]:
array([ 0.45410583, 0.35353919, 0.5976785 , 0.12030978, 0.00976334,
0.47035594, 0.76010096, 0.09229687, 0.24842551, 0.30564141])
trying to convert the array using np.asarray to Decimal type doesn't work. It seems that trying to use np.asarray and specifying the decimal.Decimal type sets the array to object which is to be expected but if you actually access an individual element of the array it still has a float data type.
In [25]:
f10todec = np.asarray(f10, dtype = decimal.Decimal)
print f10todec.dtype, f10todec
print type(f10todec[0])
object [0.454105831376884 0.3535391906233327 0.5976785016396975 0.1203097778312584
0.009763339031407026 0.47035593879363524 0.7601009625324361
0.09229687387940333 0.24842550566826282 0.30564141425653435]
<type 'float'>
If you give np.array a homogenous python list of Decimal types then it seems to preserve the type, hense the list comprehension below to get a list of the values in the first array as Decimal datatype. So I had to make the decimal array this way.
In [26]:
D10 = np.array([D(d) for d in f10])
D10
Out[26]:
array([Decimal('0.4541058313768839838076019077561795711517333984375'),
Decimal('0.35353919062333272194109667907468974590301513671875'),
Decimal('0.597678501639697490332991947070695459842681884765625'),
Decimal('0.12030977783125840208100498784915544092655181884765625'),
Decimal('0.00976333903140702563661079693702049553394317626953125'),
Decimal('0.47035593879363524205672320022131316363811492919921875'),
Decimal('0.76010096253243608632743644193396903574466705322265625'),
Decimal('0.09229687387940332943259136300184763967990875244140625'),
Decimal('0.24842550566826282487653543284977786242961883544921875'),
Decimal('0.30564141425653434946951847450691275298595428466796875')], dtype=object)
basic math operations seem to work ok
In [27]:
D10/2
Out[27]:
array([Decimal('0.2270529156884419919038009539'),
Decimal('0.1767695953116663609705483395'),
Decimal('0.2988392508198487451664959735'),
Decimal('0.06015488891562920104050249392'),
Decimal('0.004881669515703512818305398469'),
Decimal('0.2351779693968176210283616001'),
Decimal('0.3800504812662180431637182210'),
Decimal('0.04614843693970166471629568150'),
Decimal('0.1242127528341314124382677164'),
Decimal('0.1528207071282671747347592373')], dtype=object)
In [28]:
np.sqrt(D10)
Out[28]:
array([Decimal('0.6738737503248542354573624759'),
Decimal('0.5945916166776426405934196108'),
Decimal('0.7730966961769384578392278689'),
Decimal('0.3468569991095154505863255680'),
Decimal('0.09880961001545864636229121433'),
Decimal('0.6858250059553349663476168402'),
Decimal('0.8718376927688066448819998853'),
Decimal('0.3038040057000620415496242404'),
Decimal('0.4984230187985531079935481296'),
Decimal('0.5528484550548498633920483390')], dtype=object)
Untill you try a trig function for which there is no corresponding function in the decimal module
In [29]:
np.sin(D10)
---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
<ipython-input-29-31ba62da35b8> in <module>()
----> 1 np.sin(D10)
AttributeError: 'Decimal' object has no attribute 'sin'
so lets use np.vectorize so we can use decimal's quantize function to do rounding.
In [30]:
npquantize = np.vectorize(decimal.Decimal.quantize)
qnt_D10 = npquantize(D10, D('.000001'))
qnt_D10
Out[30]:
array([Decimal('0.454106'), Decimal('0.353539'), Decimal('0.597679'),
Decimal('0.120310'), Decimal('0.009763'), Decimal('0.470356'),
Decimal('0.760101'), Decimal('0.092297'), Decimal('0.248426'),
Decimal('0.305641')], dtype=object)
You also need to be careful about some regular python math functions because they will automaticaly change the return type to float. I assume this is because the number can't be calculated accuratly based on the function like SIN or COS.
so i guess the short answer is use a list comprehension to get and convert the items in a numpy array to a python list then create that array from the list of Decimals.
To return numpy arrays with their type intact I guess you could use the vectorize function to wrap any function that works with Decimal type to apply over the np array.
On a side note there is a module in the pip that provides numpy style arrays with IEEE Decimals https://pypi.python.org/pypi/decimalpy/0.1

Try doing decimal.Decimal.from_float(frac[i])

How can I change what python interprets as a integer?

How can I change what python interprets as a integer? For example: 94*n would be a valid integer.

Anything is possible when you smell like Old Spice and use Python's language services to generate a AST.

On the off chance that you're not trying to modify Python's grammar, you could use int():
>>> n = 1.2
>>> x = 94*n
>>> type(x)
<type 'float'>
>>> y = int(94*n) # use int()
>>> type(y)
<type 'int'>

You can use int() and float() to convert numeric types. If you want a computer algebra system in Python, then you may be interested in taking a look at sympy which lets you do something like:
from sympy import *
n = Symbol('n')
x = 94*n
print x
print x.subs(n, 5)
If you are trying to write a computer algebra system, I would recommend using Sympy if it meets your needs or contributing to Sympy to enhance it rather than creating a whole new system from scratch.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Are Decimal 'dtypes' available in NumPy? - python

Unfortunately, you have to cast each of your items to Decimal when you create the numpy.array. Something like s = [['123.123','23'],['2323.212','123123.21312']] decimal_s = [[decimal.Decimal(x) for x in y] for y in s] ss = numpy.array(decimal_s)

Related

Why is a NumPy int not an instance of a Python int, but a NumPy float is an instance of a Python float?

Find maximum integer type at runtime in numpy

numpy.subtract performs subtraction wrong on 1-dimensional arrays

converting a float to decimal type in Python

How can I change what python interprets as a integer?

Categories

Resources