Related
Apologies in advance - I seem to be having a very fundamental misunderstanding that I can't clear up. I have a fourvector class with variables for ct and the position vector. I'm writing code to perform an x-direction lorentz boost. The problem I'm running in to is that I, as it's written below, ct returns with a proper float value, but x does not. Messing around, I find that tempx is a float, but assigning tempx to r[0] does not make that into a float, instead it rounds down to an int. I have previously posted a question on mutability vs immutability, and I suspect this is the issue. If so I clearly have a deeper misunderstanding than expected. Regardless, there are a couple of questions I have;
1a) If instantiate a with a = FourVector(ct=5,r=[55,2.,3]), then type(a._r[0]) returns numpy.float64 as opposed to numpy.int32. What is going on here? I expected just a._r[1] to be a float, and instead it changes the type of the whole list?
1b) How do I get the above behaviour (The whole list being floats), without having to instantiate the variables as floats? I read up on the documentation and have tried various methods, like using astype(float), but everything I do seems to keep it as an int. Again, thinking this is the mutable/immutable problem I'm having.
2) I had thought, in the tempx=... line, multiplying by 1.0 would convert it to a float, as it appears this is the reason ct converts to a float, but for some reason it doesn't. Perhaps the same reason as the others?
import numpy as np
class FourVector():
def __init__(self, ct=0, x=0, y=0, z=0, r=[]):
self._ct = ct
self._r = np.array(r)
if r == []:
self._r = np.array([x,y,z])
def boost(self, beta):
gamma=1/np.sqrt(1-(beta ** 2))
tempct=(self._ct*gamma-beta*gamma*self._r[0])
tempx=(-1.0*self._ct*beta*gamma+self._r[0]*gamma)
self._ct=tempct
print(type(self._r[0]))
self._r[0]=tempx.astype(float)
print(type(self._r[0]))
a = FourVector(ct=5,r=[55,2,3])
b = FourVector(ct=1,r=[4,5,6])
print(a._r)
a.boost(.5)
print(a._r)
All your problems are indeed related.
A numpy array is an array that holds objects efficiently. It does this by having these objects be of the same type, like strings (of equal length) or integers or floats. It can then easily calculate just how much space each element needs and how many bytes it must "jump" to access the next element (we call these the "strides").
When you create an array from a list, numpy will try to determine a suitable data type ("dtype") from that list, to ensure all elements can be represented well. Only when you specify the dtype explicitly, will it not make an educated guess.
Consider the following example:
>>> import numpy as np
>>> integer_array = np.array([1,2,3]) # pass in a list of integers
>>> integer_array
array([1, 2, 3])
>>> integer_array.dtype
dtype('int64')
As you can see, on my system it returns a data type of int64, which is a representation of integers using 8 bytes. It chooses this, because:
numpy recognizes all elements of the list are integers
my system is a 64-bit system
Now consider an attempt at changing that array:
>>> integer_array[0] = 2.4 # attempt to put a float in an array with dtype int
>>> integer_array # it is automatically converted to an int!
array([2, 2, 3])
As you can see, once a datatype for an array was set, automatic casting to that datatype is done.
Let's now consider what happens when you pass in a list that has at least one float:
>>> float_array = np.array([1., 2,3])
>>> float_array
array([ 1., 2., 3.])
>>> float_array.dtype
dtype('float64')
Once again, numpy determines a suitable datatype for this array.
Blindly attempting to change the datatype of an array is not wise:
>>> integer_array.dtype = np.float32
>>> integer_array
array([ 2.80259693e-45, 0.00000000e+00, 2.80259693e-45,
0.00000000e+00, 4.20389539e-45, 0.00000000e+00], dtype=float32)
Those numbers are gibberish you might say. That's because numpy tries to reinterpret the memory locations of that array as 4-byte floats (the skilled people will be able to convert the numbers to binary representation and from there reinterpret the original integer values).
If you want to cast, you'll have to do it explicitly and numpy will return a new array:
>>> integer_array.dtype = np.int64 # go back to the previous interpretation
>>> integer_array
array([2, 2, 3])
>>> integer_array.astype(np.float32)
array([ 2., 2., 3.], dtype=float32)
Now, to address your specific questions:
1a) If instantiate a with a = FourVector(ct=5,r=[55,2.,3]), then type(a._r[0]) returns numpy.float64 as opposed to numpy.int32. What is going on here? I expected just a._r[1] to be a float, and instead it changes the type of the whole list?
That's because numpy has to determine a datatype for the entire array (unless you use a structured array), ensuring all elements fit in that datatype. Only then can numpy iterate over the elements of that array efficiently.
1b) How do I get the above behaviour (The whole list being floats), without having to instantiate the variables as floats? I read up on the documentation and have tried various methods, like using astype(float), but everything I do seems to keep it as an int. Again, thinking this is the mutable/immutable problem I'm having.
Specify the dtype when you are creating the array. In your code, that would be:
self._r = np.array(r, dtype=np.float)
2) I had thought, in the tempx=... line, multiplying by 1.0 would convert it to a float, as it appears this is the reason ct converts to a float, but for some reason it doesn't. Perhaps the same reason as the others?
That is true. Try printing the datatype of tempx, it should be a float. However, later on, you are reinserting that value into the array self._r, which has the dtype of int. And as you saw previously, that will cast the float back to an integer type.
I have a value variable and it can be either
-- numpy number
-- string
-- python primitive -- float or int
How do I identify the type? Especially if the number is numpy number or python primitive number.
When I try to do the following in python interpreter
np.issubdtype(10, np.number)
I get
arg1 = dtype(arg1).type
TypeError: data type not understood
Thanks!
Have you try like this.
isintance(10, (int, float, str, np.number))
According numpy docs --> https://docs.scipy.org/doc/numpy/reference/generated/numpy.issubdtype.html
Arguments for np.issubdtype function should be a dtypes or its string representation but not an actual values.
So you can try like this:
np.issubdtype(type(10), np.number)
Should your variable be A, using type(A) should give you your answer.
For example, if you have
import numpy as np
A = np.array([3, 4, 5])
type(A)should give you numpy.ndarrayas answer. Similarly, type(234) simply returns int.
If you're looking to find the type of the variable, would type(var) just work?
You can use the type() method in Python.
>>> r = np.float32(1.0)
>>> type(r).__name__
'numpy.float32'
>>> s = '1.0'
>>> type(s).__name__
'str'
>>> t = 1.0
>>> type(t).__name__
'float'
I would like to know if there is a way to find out the maximum, for the sake of having something specific let's say, integer type (or unsigned integer, or float, or complex - any "fixed size" type) supported by numpy at runtime. That is, let's assume that I know (from documentation) that largest unsigned integer type in the current version of numpy is np.uint64 and I have a line of code such as:
y = np.uint64(x)
I would like my code to use whatever is the largest, let's say, unsigned integer type available in the version of numpy that my code uses. That is, I would be interested in replacing the above hardcoded type with something like this:
y = np.largest_uint_type(x)
Is there such a method?
You can use np.sctypes:
>>> def largest_of_kind(kind):
... return max(np.sctypes[kind], key=lambda x: np.dtype(x).itemsize)
...
>>> largest_of_kind('int')
<class 'numpy.int64'>
>>> largest_of_kind('uint')
<class 'numpy.uint64'>
>>> largest_of_kind('float')
<class 'numpy.float128'>
>>> largest_of_kind('complex')
<class 'numpy.complex256'>
While I do like #PaulPanzer solution, I also found that numpy defines a function maximum_sctype() not documented in numpy's standard docs. This function fundamentally does the same thing as #PaulPanzer solution (plus some edge case analysis). From the code it is clear that sctype types are sorted in the increasing size order. Using this function, what I need can be done as follows:
y = np.maximum_sctype(np.float)(x) # currently np.float128 on OSX
y = np.maximum_sctype(np.uint8)(x) # currently np.uint64
etc.
Not so elegant, but using the prior knowledge that np.uint is always an exponent of 2, you can do something like that:
for i in range(4,100):
try:
eval('np.uint'+str(2**i)+'(0)')
except:
c=i-1
break
answer='np.uint'+str(2**c)
>>answer
Out[657]: 'np.uint64'
and you can use it as
y=eval(answer+'('+str(x)+')')
or, alternatively without the assumption of exp(2) and with no eval (check all the numbers up to N, here 1000):
for i in range(1000):
if hasattr(np,'uint'+str(i)):
x='uint'+str(i)
>>x
Out[662]: 'uint64'
I'm working with mpmath python library to gain precision during some computations, but i need to cast the result in a numpy native type.
More precisely i need to cast an mpmath matrix (that contains mpf object types) in an numpy.ndarray (that contains float types).
I have solved the problem with a raw approach:
# My input Matrix:
matr = mp.matrix(
[[ '115.80200375', '22.80402473', '13.69453064', '54.28049263'],
[ '22.80402473', '86.14887381', '53.79999432', '42.78548627'],
[ '13.69453064', '53.79999432', '110.9695448' , '37.24270321'],
[ '54.28049263', '42.78548627', '37.24270321', '95.79388469']])
# multiple precision computation
D = MPDBiteration(matr)
# Create a new ndarray
Z = numpy.ndarray((matr.cols,matr.rows),dtype=numpy.float)
# I fill it pretty "manually"
for i in range(0,matr.rows):
for j in range(0,matr.cols):
Z[i,j] = D[i,j] # or float(D[i,j]) seems to work the same
My question is:
Is there a better/more elegant/easier/clever way to do it?
UPDATE:
Reading again and again the mpmath documentation I've found this very useful method: tolist() , it can be used as follows:
Z = np.array(matr.tolist(),dtype=np.float32)
It seems slightly better and elegant (no for loops needed)
Are there better ways to do it? Does my second solution round or chop extra digits?
Your second method is to be preferred, but using np.float32 means casting numbers to single precision. For your matrix, this precision is too low: 115.80200375 becomes 115.80200195 due to truncation. You can set double precition explicitly with numpy.float64, or just pass Python's float type as an argument, which means the same.
Z = numpy.array(matr.tolist(), dtype=float)
or, to keep the matrix structure,
Z = numpy.matrix(matr.tolist(), dtype=float)
You can do that when vectorizing a function (which is what we usually want to do anyway). The following example vectorizes and converts the theta function
import numpy as np
import mpmath as mpm
jtheta3_fn = lambda z,q: mpm.jtheta(n=3,z=z,q=q)
jtheta3_fn = np.vectorize(jtheta3_fn,otypes=(float,))
Apologies in advance - I seem to be having a very fundamental misunderstanding that I can't clear up. I have a fourvector class with variables for ct and the position vector. I'm writing code to perform an x-direction lorentz boost. The problem I'm running in to is that I, as it's written below, ct returns with a proper float value, but x does not. Messing around, I find that tempx is a float, but assigning tempx to r[0] does not make that into a float, instead it rounds down to an int. I have previously posted a question on mutability vs immutability, and I suspect this is the issue. If so I clearly have a deeper misunderstanding than expected. Regardless, there are a couple of questions I have;
1a) If instantiate a with a = FourVector(ct=5,r=[55,2.,3]), then type(a._r[0]) returns numpy.float64 as opposed to numpy.int32. What is going on here? I expected just a._r[1] to be a float, and instead it changes the type of the whole list?
1b) How do I get the above behaviour (The whole list being floats), without having to instantiate the variables as floats? I read up on the documentation and have tried various methods, like using astype(float), but everything I do seems to keep it as an int. Again, thinking this is the mutable/immutable problem I'm having.
2) I had thought, in the tempx=... line, multiplying by 1.0 would convert it to a float, as it appears this is the reason ct converts to a float, but for some reason it doesn't. Perhaps the same reason as the others?
import numpy as np
class FourVector():
def __init__(self, ct=0, x=0, y=0, z=0, r=[]):
self._ct = ct
self._r = np.array(r)
if r == []:
self._r = np.array([x,y,z])
def boost(self, beta):
gamma=1/np.sqrt(1-(beta ** 2))
tempct=(self._ct*gamma-beta*gamma*self._r[0])
tempx=(-1.0*self._ct*beta*gamma+self._r[0]*gamma)
self._ct=tempct
print(type(self._r[0]))
self._r[0]=tempx.astype(float)
print(type(self._r[0]))
a = FourVector(ct=5,r=[55,2,3])
b = FourVector(ct=1,r=[4,5,6])
print(a._r)
a.boost(.5)
print(a._r)
All your problems are indeed related.
A numpy array is an array that holds objects efficiently. It does this by having these objects be of the same type, like strings (of equal length) or integers or floats. It can then easily calculate just how much space each element needs and how many bytes it must "jump" to access the next element (we call these the "strides").
When you create an array from a list, numpy will try to determine a suitable data type ("dtype") from that list, to ensure all elements can be represented well. Only when you specify the dtype explicitly, will it not make an educated guess.
Consider the following example:
>>> import numpy as np
>>> integer_array = np.array([1,2,3]) # pass in a list of integers
>>> integer_array
array([1, 2, 3])
>>> integer_array.dtype
dtype('int64')
As you can see, on my system it returns a data type of int64, which is a representation of integers using 8 bytes. It chooses this, because:
numpy recognizes all elements of the list are integers
my system is a 64-bit system
Now consider an attempt at changing that array:
>>> integer_array[0] = 2.4 # attempt to put a float in an array with dtype int
>>> integer_array # it is automatically converted to an int!
array([2, 2, 3])
As you can see, once a datatype for an array was set, automatic casting to that datatype is done.
Let's now consider what happens when you pass in a list that has at least one float:
>>> float_array = np.array([1., 2,3])
>>> float_array
array([ 1., 2., 3.])
>>> float_array.dtype
dtype('float64')
Once again, numpy determines a suitable datatype for this array.
Blindly attempting to change the datatype of an array is not wise:
>>> integer_array.dtype = np.float32
>>> integer_array
array([ 2.80259693e-45, 0.00000000e+00, 2.80259693e-45,
0.00000000e+00, 4.20389539e-45, 0.00000000e+00], dtype=float32)
Those numbers are gibberish you might say. That's because numpy tries to reinterpret the memory locations of that array as 4-byte floats (the skilled people will be able to convert the numbers to binary representation and from there reinterpret the original integer values).
If you want to cast, you'll have to do it explicitly and numpy will return a new array:
>>> integer_array.dtype = np.int64 # go back to the previous interpretation
>>> integer_array
array([2, 2, 3])
>>> integer_array.astype(np.float32)
array([ 2., 2., 3.], dtype=float32)
Now, to address your specific questions:
1a) If instantiate a with a = FourVector(ct=5,r=[55,2.,3]), then type(a._r[0]) returns numpy.float64 as opposed to numpy.int32. What is going on here? I expected just a._r[1] to be a float, and instead it changes the type of the whole list?
That's because numpy has to determine a datatype for the entire array (unless you use a structured array), ensuring all elements fit in that datatype. Only then can numpy iterate over the elements of that array efficiently.
1b) How do I get the above behaviour (The whole list being floats), without having to instantiate the variables as floats? I read up on the documentation and have tried various methods, like using astype(float), but everything I do seems to keep it as an int. Again, thinking this is the mutable/immutable problem I'm having.
Specify the dtype when you are creating the array. In your code, that would be:
self._r = np.array(r, dtype=np.float)
2) I had thought, in the tempx=... line, multiplying by 1.0 would convert it to a float, as it appears this is the reason ct converts to a float, but for some reason it doesn't. Perhaps the same reason as the others?
That is true. Try printing the datatype of tempx, it should be a float. However, later on, you are reinserting that value into the array self._r, which has the dtype of int. And as you saw previously, that will cast the float back to an integer type.