Is there a faster way of reading a buffer of integers to an array of complex numbers?
This works good (if buffer is with floats):
import numpy, struct
binary_string = struct.pack('2f', 1,2)
print numpy.frombuffer(binary_string, dtype=numpy.complex64)
# [ 1. + 2.j]
But, if readed buffer is with integers, there is a problem:
import numpy, struct
binary_string = struct.pack('2i', 1,2)
print numpy.frombuffer(binary_string, dtype=numpy.complex64)
# [ 1.40129846e-45 +2.80259693e-45j]
So, I can't find any faster way to convert it except with a slicing:
import numpy, struct
#for int32
binary_string = struct.pack('2i', 1,2)
ints = numpy.frombuffer(binary_string, dtype=numpy.int32)
print ints[::2] + 1j*ints[1::2]
# [ 1. + 2.j]
#for int16
binary_string = struct.pack('2H', 1,2)
ints = numpy.frombuffer(binary_string, dtype=numpy.int16)
print ints[::2] + 1j*ints[1::2]
# [ 1. + 2.j]
Also, is there any of a "complex number with integers" datatype, so a result could look like:
[1 + 2j]
Thanks.
For a string packed with 4-byte ints you could use:
In [35]: np.frombuffer(struct.pack('2i', 1,2), dtype='i4').astype(np.float32).view(np.complex64)
Out[35]: array([ 1.+2.j], dtype=complex64)
For a string packed with 2-byte ints you could use:
In [34]: np.frombuffer(struct.pack('2H', 1,2), dtype='i2').astype(np.float32).view(np.complex64)
Out[34]: array([ 1.+2.j], dtype=complex64)
The idea here is to let np.frombuffer read the string using an integer dtype appropriate for the string. Then use astype to preserve the integer values but change the underlying representation to float32s. Then use view to reinterpret the underlying data as complex64s (so every two float32s get regarded as one complex64).
Related
I am writing a program which gets an unformatted string as an input and should output a numpy int array.
The string contains id, timestamp etc. and a hexadecimal data array. Say the input string is data_string = '01190810000235a5000235b4000234c5000211a5', then 01is the id, 190810 is the timestamp and 000235a5000235b4000234c5000211a5 is the data array with the values 000235a5, 000235b4, 000234c5, 000211a5. (The real input string is several MB in size.)
I am having problems converting the data array to a numpy integer array. I have come up with:
import numpy as np
data_dict['data array'] = np.core.defchararray.asarray(data_string[8:], 8)
but this way I am only getting a string array. I tried fiddling with np.fromstring(data_string[8:], np.int32), but this changed the given values of the input string. Is there any way to get a int array from a string? Using a for loop (or similar implementiations) is not an option because this code is performance critical.
EDIT:
To clarify my problem...
Input string is
>>> import numpy as np
>>> s = "000235a5000235b4000234c5000211a5"
Converting it with np.core.defchararray.asarray() results in a chararray. But I want a integer type array.
>>> s1 = np.core.defchararray.asarray(s, 8)
>>> s1
chararray(['000235a5', '000235a5', '000235a5', '000235a5'], dtype='<U8')
Converting s with np.fromstring() results in an integer array, but it seems that it does not like hexadecimal numbers.
>>> s2 = np.fromstring(s, dtype=np.int32)
>>> s2
array([842018864, 895563059, 842018864, 878851379, 842018864, 895693875,
842018864, 895562033])
array([000235a5, 000235a5, 000235a5, 000235a5]) is the result I actually want to get.
I am trying to convert a c_byte array into different datatypes in Python, e.g. converting a eight-entry c_byte array into a int64 or double.
In my project, I read a long c_byte array (n>500) containing multiple sensor values with different datatypes. So maybe the first entry is a bool, the second and third entry represent a int8 and entries 4-11 store a double. I am looking for a convenient way of casting those array-entries into the required datatypes.
At the moment, I am transcribing the byte-array into strings containing the binary number. I was thinking about manually writing functions to convert those strings into floats and ints, but I hope there is a more elegant way of doing so. Also, i run into problems converting signed ints...
def convert_byte_to_binary(array):
binary = ''
for i in array:
binary += format(i, '#010b')[2:]
return binary
def convert_binary_to_uint(binary):
return int(binary, 2)
array = read_cbyte_array(address, length) # reads an array of size length, starting at address
array
[15, 30, 110, 7, 65]
convert_byte_to_binary(array)
'0000111100011110011011100000011101000001'
I found the bitstring library, which does something very similar to what I want. Unfortunately, I did not find any support for 64bit integers or double floats.
Ideally, I would have a set of function that can convert the ctypes.c_byte-array into the corresponding ctypes-types.
The struct library is intended for this.
Here's a short example. The value '<?2b8d' represents:
< little-endian
? bool
2b two 1-byte signed int.
8d eight doubles.
import ctypes
import struct
# Build the raw data
org_data = struct.pack('<?2b8d',True,-1,2,1.1,2.2,3.3,4.4,5.5,6.6,7.7,8.8)
print(org_data)
# Just to demo using a c_byte array...
byte_data = (ctypes.c_byte * len(org_data))(*org_data)
print(byte_data)
# convert to the original data
data = struct.unpack('<?2b8d',byte_data)
print(data)
Output:
b'\x01\xff\x02\x9a\x99\x99\x99\x99\x99\xf1?\x9a\x99\x99\x99\x99\x99\x01#ffffff\n#\x9a\x99\x99\x99\x99\x99\x11#\x00\x00\x00\x00\x00\x00\x16#ffffff\x1a#\xcd\xcc\xcc\xcc\xcc\xcc\x1e#\x9a\x99\x99\x99\x99\x99!#'
<__main__.c_byte_Array_67 object at 0x0000025D4066B0C8>
(True, -1, 2, 1.1, 2.2, 3.3, 4.4, 5.5, 6.6, 7.7, 8.8)
I want to convert a numpy array, which is of float32 data type to its equivalent hexadecimal format, in Python 3.
This is the implementation I tried but it doesn't seem to work:
import numpy as np
np.set_printoptions(formatter={'float':hex})
np.array([1.2,3.4,2.6,2.1], dtype = np.float32)
Python's float type has a built-in .hex() method. In the formatter, you can use a lambda to first cast the value to float, and then call .hex():
np.set_printoptions(formatter={'float':lambda x:float(x).hex()})
For the following array:
arr = np.array([1.2,3.4,2.6,2.1], dtype = np.float32)
print(arr)
The output is:
[0x1.3333340000000p+0 0x1.b333340000000p+1 0x1.4ccccc0000000p+1
0x1.0ccccc0000000p+1]
‘float.hex()’ method is used to convert a floating point number to its hexadecimal value. Similarly, we can use ‘float.fromhex()’ method to convert a hexadecimal string value to its floating point representation. ‘hex()’ is an instance method but ‘fromhex()’ is a class method.
Below is the code which will help you..
#define numpy array
np_arr = np.array([1.2,3.4,2.6,2.1,15,10], dtype = np.float32)
#convert numpy array to hex
np_arr_hex = np.array([float.hex(float(x)) for x in np_arr])
#back to float with upto 4 decimal places
np_arr_float = np.array([round(float.fromhex(x),1) for x in np_arr_hex])
#print both arrays
np_arr_hex,np_arr_float
Output:
np_arr_hex
(array(['0x1.3333340000000p+0', '0x1.b333340000000p+1',
'0x1.4ccccc0000000p+1', '0x1.0ccccc0000000p+1',
'0x1.e000000000000p+3', '0x1.4000000000000p+3'], dtype='<U20')
np_arr_float
array([ 1.2, 3.4, 2.6, 2.1, 15. , 10. ]))
Considering the following example array :
a = np.array([0,1,1,0,1,1,1,0,1,0])
Which could be of any dtype (int, float...)
How would I get the following output without using nasty loops and string casts ?
np.array([0b01,0b10,0b11,0b10,0b10])
a = a.astype(int)
output = a[0::2] * 2 + a[1::2]
Gives the array you've described (though it doesn't print in binary).
I am new to programming and numpy... While reading tutorials and experimenting on jupyter-notebook... I thought of converting dtype of a numpy array as follows:
import numpy as np
c = np.random.rand(4)*10
print c
#Output1: [ 0.12757225 5.48992242 7.63139022 2.92746857]
c.dtype = int
print c
#Output2: [4593764294844833304 4617867121563982285 4620278199966380988 4613774491979221856]
I know the proper way of changing is:
c = c.astype(int)
But I want to the reason behind those ambiguous numbers in Output2. What are they and what do they signify?
Floats and integers (numpy.float64s and numpy.int64s) are represented differently in memory. The value 42 stored in these different types corresponds to a different bit pattern in memory.
When you're reassigning the dtype attribute of an array, you keep the underlying data unchanged, and you're telling numpy to interpret that pattern of bits in a new way. Since the interpretation now doesn't match the original definition of the data, you end up with gibberish (meaningless numbers).
On the other hand, converting your array via .astype() will actually convert the data in memory:
>>> import numpy as np
>>> arr = np.random.rand(3)
>>> arr.dtype
dtype('float64')
>>> arr
array([ 0.7258989 , 0.56473195, 0.20885672])
>>> arr.data
<memory at 0x7f10d7061288>
>>> arr.dtype = np.int64
>>> arr.data
<memory at 0x7f10d7061348>
>>> arr
array([4604713535589390862, 4603261872765946451, 4596692876638008676])
Proper conversion:
>>> arr = np.random.rand(3)*10
>>> arr
array([ 3.59591191, 1.21786042, 6.42272461])
>>> arr.astype(np.int64)
array([3, 1, 6])
As you can see, using astype will meaningfully convert the original values of the array, in this case it will truncate to the integer part, and return a new array with corresponding values and dtype.
Note that assigning a new dtype doesn't trigger any checks, so you can do very weird stuff with your array. In the above example, 64 bits of floats were reinterpreted as 64 bits of integers. But you can also change the bit size:
>>> arr = np.random.rand(3)
>>> arr.shape
(3,)
>>> arr.dtype
dtype('float64')
>>> arr.dtype = np.float32
>>> arr.shape
(6,)
>>> arr
array([ 4.00690371e+35, 1.87285304e+00, 8.62005305e+13,
1.33751166e+00, 7.17894062e+30, 1.81315207e+00], dtype=float32)
By telling numpy that your data occupies half the space than originally, numpy will deduce that your array has twice as many elements! Clearly not what you should ever want to do.
Another example: consider the 8-bit unsigned integer 255==2**8-1: it corresponds to 11111111 in binary. Now, try to reinterpret two of these numbers as a single 16-bit unsigned integer:
>>> arr = np.array([255,255],dtype=np.uint8)
>>> arr.dtype = np.uint16
>>> arr
array([65535], dtype=uint16)
As you can see, the result is the single number 65535. If that doesn't ring a bell, it's exactly 2**16-1, with 16 ones in its binary pattern. The two full-one patterns were reinterpreted as a single 16-bit number, and the result changed accordingly. The reason you often see weirder numbers is that reinterpreting floats as ints as vice versa will lead to a much stronger mangling of the data, due to how floating-point numbers are represented in memory.
As hpaulj noted, you can directly perform this reinterpretation of the data by constructing a new view of the array with a modified dtype. This is probably more useful than having to reassign the dtype of a given array, but then again changing the dtype is only useful in fairly rare, very specific use cases.