I need to do some elaborations on MATLAB data in Python.
The data is stored as an array of doubles in Matlab. When I retrieve it, despite being stated here that double data types from Matlab are converted in float data types when handled by Python, I get this error:
TypeError: unorderable types: double() < float()
What I'm trying to do is this
import matlab.engine
eng=matlab.engine.connect_matlab()
x = eng.workspace['MyData']
x = x[len(x)-1]
if x < 0.01:
#do stuff
How can I convert the double number stored in the array to a float so that I can use it alongside my other Python variables?
Converting doubles into floats in Matlab is as simple as calling the single function:
A = rand(10);
whos A;
B = single(A);
whos B;
As per console output:
Name Size Bytes Class Attributes
A 10x10 800 double
Name Size Bytes Class Attributes
B 10x10 400 single
Be careful about the loss of precision, since you are converting 64 bit numeric values into 32 bit numeric values.
EDIT
Since you can't manipulate your Matlab data, in order to accomplish this I suggest you to use either Numpy (refer to this function in case: https://docs.scipy.org/doc/numpy-1.12.0/reference/generated/numpy.ndarray.astype.html) or struct for a straight conversion (refer to this answer in case: convert double to float in Python).
Related
I use statistics.mean() to calculate mean from sampled distribution. However, in the following code, the returned value from the following value is rounded integer. If I use numpy.mean() instead, will get the correct float typed results. So what is going on here?
import statistics
from scipy import stats
posterior_sample = stats.beta.rvs(3, 19, size = 1000)
predictive_sample = stats.binom.rvs(100, posterior_sample, size = 1000)
print(statistics.mean(predictive_sample))
print(statistics.mean([(data >= 15).astype(int) for data in predictive_sample]))
statistics.mean does not support the numpy.int64 data type.
From the docs for statistics:
Unless explicitly noted otherwise, these functions support int, float,
decimal.Decimal and fractions.Fraction. Behaviour with other types
(whether in the numeric tower or not) is currently unsupported. Mixed
types are also undefined and implementation-dependent. If your input
data consists of mixed types, you may be able to use map() to ensure a
consistent result, e.g. map(float, input_data).
To get around this, you can do as suggested, and convert your data to float before passing to statistics.mean().
print(statistics.mean(map(float, predictive_sample)))
Now for the underlying reason behind this behaviour:
At the end of the source code for statistics.mean, there is a call to statistics._convert, which is meant to convert the returned value to an appropriate type (i.e. Fraction if inputs are fractions, float if inputs are int etc).
A single line in _convert is meant to catch other data types, and ensure that the returned value is consistent with the provided data (T is the data type for each input value, value is the calculated mean):
try:
return T(value)
If your input is numpy.int64, then the _convert function tries to convert the calculated mean to numpy.int64 data type. NumPy happily converts a float to an int (rounded down I think). And hence the mean function returns a mean rounded to the nearest integer, encoded as numpy.int64.
If your input data is numpy.float64, then you won't have this problem.
I am looking for a function that can cast every element of an array to float type in c++ (like what astype() do in python).
Do you know any?
Thanks
You can cast individual items to float, but you can't cast an entire array at once.
You can pretty easily create a vector of floats that contains the values from your original array, with each individually cast to type float:
template <class T, size_t N>
std::vector<float> asFloat(T (&input)[N]) {
return std::vector<float> {input, input+N};
}
Note, however, that this creates a new array of float values created from those in the original array, without changing/affecting the original array at all. Also note that since this is a template, it's possible to apply it to any type for which a conversion from T to float is defined, even if that conversion might not make a whole lot of sense or produce particularly useful results (e.g., char):
char input[] = { 'a', 'b', 'c' };
auto result = asFloat(input);
If you print out the contents of result, you'll typically get:
97 98 99
That's the encoding of a, b and c in most common character sets (ASCII, Unicode, etc.) but those don't really make a whole lot of sense as floating point values.
Based on comments, however, you're just trying to create an array of float. There are a couple of minor details to deal with here. One is that you probably really want a vector instead of an array. Another is that floating point literals are of type double by default. Initializing a float variable with a double value can (and frequently will) lead to a warning about possibly losing data, or something similar.
So, when you're initializing an array or vector of float, you usually want to include a suffix to force the values to type float:
std::vector<float> values { 1.0f, 2.0f, 3.0f};
I am trying to convert a number stored as a list of ints to a float type. I got the number via a serial console and want to reassemble it back together into a float.
The way I would do it in C is something like this:
bit_data = ((int16_t)byte_array[0] << 8) | byte_array[1];
result = (float)bit_data;
What I tried to use in python is a much more simple conversion:
result = int_list[0]*256.0 + int_list[1]
However, this does not preserve the sign of the result, as the C code does.
What is the right way to do this in python?
UPDATE:
Python version is 2.7.3.
My byte array has a length of 2.
in the python code byte_array is list of ints. I've renamed it to avoid misunderstanding. I can not just use the float() function because it will not preserve the sign of the number.
I'm a bit confused by what data you have, and how it is represented in Python. As I understand it, you have received two unsigned bytes over a serial connection, which are now represented by a list of two python ints. This data represents a big endian 16-bit signed integer, which you want to extract and turn into a float. eg. [0xFF, 0xFE] -> -2 -> -2.0
import array, struct
two_unsigned_bytes = [255, 254] # represented by ints
byte_array = array.array("B", two_unsigned_bytes)
# change above to "b" if the ints represent signed bytes ie. in range -128 to 127
signed_16_bit_int, = struct.unpack(">h", byte_array)
float_result = float(signed_16_bit_int)
I think what you want is the struct module.
Here's a round trip snippet:
import struct
sampleValue = 42.13
somebytes = struct.pack('=f', sampleValue)
print(somebytes)
result = struct.unpack('=f', somebytes)
print(result)
result may be surprising to you. unpack returns a tuple. So to get to the value you can do
result[0]
or modify the result setting line to be
result = struct.unpack('=f', some bytes)[0]
I personally hate that, so use the following instead
result , = struct.unpack('=f', some bytes) # tuple unpacking on assignment
The second thing you'll notice is that the value has extra digits of noise. That's because python's native floating point representation is double.
(This is python3 btw, adjust for using old versions of python as appropriate)
I am not sure I really understand what you are doing, but I think you got 4 bytes from a stream and know them to represent a float32 value. The way you handling this suggests big-endian byte-order.
Python has the struct package (https://docs.python.org/2/library/struct.html) to handle bytestreams.
import struct
stream = struct.pack(">f", 2/3.)
len(stream) # 4
reconstructed_float = struct.unpack(">f", stream)
Okay, so I think int_list isn't really just a list of ints. The ints are constrained to 0-255 and represent bytes that can be built into a signed integer. You then want to turn that into a float. The trick is to set the sign of the first byte properly and then procede much like you did.
float((-(byte_array[0]-127) if byte_array[0]>127 else byte_array[0])*256 + byte_array[1])
I have allocated an array and cast it using the Python ctypes module:
dataC = ctypes.cast(crt.malloc(size), ctypes.POINTER(ctypes.c_ubyte))
in order to get byte data from a C library:
someClib.getData(handle, dataC)
Now this array is actually an array of C float types. How can I convert it to a Python list of floating type numbers?
You can cast to a pointer to float:
floatPtr = ctypes.cast(dataC, ctypes.POINTER(ctypes.c_float))
And then use a list comprehension, for example, to pull out the floats:
floatList = [floatPtr[i] for i in range(arrayLength)]
Now, only you know the value of arrayLength but it seems plausible to me that it is equal to size / ctypes.sizeof(ctypes.c_float).
I have C code which uses a variable data, which is a large 2d array created with malloc with variable size. Now I have to write an interface, so that the C functions can be called from within Python. I use ctypes for that.
C code:
FOO* pytrain(float **data){
FOO *foo = foo_autoTrain((float (*)[])data);
return foo;
}
with
FOO *foo_autoTrain(float data[nrows][ncols]) {...}
Python code:
autofoo=cdll.LoadLibrary("./libfoo.so")
... data gets filled ...
foo = autofoo.pytrain(pointer(pointer(p_data)))
My problem is, that when I try to access data in foo_autoTrain I only get 0.0 or other random values and a seg fault later. So how could I pass float (*)[])data to foo_autoTrain in Python?
Please don't hesitate to point out, if I described some part of my problem not sufficient enough.
It looks to me like the issue is a miscomprehension of how multidimensional arrays work in C. An expression like a[r][c] can mean one of two things depending on the type of a. If the type of a were float **, then the expression would mean a double pointer-offset dereference, something like this if done out long-hand:
float *row = a[r]; // First dereference yields a pointer to the row array
return row[c] // Second dereference yields the value
If the type of a were instead float (*)[ncols], then the expression becomes simply shorthand for shaping a contiguous, one-dimensional memory region as a multi-dimensional array:
float *flat = (float *)a;
return flat[(r * ncols) + c]; // Same as a[r][c]
So in your C code, the type of pytrain()'s argument should be either float * or float (*)[ncols] and your Python code should look something like the following (assuming you're using NumPy for your array data):
c_float_p = ctypes.POINTER(ctypes.c_float)
autofoo.pytrain.argtypes = [c_float_p]
data = numpy.array([[0.1, 0.1], [0.2, 0.2], [0.3, 0.3]], dtype=numpy.float32)
data_p = data.ctypes.data_as(c_float_p)
autofoo.pytrain(data_p)
And if you are in fact using NumPy, check out the ctypes page of the SciPy wiki.