I have data-points of hex-strings in a list.
I tried converting the list to string and then to a byte array. As I try to convert the byte array to float it only returns one value.
Code used is :
byteArrObj = bytearray(n, 'utf-8')
byteObj = bytes(byteArrObj)
byte8=bytearray.fromhex(b)
print(byte8)
floatvalue = struct.unpack('<f', byte8[:4])
This produces a tuple, like `(0.09273222088813782,).
How do I print all the float values from the list?
First, let's make a function that converts one of the values:
def hexdump_to_float(text):
return struct.unpack('<f', bytes.fromhex(text))[0]
Notice:
I skip the step of finding byteArrObj or byteObj from your code, because they had no effect in your code and do not help solve the problem.
I use the type bytes rather than bytearray because we don't need to modify the underlying data. (It's analogous to using a tuple rather than list.)
I do not bother with slicing the data, because we already know there will be only 4 bytes, and because struct.unpack would ignore any extra data in the buffer anyway.
To get the value out of the tuple that struct.unpack returns, I simply index into the tuple. That gives me a single float value.
So this is a simple one-line function, but it helps to make a function anyway since it gives a clear name for what we are doing.
The next step is to apply that to each element of the list. You can do this easily with, for example, a list comprehension:
my_floats = [hexdump_to_float(x) for x in my_hexdumps]
Related
I am new to this sort of stuff, so sorry if it's really simple and I am just being stupid.
So I have this variable with some bytes in it (not sure if that's the right name.)
data = b'red\x00XY\x001\x00168.93\x00859.07\x00'
I need to convert this to a list. The intended output would be something like.
["red","XY","1","169.93","859.07"]
How would I go about doing this?
Thank you for your help.
We can use the following line:
[x.decode("utf8") for x in data.split(b"\x00") if len(x)]
Going part by part:
x.decode("utf8"): x will be a bytes string, so we need to convert it into a string via `.decode("utf8").
for x in data.split(b"\x00"): We can use python's built in bytes.split method in order to split the byte string by the nullbytes to get an array of individual strings.
if len(x): This is equivalent to if len(x) > 0, since we want to discard the empty string at the end.
This code may help you to understand if you want exact same output using the pop() function.
data = 'red/x00XY/x001/x00168.93/x00859.07/x00' # I change "/" mark from "\" because i'm using Linux otherwise it will give error in Linux
new_list = [] # There is a variable that contain empty list
for item in data.split('/x00'): # Here I use split function by default it splits variable where "," appears but in this case
new_list.append(item) # you need list should be separated by "/" so that's why I gave split('/x00') and one by list appended
print(new_list)
I followed Tensorflow guide to save my string data using:
def _create_string_feature(values):
return tf.train.Feature(bytes_list=tf.train.BytesList(value=[values.encode('utf-8')]))
I also used ["tf.string", "FixedLenFeature"] as my feature original type, and "tf.string" as my feature convert type.
However, during my training when I run my session and I create iterators, my string feature for a batch size of 2 (for example: ['food fruit', 'cupcake food' ]) would be like below. The problem is that this list is of size 1, and not 2 (batch_size=2), why instances in one batch are stick together rather than being splitted?
[b'food fruit' b'cupcake food']
For my other features which are int or float, they are bumpy arrays of shape (batch_size, feature_len) which are fine but not sure why string features are not separated in a single batch?
Any help would be appreciated.
This will convert a BytesList or bytes_list string object to a string:
my_bytes_list_object.value[0].decode()
Or, in the case one is extracting the string from a TFRecord Example object:
my_example.features.feature['MyFeatureName'].bytes_list.value[0].decode()
From what I can see, bytes_list returns a BytesList object, from which we can read the value field. This will return a RepeatedScalarContainer, which operates like a simple list object. In fact, if you wrap it with the list() operation it will convert it to a list. However, instead we can just access it as if it were a list and use [0] to get the zeroth item. The returned item is a bytes array, which can be converted to a standard str object with the decode() method.
I know that in case of ndarray containing strings, dtype returned will be of the form dtype(S#) where # denotes the length of the string.
As shown in figure the array 'a' which is generated from a list [1,'2','3']. Once the array is created all the elements become string type. Array 'b' is created from a list ['1',2,'3'].
a.dtype gives S21 while b.dtype gives S1. Length of elements in both a and b is 1. Why the length of elements in first array is taken as 21 even though all the elements have length 1?
It is found that dtype will continue to be 'S21' even if 1 is replaced with 9223372036854775807. Once we use 9223372036854775808, dtype becomes 'S20'. How does this happen
Somebody please explain
np.array is compiled code, so we'd have to dig into that to see exactly what is going on. I don't recall seeing any documentation. So the easiest thing is to just try some values and look for a pattern.
If the 1st element is a string it appears to use the longest string (or str(i) for numbers).
If the 1st is a number it appears to start with some default size.
Unless the dtype is truncating some of the strings, I wouldn't worry too much about this behavior. If it matters, I'd suggest defining your own length.
Say i have a list or a tuple containing numbers of type long long,
x = [12974658, 638364, 53637, 63738363]
If want to struct.pack them individually, i have to use
struct.pack('<Q', 12974658)
or if i want to do it as multiple, then i have to explicitly mention it like this
struct.pack('<4Q', 12974658, 638364, 53637, 63738363)
But, how can i insert items in a list or tuple inside a struct.pack statement. I tried using for loop like this.
struct.pack('<4Q', ','.join(i for i in x))
got error saying expected string, int found, so i converted the list containing type int into str, now it gets much more complicated to pack them. Because the whole list gets converted into a string( like a single sentence).
As of now im doing some thing like
binary_data = ''
x = [12974658, 638364, 53637, 63738363]
for i in x:
binary_data += struct.pack('<Q', i)
And i unpack them like
struct.unpack('<4Q', binary_data)
My question: is there a better way around, like can i directly point a list or tuple inside the struct.pack statement, or probably a one liner ?
You can splat, I'm sorry "unpack the argument list":
>>> struct.pack("<4Q", *[1,2,3,4])
'\x01\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00\x03\x00\x00\x00\x00\x00\x00\x00\x04\x00\x00\x00\x00\x00\x00\x00'
If the length of the list is dynamic, you can of course build the format string at runtime too:
>>> x = [1, 2] # This could be any list of integers, of course.
>>> struct.pack("<%uQ" % len(x), *x)
'\x01\x00\x00\x00\x00\x00\x00\x00\x02\x00\x00\x00\x00\x00\x00\x00'
I have a list variable with one element,
x=['2']
and I want to convert it to a float:
x=2.0
I tried float(x), or int(x) - without success.
Can anyone please help me?
You need to convert the first item in your one-item list to a float. The approaches you tried already are trying to convert the whole list to a float (or an int - not sure where you were going with that!).
Python is zero-indexed (index numbers start from zero) which means that the first item in your list is referred to as x[0].
So the snippet you need is:
x = float(x[0])