I would like to flip from big to little endian this string:
\x00\x40
to have it like this:
\x40\x00
I guess the proper function to use would be struct.pack, but I can't find a way to make it properly work.
A small help would be very appreciated !
Thanks
You're not showing the whole code, so the simplest solution would be:
data = data[1] + data[0]
If you insist on using struct:
>>> from struct import pack, unpack
>>> unpack('<H', '\x12\x13')
(4882,)
>>> pack('>H', *unpack('<H', '\x12\x13'))
'\x13\x12'
Which first unpacks the string as a little-endian unsigned short, and then packs it back as big-endian unsigned short. You can have it the other way around, of course. When converting between BE and LE it doesn't matter which way you're converting - the conversion function is bi-directional.
data[::-1] works for any number of bytes.
little_endian = big_endian[1] + big_endian[0]
I could image that what would really want to do here is to convert the input data from big endian (or network) byteorder to your host byteorder (whatever that may be).
>>> from struct import unpack
>>> result = unpack('>H', '\x00\x40')
This would be a more portable approach than just swapping, which is bound to fail when the code is moved to a big endian machine that would not need to swap at all.
Related
I'd like to have a very simple solution in displaying the raw bytes for a float value (or more consecutive ones in memory). in my understanding, this is named typecasting (reading the memory values in byte) not to be misunderstood as casting (reading the value and interpreting that in byte).
The simplest test seem to be:
import numpy
a=3.14159265
print(a.hex())
# Returns 0x1.921fb53c8d4f1p+1
b=numpy.array(a)
print(b.tobytes())
# returns b'\xf1\xd4\xc8S\xfb!\t#'
# expected is something like 'F1' 'D4' 'C8' '53' 'FB' '21' '09' '40'
but the method hex() returns an Interpretation of the IEEE FLOAT represantation in hex. The second method shows four hex-Byte markers \x but I wonder as a float64 should read 8 Bytes. Further I'm wondering by the other characters.
I good old simple C I would have implemented that by simply using an unsigned int pointer on the memory address of the float and printing 8 values from that unsigned int "Array" (pointer.)
I know, that I can use C within python - but are there other simple solutions?
Maybe as it is of interest: I need that functionalaty to save many big float vectors to a BLOB into a database.
I think, similiar Problems are to be found in Correct interpretation of hex byte, convert it to float reading that if possible (displayable characters) they are not printed in \x-form. How can I change that?
You can use the built-in module struct for this. If you have Python 3.5 or later:
import struct
struct.pack('d', a).hex()
It gives:
'f1d4c853fb210940'
If you have Python older than 3.5:
import binascii
binascii.hexlify(struct.pack('d', a))
Or:
hex(struct.unpack('>Q', struct.pack('d', a))[0])
If you have an array of floats and want to use NumPy:
import numpy as np
np.set_printoptions(formatter={'int':hex})
np.array([a]).view('u8')
I've two bytes \x22\x38. (Read from from a Process Memory so preferably little endian)
I am pretty sure these bytes get converted 0x588, But don't know how.
I want to know how python struct module can be used to convert \x22\x38 to 0x588.
There's something else going on if somehow 2216/3816 maps to anything other than 223816 or 382216, but in any event:
>>> import struct
>>> data = b'\x22\x38'
>>> struct.unpack('<h', data)
(14370,)
>>> struct.unpack('>h', data)
(8760,)
Note that unpack() returns a tuple. h is for short (as in a C short), the < or > sets the endianness. See the struct package docs for full info.
Python 2.6 on Redhat 6.3
I have a device that saves 32 bit floating point value across 2 memory registers, split into most significant word and least significant word.
I need to convert this to a float.
I have been using the following code found on SO and it is similar to code I have seen elsewhere
#!/usr/bin/env python
import sys
from ctypes import *
first = sys.argv[1]
second = sys.argv[2]
reading_1 = str(hex(int(first)).lstrip("0x"))
reading_2 = str(hex(int(second)).lstrip("0x"))
sample = reading_1 + reading_2
def convert(s):
i = int(s, 16) # convert from hex to a Python int
cp = pointer(c_int(i)) # make this into a c integer
fp = cast(cp, POINTER(c_float)) # cast the int pointer to a float pointer
return fp.contents.value # dereference the pointer, get the float
print convert(sample)
an example of the register values would be ;
register-1;16282 register-2;60597
this produces the resulting float of
1.21034872532
A perfectly cromulent number, however sometimes the memory values are something like;
register-1;16282 register-2;1147
which, using this function results in a float of;
1.46726675314e-36
which is a fantastically small number and not a number that seems to be correct. This device should be producing readings around the 1.2, 1.3 range.
What I am trying to work out is if the device is throwing bogus values or whether the values I am getting are correct but the function I am using is not properly able to convert them.
Also is there a better way to do this, like with numpy or something of that nature?
I will hold my hand up and say that I have just copied this code from examples on line and I have very little understanding of how it works, however it seemed to work in the test cases that I had available to me at the time.
Thank you.
If you have the raw bytes (e.g. read from memory, from file, over the network, ...) you can use struct for this:
>>> import struct
>>> struct.unpack('>f', '\x3f\x9a\xec\xb5')[0]
1.2103487253189087
Here, \x3f\x9a\xec\xb5 are your input registers, 16282 (hex 0x3f9a) and 60597 (hex 0xecb5) expressed as bytes in a string. The > is the byte order mark.
So depending how you get the register values, you may be able to use this method (e.g. by converting your input integers to byte strings). You can use struct for this, too; this is your second example:
>>> raw = struct.pack('>HH', 16282, 1147) # from two unsigned shorts
>>> struct.unpack('>f', raw)[0] # to one float
1.2032617330551147
The way you've converting the two ints makes implicit assumptions about endianness that I believe are wrong.
So, let's back up a step. You know that the first argument is the most significant word, and the second is the least significant word. So, rather than try to figure out how to combine them into a hex string in the appropriate way, let's just do this:
import struct
import sys
first = sys.argv[1]
second = sys.argv[2]
sample = int(first) << 16 | int(second)
Now we can just convert like this:
def convert(i):
s = struct.pack('=i', i)
return struct.unpack('=f', s)[0]
And if I try it on your inputs:
$ python floatify.py 16282 60597
1.21034872532
$ python floatify.py 16282 1147
1.20326173306
In Python, long integers have unlimited precision. I would like to write a 16 byte (128 bit) integer to a file. struct from the standard library supports only up to 8 byte integers. array has the same limitation. Is there a way to do this without masking and shifting each integer?
Some clarification here: I'm writing to a file that's going to be read in from non-Python programs, so pickle is out. All 128 bits are used.
I think for unsigned integers (and ignoring endianness) something like
import binascii
def binify(x):
h = hex(x)[2:].rstrip('L')
return binascii.unhexlify('0'*(32-len(h))+h)
>>> for i in 0, 1, 2**128-1:
... print i, repr(binify(i))
...
0 '\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00'
1 '\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x01'
340282366920938463463374607431768211455 '\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff'
might technically satisfy the requirements of having non-Python-specific output, not using an explicit mask, and (I assume) not using any non-standard modules. Not particularly elegant, though.
Two possible solutions:
Just pickle your long integer. This will write the integer in a special format which allows it to be read again, if this is all you want.
Use the second code snippet in this answer to convert the long int to a big endian string (which can be easily changed to little endian if you prefer), and write this string to your file.
The problem is that the internal representation of bigints does not directly include the binary data you ask for.
The PyPi bitarray module in combination with the builtin bin() function seems like a good combination for a solution that is simple and flexible.
bytes = bitarray(bin(my_long)[2:]).tobytes()
The endianness can be controlled with a few more lines of code. You'll have to evaluate the efficiency.
Why not use struct with the unsigned long long type twice?
import struct
some_file.write(struct.pack("QQ", var/(2**64), var%(2**64)))
That's documented here (scroll down to get the table with Q): http://docs.python.org/library/struct.html
This may not avoid the "mask and shift each integer" requirement. I'm not sure that avoiding mask and shift means in the context of Python long values.
The bytes are these:
def bytes( long_int ):
bytes = []
while long_int != 0:
b = long_int%256
bytes.insert( 0, b )
long_int //= 256
return bytes
You can then pack this list of bytes using struct.pack( '16b', bytes )
With Python 3.2 and later, you can use int.to_bytes and int.from_bytes: https://docs.python.org/3/library/stdtypes.html#int.to_bytes
You could pickle the object to binary, use protocol buffers (I don't know if they allow you to serialize unlimited precision integers though) or BSON if you do not want to write code.
But writing a function that dumps 16 byte integers by shifting it should not be so hard to do if it's not time critical.
This may be a little late, but I don't see why you can't use struct:
bigint = 0xFEDCBA9876543210FEDCBA9876543210L
print bigint,hex(bigint).upper()
cbi = struct.pack("!QQ",bigint&0xFFFFFFFFFFFFFFFF,(bigint>>64)&0xFFFFFFFFFFFFFFFF)
print len(cbi)
The bigint by itself is rejected, but if you mask it with &0xFFFFFFFFFFFFFFFF you can reduce it to an 8 byte int instead of 16. Then the upper part is shifted and masked as well. You may have to play with byte ordering a bit. I used the ! mark to tell it to produce a network endian byte order. Also, the msb and lsb (upper and lower bytes) may need to be reversed. I will leave that as an exercise for the user to determine. I would say saving things as network endian would be safer so you always know what the endianess of your data is.
No, don't ask me if network endian is big or little endian...
Based on #DSM's answer, and to support negative integers and varying byte sizes, I've created the following improved snippet:
def to_bytes(num, size):
x = num if num >= 0 else 256**size + num
h = hex(x)[2:].rstrip("L")
return binascii.unhexlify("0"*((2*size)-len(h))+h)
This will properly handle negative integers and let the user set the number of bytes
I'm working on a program where I store some data in an integer and process it bitwise. For example, I might receive the number 48, which I will process bit-by-bit. In general the endianness of integers depends on the machine representation of integers, but does Python do anything to guarantee that the ints will always be little-endian? Or do I need to check endianness like I would in C and then write separate code for the two cases?
I ask because my code runs on a Sun machine and, although the one it's running on now uses Intel processors, I might have to switch to a machine with Sun processors in the future, which I know is big-endian.
Python's int has the same endianness as the processor it runs on. The struct module lets you convert byte blobs to ints (and viceversa, and some other data types too) in either native, little-endian, or big-endian ways, depending on the format string you choose: start the format with # or no endianness character to use native endianness (and native sizes -- everything else uses standard sizes), '~' for native, '<' for little-endian, '>' or '!' for big-endian.
This is byte-by-byte, not bit-by-bit; not sure exactly what you mean by bit-by-bit processing in this context, but I assume it can be accomodated similarly.
For fast "bulk" processing in simple cases, consider also the array module -- the fromstring and tostring methods can operate on large number of bytes speedily, and the byteswap method can get you the "other" endianness (native to non-native or vice versa), again rapidly and for a large number of items (the whole array).
If you need to process your data 'bitwise' then the bitstring module might be of help to you. It can also deal with endianness between platforms.
The struct module is the best standard method of dealing with endianness between platforms. For example this packs and unpack the integers 1, 2, 3 into two 'shorts' and one 'long' (2 and 4 bytes on most platforms) using native endianness:
>>> from struct import *
>>> pack('hhl', 1, 2, 3)
'\x00\x01\x00\x02\x00\x00\x00\x03'
>>> unpack('hhl', '\x00\x01\x00\x02\x00\x00\x00\x03')
(1, 2, 3)
To check the endianness of the platform programmatically you can use
>>> import sys
>>> sys.byteorder
which will either return "big" or "little".
The following snippet will tell you if your system default is little endian (otherwise it is big-endian)
import struct
little_endian = (struct.unpack('<I', struct.pack('=I', 1))[0] == 1)
Note, however, this will not affect the behavior of bitwise operators: 1<<1 is equal to 2 regardless of the default endianness of your system.
Check when?
When doing bitwise operations, the int in will have the same endianess as the ints you put in. You don't need to check that. You only need to care about this when converting to/from sequences of bytes, in both languages, afaik.
In Python you use the struct module for this, most commonly struct.pack() and struct.unpack().