compute CRC of struct in Python

compute CRC of struct in Python - python

I have the following struct, from the NRPE daemon code in C:
typedef struct packet_struct {
int16_t packet_version;
int16_t packet_type;
uint32_t crc32_value;
int16_t result_code;
char buffer[1024];
} packet;
I want to send this data format to the C daemon from Python. The CRC is calculated when crc32_value is 0, then it is put into the struct. My Python code to do this is as follows:
cmd = '_NRPE_CHECK'
pkt = struct.pack('hhIh1024s', 2, 1, 0, 0, cmd)
# pkt has length of 1034, as it should
checksum = zlib.crc32(pkt) & 0xFFFFFFFF
pkt = struct.pack('hhIh1024s', 2, 1, checksum, 0, cmd)
socket.send(....)
The daemon is receiving these values: version=2 type=1 crc=FE4BBC49 result=0
But it is calculating crc=3731C3FD
The actual C code to compute the CRC is:
https://github.com/KristianLyng/nrpe/blob/master/src/utils.c
and it is called via:
calculate_crc32((char *)packet, sizeof(packet));
When I ported those two functions to Python, I get the same as what zlib.crc32 returns.
Is my struct.pack call correct? Why is my CRC computation differing from the server's?

From the Python struct documentation:
To handle platform-independent data formats or omit implicit pad
bytes, use standard size and alignment instead of native size and
alignment: see Byte Order, Size, and Alignment for details.
Use '!' as the first format character to make the packed structure platform-independent. It forces big-endian, standard type sizes, and no pad bytes. Then the CRCs should be consistent.

Related

How to unpack a C-style structure inside another structure?

I am receiving data via socket interface from an application (server) written in C. The data being posted has the following structure. I am receiving data with a client written in Python.
struct hdr
{
int Id;
char PktType;
int SeqNo;
int Pktlength;
};
struct trl
{
char Message[16];
long long info;
};
struct data
{
char value[10];
double result;
long long count;
short int valueid;
};
typedef struct
{
struct hdr hdr_buf;
struct data data_buf[100];
struct trl trl_buf;
} trx_unit;
How do I unpack the received data to access my inner data buffer?

Using the struct library is the way to go. However, you will have to know a bit more about the C program that is serializing the data. Consider the hdr structure. If the C program is sending it using the naive approach:
struct hdr header;
send(sd, &hdr, sizeof(header), 0);
Then your client cannot safely interpret the bytes that are sent to it because there is an indeterminate amount of padding inserted between the struct members. In particular, I would expect three bytes of padding following the PktType member.
The safest way to approach sending around binary data is to have the server and client serialize the bytes directly to ensure that there is no additional padding and to make the byte ordering of multibyte integers explicit. For example:
/*
* Send a header over a socket.
*
* The header is sent as a stream of packed bytes with
* integers in "network" byte order. For example, a
* header value of:
* Id: 0x11223344
* PktType: 0xff
* SeqNo: 0x55667788
* PktLength: 0x99aabbcc
*
* is sent as the following byte stream:
* 11 22 33 44 ff 55 66 77 88 99 aa bb cc
*/
void
send_header(int sd, struct hdr const* header)
{ /* NO ERROR HANDLING */
uint32_t num = htonl((uint32_t)header->Id);
send(sd, &num, sizeof(num), 0);
send(sd, &header->PktType, sizeof(header->PktType), 0);
num = htonl((uint32_t)header->SeqNo);
send(sd, &num, sizeof(num), 0);
num = htonl((uint32_t)header->PktLength);
send(sd, &num, sizeof(num), 0);
}
This will ensure that your client can safely decode it using the struct module:
buf = s.recv(13) # packed data is 13 bytes long
id_, pkt_type, seq_no, pkt_length = struct.unpack('>IBII', buf)
If you cannot modify the C code to fix the serialization indeterminacy, then you will have to read the data from the stream and figure out where the C compiler is inserting padding and manually build struct format strings to match using the padding byte format character to ignore padding values.
I usually write a decoder class in Python that reads a complete value from the socket. In your case it would look something like:
class PacketReader(object):
def __init__(self, sd):
self._socket = sd
def read_packet(self):
id_, pkt_type, seq_no, pkt_length = self._read_header()
data_bufs = [self._read_data_buf() for _ in range(0, 100)]
message, info = self._read_trl()
return {'id': id_, 'pkt_type': pkt_type, 'seq_no': seq_no,
'data_bufs': data_bufs, 'message': message,
'info': info}
def _read_header(self):
"""
Read and unpack a ``hdr`` structure.
:returns: a :class:`tuple` of the header data values
in order - *Id*, *PktType*, *SeqNo*, and *PktLength*
The header is assumed to be packed as 13 bytes with
integers in network byte order.
"""
buf = self._socket.read(13)
# > Multibyte values in network order
# I Id as 32-bit unsigned integer value
# B PktType as 8-bit unsigned integer value
# I SeqNo as 32-bit unsigned integer value
# I PktLength as 32-bit unsigned integer value
return struct.unpack('>IBII', buf)
def _read_data_buf(self):
"""
Read and unpack a single ``data`` structure.
:returns: a :class:`tuple` of data values in order -
*value*, *result*, *count*, and *value*
The data structure is assumed to be packed as 28 bytes
with integers in network byte order and doubles encoded
as IEEE 754 binary64 in network byte order.
"""
buf = self._socket.read(28) # assumes double is binary64
# > Multibyte values in network order
# 10s value bytes
# d result encoded as IEEE 754 binary64 value
# q count encoded as a 64-bit signed integer
# H valueid as a 16-bit unsigned integer value
return struct.unpack('>10sdqH', buf)
def _read_trl(self):
"""
Read and unpack a ``trl`` structure.
:returns: a :class:`tuple` of trl values in order -
*Message* as byte string, *info*
The structure is assumed to be packed as 24 bytes with
integers in network byte order.
"""
buf = self.socket.read(24)
# > Multibyte values in network order
# 16s message bytes
# q info encoded as a 64-bit signed value
return struct.unpack('>16sq', buf)
Mind you that this is untested and probably contains syntax errors but that is how I would approach the problem.

The struct library has all you need to do this.

Different compression sizes using Python and C++

I am using zlib for compressing some data, and I am running into a weird issue: data compressed with python is smaller than the one using C++. I have 130MB of simulation data I want to save compressed(too many files for all the necessary data).
Using C++, I have something of the sort:
//calculate inputData (double * 256 * 256 * 256)
unsigned int length = inputLength;
unsigned int outLength = length + length/1000 + 12 + 1;
printf("Length: %d %d\n", length, outLength);
Byte *outData = new Byte[outLength];
z_stream strm;
strm.zalloc = Z_NULL;
strm.zfree = Z_NULL;
strm.next_in = (Byte *) inputData;
strm.avail_in = length;
deflateInit(&strm, -1);
do {
strm.next_out = outData;
strm.avail_out = outLength;
deflate(&strm, Z_FINISH);
unsigned int have = outLength - strm.avail_out;
fwrite(outData, 1, have, output);
} while(strm.avail_out == 0);
deflateEnd(&strm);
delete[] outData;
The result using C++ is around 120MB, which is hardly what I expect, as the original is close to 130MB.
In python:
from array import array
import zlib
// read data from uncompressed file
arrD = array.array('d', data)
file.write(zlib.compress(arrD))
The result using Python is around 50MB using the same input data, less than the half. The C++ code is mostly based on the one using in python's implementation, which makes this issue even weirder.
For C++, I am using Visual Studio 2010 Profession with Zlib 1.2.8 compiled by myself.
For Python, I am using the official python 3.4.2.

How to send an array by i2c?

I've been trying for several days now to send a python array by i2c.
data = [x,x,x,x] # `x` is a number from 0 to 127.
bus.write_i2c_block_data(i2c_address, 0, data)
bus.write_i2c_block_data(addr, cmd, array)
In the function above: addr - arduino i2c adress; cmd - Not sure what this is; array - python array of int numbers. Can this be done? What is actually the cmd?
FWIW, Arduino code, where I receive the array and put it on the byteArray:
void receiveData(int numByte){
int i = 0;
while(wire.available()){
if(i < 4){
byteArray[i] = wire.read();
i++;
}
}
}
It gives me this error: bus.write_i2c_block_data(i2c_adress, 0, decodedArray) IOError: [Errno 5] Input/output error. I tried with this: bus.write_byte(i2c_address, value), and it worked, but only for a value that goes from 0 to 127, but, I need to pass not only a value, but a full array.

The function is the good one.
But you should take care of some points:
bus.write_i2c_block_data(addr, cmd, []) send the value of cmd AND the values in the list on the I2C bus.
So
bus.write_i2c_block_data(0x20, 42, [12, 23, 34, 45])
doesn't send 4 bytes but 5 bytes to the device.
I doesn't know how the wire library work on arduino, but the device only read 4 bytes, it doesn't send the ACK for the last bytes and the sender detect an output error.
Two convention exist for I2C device address. The I2C bus have 7 bits for device address and a bit to indicate a read or a write. An other (wrong) convention is to write the address in 8 bits, and say that you have an address for read, and an other for write. The smbus package use the correct convention (7 bits).
Exemple: 0x23 in 7 bits convention, become 0x46 for writing, and 0x47 for reading.

It took me a while,but i got it working.
On the arduino side:
int count = 0;
...
...
void receiveData(int numByte){
while(Wire.available()){
if(count < 4){
byteArray[count] = Wire.read();
count++;
}
else{
count = 0;
byteArray[count] = Wire.read();
}
}
}
On the raspberry side:
def writeData(arrayValue):
for i in arrayValue:
bus.write_byte(i2c_address, i)
And that's it.

cmd is offset on which you want to write a data.
so its like
bus.write_byte(i2c_address, offset, byte)
but if you want to write array of bytes then you need to write block data so your code will look like this
bus.write_i2c_block_data(i2c_address, offset, [array_of_bytes])

How to print values of a string full of "chaos question marks"

I'm debugging with python audio, having a hard time with the audio coding.
Here I have a string full of audio data, say, [10, 20, 100].
However the data is stored in a string variable,
data = "����������������"
I want to inspect the values of this string.
Below is the things I tried
Print as int
I tried to use print "%i" % data[0]
ended up with
Traceback (most recent call last):
File "wire.py", line 28, in <module>
print "%i" % data[i]
TypeError: %d format: a number is required, not str
Convert to int
int(data[0]) ended up with
Traceback (most recent call last):
File "wire.py", line 27, in <module>
print int(data[0])
ValueError: invalid literal for int() with base 10: '\xd1'
Any idea on this? I want to print the string in a numerical way since the string is actually an array of sound wave.
EDIT
All your answers turned out to be really helpful.
The string is actually generated from the microphone so I believe it to be raw wave form, or vibration data. Further this should be referred to the audio API document, PortAudio.
After looking into PortAudio, I find this helpful example.
** This routine will be called by the PortAudio engine when audio is needed.
** It may called at interrupt level on some machines so don't do anything
** that could mess up the system like calling malloc() or free().
static int patestCallback( const void *inputBuffer, void *outputBuffer,
unsigned long framesPerBuffer,
const PaStreamCallbackTimeInfo* timeInfo,
PaStreamCallbackFlags statusFlags,
void *userData )
{
paTestData *data = (paTestData*)userData;
float *out = (float*)outputBuffer;
unsigned long i;
(void) timeInfo; /* Prevent unused variable warnings. */
(void) statusFlags;
(void) inputBuffer;
for( i=0; i<framesPerBuffer; i++ )
{
*out++ = data->sine[data->left_phase]; /* left */
*out++ = data->sine[data->right_phase]; /* right */
data->left_phase += 1;
if( data->left_phase >= TABLE_SIZE ) data->left_phase -= TABLE_SIZE;
data->right_phase += 3; /* higher pitch so we can distinguish left and right. */
if( data->right_phase >= TABLE_SIZE ) data->right_phase -= TABLE_SIZE;
}
return paContinue;
}
This indicates that there is some way that I can interpret the data as float

To be clear, your audio data is a byte string. The byte string is a representation of the bytes stored in the audio file. You are not going to simply be able to convert those bytes into meaningful values without knowing what is in the binary first.
As an example, the mp3 specification says that each mp3 contains header frames (described here: http://en.wikipedia.org/wiki/MP3). To read the header you would either need to use something like bitstring, or if you feel comfortable doing the bitwise manipulation yourself then you would just need to unpack an integer (4 bytes) and do some math to figure out the values of the 32 individual bits.
It really all depends on what you are trying to read, and how the data was generated. If you have whole byte numbers, then struct will serve you well.

If you're ok with the \xd1 mentioned above:
for item in data: print repr(item),
Note that for x in data will iterate over each value in the list rather than its location. If you want the location you can use for i in range(len(data)): ...
If you want them in numerical form, replace repr(item) with ord(item).

It is better if you use the new {}.format method:
data = "����������������"
print '{0}'.format(data[3])

You could use ord to map each byte to its numeric value between 0-255:
print map(ord, data)
Or, for Python 3 compatibility, do:
print([ord(c) for c in data])
It will also work with Unicode glyphs, which might not be what you want, so make sure you have a bytearray or an actual str or bytes object in Python 2.

Accessing bitfields while reading/writing binary data structures

I'm writing a parser for a binary format. This binary format involves different tables which are again in binary format containing varying field sizes usually (somewhere between 50 - 100 of them).
Most of these structures will have bitfields and will look something like these when represented in C:
struct myHeader
{
unsigned char fieldA : 3
unsigned char fieldB : 2;
unsigned char fieldC : 3;
unsigned short fieldD : 14;
unsigned char fieldE : 4
}
I came across the struct module but realized that its lowest resolution was a byte and not a bit, otherwise the module pretty much was the right fit for this work.
I know bitfields are supported using ctypes, but I'm not sure how to interface ctypes structs containing bitfields here.
My other option is to manipulate the bits myself and feed it into bytes and use it with the struct module - but since I have close to 50-100 different types of such structures, writing the code for that becomes more error-prone. I'm also worried about efficiency since this tool might be used to parse large gigabytes of binary data.
Thanks.

Using bitstring (which you mention you're looking at) it should be easy enough to implement. First to create some data to decode:
>>> myheader = "3, 2, 3, 14, 4"
>>> a = bitstring.pack(myheader, 1, 0, 5, 1000, 2)
>>> a.bin
'00100101000011111010000010'
>>> a.tobytes()
'%\x0f\xa0\x80'
And then decoding it again is just
>>> a.readlist(myheader)
[1, 0, 5, 1000, 2]
Your main concern might well be the speed. The library is well optimised Python, but that's not nearly as fast as a C library would be.

I haven't rigorously tested this, but it seems to work with unsigned types (edit: it works with signed byte/short types, too).
Edit 2: This is really hit or miss. It depends on the way the library's compiler packed the bits into the struct, which is not standardized. For example, with gcc 4.5.3 it works as long as I don't use the attribute to pack the struct, i.e. __attribute__ ((__packed__)) (so instead of 6 bytes it gets packed into 4 bytes, which you can check with __alignof__ and sizeof). I can make it almost work by adding _pack_ = True to the ctypes Structure definition, but it fails for fieldE. gcc notes: "Offset of packed bit-field ‘fieldE’ has changed in GCC 4.4".
import ctypes
class MyHeader(ctypes.Structure):
_fields_ = [
('fieldA', ctypes.c_ubyte, 3),
('fieldB', ctypes.c_ubyte, 2),
('fieldC', ctypes.c_ubyte, 3),
('fieldD', ctypes.c_ushort, 14),
('fieldE', ctypes.c_ubyte, 4),
]
lib = ctypes.cdll.LoadLibrary('C/bitfield.dll')
hdr = MyHeader()
lib.set_header(ctypes.byref(hdr))
for x in hdr._fields_:
print("%s: %d" % (x[0], getattr(hdr, x[0])))
Output:
fieldA: 3
fieldB: 1
fieldC: 5
fieldD: 12345
fieldE: 9
C:
typedef struct _MyHeader {
unsigned char fieldA : 3;
unsigned char fieldB : 2;
unsigned char fieldC : 3;
unsigned short fieldD : 14;
unsigned char fieldE : 4;
} MyHeader, *pMyHeader;
int set_header(pMyHeader hdr) {
hdr->fieldA = 3;
hdr->fieldB = 1;
hdr->fieldC = 5;
hdr->fieldD = 12345;
hdr->fieldE = 9;
return(0);
}

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

compute CRC of struct in Python - python

Related

How to unpack a C-style structure inside another structure?

Different compression sizes using Python and C++

How to send an array by i2c?

How to print values of a string full of "chaos question marks"

Accessing bitfields while reading/writing binary data structures

Categories

Resources