I have a c++ application which writes blocks of unsigned char data. So I would be writing unsigned char data[8].
Now, I am using python (read ctypes functionality in python), to read and buffer it in my tool for further processing.
Problem
When I read the data from file and break it down into chunks of 8, all the resultant data is in string format.I have the following structure
class MyData(Union):
_fields_=[ ("data",8 * c_ubytes), ("overlap", SelfStructure) ]
Now, I am trying to pass the data as follows
dataObj = MyData(str[0:8])
It throws an error, expected c_ubyte_Array_8 instance, got str. I think I need to convert string to array of size 8 of c_ubyte. Tried with bytearray but did not succeed. Please let me know how to do.
Try this:
(ctypes.c_ubyte * 8)(*[ctypes.c_ubyte(ord(c)) for c in str[:8]])
Related
I'm using Python ctypes to call a function from a shared library.
The function is called with a char* buffer in which it writes its result. The return value of the function is the number of bytes written to the buffer.
Calling the function works fine, however i'm struggling to access the individual bytes of the buffer.
I create the buffer and call the function like this:
buf = (c_void_p * RECBUFFERSIZE)()
n = functionInLibrary(buf)
Now how to read the individual bytes stored in buf?
I already tried using
cast(buf, c_char_p).value this yields a bytes object with the contents of buf. BUT it is terminated by the first null-byte in buf.
And this is exactly what i do not want. I need to read the first n bytes from buf.
Nevermind. I found it out myself:
cast(buf, POINTER(c_char))[0:n]
Main question
I would like to understand how to read a C++ unsigned short in Python. I was trying to use np.fromfile('file.bin',np.uint16) but it seems it doesn't work. Refer to this as the main question.
Case study:
For giving some more contest
I have an array of unsigned shorts exported as a binary file using C++ and QDataStream method of QT.
Header:
QVector<unsigned short> rawData;
main.cpp
QFile rawFile(QString("file.bin"));
rawFile.open(QIODevice::Truncate | QIODevice::ReadWrite);
QDataStream rawOut(&rawFile);
rawOut.writeRawData((char *) &rawData, 2*rawData.size());
rawFile.close();
I'm trying to read it using Python and numpy but I can't find how to read unsigned shorts. From literature unsigned shorts should be 2 bytes so I have tried to read it using:
import numpy as np
np.readfromfile('file.bin',np.uint16)
However if I compare a single unsigned_value reading it with python and prining as a string using in C++:
Qstring single_value = QString::number(unsigned_value)
They are different.
I'd experiment with endedness. Try '<u2' or '>u2'
https://docs.scipy.org/doc/numpy/reference/arrays.dtypes.html
'>' reverses the order of the 2 bytes
In [674]: np.array(123, np.dtype('>u2')).tostring()
Out[674]: b'\x00{'
In [675]: np.array(123, np.dtype('<u2')).tostring()
Out[675]: b'{\x00'
In [678]: np.array(123, np.uint16).tostring()
Out[678]: b'{\x00'
rawOut.writeRawData((char *) &rawData, 2*rawData.size()); is writing loads of rubbish in your file. QVector is not directly castable to an array of short as you are trying to do.
Use the code below to write your data
for(const auto& singleVal : rawData)
rawOut << singleVal;
Take a look at struct module
import struct
with open('file.bin', 'rb') as f:
unsigned_shorts = struct.iter_unpack('H', f.read())
print(list(unsigned_shorts))
Example output:
>>>[(1,), (2,), (3,)]
I find python struct.unpack() is quite handy to read binary data generated by other programs.
Question: How to read 16-bytes long double out of a binary file?
The following C code writes 1.01 three times to a binary file, using 4-byte float, 8-byte double and 16-byte long double respectively.
FILE* file = fopen("test_bin.bin","wb");
float f = 1.01;
double d = 1.01;
long double ld = 1.01;
fwrite(&f, sizeof(f),1,file);
fwrite(&d, sizeof(d),1,file);
fwrite(&ld, sizeof(ld),1,file);
fclose(file);
In python, I can read the float and double with no problem.
file=open('test_bin.bin','rb')
struct.unpack('<fd',file.read(12)) # (1.0099999904632568, 1.01) as expected.
I do not find description of 16-byte long double in module struct format character section.
Python does not support binary128s natively, hence you won't find support for them in the standard library. You will need to use NumPy (specifically numpy.frombuffer()) to convert from bytes to a binary128.
f128 = numpy.frombuffer(file.read(16), dtype=numpy.float128)
I have captured some packets using pcap library in c. Now i am using python program to read that saved packet file. but i have a problem here. I have a file which first have pkthdr(provided by lybrary) and then actual packet.
format of pkthdr is-
struct pcap_pkthdr {
struct timeval ts; /* time stamp 32bit */ 32bit
bpf_u_int32 caplen; /* length of portion present */
bpf_u_int32 len; /* length this packet (off wire) */
};
now i want to read len field, so i have skipped timeval and cap len, and printed len field using python in binary form.. the binary code which i got is-
01001010 00000000 00000000 00000000
Now how to read it in u_int32, i dont think it is correct value(too large), actual len field value should be 74 byte(check in wireshark).. so please tell me what i am doing wrong..
thanks in advance
Or have a look at the pylibpcap module, the pypcap module, or the pcapy module, which let you just call pcap APIs with relative ease. That way you don't have to care about the details of pcap files, and your code will, with libpcap 1.1 or later, also be able to read at least some of the pcap-ng files that Wireshark can produce and that it will produce by default in the 1.8 release.
Writing your own code to read pcap files, rather than relying on libpcap/WinPcap to do so, is rarely worth doing. (Wireshark does so, as part of its library that reads a number of capture file formats and supports pcap-ng format in ways that the current pcap API can't, but the library in question also supports pcap-ng....)
Have a look at the struct module, which lets you unpack such binary data with relative ease, for example:
struct.unpack('LLL', yourbuffer)
This will give you a tuple of the three (L = unsigned long) values. If the len value doesn't seem right, the byte order of the file is different from your native one. In that case prefix the format string with either > (big-endian) or < (little-endian):
struct.unpack('>LLL', yourbuffer)
I have created a buffer object in python like so:
f = io.open('some_file', 'rb')
byte_stream = buffer(f.read(4096))
I'm now passing byte_stream as a parameter to a C function, through SWIG. I have a typemap for converting the data which looks like this:
%typemap(in) unsigned char * byte_stream {
PyObject *buf = $input;
//some code to read the contents of buf
}
I have tried a few different things bug can't get to the actual content/value of my byte_stream. How do I convert or access the content of my byte_stream using the C API? There are many different methods for converting a C data to a buffer but none that I can find for going the other way around.
I have tried looking at this object in gcb but neither it, or the values it points to contain my data.
(I'm using buffers because I want to avoid the overhead of converting the data to a string when reading it from the file)
I'm using python 2.6 on Linux.
--
Thanks Pavel
I'm using buffers because I want to
avoid the overhead of converting the
data to a string when reading it from
the file
You are not avoiding anything. The string is already built by the read() method. Calling buffer() just builds an additional buffer object pointing to that string.
As for getting at the memory pointed to by the buffer object, try PyObject_AsReadBuffer(). See also http://docs.python.org/c-api/objbuffer.html.
As soon as you use the read method on your file object, the data will be converted to a str object; calling the buffer method does not convert it into a stream of any kind. If you want to avoid the overhead of creating the string object, you could simply pass the file object to your C code and then use it via its C API.