python read 16 bytes long double from binary file

python read 16 bytes long double from binary file - python

I find python struct.unpack() is quite handy to read binary data generated by other programs.
Question: How to read 16-bytes long double out of a binary file?
The following C code writes 1.01 three times to a binary file, using 4-byte float, 8-byte double and 16-byte long double respectively.
FILE* file = fopen("test_bin.bin","wb");
float f = 1.01;
double d = 1.01;
long double ld = 1.01;
fwrite(&f, sizeof(f),1,file);
fwrite(&d, sizeof(d),1,file);
fwrite(&ld, sizeof(ld),1,file);
fclose(file);
In python, I can read the float and double with no problem.
file=open('test_bin.bin','rb')
struct.unpack('<fd',file.read(12)) # (1.0099999904632568, 1.01) as expected.
I do not find description of 16-byte long double in module struct format character section.

Python does not support binary128s natively, hence you won't find support for them in the standard library. You will need to use NumPy (specifically numpy.frombuffer()) to convert from bytes to a binary128.
f128 = numpy.frombuffer(file.read(16), dtype=numpy.float128)

Related

file.write ints/floats in python for binary files without using struct

I have a file open in binary mode and I want to output ints and doubles (np.float64) to it. Pretty much every binary file tutorial I've seen says to use the struct module:
fout.write(struct.pack('i', np.int32(pca.components_.shape[0])))
If I don't use struct.pack, the operation is still legal and I still seem to be able to read out the bytes if I open the file in a C program later as the correct int values.
fout.write(np.int32(pca.components_.shape[0]))
Is struct.pack absolutely necessary? What happens if you write a number value to a binary file without packing? Thanks.

How can I read unsigned shorts using Python?

Main question
I would like to understand how to read a C++ unsigned short in Python. I was trying to use np.fromfile('file.bin',np.uint16) but it seems it doesn't work. Refer to this as the main question.
Case study:
For giving some more contest
I have an array of unsigned shorts exported as a binary file using C++ and QDataStream method of QT.
Header:
QVector<unsigned short> rawData;
main.cpp
QFile rawFile(QString("file.bin"));
rawFile.open(QIODevice::Truncate | QIODevice::ReadWrite);
QDataStream rawOut(&rawFile);
rawOut.writeRawData((char *) &rawData, 2*rawData.size());
rawFile.close();
I'm trying to read it using Python and numpy but I can't find how to read unsigned shorts. From literature unsigned shorts should be 2 bytes so I have tried to read it using:
import numpy as np
np.readfromfile('file.bin',np.uint16)
However if I compare a single unsigned_value reading it with python and prining as a string using in C++:
Qstring single_value = QString::number(unsigned_value)
They are different.

I'd experiment with endedness. Try '<u2' or '>u2'
https://docs.scipy.org/doc/numpy/reference/arrays.dtypes.html
'>' reverses the order of the 2 bytes
In [674]: np.array(123, np.dtype('>u2')).tostring()
Out[674]: b'\x00{'
In [675]: np.array(123, np.dtype('<u2')).tostring()
Out[675]: b'{\x00'
In [678]: np.array(123, np.uint16).tostring()
Out[678]: b'{\x00'

rawOut.writeRawData((char *) &rawData, 2*rawData.size()); is writing loads of rubbish in your file. QVector is not directly castable to an array of short as you are trying to do.
Use the code below to write your data
for(const auto& singleVal : rawData)
rawOut << singleVal;

Take a look at struct module
import struct
with open('file.bin', 'rb') as f:
unsigned_shorts = struct.iter_unpack('H', f.read())
print(list(unsigned_shorts))
Example output:
>>>[(1,), (2,), (3,)]

.bin to .cfile flowgraph for GRC 3.7.2.1

I have tried opening the flow graph for coverting .bin file (data
captured via RTL-SDR) to .cfile for analysis. I downloaded the file from
the link http://sdr.osmocom.org/trac/attachment/wiki/rtl-sd...
However, I am unable to get it working on GRC 3.7.2.1. I get a long list of error messages (given below) when I just try to open the file.
I am using Ubuntu v14.04.1.
I would be really grateful for any help to solve this or any alternate ways to convert the .bin file to .cfile (python source code?)
=======================================================
<<< Welcome to GNU Radio Companion 3.7.2.1 >>>
Showing: ""
Loading: "/home/zorro/Downloads/rtl2832-cfile.grc"
Error:
/home/zorro/Downloads/rtl2832-cfile.grc:2:0:ERROR:VALID:DTD_UNKNOWN_ELEM:
No declaration for element html
/home/zorro/Downloads/rtl2832-cfile.grc:2:0:ERROR:VALID:DTD_UNKNOWN_ATTRIBUTE:
No declaration for attribute xmlns of element html
/home/zorro/Downloads/rtl2832-cfile.grc:9:0:ERROR:VALID:DTD_UNKNOWN_ELEM:
No declaration for element head
/home/zorro/Downloads/rtl2832-cfile.grc:10:0:ERROR:VALID:DTD_UNKNOWN_ELEM:

The cause of the errors you are seeing is that your link is bad — it is truncated and points to a HTML page, not a GRC file. The errors come from GRC trying to interpret the HTML as GRC XML instead. The correct link to the download is: http://sdr.osmocom.org/trac/raw-attachment/wiki/rtl-sdr/rtl2832-cfile.grc
However, note that that flowgraph was built for GNU Radio 3.6 and will not work in GNU Radio 3.7 due to many blocks being internally renamed. I would recommend rebuilding it from scratch using the provided picture.
Since there are no variables in this flowgraph, you can simply drag out the blocks and set the parameters as shown. Doing so will be a good exercise for familiarizing yourself with the GNU Radio Companion user interface, too.

If you look at the flowgraph posted by #Kevin Reid above, you can see that it takes the input data, subtracts 127, multiplies by 0.008, and converts pairs to complex.
What is missing is the exact types. It is in the GNU Radio FAQ. From there we learn that the uchar is an unsigned char (8 bits) and the complex data type is a 'complex64' in python.
If done in numpy, as an in-memory operation, it looks like this:
import numpy as np
import sys
(scriptName, inFileName, outFileName) = sys.argv;
ubytes = np.fromfile(inFileName, dtype='uint8', count=-1)
# we need an even number of bytes
# discard last byte if the count is odd
if len(ubytes)%2==1:
ubytes = ubytes[0:-1]
print "read "+str(len(ubytes))+" bytes from "+inFileName
# scale the unsigned byte data to become a float in the interval 0.0 to 1.0
ufloats = 0.008*(ubytes.astype(float)-127.0)
ufloats.shape = (len(ubytes)/2, 2)
# turn the pairs of floats into complex numbers, needed by gqrx and other gnuradio software
IQ_data = (ufloats[:,0]+1j*ufloats[:,1]).astype('complex64')
IQ_data.tofile(outFileName)
I've tested this translating from the rtl_sdr file format to the gqrx IQ sample input file format and it seems to work fine within what can fit in memory.
But beware this script only works with data where both input and output files can fit in memory. For input files larger than about 1/5 of system memory, which sdr recording can easily exceed, it would be better to read the bytes one at a time.
We can avoid memory-hogging by reading the data 1 byte at a time with a loop, as with the following program in gnu C. This isn't the cleanest code, I should probably add fclose and check ferror, but it works as-is for hobby purposes.
#include <complex.h>
#include <stdio.h>
#include <stdlib.h>
// rtlsdr-to-gqrx Copyright 2014 Paul Brewer KI6CQ
// License: CC BY-SA 3.0 or GNU GPL 3.0
// IQ file converter
// from rtl_sdr recording format -- interleaved unsigned char
// to gqrx/gnuradio .cfile playback format -- complex64
void main(int argc, char *argv[])
{
int byte1, byte2; // int -- not unsigned char -- see fgetc man page
float _Complex fc;
const size_t fc_size = sizeof(fc);
FILE *infile,*outfile;
const float scale = 1.0/128.0;
const char *infilename = argv[1];
const char *outfilename = argv[2];
if (argc<3){
printf("usage: rtlsdr-to-gqrx infile outfile\n");
exit(1);
}
// printf("in= %s out= %s \n", infilename, outfilename);
infile=fopen(infilename,"rb");
outfile=fopen(outfilename,"wb");
if ((infile==NULL) || (outfile==NULL)){
printf("Error opening files\n");
exit(1);
}
while ((byte1=fgetc(infile)) != EOF){
if ((byte2=fgetc(infile)) == EOF){
exit(0);
}
fc = scale*(byte1-127) + I*scale*(byte2-127);
fwrite(&fc,fc_size,1,outfile);
}
}

python : convert string to c_ubyte_Array_8

I have a c++ application which writes blocks of unsigned char data. So I would be writing unsigned char data[8].
Now, I am using python (read ctypes functionality in python), to read and buffer it in my tool for further processing.
Problem
When I read the data from file and break it down into chunks of 8, all the resultant data is in string format.I have the following structure
class MyData(Union):
_fields_=[ ("data",8 * c_ubytes), ("overlap", SelfStructure) ]
Now, I am trying to pass the data as follows
dataObj = MyData(str[0:8])
It throws an error, expected c_ubyte_Array_8 instance, got str. I think I need to convert string to array of size 8 of c_ubyte. Tried with bytearray but did not succeed. Please let me know how to do.

Try this:
(ctypes.c_ubyte * 8)(*[ctypes.c_ubyte(ord(c)) for c in str[:8]])

Read pcap header length field with python

I have captured some packets using pcap library in c. Now i am using python program to read that saved packet file. but i have a problem here. I have a file which first have pkthdr(provided by lybrary) and then actual packet.
format of pkthdr is-
struct pcap_pkthdr {
struct timeval ts; /* time stamp 32bit */ 32bit
bpf_u_int32 caplen; /* length of portion present */
bpf_u_int32 len; /* length this packet (off wire) */
};
now i want to read len field, so i have skipped timeval and cap len, and printed len field using python in binary form.. the binary code which i got is-
01001010 00000000 00000000 00000000
Now how to read it in u_int32, i dont think it is correct value(too large), actual len field value should be 74 byte(check in wireshark).. so please tell me what i am doing wrong..
thanks in advance

Or have a look at the pylibpcap module, the pypcap module, or the pcapy module, which let you just call pcap APIs with relative ease. That way you don't have to care about the details of pcap files, and your code will, with libpcap 1.1 or later, also be able to read at least some of the pcap-ng files that Wireshark can produce and that it will produce by default in the 1.8 release.
Writing your own code to read pcap files, rather than relying on libpcap/WinPcap to do so, is rarely worth doing. (Wireshark does so, as part of its library that reads a number of capture file formats and supports pcap-ng format in ways that the current pcap API can't, but the library in question also supports pcap-ng....)

Have a look at the struct module, which lets you unpack such binary data with relative ease, for example:
struct.unpack('LLL', yourbuffer)
This will give you a tuple of the three (L = unsigned long) values. If the len value doesn't seem right, the byte order of the file is different from your native one. In that case prefix the format string with either > (big-endian) or < (little-endian):
struct.unpack('>LLL', yourbuffer)

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

python read 16 bytes long double from binary file - python

Python does not support binary128s natively, hence you won't find support for them in the standard library. You will need to use NumPy (specifically numpy.frombuffer()) to convert from bytes to a binary128. f128 = numpy.frombuffer(file.read(16), dtype=numpy.float128)

Related

file.write ints/floats in python for binary files without using struct

How can I read unsigned shorts using Python?

.bin to .cfile flowgraph for GRC 3.7.2.1

python : convert string to c_ubyte_Array_8

Read pcap header length field with python

Categories

Resources