Python Struct.Unpack in C++ (Bitshift?) - python

data = struct.unpack('!10H'%length, buf[:20])
Now assuming C++ where buf is a std::string.
Could I just write:
uint8_t f1 = (buf[0] << 8) | buf[1];
uint8_t f2 = (buf[2] << 8) | buf[3];
?
I have to translate a python ROS-IMU driver to ROS-C++ and have to deal with a lot of struct unpacks. I read about different ways to translate the code, some said declaring a corresponding struct and execute memcpy or reinterprete_cast, others said to use bitshift. That's what I got compiling so far. Would this do what I want it to do? Or how do I cast a std::string or uint8_t array to the corresponding values?
And what does the percent sign (%) before dB mean in this unpack? In the python manual this parameter is not listed under Format Parameters.
data = struct.unpack('!%dB'%length, buf[:-1])

Related

Need a way to properly determine c data type sizes from within python

The company I work for has a proprietary file format that's old. REALLY old. I'm building a python library to read/write from the files (database type files) and have some questions.
The runtime that originally reads/writes to the files reads out the first XX bytes dynamically based on the sizeof the struct in question. For example:
struct fhdr {
union {
unsigned char ifflag[2]; /* file type and psw */
int fh_flag; /* alignment to old version */
} ufh;
unsigned reclen; /* record length in bytes */
DWORD fsize; /* byte size/reclen */
struct {
short typ;
short offset; /* in bytes */
} fmt[MAXITMS]; /* struct for formatted file */ (65?)
};
My issue is that we have customers across a wide array of platforms. A long on one customer is 8 bytes, but a customer on an old SCO 6 box (they're out there!) might be 4 bytes in size.
Right now, I have this:
#include <stdio.h>
int main(void){
printf("char=%d\n", sizeof(char));
printf("int=%d\n", sizeof(int));
printf("short=%d\n", sizeof(short));
printf("long=%d\n", sizeof(long));
printf("float=%d\n", sizeof(float));
printf("double=%d\n", sizeof(double));
printf("long double=%d\n", sizeof(long double));
printf("DWORD=%d\n", sizeof(long));
printf("unsigned=%d\n", sizeof(unsigned));
return 0;
}
It just prints out the sizes in this format:
char=1
int=4
short=2
long=8
float=4
double=8
long double=16
DWORD=8
and it's parsed when the class is instantiated. I can then go and build an array based on the platform's real variable sizes.
My question is: Is there a way, in python 3.x, for me to find an individual server's data type sizes or am I just better off parsing a simple c program as above?
It's not hard, it just feels tedious and repetitive and feels WRONG to go and create custom functions to retrieve each datatype.
header_fields = {
'ifflag': IMS.char() * 2,
'fh_flag': IMS.int(),
'reclen': IMS.unsigned(),
'fsize': IMS.DWORD(),
'typ': IMS.short(),
'offset': IMS.short()
}
(yes, I know that a char is always 1 byte. I just like uniformity.)
What I have works and it does it's job rather well. I just want to learn how to improve on it, if possible.

Python ctypes: how to pass row outputs from a C function into a pandas DataFrame?

My question is how to parse tab-delimited output from a C function into a pandas DataFrame via ctypes:
I am writing a Python wrapper in Python3.x around a C library using ctypes. The C library currently does database queries. The C function I am accessing return_query() returns tab-delimited rows from a query, given the path to a file, an index, and a query-string:
int return_query(structname **output, const char *input_file,
const char *index, const char *query_string);
As you can see, I'm using output as the location to store all records from the query, whereby the structname is a struct for the rows
I also have a function which prints to STDOUT:
int print_query(const char *input_file,
const char *index, const char *query_string);
My goal is to access these functions via ctypes, and pass the tab-delimited row outputs into a pandas DataFrame.
My problem is this:
(1) I could try to parse the STDOUT of print_query(); however, these queries could result in large tab-delimited DataFrames. I worry this solution isn't efficient, as it might not scale to +10000s of rows. Other questions have roughly covered how to catch STDOUT from C functions in Python via ctypes:
Capturing print output from shared library called from python with ctypes module
(2) Could I access output somehow, and pass this to a pandas DataFrame? I'm currently not sure how this would work, e.g.
import ctypes
lib = CDLL("../libshared.so") ### reference to shared library, *.so
lib.return_query.restype = ctypes.c_char
lib.return_query.argtypes = (???, ctypes.c_char_p, ctypes.c_char_p, ctypes.c_char_p)
What should the first argument be, and how would I pass it into something which could be a pandas DataFrame?
(3) Perhaps it would be better to re-write the C functions which return tab-delimited rows into something more accessible via ctypes?
I was going to make a comment but stackoverflow block me from that.
1- The pandas object pass to c functions like PyObject *, so lib.return_query.argtypes = (c_types.c_void_p, ctypes.c_char_p, ctypes.c_char_p, ctypes.c_char_p)
2- If you are returning a tab-delimited rows that sounds more like ctypes.c_char_p, not lib.return_query.restype = ctypes.c_char. And your function int return_query, should be char * return_query
These are comments and observations not a full answer....

I want to create something like a python dictionary in C++

I'm using a struct. Is there some way to iterate through all the items of type "number"?
struct number { int value; string name; };
In c++ map works like python dictionary, But there is a basic difference in two languages. C++ is typed and python having duck typing. C++ Map is typed and it can't accept any type of (key, value) like python dictionary.
A sample code to make it more clear -
map<int, char> mymap;
mymap[1] = 'a';
mymap[4] = 'b';
cout<<"my map is -"<<mymap[1]<<" "<<mymap[4]<<endl;
You can use tricks to have a map which will accept any type of key, Refer - http://www.cplusplus.com/forum/general/14982/
As per my understanding you want to access a value and name using number. You can go for array of structure like
number n[5]; where n[0],n[1],...n[4]
but we have some additional features in c++ to achieve this with the predefined map, set
You can find lots of examples for map
You can use std::map (or unordered_map)
// Key Value Types.
std::map<int, std::string> data {{1, "Test"}, {2, "Plop"}, {3, "Kill"}, {4, "Beep"}};
for(auto item: data) {
// Key Value
std::cout << item.first << " : " << item.second << "\n";
}
Compile and run:
> g++ -std=c++14 test.cpp
> ./a.out
1 : Test
2 : Plop
3 : Kill
4 : Beep
The difference between std::map and std::unordered_map is for std::map the items are ordered by the Key while in std::unordered_map the values are not ordered (thus they will be printed in a seemingly random order).
Internally they use very different structures but I am sure you are not interested in that level of detail.

Parse C Array Size with Python & LibClang

I am currently using Python to parse a C file using LibClang. I've encountered a problem while reading a C-array which size is defined by a define-directive-variable.
With node.get_children i can perfectly read the following array:
int myarray[20][30][10];
As soon as the array size is replaced with a variable, the array won't be read correctly. The following array code can't be read.
#define MAX 60;
int myarray[MAX][30][10];
Actually the parser stops at MAX and in the dump there is the error: invalid sloc.
How can I solve this?
Thanks
Run the code through a C preprocessor before trying to parse it. That will cause all preprocessor-symbols to be replaced by their values, i.e. your [MAX] will become [60].
Note that C code can also do this:
const int three[] = { 1, 2, 3 };
i.e. let the compiler deduce the length of the array from the number of initializer values given.
Or, from C99, even this:
const int hundred[] = { [99] = 4711 };
So a naive approach might still break, but I don't know anything about the capabilities of the parser you're using, of course.
Semicolon ; in the define directive way causing the error.

Write Raw Numbers to Disk

It occurred to me that I have no idea how to write raw numerical values to disk.
How would I do this in Python or C++?!
I'm running some simulations and writing intermediate results to disk so that it doesn't start from scratch if it crashes.
Sadly these values chomp up gigabytes upon gigabytes of space on my hard drive.
Would writing the numerical values to disk as floats take up significantly less disk space or is there some other overhead I'm not considering?
The most versatile and powerful option is to use the HDF5 format, with the help of the Python interface. From the website:
It lets you store huge amounts of numerical data, and easily
manipulate that data from NumPy. For example, you can slice into
multi-terabyte datasets stored on disk, as if they were real NumPy
arrays. Thousands of datasets can be stored in a single file,
categorized and tagged however you want
It also has a C++ API.
The HDF5 format is widely used in the scientific computing community and is read/written by many software. Data in the HDF5 format can be manipulated rapidly with the parallel utility tools.
You can roll your own binary format and use that, but it's probably a bad idea.
If you're using Python to deal with numeric data, you're almost certainly using numpy. If you're not using numpy, you should look in to using numpy, it's great.
Once you've got your data in a numpy array, you can just use their save method.
The general method in Python is to use the struct module.
import struct
print struct.pack("!d", 3.14159)
(You can choose what byte order to use—I use ! to indicate network byte order for portability—or use no indicator to use the native byte ordering. Actually, I'm not sure if IEEE 754 specifies a byte ordering, so I'm not sure what to recommend. Maybe using the default is best.)
Before you optimize, make sure you are at least doing something like this (storing your numeric type in its binary representation on disk). If you are at this point and the file sizes are still too large, you can consider different types of compressed formats.
#include <iostream>
#include <fstream>
typedef int32_t my_numeric_type;
int main()
{
using namespace std;
{
ofstream output_file("numbers.dat", ios::binary);
if( !output_file )
{
cout << "Failed to open file for writing" << endl;
return 1;
}
for( my_numeric_type i = 0 ; i <= 1000; ++i )
output_file.write(reinterpret_cast<const char*>(&i), sizeof(i));
}
{
ifstream input_file("numbers.dat", ios::binary);
if( !input_file )
{
cout << "Failed to open file for reading" << endl;
return 1;
}
my_numeric_type i;
while( input_file.read(reinterpret_cast<char*>(&i), sizeof(i)) )
cout << i << endl;
}
return 0;
}

Categories

Resources