Sending data from python to c++ as parameters or file? - python

I have two programs. My main program is in python and I have another one in c++ because I have to connect to a machine and the only way I can use this machine is a dll/lib made with c++ that the constructor gave me.
In my python, I'm getting from a database thousands points and I'm doing some operations on them and those points are stocked in differents array in a class.
Now, I need to send thoses points to the c++ program and I was wondering how. I can send them as parameters to the main of the c++ program but the command line is then too long and I think even if it wasn't too long, this would be a really bad practice (If someone can explain a little why, that would be cool!).
Actually, I'm thinkinng to generate a file with all the coordonates in this file in my python program and send it too the c++ but I have to parse it then in c++.
Is it a good way to do it? Or maybe an other solution would be better?

Some ways:
Use subprocess.Popen() with stdin=subprocess.PIPE and write the data one per line, parsing it on the C++ side
Use temporary files, either text files or other files
Use ctypes and load the C++ part as shared library and then call into the C++ functions from Python directly
Use Python's struct module to encode your data from Python into C structs, which avoids the string formatting and string parsing -- this can be combined with any of the three above: subprocess.Popen (read binary data from stdin in C++), temporary files (open in binary mode) or ctypes (pass a pointer to the struct directly to the library function)
As to the question why passing arguments on the command line is problematic, there are limits imposed by the OS, e.g. getconf ARG_MAX.

Related

How to autogenerate python data parser for C++ structs?

I have a C++ based application logging data to files and I want to load that data in Python so I can explore it. The data files are flat files with a known number of records per file. The data records are represented as a struct (nested structs) in my C++ application. This struct (subtructs) change regularly during my development process, so I also have to make associated changes to my Python code that loads the data. This is obviously tedious and doesn't scale well. What I am interested in is a way to automate the process of updating the Python code (or some other way to handle this problem altogether). I am exploring some libraries that convert my C++ structs to other formats such as JSON, but I have yet to find a solid solution. Can anyone suggest something?
Consider using data serialization system / format that has C++ and Python bindings: https://en.wikipedia.org/wiki/Comparison_of_data-serialization_formats
(e.g. protobuf or even json or csv)
Alternatively consider writing a library in C that reads the data end exposes them as structures. Then use: https://docs.python.org/3.7/library/ctypes.html to call this C library and retrieve records
Of course if semantics of the data changes (e.g. new important field needs to by analyzed) you will have to handle that new stuff in the python code. No free lunch.

Python 3: capture a matrix return from subprocess with Rscript

I am using subprocess to run a Rscript. The script returns a R matrix. I am using subprocess.check_output in Python and get a string. But is there a way to get directly the output matrix in Python?
Thanks
Exchanging objects between two languages is not an easy task.
The generic solution
This solution works for all languages:
You launch your script
After computation you write your results in a generic format. For example .csv or .txt or .json
You reload the result in the other language
Regarding R and python
There is an existing package to do that: rpy but it might be tricky to use, and some times errors are not quite explicit (because as I said, it is tricky to exchange object between two languages).

How to make Python ssl module use data in memory rather than pass file paths?

The full explanation of what I want to do and why would take a while to explain. Basically, I want to use a private SSL connection in a publicly distributed application, and not handout my private ssl keys, because that negates the purpose! I.e. I want secure remote database operations which no one can see into - inclusive of the client.
My core question is : How could I make the Python ssl module use data held in memory containing the ssl pem file contents instead of hard file system paths to them?
The constructor for class SSLSocket calls load_verify_locations(ca_certs) and load_cert_chain(certfile, keyfile) which I can't trace into because they are .pyd files. In those black boxes, I presume those files are read into memory. How might I short circuit the process and pass the data directly? (perhaps swapping out the .pyd?...)
Other thoughts I had were: I could use io.StringIO to create a virtual file, and then pass the file descriptor around. I've used that concept with classes that will take a descriptor rather than a path. Unfortunately, these classes aren't designed that way.
Or, maybe use a virtual file system / ram drive? That could be trouble though because I need this to be cross platform. Plus, that would probably negate what I'm trying to do if someone could access those paths from any external program...
I suppose I could keep them as real files, but "hide" them somewhere in the file system.
I can't be the first person to have this issue.
UPDATE
I found the source for the "black boxes"...
https://github.com/python/cpython/blob/master/Modules/_ssl.c
They work as expected. They just read the file contents from the paths, but you have to dig down into the C layer to get to this.
I can write in C, but I've never tried to recompile an underlying Python source. It looks like maybe I should follow the directions here https://devguide.python.org/ to pull the Python repo, and make changes to. I guess I can then submit my update to the Python community to see if they want to make a new standardized feature like I'm describing... Lots of work ahead it seems...
It took some effort, but I did, in fact, solve this in the manner I suggested. I revised the underlying code in the _ssl.c Python module / extension and rebuilt Python as a whole. After figuring out the process for building Python from source, I had to learn the details for how to pass variables between Python and C, and I needed to dig into guts of OpenSSL (over which the Python module is a wrapper).
Fortunately, OpenSSL already has functions for this exact purpose, so it was just a matter of swapping out the how Python is trying to pass file paths into the C, and instead bypassing the file reading process and jumping straight to the implementation of using the ca/cert/key data directly instead.
For the moment, I only did this for Windows. Since I'm ultimately creating a cross platform program, I'll have to repeat the build process for the other platforms I'll support - so that's a hassle. Consider how badly you want this, if you are going to pursue it yourself...
Note that when I rebuilt Python, I didn't use that as my actual Python installation. I just kept it off to the side.
One thing that was really nice about this process was that after that rebuild, all I needed to do was drop the single new _ssl.pyd into my working directory. With that file in place, I could pass my direct cert data. If I removed it, I could pass the normal file paths instead. It will use either the normal Python source, or implicitly use the override if the .pyd file is simply put in the program's directory.

Many input parameters to a binary file

I have a simulation program written in C++. Normally, I would want to initialize the binary code with different parameters for different runs. First I tried using all my input parameters as command line arguments, but it happened to be too confusing, there are just too many parameters. Like:
./mysimulation 5 55 True ... output_file_name.txt
Then I decided to parse only a filename with input parameters written in it.
Like:
./mysimulation input_parameters.txt
But then the input file looks messy and confuses me because there are multiple lines, but I forget what lines correspond to what parameters The file would like so:
input_parameters.txt
55
5
True
...
output_file_name.txt
I could add comments, but then it is going to be a hassle to handle input file in C++, as it will require additional effort to parse the parameters in the file.
Now, I thought about using XML files since they are very structured and human-readable. I'm not sure if it is going to be easy to generate these XML files in python (python generates input parameters and launches C++ code multiple times) and read them in C++. Maybe there exists a better solution?
In my future plans, I'd like to have a parallel execution of multiple binaries on a many-CPU server computer. It'd be nice if the solution nicely complied with multiprocessing extension.
I tried linking C++ and Python with boost (as I did with my other projects), but I had trouble setting it up these CMakeFiles as the C++ has multiple classes, and I gave up after a couple of hours. And in one of my projects after making a shared library, multiprocessing with python crashed when using these libraries for some unknown to me reason, that is why I'd like to avoid python boost. (Maybe I did something incorrectly)

how to call a c++ file from python without using any of the spam bindings?

i have this encryption algorithm written in C++ , but the values that has to be encrypted are being taken input and stored in a file by a python program . Thus how can i call this c++ program from python?
Look for the subprocess module. It is the recommended way to invoke processes from within Python. The os.system function is a viable alternative sometimes, if your needs are very simple (no pipes, simple arguments, etc.)
The os.system function will invoke an arbitrary command-line from python.

Categories

Resources