Many input parameters to a binary file

Many input parameters to a binary file - python

I have a simulation program written in C++. Normally, I would want to initialize the binary code with different parameters for different runs. First I tried using all my input parameters as command line arguments, but it happened to be too confusing, there are just too many parameters. Like:
./mysimulation 5 55 True ... output_file_name.txt
Then I decided to parse only a filename with input parameters written in it.
Like:
./mysimulation input_parameters.txt
But then the input file looks messy and confuses me because there are multiple lines, but I forget what lines correspond to what parameters The file would like so:
input_parameters.txt
55
5
True
...
output_file_name.txt
I could add comments, but then it is going to be a hassle to handle input file in C++, as it will require additional effort to parse the parameters in the file.
Now, I thought about using XML files since they are very structured and human-readable. I'm not sure if it is going to be easy to generate these XML files in python (python generates input parameters and launches C++ code multiple times) and read them in C++. Maybe there exists a better solution?
In my future plans, I'd like to have a parallel execution of multiple binaries on a many-CPU server computer. It'd be nice if the solution nicely complied with multiprocessing extension.
I tried linking C++ and Python with boost (as I did with my other projects), but I had trouble setting it up these CMakeFiles as the C++ has multiple classes, and I gave up after a couple of hours. And in one of my projects after making a shared library, multiprocessing with python crashed when using these libraries for some unknown to me reason, that is why I'd like to avoid python boost. (Maybe I did something incorrectly)

Related

Sending data from python to c++ as parameters or file?

I have two programs. My main program is in python and I have another one in c++ because I have to connect to a machine and the only way I can use this machine is a dll/lib made with c++ that the constructor gave me.
In my python, I'm getting from a database thousands points and I'm doing some operations on them and those points are stocked in differents array in a class.
Now, I need to send thoses points to the c++ program and I was wondering how. I can send them as parameters to the main of the c++ program but the command line is then too long and I think even if it wasn't too long, this would be a really bad practice (If someone can explain a little why, that would be cool!).
Actually, I'm thinkinng to generate a file with all the coordonates in this file in my python program and send it too the c++ but I have to parse it then in c++.
Is it a good way to do it? Or maybe an other solution would be better?

Some ways:
Use subprocess.Popen() with stdin=subprocess.PIPE and write the data one per line, parsing it on the C++ side
Use temporary files, either text files or other files
Use ctypes and load the C++ part as shared library and then call into the C++ functions from Python directly
Use Python's struct module to encode your data from Python into C structs, which avoids the string formatting and string parsing -- this can be combined with any of the three above: subprocess.Popen (read binary data from stdin in C++), temporary files (open in binary mode) or ctypes (pass a pointer to the struct directly to the library function)
As to the question why passing arguments on the command line is problematic, there are limits imposed by the OS, e.g. getconf ARG_MAX.

How to run c code within python

How can I run c/c++ code within python in the form:
def run_c_code(code):
#Do something to run the code
code = """
Arbitrary code
"""
run_c_code(code)
It would be great if someone could provide an easy solution which does not involve installing packages. I know that C is not a scripting language but it would be great if it could do a 'mini'-compile that is able to run the code into the console. The code should run as it would compiled normally but this needs to be able to work on the fly as the rest of the code runs it and if possible, run as fast as normal and be able to create and edit variables so that python can use it. If necessary, the code can be pre-compiled into the code = """something""".
Sorry for all the requirements but if you can make the c code run in python then that would be great. Thanks in advance for all the answers..

As somebody else already pointed out, to run C/C++ code from "within" Python, you'd have to write said C/C++ code into an own file, compile it correctly, and then execute that program from your Python code.
You can't just type one command, compile it, and execute it. You always have to have the whole "framework" set up. You can't compile a program when you haven't yet written the } that ends the class/function/statement 20 lines later on. At this point you'd already have to write the whole C/C++ program for it to work. It's simply not meant to be interpreted on the run, line by line. You can do that with python, bash/dash/batch, and a few others. But C/C++ definitely isn't one of them.
With those come several issues. Firstly, the C/C++ part probably needs data from the Python part. I don't know of any way of doing it in RAM alone (maybe there is one, but I don't know), so the Python part would have to write it into a file, the C/C++ part would read and process it, then put the processed data into another file, and then the Python part would have to read that and continue.
Which brings another point up. Here we're already getting into multi-threading territory, because the moment you execute that C/C++ program you're dealing with a second thread. So, somehow, you'd have to coordinate those programs so that the Python part only continues once the C/C++ part is done. Shouldn't be a huge problem to get running, but it can be a nightmare to performance and RAM if done wrongly.
Without knowing to what extent you use that program, I also like to add that C/C++ isn't platform-independent like Python. You'll have to compile that program for every single different OS that you run it on. That may come with minor changes to the code and in general just a lot of work because you have to debug and test it for every single system.
To sum up, I think it may be better to find another solution. I don't know why you'd want to run this specific part in C/C++, but I'd recommend trying to get it done in one language. If there's absolutely no way you can get it done in Python (which I doubt, there's libraries for almost everything), you should get your Python to C/C++ instead.

If you want to run C/C++ code - you'll need either a C/C++ compiler, or a C/C++ interpreter.
The former is quite easy to arrange (though probably not suitable for an end user product) and you can just compile the code and run as required.
The latter requires that you attempt to process the code yourself and generate python code that you can then import. I'm not sure this one is worth the effort at all given that even websites that offer compilation tools wrap gcc/g++ rather than implement it in javascript.
I suspect that this is an XY problem; you may wish to take a couple of steps back and try to explain why you want to run c++ code from within a python script.

How do I speed up repeated calls a ruby program (github's linguist) from python?

I'm using github's linguist to identify unknown source code files. Running this from the command line after a gem install github-linguist is insanely slow. I'm using python's subprocess module to make a command-line call on a stock Ubuntu 14 installation.
Running against an empty file: linguist __init__.py takes about 2 seconds (similar results for other files). I assume this is completely from the startup time of Ruby. As #MartinKonecny points out, it seems that it is the linguist program itself.
Is there some way to speed this process up -- or a way to bundle the calls together?

One possibility is to just adapt the linguist program (https://github.com/github/linguist/blob/master/bin/linguist) to take multiple paths on the command-line. It requires mucking with a bit of Ruby, sure, but it would make it possible to pass multiple files without the startup overhead of Linguist each time.
A script this simple could suffice:
require 'linguist/file_blob'
ARGV.each do |path|
blob = Linguist::FileBlob.new(path, Dir.pwd)
# print out blob.name, blob.language, blob.sloc, etc.
end

generating simulation input files with Python

I am using a scientific simulation package that requires several text-based input files for each 'experiment' to be conducted. These files can be quite lengthy and have a lot of boilerplate sections in them; however, specific 'experiment-specific' values must be entered at many locations within these files.
I would like to automate the generation of these files and do so in a way that is maintainable.
Right now, I am using a Python script I wrote that employs triple quoted blocks of text and variable substitution (using % and .format()) to create sections in the files. I then write out these blocks to the appropriate files.
Accounting for proper aesthetic indentation in the resulting input files is proving to be difficult; moreover, the autogenerator script is becoming more and more opaque as I enhance the types of simulations and options that can be handled.
Does anyone have suggestions about how to manage this task in a more elegant and maintainable way?
I am aware of templating packages like jinja. Do these have benefits outside of generating html-like files? Has anyone used these for the above-stated purpose?
Perhaps a totally different approach would be better.
Any suggestions would be greatly appreciated.

Jinja doesn't care what type of file you make. Text is text is text, unless it's binary. Not even sure Jinja cares then either.
IPython, and in particular, nbconvert, uses Jinja2 to export LaTeX, ipynb, markdown, etc.
There is also an IPython notebook with Jinja2 magics in case you want a demo.

My usual approach to this sort of problem is to create a small library of functions that help me generate and customise the boiler-plate. I don't know what your experiment-definition language looks like but generally I'd need to write a function that writes out the text to initialise the simulation, a function that writes out the text to wrap up the simulation and some other functions to write out the different chunks of text that define each type of experiment.
Having put those functions in a file called mysim, say, I could then use them like this:
from mysim import sim_init, sim_conclude, experimentType1, experimentType2
sim_init (name="Today's Simulation", author="Simon")
for param1 in [0,1,2,3,4,5,6,7,8,20,30,40,50,60,70]:
experimentType1 (param1)
for param2 in ["A", "B", "C"]:
experimentType2 (param1, param2)
sim_conclude (savefile="output.txt")
This Python script would generate a simulation input file that would run experiment type 1 for each value of param1 and experiment type 2 for each combination of param1 and param2.
The function implementations themselves might look messy, but the script that creates a particular simulation file will be simple and clear.

Testing full program by comparing output file to reference file: what's it called, and is there a package for it?

I have a completely non-interactive python program that takes some command-line options and input files and produces output files. It can be fairly easily tested by choosing simple cases and writing the input and expected output files by hand, then running the program on the input files and comparing output files to the expected ones.
1) What's the name for this type of testing?
2) Is there a python package to do this type of testing?
It's not difficult to set up by hand in the most basic form, and I did that already. But then I ran into cases like output files containing the date and other information that can legitimately change between the runs - I considered writing something that would let me specify which sections of the reference files should be allowed to be different and still have the test pass, and realized I might be getting into "reinventing the wheel" territory.
(I rewrote a good part of unittest functionality before I caught myself last time this happened...)

I guess you're referring to a form of system testing.
No package would know which parts can legitimately change. My suggestion is to mock out the sections of code that result in the changes so that you can be sure that the output is always the same - you can use tools like Mock for that. Comparing two files is pretty straightforward, just dump each to a string and compare strings.

Functional testing. Or regression testing, if that is its purpose. Or code coverage, if you structure your data to cover all code paths.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.