Confusion wrapping c++ library to python - python

I have a .cpp and .h source file pair which is a manager (I guess a wrapper also) for a c++ library I have made. I want to let people use this manager to work with my library in python. I have heard about several different ways to wrap this library into python like cython and boost.python but Im having trouble with understanding the process.
If I want to make this manager usable in python, do I need to wrap it in a different way for each version of python? (2.7 vs 3.4) Do I also need to wrap it in a different way for each operating system for each version? So 2.7/3.4 for Windows vs 2.7/3.4 for Linux?

Concerning your confusion about the process, just follow any tutorial for any of the wrapper libs you found or where suggested in the comments.
If I want to make this manager usable in python, do I need to wrap it in a different way for each version of python? (2.7 vs 3.4)
Yes. You might be able to load binary modules compiled for Python 3.4 into Python 3.5, but it's unlikely to work across major versions.
Do I also need to wrap it in a different way for each operating system for each version?
Yes. Just as you need to compile your C++ code for different operating systems (and possibly versions) and CPU architectures, Python modules are not different. However, the "wrap it in a different way" just means "compile for the target environment".

Related

How to safely create and import .so files in Python 3?

I am trying to use Cython to create .so binary files from our .py files and shared it with our team.
However even if we all use Python3, most of times it should be exactly similar revision (let us say 3.7.8), otherwise we get an error to import them.
Is this behavior expected?
Some of revisions are compatible. For example if we make .so with python 3.5.2 and import in 3.6.8 it works but it does not work in 3.7.8
Where does this mess comes from and what is the safest way to do this?
To follow up on my comments:
On the same platform, extension modules should work within a "minor version" (i.e. modules built with 3.7.2 and 3.7.3 should be compatible). I'm struggling to find a source for this though. Beyond that some effort has made in the past to ensure compatibility between releases, but not so much any more so it's possible you may be lucky and things work.
distutils/setuptools and other similar build mechanisms tag extension modules with a suffix indicating the version and some other details. For example, and extension would be called foo.cpython-37m.so instead of just foo.so. These tags prevent the module from being used with other Python versions and are a good thing. If you are removing these tags then this mess is entirely on you.
Python now defines a more limited stable ABI that should be compatible across Python versions. Cython is working towards supporting that but at the moment it is not in a usable state. In a year or so it should be a good solution.
In summary, .so files are not expected to be portable between different Python versions. You should either standardise on a Python version or build the .so files locally.

Does compiling and linking Python in a C++ program mean that target users won't need python installed?

I have a C++ application that uses machine learning from Python and my current approach is making a single file executable with pyInstaller and then just running it from C++. This have obvious drawbacks, notably interapplication communication. At the moment I'm using an intermediate JSON file to talk to each other but this is massively suboptimal for my future requirements. What's beautiful about this, is that is working on all major platforms without too much hassle.
Section 1.6. from Python's manual reads "Compiling and Linking under Unix-like systems"
Does this mean that Python interpreter will be inside my application binary and target system doesn't need to have Python installed as the program will always use embedded interpreter ? If so, whats with python libraries ? Can I embed a whole conda enviroment ?
Also, whats with:
"(...) under Unix-like systems"
Does this means this approach is not multiplatform ?
Thanks in advance.
Embedding the Python interpreter is possible on all platforms. However it will only be the interpreter. Embedding any libraries will be a lot harder or even impossible.
But since you seem to deploy the Python libs already, you can use them just fine from the embedded interpreter. And then you could bridge C++ and Python without IPC, since they are both running in the same process.
pybind11 is very nice for embedding and generating C++ <-> Python interfaces.
A possible alternative, depending on the libraries in use, may be to export the model and use a C++ library to load and use it (for instance Tensorflow -> ONNX -> ONNX runtime).
It means that cpython (Python interpreter) will be inside your application. You will be able to run Python code and observe and manipulate virtual machine state directly from C++ code (good entry point C API reference is here). Your application might have some additional dynamic library dependencies (which ones depends on compilation options of embedded Python). Also interpreter isn't completely self contained and depends on some external .py modules normally shipped with Python distribution (as standard library). If you plan to import external modules that expect standard library, you will have to ship it with your application. There are ways how to build modules into binary too (freeze) but you might run into issues specially with modules that rely on filesystem.
As far as I tried, this procedure works on UNIX like systems and Windows (where easiest way is to link against DLL which you then ship with your application). On Windows you also need to make sure that you compile with same compiler as was used to compile DLL (or you compile Python DLL from source). Here is additional information about embedding on Windows: https://docs.python.org/3/faq/windows.html#how-can-i-embed-python-into-a-windows-application
Just note that embedding Python and shipping 3rd party modules with your application might have some licensing consequences.

Managing Python 3 code with SCons

at work I have the task to convert a large library with Python 2.7 Code to Python 3.x.
This library contains a lot of scripts and extensions made with boost python for C++.
All of this is built with SCons which does not work with a Python 3.x interpreter, but now me and my supervisor want to know if there is a way around this.
The SConstruct file contains expressions with sys.version to determine the correct module-directories to import (numpy etc.). I do not know how to use SCons or the syntax, so I can not give a lot of information about this topic.
Can we use SCons to build Python 3 Code with the given extensions or do we have to wait until SCons is compatible with Python 3?
At the time of writing this, there are plans to support both Python 2.7 and 3.x in a single branch/version. Work on this feature has started, but it will take some more time to reach this goal.
So it looks as if your best bet would be to start right away. SCons itself should run fine under Python 2.7 for compiling the Boost extensions. The problem in your case are the added checks and detection mechanisms for deriving paths and module names from the version of the current Python interpreter.
Since you can't give any more detail about this process, my answer is somewhat vague here, sorry. In principle you'd have to find the place in the SConstructs/SConscripts where the version of the currently running Python interpreter is determined. Just hardcode this to the 3.x version that you have installed on the machine additionally, and keep your fingers crossed that the rest will work automatically.
Note how there is a clear separation here between "compiling code for a Python version" vs "compiling code under a Python version".
In general, a better understanding of SCons internal workings and basic principles might be helpful. If you find the time, check out the UserGuide ( http://scons.org/doc/production/HTML/scons-user.html ) or consult our user mailing list ( see http://scons.org/lists.php ) for larger questions and discussions.

Running python script without installed libraries

I have working Python script using scipy and numpy functions and I need to run it on the computer with installed Python but without modules scipy and numpy. How should I do that? Is .pyc the answer or should I do something more complex?
Notes:
I don't want to use py2exe. I am aware of it but it doesn't fit to the problem.
I have read, these questions (What is the difference between .py and .pyc files?, Python pyc files (main file not compiled?)) with obvious connection to this problem but since I am a physicist, not a programmer, I got totally lost.
It is not possible.
A pyc-file is nothing more than a python file compiled into byte-code. It does not contain any modules that this file imports!
Additionally, the numpy module is an extension written in C (and some Python). A substantial piece of it are shared libraries that are loaded into Python at runtime. You need those for numpy to work!
Python first "compiles" a program into bytecode, and then throws this bytecode through an interpreter.
So if your code is all Python code, you would be able to one-time generate the bytecode and then have the Python runtime use this. In fact I've seen projects such as this, where the developer has just looked through the bytecode spec, and implemented a bytecode parsing engine. It's very lightweight, so it's useful for e.g. "Python on a chip" etc.
Problem comes with external libraries not entirely written in Python, (e.g. numpy, scipy).
Python provides a C-API, allowing you to create (using C/C++ code) objects that appear to it as Python objects. This is useful for speeding things up, interacting with hardware, making use of C/C++ libs.
Take a look at Nuitka. If you'll be able to compile your code (not necessarily a possible or easy task), you'll get what you want.

Python portability issues

Basically, I am a Java programmer who wants to learn Python language. I want to clarify why some of python libaries are distributing using non-portable manner.
Let me explain my thoughts. If someone creates a regular library using Java he prepares 1 (one) JAR file which can be used on different platforms:
my-great-lib-1.2.4.jar
I can use this lib (the same file) on any version of Windows or Linux.
In contrast to Java, python libraries may look like this:
bsdiff4-1.1.4.win-amd64-py2.5.exe
bsdiff4-1.1.4.win-amd64-py2.6.exe
bsdiff4-1.1.4.win-amd64-py2.7.exe
bsdiff4-1.1.4.win-amd64-py3.2.exe
bsdiff4-1.1.4.win-amd64-py3.3.exe
bsdiff4-1.1.4.win32-py2.5.exe
bsdiff4-1.1.4.win32-py2.6.exe
bsdiff4-1.1.4.win32-py2.7.exe
bsdiff4-1.1.4.win32-py3.2.exe
bsdiff4-1.1.4.win32-py3.3.exe
See full list on page.
It looks very strange for me. Even 32bit and 64bit platforms require different installers. Installers! Why do I need an installer in order to use one library? Moreover, outlined installers are only for Windows. Each of them is bind to particular python version. Where is portability?
Could anyone explain a necessity of 10 different files above?
In general, Python libraries are portable across platforms. Problems appear between different major Python versions (3 introduced some big changes from 2, but 2.7 is backwards compatible with 2.6) or when you use C code for optimizing CPU intensive code. On Linux, compiling it yourself is not a problem, when you call pip install package, it will do it for you. The problem is on Windows, where it is much more difficult to compile a C program, especially because not everybody has a compiler. So, for Windows, packages that need something in C, you usually get an installer.
Also, installers are used because they set up everything nicely, look in the registry for the appropriate place to put everything, offer a standard way to uninstall them (the ones from Chrisopther Goelke's site can be removed using Add/Remove programs in Control Panel) and because that's the standard on Windows: most of the programs on Windows are installed via an exe, because it doesn't have a standard and widespread package manager.
All these libraries are then portable: you can use them from any platform, but installing them is what differs.
There are many complications. In Java where your code and then byte-code is interpreted by JVM, the inherent computer architecture do not play lot of role as long as your code is interpreted well by JVM. In fact, that is one of the primary reason Java got so popular because your code should only worry about rightly compiled by JVM.
However, in Python situation is different. I am trying to summarize some of the reason which I think is important in following lines:
The language itself is evolving (although it is long in the scenario if you think!) and changes are happening inside the language. New features are added and sometime, even some remodeling of language is done ( Python 2.x to Python 3.x)
Python relies heavily on its C extensions and so does the applications written in Python. If you write a python program and have some CPU intensive code, you can choose to write it in C. This also adds in the necessity of creating number of libraries for various distribution.
For one python versions jump around. In python 3, the syntax of some builtins completely changed. For example:
raw_input()
changed to:
input()
also, a lot of the standard library has changed even in the alpha of 3.4. As for the 32/64 bit question, I cannot fully answer. I know that certain platforms have trouble when trying to run 32/64, and that may be the point there.

Categories

Resources