Can you recommend a well-structured Python module combining both compiled C code (e.g. using distutils) and interpreted source code? I gather that "packages" can roll up interpreted modules and compiled modules, but I'm at a loss if it's possible to combine both compiled and interpreted sources into a single module. Does such a thing exist?
If not, is The Right Thing (TM) to have a package with from-import statements loading the public symbols from separated compiled and interpreted submodules?
You cannot have one module with both Python and C. Every .py file is a module, and C files are compiled and built into .so or .pyd files, each of which is a module. You can import the compiled module into the Python file and use them together.
If you want some ultra-simple examples, you might like A Whirlwind Excursion through Python C Extensions.
Related
I'm just now reading into cython and I'm wondering if cython compiles imported modules as part of the executable of if you still need to have the modules installed on the target machine to run the cython binary.
The "interface" of a Cython module remains at the Python level. When you import a module in Cython, the module becomes available only at the Python level of the code and uses the regular Python import mechanism.
So:
Cython does not "compile in" the dependencies.
You need to install the dependencies on the target machine.
For "Cython level" code, including the question of "cimporting" module, Cython uses the equivalent of C headers (the .pxd declaration files) and dynamically loaded libraries to access external code. The .so files (for Linux, DLL for windows and dylib for mac) need to be present on the target machine.
In Python ( CPython) we can import module:
import module and module can be just *.py file ( with a python code) or module can be a module written in C/C++ ( be extending python). So, a such module is just compiled object file ( like *.so/*.o on the Unix).
I would like to know how is it executed by the interpreter exactly.
I think that python module is compiled to a bytecode and then it will be interpreted. In the case of C/C++ module functions from a such module are just executed. So, jump to the address and start execution.
Please correct me if I am wrong/ Please say more.
When you import a C extension, python uses the platform's shared library loader to load the library and then, as you say, jumps to a function in the library. But you can't load just any library or jump to any function this way. It only works for libs specifically implemented to support python and to functions that are exported by the library as a python object. The lib must understand python objects and use those objects to communicate.
Alternately, instead of importing, you can use a foreign-function library like ctypes to load the library and convert data to the C view of data to make calls.
I'm reading PEP338 .Some words confused me:
If the module is found, and is of type PY_SOURCE or PY_COMPILED , then the command line is effectively reinterpreted from python <options> -m <module> <args> to python <options> <filename> <args> .
Do modules have types in Python?
Modules can be loaded from different sources. The author refers to 2 specific sources the module was loaded from, see the imp module documentation:
imp.PY_SOURCE
The module was found as a source file.
[...]
imp.PY_COMPILED
The module was found as a compiled code object file.
[...]
imp.C_EXTENSION
The module was found as dynamically loadable shared library.
These values are used in the return value of the imp.get_suffixes() function, among others.
The PEP states that only modules loaded from source (.py files) and from a bytecode cache file (.pyc) are supported; the -m switch does not support C extension modules (typically .so or .dll dynamically loaded libraries).
The resulting module object is still just a module object; the word type in the text you found is not referring Python's type system.
Quoting from the link PEP338
Proposed Semantics
The semantics proposed are fairly simple: if -m is
used to execute a module the PEP 302 import mechanisms are used to
locate the module and retrieve its compiled code, before executing the
module in accordance with the semantics for a top-level module.
Now let us refer to the documentation of imp (the import mechanism) and determine the different types of modules that can be imported
imp.get_suffixes()
imp.get_suffixes() Return a list of 3-element tuples, each describing
a particular type of module. Each triple has the form (suffix, mode,
type), where suffix is a string to be appended to the module name to
form the filename to search for, mode is the mode string to pass to
the built-in open() function to open the file (this can be 'r' for
text files or 'rb' for binary files), and type is the file type,
which has one of the values PY_SOURCE, PY_COMPILED, or C_EXTENSION,
described below.
and subsequently it explains what the different types are
imp.PY_SOURCE The module was found as a source file.
imp.PY_COMPILED The module was found as a compiled code object file.
imp.C_EXTENSION The module was found as dynamically loadable shared
library.
So, the types mentioned in PEP 338 are nothing but the types of modules that can be imported and of these only PY_SOURCE or PY_COMPILED are the only two types out of the above three the command line is effectively reinterpreted from python -m to python .
The type of module means the type of the file where the module is stored, as python files have some possible types (and extensions.
The most common are compiled python files (pyc extension) or the regular, python plain source (py).
There are many other py file extensions, see the (almost) full list here: https://stackoverflow.com/a/18032741/6575931.
This question is about how to detect the version of the Python interpreter being used to execute an extension module, from within an extension module (i.e. one written in C).
As background, it's straightforward inside a Python extension module to get the version of Python against which the extension was compiled. You can just use one of the macros defined in patchlevel.h that are included when including the standard Python header Python.h (e.g. the macro PY_VERSION).
My question is whether it's possible from within an extension module to get at run-time the version of the interpreter currently being used to run the extension. For example, if Python 2.5 is accidentally being used to run an extension module compiled against Python 2.7, I'd like to be able to detect that at run-time from within the extension module. For concreteness, let's assume the extension module is compiled against Python 2.7.
Use the Py_GetVersion() API:
Return the version of this Python interpreter. This is a string that looks something like
"1.5 (#67, Dec 31 1997, 22:34:28) [GCC 2.7.2.2]"
The first word (up to the first space character) is the current Python version; the first three characters are the major and minor version separated by a period. The returned string points into static storage; the caller should not modify its value. The value is available to Python code as sys.version.
Import sys. Examine hexversion.
I'm trying to write a software plug-in that embeds Python. On Windows the plug-in is technically a DLL (this may be relevant). The Python Windows FAQ says:
1.Do not build Python into your .exe file directly. On Windows, Python must be a DLL to handle importing modules that are themselves DLL’s. (This is the first key undocumented fact.) Instead, link to pythonNN.dll; it is typically installed in C:\Windows\System. NN is the Python version, a number such as “23” for Python 2.3.
My question is why exactly Python must be a DLL? If, as in my case, the host application is not an .exe, but also a DLL, could I build Python into it? Or, perhaps, this note means that third-party C extensions rely on pythonN.N.dll to be present and other DLL won't do? Assuming that I'd really want to have a single DLL, what should I do?
I see there's the dynload_win.c file, which appears to be the module to import C extensions on Windows and, as far as I can see, it scans the extension file to find which pythonX.X.dll it imports; but I'm not experienced with Windows and I don't quite understand all the code there.
You need to link to pythonXY.dll as a DLL, instead of linking the relevant code directly into your executable, because otherwise the Python runtime can't load other DLLs (the extension modules it relies on.) If you make your own DLL you could theoretically link all the Python code in that DLL directly, since it doesn't end up in the executable but still in a DLL. You'll have to take care to do the linking correctly, however, as pretty much none of the standard tools (like distutils) will do this for you.
However, regardless of how you embed Python, you can't make do with just the DLL, nor can you make do with just any DLL. The ABI changes between Python versions, so if you compiled your code against Python 2.6, you need python26.dll; you can't use python25.dll or python27.dll. And Python isn't just a DLL; it also needs its standard library, which includes extension modules (which are DLLs themselves, although they have the .pyd extension.) The code in dynload_win.c you ran into is for loading those DLLs, and are not related to loading of pythonXY.dll.
In short, in order to embed Python in your plugin, you need to either ship Python with the plugin, or require that the right Python version is already installed.
(Sorry, I did a stupid thing, I first wrote the question, and then registered, and now I cannot alter it or comment on the replies, because StackOverflow's engine doesn't think I'm the author. I cannot even properly thank those who replied :( So this is actually an update to the question and comments.)
Thanks for all the advice, it's very valuable. As far as I understand with some effort I can link Python statically into a custom DLL, provided that I compile other dynamically loaded extensions myself and link them against the same DLL. (I know I need to ship the standard library too; my plan was to append a zipped archive to the DLL file. As far as I understand, I will even be able to import pure Python modules from it.)
I also found an interesting place in dynload_win.c. (I understand it loads dynamic extensions that use Python C API, e.g. _ctypes.) As far as I can see it not only looks for init_ctypes symbol or whatever the extension name is, but also scans the .pyd file's import table looking for (regex) python\d+\. and then compares the found symbol with known pythonNN. string to make sure the extension was compiled for this version of Python. If the import table doesn't have such a symbol or it refers to another version, it raises an error.
For me it means that:
If I link an extension against pythonNN.dll and try to load it from my custom DLL that includes a statically linked Python, it will pass the check, but — well, here I'm not sure: will it fail because there's no pythonNN.dll (i.e. even before getting to the check) or it will happily load the symbols?
And if I link it against my custom DLL, it will find symbols, but won't pass the check :)
I guess I could rewrite this piece to suit my needs... Are there any other such places, I wonder.
Python needs to be a dll (with a standard name) such that your application, and the plugin, can use the same instance of python.
Plugin dlls are already going to expect to be loading (and using python from) a python26.dll (or whichever version) - if your python is statically embedded in your exe, then two different instances of the python library would be managing the same data structures.
If the python libraries use no static variables at all, and the compile settings are exactly the same this should not be a problem. However, generally its far safer to simply ensure that only one instance of the python interpreter is being used.
On *nix, all shared objects in a process, including the executable, contribute their exported names into a common pool; any of the shared objects can then pull any of the names from the pool and use them as they like. This allows e.g. cStringIO.so to pull the relevant Python library functions from the main executable when the Python library is statically-linked.
On Windows, each shared object has its own independent pool of names it can use. This means that it must read the relevant different shared objects it needs functions from. Since it is a lot of work to get all the names from the main executable, the Python functions are separated out into their own DLL.