I can make as much out as its build system generates a shared object, and also a Python module as a proxy to that shared object. But how does the Python runtime end up doing a dlopen of the generated shared object and bind Python method calls on to corresponding functions in the shared library?
I also found a reference in Python's import system about shared libraries, but nothing beyond that. Does CPython treat .so files in the module path as Python modules?
When doing import module Python will look for files with various extensions that could be python modules. That can be module.py, but also module.so (on Linux) or module.pyd (on Windows).
When loading a shared object, Python will load it like any dynamic library and then it will call the module's init method: It must be named PyInit_{module_name_here} and exported in the shared library.
You can read more about it here.
Related
Recently I am reading some official doc about Modules and Import systems about python.
https://docs.python.org/3/reference/import.html
https://docs.python.org/3/tutorial/modules.html
https://docs.python.org/3/reference/simple_stmts.html#import
I notice that sys.modules which holds all the modules which have been loaded.
If I run a script like
import sys
print(sys.modules.keys())
I shall get the names for the modules which have been loaded by default except the sys module. (since I have import sys explicitly, but it may also be loaded by default because the import statement will first check whether the module to be imported has been loaded, if not then it will do the loading and initialization action).
I found that there is a set of modules called builtin modules, but I checked it with sys.builtin_module_names and found there are only part of them are loaded by default. And I also noticed that there are also some modules loaded by default come from the Python Standard Module/Library https://docs.python.org/3/tutorial/modules.html#standard-modules https://docs.python.org/3/library/. (Maybe the Python Standard should also contains all the builtin_modules)
So I want to know what is the format for these modules which will been loaded by default. Is there any official explanation about it?
The answer is in the documentation for "standard modules" that you linked to:
Python comes with a library of standard modules, described in a
separate document, the Python Library Reference (“Library Reference”
hereafter). Some modules are built into the interpreter; these provide
access to operations that are not part of the core of the language but
are nevertheless built in, either for efficiency or to provide access
to operating system primitives such as system calls.
In Python ( CPython) we can import module:
import module and module can be just *.py file ( with a python code) or module can be a module written in C/C++ ( be extending python). So, a such module is just compiled object file ( like *.so/*.o on the Unix).
I would like to know how is it executed by the interpreter exactly.
I think that python module is compiled to a bytecode and then it will be interpreted. In the case of C/C++ module functions from a such module are just executed. So, jump to the address and start execution.
Please correct me if I am wrong/ Please say more.
When you import a C extension, python uses the platform's shared library loader to load the library and then, as you say, jumps to a function in the library. But you can't load just any library or jump to any function this way. It only works for libs specifically implemented to support python and to functions that are exported by the library as a python object. The lib must understand python objects and use those objects to communicate.
Alternately, instead of importing, you can use a foreign-function library like ctypes to load the library and convert data to the C view of data to make calls.
When writing a Cython implementation file (.pyx), is it possible to define functions that are the __attribute__((__constructor__)) or __attribute__((__destructor__)) of the shared library that Cython will create as the extension module?
It wouldn't make sense to place these as prototype declarations in a header file unless you also then defined them in a C file and compiled it such that the function implementations could be cimported... but then the __constructor__-ness of the function would apply to that compiled library, not to the one compiled by Cython.
Note that I am not referring to __cinit__ or __dealloc__ for extension types but instead to C-specific functions that are designed to run upon library load or unload.
It is not clear to me if the requirements of a module built with the CPython API (e.g. Py_InitModule) prevent the module from having any other function built as an entry hook that is automatically invoked when the library is loaded (or unloaded as for the destructor).
I am wrapping C++ code for use in Python using SWIG. The C++ module I am wrapping has C++ dependencies of other modules located within a different package. However, rather than directly importing/including these files, I would like to import a previously created Python library/dynamic shared library to deal with the dependencies. I want to use this method because I do not want to hard-code files in this package for them to work. I will simply have access to the shared library.
Currently, without importing the library, compiling the new module with the wrapper file results in the error:
"fatal error: string: No such file or directory compilation terminated."
as a header file the new module depends on is not available within this package. I do not want to copy the required headers the new module has a dependency on into this package.
I would like to know if this is possible within either the SWIG interface file or CMake.
Thanks for your help.
I am working on Python version 2.7.
I have a module extension for Python written in C.
The module initialization function PyMODINIT_FUNC initmymodule contains some code for initializing OpenSSL library. My module built as shared library and loading via imp.load_dynamic
This module may loading many times and I can't control it. Django and python doing that. And when it loading twice then OPENSSL_config function calling twice too. And it leading to process crash.
I can't control it from C-code, I can't control it from Python-code.
Here look at the docs
http://docs.python.org/2.7/library/imp.html
It says:
imp.load_dynamic Load and initialize a module implemented as a
dynamically loadable shared library and return its module object. If
the module was already initialized, it will be initialized again.
Nice.
I found that the similar problem was solved in Python version 3.4
http://hg.python.org/cpython/file/ad51ed93377c/Python/import.c#l459
Modules which do support multiple initialization set their m_size
field to a non-negative number (indicating the size of the
module-specific state). They are still recorded in the extensions
dictionary, to avoid loading shared libraries twice.
But what shall I do in Python 2.7?
Maybe to do workaround by registering own custom import hook where you could control the case which causes you problem (prevent double initialization). Some references for writing custom import hooks:
Python import hooks article
PEP-302 New Import Hooks - python 2.3+
create and register custom import/reload functions - example of implementation in project lazy_reload
This is hackish solution, so I suggest extra caution if this is to be used in production systems.
I have found the cause of my problem. It happens because my django application uses driver to connect PostgreSQL and this driver loads OpenSSL library. It leads to conflict just as user315052 showed in this comment
I think I have to take out all crypto-functionality of my application to a separate process.