Statically linking Python, but still supporting external .pyd modules - python

I'm looking at statically link Python with my application. The reason for this is because in some test cases I've seen a 10% speed increase. My application uses the Python C-API heavily, and it seems that Whole Program Optimization is able to do some good optimizations. I expect Profile Guided Optimizations will gain a little more too. This is all being done in MSVC2015
So far I've recompiled the pythoncore project (python35.dll) into a static library and linked that with my application (let's call it myapp.exe). FYI other than changing the project type to static, the only other thing that needs doing is setting the define Py_NO_ENABLE_SHARED during the static lib compile, and when compiling myapp.exe. This works fine and it's how I was able to obtain the 10% speed improvement test result.
So the next step is continuing to support external python modules that have .pyd files (which are .dll files renamed to .pyd). These modules will have been compiled expecting to dynamically link with python35.dll, so I need to provide a workaround for that requirement, since all of the python functions are now embedded into myapp.exe.
First I use a .def file to export all of the public Python functions from myapp.exe. This works fine.
The missing piece is how do I create a python35.dll which redirects all the calls to the functions exported from myapp.exe.
My first attempt is using DLL forwarding. I made a custom python35.dll which has a .def file with lines such as:
PyArg_Parse=myapp.PyArg_Parse
In theory, this works. If I use Dependency Walker on socket.pyd, it correctly opens my python35.dll and shows that all the calls are being forwarded to myapp.exe.
However when actually running myapp.exe and trying to import socket, it fails to load the required entry points from myapp.exe. 'import socket' in Python will cause a LoadLibrary("socket.pyd") to occur. This will load my custom python35.dll implicitly. The failure occurs while trying to load python35.dll, it's unable to find the entry points for it's forwards. It seems like the reason for this is because myapp.exe won't be part of the library search path. I seem to be able to verify this by copying myapp.exe to myapp.dll. If I do that then the python35.dll load works, however this isn't a solution since that will result in 2 copies of the Python enviroment (one in myapp.exe, one in myapp.dll)
Possible other avenues I've also looked into but haven't found the right solution for:
Somehow getting .exe files to be part of the library search path
Using Windows manifest/configuration to redirect the library somehow
Manually using declspec(naked) and jmp statements to more explicitly wrap the .dll. I'm working in x64, so I don't think that's possible anymore?
I could manually do the whole Python API and wrap each function manually. This is doable if I can find a way to create the function definitions of all the exports so it's not an insane amount of manual work.
In summary, is there a way to redirect/forward calls to a .dll to functions/data exported from an .exe. Thanks!

I ended up going with the solution that #martineau suggested in the comments, which was to put all of my application, including Python, into a single .dll instead of an .exe. Then the .exe is just a simple file that calls into the .dll and does nothing else.

Related

Dynamically importing .py files after compiling

I've tried looking online and I'm honestly lost at this point.
I'm trying to find if there's a way to import python scripts and run them AFTER my Python program has been compiled.
For an example, let's say I have a main.py such that:
import modules.NewModule
a = NewModuleClass()
a.showYouWork()
then I compile main.py with pyinstaller so that my directory looks like:
main.exe
modules/NewModule.py
My end goal is to make a program that can dynamically read new Python files in a folder (this will be coded in) and use them (the part I'm struggling with). I know it's possible, since that's how add-ons work in Blender 3D but I've struggled for many hours to figure this out. I think I'm just bad at choosing the correct terms in Google.
Maybe I just need to convert all of the Python files in the modules directory to .pyc files? Then, how would I use them?
Also, if this is a duplicate on here (it probably is), please let me know. I couldn't find this issue on this site either.
You may find no detailed answer simply because there is no problem. PyInstaller does not really compile Python scripts into machine code executables. It just assembles then into a folder along with an embedded Python interpretor, or alternatively creates a compressed single file executable that will automatically uncompress itself at run time into a temporary folder containing that.
From then on, you have an almost standard Python environment, with normal .pyc file which can contain normal Python instructions like calls to importlib to dynamically load other Python modules. You have just to append the directory containing the modules to sys.path before importing them. An other possible caveat, is that pyinstaller only gets required modules and not a full Python installation, so you must either make sure that the dynamic modules do not rely on missing standard modules, or be prepared to face an ImportError.

How to compile single python scripts (not to exe)?

I know there is a lot of debate within this topic.
I made some research, I looked into some of the questions here, but none was exactly it.
I'm developing my app in Django, using Python 3.7 and I'm not looking to convert my app into a single .exe file, actually it wouldn't be reasonable to do so, if even possible.
However, I have seen some apps developed in javascript that use bytenode to compile code to .jsc
Is there such a thing for python? I know there is .pyc, but for all I know those are just runtime compiled files, not actually a bytecode precompiled script.
I wanted to protect the source code on some files that can compromise the security of the app. After all, deploying my app means deploying a fully fledged python installation with a web port open and an app that works on it.
What do you think, is there a way to do it, does it even make sense to you?
Thank you
The precompiled (.pyc) files are what you are looking for. They contain pre-optimized bytecode that can be run by the interpreter even when the original .py file is absent.
You can build the .pyc files directly using python -m py_compile <filename>. There is also a more optimized .pyo format that further reduces the file size by removing identifier names and docstrings. You can turn it on by using -OO.
Note that it might still be possible to decompile the generated bytecode with enough effort, so don't use it as a security measure.

Python ctypes and DLL that uses a COM object

Under Windows, I'm trying to use a 3rd party DLL (SomeLib.dll) programmed in C++ from Python 2.7 using ctypes.
For some of its features, this library uses another COM DLL (SomeCOMlib.dll), which itself uses other DLL (LibA.dll).
Please note that this isn't about using a COM DLL directly from Python, but it's about using a DLL that uses it from Python.
To make the integration with Python easier, I've grouped the calls I want to use into my own functions in a new DLL (MyLib.dll), also programmed in C++, simply to make the calls from ctypes more convenient (using extern "C" and tailoring the functions for specific scenarios).
Essentially, my library exposes 2 functions: doSomethingSimple(), doSomethingWithCOMobj() (all returning void, no parameters).
The "effective" dependency hierarchy is as follows:
MyLib.dll
SomeLib.dll
SomeCOMlib.dll
LibA.dll
I'm able to write a simple C++ console application (Visual C++) that uses MyLib.dll and makes those 2 consecutive calls without problem.
Using Python/ctypes, the first call works fine, but the call that makes use of COM throws a WindowsError: [Error -529697949] Windows Error 0xE06D7363.
From the rest of the library behaviour, I can see that the problem comes exactly where that COM call is made.
(The simple test C++ application also fails more or less at the same place if LibA.dll is missing, but I'm not sure if it's related.)
I've looked at the dependency hierarchy using Dependency Walker.
SomeCOMlib.dll isn't listed as a dependency of SomeLib.dll, although it's clearly required, and LibA.dll isn't listed as a dependency of SomeCOMlib.dll, although it's also clearly required at run time.
I'm running everything from the command line, from within the directory where these DLLs are (and the C++ sample executable works fine).
I've tried to force the PATH to include that directory, and I've also tried to copy the DLLs to various places where I guessed they might be picked up (C:\Windows\System32 and C:\Python27\DLLs), without success.
SomeCOMlib.dll was also registered with regasm.exe.
What could cause this difference between using this DLL from a plain C++ application and from Python's ctypes when it comes to its own usage of the COM mechanism (and possibly the subsequent loading of other DLLs there)?
Which steps would give at least a bit more information than Windows Error 0xE06D7363 from Python, to be able to investigate the problem further?
The Python code looks like this:
import ctypes
myDll = ctypes.WinDLL("MyLib.dll")
myDll.doSomethingSimple()
myDll.doSomethingWithCOMobj() # This statement throws the Windows Error
(The test C++ standalone application that works, linked to MyLib.dll makes exactly the same calls within main.)
When you need an in-proc COM object, you don't link directly to the implementing DLL. You usually use CoCreateInstance/CoCreateInstanceEx, which will load the DLL for you.
The lookup goes through the application's manifest and its dependant assembly manifests. This is done to support registration-free COM.
If there's no application manifest or if none of the dependant assembly manifests declare your class in a comClass XML element, the lookup defaults to the registry, which will check HKEY_CLASSES_ROOT\CLSID1 for a subkey named {<your-CLSID>}, itself with an InProcServer32 subkey stating the DLL.
This explains why SomeCOMlib.dll doesn't appear as a dependency. It doesn't explain why LibA.dll doesn't appear as a dependency of it, probably because it's dynamically loaded. If you profile your app within Dependency Walker, you'll see a log of LoadLibrary calls in the bottom pane. To profile it, open your Python executable in Dependency Walker, then go to menu option Profile->Start profiling..., set the parameters to run your .py file and click Ok.
The 0xE06D7363 exception code is a Visual C++ exception code. You should check the source code of doSomethingWithCOMobj. To debug it, use your preferred tool (Visual C++, WinDbg, etc.), open Python's executable, setting up the arguments to run your .py file, and enable a breakpoint on the first statement of the function before running the application. Then run it and step through each instruction.
It's really hard to guess what's different about your native C++ application and Python, but it may be that different COM initialization arguments are used by Python and doSomethingWithCOMobj, or that you haven't declared it __stdcall (although being a void function that shouldn't matter), or that it tries to write to stdout and you're using pythonw.exe which isn't a console application, etc.
1. HKEY_CLASSES_ROOT is a mix of HKEY_CURRENT_USER\Software\Classes and HKEY_LOCAL_MACHINE\Software\Classes.
You could use something like processexplorer to find out if all the required libraries are loaded into the running process.
Did you initialize the COM runtime somewhere in your Python code? In your C++ code?
One difference between C++ code and Python/ctypes is that in the C++ code the dll's DllMain() function is called when the dll is loaded, Python/ctypes doesn't do this because of the fully dynamic loading of the dll.
I'm going to take a guess that you have the same issue that I had, although I don't really know because unlike this case, I know nothing about the dll I am using. 6 years later I don't know if you can still test this, but I wonder if importing pythoncom would solve the problem.
import ctypes
import pythoncom
myDll = ctypes.WinDLL("MyLib.dll")
myDll.doSomethingSimple()
myDll.doSomethingWithCOMobj() # This statement throws the Windows Error

Embedding Python on Windows: why does it have to be a DLL?

I'm trying to write a software plug-in that embeds Python. On Windows the plug-in is technically a DLL (this may be relevant). The Python Windows FAQ says:
1.Do not build Python into your .exe file directly. On Windows, Python must be a DLL to handle importing modules that are themselves DLL’s. (This is the first key undocumented fact.) Instead, link to pythonNN.dll; it is typically installed in C:\Windows\System. NN is the Python version, a number such as “23” for Python 2.3.
My question is why exactly Python must be a DLL? If, as in my case, the host application is not an .exe, but also a DLL, could I build Python into it? Or, perhaps, this note means that third-party C extensions rely on pythonN.N.dll to be present and other DLL won't do? Assuming that I'd really want to have a single DLL, what should I do?
I see there's the dynload_win.c file, which appears to be the module to import C extensions on Windows and, as far as I can see, it scans the extension file to find which pythonX.X.dll it imports; but I'm not experienced with Windows and I don't quite understand all the code there.
You need to link to pythonXY.dll as a DLL, instead of linking the relevant code directly into your executable, because otherwise the Python runtime can't load other DLLs (the extension modules it relies on.) If you make your own DLL you could theoretically link all the Python code in that DLL directly, since it doesn't end up in the executable but still in a DLL. You'll have to take care to do the linking correctly, however, as pretty much none of the standard tools (like distutils) will do this for you.
However, regardless of how you embed Python, you can't make do with just the DLL, nor can you make do with just any DLL. The ABI changes between Python versions, so if you compiled your code against Python 2.6, you need python26.dll; you can't use python25.dll or python27.dll. And Python isn't just a DLL; it also needs its standard library, which includes extension modules (which are DLLs themselves, although they have the .pyd extension.) The code in dynload_win.c you ran into is for loading those DLLs, and are not related to loading of pythonXY.dll.
In short, in order to embed Python in your plugin, you need to either ship Python with the plugin, or require that the right Python version is already installed.
(Sorry, I did a stupid thing, I first wrote the question, and then registered, and now I cannot alter it or comment on the replies, because StackOverflow's engine doesn't think I'm the author. I cannot even properly thank those who replied :( So this is actually an update to the question and comments.)
Thanks for all the advice, it's very valuable. As far as I understand with some effort I can link Python statically into a custom DLL, provided that I compile other dynamically loaded extensions myself and link them against the same DLL. (I know I need to ship the standard library too; my plan was to append a zipped archive to the DLL file. As far as I understand, I will even be able to import pure Python modules from it.)
I also found an interesting place in dynload_win.c. (I understand it loads dynamic extensions that use Python C API, e.g. _ctypes.) As far as I can see it not only looks for init_ctypes symbol or whatever the extension name is, but also scans the .pyd file's import table looking for (regex) python\d+\. and then compares the found symbol with known pythonNN. string to make sure the extension was compiled for this version of Python. If the import table doesn't have such a symbol or it refers to another version, it raises an error.
For me it means that:
If I link an extension against pythonNN.dll and try to load it from my custom DLL that includes a statically linked Python, it will pass the check, but — well, here I'm not sure: will it fail because there's no pythonNN.dll (i.e. even before getting to the check) or it will happily load the symbols?
And if I link it against my custom DLL, it will find symbols, but won't pass the check :)
I guess I could rewrite this piece to suit my needs... Are there any other such places, I wonder.
Python needs to be a dll (with a standard name) such that your application, and the plugin, can use the same instance of python.
Plugin dlls are already going to expect to be loading (and using python from) a python26.dll (or whichever version) - if your python is statically embedded in your exe, then two different instances of the python library would be managing the same data structures.
If the python libraries use no static variables at all, and the compile settings are exactly the same this should not be a problem. However, generally its far safer to simply ensure that only one instance of the python interpreter is being used.
On *nix, all shared objects in a process, including the executable, contribute their exported names into a common pool; any of the shared objects can then pull any of the names from the pool and use them as they like. This allows e.g. cStringIO.so to pull the relevant Python library functions from the main executable when the Python library is statically-linked.
On Windows, each shared object has its own independent pool of names it can use. This means that it must read the relevant different shared objects it needs functions from. Since it is a lot of work to get all the names from the main executable, the Python functions are separated out into their own DLL.

Can I use zipimport to ship a embedded python?

Currently, I'm deploying a full python distribution (the original python 2.7 msi) with my app. Which is an embedded web server made with delphi.
Reading this, I wonder if is possible to embed the necessary python files with my app, to decrease load files and avoid conflict with several python versions.
I have previous experience with python for delphi so I only need to know if only shipping the python dll + zip with the distro + own scripts will work (and if exist any caveats I must know or a sample where I can look)
zipimport should work just fine for you -- I'm not familiar with Python for Delphi, but I doubt it disables that functionality (an embedding application can do that, but it's an unusual choice). Just remember that what you can zip up and import directly are the Python-coded modules (or just their corresponding .pyc or .pyo byte codes) -- DLLs (even if renamed as .pyds;-) need to be on disk to be loaded (so if you have a zipfile with them it will need to be unzipped at the start of the app, e.g. into a temporary directory).
Moreover, you don't even need to zip up all modules, just those you actually need (by transitive closure) -- and you can easily find out exactly which modules those are, with the modulefinder module of the standard Python library. The example on the documentation page I just pointed to should clarify things. Happy zipping!
Yes it is possible.
I'm actually writing automatisation script in Python with the Zipimport library. I actually included every .py files in my zip as well as configuration or xml files needed by those script.
Then, I call a .command file targeting a __main__.py class that redirect towards the desired script according to my sys.argv parameters which is really useful!

Categories

Resources