Minimize python distribution size - python

The hindrance we have to ship python is the large size of the standard library.
Is there a minimal python distribution or an easy way to pick and choose what we want from
the standard library?
The platform is linux.

If all you want is to get the minimum subset you need (rather than build an exe which would constrain you to Windows systems), use the standard library module modulefinder to list all modules your program requires (you'll get all dependencies, direct and indirect). Then you can zip all the relevant .pyo or .pyc files (depending on whether you run Python with or without the -O flag) and just use that zipfile as your sys.path (plus a directory for all the .pyd or .so native-code dynamic libraries you may need -- those need to live directly in the filesystem to let the OS load them in as needed, can't be loaded directly from a zipfile the way Python bytecode modules can, unfortunately).

Have you looked at py2exe? It provides a way to ship Python programs without requiring a Python installation.

Like Hank Gay and Alex Martelli suggest, you can use py2exe. In addition I would suggest looking into using something like IronPython. Depending on your application, you can use libraries that are built into the .NET framework (or MONO if for Linux). This reduces your shipping size, but adds minimum requirements to your program.
Further, if you are using functions from a library, you can use from module import x instead of doing a wildcard import. This reduces your ship size as well, but maybe not by too much
Hope this helps

Related

Call a C library from Python

I've found a library which is in C (?), that I want to try from Python:
https://github.com/centaurean/density
I'd like to see if I can store compressed files on disk and decompress them in memory for faster full-file reading times.
I'm new to using external code from Python. How can I create a Python function that uses this library with a little overhead as possible?
(I will work with Windows or Linux)
One way is to compile the library into a dynamic library (.dll on Windows or .so on Linux) and use ctypes (https://docs.python.org/2/library/ctypes.html) to access it.
ctypes provides an easy path into doing exactly this through FFI, with the alternative of using something like SWIG to interface with the python libraries at a much lower layer.
I would recommend generating a little micro-benchmark program to record what the performance is before and after such an experiment, so that you have concrete values to use in choosing if you want to run with ctypes or python-swig code long term over doing something natively in python, which you could trial in c-python or a variant (such as pypy)
Fortunately for you, Python comes with an FFI built-in called ctypes. It works with both Linux and Windows, and there are plenty of tutorials out there to learn how to use it.
Note that you will need to compile the library beforehand into a shared lib (DLL in Windows, SO in Linux).

Distributing a Python module with ffmpeg dependency

Rookie software developer here. I’m working on a Python module that harnesses some functionality from the FFmpeg framework - specifically, the ebur128 filter function. Ideally the module will stand on its own as an independent, platform agnostic tool for verifying that audio clips comply with EBU loudness standards. It’s being designed so that end users need only perform one simple, (hopefully!) painless installation procedure, which will encompass the installation of both the FFmpeg libraries and my Python wrapper/GUI.
I apologize for the rather vague question, but does anyone have general advice for creating Python module with external dependencies, or specific advice for standardizing the FFmpeg installation across platforms? Distutils seems pretty helpful – are there other guidelines or standard practices for developing a neatly packaged Python tool? I want to minimize any installation headaches for end users.
Thanks very much.
For Windows
I think it will be easy to find ffmpeg binaries that work on any system, just like for Qt or whatever GUI library you are using. You can ship these binaries with your project and things will work (you may want to distinguish 32 bit and 64 bit systems, though).
It looks like you want to create a software that is self-contained and easily installable for end-users. Inkscape is such an example -- its installer contains Python and all other dependencies, in binary form (if required). That is, for Windows, you do not need to create a real Python package (which would allow installation with pip), and you do not need to look into distutils (which supports building C extensions). Both you do not need/want, I guess.
Maybe it will be enough for you to assemble a good directory structure and to distribute a ZIP archive with your software. This is enough if you do not need to interact with the Windows registry, for instance. Such programs are usually called "standalone", in the Windows world. However, you might still want to have a real Windows installer (even if it is just a self-extracting archive). The following article covers your requirements, I believe: http://cyrille.rossant.net/create-a-standalone-windows-installer-for-your-python-application/
It suggests using http://www.jrsoftware.org/isinfo.php for creating such an installer.
Other platforms
On other operating systems it will be more difficult. For instance, I think it will be almost impossible to create ffmpeg binaries that run on every Linux system, because ffmpeg itself has so many binary dependencies. I do not know whether you can statically build ffmpeg at all.

Insert Image into PPT using Standard Libraries

I know this is possible to do using additional libraries such as win32com or python-pptx, but I wasn wondering if anyone knew of a way to insert an image into a powerpoint slide using the standard libraries. Lots of googling has indicated that the best solution is probably win32com, but since I can guarantee that every system this script will be deployed to will have win32com, I am looking for an implemention leveraging libraries all systems with a standard python 2.7 install will have.
It is probably possible to modify a .pptx file with the standard library without much effort: these new generation of files are meant to be zip-compressed XML + external images files, and can be handled by ziplib and standard xml parsers.
Legacy .ppt files however are a binary closed format, with little documentation, and hundrededs of corner cases. It would alwasys "be possible" to change them, since they are still just bytes, but it would take considerable effort.
That said, starting with Python 3.4, the Python installer "PIP" comes default with the language install: probably the best way to go would be to script the installation of external libraries based on the built-in PIP - that way one would not have to all external library usage.

Embedding Python on Windows: why does it have to be a DLL?

I'm trying to write a software plug-in that embeds Python. On Windows the plug-in is technically a DLL (this may be relevant). The Python Windows FAQ says:
1.Do not build Python into your .exe file directly. On Windows, Python must be a DLL to handle importing modules that are themselves DLL’s. (This is the first key undocumented fact.) Instead, link to pythonNN.dll; it is typically installed in C:\Windows\System. NN is the Python version, a number such as “23” for Python 2.3.
My question is why exactly Python must be a DLL? If, as in my case, the host application is not an .exe, but also a DLL, could I build Python into it? Or, perhaps, this note means that third-party C extensions rely on pythonN.N.dll to be present and other DLL won't do? Assuming that I'd really want to have a single DLL, what should I do?
I see there's the dynload_win.c file, which appears to be the module to import C extensions on Windows and, as far as I can see, it scans the extension file to find which pythonX.X.dll it imports; but I'm not experienced with Windows and I don't quite understand all the code there.
You need to link to pythonXY.dll as a DLL, instead of linking the relevant code directly into your executable, because otherwise the Python runtime can't load other DLLs (the extension modules it relies on.) If you make your own DLL you could theoretically link all the Python code in that DLL directly, since it doesn't end up in the executable but still in a DLL. You'll have to take care to do the linking correctly, however, as pretty much none of the standard tools (like distutils) will do this for you.
However, regardless of how you embed Python, you can't make do with just the DLL, nor can you make do with just any DLL. The ABI changes between Python versions, so if you compiled your code against Python 2.6, you need python26.dll; you can't use python25.dll or python27.dll. And Python isn't just a DLL; it also needs its standard library, which includes extension modules (which are DLLs themselves, although they have the .pyd extension.) The code in dynload_win.c you ran into is for loading those DLLs, and are not related to loading of pythonXY.dll.
In short, in order to embed Python in your plugin, you need to either ship Python with the plugin, or require that the right Python version is already installed.
(Sorry, I did a stupid thing, I first wrote the question, and then registered, and now I cannot alter it or comment on the replies, because StackOverflow's engine doesn't think I'm the author. I cannot even properly thank those who replied :( So this is actually an update to the question and comments.)
Thanks for all the advice, it's very valuable. As far as I understand with some effort I can link Python statically into a custom DLL, provided that I compile other dynamically loaded extensions myself and link them against the same DLL. (I know I need to ship the standard library too; my plan was to append a zipped archive to the DLL file. As far as I understand, I will even be able to import pure Python modules from it.)
I also found an interesting place in dynload_win.c. (I understand it loads dynamic extensions that use Python C API, e.g. _ctypes.) As far as I can see it not only looks for init_ctypes symbol or whatever the extension name is, but also scans the .pyd file's import table looking for (regex) python\d+\. and then compares the found symbol with known pythonNN. string to make sure the extension was compiled for this version of Python. If the import table doesn't have such a symbol or it refers to another version, it raises an error.
For me it means that:
If I link an extension against pythonNN.dll and try to load it from my custom DLL that includes a statically linked Python, it will pass the check, but — well, here I'm not sure: will it fail because there's no pythonNN.dll (i.e. even before getting to the check) or it will happily load the symbols?
And if I link it against my custom DLL, it will find symbols, but won't pass the check :)
I guess I could rewrite this piece to suit my needs... Are there any other such places, I wonder.
Python needs to be a dll (with a standard name) such that your application, and the plugin, can use the same instance of python.
Plugin dlls are already going to expect to be loading (and using python from) a python26.dll (or whichever version) - if your python is statically embedded in your exe, then two different instances of the python library would be managing the same data structures.
If the python libraries use no static variables at all, and the compile settings are exactly the same this should not be a problem. However, generally its far safer to simply ensure that only one instance of the python interpreter is being used.
On *nix, all shared objects in a process, including the executable, contribute their exported names into a common pool; any of the shared objects can then pull any of the names from the pool and use them as they like. This allows e.g. cStringIO.so to pull the relevant Python library functions from the main executable when the Python library is statically-linked.
On Windows, each shared object has its own independent pool of names it can use. This means that it must read the relevant different shared objects it needs functions from. Since it is a lot of work to get all the names from the main executable, the Python functions are separated out into their own DLL.

Can I use zipimport to ship a embedded python?

Currently, I'm deploying a full python distribution (the original python 2.7 msi) with my app. Which is an embedded web server made with delphi.
Reading this, I wonder if is possible to embed the necessary python files with my app, to decrease load files and avoid conflict with several python versions.
I have previous experience with python for delphi so I only need to know if only shipping the python dll + zip with the distro + own scripts will work (and if exist any caveats I must know or a sample where I can look)
zipimport should work just fine for you -- I'm not familiar with Python for Delphi, but I doubt it disables that functionality (an embedding application can do that, but it's an unusual choice). Just remember that what you can zip up and import directly are the Python-coded modules (or just their corresponding .pyc or .pyo byte codes) -- DLLs (even if renamed as .pyds;-) need to be on disk to be loaded (so if you have a zipfile with them it will need to be unzipped at the start of the app, e.g. into a temporary directory).
Moreover, you don't even need to zip up all modules, just those you actually need (by transitive closure) -- and you can easily find out exactly which modules those are, with the modulefinder module of the standard Python library. The example on the documentation page I just pointed to should clarify things. Happy zipping!
Yes it is possible.
I'm actually writing automatisation script in Python with the Zipimport library. I actually included every .py files in my zip as well as configuration or xml files needed by those script.
Then, I call a .command file targeting a __main__.py class that redirect towards the desired script according to my sys.argv parameters which is really useful!

Categories

Resources