What are the limitations of distributing .pyc files? - python

I've started working on a commercial application in Python, and I'm weighing my options for how to distribute the application.
Aside from the obvious (distribute sources with an appropriate commercial license), I'm considering distributing just the .pyc files without their corresponding .py sources. But I'm not familiar enough with Python's compatibility guarantees to know if this is even workable, much less whether it's a good idea or not.
Are .pyc files independent of the underlying OS? For example, would a .pyc file generated on a 64-bit Linux machine work on a 32-bit Windows machine?
I've found that .pyc file should be compatible across bugfix releases, but what about major and minor releases? For example, would a file generated with Python 3.1.5 be compatible with Python 3.2.x? Or would a .pyc file generated with Python 2.7.3 be compatible with a Python 3.x release?
Edit:
Primarily, I may have to appease stakeholders who are uncomfortable distributing sources. Distributing .pyc's without sources may give them some level of comfort, since it would require the extra step of decompiling to get at the sources, even if that step is somewhat trivial. Just enough of a barrier to keep honest people honest.

For example, would a file generated with Python 3.1.5 be compatible with Python 3.2.x?
No.
Or would a .pyc file generated with Python 2.7.3 be compatible with a Python 3.x release?
Doubly no.
I'm considering distributing just the .pyc files without their corresponding .py sources.
Python bytecode is high-level and trivially decompilable.

You certainly could distribute the .pyc files only. As Cat mentioned, no it would not be compatible with different major version of Python. It might prevent some people from viewing the source code, but the .pyc files are very easy to decompile. Basically if you can compile it, you can decompile it.
You could use a binary packager like py2exe / py2app / freeze. I've never tried them but someone could still decompile them if they wanted to.

As Cat said, pyc files are not cross version safe. Though what you're trying to hide from the users determines what you need to do.
As for source code, there is no good way to hide Python source code in a distributed application. If you just trying to hide specific details you could pack those into a C extension -- which would be much harder to decompile.
So if you're worried about code use, put a license attached to the code for no-use or translate the sections you don't want stolen to a compiled language. If you just want code to not be obviously Python, you can create a binary executable that wraps the Python code (though doesn't hide the actual details if someone extracts them from the file).

Related

How can I wrap Python processes in an OSGi deployment

I need to integrate a body of Python code into an existing OSGi (Apache Felix) deployment.
I assume, or at least hope, that packages exist to help with this effort.
If it helps, the Python code is still relatively new and small, so can probably re re-architected to meet whatever constraints are needed. However, it must remain in Python, because of dependencies on third-party libraries.
What are suggested best practices?
The trick is to make this an extender, see 1 and 2. You want your Python code to be separate from the code that handles the interaction with the interpreter. So what you do is wrap the Python code and any native libraries in a bundle. This is trivial since it is just a zip file.
You then develop a bundle that listens to starting bundle (see the BundleTracker) that have python code. A manifest is often used but you can also look in a directory in the JAR. If you detect this code, you extract any native libraries and run the code in the interpeter of your choice.
If can use JYthon then that would be highly recommended. You can then carry the interpreter as an OSGi bundle that runs on the VM. If you need to use a native compiler your life is less rosy. You can rely on the environment to provide you with an interpreter but then why use OSGi in the first place. You basically lose the write once run anywhere advantage. You could go the full monty by creating bundles that contain Python installers for all platforms you support. Can be done, not even that hard, but a maintenance nightmare. Believe me, native code suck, it only does it a bit faster than Java.

How to safely create and import .so files in Python 3?

I am trying to use Cython to create .so binary files from our .py files and shared it with our team.
However even if we all use Python3, most of times it should be exactly similar revision (let us say 3.7.8), otherwise we get an error to import them.
Is this behavior expected?
Some of revisions are compatible. For example if we make .so with python 3.5.2 and import in 3.6.8 it works but it does not work in 3.7.8
Where does this mess comes from and what is the safest way to do this?
To follow up on my comments:
On the same platform, extension modules should work within a "minor version" (i.e. modules built with 3.7.2 and 3.7.3 should be compatible). I'm struggling to find a source for this though. Beyond that some effort has made in the past to ensure compatibility between releases, but not so much any more so it's possible you may be lucky and things work.
distutils/setuptools and other similar build mechanisms tag extension modules with a suffix indicating the version and some other details. For example, and extension would be called foo.cpython-37m.so instead of just foo.so. These tags prevent the module from being used with other Python versions and are a good thing. If you are removing these tags then this mess is entirely on you.
Python now defines a more limited stable ABI that should be compatible across Python versions. Cython is working towards supporting that but at the moment it is not in a usable state. In a year or so it should be a good solution.
In summary, .so files are not expected to be portable between different Python versions. You should either standardise on a Python version or build the .so files locally.

Distributing a Python module with ffmpeg dependency

Rookie software developer here. I’m working on a Python module that harnesses some functionality from the FFmpeg framework - specifically, the ebur128 filter function. Ideally the module will stand on its own as an independent, platform agnostic tool for verifying that audio clips comply with EBU loudness standards. It’s being designed so that end users need only perform one simple, (hopefully!) painless installation procedure, which will encompass the installation of both the FFmpeg libraries and my Python wrapper/GUI.
I apologize for the rather vague question, but does anyone have general advice for creating Python module with external dependencies, or specific advice for standardizing the FFmpeg installation across platforms? Distutils seems pretty helpful – are there other guidelines or standard practices for developing a neatly packaged Python tool? I want to minimize any installation headaches for end users.
Thanks very much.
For Windows
I think it will be easy to find ffmpeg binaries that work on any system, just like for Qt or whatever GUI library you are using. You can ship these binaries with your project and things will work (you may want to distinguish 32 bit and 64 bit systems, though).
It looks like you want to create a software that is self-contained and easily installable for end-users. Inkscape is such an example -- its installer contains Python and all other dependencies, in binary form (if required). That is, for Windows, you do not need to create a real Python package (which would allow installation with pip), and you do not need to look into distutils (which supports building C extensions). Both you do not need/want, I guess.
Maybe it will be enough for you to assemble a good directory structure and to distribute a ZIP archive with your software. This is enough if you do not need to interact with the Windows registry, for instance. Such programs are usually called "standalone", in the Windows world. However, you might still want to have a real Windows installer (even if it is just a self-extracting archive). The following article covers your requirements, I believe: http://cyrille.rossant.net/create-a-standalone-windows-installer-for-your-python-application/
It suggests using http://www.jrsoftware.org/isinfo.php for creating such an installer.
Other platforms
On other operating systems it will be more difficult. For instance, I think it will be almost impossible to create ffmpeg binaries that run on every Linux system, because ffmpeg itself has so many binary dependencies. I do not know whether you can statically build ffmpeg at all.

Why do some Python modules have to be "compiled"?

I've noticed that Python modules/packages come in two sorts. Some are just pure Python scripts and can simply be copied and pasted to the Python directory. Others however require, and I think these are usually wrappers for or based on C/C++ code, that the code is "built" and/or "compiled" with setup.py to produce a set of new files.
My questions is about the second type of module/package. Why is it that they have to be compiled, is there a particular reason for it? Couldn't the distributor just provide all the files from the beginning?
The reason I ask is because I wish to distribute such C++ based packages as part of my own packages, so that the user don't have to worry about installing dependencies on their own, and not having to worry about compiling, etc. I've always wondered why module distributors don't just include the dependencies instead of asking the user to install them themselves.
I suspect the answer might be bc those extra files have to be written in a specific way depending on what type of OS the computer is, and whether it is 32 or 64 bit. Could this mean that distributing the compiled files will work, but only if the user has the same specific OS and bit-system as the one where the files were compiled.
Anyway, curious to know the answer.
Why packages just don't include dependencies: License. You can't just add somebody else's code, compiled or not, without asking the person/company or even pay fees.
In Python, these built modules you talk about are Python Extensions. They are usually there to improve performance or access low-level functionality which is not in Python. Sometimes also to include proprietary functionality.

Python portability issues

Basically, I am a Java programmer who wants to learn Python language. I want to clarify why some of python libaries are distributing using non-portable manner.
Let me explain my thoughts. If someone creates a regular library using Java he prepares 1 (one) JAR file which can be used on different platforms:
my-great-lib-1.2.4.jar
I can use this lib (the same file) on any version of Windows or Linux.
In contrast to Java, python libraries may look like this:
bsdiff4-1.1.4.win-amd64-py2.5.exe
bsdiff4-1.1.4.win-amd64-py2.6.exe
bsdiff4-1.1.4.win-amd64-py2.7.exe
bsdiff4-1.1.4.win-amd64-py3.2.exe
bsdiff4-1.1.4.win-amd64-py3.3.exe
bsdiff4-1.1.4.win32-py2.5.exe
bsdiff4-1.1.4.win32-py2.6.exe
bsdiff4-1.1.4.win32-py2.7.exe
bsdiff4-1.1.4.win32-py3.2.exe
bsdiff4-1.1.4.win32-py3.3.exe
See full list on page.
It looks very strange for me. Even 32bit and 64bit platforms require different installers. Installers! Why do I need an installer in order to use one library? Moreover, outlined installers are only for Windows. Each of them is bind to particular python version. Where is portability?
Could anyone explain a necessity of 10 different files above?
In general, Python libraries are portable across platforms. Problems appear between different major Python versions (3 introduced some big changes from 2, but 2.7 is backwards compatible with 2.6) or when you use C code for optimizing CPU intensive code. On Linux, compiling it yourself is not a problem, when you call pip install package, it will do it for you. The problem is on Windows, where it is much more difficult to compile a C program, especially because not everybody has a compiler. So, for Windows, packages that need something in C, you usually get an installer.
Also, installers are used because they set up everything nicely, look in the registry for the appropriate place to put everything, offer a standard way to uninstall them (the ones from Chrisopther Goelke's site can be removed using Add/Remove programs in Control Panel) and because that's the standard on Windows: most of the programs on Windows are installed via an exe, because it doesn't have a standard and widespread package manager.
All these libraries are then portable: you can use them from any platform, but installing them is what differs.
There are many complications. In Java where your code and then byte-code is interpreted by JVM, the inherent computer architecture do not play lot of role as long as your code is interpreted well by JVM. In fact, that is one of the primary reason Java got so popular because your code should only worry about rightly compiled by JVM.
However, in Python situation is different. I am trying to summarize some of the reason which I think is important in following lines:
The language itself is evolving (although it is long in the scenario if you think!) and changes are happening inside the language. New features are added and sometime, even some remodeling of language is done ( Python 2.x to Python 3.x)
Python relies heavily on its C extensions and so does the applications written in Python. If you write a python program and have some CPU intensive code, you can choose to write it in C. This also adds in the necessity of creating number of libraries for various distribution.
For one python versions jump around. In python 3, the syntax of some builtins completely changed. For example:
raw_input()
changed to:
input()
also, a lot of the standard library has changed even in the alpha of 3.4. As for the 32/64 bit question, I cannot fully answer. I know that certain platforms have trouble when trying to run 32/64, and that may be the point there.

Categories

Resources