How to include external files in Python wheel? - python

I'm working on a Python package (built as a wheel), which requires some external files – i.e. ones that aren't in my package/repository. At runtime, it currently gets the path to these from a config file.
I'd like to instead bundle these files inside the wheel itself, so the package is all you need to install. Normally you'd use package_data to do this kind of thing, but I don't think that works for files outside of the package tree. I've wondered about making a build script, which first copies these into a local temporary directory. Would that work, or is there a more elegant way to do this?

Related

Converting a python package into a single importable file

Is there a way to convert a python package, i.e. is a folder of python files, into a single file that can be copied and then directly imported into a python script without needing to run any extra shell commands? I know it is possible to zip all of the files and then unzip them from python when they are needed, but I'm hoping that there is a more elegant solution.
It's not totally clear what the question is. I could interpret it two ways.
If you are looking to manage the symbols from many modules in a more organized way:
You'll want to put an __init__.py file in your directory and make it a package. In it you can define the symbols for your package, and create a graceful import packagename behavior. Details on packages.
If you are looking to make your code portable to another environment:
One way or the other, the package needs to be accessible in whatever environment it is run in. That means it either needs to be installed in the python environment (likely using pip), copied into a location that is in a subdirectory relative to the running code, or in a directory that is listed in the PYTHONPATH environment variable.
The most straightforward way to package up code and make it portable is to use setuptools to create a portable package that can be installed into any python environment. The manual page for Packaging Projects gives the details of how to go about building a package archive, and optionally uploading to PyPi for public distribution. If it is for private use, the resulting archive can be passed around without uploading it to the public repository.

Using a shared python library in Visual Studio

We have a python library that needs to be shared among many projects and we're trying to find a way to organize and link the shared library to the specific projects that want to use it.
It also needs to work without visual studio, meaning that if the whole project is moved to some different machine, it will still work and use the "shared library", which means that the linked library needs to be placed statically in every project that use it (and of course each time it's updated, the library directory will be updated in each project)
Is there anyway it can be done?
The directory structure looks like this:
Project1
main.py <--- (One of the projects that uses the library)
...
Libs
PyLib <--- (This is the shared library)
__init__.py
ps_lib.py
another.py
CWinLib
CNixLib
Some ways that I've tested are:
Working with linked files - The problem is that it doesn't copy the whole package to the project (which means that it doesn't work outside of visual studio)
Adding a search path - The same problem as before, doesn't work outside of visual studio
Using sys.path.append - It means that we'll need to copy the exact directory structure that's in place and that is something I want to avoid
Is there another way to solve this?

Why is recommended to add extra files in a python source package?

There are python tools like check-manifest, to verify that all your files under your vcs are included also in your MANIFEST.in. And releasing helpers like zest.releaser recommend you to use them.
I think files in tests or docs directories are never used directly from the python package. Usually services like read the docs or travis-ci are going to access that files, and they get the files from the vcs, not from the package. I have seen also packages including .travis.yml files, what makes even less sense to me.
What is the advantage of including all the files in the python package?

Python compile all modules into a single python file

I am writing a program in python to be sent to other people, who are running the same python version, however these some 3rd party modules that need to be installed to use it.
Is there a way to compile into a .pyc (I only say pyc because its a python compiled file) that has the all the dependant modules inside it as well?
So they can run the programme without needing to install the modules separately?
Edit:
Sorry if it wasnt clear, but I am aware of things such as cx_freeze etc but what im trying to is just a single python file.
So they can just type "python myapp.py" and then it will run. No installation of anything. As if all the module codes are in my .py file.
If you are on python 2.3 or later and your dependencies are pure python:
If you don't want to go the setuptools or distutiles routes, you can provide a zip file with the pycs for your code and all of its dependencies. You will have to do a little work to make any complex pathing inside the zip file available (if the dependencies are just lying around at the root of the zip this is not necessary. Then just add the zip location to your path and it should work just as if the dependencies files has been installed.
If your dependencies include .pyds or other binary dependencies you'll probably have to fall back on distutils.
You can simply include .pyc files for the libraries required, but no - .pyc cannot work as a container for multiple files (unless you will collect all the source into one .py file and then compile it).
It sounds like what you're after is the ability for your end users to run one command, e.g. install my_custom_package_and_all_required_dependencies, and have it assemble everything it needs.
This is a perfect use case for distutils, with which you can make manifests for your own code that link out to external dependencies. If your 3rd party modules are available publicly in a standard format (they should be, and if they're not, it's pretty easy to package them yourself), then this approach has the benefit of allowing you to very easily change what versions of 3rd party libraries your code runs against (see this section of the above linked doc). If you're dead set on packaging others' code with your own, you can always include the required files in the .egg you create with distutils.
Two options:
build a package that will install the dependencies for them (I don't recommend this if the only dependencies are python packages that are installed with pip)
Use virtual environments. You use an existing python on their system but python modules are installed into the virtualenv.
or I suppose you could just punt, and create a shell script that installs them, and tell them to run it once before they run your stuff.

Package python directory for different architectures

I have a personal python library consisting of several modules of scientific programs that I use. These live on a directory with the structure:
root/__init__.py
root/module1/__init__.py
root/module1/someprog.py
root/module1/ (...)
root/module2/__init__.py
root/module2/someprog2.py
root/module2/somecython.pyx
root/module2/somecython.so
root/module2/somefortran.f
root/module2/somefortran.so
(...)
I am constantly making changes to these programs and adding new files. With my current setup at work, I share the same directory with several machines of different architectures. What I want is a way to use these packages from python in the different architectures. If the packages were all pure python, this would be no problem. But the issue is that I have several compiled binaries (as shown in the example) from Cython and from f2py.
Is there a clever way to repackage these binaries so that python in the different systems only imports the relevant binaries? I'd like to keep the code organised in the same directory.
Obviously the simplest way would be to duplicate the directory or create another directory of symlinks. But this would mean that when new files are created, I'd have to update the symlinks manually.
Has anyone bumped into a similar problem, or can suggest a more pythonic approach to this organisation problem?
Probably you should use setuptools/distribute. You can then define a setup.py that compiles all files according to your current platform, copies them to some adequate directory and makes sure they are available in your sys.path.
You would do the following when you compile the source code of python.
Pass the exec-prefix flag with the directory to ./configure
For more info: ./configure --help will give you the following:
Installation directories:
--prefix=PREFIX install architecture-independent files in PREFIX
[/usr/local]
--exec-prefix=EPREFIX install architecture-dependent files in EPREFIX
[PREFIX]
Hope this helps :)
There is unfortunately no way to do this. A python package must reside entirely in one directory. PEP 382 proposed support for namespace-packages that could be split in different directories, but it was rejected. (And in any case, those would be special packages.)
Given that python packages have to be in a single directory, it is not possible to mix compiled extension modules for different architectures. There are two ways to mitigate this problem:
Keep binary extensions on a separate directory, and have all the python packages in a common directory that can be shared between architectures. The separate directory for binary extension can then be selected for different architectures with PYTHONPATH.
Keep a common directory with all the python files and extensions for different architectures. For each architecture, create a new directory with the package name. Then symlink all the python files and binaries in each of these directories. This will still allow a single place where the code lives, at the expense of having to create new symlinks for each new file.
The option suggested by Thorsten Krans is unfortunately not viable for this problem. Using distutils/setuptools/distribute still requires all the python source files to be installed in a directory for each architecture, negating the advantage of having them in a single directory. (This is not a finished package, but always work in progress.)

Categories

Resources