How to install writable shared and user specific data files with setuptools? - python

I know package_data. But it is for readonly data inside the package. Or is this assumption wrong? How to install shared or user specific writeable data? For example to ProgramData or AppData on Windows. I'm interested in a solution for linux, too.

Your assumption seems right to me (package data should be read-only). For writable data: either let the user choose a target directory, or pick one directory according to a common convention. But this can not happen at install time. It's probably better to have your library or application check if these shared directories and files exist when they are needed, and if they don't then create them on the fly.
For example a pretty common convention is the XDG Base Directory Specification. These libraries can help write code according to this specification:
platformdirs (preferred, up-to-date and well maintained)
appdirs (outdated)
xdgappdirs (outdated)

Related

Packaging python resources (Manifest.in vs package_data vs data_files)

It seems that non-python resources are included in python distribution packages one of 4 ways:
Manifest.in file (I'm not sure when this is preferred over package_data or data_files)
package_data in setup.py (for including resources within python import packages)
data_files in setup.py (for including resources outside python import packages)
something called setuptools-scm (which I believe uses your version control system to find resources instead of manifest.in or something)
Which of these are accessible from importlib.resources?
(It is my understanding that importlib.resources is the preferred way to access such resources.) If any of these are not accessible via importlib.resources, then how could/should one access such resources?
Other people online have been scolded for the suggestion of using __file__ to find the path to a resource because installed wheel distributions may be stored as zip files and therefore there won't even be a proper path to your resources. When are wheels extracted into site-packages and when do they remain zipped?
All of (1)-(3) will put files into your package (don't know about (4)).
At runtime, importlib.resources will then be able to access any data in your package.
At least with Python 3.9, which can access resources in subdirectories.
Before, you had to make each subdirectory a package by adding a __init__.
As for why not use __file__: Pythons import system has some weird ways to resolve packages. For example it can look them up in a zip file if you use Zipapp.
You may even have a custom loader for a package you are asked to load some resources from.
Who knows where those resources are located? Ans: importlib.resources.
Afaik, wheels are not a contender as those are unpacked.

How to properly use 3rd party dependencies with Sublime Text plugins?

I'm trying to write a plugin for Sublime Text 3.
I have to use several third party packages in my code. I have managed to get the code working by manually copying the packages into /home/user/.config/sublime-text-3/Packages/User/, then I used relative imports to get to the needed code. How would I distribute the plugin to the end users? Telling them to copy the needed dependencies to the appropriate location is certainly not the way to go. How are 3rd party modules supposed to be used properly with Sublime Text plugins? I can't find any documentation online; all I see is the recommendation to put the modules in the folder.
Sublime uses it's own embedded Python interpreter (currently Python 3.3.6 although the next version will also support Python 3.8 as well) and as such it will completely ignore any version of Python that you may or may not have installed on your system, as well as any libraries that are installed for that version.
For that reason, if you want to use external modules (hereafter dependencies) you need to do extra work. There are a variety of ways to accomplish this, each with their own set of pros and cons.
The following lists the various ways that you can achieve this; all of them require a bit of an understanding about how modules work in Python in order to understand what's going on. By and large except for the paths involved there's nothing too "Sublime Text" about the mechanisms at play.
NOTE: The below is accurate as of the time of this answer. However there are plans for Package Control to change how it works with dependencies that are forthcoming that may change some aspect of this.
This is related to the upcoming version of Sublime supporting multiple versions of Python (and the manner in which it supports them) which the current Package Control mechanism does not support.
It's unclear at the moment if the change will bring a new way to specify dependencies or if only the inner workings of how the dependencies are installed will change. The existing mechanism may remain in place regardless just for backwards compatibility, however.
All roads to accessing a Python dependency from a Sublime plugin involve putting the code for it in a place where the Python interpreter is going to look for it. This is similar to how standard Python would do things, except that locations that are checked are contained within the area that Sublime uses to store your configuration (referred to as the Data directory) and instead of a standalone Python interpreter, Python is running in the plugin host.
Populate the library into the Lib folder
Since version 3.0 (build 3143), Sublime will create a folder named Lib in the data directory and inside of it a directory based on the name of the Python version. If you use Preferences > Browse Packages and go up one folder level, you'll see Lib, and inside of it a folder named e.g. python3.3 (or if you're using a newer build, python33 and python38).
Those directories are directly on the Python sys.path by default, so anything placed inside of them will be immediately available to any plugin just as a normal Python library (or any of those built in) would be. You could consider these folders to be something akin to the site-packages folder in standard Python.
So, any method by which you could install a standard Python library can be used so long as the result is files ending up in this folder. You could for example install a library via pip and then manually copy the files to that location from site-packages, manually install from sources, etc.
Lib/python3.3/
|-- librarya
| `-- file1.py
|-- libraryb
| `-- file2.py
`-- singlefile.py
Version restrictions apply here; the dependency that you want to use must support the version of Python that Sublime is using, or it won't work. This is particularly important for Python libraries with a native component (e.g. a .dll, .so or .dylib), which may require hand compiling the code.
This method is not automatic; you would need to do it to use your package locally, and anyone that wants to use your package would also need to do it as well. Since Sublime is currently using an older version of Python, it can be problematic to obtain a correct version of libraries as well.
In the future, Package Control will install dependencies in this location (Will added the folder specifically for this purpose during the run up to version 3.0), but as of the time I'm writing this answer that is not currently the case.
Vendor your dependencies directly inside of your own package
The Packages folder is on the sys.path by default as well; this is how Sublime finds and loads packages. This is true of both the physical Packages folder, as well as the "virtual" packages folder that contains the contents of sublime-package files.
For example, one can access the class that provides the exec command via:
from Default.exec import ExecCommand
This will work even though the exec.py file is actually stored in Default.sublime-package in the Sublime text install folder and not physically present in the Packages folder.
As a result of this, you can vendor any dependencies that you require directly inside of your own package. Here this could be the User package or any other package that you're creating.
It's important to note that Sublime will treat any Python file in the top level of a package as a plugin and try to load it as one. Hence it's important that if you go this route you create a sub-folder in your package and put the library in there.
MyPackage/
|-- alibrary
| `-- code.py
`-- my_plugin.py
With this structure, you can access the module directly:
import MyPackage.alibrary
from MyPackage.alibrary import someSymbol
Not all Python modules lend themselves to this method directly without modification; some code changes in the dependency may be required in order to allow different parts of the library to see other parts of itself, for example if it doesn't use relative import to get at sibling files. License restrictions may also get in the way of this as well, depending on the library that you're using.
On the other hand, this directly locks the version of the library that you're using to exactly the version that you tested with, which ensures that you won't be in for any undue surprises further on down the line.
Using this method, anything you do to distribute your package will automatically also distribute the vendored library that's contained inside. So if you're distributing by Package Control, you don't need to do anything special and it will Just Work™.
Modify the sys.path to point to a custom location
The Python that's embedded into Sublime is still standard Python, so if desired you can manually manipulate the sys.path that describes what folders to look for packages in so that it will look in a place of your choosing in addition to the standard locations that Sublime sets up automatically.
This is generally not a good idea since if done incorrectly things can go pear shaped quickly. It also still requires you to manually install libraries somewhere yourself first, and in that case you're better off using the Lib folder as outlined above, which is already on the sys.path.
I would consider this method an advanced solution and one you might use for testing purposes during development but otherwise not something that would be user facing. If you plan to distribute your package via Package Control, the review of your package would likely kick back a manipulation of the sys.path with a request to use another method.
Use Package Control's Dependency system (and the dependency exists)
Package control contains a dependency mechanism that uses a combination of the two prior methods to provide a way to install a dependency automatically. There is a list of available dependencies as well, though the list may not be complete.
If the dependency that you're interested in using is already available, you're good to go. There are two different ways to go about declaring that you need one or more dependencies on your package.
NOTE: Package Control doesn't currently support dependencies of dependencies; if a dependency requires that another library also be installed, you need to explicitly mention them both yourself.
The first involves adding a dependencies key to the entry for your package in the package control channel file. This is a step that you'd take at the point where you're adding your package to Package Control, which is something that's outside the scope of this answer.
While you're developing your package (or if you decide that you don't want to distribute your package via Package Control when you're done), then you can instead add a dependencies.json file into the root of your package (an example dependencies.json file is available to illustrate this).
Once you do that, you can choose Package Control: Satisfy Dependencies from the command Palette to have Package Control download and install the dependency for you (if needed).
This step is automatic if your package is being distributed and installed by Package Control; otherwise you need to tell your users to take this step once they install the package.
Use Package Control's Dependency system (but the dependency does not exist)
The method that Package Control uses to install dependencies is, as outlined at the top of the question subject to change at some point in the (possibly near) future. This may affect the instructions here. The overall mechanism may remain the same as far as setup is concerned, with only the locations of the installation changing, but that remains to be seen currently.
Package Control installs dependencies via a special combination of vendoring and also manipulation of the sys.path to allow things to be found. In order to do so, it requires that you lay out your dependency in a particular way and provide some extra metadata as well.
The layout for the package that contains the dependency when you're building it would have a structure similar to the following:
Packages/my_dependency/
├── .sublime-dependency
└── prefix
└── my_dependency
└── file.py
Package Control installs a dependency as a Package, and since Sublime treats every Python file in the root of a package as a plugin, the code for the dependency is not kept in the top level of the package. As seen above, the actual content of the dependency is stored inside of the folder labeled as prefix above (more on that in a second).
When the dependency is installed, Package Control adds an entry to it's special 0_package_control_loader package that causes the prefix folder to be added to the sys.path, which makes everything inside of it available to import statements as normal. This is why there's an inherent duplication of the name of the library (my_dependency in this example).
Regarding the prefix folder, this is not actually named that and instead has a special name that determines what combination of Sublime Text version, platform and architecture the dependency is available on (important for libraries that contain binaries, for example).
The name of the prefix folder actually follows the form {st_version}_{os}_{arch}, {st_version}_{os}, {st_version} or all. {st_version} can be st2 or st3, {os} can be windows, linux or osx and {arch} can be x32 or x64.
Thus you could say that your dependency supports only st3, st3_linux, st3_windows_x64 or any combination thereof. For something with native code you may specify several different versions by having multiple folders, though commonly all is used when the dependency contains pure Python code that will work regardless of the Sublime version, OS or architecture.
In this example, if we assume that the prefix folder is named all because my_dependency is pure Python, then the result of installing this dependency would be that Packages/my_dependency/all would be added to the sys.path, meaning that if you import my_dependency you're getting the code from inside of that folder.
During development (or if you don't want to distribute your dependency via Package Control), you create a .sublime-dependency file in the root of the package as shown above. This should be a text file with a single line that contains a 2 digit number (e.g. 01 or 50). This controls in what order each installed dependency will get added to the sys.path. You'd typically pick a lower number if your dependency has no other dependencies and a higher value if it does (so that it gets injected after those).
Once you have the initial dependency laid out in the correct format in the Packages folder, you would use the command Package Control: Install Local Dependency from the Command Palette, and then select the name of your dependency.
This causes Package Control to "install" the dependency (i.e. update the 0_package_control_loader package) to make the dependency active. This step would normally be taken by Package Control automatically when it installs a dependency for the first time, so if you are also manually distributing your dependency you need to provide instructions to take this step.

Python packaging: create directory in homedir during installation

I am creating a Python package whose features include logging certain actions in a database. Thus I would like to store the database in a location such as ~/.package-name/database.sqlite. Additionally, ~/.package-name could be a directory to hold configuration files.
What is the best practice for doing this? I am using setuptools to handle package installation. I imagine that within one of my modules, I would have code that checks for the existence of the database file and config file(s), creating them if necessary.
Reading the documentation, it states
you can’t actually install data files to some arbitrary location on a user’s machine; this is a feature, not a bug. You can always include a script in your distribution that extracts and copies your the documentation or data files to a user-specified location, at their discretion.
It seems that I cannot create the location ~/.package-name using setuptools. So should I create this directory the first time the user runs the program by checking for the directory and invoking a script or function?
Is there a standard sort of example I might look at? I had some trouble searching for my problem.

Identify directories that are packages in Mac OS X with Python

The Mac OS X Finder uses the concept of "packages" to make the contents of certain folders opaque to the user. I'm using os.walk() to enumerate a directory tree and I want to skip enumeration on packages such as application bundles.
The mdls commandline utility can be used to check whether com.apple.package is in the kMDItemContentTypeTree attribute. Is the only/best way to detect whether a folder is a package to drop into os.system and use mdls after detecting that the OS is indeed darwin?
As an aside, this solution seems to depend on Spotlight metadata which I understand is populated from the files/directories themselves. This makes me wonder whether there is a method to check whether a directory is a package outside of mdls. Perhaps I'm missing something.
OS X packages (and bundles) are usually defined by their extension. Simply create a directory with an .app extension to see it be presented as a (broken) application in the Finder.
The official documentation lists the following ways to define bundles:
The Finder considers a directory to be a package if any of the following conditions are true:
The directory has a known filename extension: .app, .bundle, .framework, .plugin, .kext, and so on.
The directory has an extension that some other application claims represents a package type; see “Document Packages.”
The directory has its package bit set.
The preferred way to specify a package is to give the package directory a known filename extension. For the most part, Xcode takes care of this for you by providing templates that apply the correct extension. All you have to do is create an Xcode project of the appropriate type.
The simplest way to detect packages then is to detect those extensions. The quick and dirty way is to simply look for a hard-coded list of extensions, using the above documentation as your guide.
The next step up is to query the OS if a given extension has been registered as a Document Package. See How to check whether directories with a given extension are shown by the Finder as a package?
To detect the package bit on directories, you'll have to use the xattr library to retrieve the u'com.apple.FinderInfo' key and then use the Finder.h header info to decode the binary data returned; the kHasBundle flag is 0x2000:
attrs = xattr.getxattr('/path/to/dir', u'com.apple.FinderInfo')
ispackage = bool(ord(attrs[8]) & 0x20) # I *think* this is correct; works for hidden dirs and & 0x40

Package python directory for different architectures

I have a personal python library consisting of several modules of scientific programs that I use. These live on a directory with the structure:
root/__init__.py
root/module1/__init__.py
root/module1/someprog.py
root/module1/ (...)
root/module2/__init__.py
root/module2/someprog2.py
root/module2/somecython.pyx
root/module2/somecython.so
root/module2/somefortran.f
root/module2/somefortran.so
(...)
I am constantly making changes to these programs and adding new files. With my current setup at work, I share the same directory with several machines of different architectures. What I want is a way to use these packages from python in the different architectures. If the packages were all pure python, this would be no problem. But the issue is that I have several compiled binaries (as shown in the example) from Cython and from f2py.
Is there a clever way to repackage these binaries so that python in the different systems only imports the relevant binaries? I'd like to keep the code organised in the same directory.
Obviously the simplest way would be to duplicate the directory or create another directory of symlinks. But this would mean that when new files are created, I'd have to update the symlinks manually.
Has anyone bumped into a similar problem, or can suggest a more pythonic approach to this organisation problem?
Probably you should use setuptools/distribute. You can then define a setup.py that compiles all files according to your current platform, copies them to some adequate directory and makes sure they are available in your sys.path.
You would do the following when you compile the source code of python.
Pass the exec-prefix flag with the directory to ./configure
For more info: ./configure --help will give you the following:
Installation directories:
--prefix=PREFIX install architecture-independent files in PREFIX
[/usr/local]
--exec-prefix=EPREFIX install architecture-dependent files in EPREFIX
[PREFIX]
Hope this helps :)
There is unfortunately no way to do this. A python package must reside entirely in one directory. PEP 382 proposed support for namespace-packages that could be split in different directories, but it was rejected. (And in any case, those would be special packages.)
Given that python packages have to be in a single directory, it is not possible to mix compiled extension modules for different architectures. There are two ways to mitigate this problem:
Keep binary extensions on a separate directory, and have all the python packages in a common directory that can be shared between architectures. The separate directory for binary extension can then be selected for different architectures with PYTHONPATH.
Keep a common directory with all the python files and extensions for different architectures. For each architecture, create a new directory with the package name. Then symlink all the python files and binaries in each of these directories. This will still allow a single place where the code lives, at the expense of having to create new symlinks for each new file.
The option suggested by Thorsten Krans is unfortunately not viable for this problem. Using distutils/setuptools/distribute still requires all the python source files to be installed in a directory for each architecture, negating the advantage of having them in a single directory. (This is not a finished package, but always work in progress.)

Categories

Resources