I am creating a Python package whose features include logging certain actions in a database. Thus I would like to store the database in a location such as ~/.package-name/database.sqlite. Additionally, ~/.package-name could be a directory to hold configuration files.
What is the best practice for doing this? I am using setuptools to handle package installation. I imagine that within one of my modules, I would have code that checks for the existence of the database file and config file(s), creating them if necessary.
Reading the documentation, it states
you can’t actually install data files to some arbitrary location on a user’s machine; this is a feature, not a bug. You can always include a script in your distribution that extracts and copies your the documentation or data files to a user-specified location, at their discretion.
It seems that I cannot create the location ~/.package-name using setuptools. So should I create this directory the first time the user runs the program by checking for the directory and invoking a script or function?
Is there a standard sort of example I might look at? I had some trouble searching for my problem.
Related
Is there a way to convert a python package, i.e. is a folder of python files, into a single file that can be copied and then directly imported into a python script without needing to run any extra shell commands? I know it is possible to zip all of the files and then unzip them from python when they are needed, but I'm hoping that there is a more elegant solution.
It's not totally clear what the question is. I could interpret it two ways.
If you are looking to manage the symbols from many modules in a more organized way:
You'll want to put an __init__.py file in your directory and make it a package. In it you can define the symbols for your package, and create a graceful import packagename behavior. Details on packages.
If you are looking to make your code portable to another environment:
One way or the other, the package needs to be accessible in whatever environment it is run in. That means it either needs to be installed in the python environment (likely using pip), copied into a location that is in a subdirectory relative to the running code, or in a directory that is listed in the PYTHONPATH environment variable.
The most straightforward way to package up code and make it portable is to use setuptools to create a portable package that can be installed into any python environment. The manual page for Packaging Projects gives the details of how to go about building a package archive, and optionally uploading to PyPi for public distribution. If it is for private use, the resulting archive can be passed around without uploading it to the public repository.
I would like to gather many libraries I have made while working on my projects in some kind of container, so that I can easily use any of them in future projects of mine. It is pretty clear to me how to do this, except one part.
I am assuming that every service will have its own config file (for instance, the Cache service, will have a config file with cache host and port, and so on). Now the problem is: when I want to use this container in an arbitrary project I will have to make assumptions about the project directory structure to know where to find these config files.
For instance, one might assume that on the same path of my library there is a config folder where I will find the config files of my services. However, this might conflict with the project's directory structure (i.e. the project might already have its own config directory for instance).
So, all in all, my question is: is there a safe, standard way to ship a library which might assume to find some config files someplace, or for which example config files are shipped along with the library itself?
well, you should not keep config files, or anything that you want to modify along with code in python (or actually in any language). Each OS have folders for that purpose.
Either it's system wide, and on Unix it's /etc or it's for an user it's in ~/.config. You have theLibrary folders for OSX, and I'm sure there's something alike for windows beyond \Windows\SYSTEM32 😉.
What that means is that the path to your configuration files shall not be considered relative to your code at any point. Never. Ever.
You can include some assets in a python package, using the MANIFEST.in but, as it'll be within your python package, you shall assume you won't have rights to write where it'll be (installed by admin, ran by user).
You can also specify some of those assets to install at specific places using setup.py's data_files directive, which will be installed relatively to sys.prefix.
Common practice is to provide configuration files examples using a link from the documentation, or better generate those files when starting the application.
Also, another trend for desktops, is to use the XDG directory specification, to decide where to look for, or where to place your configuration files.
To sum it up:
make a list of default paths your code expects to find the configuration,
make it possible to specify manually at command line the path to the configuration python foo.py --config bar.ini
write a feature for your tool to generate the configuration (with a series of questions)
deploy your default configurations to standard places (XDG paths, $prefix/etc…)
Im struggling on figuring out how to create a setup.py module that will overwrite a series of specific modules in another location.
For example, lets say I've got new_file_1.py and new_file_2.py (in the right directory structure above setpu.py)
/setup.py
/my_modules/__init__.py
/my_modules/new_file_file_1.py
/my_modules/new_file_file_2.py
and I need the new_files to overwrite old files that live in a path somewhere else, that is NOT one of the generic python library paths.
I realize its brute force, but is there a way to hard-code, or take in via command line, an absolute path that isn't going to to staple on all sorts of generic path stuff? e.g. if I tell setup.py --prefix=/some/path/ it will then install to /some/path/.../python2x/dist-packages/ or something like that.
I recognize a shell-script would be simpler, but the work I am doing at the moment necessitates everything stay within python scripts.
tldr, can I force distutils to simply put the files in a particular specified location?
I need to ship a collection of Python programs that use multiple packages stored in a local Library directory: the goal is to avoid having users install packages before using my programs (the packages are shipped in the Library directory). What is the best way of importing the packages contained in Library?
I tried three methods, but none of them appears perfect: is there a simpler and robust method? or is one of these methods the best one can do?
In the first method, the Library folder is simply added to the library path:
import sys
import os
sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'Library'))
import package_from_Library
The Library folder is put at the beginning so that the packages shipped with my programs have priority over the same modules installed by the user (this way I am sure that they have the correct version to work with my programs). This method also works when the Library folder is not in the current directory, which is good. However, this approach has drawbacks. Each and every one of my programs adds a copy of the same path to sys.path, which is a waste. In addition, all programs must contain the same three path-modifying lines, which goes against the Don't Repeat Yourself principle.
An improvement over the above problems consists in trying to add the Library path only once, by doing it in an imported module:
# In module add_Library_path:
sys.path.insert(0, os.path.join(os.path.dirname(__file__), 'Library'))
and then to use, in each of my programs:
import add_Library_path
import package_from_Library
This way, thanks to the caching mechanism of CPython, the module add_Library_path is only run once, and the Library path is added only once to sys.path. However, a drawback of this approach is that import add_Library_path has an invisible side effect, and that the order of the imports matters: this makes the code less legible, and more fragile. Also, this forces my distribution of programs to inlude an add_Library_path.py program that users will not use.
Python modules from Library can also be imported by making it a package (empty __init__.py file stored inside), which allows one to do:
from Library import module_from_Library
However, this breaks for packages in Library, as they might do something like from xlutils.filter import …, which breaks because xlutils is not found in sys.path. So, this method works, but only when including modules in Library, not packages.
All these methods have some drawback.
Is there a better way of shipping programs with a collection of packages (that they use) stored in a local Library directory? or is one of the methods above (method 1?) the best one can do?
PS: In my case, all the packages from Library are pure Python packages, but a more general solution that works for any operating system is best.
PPS: The goal is that the user be able to use my programs without having to install anything (beyond copying the directory I ship them regularly), like in the examples above.
PPPS: More precisely, the goal is to have the flexibility of easily updating both my collection of programs and their associated third-party packages from Library by having my users do a simple copy of a directory containing my programs and the Library folder of "hidden" third-party packages. (I do frequent updates, so I prefer not forcing the users to update their Python distribution too.)
Messing around with sys.path() leads to pain... The modern package template and Distribute contain a vast array of information and were in part set up to solve your problem.
What I would do is to set up setup.py to install all your packages to a specific site-packages location or if you could do it to the system's site-packages. In the former case, the local site-packages would then be added to the PYTHONPATH of the system/user. In the latter case, nothing needs to changes
You could use the batch file to set the python path as well. Or change the python executable to point to a shell script that contains a modified PYTHONPATH and then executes the python interpreter. The latter of course, means that you have to have access to the user's machine, which you do not. However, if your users only run scripts and do not import your own libraries, you could use your own wrapper for scripts:
#!/path/to/my/python
And the /path/to/my/python script would be something like:
#!/bin/sh
PYTHONPATH=/whatever/lib/path:$PYTHONPATH /usr/bin/python $*
I think you should have a look at path import hooks which allow to modify the behaviour of python when searching for modules.
For example you could try to do something like kde's scriptengine does for python plugins[1].
It adds a special token to sys.path(like "<plasmaXXXXXX>" with XXXXXX being a random number just to avoid name collisions) and then when python try to import modules and can't find them in the other paths, it will call your importer which can deal with it.
A simpler alternative is to have a main script used as launcher which simply adds the path to sys.path and execute the target file(so that you can safely avoid putting the sys.path.append(...) line on every file).
Yet an other alternative, that works on python2.6+, would be to install the library under the per-user site-packages directory.
[1] You can find the source code under /usr/share/kde4/apps/plasma_scriptengine_python in a linux installation with kde.
The Mac OS X Finder uses the concept of "packages" to make the contents of certain folders opaque to the user. I'm using os.walk() to enumerate a directory tree and I want to skip enumeration on packages such as application bundles.
The mdls commandline utility can be used to check whether com.apple.package is in the kMDItemContentTypeTree attribute. Is the only/best way to detect whether a folder is a package to drop into os.system and use mdls after detecting that the OS is indeed darwin?
As an aside, this solution seems to depend on Spotlight metadata which I understand is populated from the files/directories themselves. This makes me wonder whether there is a method to check whether a directory is a package outside of mdls. Perhaps I'm missing something.
OS X packages (and bundles) are usually defined by their extension. Simply create a directory with an .app extension to see it be presented as a (broken) application in the Finder.
The official documentation lists the following ways to define bundles:
The Finder considers a directory to be a package if any of the following conditions are true:
The directory has a known filename extension: .app, .bundle, .framework, .plugin, .kext, and so on.
The directory has an extension that some other application claims represents a package type; see “Document Packages.”
The directory has its package bit set.
The preferred way to specify a package is to give the package directory a known filename extension. For the most part, Xcode takes care of this for you by providing templates that apply the correct extension. All you have to do is create an Xcode project of the appropriate type.
The simplest way to detect packages then is to detect those extensions. The quick and dirty way is to simply look for a hard-coded list of extensions, using the above documentation as your guide.
The next step up is to query the OS if a given extension has been registered as a Document Package. See How to check whether directories with a given extension are shown by the Finder as a package?
To detect the package bit on directories, you'll have to use the xattr library to retrieve the u'com.apple.FinderInfo' key and then use the Finder.h header info to decode the binary data returned; the kHasBundle flag is 0x2000:
attrs = xattr.getxattr('/path/to/dir', u'com.apple.FinderInfo')
ispackage = bool(ord(attrs[8]) & 0x20) # I *think* this is correct; works for hidden dirs and & 0x40