This question follows up The way to make namespace packages in Python and How do I create a namespace package in Python?.
Note PEP 420, and the distribute docs, which state:
You must NOT include any other code and data in a namespace package’s __init__.py. Even though it may appear to work during development, or when projects are installed as .egg files, it will not work when the projects are installed using “system” packaging tools – in such cases the __init__.py files will not be installed, let alone executed.
This all seems to make it impossible to have a "main library" package with independently distributed extension sub-packages. What I want is to be able to:
define a core library package, to be used like this:
import mylibrary
mylibrary.some_function()
allow library extensions, packaged and distributed separately, to be used like this:
import mylibrary.myextension
mylibrary.myextension.some_other_function()
I would've expected to be able to do this with namespace packages, but it seems not to be the case, based on the questions and links above. Can this be done at all?
It is indeed not possible to have code in a top level __init__.py for a PEP 420 namespace package.
If I were you, I'd either:
create 2 packages, one called mylibrary (a normal package) which contains your actual library code, and the other called mylibrary_plugins which is a namespace package.
or, create mylibrary.lib, which is a normal package and contains your code, and mylibrary.plugins, which is a namespace package.
Personally I'd use option 1.
The rationale section of PEP 420 explains why __init__.py cannot contain any code.
strictly speaking, you can have variables under mylibrary, you just won't be able to define them there. You can, for instance:
# mylibrary/core.py
import mylibrary
def some_function():
pass
mylibrary.some_function = some_function
and your users can use it like:
import mylibrary.core
mylibrary.some_function()
That is to say, mylibrary.core monkey patches mylibrary so that, other than the import, it looks as though somefunction is defined in mylibrary rather than a sub-package.
Related
I'm trying to work in an environment where I cannot install packages normally using Pip. As such, I need to bundle all my program's dependencies alongside the full program. This works great for simpler dependencies, but for ones that have their own dependencies, imports fail due to their dependencies not being globally available (since neither of them can be imported using absolute imports).
Currently, my modules are as follows:
main.py (uses my_module)
my_module/ (depends on foo)
somefile.py
anotherfile.py
foo/ (depends on bar)
# Contents of third-party module
bar/
# Contents of other third-party module
Is there a way that I can alias a module that I have imported relatively, so that whenever some other module tries to import it, it will be referred to the place where I have directed it? I'd much rather do that than have to risk modifying thousands of lines of unfamiliar code to make all the import statements relative.
One way is to create a custom pip package as explained here:
Use custom Python package
Based on the the documentation for Packaging Python Projects, __init__.py should be empty. I want to know why? because I'm placing certain objects in the __init__.py file that are used in every module in the Package. On checking bunch of __init.py__ files in my local environments for standard packages like importlib, multiprocessing etc. All of them have a bunch of code in the file.
The sole purpose of __init__.py is to indicate that the folder containing this file is a package, and to treat this folder as package, that's why it is recommended to leave it empty.
Consider following hierarchy:
foo
__init__.py
bar.py
When you use from foo import bar or import foo.bar, Python interpreter will look for __init__.py in foo folder, if it finds it, the bar module will be imported else it won't; however, this behavior has changed over the time, and it may be able to successfully import the modules/packages even if __init__.py is missing, but remember Zen of Python: Explicit is better than implicit, so it's always safe to have it.
But in case, if you need some package level variables to be defined, you can do it inside the __init__.py file, and all the modules inside the package will be able to use it.
And in fact, if you look at PEP 257, it mentions that the __init__.py can also contain the documentation for package level information.
You're taking that statement as more general than it's meant. You're reading a statement from a tutorial, where they walk you through creating a simple example project. That particular example project's __init__.py should be empty, simply because the example doesn't need to do anything in __init__.py.
Most projects' __init__.py files will not be empty. Taking a few examples from popular packages, such as numpy, requests, flask, sortedcontainers, or the stdlib asyncio, none of these example __init__.py files are empty. They may perform package initialization, import things from submodules into the main package namespace, or include metadata like __all__, __version__, or a package docstring. The example project is just simplified to the point where it doesn't have any of that.
To my knowledge, there are three things you need to be aware of when you create a non-empty __init__ file:
it might be more difficult to follow the code. If you instantiate a = B() in __init__ it's even worse. I know developers who don't like it only for this reason
on package import contents of __init__ are evaluated. Sometimes it might be computation heavy or simply not needed.
namespace conflicts. You can't really instantiate bar in init and have a bar.py file in your package.
I like importing package contents in __init__ as otherwise in bigger projects import statements become ugly. Overall it's not a good or bad practice. This advice applies only to the project in this particular example.
In some case, you didn't have any shared component in your package. Suppose Defining a little package for calculating some algorithms, Then you didn't need any shared component in your __init__
I've created python modules but they are in different directories.
/xml/xmlcreator.py
/tasklist/tasks.py
Here, tasks.py is trying to import xmlcreator but both are in different paths. One way to do it is include xmlcreator.py in the Pythonpath. But, considering that I'll be publishing the code, this doesn't seem the right way to go about it as suggested here. Thus, how do I include xmlcreator or rather any module that might be written by me which would be in various directories and sub directories?
Are you going to publish both modules separately or together in one package?
If the former, then you'll probably want to have your users install your xml module (I'd call it something else :) so that it is, by default, already on Python's path, and declare it as a dependency of the tasklist module.
If both are distributed as a bundle, then relative imports seem to be the best option, since you can control where the paths are relative to each other.
The best way is to create subpackages in a single top-level package that you define. You then ship these together in one package. If you are using setuptools/Distribute and you want to distribute them separately then you may also define a "namspace package" that the packages will be installed in. You don't need to use any ugly sys.path hacks.
Make a directory tree like this:
mypackage/__init__.py
mypackage/xml/__init__.py
mypackage/xml/xmlcreator.py
mypackage/tasklist/__init__.py
mypackage/tasklist/tasks.py
The __init__.py files may be empty. They define the directory to be a package that Python will search in.
Except if you want to use namespace packages the mypackage/__init__.py should contains:
__import__('pkg_resources').declare_namespace(__name__)
And your setup.py file contain:
...
namespace_packages=["mypackage"],
...
Then in your code:
from mypackage.xml import xmlcreator
from mypackage.tasklist import tasks
Will get them anywhere you need them. You only need to make one name globally unique in this case, the mypackage name.
For developing the code you can put the package in "develop mode", by doing
python setup.py develop --user
This will set up the local python environment to look for your package in your workspace.
When I start a new Python project, I immediately write its setup.py and declare my Python modules/packages, so that then I just do:
python setup.py develop
and everything gets magically added to my PYTHONPATH. If you do it from a virtualenv it's even better, since you don't need to install it system-wide.
Here's more about it:
http://packages.python.org/distribute/setuptools.html#development-mode
From Namespace Packages in distribute, I know I can make use of namespace packages to separate a big Python package into several smaller ones. It is really awesome. The document also mentions:
Note, by the way, that your project’s source tree must include the
namespace packages’ __init__.py files (and the __init__.py of any
parent packages), in a normal Python package layout. These __init__.py
files must contain the line:
__import__('pkg_resources').declare_namespace(__name__)
This code ensures that the namespace package machinery is operating
and that the current package is registered as a namespace package.
I'm wondering are there any benefits to keep the same hierarchy of directories to the hierarchy of packages? Or, this is just the technical requirement of the namespace packages feature of distribute/setuptools?
Ex,
I would like to provide a sub-package foo.bar, such that I have to build the following hierarchy of folders and prepare a __init__.py to make setup.py work the namespace package:
~foo.bar/
~foo.bar/setup.py
~foo.bar/foo/__init__.py <= one-lined file dedicated to namespace packages
~foo.bar/foo/bar/__init__.py
~foo.bar/foo/bar/foobar.py
I'm not familiar with namespace packages but it looks to me that 1) foo/bar and 2) (nearly) one-lined __init__.py are routine tasks. They do provide some hints of "this is a namespace package" but I think we already have that information in setup.py?
edit:
As illustrated in the following block, can I have a namespace package without that nested directory and one-lined __init__.py in my working directory? That is, can we ask setup.py to automatically generate those by just putting one line namespace_packages = ['foo']?
~foo.bar/
~foo.bar/setup.py
~foo.bar/src/__init__.py <= for bar package
~foo.bar/src/foobar.py
A namespace package mainly has a particular effect when it comes time to import a sub-package. Basically, here's what happens, when importing foo.bar
the importer scans through sys.path looking for something that looks like foo.
when it finds something, it will look inside of the discovered foo for bar.
if bar is not found:
if foo is a normal package, an ImportError is raised, indicating that foo.bar doesn't exist.
if foo is a namespace package, the importer goes back to looking through sys.path for the next match of foo. the ImportError is only raised if all paths have been exhausted.
So that's what it does, but doesn't explain why you might want that. Suppose you designed a big, useful library (foo) but as part of that, you also developed a small, but very useful utility (foo.bar) that others python programmers find useful, even when they don't have a use for the bigger library.
You could distribute them together as one big blob of a package (as you designed it) even though most of the people using it only ever import the sub-module. Your users would find this terribly inconvenient because they'd have to download the whole thing (all 200MB of it!) even though they are only really interested in a 10 line utility class. If you have an open license, you'll probably find that several people end up forking it and now there are a half dozen diverging versions of your utility module.
You could rewrite your whole library so that the utility lives outside the foo namespace (just bar instead of foo.bar). You'll be able to distribute the utility separately, and some of your users will be happy, but that's a lot of work, especially considering that there actually are lots of users using the whole library, and so they'll have to rewrite their programs to use the new.
So what you really want is a way to install foo.bar on its own, but happily coexist with foo when that's desired too.
A namespace package allows exactly this, two totally independent installations of a foo package can coexist. setuptools will recognize that the two packages are designed to live next to each other and politely shift the folders/files in such a way that both are on the path and appear as foo, one containing foo.bar and the other containing the rest of foo.
You'll have two different setup.py scripts, one for each. foo/__init__.py in both packages have to indicate that they are namespace packages so the importer knows to continue regardless of which package is discovered first.
I would like to create a library, say foolib, but to keep different subpackages separated, so to have barmodule, bazmodule, all under the same foolib main package. In other words, I want the client code to be able to do
import foolib.barmodule
import foolib.bazmodule
but to distribute barmodule and bazmodule as two independent entities. Replace module with package as well... ba[rz]module can be a fukll fledged library with complex content.
The reason behind this choice is manifold:
I would like a user to install only barmodule if he needs so.
I would like to keep the modules relatively independent and lightweight.
but I would like to keep them under a common namespace.
jQuery has a similar structure with the plugins.
Is it feasible in python with the standard setuptools and install procedure ?
You may be looking for namespace packages. See also PEP 382.
Yes, simply create a foolib directory, add an __init__.py to it, and make each sub-module a .py file.
/foolib
barmodule.py
bazmodule.py
then you can import them like so:
from foolib import barmodule
barmodule.some_function()