Neat way to define __all__? - python

Hi I'm building my own package and I have a question on __all__.
Are there any neat way to define __all__, other than explicitly typing each and every function in the module?
I find it very tedious...
I'm trying to make some code which wraps on frequently used libraries such as numpy, pytorch, os. The problem is, the libraries I used to create my modules also gets imported when I import my package.
I want to import every function / class that I defined, but I don't want the third-party libraries that I used in the process to get imported.
I use from .submodule import * in my __init__.py so that I can access my functions inside the submodule directly. (Just like we can access functions directly from the top package like np.sum(), torch.sum() )
My submodule has a lot of functions, and I want to import all of them to __init__.py, except for the third-party packages that I used.
I see that __all__ defines what to import when from package import * is called.
For example,
utils.py
__all__ = ['a']
def a():
pass
def b():
pass
__init__.py
from .utils import *
and
>>> import package
>>> package.a()
None
>>> package.b()
NameError: 'package.b' is not defined
What I want is something like
__all__ = Some_neat_fancy_method()
I tried locals() and dir(), but got lost along the way.
Any suggestions?

As others have pointed out, the whole point of __all__ is to explicitly specify what gets exposed to star-imports. By default everything is. If you really want to specify what doesn't get exposed instead, you can do a little trick and include all modules in __all__ and then remove the ones you want to exclude.
For example:
def _exclude(exclusions: list) -> list:
import types
# add everything as long as it's not a module and not prefixed with _
functions = [name for name, function in globals().items()
if not (name.startswith('_') or isinstance(function, types.ModuleType))]
# remove the exclusions from the functions
for exclusion in exclusions:
if exclusion in functions:
functions.remove(exclusion)
del types # deleting types from scope, introduced from the import
return functions
# the _ prefix is important, to not add these to the __all__
_exclusions = ["function1", "function2"]
__all__ = _exclude(_exclusions)
You can of course repurpose this to simply include everything that's not a function or prefixed with _ but it serves little use since everything is included in star-imports if you don't specify the __all__, so I thought it was better to include the exclusion idea. This way you can simply tell it to exclude specific functions.

Are there any neat way to define all, other than explicitly typing each and every function in the module?
Not built-in no. But defining __all__ by hand is basically the entire point, if you want to include everything in __all__ you can just do nothing at all:
If __all__ is not defined, the statement from sound.effects import * [...] ensures that the package sound.effects has been imported (possibly running any initialization code in __init__.py) and then imports whatever names are defined in the package.
The entire point of __all__ is to restricts what gets "exported" by star-imports. There's no real way for Python to know that except by having you tell it, for each symbol, whether it should be there or not.

One easy workaround is to alias all of your imports with a leading underscore. Anything with a leading underscore is excluded from from x import * style imports.
import numpy as _np
import pandas as _pd
def my_fn():
...

Related

Export decorator that manages __all__

A proper Python module will list all its public symbols in a list called __all__. Managing that list can be tedious, since you'll have to list each symbol twice. Surely there are better ways, probably using decorators so one would merely annotate the exported symbols as #export.
How would you write such a decorator? I'm certain there are different ways, so I'd like to see several answers with enough information that users can compare the approaches against one another.
In Is it a good practice to add names to __all__ using a decorator?, Ed L suggests the following, to be included in some utility library:
import sys
def export(fn):
"""Use a decorator to avoid retyping function/class names.
* Based on an idea by Duncan Booth:
http://groups.google.com/group/comp.lang.python/msg/11cbb03e09611b8a
* Improved via a suggestion by Dave Angel:
http://groups.google.com/group/comp.lang.python/msg/3d400fb22d8a42e1
"""
mod = sys.modules[fn.__module__]
if hasattr(mod, '__all__'):
name = fn.__name__
all_ = mod.__all__
if name not in all_:
all_.append(name)
else:
mod.__all__ = [fn.__name__]
return fn
We've adapted the name to match the other examples. With this in a local utility library, you'd simply write
from .utility import export
and then start using #export. Just one line of idiomatic Python, you can't get much simpler than this. On the downside, the module does require access to the module by using the __module__ property and the sys.modules cache, both of which may be problematic in some of the more esoteric setups (like custom import machinery, or wrapping functions from another module to create functions in this module).
The python part of the atpublic package by Barry Warsaw does something similar to this. It offers some keyword-based syntax, too, but the decorator variant relies on the same patterns used above.
This great answer by Aaron Hall suggests something very similar, with two more lines of code as it doesn't use __dict__.setdefault. It might be preferable if manipulating the module __dict__ is problematic for some reason.
You could simply declare the decorator at the module level like this:
__all__ = []
def export(obj):
__all__.append(obj.__name__)
return obj
This is perfect if you only use this in a single module. At 4 lines of code (plus probably some empty lines for typical formatting practices) it's not overly expensive to repeat this in different modules, but it does feel like code duplication in those cases.
You could define the following in some utility library:
def exporter():
all = []
def decorator(obj):
all.append(obj.__name__)
return obj
return decorator, all
export, __all__ = exporter()
export(exporter)
# possibly some other utilities, decorated with #export as well
Then inside your public library you'd do something like this:
from . import utility
export, __all__ = utility.exporter()
# start using #export
Using the library takes two lines of code here. It combines the definition of __all__ and the decorator. So people searching for one of them will find the other, thus helping readers to quickly understand your code. The above will also work in exotic environments, where the module may not be available from the sys.modules cache or where the __module__ property has been tampered with or some such.
https://github.com/russianidiot/public.py has yet another implementation of such a decorator. Its core file is currently 160 lines long! The crucial points appear to be the fact that it uses the inspect module to obtain the appropriate module based on the current call stack.
This is not a decorator approach, but provides the level of efficiency I think you're after.
https://pypi.org/project/auto-all/
You can use the two functions provided with the package to "start" and "end" capturing the module objects that you want included in the __all__ variable.
from auto_all import start_all, end_all
# Imports outside the start and end functions won't be externally availab;e.
from pathlib import Path
def a_private_function():
print("This is a private function.")
# Start defining externally accessible objects
start_all(globals())
def a_public_function():
print("This is a public function.")
# Stop defining externally accessible objects
end_all(globals())
The functions in the package are trivial (a few lines), so could be copied into your code if you want to avoid external dependencies.
While other variants are technically correct to a certain extent, one might also be sure that:
if the target module already has __all__ declared, it is handled correctly;
target appears in __all__ only once:
# utils.py
import sys
from typing import Any
def export(target: Any) -> Any:
"""
Mark a module-level object as exported.
Simplifies tracking of objects available via wildcard imports.
"""
mod = sys.modules[target.__module__]
__all__ = getattr(mod, '__all__', None)
if __all__ is None:
__all__ = []
setattr(mod, '__all__', __all__)
elif not isinstance(__all__, list):
__all__ = list(__all__)
setattr(mod, '__all__', __all__)
target_name = target.__name__
if target_name not in __all__:
__all__.append(target_name)
return target

Import symbols starting with underscore

I'm writing a simple Python library in which I have several "private" functions starting with underscore:
def _a():
pass
def _b():
pass
def public_interface_call():
_a()
_b()
This way my library users can simply do from MyLib.Module import * and their namespace won't be cluttered with implementation detail.
However I'm also writing unit tests in which I'd love to test these functions separately and simple importing truly all symbols from my module would be very handy. Currently I'm doing from Mylib.Module import _a _b public_interface_call but I wonder if there's any better/quicker/cleaner way to achieve what I want?
I'm not sure if it was a blackout or something when I wrote that question but today I realized (inspired by Underyx's comment) that I can simply do this:
import MyLib.Module
MyLib.Module._a()
MyLib.Module._b()
Or even to shorten things a little (because I'm a lazy bastard):
import MyLib.Module as mm
mm._a()
mm._b()
According to docs,
There is even a variant to import all names that a module defines:
from fibo import *
...
This imports all names except those beginning with an underscore (_).
Not sure why this is the case however.
The best and most common solution for your problem already has been given:
import MyLib.Module as mm
If one still wants to make use of the variant from MyLib.Module import *, there is the possibility to override its default behavior: Add a list __all__ to the module's source file MyLib/Module.py and declare which objects should be exported.
Note that you need to do this for every object you want to be visible. This does not affect the behavior of the other import mechanism above.
Example:
# MyLib/Module.py
__all__ = ["_a", "_b"]
def _a(): pass
def _b(): pass
def c(): pass
# ...
# Import
from MyLib.Module import *
# => this imports _a() and _b(), but not c()
To specify a package index __all__ can make sense to control which submodules should be loaded when importing a package. See here for more information, or this SO thread on a related question.

Export __all__ as something that is not itself

I want my_module to export __all__ as empty list, i.e.
from my_module import *
assert '__all__' in dir() and __all__ == []
I can export __all__ like this (in 'my_module.py'):
__all__ = ['__all__']
However it predictably binds __all__ to itself , so that
from my_module import *
assert '__all__' in dir() and __all__ == ['__all__']
How can I export __all__ as an empty list? Failing that, how can I hook into import process to put __all__ into importing module's __dict__ on every top level import my_module statement, circumventing module caching.
I'll start with saying this is, in my mind, a terrible idea. You really should not implicitly alter what is exported from a module, this goes counter to the Zen of Python: Explicit is better than implicit..
I also agree with the highest-voted answer on the question you cite; Python already has a mechanism to mark functions 'private', by convention we use a leading underscore to indicate a function should not be considered part of the module API. This approach works with existing tools, vs. the decorator dynamically setting __all__ which certainly breaks static code analysers.
That out of the way, here is a shotgun pointing at your foot. Use it with care.
What you want here is a way to detect when names are imported. You cannot normally do this; there are no hooks for import statements. Once a module has been imported from source, a module object is added to sys.modules and re-used for subsequent imports, but that object is not notified of imports.
What you can do is hook into attribute access. Not with the default module object, but you can stuff any object into sys.modules and it'll be treated as a module. You could just subclass the module type even, then add a __getattribute__ method to that. It'll be called when importing any name with from module import name, for all names listed in __all__ when using from module import *, and in Python 3, __spec__ is accessed for all import forms, even when doing just import module.
You can then use this to hack your way into the calling frame globals, via sys._getframe():
import sys
import types
class AttributeAccessHookModule(types.ModuleType):
def __getattribute__(self, name):
if name == '__all__':
# assume we are being imported with from module import *
g = sys._getframe(1).f_globals
if '__all__' not in g:
g['__all__'] = []
return super(AttributeAccessHook, self).__getattribute__(name)
# replace *this* module with our hacked-up version
# this part goes at the *end* of your module.
replacement = sys.module[__name__] = AttributeAccessHook(__name__, __doc__)
for name, obj in globals().items():
setattr(replacement, name, obj)
The guy there sets __all__ on first decorator application, so not explicitly exporting anything causes it to implicitly export everything. I am trying to improve on this design: if the decorator is imported, then export nothing my default, regardless of it's usage.
Just set __all__ to an empty list at the start of your module, e.g.:
# this is my_module.py
from utilitymodule import public
__all__ = []
# and now you could use your #public decorator to optionally add module to it

Should I define __all__ even if I prefix hidden functions and variables with underscores in modules?

From the perspective of an external user of the module, are both necessary?
From my understanding, by correctly prefix hidden functions with an underscore it essentially does the same thing as explicitly define __all__, but I keep seeing developers doing both in their code. Why is that?
When importing from a module with from modulename import * names starting with underscores are indeed skipped.
However, a module rarely contains only public API objects. Usually you've made imports to support the code as well, and those names are global in the module as well. Without __all__, those names would be part of the import too.
In other words, unless you want to 'export' os in the following example you should use __all__:
import os
from .implementation import some_other_api_call
_module_path = os.path.dirname(os.path.abspath(__file__))
_template = open(os.path.join(_module_path, 'templates/foo_template.txt')).read()
VERSION = '1.0.0'
def make_bar(baz, ham, spam):
return _template.format(baz, ham, spam)
__all__ = ['some_other_api_call', 'make_bar']
because without the __all__ list, Python cannot distinguish between some_other_api_call and os here and divine which one should not be imported when using from ... import *.
You could work around this by renaming all your imports, so import os as _os, but that'd just make your code less readable.
And an explicit export list is always nice. Explicit is better than implicit, as the Zen of Python tells you.
I also use __all__: that explictly tells module users what you intend to export. Searching the module for names is tedious, even if you are careful to do, e.g., import os as _os, etc. A wise man once wrote "explicit is better than implicit" ;-)
Defining all will overide the default behaviour. There is actually might be one reason to define __all__
When importing a module, you might want that doing from mod import * will import only a minimal amount of things. Even if you prefix everything correctly, there could be reasons not to import everything.
The other problem that I had once was defining a gettext shortcut. The translation function was _ which would not get imported. Even if it is prefixed "in theory" I still wanted it to get exported.
One other reason as stated above is importing a module that imports a lot of thing. As python cannot make the difference between symbols created by imports and the one defined in the actual module. It will automatically reexport everything that can be reexported. For that reason, it could be wise to explicitely limit the thing exported to the things you want to export.
With that in mind, you might want to have some prefixed symbols exported by default. Usually, you don't redefine __all__. Whenever you need it to do something unusual then it may make sense to do it.

How should I perform imports in a python module without polluting its namespace?

I am developing a Python package for dealing with some scientific data. There are multiple frequently-used classes and functions from other modules and packages, including numpy, that I need in virtually every function defined in any module of the package.
What would be the Pythonic way to deal with them? I have considered multiple variants, but every has its own drawbacks.
Import the classes at module-level with from foreignmodule import Class1, Class2, function1, function2
Then the imported functions and classes are easily accessible from every function. On the other hand, they pollute the module namespace making dir(package.module) and help(package.module) cluttered with imported functions
Import the classes at function-level with from foreignmodule import Class1, Class2, function1, function2
The functions and classes are easily accessible and do not pollute the module, but imports from up to a dozen modules in every function look as a lot of duplicate code.
Import the modules at module-level with import foreignmodule
Not too much pollution is compensated by the need to prepend the module name to every function or class call.
Use some artificial workaround like using a function body for all these manipulations and returning only the objects to be exported... like this
def _export():
from foreignmodule import Class1, Class2, function1, function2
def myfunc(x):
return function1(x, function2(x))
return myfunc
myfunc = _export()
del _export
This manages to solve both problems, module namespace pollution and ease of use for functions... but it seems to be not Pythonic at all.
So what solution is the most Pythonic? Is there another good solution I overlooked?
Go ahead and do your usual from W import X, Y, Z and then use the __all__ special symbol to define what actual symbols you intend people to import from your module:
__all__ = ('MyClass1', 'MyClass2', 'myvar1', …)
This defines the symbols that will be imported into a user's module if they import * from your module.
In general, Python programmers should not be using dir() to figure out how to use your module, and if they are doing so it might indicate a problem somewhere else. They should be reading your documentation or typing help(yourmodule) to figure out how to use your library. Or they could browse the source code yourself, in which case (a) the difference between things you import and things you define is quite clear, and (b) they will see the __all__ declaration and know which toys they should be playing with.
If you try to support dir() in a situation like this for a task for which it was not designed, you will have to place annoying limitations on your own code, as I hope is clear from the other answers here. My advice: don't do it! Take a look at the Standard Library for guidance: it does from … import … whenever code clarity and conciseness require it, and provides (1) informative docstrings, (2) full documentation, and (3) readable code, so that no one ever has to run dir() on a module and try to tell the imports apart from the stuff actually defined in the module.
One technique I've seen used, including in the standard library, is to use import module as _module or from module import var as _var, i.e. assigning imported modules/variables to names starting with an underscore.
The effect is that other code, following the usual Python convention, treats those members as private. This applies even for code that doesn't look at __all__, such as IPython's autocomplete function.
An example from Python 3.3's random module:
from warnings import warn as _warn
from types import MethodType as _MethodType, BuiltinMethodType as _BuiltinMethodType
from math import log as _log, exp as _exp, pi as _pi, e as _e, ceil as _ceil
from math import sqrt as _sqrt, acos as _acos, cos as _cos, sin as _sin
from os import urandom as _urandom
from collections.abc import Set as _Set, Sequence as _Sequence
from hashlib import sha512 as _sha512
Another technique is to perform imports in function scope, so that they become local variables:
"""Some module"""
# imports conventionally go here
def some_function(arg):
"Do something with arg."
import re # Regular expressions solve everything
...
The main rationale for doing this is that it is effectively lazy, delaying the importing of a module's dependencies until they are actually used. Suppose one function in the module depends on a particular huge library. Importing the library at the top of the file would mean that importing the module would load the entire library. This way, importing the module can be quick, and only client code that actually calls that function incurs the cost of loading the library. Further, if the dependency library is not available, client code that doesn't need the dependent feature can still import the module and call the other functions. The disadvantage is that using function-level imports obscures what your code's dependencies are.
Example from Python 3.3's os.py:
def get_exec_path(env=None):
"""[...]"""
# Use a local import instead of a global import to limit the number of
# modules loaded at startup: the os module is always loaded at startup by
# Python. It may also avoid a bootstrap issue.
import warnings
Import the module as a whole: import foreignmodule. What you claim as a drawback is actually a benefit. Namely, prepending the module name makes your code easier to maintain and makes it more self-documenting.
Six months from now when you look at a line of code like foo = Bar(baz) you may ask yourself which module Bar came from, but with foo = cleverlib.Bar it is much less of a mystery.
Of course, the fewer imports you have, the less of a problem this is. For small programs with few dependencies it really doesn't matter all that much.
When you find yourself asking questions like this, ask yourself what makes the code easier to understand, rather than what makes the code easier to write. You write it once but you read it a lot.
For this situation I would go with an all_imports.py file which had all the
from foreignmodule import .....
from another module import .....
and then in your working modules
import all_imports as fgn # or whatever you want to prepend
...
something = fgn.Class1()
Another thing to be aware of
__all__ = ['func1', 'func2', 'this', 'that']
Now, any functions/classes/variables/etc that are in your module, but not in your modules's __all__ will not show up in help(), and won't be imported by from mymodule import * See Making python imports more structured? for more info.
I would compromise and just pick a short alias for the foreign module:
import foreignmodule as fm
It saves you completely from the pollution (probably the bigger issue) and at least reduces the prepending burden.
I know this is an old question. It may not be 'Pythonic', but the cleanest way I've discovered for exporting only certain module definitions is, really as you've found, to globally wrap the module in a function. But instead of returning them, to export names, you can simply globalize them (global thus in essence becomes a kind of 'export' keyword):
def module():
global MyPublicClass,ExportedModule
import somemodule as ExportedModule
import anothermodule as PrivateModule
class MyPublicClass:
def __init__(self):
pass
class MyPrivateClass:
def __init__(self):
pass
module()
del module
I know it's not much different than your original conclusion, but frankly to me this seems to be the cleanest option. The other advantage is, you can group any number of modules written this way into a single file, and their private terms won't overlap:
def module():
global A
i,j,k = 1,2,3
class A:
pass
module()
del module
def module():
global B
i,j,k = 7,8,9 # doesn't overwrite previous declarations
class B:
pass
module()
del module
Though, keep in mind their public definitions will, of course, overlap.

Categories

Resources