Make methods available at higher level in a package - python

Consider the following package structure:
foo/ # package name
spam/ # module name
__init__.py
eggs.py # contains "bar" method
exceptions.py # contains "BarException" class
Now in order to call the bar method, we have to do
import spam
spam.eggs.bar()
I'd like to lose eggs.
Now I know it is possible to import ... as (and from ... import), but is there no way to make methods available higher up in a tree?
Things I do not want to resort to:
lots of from ... import ...
putting my eggs.py code in __init__.py instead
starred imports
long names like spam.exceptions.BarException (possibly longer)
An example would be to have exceptions.py where I define my exception classes.
Whenever I would want to make them available to a user I wouldn't want them to use spam.exceptions.BarException, but rather be able to use spam.BarException.
Goal:
import spam
try:
spam.bar() # in this case throws BarException
except spam.BarException:
pass

Note that, contrary to your comments, the top foo is not the package name, it's just a directory that's (presumably) on your sys.path somewhere, and spam is not the module name but the package name, and eggs is the module name. So:
foo/ # directory package is in
spam/ # package name
__init__.py
eggs.py # contains "bar" method
exceptions.py # contains "BarException" class
The key to what you want to do is this:
Any global names in spam/__init__.py are members of the spam package. It doesn't matter whether they were actually defined in __init__.py, or imported from somewhere else.
So, if you want to make the spam.eggs.bar function available as spam.bar, all you have to do is add this line to spam/__init__.py:
from .eggs import bar
If you have an __all__ attribute in spam/__init__.py to define the public attributes of spam, you will want to add bar to that list:
__all__ = ['other', 'stuff', 'directly', 'in', 'spam', 'bar']
If you want to re-export everything public from spam.eggs as a public part of spam, you can just do this:
from .eggs import *
__all__ = ['other', 'stuff', directly', 'in', spam'] + eggs.__all__
And of course you can extend this to more than one child module:
from .eggs import *
from .exceptions import *
__all__ = (['other', 'stuff', directly', 'in', spam'] +
eggs.__all__ +
exceptions.__all__)
This is common in the stdlib, and in popular third-party packages. For a good example, see the source to asyncio/__init__.py from Python 3.4.
However, it's only really common in this exact case: you want your uses to be able to treat your package as if it were a simple, flat module, but it actually has some internal structure (either because the implementation would be too complicated otherwise, or because occasionally users will need that structure). If you're pulling in names from a grandchild, sibling, or parent instead of a child, you're probably abusing the idiom (or at least you should stop and convince yourself that you're not).

In your __init__.py, you can import things from other modules in the package. If in __init__.py you do from .eggs import bar, then someone can do import spam and access spam.bar. If in __init__.py you do from .exceptions import BarException, then someone can do import spam and then do spam.BarException.
However, you should be wary of going too far with this. Using nesting in packages and modules has a purpose, namely to create separate namespaces. Explicitly importing a few common things from a submodule to the top level is fine, but if you start trying to implicitly make everything available at the top level, you set yourself up for name collisions down the road (e.g., if one module defines something called Blah and then later another module also does so, without realizing they will collide when they're both imported to the top level).
"Forcing" users to use from is not an onerous requirement. If cumbersome imports are required to use your library, that may be a sign that your package/module structure is too cumbersome, and you should combine some things rather than splitting them up into separate directories/files.
Incidentally, the file structure you have indicated in your post has some problems. The top-level foo as you have shown is not a package, since it doesn't have an __init__.py. The second-level spam is not a module, since it is not a file. In your example, spam is a package, and it has inside it a module called eggs (in the file eggs.py); the top-level foo directory has no status in the Python packaging system.

Related

Multi layer package in python

I have a python package with packages in it. This explanation seems strange, so I'll include my package's structure:
package\_
__init__.py
subpackage1\_
__init__.py
file1.py
subpackage2\_
__init__.py
file2.py
(I'm simplifying it for easier understanding).
The __init__.py on the top level looks like this:
__all__ = ["subpackage1", "subpackage2"]
And, for some reason, when importing the package, it dosen't recognise anythong from file1.py or file2.py. Any ideas how to fix it?
If you need more details, here's the project on github: https://github.com/Retr0MrWave/mathModule
. The directory I called package is mathmodule_pkg in the actual project
Filling the __all__ field with names does not make imports possible, it merely serves as a hint of what you mean to make importable. This hint is picked up by star-imports to restrict what is imported, and IDEs like pycharm also use it to get an idea of what is and isn't exposed - but that's about it.
If you want to enable top-level imports of your nested classes and functions, you need to
import them into the top-level __init__.py
bind them to names that can be used for the import
optionally, reference said names in __all__ to make the API nice and obvious
Using the project you're referencing as an example, this is what it would look like:
mathmodule_pkg/__init__.py
import mathmodule_pkg.calculus.DerrivativeAndIntegral #1
integral = mathmodule_pkg.calculus.DerrivativeAndIntegral.integral #2
__all__ = ['integral'] # 3
Using the very common form of from some.package import some_name we can combine steps 1 and 2 and reduce the potential for bugs when re-binding the name:
from mathmodule_pkg.calculus.DerrivativeAndIntegral import integral # 1 and 2
__all__ = ['integral'] # 3
Using either form, after installing your package the following will be possible:
>>> from mathmodule_pkg import integral
>>> integral(...)

Import method from Python submodule in __init__, but not submodule itself

I have a Python module with the following structure:
mymod/
__init__.py
tools.py
# __init__.py
from .tools import foo
# tools.py
def foo():
return 42
Now, when import mymod, I see that it has the following members:
mymod.foo()
mymod.tools.foo()
I don't want the latter though; it just pollutes the namespace.
Funnily enough, if tools.py is called foo.py you get what you want:
mymod.foo()
(Obviously, this only works if there is just one function per file.)
How do I avoid importing tools? Note that putting foo() into __init__.py is not an option. (In reality, there are many functions like foo which would absolutely clutter the file.)
The existence of the mymod.tools attribute is crucial to maintaining proper function of the import system. One of the normal invariants of Python imports is that if a module x.y is registered in sys.modules, then the x module has a y attribute referring to the x.y module. Otherwise, things like
import x.y
x.y.y_function()
break, and depending on the Python version, even
from x import y
can break. Even if you don't think you're doing any of the things that would break, other tools and modules rely on these invariants, and trying to remove the attribute causes a slew of compatibility problems that are nowhere near worth it.
Trying to make tools not show up in your mymod module's namespace is kind of like trying to not make "private" (leading-underscore) attributes show up in your objects' namespaces. It's not how Python is designed to work, and trying to force it to work that way causes more problems than it solves.
The leading-underscore convention isn't just for instance variables. You could mark your tools module with a leading underscore, renaming it to _tools. This would prevent it from getting picked up by from mymod import * imports (unless you explicitly put it in an __all__ list), and it'd change how IDEs and linters treat attempts to access it directly.
You are not importing the tools module, it's just available when you import the package like you're doing:
import mymod
You will have access to everything defined in the __init__ file and all the modules of this package:
import mymod
# Reference a module
mymod.tools
# Reference a member of a module
mymod.tools.foo
# And any other modules from this package
mymod.tools.subtools.func
When you import foo inside __init__ you are are just making foo available there just like if you have defined it there, but of course you defined it in tools which is a way to organize your package, so now since you imported it inside __init__ you can:
import mymod
mymod.foo()
Or you can import foo alone:
from mymod import foo
foo()
But you can import foo without making it available inside __init__, you can do the following which is exactly the same as the example above:
from mymod.tools import foo
foo()
You can use both approaches, they're both right, in all these example you are not "cluttering the file" as you can see accessing foo using mymod.tools.foo is namespaced so you can have multiple foos defined in other modules.
Try putting this in your __init__.py file:
from .tools import foo
del tools

Approach to Python package contents that are both part of the 'public interface' and 'used internally'?

I am in the midst of refactoring some single-file Python modules into multi-file packages and I am encountering the same problem pattern repeatedly: I have objects that are part of the public interface of the package, but must also be used internally by submodules the package.
mypackage/
__init__.py # <--- Contains object 'cssURL'
views.py # <--- Needs to use object 'cssURL'
In this case, it's important that clients of mypackage have access to mypackage.cssURL. However, my submodule, views.py, also needs it, but has no access to the contents of __init__.py. Sure, I can create another submodule like so:
mypackage/
__init__.py
views.py
style.py # <--- New home for 'cssURL'
However, if I did this every time, it seems like it would multiply the number of submodules exceedingly. Moreover, clients must now refer to mypackage.cssURL as mypackage.style.cssURL, or else I must create a synonym in __init__.py like this:
import style
cssURL = style.cssURL
I think I am doing something wrong. Is there a better way to handle these kinds of package members that are both part of the public interface and used internally?
You can refer to the current package as .:
# views.py
from . import cssURL
See here for more information.
I would structure it as follows:
/mypackage
__init__.py
from style import cssURL
...
style.py
cssURL = '...' # or whatever
...
views.py
from .style import cssURL
...
If other modules within the same package need them, I wouldn't define names in __init__.py; just create an alias there for external consumers to use.
As far as I know, the preferred way is to create a "synonym" in __init__.py with "from .style import cssURL"; cf. the source for the json module.

Module name different than directory name?

Let's assume I have a python package called bestpackage.
Convention dictates that bestpacakge would also be a directory on sys.path that contains an __init__.py to make the interpreter assume it can be imported from.
Is there any way I can set a variable for the package name so the directory could be named something different than the directive I import it with? Is there any way to make the namespacing not care about the directory name and honor some other config instead?
My super trendy client-side devs are just so much in love with these sexy something.otherthing.js project names for one of our smaller side projects!
EDIT:
To clarify, the main purpose of my question was to allow my client side guys continue to call the directories in their "projects" (the one we all have added to our paths) folder using their existing convention (some.app.js), even though in some cases they are in fact python packages that will be on the path and sourced to import statements internally. I realize this is in practice a pretty horrible thing and so I ask more out of curiosity. So part of the big problem here is circumventing the fact that the . in the directory name (and thereby the assumed package name) implies attribute access. It doesn't really surprise me that this cannot be worked around, I was just curious if it was possible deeper in the "magic" behind import.
There's some great responses here, but all rely on doing a classical import of some kind where the attribute accessor . will clash with the directory names.
A directory with a __init__.py file is called a package.
And no, the package name is always the same as the directory. That's how Python can discover packages, it matches it against directory names found on the search path, and if there is a __init__.py file in that directory it has found a match and imports the __init__.py file contained.
You can always import something into your local module namespace under a shorter, easier to use name using the from module import something or the import module as alias syntax:
from something.otherthing.js import foo
from something.otherthing import js as bar
import something.otherthing.js as hamspam
There is one solution wich needs one initial import somewhere
>>> import sys
>>> sys.modules['blinot_existing_blubb'] = sys
>>> import blinot_existing_blubb
>>> blinot_existing_blubb
<module 'sys' (built-in)>
Without a change to the import mechanism you can not import from an other name. This is intended, I think, to make Python easier to understand.
However if you want to change the import mechanism I recommend this: Getting the Most Out of Python Imports
Well, first I would say that Python is not Java/Javascript/C/C++/Cobol/YourFavoriteLanguageThatIsntPython. Of course, in the real world, some of us have to answer to bosses who disagree. So if all you want is some indirection, use smoke and mirrors, as long as they don't pay too much attention to what's under the hood. Write your module the Python way, then provide an API on the side in the dot-heavy style that your coworkers want. Ex:
pythonic_module.py
def func_1():
pass
def func_2():
pass
def func_3():
pass
def func_4():
pass
indirection
/dotty_api_1/__init__.py
from pythonic_module import func_1 as foo, func_2 as bar
/dotty_api_2/__init__.py
from pythonic_module import func_3 as foo, func_4 as bar
Now they can dot to their hearts' content, but you can write things the Pythonic way under the hood.
Actually yes!
you could do a canonical import Whatever or newmodulename = __import__("Whatever")
python keeps track of your modules and you can inspect that by doing:
import sys
print sys.modules
See this article more details
But that's maybe not your problem? let's guess: you have a module in a different path, which your current project can't access because it's not in the sys-path?
well the just add:
import sys
sys.path.append('path_to_the_other_package_or_module_directory')
prior to your import statement or see this SO-post for a more permanent solution.
I was looking for this to happen with setup.py at sdist and install time, rather than runtime, and found the directive package_dir:
https://docs.python.org/3.5/distutils/setupscript.html#listing-whole-packages

What does __all__ mean in Python?

I see __all__ in __init__.py files. What does it do?
Linked to, but not explicitly mentioned here, is exactly when __all__ is used. It is a list of strings defining what symbols in a module will be exported when from <module> import * is used on the module.
For example, the following code in a foo.py explicitly exports the symbols bar and baz:
__all__ = ['bar', 'baz']
waz = 5
bar = 10
def baz(): return 'baz'
These symbols can then be imported like so:
from foo import *
print(bar)
print(baz)
# The following will trigger an exception, as "waz" is not exported by the module
print(waz)
If the __all__ above is commented out, this code will then execute to completion, as the default behaviour of import * is to import all symbols that do not begin with an underscore, from the given namespace.
Reference: https://docs.python.org/tutorial/modules.html#importing-from-a-package
NOTE: __all__ affects the from <module> import * behavior only. Members that are not mentioned in __all__ are still accessible from outside the module and can be imported with from <module> import <member>.
It's a list of public objects of that module, as interpreted by import *. It overrides the default of hiding everything that begins with an underscore.
Explain all in Python?
I keep seeing the variable __all__ set in different __init__.py files.
What does this do?
What does __all__ do?
It declares the semantically "public" names from a module. If there is a name in __all__, users are expected to use it, and they can have the expectation that it will not change.
It also will have programmatic effects:
import *
__all__ in a module, e.g. module.py:
__all__ = ['foo', 'Bar']
means that when you import * from the module, only those names in the __all__ are imported:
from module import * # imports foo and Bar
Documentation tools
Documentation and code autocompletion tools may (in fact, should) also inspect the __all__ to determine what names to show as available from a module.
__init__.py makes a directory a Python package
From the docs:
The __init__.py files are required to make Python treat the directories as containing packages; this is done to prevent directories with a common name, such as string, from unintentionally hiding valid modules that occur later on the module search path.
In the simplest case, __init__.py can just be an empty file, but it can also execute initialization code for the package or set the __all__ variable.
So the __init__.py can declare the __all__ for a package.
Managing an API:
A package is typically made up of modules that may import one another, but that are necessarily tied together with an __init__.py file. That file is what makes the directory an actual Python package. For example, say you have the following files in a package:
package
├── __init__.py
├── module_1.py
└── module_2.py
Let's create these files with Python so you can follow along - you could paste the following into a Python 3 shell:
from pathlib import Path
package = Path('package')
package.mkdir()
(package / '__init__.py').write_text("""
from .module_1 import *
from .module_2 import *
""")
package_module_1 = package / 'module_1.py'
package_module_1.write_text("""
__all__ = ['foo']
imp_detail1 = imp_detail2 = imp_detail3 = None
def foo(): pass
""")
package_module_2 = package / 'module_2.py'
package_module_2.write_text("""
__all__ = ['Bar']
imp_detail1 = imp_detail2 = imp_detail3 = None
class Bar: pass
""")
And now you have presented a complete api that someone else can use when they import your package, like so:
import package
package.foo()
package.Bar()
And the package won't have all the other implementation details you used when creating your modules cluttering up the package namespace.
__all__ in __init__.py
After more work, maybe you've decided that the modules are too big (like many thousands of lines?) and need to be split up. So you do the following:
package
├── __init__.py
├── module_1
│   ├── foo_implementation.py
│   └── __init__.py
└── module_2
├── Bar_implementation.py
└── __init__.py
First make the subpackage directories with the same names as the modules:
subpackage_1 = package / 'module_1'
subpackage_1.mkdir()
subpackage_2 = package / 'module_2'
subpackage_2.mkdir()
Move the implementations:
package_module_1.rename(subpackage_1 / 'foo_implementation.py')
package_module_2.rename(subpackage_2 / 'Bar_implementation.py')
create __init__.pys for the subpackages that declare the __all__ for each:
(subpackage_1 / '__init__.py').write_text("""
from .foo_implementation import *
__all__ = ['foo']
""")
(subpackage_2 / '__init__.py').write_text("""
from .Bar_implementation import *
__all__ = ['Bar']
""")
And now you still have the api provisioned at the package level:
>>> import package
>>> package.foo()
>>> package.Bar()
<package.module_2.Bar_implementation.Bar object at 0x7f0c2349d210>
And you can easily add things to your API that you can manage at the subpackage level instead of the subpackage's module level. If you want to add a new name to the API, you simply update the __init__.py, e.g. in module_2:
from .Bar_implementation import *
from .Baz_implementation import *
__all__ = ['Bar', 'Baz']
And if you're not ready to publish Baz in the top level API, in your top level __init__.py you could have:
from .module_1 import * # also constrained by __all__'s
from .module_2 import * # in the __init__.py's
__all__ = ['foo', 'Bar'] # further constraining the names advertised
and if your users are aware of the availability of Baz, they can use it:
import package
package.Baz()
but if they don't know about it, other tools (like pydoc) won't inform them.
You can later change that when Baz is ready for prime time:
from .module_1 import *
from .module_2 import *
__all__ = ['foo', 'Bar', 'Baz']
Prefixing _ versus __all__:
By default, Python will export all names that do not start with an _ when imported with import *. As demonstrated by the shell session here, import * does not bring in the _us_non_public name from the us.py module:
$ cat us.py
USALLCAPS = "all caps"
us_snake_case = "snake_case"
_us_non_public = "shouldn't import"
$ python
Python 3.10.0 (default, Oct 4 2021, 17:55:55) [GCC 10.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from us import *
>>> dir()
['USALLCAPS', '__annotations__', '__builtins__', '__doc__', '__loader__', '__name__', '__package__', '__spec__', 'us_snake_case']
You certainly could rely on this mechanism. Some packages in the Python standard library, in fact, do rely on this, but to do so, they alias their imports, for example, in ctypes/__init__.py:
import os as _os, sys as _sys
Using the _ convention can be more elegant because it removes the redundancy of naming the names again. But it adds the redundancy for imports (if you have a lot of them) and it is easy to forget to do this consistently - and the last thing you want is to have to indefinitely support something you intended to only be an implementation detail, just because you forgot to prefix an _ when naming a function.
I personally write an __all__ early in my development lifecycle for modules so that others who might use my code know what they should use and not use.
Most packages in the standard library also use __all__.
When avoiding __all__ makes sense
It makes sense to stick to the _ prefix convention in lieu of __all__ when:
You're still in early development mode and have no users, and are constantly tweaking your API.
Maybe you do have users, but you have unittests that cover the API, and you're still actively adding to the API and tweaking in development.
An export decorator
The downside of using __all__ is that you have to write the names of functions and classes being exported twice - and the information is kept separate from the definitions. We could use a decorator to solve this problem.
I got the idea for such an export decorator from David Beazley's talk on packaging. This implementation seems to work well in CPython's traditional importer. If you have a special import hook or system, I do not guarantee it, but if you adopt it, it is fairly trivial to back out - you'll just need to manually add the names back into the __all__
So in, for example, a utility library, you would define the decorator:
import sys
def export(fn):
mod = sys.modules[fn.__module__]
if hasattr(mod, '__all__'):
mod.__all__.append(fn.__name__)
else:
mod.__all__ = [fn.__name__]
return fn
and then, where you would define an __all__, you do this:
$ cat > main.py
from lib import export
__all__ = [] # optional - we create a list if __all__ is not there.
#export
def foo(): pass
#export
def bar():
'bar'
def main():
print('main')
if __name__ == '__main__':
main()
And this works fine whether run as main or imported by another function.
$ cat > run.py
import main
main.main()
$ python run.py
main
And API provisioning with import * will work too:
$ cat > run.py
from main import *
foo()
bar()
main() # expected to error here, not exported
$ python run.py
Traceback (most recent call last):
File "run.py", line 4, in <module>
main() # expected to error here, not exported
NameError: name 'main' is not defined
I'm just adding this to be precise:
All other answers refer to modules. The original question explicitely mentioned __all__ in __init__.py files, so this is about python packages.
Generally, __all__ only comes into play when the from xxx import * variant of the import statement is used. This applies to packages as well as to modules.
The behaviour for modules is explained in the other answers. The exact behaviour for packages is described here in detail.
In short, __all__ on package level does approximately the same thing as for modules, except it deals with modules within the package (in contrast to specifying names within the module). So __all__ specifies all modules that shall be loaded and imported into the current namespace when us use from package import *.
The big difference is, that when you omit the declaration of __all__ in a package's __init__.py, the statement from package import * will not import anything at all (with exceptions explained in the documentation, see link above).
On the other hand, if you omit __all__ in a module, the "starred import" will import all names (not starting with an underscore) defined in the module.
It also changes what pydoc will show:
module1.py
a = "A"
b = "B"
c = "C"
module2.py
__all__ = ['a', 'b']
a = "A"
b = "B"
c = "C"
$ pydoc module1
Help on module module1:
NAME
module1
FILE
module1.py
DATA
a = 'A'
b = 'B'
c = 'C'
$ pydoc module2
Help on module module2:
NAME
module2
FILE
module2.py
DATA
__all__ = ['a', 'b']
a = 'A'
b = 'B'
I declare __all__ in all my modules, as well as underscore internal details, these really help when using things you've never used before in live interpreter sessions.
__all__ customizes * in from <module> import *
and from <package> import *.
A module is a .py file meant to be imported.
A package is a directory with a __init__.py file. A package usually contains modules.
MODULES
""" cheese.py - an example module """
__all__ = ['swiss', 'cheddar']
swiss = 4.99
cheddar = 3.99
gouda = 10.99
__all__ lets humans know the "public" features of a module.[#AaronHall] Also, pydoc recognizes them.[#Longpoke]
from module import *
See how swiss and cheddar are brought into the local namespace, but not gouda:
>>> from cheese import *
>>> swiss, cheddar
(4.99, 3.99)
>>> gouda
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'gouda' is not defined
Without __all__, any symbol (that doesn't start with an underscore) would have been available.
Imports without * are not affected by __all__
import module
>>> import cheese
>>> cheese.swiss, cheese.cheddar, cheese.gouda
(4.99, 3.99, 10.99)
from module import names
>>> from cheese import swiss, cheddar, gouda
>>> swiss, cheddar, gouda
(4.99, 3.99, 10.99)
import module as localname
>>> import cheese as ch
>>> ch.swiss, ch.cheddar, ch.gouda
(4.99, 3.99, 10.99)
PACKAGES
In the __init__.py file of a package __all__ is a list of strings with the names of public modules or other objects. Those features are available to wildcard imports. As with modules, __all__ customizes the * when wildcard-importing from the package.[#MartinStettner]
Here's an excerpt from the Python MySQL Connector __init__.py:
__all__ = [
'MySQLConnection', 'Connect', 'custom_error_exception',
# Some useful constants
'FieldType', 'FieldFlag', 'ClientFlag', 'CharacterSet', 'RefreshOption',
'HAVE_CEXT',
# Error handling
'Error', 'Warning',
...etc...
]
The default case, asterisk with no __all__ for a package, is complicated, because the obvious behavior would be expensive: to use the file system to search for all modules in the package. Instead, in my reading of the docs, only the objects defined in __init__.py are imported:
If __all__ is not defined, the statement from sound.effects import * does not import all submodules from the package sound.effects into the current namespace; it only ensures that the package sound.effects has been imported (possibly running any initialization code in __init__.py) and then imports whatever names are defined in the package. This includes any names defined (and submodules explicitly loaded) by __init__.py. It also includes any submodules of the package that were explicitly loaded by previous import statements.
And lastly, a venerated tradition for stack overflow answers, professors, and mansplainers everywhere, is the bon mot of reproach for asking a question in the first place:
Wildcard imports ... should be avoided, as they [confuse] readers and many automated tools.
[PEP 8, #ToolmakerSteve]
Short answer
__all__ affects from <module> import * statements.
Long answer
Consider this example:
foo
├── bar.py
└── __init__.py
In foo/__init__.py:
(Implicit) If we don't define __all__, then from foo import * will only import names defined in foo/__init__.py.
(Explicit) If we define __all__ = [], then from foo import * will import nothing.
(Explicit) If we define __all__ = [ <name1>, ... ], then from foo import * will only import those names.
Note that in the implicit case, python won't import names starting with _. However, you can force importing such names using __all__.
You can view the Python document here.
__all__ is used to document the public API of a Python module. Although it is optional, __all__ should be used.
Here is the relevant excerpt from the Python language reference:
The public names defined by a module are determined by checking the module’s namespace for a variable named __all__; if defined, it must be a sequence of strings which are names defined or imported by that module. The names given in __all__ are all considered public and are required to exist. If __all__ is not defined, the set of public names includes all names found in the module’s namespace which do not begin with an underscore character ('_'). __all__ should contain the entire public API. It is intended to avoid accidentally exporting items that are not part of the API (such as library modules which were imported and used within the module).
PEP 8 uses similar wording, although it also makes it clear that imported names are not part of the public API when __all__ is absent:
To better support introspection, modules should explicitly declare the names in their public API using the __all__ attribute. Setting __all__ to an empty list indicates that the module has no public API.
[...]
Imported names should always be considered an implementation detail. Other modules must not rely on indirect access to such imported names unless they are an explicitly documented part of the containing module's API, such as os.path or a package's __init__ module that exposes functionality from submodules.
Furthermore, as pointed out in other answers, __all__ is used to enable wildcard importing for packages:
The import statement uses the following convention: if a package’s __init__.py code defines a list named __all__, it is taken to be the list of module names that should be imported when from package import * is encountered.
__all__ affects how from foo import * works.
Code that is inside a module body (but not in the body of a function or class) may use an asterisk (*) in a from statement:
from foo import *
The * requests that all attributes of module foo (except those beginning with underscores) be bound as global variables in the importing module. When foo has an attribute __all__, the attribute's value is the list of the names that are bound by this type of from statement.
If foo is a package and its __init__.py defines a list named __all__, it is taken to be the list of submodule names that should be imported when from foo import * is encountered. If __all__ is not defined, the statement from foo import * imports whatever names are defined in the package. This includes any names defined (and submodules explicitly loaded) by __init__.py.
Note that __all__ doesn't have to be a list. As per the documentation on the import statement, if defined, __all__ must be a sequence of strings which are names defined or imported by the module. So you may as well use a tuple to save some memory and CPU cycles. Just don't forget a comma in case the module defines a single public name:
__all__ = ('some_name',)
See also Why is “import *” bad?
This is defined in PEP8 here:
Global Variable Names
(Let's hope that these variables are meant for use inside one module only.) The conventions are about the same as those for functions.
Modules that are designed for use via from M import * should use the __all__ mechanism to prevent exporting globals, or use the older convention of prefixing such globals with an underscore (which you might want to do to indicate these globals are "module non-public").
PEP8 provides coding conventions for the Python code comprising the standard library in the main Python distribution. The more you follow this, closer you are to the original intent.

Categories

Resources