Import symbols starting with underscore - python

I'm writing a simple Python library in which I have several "private" functions starting with underscore:
def _a():
pass
def _b():
pass
def public_interface_call():
_a()
_b()
This way my library users can simply do from MyLib.Module import * and their namespace won't be cluttered with implementation detail.
However I'm also writing unit tests in which I'd love to test these functions separately and simple importing truly all symbols from my module would be very handy. Currently I'm doing from Mylib.Module import _a _b public_interface_call but I wonder if there's any better/quicker/cleaner way to achieve what I want?

I'm not sure if it was a blackout or something when I wrote that question but today I realized (inspired by Underyx's comment) that I can simply do this:
import MyLib.Module
MyLib.Module._a()
MyLib.Module._b()
Or even to shorten things a little (because I'm a lazy bastard):
import MyLib.Module as mm
mm._a()
mm._b()

According to docs,
There is even a variant to import all names that a module defines:
from fibo import *
...
This imports all names except those beginning with an underscore (_).
Not sure why this is the case however.

The best and most common solution for your problem already has been given:
import MyLib.Module as mm
If one still wants to make use of the variant from MyLib.Module import *, there is the possibility to override its default behavior: Add a list __all__ to the module's source file MyLib/Module.py and declare which objects should be exported.
Note that you need to do this for every object you want to be visible. This does not affect the behavior of the other import mechanism above.
Example:
# MyLib/Module.py
__all__ = ["_a", "_b"]
def _a(): pass
def _b(): pass
def c(): pass
# ...
# Import
from MyLib.Module import *
# => this imports _a() and _b(), but not c()
To specify a package index __all__ can make sense to control which submodules should be loaded when importing a package. See here for more information, or this SO thread on a related question.

Related

Neat way to define __all__?

Hi I'm building my own package and I have a question on __all__.
Are there any neat way to define __all__, other than explicitly typing each and every function in the module?
I find it very tedious...
I'm trying to make some code which wraps on frequently used libraries such as numpy, pytorch, os. The problem is, the libraries I used to create my modules also gets imported when I import my package.
I want to import every function / class that I defined, but I don't want the third-party libraries that I used in the process to get imported.
I use from .submodule import * in my __init__.py so that I can access my functions inside the submodule directly. (Just like we can access functions directly from the top package like np.sum(), torch.sum() )
My submodule has a lot of functions, and I want to import all of them to __init__.py, except for the third-party packages that I used.
I see that __all__ defines what to import when from package import * is called.
For example,
utils.py
__all__ = ['a']
def a():
pass
def b():
pass
__init__.py
from .utils import *
and
>>> import package
>>> package.a()
None
>>> package.b()
NameError: 'package.b' is not defined
What I want is something like
__all__ = Some_neat_fancy_method()
I tried locals() and dir(), but got lost along the way.
Any suggestions?
As others have pointed out, the whole point of __all__ is to explicitly specify what gets exposed to star-imports. By default everything is. If you really want to specify what doesn't get exposed instead, you can do a little trick and include all modules in __all__ and then remove the ones you want to exclude.
For example:
def _exclude(exclusions: list) -> list:
import types
# add everything as long as it's not a module and not prefixed with _
functions = [name for name, function in globals().items()
if not (name.startswith('_') or isinstance(function, types.ModuleType))]
# remove the exclusions from the functions
for exclusion in exclusions:
if exclusion in functions:
functions.remove(exclusion)
del types # deleting types from scope, introduced from the import
return functions
# the _ prefix is important, to not add these to the __all__
_exclusions = ["function1", "function2"]
__all__ = _exclude(_exclusions)
You can of course repurpose this to simply include everything that's not a function or prefixed with _ but it serves little use since everything is included in star-imports if you don't specify the __all__, so I thought it was better to include the exclusion idea. This way you can simply tell it to exclude specific functions.
Are there any neat way to define all, other than explicitly typing each and every function in the module?
Not built-in no. But defining __all__ by hand is basically the entire point, if you want to include everything in __all__ you can just do nothing at all:
If __all__ is not defined, the statement from sound.effects import * [...] ensures that the package sound.effects has been imported (possibly running any initialization code in __init__.py) and then imports whatever names are defined in the package.
The entire point of __all__ is to restricts what gets "exported" by star-imports. There's no real way for Python to know that except by having you tell it, for each symbol, whether it should be there or not.
One easy workaround is to alias all of your imports with a leading underscore. Anything with a leading underscore is excluded from from x import * style imports.
import numpy as _np
import pandas as _pd
def my_fn():
...

Is there an accepted way to import all from the global namespace that is not "from module import *"

I have a small application that I would like to split into modules so that my code is better structured and readable. The drawback to this has been that for every module that I import using:
import module
I then have to use module.object for any object that I want to access from that module.
In this case I don't want to pollute the global namespace, I want to fill it with the proper module names so that I don't have to use
from module import *
in order to call an object without the module prepend.
Is there a means to do this that isn't consider to be poor use of from import or import?
Two reasonable options are to import with a shorter name to prepend. E.g.
import module as m
m.foo()
Or explicitly import names that you plan to use:
from module import (foo,bar)
foo()
You should avoid using an asterisk in your imports always. So to answer your question, I would say no, there isn't a better way than just:
import module
module.method()
OR
import really_long_module_name as mm
mm.method()
Take a look here at the pep8 guide "Imports" section:
https://www.python.org/dev/peps/pep-0008/#imports
Wildcard imports ( from import * ) should be avoided, as they make it unclear which names are present in the namespace, confusing both readers and many automated tools. There is one defensible use case for a wildcard import, which is to republish an internal interface as part of a public API (for example, overwriting a pure Python implementation of an interface with the definitions from an optional accelerator module and exactly which definitions will be overwritten isn't known in advance).
Specific is safer than globbing and I try to only import what I need if I can help it. When I'm learning a new module I'll import the whole thing and then once it's in a good state I go back and refactor by specifically importing the methods I need:
from module import method
method()
I would have to say that you should use the module's name. It's a better way of usage, and makes your code free of namespace confusions and also very understandable.
To make your code more beautiful, you could use the as import:
import random as r
# OR
from random import randint as rint
One solution that I think is not very beautiful, but works, that comes to mind, in case you don't want to pollute the global namespace, you can try to use the import statements, for example, inside functions.
For example:
>>> def a():
... from random import randint
... x = randint(0,2)
... print x
...
>>> a()
1
>>> randint(0,2)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'randint' is not defined
This way, the local namespace of the specific function is filled with the values from the module but the global one is clean.

Why can't Python's import work like C's #include?

I've literally been trying to understand Python imports for about a year now, and I've all but given up programming in Python because it just seems too obfuscated. I come from a C background, and I assumed that import worked like #include, yet if I try to import something, I invariably get errors.
If I have two files like this:
foo.py:
a = 1
bar.py:
import foo
print foo.a
input()
WHY do I need to reference the module name? Why not just be able to write import foo, print a? What is the point of this confusion? Why not just run the code and have stuff defined for you as if you wrote it in one big file? Why can't it work like C's #include directive where it basically copies and pastes your code? I don't have import problems in C.
To do what you want, you can use (not recommended, read further for explanation):
from foo import *
This will import everything to your current namespace, and you will be able to call print a.
However, the issue with this approach is the following. Consider the case when you have two modules, moduleA and moduleB, each having a function named GetSomeValue().
When you do:
from moduleA import *
from moduleB import *
you have a namespace resolution issue*, because what function are you actually calling with GetSomeValue(), the moduleA.GetSomeValue() or the moduleB.GetSomeValue()?
In addition to this, you can use the Import As feature:
from moduleA import GetSomeValue as AGetSomeValue
from moduleB import GetSomeValue as BGetSomeValue
Or
import moduleA.GetSomeValue as AGetSomeValue
import moduleB.GetSomeValue as BGetSomeValue
This approach resolves the conflict manually.
I am sure you can appreciate from these examples the need for explicit referencing.
* Python has its namespace resolution mechanisms, this is just a simplification for the purpose of the explanation.
Imagine you have your a function in your module which chooses some object from a list:
def choice(somelist):
...
Now imagine further that, either in that function or elsewhere in your module, you are using randint from the random library:
a = randint(1, x)
Therefore we
import random
You suggestion, that this does what is now accessed by from random import *, means that we now have two different functions called choice, as random includes one too. Only one will be accessible, but you have introduced ambiguity as to what choice() actually refers to elsewhere in your code.
This is why it is bad practice to import everything; either import what you need:
from random import randint
...
a = randint(1, x)
or the whole module:
import random
...
a = random.randint(1, x)
This has two benefits:
You minimise the risks of overlapping names (now and in future additions to your imported modules); and
When someone else reads your code, they can easily see where external functions come from.
There are a few good reasons. The module provides a sort of namespace for the objects in it, which allows you to use simple names without fear of collisions -- coming from a C background you have surely seen libraries with long, ugly function names to avoid colliding with anybody else.
Also, modules themselves are also objects. When a module is imported in more than one place in a python program, each actually gets the same reference. That way, changing foo.a changes it for everybody, not just the local module. This is in contrast to C where including a header is basically a copy+paste operation into the source file (obviously you can still share variables, but the mechanism is a bit different).
As mentioned, you can say from foo import * or better from foo import a, but understand that the underlying behavior is actually different, because you are taking a and binding it to your local module.
If you use something often, you can always use the from syntax to import it directly, or you can rename the module to something shorter, for example
import itertools as it
When you do import foo, a new module is created inside the current namespace named foo.
So, to use anything inside foo; you have to address it via the module.
However, if you use from from foo import something, you don't have use to prepend the module name, since it will load something from the module and assign to it the name something. (Not a recommended practice)
import importlib
# works like C's #include, you always call it with include(<path>, __name__)
def include(file, module_name):
spec = importlib.util.spec_from_file_location(module_name, file)
mod = importlib.util.module_from_spec(spec)
# spec.loader.exec_module(mod)
o = spec.loader.get_code(module_name)
exec(o, globals())
For example:
#### file a.py ####
a = 1
#### file b.py ####
b = 2
if __name__ == "__main__":
print("Hi, this is b.py")
#### file main.py ####
# assuming you have `include` in scope
include("a.py", __name__)
print(a)
include("b.py", __name__)
print(b)
the output will be:
1
Hi, this is b.py
2

How should I perform imports in a python module without polluting its namespace?

I am developing a Python package for dealing with some scientific data. There are multiple frequently-used classes and functions from other modules and packages, including numpy, that I need in virtually every function defined in any module of the package.
What would be the Pythonic way to deal with them? I have considered multiple variants, but every has its own drawbacks.
Import the classes at module-level with from foreignmodule import Class1, Class2, function1, function2
Then the imported functions and classes are easily accessible from every function. On the other hand, they pollute the module namespace making dir(package.module) and help(package.module) cluttered with imported functions
Import the classes at function-level with from foreignmodule import Class1, Class2, function1, function2
The functions and classes are easily accessible and do not pollute the module, but imports from up to a dozen modules in every function look as a lot of duplicate code.
Import the modules at module-level with import foreignmodule
Not too much pollution is compensated by the need to prepend the module name to every function or class call.
Use some artificial workaround like using a function body for all these manipulations and returning only the objects to be exported... like this
def _export():
from foreignmodule import Class1, Class2, function1, function2
def myfunc(x):
return function1(x, function2(x))
return myfunc
myfunc = _export()
del _export
This manages to solve both problems, module namespace pollution and ease of use for functions... but it seems to be not Pythonic at all.
So what solution is the most Pythonic? Is there another good solution I overlooked?
Go ahead and do your usual from W import X, Y, Z and then use the __all__ special symbol to define what actual symbols you intend people to import from your module:
__all__ = ('MyClass1', 'MyClass2', 'myvar1', …)
This defines the symbols that will be imported into a user's module if they import * from your module.
In general, Python programmers should not be using dir() to figure out how to use your module, and if they are doing so it might indicate a problem somewhere else. They should be reading your documentation or typing help(yourmodule) to figure out how to use your library. Or they could browse the source code yourself, in which case (a) the difference between things you import and things you define is quite clear, and (b) they will see the __all__ declaration and know which toys they should be playing with.
If you try to support dir() in a situation like this for a task for which it was not designed, you will have to place annoying limitations on your own code, as I hope is clear from the other answers here. My advice: don't do it! Take a look at the Standard Library for guidance: it does from … import … whenever code clarity and conciseness require it, and provides (1) informative docstrings, (2) full documentation, and (3) readable code, so that no one ever has to run dir() on a module and try to tell the imports apart from the stuff actually defined in the module.
One technique I've seen used, including in the standard library, is to use import module as _module or from module import var as _var, i.e. assigning imported modules/variables to names starting with an underscore.
The effect is that other code, following the usual Python convention, treats those members as private. This applies even for code that doesn't look at __all__, such as IPython's autocomplete function.
An example from Python 3.3's random module:
from warnings import warn as _warn
from types import MethodType as _MethodType, BuiltinMethodType as _BuiltinMethodType
from math import log as _log, exp as _exp, pi as _pi, e as _e, ceil as _ceil
from math import sqrt as _sqrt, acos as _acos, cos as _cos, sin as _sin
from os import urandom as _urandom
from collections.abc import Set as _Set, Sequence as _Sequence
from hashlib import sha512 as _sha512
Another technique is to perform imports in function scope, so that they become local variables:
"""Some module"""
# imports conventionally go here
def some_function(arg):
"Do something with arg."
import re # Regular expressions solve everything
...
The main rationale for doing this is that it is effectively lazy, delaying the importing of a module's dependencies until they are actually used. Suppose one function in the module depends on a particular huge library. Importing the library at the top of the file would mean that importing the module would load the entire library. This way, importing the module can be quick, and only client code that actually calls that function incurs the cost of loading the library. Further, if the dependency library is not available, client code that doesn't need the dependent feature can still import the module and call the other functions. The disadvantage is that using function-level imports obscures what your code's dependencies are.
Example from Python 3.3's os.py:
def get_exec_path(env=None):
"""[...]"""
# Use a local import instead of a global import to limit the number of
# modules loaded at startup: the os module is always loaded at startup by
# Python. It may also avoid a bootstrap issue.
import warnings
Import the module as a whole: import foreignmodule. What you claim as a drawback is actually a benefit. Namely, prepending the module name makes your code easier to maintain and makes it more self-documenting.
Six months from now when you look at a line of code like foo = Bar(baz) you may ask yourself which module Bar came from, but with foo = cleverlib.Bar it is much less of a mystery.
Of course, the fewer imports you have, the less of a problem this is. For small programs with few dependencies it really doesn't matter all that much.
When you find yourself asking questions like this, ask yourself what makes the code easier to understand, rather than what makes the code easier to write. You write it once but you read it a lot.
For this situation I would go with an all_imports.py file which had all the
from foreignmodule import .....
from another module import .....
and then in your working modules
import all_imports as fgn # or whatever you want to prepend
...
something = fgn.Class1()
Another thing to be aware of
__all__ = ['func1', 'func2', 'this', 'that']
Now, any functions/classes/variables/etc that are in your module, but not in your modules's __all__ will not show up in help(), and won't be imported by from mymodule import * See Making python imports more structured? for more info.
I would compromise and just pick a short alias for the foreign module:
import foreignmodule as fm
It saves you completely from the pollution (probably the bigger issue) and at least reduces the prepending burden.
I know this is an old question. It may not be 'Pythonic', but the cleanest way I've discovered for exporting only certain module definitions is, really as you've found, to globally wrap the module in a function. But instead of returning them, to export names, you can simply globalize them (global thus in essence becomes a kind of 'export' keyword):
def module():
global MyPublicClass,ExportedModule
import somemodule as ExportedModule
import anothermodule as PrivateModule
class MyPublicClass:
def __init__(self):
pass
class MyPrivateClass:
def __init__(self):
pass
module()
del module
I know it's not much different than your original conclusion, but frankly to me this seems to be the cleanest option. The other advantage is, you can group any number of modules written this way into a single file, and their private terms won't overlap:
def module():
global A
i,j,k = 1,2,3
class A:
pass
module()
del module
def module():
global B
i,j,k = 7,8,9 # doesn't overwrite previous declarations
class B:
pass
module()
del module
Though, keep in mind their public definitions will, of course, overlap.

Importing modules inside python class

I'm currently writing a class that needs os, stat and some others.
What's the best way to import these modules in my class?
I'm thinking about when others will use it, I want the 'dependency' modules to be already
imported when the class is instantiated.
Now I'm importing them in my methods, but maybe there's a better solution.
If your module will always import another module, always put it at the top as PEP 8 and the other answers indicate. Also, as #delnan mentions in a comment, sys, os, etc. are being used anyway, so it doesn't hurt to import them globally.
However, there is nothing wrong with conditional imports, if you really only need a module under certain runtime conditions.
If you only want to import them if the class is defined, like if the class is in an conditional block or another class or method, you can do something like this:
condition = True
if condition:
class C(object):
os = __import__('os')
def __init__(self):
print self.os.listdir
C.os
c = C()
If you only want it to be imported if the class is instantiated, do it in __new__ or __init__.
import sys
from importlib import import_module
class Foo():
def __init__(self):
# Depends on the configuration of the application.
self.condition = True # "True" Or "False"
if self.condition:
self.importedModule = import_module('moduleName')
# ---
if 'moduleName' in sys.modules:
self.importedModule.callFunction(params)
#or
if self.condition:
self.importedModule.callFunction(params)
# ---
PEP 8 on imports:
Imports are always put at the top of the file, just after any module
comments and docstrings, and before module globals and constants.
This makes it easy to see all modules used by the file at hand, and avoids having to replicate the import in several places when a module is used in more than one place. Everything else (e.g. function-/method-level imports) should be an absolute exception and needs to be justified well.
This (search for the section "Imports") official paper states, that imports should normally be put in the top of your source file. I would abide to this rule, apart from special cases.

Categories

Resources