Python Etiquette: Importing Modules - python

Say I have two Python modules:
module1.py:
import module2
def myFunct(): print "called from module1"
module2.py:
def myFunct(): print "called from module2"
def someFunct(): print "also called from module2"
If I import module1, is it better etiquette to re-import module2, or just refer to it as module1.module2?
For example (someotherfile.py):
import module1
module1.myFunct() # prints "called from module1"
module1.module2.myFunct() # prints "called from module2"
I can also do this: module2 = module1.module2. Now, I can directly call module2.myFunct().
However, I can change module1.py to:
from module2 import *
def myFunct(): print "called from module1"
Now, in someotherfile.py, I can do this:
import module1
module1.myFunct() # prints "called from module1"; overrides module2
module1.someFunct() # prints "also called from module2"
Also, by importing *, help('module1') shows all of the functions from module2.
On the other hand, (assuming module1.py uses import module2), I can do:
someotherfile.py:
import module1, module2
module1.myFunct() # prints "called from module1"
module2.myFunct() # prints "called from module2"
Again, which is better etiquette and practice? To import module2 again, or to just refer to module1's importation?

Quoting the PEP 8 style guide:
When importing a class from a class-containing module, it's usually okay to spell this:
from myclass import MyClass
from foo.bar.yourclass import YourClass
If this spelling causes local name clashes, then spell them
import myclass
import foo.bar.yourclass
Emphasis mine.
Don't use module1.module2; you are relying on the internal implementation details of module1, which may later change what imports it is using. You can import module2 directly, so do so unless otherwise documented by the module author.
You can use the __all__ convention to limit what is imported from a module with from modulename import *; the help() command honours that list as well. Listing the names you explicitly export in __all__ helps clean up the help() text presentation:
The public names defined by a module are determined by checking the module’s namespace for a variable named __all__; if defined, it must be a sequence of strings which are names defined or imported by that module. The names given in __all__ are all considered public and are required to exist. If __all__ is not defined, the set of public names includes all names found in the module’s namespace which do not begin with an underscore character ('_'). __all__ should contain the entire public API. It is intended to avoid accidentally exporting items that are not part of the API (such as library modules which were imported and used within the module).

Just import module2. Re-importing is relatively costless, since Python caches module objects in sys.modules.
Moreover, chaining dots as in module1.module2.myFunct is a violation of the Law of Demeter. Perhaps some day you will want to replace module1 with some other module module1a which does not import module2. By using import module2, you will avoid having to rewrite all occurrences of module1.module2.myFunct.
from module2 import * is generally a bad practice since it makes it hard to trace where variables come from. And mixing module namespaces can create variable-name conflicts. For example, from numpy import * is a definite no-no, since doing so would override Python's builtin sum, min, max, any, all, abs and round.

Related

Excluding modules when importing everything in __init__.py

Problem
Consider the following layout:
package/
main.py
math_helpers/
mymath.py
__init__.py
mymath.py contains:
import math
def foo():
pass
In main.py I want to be able to use code from mymath.py like so:
import math_helpers
math_helpers.foo()
In order to do so, __init__.py contains:
from .mymath import *
However, modules imported in mymath.py are now in the math_helpers namespace, e.g. math_helpers.math is accessible.
Current approach
I'm adding the following at the end of mymath.py.
import types
__all__ = [name for name, thing in globals().items()
if not (name.startswith('_') or isinstance(thing, types.ModuleType))]
This seems to work, but is it the correct approach?
On the one hand there are many good reasons not to do star imports, but on the other hand, python is for consenting adults.
__all__ is the recommended approach to determining what shows up in a star import. Your approach is correct, and you can further sanitize the namespace when finished:
import types
__all__ = [name for name, thing in globals().items()
if not (name.startswith('_') or isinstance(thing, types.ModuleType))]
del types
While less recommended, you can also sanitize elements directly out of the module, so that they don't show up at all. This will be a problem if you need to use them in a function defined in the module, since every function object has a __globals__ reference that is bound to its parent module's __dict__. But if you only import math_helpers to call math_helpers.foo(), and don't require a persistent reference to it elsewhere in the module, you can simply unlink it at the end:
del math_helpers
Long Version
A module import runs the code of the module in the namespace of the module's __dict__. Any names that are bound at the top level, whether by class definition, function definition, direct assignment, or other means, live in the that dictionary. Sometimes, it is desirable to clean up intermediate variables, as I suggested doing with types.
Let's say your module looks like this:
test_module.py
import math
import numpy as np
def x(n):
return math.sqrt(n)
class A(np.ndarray):
pass
import types
__all__ = [name for name, thing in globals().items()
if not (name.startswith('_') or isinstance(thing, types.ModuleType))]
In this case, __all__ will be ['x', 'A']. However, the module itself will contain the following names: 'math', 'np', 'x', 'A', 'types', '__all__'.
If you run del types at the end, it will remove that name from the namespace. Clearly this is safe because types is not referenced anywhere once __all__ has been constructed.
Similarly, if you wanted to remove np by adding del np, that would be OK. The class A is fully constructed by the end of the module code, so it does not require the global name np to reference its parent class.
Not so with math. If you were to do del math at the end of the module code, the function x would not work. If you import your module, you can see that x.__globals__ is the module's __dict__:
import test_module
test_module.__dict__ is test_module.x.__globals__
If you delete math from the module dictionary and call test_module.x, you will get
NameError: name 'math' is not defined
So you under some very special circumstances you may be able to sanitize the namespace of mymath.py, but that is not the recommended approach as it only applies to certain cases.
In conclusion, stick to using __all__.
A Story That's Sort of Relevant
One time, I had two modules that implemented similar functionality, but for different types of end users. There were a couple of functions that I wanted to copy out of module a into module b. The problem was that I wanted the functions to work as if they had been defined in module b. Unfortunately, they depended on a constant that was defined in a. b defined its own version of the constant. For example:
a.py
value = 1
def x():
return value
b.py
from a import x
value = 2
I wanted b.x to access b.value instead of a.value. I pulled that off by adding the following to b.py (based on https://stackoverflow.com/a/13503277/2988730):
import functools, types
x = functools.update_wrapper(types.FunctionType(x.__code__, globals(), x.__name__, x.__defaults__, x.__closure__), x)
x.__kwdefaults__ = x.__wrapped__.__kwdefaults__
x.__module__ = __name__
del functools, types
Why am I telling you all this? Well, you can make a version of your module that does not have any stray names in your namespace. You won't be able to see changes to global variables in your functions though. This is just an exercise in pushing python beyond its normal usage. I highly don't recommend doing this, but here is a sample module that effectively freezes its __dict__ as far as the functions are concerned. This has the same members as test_module above, but with no modules in the global namespace:
import math
import numpy as np
def x(n):
return math.sqrt(n)
class A(np.ndarray):
pass
import functools, types, sys
def wrap(obj):
""" Written this way to be able to handle classes """
for name in dir(obj):
if name.startswith('_'):
continue
thing = getattr(obj, name)
if isinstance(thing, FunctionType) and thing.__module__ == __name__:
setattr(obj, name,
functools.update_wrapper(types.FunctionType(thing.func_code, d, thing.__name__, thing.__defaults__, thing.__closure__), thing)
getattt(obj, name).__kwdefaults__ = thing.__kwdefaults__
elif isinstance(thing, type) and thing.__module__ == __name__:
wrap(thing)
d = globals().copy()
wrap(sys.modules[__name__])
del d, wrap, sys, math, np, functools, types
So yeah, please don't ever do this! But if you do, stick it in a utility class somewhere.

Import method from Python submodule in __init__, but not submodule itself

I have a Python module with the following structure:
mymod/
__init__.py
tools.py
# __init__.py
from .tools import foo
# tools.py
def foo():
return 42
Now, when import mymod, I see that it has the following members:
mymod.foo()
mymod.tools.foo()
I don't want the latter though; it just pollutes the namespace.
Funnily enough, if tools.py is called foo.py you get what you want:
mymod.foo()
(Obviously, this only works if there is just one function per file.)
How do I avoid importing tools? Note that putting foo() into __init__.py is not an option. (In reality, there are many functions like foo which would absolutely clutter the file.)
The existence of the mymod.tools attribute is crucial to maintaining proper function of the import system. One of the normal invariants of Python imports is that if a module x.y is registered in sys.modules, then the x module has a y attribute referring to the x.y module. Otherwise, things like
import x.y
x.y.y_function()
break, and depending on the Python version, even
from x import y
can break. Even if you don't think you're doing any of the things that would break, other tools and modules rely on these invariants, and trying to remove the attribute causes a slew of compatibility problems that are nowhere near worth it.
Trying to make tools not show up in your mymod module's namespace is kind of like trying to not make "private" (leading-underscore) attributes show up in your objects' namespaces. It's not how Python is designed to work, and trying to force it to work that way causes more problems than it solves.
The leading-underscore convention isn't just for instance variables. You could mark your tools module with a leading underscore, renaming it to _tools. This would prevent it from getting picked up by from mymod import * imports (unless you explicitly put it in an __all__ list), and it'd change how IDEs and linters treat attempts to access it directly.
You are not importing the tools module, it's just available when you import the package like you're doing:
import mymod
You will have access to everything defined in the __init__ file and all the modules of this package:
import mymod
# Reference a module
mymod.tools
# Reference a member of a module
mymod.tools.foo
# And any other modules from this package
mymod.tools.subtools.func
When you import foo inside __init__ you are are just making foo available there just like if you have defined it there, but of course you defined it in tools which is a way to organize your package, so now since you imported it inside __init__ you can:
import mymod
mymod.foo()
Or you can import foo alone:
from mymod import foo
foo()
But you can import foo without making it available inside __init__, you can do the following which is exactly the same as the example above:
from mymod.tools import foo
foo()
You can use both approaches, they're both right, in all these example you are not "cluttering the file" as you can see accessing foo using mymod.tools.foo is namespaced so you can have multiple foos defined in other modules.
Try putting this in your __init__.py file:
from .tools import foo
del tools

Importing modules in python (3 modules)

Let's say i have 3 modules within the same directory. (module1,module2,module3)
Suppose the 2nd module imports the 3rd module then if i import module2 in module 1. Does that automatically import module 3 to module 1 ?
Thanks
No. The imports only work inside a module. You can verify that by creating a test.
Saying,
# module1
import module2
# module2
import module3
# in module1
module3.foo() # oops
This is reasonable because you can think in reverse: if imports cause a chain of importing, it'll be hard to decide which function is from which module, thus causing complex naming conflicts.
No, it will not be imported unless you explicitly specify python to, like so:
from module2 import *
What importing does conceptually is outlined below.
import some_module
The statement above is equivalent to:
module_variable = import_module("some_module")
All we have done so far is bind some object to a variable name.
When it comes to the implementation of import_module it is also not that hard to grasp.
def import_module(module_name):
if module_name in sys.modules:
module = sys.modules[module_name]
else:
filename = find_file_for_module(module_name)
python_code = open(filename).read()
module = create_module_from_code(python_code)
sys.modules[module_name] = module
return module
First, we check if the module has been imported before. If it was, then it will be available in the global list of all modules (sys.modules), and so will simply be reused. In the case that the module is not available, we create it from the code. Once the function returns, the module will be assigned to the variable name that you have chosen. As you can see the process is not inefficient or wasteful. All you are doing is creating an alias for your module. In most cases, transparency is prefered, hence having a quick look at the top of the file can tell you what resources are available to you. Otherwise, you might end up in a situation where you are wondering where is a given resource coming from. So, that is why you do not get modules inherently "imported".
Resource:
Python doc on importing

Imports inside python packages

I was facing import error (ImportError: cannot import name 'ClassB') in following code:
dir structure:
main.py
test_pkg/
__init__.py
a.py
b.py
main.py:
from test_pkg import ClassA, ClassB
__init__.py:
from .a import ClassA
from .b import ClassB
a.py:
from test_pkg import ClassB
class ClassA:
pass
b.py:
class ClassB:
pass
in past i fixed it by quick 'experiment' by adding full name in import in a.py:
from test_pkg.b import ClassB
class ClassA:
pass
I have read about import machinery and according to :
This name will be used in various phases of the import search, and it may be the dotted path to a submodule, e.g. foo.bar.baz. In this case, Python first tries to import foo, then foo.bar, and finally foo.bar.baz. link2doc
I was expecting it will fail again, because it will try to import test_pkg during test_pkg import, but it is working. My question is why?
Also 2 additional questions:
is it proper approach to have cross modules dependencies?
is it ok to have modules imported in package init.py?
My analysis:
Based on readings i recognized that high probably issue is that, because off
__init__.py:
from .a import ClassA
from .b import ClassB
ClassA and ClassB import is executed as part of test_pkg import, but then hits import statement in a.py:
a.py:
from test_pkg import ClassB
class ClassA:
pass
and it fail because circular dependency occured.
But is working when is imported using:
from test_pkg.b import ClassB
and according to my understanding it shouldnt, because:
This name will be used in various phases of the import search, and it
may be the dotted path to a submodule, e.g. foo.bar.baz. In this case,
Python first tries to import foo, then foo.bar, and finally
foo.bar.baz. If any of the intermediate imports fail, a
ModuleNotFoundError is raised.
so i was expecting same behavior for both imports.
Looks like import with full path is not launching problematic test_pkg import process
from test_pkg.b import ClassB
Your file is named b.py, but you're trying to import B, not import b.
Depending on your platform (see PEP 235 for details), this may work, or it may not. If it doesn't work, the symptoms will be exactly what you're seeing: ImportError: cannot import name 'B'.
The fix is to from test_okg import b. Or, if you want the module to be named B, rename the file to B.py.
This actually has nothing to do with packages (except that the error message you get says cannot import name 'B' instead of No module named 'B', because in a failed from … import statement, Python can't tell whether you were failing to import a module from a package, or some global name from a module).
So, why does this work?
from test_pkg.b import B
I was expecting it will fail again, because it will try to import test_pkg during test_pkg import, but it is working. My question is why?
Because importing test_pkg isn't the problem in the first place; importing test_pkg.B is. And you solved that problem by importing test_pkg.b instead.
test_pkg.b is successfully found in test_pkg/b.py.
And then, test_pkg.b.B is found within that module, and imported into your module, because of course there's a class B: statement in b.py.
For your followup questions:
is it proper approach to have cross modules dependencies?
There's nothing wrong with cross-module dependencies as long as they aren't circular, and yours aren't.
It's perfectly fine for test_pkg.a to import test_pkg.b with an absolute import, like from test_pkg import b.
However, it's usually better to use a relative import, like from . import b (unless you need dual-version code that works the same on Python 2.x and 3.x). PEP 328 explains the reasons why relative imports are usually better for intra-package dependencies (and why it's only "usually" rather than "always").
is it ok to have modules imported in package __init__.py?
Yes. In fact, this is a pretty common idiom, used for multiple purposes.
For example, see asyncio.__init__.py, which imports all of the public exports from each of its submodules and re-exports them. There are a handful of rarely-used names in the submodules, which don't start with _ but aren't included in __all__, and if you want to use those you need to import the submodule explicitly. But everything you're likely to need in a typical program is included in __all__ and re-exported by the package, so you can just write, e.g., asyncio.Lock instead of asyncio.locks.Lock.
The code I would prefer to write is probably:
main.py:
from test_pkg import A, B
b.py:
class B:
pass
a.py:
from .b import B
class A:
pass
__init__.py:
from .a import A
from .b import B
main.py:
from test_pkg import A, B
The proper approach is to have cross module dependencies but not circular. You should figure out the hierarchy of your project and arrange your dependency graph in a DAG (directed acyclic graph).
what you put in package __init__.py will be what you can access thru the package. Also you can refer to this question for the use of __all__ in __init__.py.

Importing the same object from the same module, in two different modules:

module1.py:
from somemod import something
import module2
module2.py:
from somemod import something
Is something in module1 the exact same object as something in module2?
For example, if before importing module2, module1 would do something.val = 10. Could module2 get the value by doing something.val? Or does it get a different object?
Asked differently: does import execute an imported module even if it was already imported in the interpreter session, in a different module?
Also, is it necessary to from somemod import something in module2 if module1 already imported it?
It's the same object. Modules, like everything else in Python, are objects that exist in memory and referenced by a name. The import statement does two things: one, if the requested module does not yet exist, executes the code in the imported file and two, makes it available as a module. Subsequent import statements will skip the first step. This means that in module1, the names module1.something and module1.module2.something both refer to the same object created the first time somemod was imported.

Categories

Resources