Best practice for nested Python module imports with * [duplicate] - python

I have a file, myfile.py, which imports Class1 from file.py and file.py contains imports to different classes in file2.py, file3.py, file4.py.
In my myfile.py, can I access these classes or do I need to again import file2.py, file3.py, etc.?
Does Python automatically add all the imports included in the file I imported, and can I use them automatically?

Best practice is to import every module that defines identifiers you need, and use those identifiers as qualified by the module's name; I recommend using from only when what you're importing is a module from within a package. The question has often been discussed on SO.
Importing a module, say moda, from many modules (say modb, modc, modd, ...) that need one or more of the identifiers moda defines, does not slow you down: moda's bytecode is loaded (and possibly build from its sources, if needed) only once, the first time moda is imported anywhere, then all other imports of the module use a fast path involving a cache (a dict mapping module names to module objects that is accessible as sys.modules in case of need... if you first import sys, of course!-).

Python doesn't automatically introduce anything into the namespace of myfile.py, but you can access everything that is in the namespaces of all the other modules.
That is to say, if in file1.py you did from file2 import SomeClass and in myfile.py you did import file1, then you can access it within myfile as file1.SomeClass. If in file1.py you did import file2 and in myfile.py you did import file1, then you can access the class from within myfile as file1.file2.SomeClass. (These aren't generally the best ways to do it, especially not the second example.)
This is easily tested.

In the myfile module, you can either do from file import ClassFromFile2 or from file2 import ClassFromFile2 to access ClassFromFile2, assuming that the class is also imported in file.
This technique is often used to simplify the API a bit. For example, a db.py module might import various things from the modules mysqldb, sqlalchemy and some other helpers. Than, everything can be accessed via the db module.

If you are using wildcard import, yes, wildcard import actually is the way of creating new aliases in your current namespace for contents of the imported module. If not, you need to use the namespace of the module you have imported as usual.

Related

How to get source file which python class/function/variable is imported from?

Let's say I have imported two modules like this:
from module0 import hello_func
from directory.module1 import hello_var
Where in module0:
def hello_func(): return "hello from module0"
And module1:
hello_var = "hello from module1"
How can I know from which file is each object being imported?
I tried to check locals() function but nothing in there giving reference to the file...
Actually, you kind of answered your question yourself:
Let's say I have imported two modules
(insert "from xxx import *" here)
How can I know from which file is each object being imported?
One of the reasons of NOT using wildcard imports is precisely to make it clear where names are imported from (the other one being to avoind one imported name to shadow a previously imported one - something that tends to break you code in the most unexpected - and sometimes quite hard to spot - ways).
Note that in your edited question:
from module0 import hello_func
from directory.module1 import hello_var
you already have a much better idea where a name comes from. Not the exact files path yet, but at least the name of the package/module.
And that's one of the main reasons why one should NOT use wildcard imports.
Now if you want to know the exact files path, you have two distinct cases.
Some objects keep trace of where they were created (mostly modules, classes, functions, etc - cf the list of types supported by inspect.getfile()), and then, well, you already know the answer (use inspect.getfile() xD).
But most types wont (because there's no reason for it). In this case, you have to know which module they were imported from and call inspect.getfile() on the module itself. In this case, if you used wildcard imports, you will have to manually inspect all the modules you imported from to find out which one defined this name. Enjoy. Specially if one of those modules did also use wildcard imports...
one question please: where they does keep traces? and how these traces look like?
Modules keep it in their __file__ attribute. Functions and classes keep a reference to their module's name in their __module__ attribute, and from this you can use it to retrieve the module from the sys.modules dict (a cache of all modules already imported in the current process), which will gives you the file.
I never had a need to search this info for tracebacks, frames, code objects etc so you'll have to check it yourself I'm afraid ;-)
You can define in each module a constant with its path, something like this should work:
import os
FILE_PATH = os.path.abspath(__file__)
When you import that module you can access its location like this:
import module
print(module.FILE_PATH)
Another solution using the inspect and os modules.
import module0
import os
import inspect
print(os.path.abspath(inspect.getfile(module0.hello_func)))
If you are looking for the absolute path of where the script is being run, this should work for sure:
import os
abs_path = os.path.dirname(os.path.abspath(__file__))
print(abs_path)

Import method from Python submodule in __init__, but not submodule itself

I have a Python module with the following structure:
mymod/
__init__.py
tools.py
# __init__.py
from .tools import foo
# tools.py
def foo():
return 42
Now, when import mymod, I see that it has the following members:
mymod.foo()
mymod.tools.foo()
I don't want the latter though; it just pollutes the namespace.
Funnily enough, if tools.py is called foo.py you get what you want:
mymod.foo()
(Obviously, this only works if there is just one function per file.)
How do I avoid importing tools? Note that putting foo() into __init__.py is not an option. (In reality, there are many functions like foo which would absolutely clutter the file.)
The existence of the mymod.tools attribute is crucial to maintaining proper function of the import system. One of the normal invariants of Python imports is that if a module x.y is registered in sys.modules, then the x module has a y attribute referring to the x.y module. Otherwise, things like
import x.y
x.y.y_function()
break, and depending on the Python version, even
from x import y
can break. Even if you don't think you're doing any of the things that would break, other tools and modules rely on these invariants, and trying to remove the attribute causes a slew of compatibility problems that are nowhere near worth it.
Trying to make tools not show up in your mymod module's namespace is kind of like trying to not make "private" (leading-underscore) attributes show up in your objects' namespaces. It's not how Python is designed to work, and trying to force it to work that way causes more problems than it solves.
The leading-underscore convention isn't just for instance variables. You could mark your tools module with a leading underscore, renaming it to _tools. This would prevent it from getting picked up by from mymod import * imports (unless you explicitly put it in an __all__ list), and it'd change how IDEs and linters treat attempts to access it directly.
You are not importing the tools module, it's just available when you import the package like you're doing:
import mymod
You will have access to everything defined in the __init__ file and all the modules of this package:
import mymod
# Reference a module
mymod.tools
# Reference a member of a module
mymod.tools.foo
# And any other modules from this package
mymod.tools.subtools.func
When you import foo inside __init__ you are are just making foo available there just like if you have defined it there, but of course you defined it in tools which is a way to organize your package, so now since you imported it inside __init__ you can:
import mymod
mymod.foo()
Or you can import foo alone:
from mymod import foo
foo()
But you can import foo without making it available inside __init__, you can do the following which is exactly the same as the example above:
from mymod.tools import foo
foo()
You can use both approaches, they're both right, in all these example you are not "cluttering the file" as you can see accessing foo using mymod.tools.foo is namespaced so you can have multiple foos defined in other modules.
Try putting this in your __init__.py file:
from .tools import foo
del tools

Importing modules in python (3 modules)

Let's say i have 3 modules within the same directory. (module1,module2,module3)
Suppose the 2nd module imports the 3rd module then if i import module2 in module 1. Does that automatically import module 3 to module 1 ?
Thanks
No. The imports only work inside a module. You can verify that by creating a test.
Saying,
# module1
import module2
# module2
import module3
# in module1
module3.foo() # oops
This is reasonable because you can think in reverse: if imports cause a chain of importing, it'll be hard to decide which function is from which module, thus causing complex naming conflicts.
No, it will not be imported unless you explicitly specify python to, like so:
from module2 import *
What importing does conceptually is outlined below.
import some_module
The statement above is equivalent to:
module_variable = import_module("some_module")
All we have done so far is bind some object to a variable name.
When it comes to the implementation of import_module it is also not that hard to grasp.
def import_module(module_name):
if module_name in sys.modules:
module = sys.modules[module_name]
else:
filename = find_file_for_module(module_name)
python_code = open(filename).read()
module = create_module_from_code(python_code)
sys.modules[module_name] = module
return module
First, we check if the module has been imported before. If it was, then it will be available in the global list of all modules (sys.modules), and so will simply be reused. In the case that the module is not available, we create it from the code. Once the function returns, the module will be assigned to the variable name that you have chosen. As you can see the process is not inefficient or wasteful. All you are doing is creating an alias for your module. In most cases, transparency is prefered, hence having a quick look at the top of the file can tell you what resources are available to you. Otherwise, you might end up in a situation where you are wondering where is a given resource coming from. So, that is why you do not get modules inherently "imported".
Resource:
Python doc on importing

Is there a module cache in python? [duplicate]

If a large module is loaded by some submodule of your code, is there any benefit to referencing the module from that namespace instead of importing it again?
For example:
I have a module MyLib, which makes extensive use of ReallyBigLib. If I have code that imports MyLib, should I dig the module out like so
import MyLib
ReallyBigLib = MyLib.SomeModule.ReallyBigLib
or just
import MyLib
import ReallyBigLib
Python modules could be considered as singletons... no matter how many times you import them they get initialized only once, so it's better to do:
import MyLib
import ReallyBigLib
Relevant documentation on the import statement:
https://docs.python.org/2/reference/simple_stmts.html#the-import-statement
Once the name of the module is known (unless otherwise specified, the term “module” will refer to both packages and modules), searching for the module or package can begin. The first place checked is sys.modules, the cache of all modules that have been imported previously. If the module is found there then it is used in step (2) of import.
The imported modules are cached in sys.modules:
This is a dictionary that maps module names to modules which have already been loaded. This can be manipulated to force reloading of modules and other tricks. Note that removing a module from this dictionary is not the same as calling reload() on the corresponding module object.
As others have pointed out, Python maintains an internal list of all modules that have been imported. When you import a module for the first time, the module (a script) is executed in its own namespace until the end, the internal list is updated, and execution of continues after the import statement.
Try this code:
# module/file a.py
print "Hello from a.py!"
import b
# module/file b.py
print "Hello from b.py!"
import a
There is no loop: there is only a cache lookup.
>>> import b
Hello from b.py!
Hello from a.py!
>>> import a
>>>
One of the beauties of Python is how everything devolves to executing a script in a namespace.
It makes no substantial difference. If the big module has already been loaded, the second import in your second example does nothing except adding 'ReallyBigLib' to the current namespace.
WARNING: Python does not guarantee that module will not be initialized twice.
I've stubled upon such issue. See discussion:
http://code.djangoproject.com/ticket/8193
The internal registry of imported modules is the sys.modules dictionary, which maps module names to module objects. You can look there to see all the modules that are currently imported.
You can also pull some useful tricks (if you need to) by monkeying with sys.modules - for example adding your own objects as pseudo-modules which can be imported by other modules.
It is the same performancewise. There is no JIT compiler in Python yet.

How do I import all the submodules of a Python namespace package?

A Python namespace package can be spread over many directories, and zip files or custom importers. What's the correct way to iterate over all the importable submodules of a namespace package?
Here is a way that works well for me. Create a new submodule all.py, say, in one of the packages in the namespace.
If you write
import mynamespace.all
you are given the object for the mynamespace module. This object contains all of the loaded modules in the namespace, irrespective of where they were loaded, since there is only one instance of mynamespace around.
So, just load all the packages in the namespace in all.py!
# all.py
from pkgutil import iter_modules
# import this module's namespace (= parent) package
pkg = __import__(__package__)
# iterate all modules in pkg's paths,
# prefixing the returned module names with namespace-dot,
# and import the modules by name
for m in iter_modules(pkg.__path__, __package__ + '.'):
__import__(m.name)
Or in a one-liner that keeps the all module empty, if you care about that sort of thing:
# all.py
(lambda: [__import__(_.name) for _ in __import__('pkgutil').iter_modules(__import__(__package__).__path__, __package__ + '.')])() # noqa
After importing the all module from your namespace, you then actually receive a fully populated namespace module:
import mynamespace.all
mynamespace.mymodule1 # works
mynamespace.mymodule2 # works
...
Of course, you can use the same mechanism to enumerate or otherwise process the modules in the namespace, if you do not want to import them immediately.
Please read import confusion.
It very clearly distinguishes all the different ways you can import packages and its sub modules and in the process answers your question. When you need a certain submodule from a package, it’s often much more convenient to write from io.drivers import zip than import io.drivers.zip, since the former lets you refer to the module simply as zip instead of its full name.
from modname import *, this provides an easy way to import all the items from a module into the current namespace; however, this statement should be used sparingly.

Categories

Resources