Import method from Python submodule in __init__, but not submodule itself - python

I have a Python module with the following structure:
mymod/
__init__.py
tools.py
# __init__.py
from .tools import foo
# tools.py
def foo():
return 42
Now, when import mymod, I see that it has the following members:
mymod.foo()
mymod.tools.foo()
I don't want the latter though; it just pollutes the namespace.
Funnily enough, if tools.py is called foo.py you get what you want:
mymod.foo()
(Obviously, this only works if there is just one function per file.)
How do I avoid importing tools? Note that putting foo() into __init__.py is not an option. (In reality, there are many functions like foo which would absolutely clutter the file.)

The existence of the mymod.tools attribute is crucial to maintaining proper function of the import system. One of the normal invariants of Python imports is that if a module x.y is registered in sys.modules, then the x module has a y attribute referring to the x.y module. Otherwise, things like
import x.y
x.y.y_function()
break, and depending on the Python version, even
from x import y
can break. Even if you don't think you're doing any of the things that would break, other tools and modules rely on these invariants, and trying to remove the attribute causes a slew of compatibility problems that are nowhere near worth it.
Trying to make tools not show up in your mymod module's namespace is kind of like trying to not make "private" (leading-underscore) attributes show up in your objects' namespaces. It's not how Python is designed to work, and trying to force it to work that way causes more problems than it solves.
The leading-underscore convention isn't just for instance variables. You could mark your tools module with a leading underscore, renaming it to _tools. This would prevent it from getting picked up by from mymod import * imports (unless you explicitly put it in an __all__ list), and it'd change how IDEs and linters treat attempts to access it directly.

You are not importing the tools module, it's just available when you import the package like you're doing:
import mymod
You will have access to everything defined in the __init__ file and all the modules of this package:
import mymod
# Reference a module
mymod.tools
# Reference a member of a module
mymod.tools.foo
# And any other modules from this package
mymod.tools.subtools.func
When you import foo inside __init__ you are are just making foo available there just like if you have defined it there, but of course you defined it in tools which is a way to organize your package, so now since you imported it inside __init__ you can:
import mymod
mymod.foo()
Or you can import foo alone:
from mymod import foo
foo()
But you can import foo without making it available inside __init__, you can do the following which is exactly the same as the example above:
from mymod.tools import foo
foo()
You can use both approaches, they're both right, in all these example you are not "cluttering the file" as you can see accessing foo using mymod.tools.foo is namespaced so you can have multiple foos defined in other modules.

Try putting this in your __init__.py file:
from .tools import foo
del tools

Related

Best practice for nested Python module imports with * [duplicate]

I have a file, myfile.py, which imports Class1 from file.py and file.py contains imports to different classes in file2.py, file3.py, file4.py.
In my myfile.py, can I access these classes or do I need to again import file2.py, file3.py, etc.?
Does Python automatically add all the imports included in the file I imported, and can I use them automatically?
Best practice is to import every module that defines identifiers you need, and use those identifiers as qualified by the module's name; I recommend using from only when what you're importing is a module from within a package. The question has often been discussed on SO.
Importing a module, say moda, from many modules (say modb, modc, modd, ...) that need one or more of the identifiers moda defines, does not slow you down: moda's bytecode is loaded (and possibly build from its sources, if needed) only once, the first time moda is imported anywhere, then all other imports of the module use a fast path involving a cache (a dict mapping module names to module objects that is accessible as sys.modules in case of need... if you first import sys, of course!-).
Python doesn't automatically introduce anything into the namespace of myfile.py, but you can access everything that is in the namespaces of all the other modules.
That is to say, if in file1.py you did from file2 import SomeClass and in myfile.py you did import file1, then you can access it within myfile as file1.SomeClass. If in file1.py you did import file2 and in myfile.py you did import file1, then you can access the class from within myfile as file1.file2.SomeClass. (These aren't generally the best ways to do it, especially not the second example.)
This is easily tested.
In the myfile module, you can either do from file import ClassFromFile2 or from file2 import ClassFromFile2 to access ClassFromFile2, assuming that the class is also imported in file.
This technique is often used to simplify the API a bit. For example, a db.py module might import various things from the modules mysqldb, sqlalchemy and some other helpers. Than, everything can be accessed via the db module.
If you are using wildcard import, yes, wildcard import actually is the way of creating new aliases in your current namespace for contents of the imported module. If not, you need to use the namespace of the module you have imported as usual.

Python import function from package

(Python 3.6)
I have this folder structure:
package/
start.py
subpackage/
__init__.py
submodule.py
submodule.py:
def subfunc():
print("This is submodule")
__ init __.py:
from subpackage.submodule import subfunc
start.py:
import subpackage
subpackage.subfunc()
subpackage.submodule.subfunc()
I understand how and why
subpackage.subfunc()
works.
But I don't understand why:
subpackage.submodule.subfunc()
also works, if I have not done:
from subpackage import submodule
Nor:
import subpackage.submodule
Neither in __ init __.py nor in start.py
Thank you very much if anyone may clear my doubt.
When issuing from subpackage.submodule import subfunc, python does two things for you: one, search and evaluate the module named subpackage.submodule, put it into sys.modules cache; two, populate subpackage.submodule.subfunc object and bind name "subfunc" to the namespace of the current module:
The import statement combines two operations; it searches for the named module, then it binds the results of that search to a name in the local scope.
When importing subpackage.submodule, parent of submodule also got imported:
While certain side-effects may occur, such as the importing of parent packages, and the updating of various caches (including sys.modules) ...
On the last stage of importing subpackage.submodule, python would set the module as an attribute on its parent subpackage, this behavior is documented:
When a submodule is loaded using any mechanism (e.g. importlib APIs, the import or import-from statements, or built-in __import__()) a binding is placed in the parent module’s namespace to the submodule object.
If I'm getting this right, you have a folder called "package" in which there are 2 things: a .py file and another folder called "subpackage".
Inside "subpackage" you have __init__.py and submodule.py which the latter contains a function that just prints "This is submodule".
Now, when you call import subpackage, you call and "pull" everything that's inside "subpackage", including submodule and therefore, the subfunc() function.
When you write subpackage.submodule.subfunc() there's really nothing amazing going there, you just call the mainfolder/container (subpackage.), then the .py file (submodule.) and finally the function itself (subfunc() ).

Unknown python import behaviour from relative package

I've stumbled across some odd python (2.7) import behaviour, which, whilst easy to work around, has me scratching my head.
Given the following folder structure:
test/
__init__.py
x.py
package/
__init__.py
x.py
Where test/package/__init__.py contains the following
from .. import x
print x
from .x import hello
print x
print x.hello
And test/package/x.py contains the following
hello = 1
Why would running import test.package from a REPL result in the following output?
<module 'test.x' from 'test/x.pyc'>
<module 'test.package.x' from 'test/package/x.pyc'>
1
I would have expected x to reference the top level x module, however what the second import does instead, is to import the whole local x module (not just hello as I expected), effectively trampling on the first import.
Can anyone explain the mechanics of the import here?
The from .x import name realizes that test.package.x needs to be a module. It then checks the corresponding entry in sys.modules; if it is found there, then sys.modules['test.package.x'].hello is imported into the calling module.
However, if sys.modules['test.package.x'] does not exist yet, the module is loaded; and as the last step of loading the sys.modules['test.package'].x is set to point to the newly loaded module, even if you explicitly did not ask for it. Thus the second import overrides the name of the first import.
This is by design, otherwise
import foo.bar.baz
foo.bar.baz.x()
and
from foo.bar import baz
baz.x()
wouldn't be interchangeable.
I am unable to find good documentation on this behaviour in the Python 2 documentation, but the Python 3 behaviour is essentially the same in this case:
When a submodule is loaded using any mechanism (e.g. importlib APIs, the import or import-from statements, or built-in __import__()) a binding is placed in the parent module’s namespace to the submodule object. For example, if package spam has a submodule foo, after importing spam.foo, spam will have an attribute foo which is bound to the submodule.
[...]
The invariant holding is that if you have sys.modules['spam'] and sys.modules['spam.foo'] (as you would after the above import), the latter must appear as the foo attribute of the former.

Is there a reason why when importing python files, you still need to name the file.function_name?

I am currently doing a python tutorial, but they use IDLE, and I opted to use the interpreter on terminal. So I had to find out how to import a module I created. At first I tried
import my_file
then I tried calling the function inside the module by itself, and it failed. I looked around and doing
my_file.function
works. I am very confused why this needs to be done if it was imported. Also, is there a way around it so that I can just call the function? Can anyone point me in the right direction. Thanks in advance.
If you wanted to use my_file.function by just calling function, try using the from keyword.
Instead of import my_file try from my_file import *.
You can also do this to only import parts of a module like so :
from my_file import function1, function2, class1
To avoid clashes in names, you can import things with a different name:
from my_file import function as awesomePythonFunction
EDIT:
Be careful with this, if you import two modules (myfile, myfile2) that both have the same function inside, function will will point to the function in whatever module you imported last. This could make interesting things happen if you are unaware of it.
This is a central concept to python. It uses namespaces (see the last line of import this). The idea is that with thousands of people writing many different modules, the likelihood of a name collision is reasonably high. For example, I write module foo which provides function baz and Joe Smith writes module bar which provides a function baz. My baz is not the same as Joe Smiths, so in order to differentiate the two, we put them in a namespace (foo and bar) so mine can be called by foo.baz() and Joe's can be called by bar.baz().
Of course, typing foo.baz() all the time gets annoying if you just want baz() and are sure that none of your other modules imported will provide any problems... That is why python provides the from foo import * syntax, or even from foo import baz to only import the function/object/constant baz (as others have already noted).
Note that things can get even more complex:
Assume you have a module foo which provides function bar and baz, below are a few ways to import and then call the functions contained inside foo...
import foo # >>> foo.bar();foo.baz()
import foo as bar # >>> bar.bar();bar.baz()
from foo import bar,baz # >>> bar(); baz()
from foo import * # >>> bar(); baz()
from foo import bar as cow # >>> cow() # This calls bar(), baz() is not available
...
A basic import statement is an assignment of the module object (everything's an object in Python) to the specified name. I mean this literally: you can use an import anywhere in your program you can assign a value to a variable, because they're the same thing. Behind the scenes, Python is calling a built-in function called __import__() to do the import, then returning the result and assigning it to the variable name you provided.
import foo
means "import module foo and assign it the name foo in my namespace. This is the same as:
foo = __import__("foo")
Similarly, you can do:
import foo as f
which means "import module foo and assign it the name f in my namespace." This is the same as:
f = __import__("foo")
Since in this case, you have only a reference to the module object, referring to things contained by the module requires attribute access: foo.bar etc.
You can also do from foo import bar. This creates a variable named bar in your namespace that points to the bar function in the foo module. It's syntactic sugar for:
bar = __import__("foo").bar
I don't really understand your confusion. You've imported the name my_file, not anything underneath it, so that's how you reference it.
If you want to import functions or classes inside a module directly, you can use:
from my_file import function
I'm going to incorporate many of the comments already posted.
To have access to function without having to refer to the module my_file, you can do one of the following:
from my_file import function
or
from my_file import *
For a more in-depth description of how modules work, I would refer to the documentation on python modules.
The first is the preferred solution, and the second is not recommended for many reasons:
It pollutes your namespace
It is not a good practice for maintainability (it becomes more difficult to find where specific names reside.
You typically don't know exactly what is imported
You can't use tools such as pyflakes to statically detect errors in your code
Python imports work differently than the #includes/imports in a static language like C or Java, in that python executes the statements in a module. Thus if two modules need to import a specific name (or *) out of each other, you can run into circular referencing problems, such as an ImportError when importing a specific name, or simply not getting the expected names defined (in the case you from ... import *). When you don't request specific names, you don't run into the, risk of having circular references, as long as the name is defined by the time you actually want to use it.
The from ... import * also doesn't guarantee you get everything. As stated in the documentation on python modules, a module can defined the __all__ name, and cause from ... import * statements to miss importing all of the subpackages, except those listed by __all__.

from <module> import ... in __init__.py makes module name visible?

Take the following code example:
File package1/__init__.py:
from moduleB import foo
print moduleB.__name__
File package1/moduleB.py:
def foo(): pass
Then from the current directory:
>>> import package1
package1.moduleB
This code works in CPython. What surprises me about it is that the from ... import in __init__.py statement makes the moduleB name visible. According to Python documentation, this should not be the case:
The from form does not bind the module name
Could someone please explain why CPython works that way? Is there any documentation describing this in detail?
The documentation misled you as it is written to describe the more common case of importing a module from outside of the parent package containing it.
For example, using "from example import submodule" in my own code, where "example" is some third party library completely unconnected to my own code, does not bind the name "example". It does still import both the example/__init__.py and example/submodule.py modules, create two module objects, and assign example.submodule to the second module object.
But, "from..import" of names from a submodule must set the submodule attribute on the parent package object. Consider if it didn't:
package/__init__.py executes when package is imported.
That __init__ does "from submodule import name".
At some point later, other completely different code does "import package.submodule".
At step 3, either sys.modules["package.submodule"] doesn't exist, in which case loading it again will give you two different module objects in different scopes; or sys.modules["package.submodule"] will exist but "submodule" won't be an attribute of the parent package object (sys.modules["package"]), and "import package.submodule" will do nothing. However, if it does nothing, the code using the import cannot access submodule as an attribute of package!
Theoretically, how importing a submodule works could be changed if the rest of the import machinery was changed to match.
If you just need to know what importing a submodule S from package P will do, then in a nutshell:
Ensure P is imported, or import it otherwise. (This step recurses to handle "import A.B.C.D".)
Execute S.py to get a module object. (Skipping details of .pyc files, etc.)
Store module object in sys.modules["P.S"].
setattr(sys.modules["P"], "S", sys.modules["P.S"])
If that import was of the form "import P.S", bind "P" in local scope.
this is because __init__.py represent itself as package1 module object at runtime, so every .py file will be defined as an submodule. and rewrite __all__ will not make any sense. you can make another file e.g example.py and fill it with the same code in __init__.py and it will raise NameError.
i think CPython runtime takes special algorithm when __init__.py looking for variables differ from other python files, may be like this:
looking for variable named "moduleB"
if not found:
if __file__ == '__init__.py': #dont raise NameError, looking for file named moduleB.py
if current dir contains file named "moduleB.py":
import moduleB
else:
raise namerror

Categories

Resources