Where to import?

Where to import? - python

I'm learning Python and today while writing some code I was trying to decide where to put an import statement.
I can put an import statement just about anywhere it seems but how does the placement affect performance, the namespace, and anything else I don't know yet?

The official GoodPractice is to put all your imports at the beginning of your module or script, starting with standard lib modules / packages, then third part, then project specific, cf http://www.python.org/dev/peps/pep-0008/#imports
Practically, you sometimes have to defer the import into a function as a quick&dirty workaround for a circular dependency (the correct way to solve a circular dependency is to extract the relevant parts in another module, but with some frameworks you may have to accept the Q&D workaround).
Deferring the import into a function for "performances" reasons is not a good idea IMHO but once again you sometimes have to break the rules.
Importing a module really means:
search the module_or_package in `sys.modules`
if not found:
search the module_or_package_source in `sys.path`
if not found:
raise an ImportError
create a `module` instance from the module_or_package_source
# -> imply executing the top-level source code, which may raise anything
store the `module` instance in `sys.modules`
bind the `module` name (or whatever name was imported from it) in the current namespace
wrt/ what "current namespace" means, it's really this: the namespace (module's "global", function's "local" or class statement's body) in which the import statement is executed. Here's a simple script with all three examples:
try:
re
except NameError, e:
print "name 're' is not yet defined in the module's namespace"
print "module namespace : %s" % globals()
import re
print "name 're' is now defined in the module's namespace"
print "module namespace : %s" % globals()
def foo():
try:
os
except NameError, e:
print "name 'os' is not yet defined in the function's namespace"
print "function namespace : %s" % locals()
print "name 'os' is not defined in the module's namespace neither"
print "module namespace : %s" % globals()
import os
print "name 'os' is now defined in the function's namespace"
print "function namespace : %s" % locals()
print "name 'os' is still not defined in the module's namespace"
print "module namespace : %s" % globals()
foo()
print "After calling foo(), name 'os' is still not defined in the module's namespace"
print "module namespace : %s" % globals()
class Foo(object):
try:
os
except NameError, e:
print "name 'os' is not yet defined in the class namespace"
print "but we cannot inspect this namespace now so you have to take me on words"
print "but if you read the code you'll notice we can only get there if we have a NameError, so we have an indirect proof at least ;)"
print "name 'os' is not defined in the module's namespace neither obvisouly"
print "module namespace : %s" % globals()
import os
print "name 'os' is now defined in the class namespace"
print "we still cannot inspect this namespace now but wait..."
print "name 'os' is still not defined in the module's namespace neither"
print "module namespace : %s" % globals()
print "class namespace is now accessible via Foo.__dict__"
print "Foo.__dict__ is %s" % (Foo.__dict__)
print "'os' is now an attribute of Foo - Foo.os = %s" % Foo.os
print "name 'os' is still not defined in the module's namespace"
print "module namespace : %s" % globals()

When you use import you actually execute it's (module) code. So if you can control executing it (for example you only need import if some condition works) then put it anywhere you want.
if some_condition:
import foo
If you always need (no condition) it, then put it at the top of your file.
For starters I would suggest always put import statement at the top of the file.

Imports are usually placed at the top of a file, like in other programming languages. Having them all together makes it easy to see the dependencies of a module at a glance.
However, since an import executes a module's code it may be an expensive operation, and for this reason you'll sometimes see imports inside functions. NLTK is a notoriously heavy module, so when I use that, I sometimes do
def _tokenize(text):
import nltk
return nltk.word_tokenize(text)
def process_some_text(text):
if isinstance(text, basestring):
text = _tokenize(text)
# now do the actual processing
Because imports are cached, only the first call to _tokenize does the import. This also has the effect of making the dependency optional, since the import isn't attempted until the caller requests the relevant functionality.

When a module is first imported, Python searches for the module and if found, it creates a module object.
Depending on the size of the module, and the frequency with which it is used in your code, you may want to import it for once and forever at the top of your file, or you may want to import it when a particular condition is met.
The system memory is always constrained - If there is a high chance that the said condition for the module is going to be met in a very small number of cases, it will make sense to import based on the condition check.
This can be particularly useful if you need to import multiple heavy modules in your code which will each eat a lot of memory, but they are needed in different places. So rather than doing it like
import module1
import module2
def foo1()
module1.function()
def foo2()
module2.function()
foo1()
foo2()
Try something like
def foo1()
import module1
module1.function()
def foo2()
import module2
module2.function()
foo1()
foo2()
If the python modules are simple enough, it makes sense to include them at the top of the file - that ways, anyone else who is reading the code also gets a prior understanding of which all modules your current code uses.

You can put it anywhere in the file before you use it. You should not normally put it in loops, (as it will not do quite what you expect) but may put it in conditionals. As the imported modules initialisation code is executed you might save a bit of time by only loading it if you know that you will need it.
You may put it within functions but if in function then it will only be in scope within that function. e.g.
>>> def f1():
... import sys
... print sys.version
...
>>> def f2():
... print sys.version
...
>>> f2()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in f2
NameError: global name 'sys' is not defined
>>> f1()
2.5.4 (r254:67916, Dec 23 2008, 15:10:54) [MSC v.1310 32 bit (Intel)]
>>> f2()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 2, in f2
NameError: global name 'sys' is not defined
>>>
A good convention to follow is to put it at the top of the files so that it is always available and easy to find.
You may also find, especially for testing package components that you may need to modify sys.path prior to some imports so that should be imported early.
A convention I personally find useful is to have all your system imports first then project package imports then local imports with appropriate comments between them.
If you import modulename, from module import submodule or import module as alias then import order should make no major difference but if you from module import * then all bets are off as various modules can define the same name and the last will be what you get - this is just one of the reasons that it is discouraged.

Related

Python 'from x import z' imports more than just 'z' [duplicate]

I've noticed that asyncio/init.py from python 3.6 uses the following construct:
from .base_events import *
...
__all__ = (base_events.__all__ + ...)
The base_events symbol is not imported anywhere in the source code, yet the module still contains a local variable for it.
I've checked this behavior with the following code, put into an __init__.py with a dummy test.py next to it:
test = "not a module"
print(test)
from .test import *
print(test)
not a module
<module 'testpy.test' from 'C:\Users\MrM\Desktop\testpy\test.py'>
Which means that the test variable got shadowed after using a star import.
I fiddled with it a bit, and it turns out that it doesn't have to be a star import, but it has to be inside an __init__.py, and it has to be relative. Otherwise the module object is not being assigned anywhere.
Without the assignment, running the above example from a file that isn't an __init__.py will raise a NameError.
Where is this behavior coming from? Has this been outlined in the spec for import system somewhere? What's the reason behind __init__.py having to be special in this way? It's not in the reference, or at least I couldn't find it.

This behavior is defined in The import system documentation section 5.4.2 Submodules
When a submodule is loaded using any mechanism (e.g. importlib APIs,
the import or import-from statements, or built-in import()) a
binding is placed in the parent module’s namespace to the submodule
object. For example, if package spam has a submodule foo, after
importing spam.foo, spam will have an attribute foo which is bound to
the submodule.
A package namespace includes the namespace created in __init__.py plus extras added by the import system. The why is for namespace consistency.
Given Python’s familiar name binding rules this might seem surprising,
but it’s actually a fundamental feature of the import system. The
invariant holding is that if you have sys.modules['spam'] and
sys.modules['spam.foo'] (as you would after the above import), the
latter must appear as the foo attribute of the former.

This appears to have everything to do with the interplay of how the interpreter resolve variable assignments as the module/submodule level. We may be able to acquire additional information if we instead interrogate what the assignments are using code executed outside the module we are trying to interrogate.
In my example, I have the following:
Code listing for src/example/package/module.py:
from logging import getLogger
__all__ = ['fn1']
logger = getLogger(__name__)
def fn1():
logger.warning('running fn1')
return 'fn1'
Code listing for src/example/package/__init__.py:
def print_module():
print("`module` is assigned with %r" % module)
Now execute the following in the interactive interpreter:
>>> from example.package import print_module
>>> print_module()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/tmp/example.package/src/example/package/__init__.py", line 2, in print_module
print("`module` is assigned with %r" % module)
NameError: name 'module' is not defined
So far so good, the exception looks perfectly normal. Now let's see what happens if example.package.module gets imported:
>>> import example.package.module
>>> print_module()
`module` is assigned with <module 'example.package.module' from '/tmp/example.package/src/example/package/module.py'>
Given that relative import is a short-hand syntax for the full import, let's see what happens if we modify the __init__.py to contain the absolute import rather than relative like what was just done in the interactive interpreter and see what happens now:
import example.package.module
def print_module():
print("`module` is assigned with %r" % module)
Launch the interactive interpreter once more, we see this:
>>> print_module()
`module` is assigned with <module 'example.package.module' from '/tmp/example.package/src/example/package/module.py'>
Note that __init__.py actually represents the module binding example.package, an intuition might be that if example.package.module is imported, the interpreter will then provide an assignment of module to example.package to aid with the resolution of example.package.module, regardless of absolute or relative imports being done. This seems to be a particular quirk of executing code at a module that may have submodules (i.e. __init__.py).
Actually, one more test. Let's see if there is just something weird to do with variable assignments. Modify src/example/package/__init__.py to:
import example.package.module
def print_module():
print("`module` is assigned with %r" % module)
def delete_module():
del module
The new function would test whether or not module was actually assigned to the scope at __init__.py. Executing this we learn that:
>>> from example.package import print_module, delete_module
>>> print_module()
`module` is assigned with <module 'example.package.module' from '/tmp/example.package/src/example/package/module.py'>
>>> delete_module()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/tmp/example.package/src/example/package/__init__.py", line 7, in delete_module
del module
UnboundLocalError: local variable 'module' referenced before assignment
Indeed, it wasn't, so the interpreter is truly resolving the reference at module through the import system, rather than any variable that got assigned to the scope within __init__.py. So the prior intuition was actually wrong but it is rather the interpreter resolving the module name within example.package (even if this is done inside the scope of __init__.py) through the module system once example.package.module was imported.
I haven't looked at the specific PEPs that deals with assignment/name resolutions for modules and imports, but given that this little exercise proved that the issue is not simply reliant on relative imports, and that assignment is triggered regardless when or where the import was done, there might be something there, but this hopefully provided a greater understanding of how Python's import system deals with resolving names relating to imported modules.

Can we use a fully qualified identifier in a module, without importing the module?

From Dynamic linking in C/C++ (dll) vs JAVA (JAR)
when i want to use this jar file in another project we use "package" or "import" keyword
You don't have to. This is just a short hand. You can use full
package.ClassName and there is no need for an import. Note: this
doesn't import any code or data, just allow you to use a shorter name
for the class.
e.g. there is no difference between
java.util.Date date = new java.util.Date();
and
import java.util.Date();
Date date = new Date(); // don't need to specify the full package name.
Is it the same case for import in Python3?
Can we use a identifier defined in a module, without importing its module? Did I miss something in the following to make that happen?
What differences are between Java and Python's import?
>>> random.randint(1,25)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'random' is not defined
>>> import random
>>> random.randint(1,25)
18

Python is not Java. In Python you can only access names that are either builtins or defined in the current scope or it's parents scopes - the "top-level" scope being the module namespace (AKA "global" namespace).
The import statement (which is an executable statement FWIW) does two things: first load the module (this actually happens only once per process, then the module is cached in sys.modules), then bind the imported name(s) in the current scope. IOW this:
import foo
is syntaxic sugar for
foo = __import__("foo")
and
from foo import bar
is syntaxic sugar for
foo = __import__("foo")
bar = getattr(foo, "bar")
del foo
Also you have to understand what "loading a module" really means: executing all the code at the module's top-level.
As I mentionned, import is an executable statement, but so are class and def - the def statement creates a code object from the function's body and signature, then creates a function object with this code object, and finally bind this function object to the function's name in the current scope, the class statement does the same thing for a class (executing all the code at the "class" statement's top-level in a temporary namespace and using this namespace to create the "class" object, then binding the class object to it's name).
IOW, all happens at runtime, everything is an object (including functions, classes and modules), and everything you do with an import, class or def statement can be done "manually" too (more or less easily though - manually creating a function is quite an involved process).
So as you can see, this really has nothing to do with how either Java or C++ work.

Short answer: No, you can't implicitly import a module in Python by using a fully qualified name.
Slightly longer answer:
In Python, importing a module can have side effects: a module can have module-level code, not all of its code has to be wrapped in functions or classes. Therefore, importing modules at arbitrary and unexpected locations could be confusing, since it would trigger those side effects when you don't expect them.
The recommended style (see https://www.python.org/dev/peps/pep-0008/ for details) is to put all your imports at the top of your module, and not to hide imports at unexpected places.

Python Importing - explaination

Similar Question: Understanding A Chain of Imports in Python
NB: I'm using Python 3.3
I have setup the following two files in the same directory to explain importing to myself, however I still don't get exactly what it's doing. I understand function and class definitions are statements that need to run.
untitled.py:
import string
class testing:
def func(self):
try:
print(string.ascii_lowercase)
except:
print('not imported')
class second:
x=1
print('print statement in untitled executed')
stuff.py:
from untitled import testing
try:
t=testing()
t.func()
except NameError:
print('testing not imported')
try:
print(string.ascii_uppercase)
except NameError:
print('string not imported')
try:
print(untitled.string.ascii_uppercase)
except NameError:
print('string not imported in untitled')
try:
s=second()
print(s.x)
except NameError:
print('second not imported')
This is the output I get from running stuff.py:
print statement in untitled executed
abcdefghijklmnopqrstuvwxyz
string not imported
string not imported in untitled
second not imported
The print statement in untitled.py is executed despite the import in stuff.py specifying only the testing class. Moreover what is the string module's relation inside stuff.py, as it can be called from within the testing class yet not from the outside.
Could somebody please explain this behaviour to me, what exactly does a "from import" statment do (what does it run)?

You can think of python modules as namespaces. Keep in mind that imports are not includes:
modules are only imported once
the first time, the top level code is executed
any imports, variable, function or class declarations affects only the module local namespace
Suppose you have a module called foo.py:
import eggs
bar = "Lets drink, it's a bar'
So when you do a from foo import bar in another module, you will make bar available in the current namespace. The module eggs will be available under foo.eggs if you do an import foo. If you do a from foo import *, then eggs, bar and everything else in the module namespace will be also in the current namespace - but never do that, wildcard imports are frowned upon in Python.
If you do a import foo and then import eggs, the top level code at eggs will be executed once and the module namespace will be stored in the module cache: if another module imports it the information will be pulled from this cache. If you are going to use it, then import it - no need to worry about multiple imports executing the top level code multiple times.
Python programmers are very fond of namespaces; I always try to use import foo and then foo.bar instead of from foo import bar if possible - it keeps the namespace clean and prevent name clashes.
That said, the import mechanism is hackable, you can make python import statement work even with files that are not python.

The from statement isn't any different to import with regard to loading behaviour. Always the top level code is executed, when loading the module. from just controls which parts of the loaded module are being added to the current scope (the first point is most important):
The from form uses a slightly more complex process:
find the module specified in the from clause loading and initializing it if necessary;
for each of the identifiers specified in the import clauses:
check if the imported module has an attribute by that name
if not, attempt to import a submodule with that name and then check the imported module again for that attribute
if the attribute is not found, ImportError is raised.
otherwise, a reference to that value is bound in the local namespace, using the name in the as clause if it is present, otherwise using the attribute name
Thus you can access the contents of a module partially imported with from with this inelegant trick:
print(sys.modules['untitled'].string.ascii_uppercase)

In your first file (untitled.py), when python compiler parses(since you called it in import) this file It will create 2 class code objects and execute the print statement. Note that it will even print it if you run untitled.py from command line.
In your second file(stuff.py), to add to #Paulo comments, you have only imported testing class in your namspace, so only that will be available, from the 2 code objects from untitled.py
However if you just say
import untitled
your 3rd "try" statement will work, since it will have untitled in its namespace.
Next thing. try importing untitled.testing :)

Bug in Python's documentation?

I am reading http://docs.python.org/2/tutorial/modules.html#more-on-modules and wonder if the following is correct:
Modules can import other modules. It is customary but not required to
place all import statements at the beginning of a module (or script,
for that matter). The imported module names are placed in the
importing module’s global symbol table.
Apparently not:
>>> def foo(): import sys
...
>>> foo()
>>> sys.path
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'sys' is not defined
See http://ideone.com/cLK09v for an online demo.
So, is it a bug in the Python's documentation or I don't understand something?

Yes, this is a documentation error. The import statement imports the names to the current namespace. Usually import is used outside of functions and classes, but as you've discovered, it does work within them. In your example function, the module is imported into the function's local namespace when the function is called. (Which you didn't do, but that wouldn't make it available outside the function anyway.)
The global keyword does work here, however:
def foo():
global sys
import sys
foo()
sys.path

I don't think this is actually an error in the documentation, but more of a mis-interpretation. You simply have a scope issue. You are importing it in the scope of the function foo(). You could certainly do as the documentation suggests and put the import at the bottom of the file or somewhere else in the file that would still have the same global scope as your module. The problem is "The imported module names are placed in the importing module’s global symbol table", where the scope of the module you are importing into is contained in the function foo(), not at the module's global level.

Some confusion regarding imports in Python

I'm new to Python and there's something that's been bothering me for quite some time. I read in "Learning Python" by Mark Lutz that when we use a from statement to import a name present in a module, it first imports the module, then assigns a new name to it (i.e. the name of the function, class, etc. present in the imported module) and then deletes the module object with the del statement. However what happens if I try to import a name using from that references a name in the imported module that itself is not imported? Consider the following example in which there are two modules mod1.py and mod2.py:
#mod1.py
from mod2 import test
test('mod1.py')
#mod2.py
def countLines(name):
print len(open(name).readlines())
def countChars(name):
print len(open(name).read())
def test(name):
print 'loading...'
countLines(name)
countChars(name)
print '-'*10
Now see what happens when I run or import mod1:
>>>import mod1
loading...
3
44
----------
Here when I imported and ran the test function, it ran successfully although I didn't even import countChars or countLines, and the from statement had already deleted the mod2 module object.
So I basically need to know why this code works even though considering the problems I mentioned it shouldn't.
EDIT: Thanx alot to everyone who answered :)

Every function have a __globals__ attribute which holds a reference for the environment where it search for global variables and functions.
The test function is then linked to the global variables of mod2. So when it calls countLines the interpreter will always find the right function even if you wrote a new one with the same name in the module importing the function.

I think you're wrestling with the way python handles namespaces. when you type from module import thing you are bringing thing from module into your current namespace. So, in your example, when mod1 gets imported, the code is evaluated in the following order:
from mod2 import test #Import mod2, bring test function into current module namespace
test("mod1.py") #run the test function (defined in mod2)
And now for mod2:
#create a new function named 'test' in the current (mod2) namespace
#the first time this module is imported. Note that this function has
#access to the entire namespace where it is defined (mod2).
def test(name):
print 'loading...'
countLines(name)
countChars(name)
print '-'*10
The reason that all of this is important is because python lets you choose exactly what you want to pull into your namespace. For example, say you have a module1 which defines function cool_func. Now you are writing another module (module2) and it makes since for module2 to have a function cool_func also. Python allows you to keep those separate. In module3 you could do:
import module1
import module2
module1.cool_func()
module2.cool_func()
Or, you could do:
from module1 import cool_func
import module2
cool_func() #module1
module2.cool_func()
or you could do:
from module1 import cool_func as cool
from module2 import cool_func as cooler
cool() #module1
cooler() #module2
The possibilities go on ...
Hopefully my point is clear. When you import an object from a module, you are choosing how you want to reference that object in your current namespace.

The other answers are better articulated than this one, but if you run the following you can see that countChars and countLines are actually both defined in test.__globals__:
from pprint import pprint
from mod2 import test
pprint(test.__globals___)
test('mod1')
You can see that importing test brings along the other globals defined in mod2, letting you run the function without worrying about having to import everything you need.

Each module has its own scope. Within mod1, you cannot use the names countLines or countChars (or mod2).
mod2 itself isn't affected in the least by how it happens to be imported elsewhere; all names defined in it are available within the module.
If the webpage you reference really says that the module object is deleted with the del statement, it's wrong. del only removes names, it doesn't delete objects.

From A GUIDE TO PYTHON NAMESPACES,
Even though modules have their own global namespaces, this doesn’t mean that all names can be used from everywhere in the module. A scope refers to a region of a program from where a namespace can be accessed without a prefix. Scopes are important for the isolation they provide within a module. At any time there are a number of scopes in operation: the scope of the current function you’re in, the scope of the module and then the scope of the Python builtins. This nesting of scopes means that one function can’t access names inside another function.
Namespaces are also searched for names inside out. This means that if there is a certain name declared in the module’s global namespace, you can reuse the name inside a function while being certain that any other function will get the global name. Of course, you can force the function to use the global name by prefixing the name with the ‘global’ keyword. But if you need to use this, then you might be better off using classes and objects.

An import statement loads the whole module in memory so that's why the test() function ran successfully.
But as you used from statement that's why you can't use the countLines and countChars directly but test can surely call them.
from statement basically loads the whole module and sets the imported function, variable etc to the global namespace.
for eg.
>>> from math import sin
>>> sin(90) #now sin() is a global variable in the module and can be accesed directly
0.89399666360055785
>>> math
Traceback (most recent call last):
File "<pyshell#2>", line 1, in <module>
math
NameError: name 'math' is not defined
>>> vars() #shows the current namespace, and there's sin() in it
{'__builtins__': <module '__builtin__' (built-in)>, '__file__': '/usr/bin/idle', '__package__': None, '__name__': '__main__', 'main': <function main at 0xb6ac702c>, 'sin': <built-in function sin>, '__doc__': None}
consider a simple file, file.py:
def f1():
print 2+2
def f2():
f1()
import only f2:
>>> from file import f2
>>> f2()
4
though I only imported f2() not f1() but it ran f1() succesfully it's because the module is loaded in memory but we can only access f2(), but f2() can access other parts of the module.

Develop Reference

Python is a programming language that lets you work quickly and integrate systems more effectively.

Where to import? - python

I'm learning Python and today while writing some code I was trying to decide where to put an import statement. I can put an import statement just about anywhere it seems but how does the placement affect performance, the namespace, and anything else I don't know yet?

Related

Python 'from x import z' imports more than just 'z' [duplicate]

Can we use a fully qualified identifier in a module, without importing the module?

Python Importing - explaination

Bug in Python's documentation?

Some confusion regarding imports in Python

Categories

Resources