Just after standard pythonmodule imports?
If I postpone it to the main function and do my specific module imports before it, it gives error (which is quite obvious). Python Style guide no where mentions the correct location for it.
It should go before the import or from statements that need it (which as you say is obvious). So for example a module could start with:
import sys
import os
import math
try:
import foo
except ImportError:
if 'foopath' in sys.path: raise
sys.path.append('foopath')
import foo
Note that I've made the append conditional (on the import failing and the specific module's path not being on sys.path yet) to avoid the risk of sys.path ending up with dozens of occurrences of string foopath, which would not be particularly helpful;-).
One reason this isn't mentioned in PEP 8 or other good Python style guides is that modifying sys.path isn't something you want to do in a real program; it makes your program less robust and portable. A better solution might be to put your package somewhere that would already be in sys.path or to define PYTHONPATH systemwide to include your package.
I often use a shell script to launch my python applications: I put my sys.path.insert (append) statement just after the "standard" python module imports in my "launch python script".
Using sys.path.insert(0, ...) gets your "imports" in priority in the path list.
I generally do it before importing anything. If you're worried that your module names might conflict with the Python stdlib names, then change your module names!
I think it's a matter of taste. But most people tend to put it behind the import sys :-)
I prefer wrapping it in an extra function:
def importmod_abs(name):
sys.path.append() ..
__import__ ...
sys.path.pop()
... this way sys.path remains clean. Of course that's only applicable to certain module structures. Anyways, I'd import everything that works without altering sys.path first.
Related
I'm wondering about the preferred way to import packages in a Python application. I have a package structure like this:
project.app1.models
project.app1.views
project.app2.models
project.app1.views imports project.app1.models and project.app2.models. There are two ways to do this that come to mind.
With absolute imports:
import A.A
import A.B.B
or with explicit relative imports, as introduced in Python 2.5 with PEP 328:
# explicit relative
from .. import A
from . import B
What is the most pythonic way to do this?
Python relative imports are no longer strongly discouraged, but using absolute_import is strongly suggested in that case.
Please see this discussion citing Guido himself:
"Isn't this mostly historical? Until the new relative-import syntax
was implemented there were various problems with relative imports. The
short-term solution was to recommend not using them. The long-term
solution was to implement an unambiguous syntax. Now it is time to
withdraw the anti-recommendation. Of course, without going overboard
-- I still find them an acquired taste; but they have their place."
The OP correctly links the PEP 328 that says:
Several use cases were presented, the most important of which is being
able to rearrange the structure of large packages without having to
edit sub-packages. In addition, a module inside a package can't easily
import itself without relative imports.
Also see almost duplicate question When or why to use relative imports in Python
Of course it still stands as a matter of taste. While it's easier to move code around with relative imports, that might also unexpectedly break things; and renaming the imports is not that difficult.
To force the new behaviour from PEP 328 use:
from __future__ import absolute_import
In this case, implicit relative import will no longer be possible (eg. import localfile will not work anymore, only from . import localfile). For clean and future proof behaviour, using absolute_import is advisable.
An important caveat is that because of PEP 338 and PEP 366, relative imports require the python file to be imported as a module - you cannot execute a file.py that has a relative import or you'll get a ValueError: Attempted relative import in non-package.
This limitation should be taken into account when evaluating the best approach. Guido is against running scripts from a module in any case:
I'm -1 on this and on any other proposed twiddlings of the __main__ machinery.
The only use case seems to be running scripts that happen to be living inside a module's directory, which I've always seen as an antipattern.
To make me change my mind you'd have to convince me that it isn't.
Exhaustive discussions on the matter can be found on SO; re. Python 3 this is quite comprehensive:
Relative imports in Python 3
Absolute imports. From PEP 8:
Relative imports for intra-package imports are highly
discouraged.
Always use the absolute package path for all imports.
Even now that PEP 328 [7] is fully implemented in Python 2.5,
its style of explicit relative imports is actively discouraged;
absolute imports are more portable and usually more readable.
Explicit relative imports are a nice language feature (I guess), but they're not nearly as explicit as absolute imports. The more readable form is:
import A.A
import A.B.B
especially if you import several different namespaces. If you look at some well written projects/tutorials that include imports from within packages, they usually follow this style.
The few extra keystrokes you take to be more explicit will save others (and perhaps you) plenty of time in the future when they're trying to figure out your namespace (especially if you migrate to 3.x, in which some of the package names have changed).
Relative imports not only leave you free to rename your package later without changing dozens of internal imports, but I have also had success with them in solving certain problems involving things like circular imports or namespace packages, because they do not send Python "back to the top" to start the search for the next module all over again from the top-level namespace.
A few hours ago I was careless enough to name my short script as code.py. Apparently, there is such a package which is used e.g. by ptvsd or pdb. This led to my code.py to be imported instead and caused a bunch of nested unhandled exceptions with missing imports upon trying to debug my code. What was making it more frustrating is that traceback didn't show any sign of importing my code.py file, so I spent quite a while to find the source of the problem.
I'd like to avoid such situations in the future, so my question is: what's the best practice to ensure that the modules you use aren't importing your code by mistake due to such a name collision?
This is a common gotcha, and actually there's no failsafe way to avoid it. At least you can make sure your modules all live in packages (at least one package if that's a small project with no reusable code) so that you'd use them as from mypackage import code instead of import code (also make sure you use either absolute imports etc), and that you always run your code from the directory containing the package(s), not from within the package directory itself (python inserts the current working directory in first position of sys.path).
This won't prevent ALL possible name masking issues but it should minimize them. Now from experience, once you've experienced this kind of issues at least once, you usually spot the symptoms very quickly - the most common and quite obvious being that some totally unrelated stlib or third-part module starts crashing with ImportErrors or AttributeErrors (with "module X has no attribute Y" messages). At this point, if you just added a new module to your own code, chances are it's the new module that breaks everything, so you can just rename it (make sure you clean up .pyo/.pyc files if any) and see if it solves the issue. Else check the traceback to find out which imports fails, most of the time you'll find you have a module or package by the same name in your current working directory.
You can't avoid completely, that somebody is able to import your module by mistake.
You can structure your code better in subpackages going from "well known" to "less known" names. E.g. if you are developing code for a certain company then you might want to structure like:
company.country.location.department.function
If your code is then getting more accepted and used by others, you can bring the code in the upper namespace, so that it is made available in company.country.location.department.function
and company.country.location.department
You can modify sys.path at the beginning of your main module, before you start importing other modules:
import sys
sys.path.append(sys.path.pop(0))
so that the main module's starting directory is placed at the last of the module search paths rather than at the front, in order for other modules of the same name to take precedence.
EDIT: To all the downvoters, this answer actually works.
For example, running code.py with the following content:
import pdb
pdb.run('print("Hello world")')
would raise:
AttributeError: module 'pdb' has no attribute 'run'
because code.py has no run defined, while running code.py with the following content instead:
import sys
sys.path.append(sys.path.pop(0))
import pdb
pdb.run('print("Hello world")')
would execute pdb.run properly:
> <string>(1)<module>()
(Pdb)
I can't wrap my head around how 'import' statement works in Python.
It is said to search for the packages in directories returned by sys.path(). However, even if sys module is available automatically in every Python program it's not imported automatically. So does import statement import sys module under the hood?
There are two phases which you are getting a little confused.
Python has to find the actual file (containing code) that you want to import, parse it, execute it, and store it somewhere.
It then has to bind the name of the imported module locally to the module object.
That is, the process "find the module sys and turn it into a module object" is not the same as "define the variable sys to mean the module".
You can check which modules have been loaded by looking in sys.modules.
As a separate issue, there are a few basics of Python that are actually hardcoded into the interpreter, not represented as separate files on disk. sys is one of these modules: there is no sys.py file; instead, it's compiled C code that's included in the python.exe binary.
sys module vs import statement - aka. "chicken or the egg?"
I believe sys module documentation says it all:
This module provides access to some variables used or maintained by the interpreter and to functions that interact strongly with the interpreter. It is always available.
So, to put it shortly and in different words: interpreter has some variables that you can access by using sys module - sys.path is one of such variables.
Question no. 2. - How import works
When it comes to "how import statement works", you can read about it here: http://docs.python.org/2/reference/simple_stmts.html#import. However, it is not really related to the main part of your question: the relation of import statement and importing sys module.
Specs: Python 2.7
I'm working on a project that has several modules, I want to activate some features from the __future__ module in all of them. I would like to import all the features I need on one module, and then import that single module to every other, and have those features be active in all of them, or something to that effect.
I tried:
[A.py]
from __future__ import division
[B.py]
import A
print(1/2)
Running B.py the division was still integer. I tried:
[A.py]
print(1/2)
[B.py]
from __future__ import division
import A
Running B.py gave the same result. With both previous examples I also tried switching 'import A' by 'from A import *' with the same results.
I searched Google for a while, and found the best description about how the __future__ module works, obviously enough, on the Python documentation. There I could only find the assurance the features would be active in the module they were imported to, without any mention of how to do it globally.
So I'd like to know if there is a way of doing this, either the way I described, or creating some sort of runtime configuration file, or through some other means.
There's no way to do this in-language; you really can't make __future__ imports global in this sense. (Well, you probably can replace the normal import statements with something complicated around imp or something. See the Future statement documentation and scroll down to "Code compiled by…" But anything like this is almost certainly a bad idea.)
The reason is that from __future__ import division isn't really a normal import. Or, rather, it's more than a normal import. You actually do get a name called division that you can inspect, but just having that value has no effect—so passing it to other modules doesn't affect those modules. On top of the normal import, Python has special magic that detects __future__ imports at the top of a module, or in the interactive interpreter, and changes the way your code is compiled. See future for the "real import" part, and Future statements for the "magic" part, if you want all the details.
And there's no configuration file that lets you do this. But there is a command-line parameter:
python -Qnew main.py
This has the same effect as doing a from __future__ import division everywhere.
You can add this to the #! lines, or alias pyfuturediv='python -Qnew' (or even alias python='python -Qnew') in your shell, or whatever, which maybe as good as a configuration file for your purposes.
But really, if you want to make sure module B gets new-style division, you probably should have the __future__ declaration in B in the first place.
Or, of course, you could just write for Python 3.0+ instead of 2.3-2.7. (Note that some of the core devs were against having command-line arguments, because "the right way to get feature X globally is to use a version of Python >= feature X's MandatoryRelease".) Or use // when you mean //.
Another possibility is to use six, a module designed to let you write code that's almost Python 3.3 and have it work properly in 2.4-2.7 (and 3.0-3.2). For example, you don't get a print function, but you do get a print_ function that works exactly the same. You don't get Unicode literals, but you get u() fake literals—which, together with a UTF-8 encoding declaration in the source, is almost good enough. And it provides a whole lot of stuff that you can't get from __future__ as well—StringIO and BytesIO, exec as a function, the next function, etc.
If the problem is that you have 1000 source files, and it's a pain to edit them all, you could use sed, or use 3to2 with just the option that fixes division, or…
Another approach would be using isort. isort has a -a command line flag to add imports to files that you specify. Simply running isort without arguments will run it recursively on all python files in the current working directory and all subdirectories.
If, like me, you have a virtual environment inside that folder, and are using git (or have an equivalent way of listing only your files) and don't want to run it on all files inside that virtual environment, you can use something like:
git ls-tree -r HEAD --name-only | grep "\.py$" | xargs isort -a -y "from __future__ import division"
I'm wondering about the preferred way to import packages in a Python application. I have a package structure like this:
project.app1.models
project.app1.views
project.app2.models
project.app1.views imports project.app1.models and project.app2.models. There are two ways to do this that come to mind.
With absolute imports:
import A.A
import A.B.B
or with explicit relative imports, as introduced in Python 2.5 with PEP 328:
# explicit relative
from .. import A
from . import B
What is the most pythonic way to do this?
Python relative imports are no longer strongly discouraged, but using absolute_import is strongly suggested in that case.
Please see this discussion citing Guido himself:
"Isn't this mostly historical? Until the new relative-import syntax
was implemented there were various problems with relative imports. The
short-term solution was to recommend not using them. The long-term
solution was to implement an unambiguous syntax. Now it is time to
withdraw the anti-recommendation. Of course, without going overboard
-- I still find them an acquired taste; but they have their place."
The OP correctly links the PEP 328 that says:
Several use cases were presented, the most important of which is being
able to rearrange the structure of large packages without having to
edit sub-packages. In addition, a module inside a package can't easily
import itself without relative imports.
Also see almost duplicate question When or why to use relative imports in Python
Of course it still stands as a matter of taste. While it's easier to move code around with relative imports, that might also unexpectedly break things; and renaming the imports is not that difficult.
To force the new behaviour from PEP 328 use:
from __future__ import absolute_import
In this case, implicit relative import will no longer be possible (eg. import localfile will not work anymore, only from . import localfile). For clean and future proof behaviour, using absolute_import is advisable.
An important caveat is that because of PEP 338 and PEP 366, relative imports require the python file to be imported as a module - you cannot execute a file.py that has a relative import or you'll get a ValueError: Attempted relative import in non-package.
This limitation should be taken into account when evaluating the best approach. Guido is against running scripts from a module in any case:
I'm -1 on this and on any other proposed twiddlings of the __main__ machinery.
The only use case seems to be running scripts that happen to be living inside a module's directory, which I've always seen as an antipattern.
To make me change my mind you'd have to convince me that it isn't.
Exhaustive discussions on the matter can be found on SO; re. Python 3 this is quite comprehensive:
Relative imports in Python 3
Absolute imports. From PEP 8:
Relative imports for intra-package imports are highly
discouraged.
Always use the absolute package path for all imports.
Even now that PEP 328 [7] is fully implemented in Python 2.5,
its style of explicit relative imports is actively discouraged;
absolute imports are more portable and usually more readable.
Explicit relative imports are a nice language feature (I guess), but they're not nearly as explicit as absolute imports. The more readable form is:
import A.A
import A.B.B
especially if you import several different namespaces. If you look at some well written projects/tutorials that include imports from within packages, they usually follow this style.
The few extra keystrokes you take to be more explicit will save others (and perhaps you) plenty of time in the future when they're trying to figure out your namespace (especially if you migrate to 3.x, in which some of the package names have changed).
Relative imports not only leave you free to rename your package later without changing dozens of internal imports, but I have also had success with them in solving certain problems involving things like circular imports or namespace packages, because they do not send Python "back to the top" to start the search for the next module all over again from the top-level namespace.