Why does python not import every module at startup automatically? - python

I was having a play around with Python 2.7 and everybody knows that at the start of every program, you always have to import modules. For example:
import random
import time
for x in range(1, 300):
print random.randint(1,100)
time.sleep(1)
print "Done!"
Anyway, I was thinking, why do I have to import all my modules manually? Why doesn't Python just import them all like this.
Sure, I can understand why it does not import like this:
from random import randint
from time import *
for x in range(1, 300):
print randint(1,100)
sleep(1)
print "Done!"
As some function names may clash. But, if you have to define where the function is at the start, so for example random. in random.randint(1,100).
Now modern computers are so powerful, it seems logical to import every module automatically instead of wasting lines of code, and time by having to find which module you need then importing it manually when it can easily be automated.
So, why does python not import every module at startup automatically?
EDIT:
I have made a new version of a little program that imports every module that I can find by running:
import sys
sys.builtin_module_names
Here are the results:
x = int(1000000)
def test():
global x
x -= 1
print "Iterations Left: ", x
import __builtin__
import __main__
import _ast
import _bisect
import _codecs
import _codecs_cn
import _codecs_hk
import _codecs_iso2022
import _codecs_jp
import _codecs_kr
import _codecs_tw
import _collections
import _csv
import _functools
import _heapq
import _hotshot
import _io
import _json
import _locale
import _lsprof
import _md5
import _multibytecodec
import _random
import _sha
import _sha256
import _sha512
import _sre
import _struct
import _subprocess
import _symtable
import _warnings
import _weakref
import _winreg
import array
import audioop
import binascii
import cPickle
import cStringIO
import cmath
import datetime
import errno
import exceptions
import future_builtins
import gc
import imageop
import imp
import itertools
import marshal
import math
import mmap
import msvcrt
import nt
import operator
import parser
import signal
import strop
import sys
import thread
import time
import xxsubtype
import zipimport
import zlib
def start():
from timeit import Timer
t = Timer("test()", "from __main__ import test")
print t.timeit()
start()

Because you don't need all of it. There is no point in loading every library if you don't need them.
EDIT:
I copied my libs folder to a test directory and made it into a package by adding an __init__.py file to it. In this file I added:
import os
import glob
__all__ = [ os.path.basename(f)[:-3] for f in glob.glob(os.path.dirname(__file__)+"/*.py")]
I created a test script that contains:
from Lib import *
print('Hello')
When I try to run it in the shell all it does is print 'The Zen of Python' by Tim Peters, opens this webcomic in my browser (2 things I absolutely did not see coming) and throws the following error:
Traceback (most recent call last):
File "C:\Users\Hannah\Documents\dropBox\Python\test\test.py", line 1, in <module>
from Lib import *
AttributeError: 'module' object has no attribute 'crypt'
It takes a noticable amount of time before it does any of this, about 10-15 seconds

Maybe what you would like is a feature that automatically imports the libraries that are used in your script without needing to specify them at the beginning. I found this on the Internet http://www.connellybarnes.com/code/autoimp/
You just need one import at the beginning of your script
from autoimp import *
All other modules are loaded "lazily", i.e. when they are first used.
Example in the interactive shell:
>>> random.random()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'random' is not defined
>>> from autoimp import *
>>> random.random()
0.0679000238267422
From the docs:
For ultimate laziness, place the command "from autoimp import *" in your PYTHONSTARTUP file. Now your interactive session has all modules available by default.

Every module you import takes time to import. Importing every built-in module every time you start Python would kill performance in a lot of important scenarios where new Python interpreters are started frequently.

Python does have a set of modules that are always loaded, its call __builtins__ :).
Python's builtins provide the import statement for you to extend your scope with even more modules! But as other posts have said, deciding your script needs these modules it up to you. -- I have looked into mutating __builtins__ and I promise you, explicitly importing what you need is the better option.
((Big rant about not using from name import * cut from here))
Since most of writing python ultimately becomes packaging and installing that writen python somewhere, this is my goto set of resources for getting a handle on python's infamous import:
Start by sticking to standard tools and libraries (https://packaging.python.org/current/)
Reading and understand The Google Python Standards Guide (https://google.github.io/styleguide/pyguide.html),
Read the Zen of Python (https://www.python.org/dev/peps/pep-0020/)
Be Pythonic (basically adhere to "The Zen of Python"), https://www.youtube.com/watch?v=wf-BqAjZb8M
Supplement your problem-space with tips from The Hitchhikers Guid to Python (http://docs.python-guide.org/en/latest/)
Be preapred to package your code (https://packaging.python.org/distributing/ (Doc), https://github.com/pypa/sampleproject/ (Example))
Being prepared to debug someone else's and, Your own code by getting familiar with tools like:
pdb (import pdb; pdb.set_trace(), > pp variable),
print(help(variable)),
dir(variable),
and pprint.pprint( variable.__dict__ )

Related

ModuleNotFoundError while importing from alias

e.g.
import os as my_os
import my_os.path
ModuleNotFoundError: No module named 'my_os'
but the following script is ok
import os
import os.path
You can't do that in Python.
import statements are importing from Python file names.
You aren't renaming the file of os to my_os, therefore this wouldn't work.
As mentioned in the documentation:
The import statement combines two operations; it searches for the named module, then it binds the results of that search to a name in the local scope.
Tried the similar way as os.py did:
import json as my_json
from my_json.decoder import * # ModuleNotFoundError: No module named 'my_json'
import sys
import json.decoder as my_decoder
sys.modules['my_json.decoder'] = my_decoder
from my_json.decoder import * # it's ok now

Module undefined after importing it in a function

I'm trying to use a function called start to set up my enviroment in python. The function imports os.
After I run the function and do the following
os.listdir(simdir+"main")
I get a error that says os not defined
code
>>> def setup ():
import os.path
import shutil
simdir="e:\\"
maindir="c:\\backup\\bitcois\\test exit\\"
>>> setup()
>>> os.listdir(simdir+"main")
Traceback (most recent call last):
File "<pyshell#10>", line 1, in <module>
os.listdir(simdir+"main")
NameError: name 'os' is not defined
The import statement is scoped. When importing modules they are defined for the local namespace.
From the documentation:
Import statements are executed in two steps: (1) find a module, and initialize it if necessary; (2) define a name or names in the local namespace (of the scope where the import statement occurs). [...]
So in your case the os package is only defined within function setup.
You are getting this error because you are NOT importing the whole os library but just the os.path module. In this way, the other resources at the os library are not made available for your use.
In order to be able to use the os.listdir method, you need to either import it alongside the os.path like this:
>>> def setup ():
import os.path, os.listdir
import shutil
simdir="e:\\"
maindir="c:\\backup\\bitcois\\test exit\\"
or import the full library:
>>> def setup ():
import os
import shutil
simdir="e:\\"
maindir="c:\\backup\\bitcois\\test exit\\"
You can read more here:
https://docs.python.org/2/tutorial/modules.html
try:
import os.path
import shutil
import glob
def setup ():
global simdir
simdir="e:\\"
maindir="c:\\backup\\bitcois\\test exit\\"
setup()
os.listdir(simdir+"main")
You need to return the paths and assign the returned values in the global scope. Also, import os too:
import os
def setup():
# retain existing code
return simdir, maindir
simdir, maindir = setup()
When you import os or do any sort of command within a function, the command's effect only last while that function itself is running. What you need to do is
import os
...Do your function and other code
This way, your import lasts for the whole program :).

How to check if a module/library/package is part of the python standard library?

I have installed sooo many libraries/modules/packages with pip and now I cannot differentiate which is native to the python standard library and which is not. This causes problem when my code works on my machine but it doesn't work anywhere else.
How can I check if a module/library/package that I import in my code is from the python stdlib?
Assume that the checking is done on the machine with all the external libraries/modules/packages, otherwise I could simply do a try-except import on the other machine that doesn't have them.
For example, I am sure these imports work on my machine, but when it's on a machine with only a plain Python install, it breaks:
from bs4 import BeautifulSoup
import nltk
import PIL
import gensim
You'd have to check all modules that have been imported to see if any of these are located outside of the standard library.
The following script is not bulletproof but should give you a starting point:
import sys
import os
external = set()
exempt = set()
paths = (os.path.abspath(p) for p in sys.path)
stdlib = {p for p in paths
if p.startswith((sys.prefix, sys.real_prefix))
and 'site-packages' not in p}
for name, module in sorted(sys.modules.items()):
if not module or name in sys.builtin_module_names or not hasattr(module, '__file__'):
# an import sentinel, built-in module or not a real module, really
exempt.add(name)
continue
fname = module.__file__
if fname.endswith(('__init__.py', '__init__.pyc', '__init__.pyo')):
fname = os.path.dirname(fname)
if os.path.dirname(fname) in stdlib:
# stdlib path, skip
exempt.add(name)
continue
parts = name.split('.')
for i, part in enumerate(parts):
partial = '.'.join(parts[:i] + [part])
if partial in external or partial in exempt:
# already listed or exempted
break
if partial in sys.modules and sys.modules[partial]:
# just list the parent name and be done with it
external.add(partial)
break
for name in external:
print name, sys.modules[name].__file__
Put this is a new module, import it after all imports in your script, and it'll print all modules that it thinks are not part of the standard library.
The standard library is defined in the documentation of python. You can just search there, or put the module names into a list and check programmatically with that.
Alternatively, in python3.4 there's a new isolated mode that allows to ignore a certain number of user-defined library paths. In previous versions of python you can use -s to ignore the per-user environment and -E to ignore the system defined variables.
In python2 a very simple way to check if a module is part of the standard library is to clear the sys.path:
>>> import sys
>>> sys.path = []
>>> import numpy
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: No module named numpy
>>> import traceback
>>> import os
>>> import re
However this doesn't work in python3.3+:
>>> import sys
>>> sys.path = []
>>> import traceback
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: No module named 'traceback'
[...]
This is because starting with python3.3 the import machinery was changed, and importing the standard library uses the same mechanism as importing any other module (see the documentation).
In python3.3 the only way to make sure that only stdlib's imports succeed is to add only the standard library path to sys.path, for example:
>>> import os, sys, traceback
>>> lib_path = os.path.dirname(traceback.__file__)
>>> sys.path = [lib_path]
>>> import traceback
>>> import re
>>> import numpy
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ImportError: No module named 'numpy'
I used the traceback module to get the library path, since this should work on any system.
For the built-in modules, which are a subset of the stdlib modules, you can check sys.builtin_module_names
#Bakuriu's answer was very useful to me. The only problem I experienced is if you want to check if a particular module is stdlib however is has been imported already. In that case, sys.modules will only have an entry for it so even if sys.path is stripped, the import will succeed:
In [1]: import sys
In [2]: import virtualenv
In [3]: sys.path = []
In [4]: try:
__import__('virtualenv')
except ImportError:
print(False)
else:
print(True)
...:
True
vs
In [1]: import sys
In [2]: sys.path = []
In [3]: try:
__import__('virtualenv')
except ImportError:
print(False)
else:
print(True)
...:
False
I whipped out the following solution which seems to work in both Python2 and Python3:
from __future__ import unicode_literals, print_function
import sys
from contextlib import contextmanager
from importlib import import_module
#contextmanager
def ignore_site_packages_paths():
paths = sys.path
# remove all third-party paths
# so that only stdlib imports will succeed
sys.path = list(filter(
None,
filter(lambda i: 'site-packages' not in i, sys.path)
))
yield
sys.path = paths
def is_std_lib(module):
if module in sys.builtin_module_names:
return True
with ignore_site_packages_paths():
imported_module = sys.modules.pop(module, None)
try:
import_module(module)
except ImportError:
return False
else:
return True
finally:
if imported_module:
sys.modules[module] = imported_module
You can keep track of the source code here

ImportError when attempting to mock a module

I have a module that I am testing that depends on another module that won't be available at the time of testing. To get around this, I wrote (essentially):
import mock
import sys
sys.modules['parent_module.unavailable_module'] = mock.MagicMock()
import module_under_test
This works fine as long as module_under_test is doing one of the following import parent_module, import parent_module.unavailable_module. However, the following code generates a traceback:
>>> from parent_module import unavailable_module
ImportError: cannot import name unavailable_module
What's up with this? What can I do in my test code (without changing the import statement) to avoid this error?
Alright, I think I've figured it out. It appears that in the statement:
from parent_module import unavailable_module
Python looks for an attribute of parent_module called unavailable_module. Therefore, the following set up code fully replaces unavailable_module within parent_module:
import mock
import sys
fake_module = mock.MagicMock()
sys.modules['parent_module.unavailable_module'] = fake_module
setattr(parent_module, 'unavailable_module', fake_module)
I tested the four import idioms of which I am aware:
import parent_module
import parent_module.unavailable_module
import parent_module.unavailable_module as unavailabe_module
from parent_module import unavailable_module
and each worked with the above set up code.

Importing a module based on installed python version?

My module currently imports the json module, which is only available in 2.6. I'd like to make a check against the python version to import simplejson, which can be built for 2.5 (and is the module adopted in 2.6 anyway). Something like:
if __version__ 2.5:
import simplejson as json
else:
import json
What's the best way to approach this?
try:
import simplejson as json
except ImportError:
import json
of course, it doesn't work around cases when in python-2.5 you don't have simplejson installed, the same as your example.
Though the ImportError approach (SilentGhost's answer) is definitely best for this example, anyone wanting to do that __version__ thing would use something like this:
import sys
if sys.version_info < (2, 6):
import simplejson as json
else:
import json
To be absolutely clear though, this is not the "best way" to do what you wanted... it's merely the correct way to do what you were trying to show with __version__.
You can import one or more modules without Handling ImportError error:
import sys
major_version = sys.version_info.major
if major_version == 2:
import SocketServer
import SimpleHTTPServer
import urllib2
elif major_version == 3:
import http.server as SimpleHTTPServer
import socketserver as SocketServer
import urllib.request as urllib2

Categories

Resources