Python: How to load a module twice? - python

Is there a way to load a module twice in the same python session?
To fill this question with an example: Here is a module:
Mod.py
x = 0
Now I would like to import that module twice, like creating two instances of a class to have actually two copies of x.
To already answer the questions in the comments, "why anyone would want to do that if they could just create a class with x as a variable":
You are correct, but there exists some huge amount of source that would have to be rewritten, and loading a module twice would be a quick fix^^.

Yes, you can load a module twice:
import mod
import sys
del sys.modules["mod"]
import mod as mod2
Now, mod and mod2 are two instances of the same module.
That said, I doubt this is ever useful. Use classes instead -- eventually it will be less work.
Edit: In Python 2.x, you can also use the following code to "manually" import a module:
import imp
def my_import(name):
file, pathname, description = imp.find_module(name)
code = compile(file.read(), pathname, "exec", dont_inherit=True)
file.close()
module = imp.new_module(name)
exec code in module.__dict__
return module
This solution might be more flexible than the first one. You no longer have to "fight" the import mechanism since you are (partly) rolling your own one. (Note that this implementation doesn't set the __file__, __path__ and __package__ attributes of the module -- if these are needed, just add code to set them.)

Deleting an entry from sys.modules will not necessarily work (e.g. it will fail when importing recurly twice, if you want to work with multiple recurly accounts in the same app etc.)
Another way to accomplish this is:
>>> import importlib
>>> spec = importlib.util.find_spec(module_name)
>>> instance_one = importlib.util.module_from_spec(spec)
>>> instance_two = importlib.util.module_from_spec(spec)
>>> instance_one == instance_two
False

You could use the __import__ function.
module1 = __import__("module")
module2 = __import__("module")
Edit: As it turns out, this does not import two separate versions of the module, instead module1 and module2 will point to the same object, as pointed out by Sven.

Related

Importing modules in python (3 modules)

Let's say i have 3 modules within the same directory. (module1,module2,module3)
Suppose the 2nd module imports the 3rd module then if i import module2 in module 1. Does that automatically import module 3 to module 1 ?
Thanks
No. The imports only work inside a module. You can verify that by creating a test.
Saying,
# module1
import module2
# module2
import module3
# in module1
module3.foo() # oops
This is reasonable because you can think in reverse: if imports cause a chain of importing, it'll be hard to decide which function is from which module, thus causing complex naming conflicts.
No, it will not be imported unless you explicitly specify python to, like so:
from module2 import *
What importing does conceptually is outlined below.
import some_module
The statement above is equivalent to:
module_variable = import_module("some_module")
All we have done so far is bind some object to a variable name.
When it comes to the implementation of import_module it is also not that hard to grasp.
def import_module(module_name):
if module_name in sys.modules:
module = sys.modules[module_name]
else:
filename = find_file_for_module(module_name)
python_code = open(filename).read()
module = create_module_from_code(python_code)
sys.modules[module_name] = module
return module
First, we check if the module has been imported before. If it was, then it will be available in the global list of all modules (sys.modules), and so will simply be reused. In the case that the module is not available, we create it from the code. Once the function returns, the module will be assigned to the variable name that you have chosen. As you can see the process is not inefficient or wasteful. All you are doing is creating an alias for your module. In most cases, transparency is prefered, hence having a quick look at the top of the file can tell you what resources are available to you. Otherwise, you might end up in a situation where you are wondering where is a given resource coming from. So, that is why you do not get modules inherently "imported".
Resource:
Python doc on importing

Is there a module cache in python? [duplicate]

If a large module is loaded by some submodule of your code, is there any benefit to referencing the module from that namespace instead of importing it again?
For example:
I have a module MyLib, which makes extensive use of ReallyBigLib. If I have code that imports MyLib, should I dig the module out like so
import MyLib
ReallyBigLib = MyLib.SomeModule.ReallyBigLib
or just
import MyLib
import ReallyBigLib
Python modules could be considered as singletons... no matter how many times you import them they get initialized only once, so it's better to do:
import MyLib
import ReallyBigLib
Relevant documentation on the import statement:
https://docs.python.org/2/reference/simple_stmts.html#the-import-statement
Once the name of the module is known (unless otherwise specified, the term “module” will refer to both packages and modules), searching for the module or package can begin. The first place checked is sys.modules, the cache of all modules that have been imported previously. If the module is found there then it is used in step (2) of import.
The imported modules are cached in sys.modules:
This is a dictionary that maps module names to modules which have already been loaded. This can be manipulated to force reloading of modules and other tricks. Note that removing a module from this dictionary is not the same as calling reload() on the corresponding module object.
As others have pointed out, Python maintains an internal list of all modules that have been imported. When you import a module for the first time, the module (a script) is executed in its own namespace until the end, the internal list is updated, and execution of continues after the import statement.
Try this code:
# module/file a.py
print "Hello from a.py!"
import b
# module/file b.py
print "Hello from b.py!"
import a
There is no loop: there is only a cache lookup.
>>> import b
Hello from b.py!
Hello from a.py!
>>> import a
>>>
One of the beauties of Python is how everything devolves to executing a script in a namespace.
It makes no substantial difference. If the big module has already been loaded, the second import in your second example does nothing except adding 'ReallyBigLib' to the current namespace.
WARNING: Python does not guarantee that module will not be initialized twice.
I've stubled upon such issue. See discussion:
http://code.djangoproject.com/ticket/8193
The internal registry of imported modules is the sys.modules dictionary, which maps module names to module objects. You can look there to see all the modules that are currently imported.
You can also pull some useful tricks (if you need to) by monkeying with sys.modules - for example adding your own objects as pseudo-modules which can be imported by other modules.
It is the same performancewise. There is no JIT compiler in Python yet.

Why can't Python's import work like C's #include?

I've literally been trying to understand Python imports for about a year now, and I've all but given up programming in Python because it just seems too obfuscated. I come from a C background, and I assumed that import worked like #include, yet if I try to import something, I invariably get errors.
If I have two files like this:
foo.py:
a = 1
bar.py:
import foo
print foo.a
input()
WHY do I need to reference the module name? Why not just be able to write import foo, print a? What is the point of this confusion? Why not just run the code and have stuff defined for you as if you wrote it in one big file? Why can't it work like C's #include directive where it basically copies and pastes your code? I don't have import problems in C.
To do what you want, you can use (not recommended, read further for explanation):
from foo import *
This will import everything to your current namespace, and you will be able to call print a.
However, the issue with this approach is the following. Consider the case when you have two modules, moduleA and moduleB, each having a function named GetSomeValue().
When you do:
from moduleA import *
from moduleB import *
you have a namespace resolution issue*, because what function are you actually calling with GetSomeValue(), the moduleA.GetSomeValue() or the moduleB.GetSomeValue()?
In addition to this, you can use the Import As feature:
from moduleA import GetSomeValue as AGetSomeValue
from moduleB import GetSomeValue as BGetSomeValue
Or
import moduleA.GetSomeValue as AGetSomeValue
import moduleB.GetSomeValue as BGetSomeValue
This approach resolves the conflict manually.
I am sure you can appreciate from these examples the need for explicit referencing.
* Python has its namespace resolution mechanisms, this is just a simplification for the purpose of the explanation.
Imagine you have your a function in your module which chooses some object from a list:
def choice(somelist):
...
Now imagine further that, either in that function or elsewhere in your module, you are using randint from the random library:
a = randint(1, x)
Therefore we
import random
You suggestion, that this does what is now accessed by from random import *, means that we now have two different functions called choice, as random includes one too. Only one will be accessible, but you have introduced ambiguity as to what choice() actually refers to elsewhere in your code.
This is why it is bad practice to import everything; either import what you need:
from random import randint
...
a = randint(1, x)
or the whole module:
import random
...
a = random.randint(1, x)
This has two benefits:
You minimise the risks of overlapping names (now and in future additions to your imported modules); and
When someone else reads your code, they can easily see where external functions come from.
There are a few good reasons. The module provides a sort of namespace for the objects in it, which allows you to use simple names without fear of collisions -- coming from a C background you have surely seen libraries with long, ugly function names to avoid colliding with anybody else.
Also, modules themselves are also objects. When a module is imported in more than one place in a python program, each actually gets the same reference. That way, changing foo.a changes it for everybody, not just the local module. This is in contrast to C where including a header is basically a copy+paste operation into the source file (obviously you can still share variables, but the mechanism is a bit different).
As mentioned, you can say from foo import * or better from foo import a, but understand that the underlying behavior is actually different, because you are taking a and binding it to your local module.
If you use something often, you can always use the from syntax to import it directly, or you can rename the module to something shorter, for example
import itertools as it
When you do import foo, a new module is created inside the current namespace named foo.
So, to use anything inside foo; you have to address it via the module.
However, if you use from from foo import something, you don't have use to prepend the module name, since it will load something from the module and assign to it the name something. (Not a recommended practice)
import importlib
# works like C's #include, you always call it with include(<path>, __name__)
def include(file, module_name):
spec = importlib.util.spec_from_file_location(module_name, file)
mod = importlib.util.module_from_spec(spec)
# spec.loader.exec_module(mod)
o = spec.loader.get_code(module_name)
exec(o, globals())
For example:
#### file a.py ####
a = 1
#### file b.py ####
b = 2
if __name__ == "__main__":
print("Hi, this is b.py")
#### file main.py ####
# assuming you have `include` in scope
include("a.py", __name__)
print(a)
include("b.py", __name__)
print(b)
the output will be:
1
Hi, this is b.py
2

Does python ever multiply-load a module?

I'm noticing some weird situations where tests like the following fail:
x = <a function from some module, passed around some big application for a while>
mod = __import__(x.__module__)
x_ref = getattr(mod, x.__name__)
assert x_ref is x # Fails
(Code like this appears in the pickle module)
I don't think I have any import hooks, reload calls, or sys.modules manipulation that would mess with python's normal import caching behavior.
Is there any other reason why a module would be loaded twice? I've seen claims about this (e.g, https://stackoverflow.com/a/10989692/1332492), but I haven't been able to reproduce it in a simple, isolated script.
I believe you misunderstood how __import__ works:
>>> from my_package import my_module
>>> my_module.function.__module__
'my_package.my_module'
>>> __import__(my_module.function.__module__)
<module 'my_package' from './my_package/__init__.py'>
From the documentation:
When the name variable is of the form package.module, normally, the
top-level package (the name up till the first dot) is returned, not
the module named by name. However, when a non-empty fromlist
argument is given, the module named by name is returned.
As you can see __import__ does not return the sub-module, but only the top package. If you have function also defined at package level you will indeed have different references to it.
If you want to just load a module you should use importlib.import_module instead of __import__.
As to answer you actual question: AFAIK there is no way to import the same module, with the same name, twice without messing around with the importing mechanism. However, you could have a submodule of a package that is also available in the sys.path, in this case you can import it twice using different names:
from some.package import submodule
import submodule as submodule2
print(submodule is submodule2) # False. They have *no* relationships.
This sometimes can cause problems with, e.g., pickle. If you pickle something referenced by submodule you cannot unpickle it using submodule2 as reference.
However this doesn't address the specific example you gave us, because using the __module__ attribute the import should return the correct module.

How to change a Python module name?

Is it only possible if I rename the file? Or is there a __module__ variable to the file to define what's its name?
If you really want to import the file 'oldname.py' with the statement 'import newname', there is a trick that makes it possible: Import the module somewhere with the old name, then inject it into sys.modules with the new name. Subsequent import statements will also find it under the new name. Code sample:
# this is in file 'oldname.py'
...module code...
Usage:
# inject the 'oldname' module with a new name
import oldname
import sys
sys.modules['newname'] = oldname
Now you can everywhere your module with import newname.
You can change the name used for a module when importing by using as:
import foo as bar
print bar.baz
Yes, you should rename the file. Best would be after you have done that to remove the oldname.pyc and oldname.pyo compiled files (if present) from your system, otherwise the module will be importable under the old name too.
When you do import module_name the Python interpreter looks for a file module_name.extension in PYTHONPATH. So there's no chaging that name without changing name of the file. But of course you can do:
import module_name as new_module_name
or even
import module_name.submodule.subsubmodule as short_name
Useful eg. for writing DB code.
import sqlite3 as sql
sql.whatever..
And then to switch eg. sqlite3 to pysqlite you just change the import line
Every class has an __module__ property, although I believe changing this will not change the namespace of the Class.
If it is possible, it would probably involve using setattr to insert the methods or class into the desired module, although you run the risk of making your code very confusing to your future peers.
Your best bet is to rename the file.
Where would you like to have this __module__ variable, so your original script knows what to import? Modules are recognized by file names and looked in paths defined in sys.path variable.
So, you have to rename the file, then remove the oldname.pyc, just to make sure everything works right.
I had an issue like this with bsddb. I was forced to install the bsddb3 module but hundreds of scripts imported bsddb. Instead of changing the import in all of them, I extracted the bsddb3 egg, and created a soft link in the site-packages directory so that both "bsddb" and "bsddb3" were one in the same to python.
You can set the module via module attribute like below.
func.__module__ = module
You can even create a decorator to change the module names for specific functions in a file for example:
def set_module(module):
"""Decorator for overriding __module__ on a function or class.
Example usage::
#set_module('numpy')
def example():
pass
assert example.__module__ == 'numpy'
"""
def decorator(func):
if module is not None:
func.__module__ = module
return func
return
and then use
#set_module('my_module')
def some_func(...):
Pay attention since this decorator is for changing individual
module names for functions.
This example is taken from numpy source code: https://github.com/numpy/numpy/blob/0721406ede8b983b8689d8b70556499fc2aea28a/numpy/core/numeric.py#L289

Categories

Resources